Re: [Gluster-users] 2 node replica 2 cluster - volume on one node stopped responding
Anyone who could help? We just ran into this exact same problem again. I just noticed we are running GlusterFS 3.7.1 on the clients (oVirt hosts/VDSM). Could this be an issue? On 8 June 2015 at 11:40, Tiemen Ruiten t.rui...@rdmedia.com wrote: Some extra points: - 10.100.3.41 is one of the oVirt hosts. - I only needed to restart glusterfsd glusterd in one of the gluster nodes (also the one where I pulled the logs from) to get everything in working order. - it's a separate gluster volume, not managed from oVirt engine. On 8 June 2015 at 11:35, Tiemen Ruiten t.rui...@rdmedia.com wrote: Hello, We are running an oVirt cluster on top of a 2 node replica 2 Gluster volume. Yesterday we suddenly noticed VMs were not responding and quickly found out the Gluster volume had issues. These errors were filling up the etc-glusterfs-glusterd.log file: [2015-06-07 08:36:26.498012] W [rpcsvc.c:270:rpcsvc_program_actor] 0-rpc-service: RPC program not available (req 1298437 330) for 10.100.3.41:1022 [2015-06-07 08:36:26.498073] E [rpcsvc.c:565:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully A restart of glusterfsd and glusterd resolved the issue, but triggered a lot of self-heals. We are running glusterfs 3.7.0 on ZFS. I have attached etc-glusterfs-glusterd.log, the brick log file and the glustershd.log. I would be grateful if anyone could shed any light on what happened here and if there's anything we can do to prevent it. -- Tiemen Ruiten Systems Engineer RD Media -- Tiemen Ruiten Systems Engineer RD Media -- Tiemen Ruiten Systems Engineer RD Media ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] 2 node replica 2 cluster - volume on one node stopped responding
I just found this on Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1134305 rpc actor failed to complete successfully messages in Glusterd Is it related? On 9 June 2015 at 11:30, Tiemen Ruiten t.rui...@rdmedia.com wrote: Anyone who could help? We just ran into this exact same problem again. I just noticed we are running GlusterFS 3.7.1 on the clients (oVirt hosts/VDSM). Could this be an issue? On 8 June 2015 at 11:40, Tiemen Ruiten t.rui...@rdmedia.com wrote: Some extra points: - 10.100.3.41 is one of the oVirt hosts. - I only needed to restart glusterfsd glusterd in one of the gluster nodes (also the one where I pulled the logs from) to get everything in working order. - it's a separate gluster volume, not managed from oVirt engine. On 8 June 2015 at 11:35, Tiemen Ruiten t.rui...@rdmedia.com wrote: Hello, We are running an oVirt cluster on top of a 2 node replica 2 Gluster volume. Yesterday we suddenly noticed VMs were not responding and quickly found out the Gluster volume had issues. These errors were filling up the etc-glusterfs-glusterd.log file: [2015-06-07 08:36:26.498012] W [rpcsvc.c:270:rpcsvc_program_actor] 0-rpc-service: RPC program not available (req 1298437 330) for 10.100.3.41:1022 [2015-06-07 08:36:26.498073] E [rpcsvc.c:565:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully A restart of glusterfsd and glusterd resolved the issue, but triggered a lot of self-heals. We are running glusterfs 3.7.0 on ZFS. I have attached etc-glusterfs-glusterd.log, the brick log file and the glustershd.log. I would be grateful if anyone could shed any light on what happened here and if there's anything we can do to prevent it. -- Tiemen Ruiten Systems Engineer RD Media -- Tiemen Ruiten Systems Engineer RD Media -- Tiemen Ruiten Systems Engineer RD Media -- Tiemen Ruiten Systems Engineer RD Media ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] 2 node replica 2 cluster - volume on one node stopped responding
Some extra points: - 10.100.3.41 is one of the oVirt hosts. - I only needed to restart glusterfsd glusterd in one of the gluster nodes (also the one where I pulled the logs from) to get everything in working order. - it's a separate gluster volume, not managed from oVirt engine. On 8 June 2015 at 11:35, Tiemen Ruiten t.rui...@rdmedia.com wrote: Hello, We are running an oVirt cluster on top of a 2 node replica 2 Gluster volume. Yesterday we suddenly noticed VMs were not responding and quickly found out the Gluster volume had issues. These errors were filling up the etc-glusterfs-glusterd.log file: [2015-06-07 08:36:26.498012] W [rpcsvc.c:270:rpcsvc_program_actor] 0-rpc-service: RPC program not available (req 1298437 330) for 10.100.3.41:1022 [2015-06-07 08:36:26.498073] E [rpcsvc.c:565:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully A restart of glusterfsd and glusterd resolved the issue, but triggered a lot of self-heals. We are running glusterfs 3.7.0 on ZFS. I have attached etc-glusterfs-glusterd.log, the brick log file and the glustershd.log. I would be grateful if anyone could shed any light on what happened here and if there's anything we can do to prevent it. -- Tiemen Ruiten Systems Engineer RD Media -- Tiemen Ruiten Systems Engineer RD Media ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users