Re: [Gluster-users] 2 node replica 2 cluster - volume on one node stopped responding

2015-06-09 Thread Tiemen Ruiten
Anyone who could help? We just ran into this exact same problem again. I
just noticed we are running GlusterFS 3.7.1 on the clients (oVirt
hosts/VDSM). Could this be an issue?

On 8 June 2015 at 11:40, Tiemen Ruiten t.rui...@rdmedia.com wrote:

 Some extra points:

 - 10.100.3.41 is one of the oVirt hosts.

 - I only needed to restart glusterfsd  glusterd in one of the gluster
 nodes (also the one where I pulled the logs from) to get everything in
 working order.

 - it's a separate gluster volume, not managed from oVirt engine.

 On 8 June 2015 at 11:35, Tiemen Ruiten t.rui...@rdmedia.com wrote:

 Hello,

 We are running an oVirt cluster on top of a 2 node replica 2 Gluster
 volume. Yesterday we suddenly noticed VMs were not responding and quickly
 found out the Gluster volume had issues. These errors were filling up the
 etc-glusterfs-glusterd.log file:

 [2015-06-07 08:36:26.498012] W [rpcsvc.c:270:rpcsvc_program_actor]
 0-rpc-service: RPC program not available (req 1298437 330) for
 10.100.3.41:1022
 [2015-06-07 08:36:26.498073] E
 [rpcsvc.c:565:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to
 complete successfully


 A restart of glusterfsd and glusterd resolved the issue, but triggered a
 lot of self-heals.

 We are running glusterfs 3.7.0 on ZFS.

 I have attached etc-glusterfs-glusterd.log, the brick log file and the
 glustershd.log. I would be grateful if anyone could shed any light on what
 happened here and if there's anything we can do to prevent it.

 --
 Tiemen Ruiten
 Systems Engineer
 RD Media




 --
 Tiemen Ruiten
 Systems Engineer
 RD Media




-- 
Tiemen Ruiten
Systems Engineer
RD Media
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 2 node replica 2 cluster - volume on one node stopped responding

2015-06-09 Thread Tiemen Ruiten
I just found this on Bugzilla:
https://bugzilla.redhat.com/show_bug.cgi?id=1134305
rpc actor failed to complete successfully messages in Glusterd

Is it related?

On 9 June 2015 at 11:30, Tiemen Ruiten t.rui...@rdmedia.com wrote:

 Anyone who could help? We just ran into this exact same problem again. I
 just noticed we are running GlusterFS 3.7.1 on the clients (oVirt
 hosts/VDSM). Could this be an issue?

 On 8 June 2015 at 11:40, Tiemen Ruiten t.rui...@rdmedia.com wrote:

 Some extra points:

 - 10.100.3.41 is one of the oVirt hosts.

 - I only needed to restart glusterfsd  glusterd in one of the gluster
 nodes (also the one where I pulled the logs from) to get everything in
 working order.

 - it's a separate gluster volume, not managed from oVirt engine.

 On 8 June 2015 at 11:35, Tiemen Ruiten t.rui...@rdmedia.com wrote:

 Hello,

 We are running an oVirt cluster on top of a 2 node replica 2 Gluster
 volume. Yesterday we suddenly noticed VMs were not responding and quickly
 found out the Gluster volume had issues. These errors were filling up the
 etc-glusterfs-glusterd.log file:

 [2015-06-07 08:36:26.498012] W [rpcsvc.c:270:rpcsvc_program_actor]
 0-rpc-service: RPC program not available (req 1298437 330) for
 10.100.3.41:1022
 [2015-06-07 08:36:26.498073] E
 [rpcsvc.c:565:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to
 complete successfully


 A restart of glusterfsd and glusterd resolved the issue, but triggered a
 lot of self-heals.

 We are running glusterfs 3.7.0 on ZFS.

 I have attached etc-glusterfs-glusterd.log, the brick log file and the
 glustershd.log. I would be grateful if anyone could shed any light on what
 happened here and if there's anything we can do to prevent it.

 --
 Tiemen Ruiten
 Systems Engineer
 RD Media




 --
 Tiemen Ruiten
 Systems Engineer
 RD Media




 --
 Tiemen Ruiten
 Systems Engineer
 RD Media




-- 
Tiemen Ruiten
Systems Engineer
RD Media
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 2 node replica 2 cluster - volume on one node stopped responding

2015-06-08 Thread Tiemen Ruiten
Some extra points:

- 10.100.3.41 is one of the oVirt hosts.

- I only needed to restart glusterfsd  glusterd in one of the gluster
nodes (also the one where I pulled the logs from) to get everything in
working order.

- it's a separate gluster volume, not managed from oVirt engine.

On 8 June 2015 at 11:35, Tiemen Ruiten t.rui...@rdmedia.com wrote:

 Hello,

 We are running an oVirt cluster on top of a 2 node replica 2 Gluster
 volume. Yesterday we suddenly noticed VMs were not responding and quickly
 found out the Gluster volume had issues. These errors were filling up the
 etc-glusterfs-glusterd.log file:

 [2015-06-07 08:36:26.498012] W [rpcsvc.c:270:rpcsvc_program_actor]
 0-rpc-service: RPC program not available (req 1298437 330) for
 10.100.3.41:1022
 [2015-06-07 08:36:26.498073] E
 [rpcsvc.c:565:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to
 complete successfully


 A restart of glusterfsd and glusterd resolved the issue, but triggered a
 lot of self-heals.

 We are running glusterfs 3.7.0 on ZFS.

 I have attached etc-glusterfs-glusterd.log, the brick log file and the
 glustershd.log. I would be grateful if anyone could shed any light on what
 happened here and if there's anything we can do to prevent it.

 --
 Tiemen Ruiten
 Systems Engineer
 RD Media




-- 
Tiemen Ruiten
Systems Engineer
RD Media
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users