Re: [Gluster-users] Mount problems when secondary node down

A F Mon, 17 Nov 2014 03:58:19 -0800

Hi,

So, can anyone try and reproduce this problem? I've downgraded tov3.5.2, which I'm using in prod, and I get the same behavior.

Steps to reproduce:
1. probe server2, create and start volume
2. do not mount volume

3. reboot/poweroff server2; or add server1 to its iptables (with -jDROP, not -j REJECT)4. on server1 (while server2 is rebooting or dropping traffic fromserver1): time mount -t glusterfs server1:/volume /some/path

PS: with -j REJECT it mounts instantly. with -j DROP it always waits2mins 7secs

Thanks!

On 11/11/2014 01:19, Pranith Kumar Karampuri wrote:

On 11/10/2014 11:47 PM, A F wrote:
Hello,
I have two servers, 192.168.0.10 and 192.168.2.10. I'm using gluster3.6.1 (installed from gluster repo) on AWS Linux. Both servers arecompletely reachable in LAN.
# rpm -qa|grep gluster
glusterfs-3.6.1-1.el6.x86_64
glusterfs-server-3.6.1-1.el6.x86_64
glusterfs-libs-3.6.1-1.el6.x86_64
glusterfs-api-3.6.1-1.el6.x86_64
glusterfs-cli-3.6.1-1.el6.x86_64
glusterfs-fuse-3.6.1-1.el6.x86_64

These are the commands I ran:
# gluster peer probe 192.168.2.10
# gluster volume create aloha replica 2 transport tcp192.168.0.10:/var/aloha 192.168.2.10:/var/aloha force
# gluster volume start aloha
# gluster volume set aloha network.ping-timeout 5
# gluster volume set aloha nfs.disable on

Problem number 1:
tail -f /var/log/glusterfs/etc-glusterfs-glusterd.vol.log shows logcluttering with:[2014-11-10 17:41:26.328796] W [socket.c:611:__socket_rwv]0-management: readv on/var/run/38c520c774793c9cdae8ace327512027.socket failed (Invalidargument)this happens every 3 seconds on both servers. It is related to NFSand probably rpcbind, but I absolutely want them disabled. As yousee, I've set gluster to disable nfs - why doesn't it keep quietabout it then?
Problem number 2:
in fstab on server 192.168.0.10: 192.168.0.10:/aloha/var/www/hawaii glusterfs defaults,_netdev 0 0in fstab on server 192.168.2.10: 192.168.2.10:/aloha/var/www/hawaii glusterfs defaults,_netdev 0 0
If I shutdown one of the servers (192.168.2.10), and I reboot theremaining one (192.168.0.10), it won't come up as fast as it should.It lags a few minutes waiting for gluster. After it eventuallystarts, mount point is not mounted and volume is stopped:
# gluster volume status
Status of volume: aloha
Gluster process                                         Port Online  Pid
------------------------------------------------------------------------------
Brick 192.168.0.10:/var/aloha                           N/A N       N/A
Self-heal Daemon on localhost                           N/A N       N/A

Task Status of Volume aloha
------------------------------------------------------------------------------
There are no active volume tasks
This didn't happen before, so fine, I first have to stop the volumeand then start it again. It now shows as online:Brick 192.168.0.10:/var/aloha 49155 Y3473
Self-heal Daemon on localhost                           N/A Y       3507

# time mount -a
real    2m7.307s

# time mount -t glusterfs 192.168.0.10:/aloha /var/www/hawaii
real    2m7.365s

# strace mount -t glusterfs 192.168.0.10:/aloha /var/www/hawaii
(attached)

# tail /var/log/glusterfs/* -f|grep -v readv
(attached)
I've done this setup before, so I'm amazed it doesn't work. I evenhave it in production at the moment, with the same options and setup,and for example I'm not getting readv errors. I'm unable to test themount part though, but I feel I have covered it way back when I wastesting the environment.
Any help is kindly appreciated.
CC glusterd folks

Pranith
_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Mount problems when secondary node down

Reply via email to