Hi,
So, can anyone try and reproduce this problem? I've downgraded to
v3.5.2, which I'm using in prod, and I get the same behavior.
Steps to reproduce:
1. probe server2, create and start volume
2. do not mount volume
3. reboot/poweroff server2; or add server1 to its iptables (with -j
DROP, not -j REJECT)
4. on server1 (while server2 is rebooting or dropping traffic from
server1): time mount -t glusterfs server1:/volume /some/path
PS: with -j REJECT it mounts instantly. with -j DROP it always waits
2mins 7secs
Thanks!
On 11/11/2014 01:19, Pranith Kumar Karampuri wrote:
On 11/10/2014 11:47 PM, A F wrote:
Hello,
I have two servers, 192.168.0.10 and 192.168.2.10. I'm using gluster
3.6.1 (installed from gluster repo) on AWS Linux. Both servers are
completely reachable in LAN.
# rpm -qa|grep gluster
glusterfs-3.6.1-1.el6.x86_64
glusterfs-server-3.6.1-1.el6.x86_64
glusterfs-libs-3.6.1-1.el6.x86_64
glusterfs-api-3.6.1-1.el6.x86_64
glusterfs-cli-3.6.1-1.el6.x86_64
glusterfs-fuse-3.6.1-1.el6.x86_64
These are the commands I ran:
# gluster peer probe 192.168.2.10
# gluster volume create aloha replica 2 transport tcp
192.168.0.10:/var/aloha 192.168.2.10:/var/aloha force
# gluster volume start aloha
# gluster volume set aloha network.ping-timeout 5
# gluster volume set aloha nfs.disable on
Problem number 1:
tail -f /var/log/glusterfs/etc-glusterfs-glusterd.vol.log shows log
cluttering with:
[2014-11-10 17:41:26.328796] W [socket.c:611:__socket_rwv]
0-management: readv on
/var/run/38c520c774793c9cdae8ace327512027.socket failed (Invalid
argument)
this happens every 3 seconds on both servers. It is related to NFS
and probably rpcbind, but I absolutely want them disabled. As you
see, I've set gluster to disable nfs - why doesn't it keep quiet
about it then?
Problem number 2:
in fstab on server 192.168.0.10: 192.168.0.10:/aloha
/var/www/hawaii glusterfs defaults,_netdev 0 0
in fstab on server 192.168.2.10: 192.168.2.10:/aloha
/var/www/hawaii glusterfs defaults,_netdev 0 0
If I shutdown one of the servers (192.168.2.10), and I reboot the
remaining one (192.168.0.10), it won't come up as fast as it should.
It lags a few minutes waiting for gluster. After it eventually
starts, mount point is not mounted and volume is stopped:
# gluster volume status
Status of volume: aloha
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick 192.168.0.10:/var/aloha N/A N N/A
Self-heal Daemon on localhost N/A N N/A
Task Status of Volume aloha
------------------------------------------------------------------------------
There are no active volume tasks
This didn't happen before, so fine, I first have to stop the volume
and then start it again. It now shows as online:
Brick 192.168.0.10:/var/aloha 49155 Y
3473
Self-heal Daemon on localhost N/A Y 3507
# time mount -a
real 2m7.307s
# time mount -t glusterfs 192.168.0.10:/aloha /var/www/hawaii
real 2m7.365s
# strace mount -t glusterfs 192.168.0.10:/aloha /var/www/hawaii
(attached)
# tail /var/log/glusterfs/* -f|grep -v readv
(attached)
I've done this setup before, so I'm amazed it doesn't work. I even
have it in production at the moment, with the same options and setup,
and for example I'm not getting readv errors. I'm unable to test the
mount part though, but I feel I have covered it way back when I was
testing the environment.
Any help is kindly appreciated.
CC glusterd folks
Pranith
_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users