Re: [Gluster-users] Unable to make HA work; mounts hang on remote node reboot

CJ Baar Tue, 07 Apr 2015 09:42:48 -0700

> On Apr 6, 2015, at 10:22 PM, Joe Julian <[email protected]> wrote:
> 
> On 04/06/2015 09:00 PM, Ravishankar N wrote:
>> 
>> 
>> On 04/07/2015 04:15 AM, CJ Baar wrote:
>>> I am hoping someone can give me some direction on this. I have been 
>>> searching and trying various tweaks all day. I am trying to setup a 
>>> two-node cluster with a replicated volume. Each node has a brick under 
>>> /export, and a local mount using glusterfs under /mnt.
>>>    gluster volume create test1 rep 2 g01.x.local:/exports/sdb1/brick 
>>> g02.x.local:/exports/sdb1/brick
>>>    gluster volume start test1
>>>    mount -t glusterfs g01.x.local:/test1 /mnt/test1
>>> When I write a file to one node, it shows up instantly on the other… just 
>>> as I expect it to. The volume was created as:
>>> 
>>> My problem is that if I reboot one node, the mount on the other completely 
>>> hangs until the rebooted node comes back up. This seems to defeat the 
>>> purpose of being highly-available. Is there some setting I am missing? How 
>>> do I keep the volume on a single node alive during a failure?
>>> Any info is appreciated. Thank you.
>> 
>> You can explore the  network.ping-timeout setting; try reducing it from the 
>> default value of 42 seconds.
>> -Ravi
> That's probably wrong. If you're doing a proper reboot, the services should 
> be stopped before shutting down, which will do all the proper handshaking for 
> shutting down a tcp connection. This allows the client to avoid the 
> ping-timeout. Ping-timeout only comes in to play if there's a sudden - 
> unexpected communication loss with the server such as power loss, network 
> partition, etc. Most communication losses should be transient and recovery is 
> less impactful if you can wait for the transient issue to resolve.
> 
> No, if you're hanging when one server is shut down, then your client isn't 
> connecting to all the servers as it should. Check your client logs to figure 
> out why.


The logs, as I interpret them, show both bricks successfully being connected 
when I do a mount.  (mount -t glusterfs g01.x.local:/test1 /mnt/test1)  It even 
claims to be setting the read preference to the correct local brick.

[2015-04-07 16:13:05.581085] I [rpc-clnt.c:1761:rpc_clnt_reconfig] 
0-test1-client-0: changing port to 49152 (from 0)
[2015-04-07 16:13:05.583826] I 
[client-handshake.c:1413:select_server_supported_programs] 0-test1-client-0: 
Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-04-07 16:13:05.584017] I [client-handshake.c:1200:client_setvolume_cbk] 
0-test1-client-0: Connected to test1-client-0, attached to remote volume 
'/exports/sdb1/brick'.
[2015-04-07 16:13:05.584030] I [client-handshake.c:1210:client_setvolume_cbk] 
0-test1-client-0: Server and Client lk-version numbers are not same, reopening 
the fds
[2015-04-07 16:13:05.584122] I [MSGID: 108005] [afr-common.c:3552:afr_notify] 
0-test1-replicate-0: Subvolume 'test1-client-0' came back up; going online.
[2015-04-07 16:13:05.584146] I 
[client-handshake.c:188:client_set_lk_version_cbk] 0-test1-client-0: Server lk 
version = 1
[2015-04-07 16:13:05.585647] I [rpc-clnt.c:1761:rpc_clnt_reconfig] 
0-test1-client-1: changing port to 49152 (from 0)
[2015-04-07 16:13:05.590017] I 
[client-handshake.c:1413:select_server_supported_programs] 0-test1-client-1: 
Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-04-07 16:13:05.591067] I [client-handshake.c:1200:client_setvolume_cbk] 
0-test1-client-1: Connected to test1-client-1, attached to remote volume 
'/exports/sdb1/brick'.
[2015-04-07 16:13:05.591079] I [client-handshake.c:1210:client_setvolume_cbk] 
0-test1-client-1: Server and Client lk-version numbers are not same, reopening 
the fds
[2015-04-07 16:13:05.595077] I [fuse-bridge.c:5080:fuse_graph_setup] 0-fuse: 
switched to graph 0
[2015-04-07 16:13:05.595144] I 
[client-handshake.c:188:client_set_lk_version_cbk] 0-test1-client-1: Server lk 
version = 1
[2015-04-07 16:13:05.595265] I [fuse-bridge.c:4009:fuse_init] 0-glusterfs-fuse: 
FUSE inited with protocol versions: glusterfs 7.22 kernel 7.22
[2015-04-07 16:13:05.596883] I [afr-common.c:1484:afr_local_discovery_cbk] 
0-test1-replicate-0: selecting local read_child test1-client-0


This is all the log I get on node1 when I drop node2. It takes almost two 
minutes for node1 to resume.

[2015-04-07 16:20:48.278742] W [socket.c:611:__socket_rwv] 0-management: readv 
on 172.32.65.241:24007 failed (No data available)
[2015-04-07 16:20:48.278837] I [MSGID: 106004] 
[glusterd-handler.c:4365:__glusterd_peer_rpc_notify] 0-management: Peer 
1069f037-13eb-458e-a9c4-0e7e79e595d0, in Peer in Cluster state, has 
disconnected from glusterd.
[2015-04-07 16:20:48.279062] W [glusterd-locks.c:647:glusterd_mgmt_v3_unlock] 
(--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x7f736ad56550] (--> 
/usr/lib64/glusterfs/3.6.2/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x428)[0x7f735fdf1df8]
 (--> 
/usr/lib64/glusterfs/3.6.2/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x262)[0x7f735fd662c2]
 (--> 
/usr/lib64/glusterfs/3.6.2/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x60)[0x7f735fd51a80]
 (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1a3)[0x7f736ab2bf63] ))))) 
0-management: Lock for vol test1 not held
[2015-04-07 16:22:24.766177] W 
[glusterd-op-sm.c:4021:glusterd_op_modify_op_ctx] 0-management: op_ctx 
modification failed
[2015-04-07 16:22:24.766587] I 
[glusterd-handler.c:3803:__glusterd_handle_status_volume] 0-management: 
Received status volume req for volume test1


If I try a “graceful” shutdown by manually stopping the glusterd services, the 
mount stays up and works… until the node itself is shutdown.  This is the log 
from node1 after issuing “service glusterd stop” on node2.

[2015-04-07 16:32:57.224545] W [socket.c:611:__socket_rwv] 0-management: readv 
on 172.32.65.241:24007 failed (No data available)
[2015-04-07 16:32:57.224612] I [MSGID: 106004] 
[glusterd-handler.c:4365:__glusterd_peer_rpc_notify] 0-management: Peer 
1069f037-13eb-458e-a9c4-0e7e79e595d0, in Peer in Cluster state, has 
disconnected from glusterd.
[2015-04-07 16:32:57.224829] W [glusterd-locks.c:647:glusterd_mgmt_v3_unlock] 
(--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x7f736ad56550] (--> 
/usr/lib64/glusterfs/3.6.2/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x428)[0x7f735fdf1df8]
 (--> 
/usr/lib64/glusterfs/3.6.2/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x262)[0x7f735fd662c2]
 (--> 
/usr/lib64/glusterfs/3.6.2/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x60)[0x7f735fd51a80]
 (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1a3)[0x7f736ab2bf63] ))))) 
0-management: Lock for vol test1 not held
[2015-04-07 16:33:03.506088] W 
[glusterd-op-sm.c:4021:glusterd_op_modify_op_ctx] 0-management: op_ctx 
modification failed
[2015-04-07 16:33:03.506619] I 
[glusterd-handler.c:3803:__glusterd_handle_status_volume] 0-management: 
Received status volume req for volume test1
[2015-04-07 16:33:08.498391] E [socket.c:2267:socket_connect_finish] 
0-management: connection to 172.32.65.241:24007 failed (Connection refused)

At this point, the mount on node1 is still responsive, even though gluster 
itself is down on node2, and confirmed by a status output.
Status of volume: test1
Gluster process                                         Port    Online  Pid
------------------------------------------------------------------------------
Brick g01.x.local:/exports/sdb1/brick                   49152   Y       22739
NFS Server on localhost                                 2049    Y       22746
Self-heal Daemon on localhost                           N/A     Y       22751
 
Task Status of Volume test1
------------------------------------------------------------------------------
There are no active volume tasks

Then, I issue “init 0” on node2, and the mount on node1 becomes unresponsive. 
This is the log from node1
[2015-04-07 16:36:04.250693] W 
[glusterd-op-sm.c:4021:glusterd_op_modify_op_ctx] 0-management: op_ctx 
modification failed
[2015-04-07 16:36:04.251102] I 
[glusterd-handler.c:3803:__glusterd_handle_status_volume] 0-management: 
Received status volume req for volume test1
The message "I [MSGID: 106004] 
[glusterd-handler.c:4365:__glusterd_peer_rpc_notify] 0-management: Peer 
1069f037-13eb-458e-a9c4-0e7e79e595d0, in Peer in Cluster state, has 
disconnected from glusterd." repeated 39 times between [2015-04-07 
16:34:40.609878] and [2015-04-07 16:36:37.752489]
[2015-04-07 16:36:40.755989] I [MSGID: 106004] 
[glusterd-handler.c:4365:__glusterd_peer_rpc_notify] 0-management: Peer 
1069f037-13eb-458e-a9c4-0e7e79e595d0, in Peer in Cluster state, has 
disconnected from glusterd.


This does not seem like desired behaviour. I was trying to create this cluster 
because I was under the impression it would be more resilient than a 
single-point-of-failure NFS server. However, if the mount halts when one node 
in the cluster dies, then I’m no better off.

I also can’t seem to figure out how to bring a volume online if only one node 
in the cluster is running; again, not really functioning as HA. The gluster 
service runs and the volume “starts”, but it is not “online” or mountable until 
both nodes are running. In a situation where a node fails and we need storage 
online before we can troubleshoot the cause of the node failure, how do I get a 
volume to go online?

Thanks.


_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Unable to make HA work; mounts hang on remote node reboot

Reply via email to