Is node 5 still showing "Disconnected" for node 6?

On Mon, Mar 21, 2011 at 12:08 PM, Burnash, James <[email protected]> wrote:
> After 'service glusterd restart' on node 6:
>
> root@jc1letgfs5:/etc/glusterd/vols# gluster peer status
> Number of Peers: 3
>
> Hostname: jc1letgfs6
> Uuid: cd590fad-022c-4b9a-97f5-3262080d772d
> State: Peer in Cluster (Disconnected)
>
> Hostname: jc1letgfs7
> Uuid: c5f40de4-9bb1-47ad-93b6-d52c6689ee29
> State: Peer in Cluster (Connected)
>
> Hostname: jc1letgfs8
> Uuid: 13f4ce3f-042e-4144-a76c-d2b1b91676bd
> State: Peer in Cluster (Connected)
>
> BUT ... after 'service glusterd restart' on node 5:
>
> root@jc1letgfs5:/etc/glusterd/vols# gluster peer status
> Number of Peers: 3
>
> Hostname: jc1letgfs7
> Uuid: c5f40de4-9bb1-47ad-93b6-d52c6689ee29
> State: Peer in Cluster (Connected)
>
> Hostname: jc1letgfs8
> Uuid: 13f4ce3f-042e-4144-a76c-d2b1b91676bd
> State: Peer in Cluster (Connected)
>
> Hostname: jc1letgfs6
> Uuid: cd590fad-022c-4b9a-97f5-3262080d772d
> State: Peer in Cluster (Connected)
>
> Works now. Thanks so much. I suspect a race condition of some sort, though 
> what I'll leave up to the devs.
>
> -----Original Message-----
> From: Mohit Anchlia [mailto:[email protected]]
> Sent: Monday, March 21, 2011 2:57 PM
> To: Burnash, James; [email protected]
> Subject: Re: [Gluster-users] What does this error mean?
>
> At this point can you do /etc/init.d/gluster stop and then start and
> see if this changes anything? Or do you see same behaviour? I am
> thinking gluster might have tried to start too soon on reboot.
>
> On Mon, Mar 21, 2011 at 11:43 AM, Burnash, James <[email protected]> wrote:
>> Short answers - yes all on the same subnet.
>> Every host can ping the others
>> Iptables shows empty entries for all filters
>>
>> Details are here - http://pastebin.com/eKtRMbGE
>>
>> I did explicitly turn the iptables off again, and then checked again:
>>
>> jc1letgfs5
>> Firewall is stopped.
>>
>> jc1letgfs6
>> Firewall is stopped.
>>
>> jc1letgfs7
>> Firewall is stopped.
>>
>> jc1letgfs8
>> Firewall is stopped.
>>
>> Thanks,
>>
>> James
>>
>> -----Original Message-----
>> From: Mohit Anchlia [mailto:[email protected]]
>> Sent: Monday, March 21, 2011 2:25 PM
>> To: Burnash, James
>> Cc: [email protected]
>> Subject: Re: [Gluster-users] What does this error mean?
>>
>> Are they in same subnet? What happens if you ping these hosts
>> individually? Do they ping?
>>
>> I closely looked at the error you posted and "connection to
>> 10.20.72.157:24007 failed (No route to host" points to either firewall
>> issue or could be a switch issue on the network. Ping test on each
>> host to each other will be helpful.
>>
>> Can you post results of ping and also "service iptables status" from each 
>> node?
>>
>> On Mon, Mar 21, 2011 at 11:16 AM, Burnash, James <[email protected]> wrote:
>>> A little more information:
>>>
>>> From the original (first peer node):
>>> root@jc1letgfs5:/etc/glusterd/vols# gluster peer status
>>> Number of Peers: 3
>>>
>>> Hostname: jc1letgfs6
>>> Uuid: cd590fad-022c-4b9a-97f5-3262080d772d
>>> State: Peer in Cluster (Disconnected)
>>>
>>> Hostname: jc1letgfs7
>>> Uuid: c5f40de4-9bb1-47ad-93b6-d52c6689ee29
>>> State: Peer in Cluster (Connected)
>>>
>>> Hostname: jc1letgfs8
>>> Uuid: 13f4ce3f-042e-4144-a76c-d2b1b91676bd
>>> State: Peer in Cluster (Connected)
>>>
>>>
>>> From the problem node:
>>> *** NOTE - only one Peer seen
>>> root@jc1letgfs6:~# gluster peer status
>>> Number of Peers: 1
>>>
>>> Hostname: 10.20.72.156
>>> Uuid: 95e1d79a-632a-4774-9d7e-a7234cb084ca
>>> State: Peer in Cluster (Connected)
>>>
>>>
>>> From a different peer node:
>>> root@jc1letgfs8:~# gluster peer status
>>> Number of Peers: 3
>>>
>>> Hostname: jc1letgfs6
>>> Uuid: cd590fad-022c-4b9a-97f5-3262080d772d
>>> State: Peer Rejected (Connected)
>>>
>>> Hostname: jc1letgfs7
>>> Uuid: c5f40de4-9bb1-47ad-93b6-d52c6689ee29
>>> State: Peer in Cluster (Connected)
>>>
>>> Hostname: 10.20.72.156
>>> Uuid: 95e1d79a-632a-4774-9d7e-a7234cb084ca
>>> State: Peer in Cluster (Connected)
>>>
>>> -----Original Message-----
>>> From: [email protected] 
>>> [mailto:[email protected]] On Behalf Of Burnash, James
>>> Sent: Monday, March 21, 2011 2:05 PM
>>> To: Mohit Anchlia
>>> Cc: [email protected]
>>> Subject: Re: [Gluster-users] What does this error mean?
>>>
>>> I did do this, and noting in particular stands out.
>>>
>>> I'll exercise it some more, and see if we can get something that will at 
>>> least point in the proper direction.
>>>
>>> I suspect that another reboot of the affected machine will fix this 
>>> condition - but it won't help me understand the root problem the next time 
>>> this happens.
>>>
>>> Thanks,
>>>
>>> James
>>>
>>> -----Original Message-----
>>> From: Mohit Anchlia [mailto:[email protected]]
>>> Sent: Monday, March 21, 2011 12:40 PM
>>> To: Burnash, James
>>> Cc: [email protected]
>>> Subject: Re: [Gluster-users] What does this error mean?
>>>
>>> Can you turn on DEBUG and see if there is something that stands out?
>>>
>>> On Mon, Mar 21, 2011 at 9:34 AM, Burnash, James <[email protected]> wrote:
>>>> Does anybody have any clue as to why this is happening? The problem has 
>>>> persisted for several days now, but I can't find anything at all in the 
>>>> logs to possibly explain why this is so.
>>>>
>>>> -----Original Message-----
>>>> From: [email protected] 
>>>> [mailto:[email protected]] On Behalf Of Burnash, James
>>>> Sent: Wednesday, March 16, 2011 9:10 AM
>>>> To: [email protected]
>>>> Subject: [SPAM?] [Gluster-users] What does this error mean?
>>>> Importance: Low
>>>>
>>>> Hello.
>>>>
>>>> After purposely crashing (via ' echo b>/proc/sysrq-trigger ) node 
>>>> jc1letgfs6 to test mirroring, even after the node has rebooted and is back 
>>>> online I am still seeing the statement "Disconnected" for that node when I 
>>>> execute the following command on the first storage node:
>>>>
>>>> root@jc1letgfs5:/etc/glusterd/vols# gluster peer status
>>>> Number of Peers: 3
>>>>
>>>> Hostname: jc1letgfs6
>>>> Uuid: cd590fad-022c-4b9a-97f5-3262080d772d
>>>> State: Peer in Cluster (Disconnected)
>>>>
>>>> Hostname: jc1letgfs7
>>>> Uuid: c5f40de4-9bb1-47ad-93b6-d52c6689ee29
>>>> State: Peer in Cluster (Disconnected)
>>>>
>>>> Hostname: jc1letgfs8
>>>> Uuid: 13f4ce3f-042e-4144-a76c-d2b1b91676bd
>>>> State: Peer in Cluster (Connected)
>>>>
>>>> This is running on 4 servers with CentOS 5.5 (x86_64), GlusterFS 3.1.1
>>>>
>>>> Here is the volume info:
>>>>
>>>> # gluster volume info
>>>>
>>>> Volume Name: test-pfs-ro1
>>>> Type: Distributed-Replicate
>>>> Status: Started
>>>> Number of Bricks: 4 x 2 = 8
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: jc1letgfs5:/export/read-only/g01
>>>> Brick2: jc1letgfs6:/export/read-only/g01
>>>> Brick3: jc1letgfs5:/export/read-only/g02
>>>> Brick4: jc1letgfs6:/export/read-only/g02
>>>> Brick5: jc1letgfs7:/export/read-only/g01
>>>> Brick6: jc1letgfs8:/export/read-only/g01
>>>> Brick7: jc1letgfs7:/export/read-only/g02
>>>> Brick8: jc1letgfs8:/export/read-only/g02
>>>> Options Reconfigured:
>>>> performance.stat-prefetch: on
>>>> performance.cache-size: 2GB
>>>> network.ping-timeout: 10
>>>>
>>>> Even with this error, mirroring functions as expected, and the node is 
>>>> recognized and utilized, as can be seen in this log fragment from 
>>>> jc1letgfs5: /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
>>>>
>>>> [2011-03-13 23:51:31.458329] E [socket.c:1656:socket_connect_finish] 
>>>> management: connection to 10.20.72.157:24007 failed (No route to ho
>>>> st)
>>>> [2011-03-13 23:53:49.42170] I 
>>>> [glusterd3_1-mops.c:172:glusterd3_1_friend_add_cbk] glusterd: Received ACC 
>>>> from uuid: cd590fad-022c-4b9a-9
>>>> 7f5-3262080d772d, host: jc1letgfs6, port: 0
>>>> [2011-03-13 23:53:49.42204] I 
>>>> [glusterd-utils.c:2062:glusterd_friend_find_by_uuid] glusterd: Friend 
>>>> found.. state: Peer in Cluster
>>>> [2011-03-13 23:53:49.42320] I 
>>>> [glusterd-utils.c:2062:glusterd_friend_find_by_uuid] glusterd: Friend 
>>>> found.. state: Peer in Cluster
>>>> [2011-03-13 23:53:49.42336] I 
>>>> [glusterd-handler.c:2267:glusterd_handle_friend_update] glusterd: Received 
>>>> friend update from uuid: cd590f
>>>> ad-022c-4b9a-97f5-3262080d772d
>>>> [2011-03-13 23:53:49.42359] I 
>>>> [glusterd-handler.c:2312:glusterd_handle_friend_update] : Received uuid: 
>>>> 95e1d79a-632a-4774-9d7e-a7234cb08
>>>> 4ca, hostname:10.20.72.156
>>>> [2011-03-13 23:53:49.42412] I 
>>>> [glusterd-handler.c:2315:glusterd_handle_friend_update] : Received my uuid 
>>>> as Friend
>>>>
>>>>
>>>> Any pointers or help would be appreciated.
>>>>
>>>> James Burnash, Unix Engineering
>>>>
>>>>
>>>> DISCLAIMER:
>>>> This e-mail, and any attachments thereto, is intended only for use by the 
>>>> addressee(s) named herein and may contain legally privileged and/or 
>>>> confidential information. If you are not the intended recipient of this 
>>>> e-mail, you are hereby notified that any dissemination, distribution or 
>>>> copying of this e-mail, and any attachments thereto, is strictly 
>>>> prohibited. If you have received this in error, please immediately notify 
>>>> me and permanently delete the original and any copy of any e-mail and any 
>>>> printout thereof. E-mail transmission cannot be guaranteed to be secure or 
>>>> error-free. The sender therefore does not accept liability for any errors 
>>>> or omissions in the contents of this message which arise as a result of 
>>>> e-mail transmission.
>>>> NOTICE REGARDING PRIVACY AND CONFIDENTIALITY Knight Capital Group may, at 
>>>> its discretion, monitor and review the content of all e-mail 
>>>> communications. http://www.knight.com
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> [email protected]
>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> [email protected]
>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> [email protected]
>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>
>>
>
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Reply via email to