Are they in same subnet? What happens if you ping these hosts
individually? Do they ping?

I closely looked at the error you posted and "connection to
10.20.72.157:24007 failed (No route to host" points to either firewall
issue or could be a switch issue on the network. Ping test on each
host to each other will be helpful.

Can you post results of ping and also "service iptables status" from each node?

On Mon, Mar 21, 2011 at 11:16 AM, Burnash, James <[email protected]> wrote:
> A little more information:
>
> From the original (first peer node):
> root@jc1letgfs5:/etc/glusterd/vols# gluster peer status
> Number of Peers: 3
>
> Hostname: jc1letgfs6
> Uuid: cd590fad-022c-4b9a-97f5-3262080d772d
> State: Peer in Cluster (Disconnected)
>
> Hostname: jc1letgfs7
> Uuid: c5f40de4-9bb1-47ad-93b6-d52c6689ee29
> State: Peer in Cluster (Connected)
>
> Hostname: jc1letgfs8
> Uuid: 13f4ce3f-042e-4144-a76c-d2b1b91676bd
> State: Peer in Cluster (Connected)
>
>
> From the problem node:
> *** NOTE - only one Peer seen
> root@jc1letgfs6:~# gluster peer status
> Number of Peers: 1
>
> Hostname: 10.20.72.156
> Uuid: 95e1d79a-632a-4774-9d7e-a7234cb084ca
> State: Peer in Cluster (Connected)
>
>
> From a different peer node:
> root@jc1letgfs8:~# gluster peer status
> Number of Peers: 3
>
> Hostname: jc1letgfs6
> Uuid: cd590fad-022c-4b9a-97f5-3262080d772d
> State: Peer Rejected (Connected)
>
> Hostname: jc1letgfs7
> Uuid: c5f40de4-9bb1-47ad-93b6-d52c6689ee29
> State: Peer in Cluster (Connected)
>
> Hostname: 10.20.72.156
> Uuid: 95e1d79a-632a-4774-9d7e-a7234cb084ca
> State: Peer in Cluster (Connected)
>
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Burnash, James
> Sent: Monday, March 21, 2011 2:05 PM
> To: Mohit Anchlia
> Cc: [email protected]
> Subject: Re: [Gluster-users] What does this error mean?
>
> I did do this, and noting in particular stands out.
>
> I'll exercise it some more, and see if we can get something that will at 
> least point in the proper direction.
>
> I suspect that another reboot of the affected machine will fix this condition 
> - but it won't help me understand the root problem the next time this happens.
>
> Thanks,
>
> James
>
> -----Original Message-----
> From: Mohit Anchlia [mailto:[email protected]]
> Sent: Monday, March 21, 2011 12:40 PM
> To: Burnash, James
> Cc: [email protected]
> Subject: Re: [Gluster-users] What does this error mean?
>
> Can you turn on DEBUG and see if there is something that stands out?
>
> On Mon, Mar 21, 2011 at 9:34 AM, Burnash, James <[email protected]> wrote:
>> Does anybody have any clue as to why this is happening? The problem has 
>> persisted for several days now, but I can't find anything at all in the logs 
>> to possibly explain why this is so.
>>
>> -----Original Message-----
>> From: [email protected] 
>> [mailto:[email protected]] On Behalf Of Burnash, James
>> Sent: Wednesday, March 16, 2011 9:10 AM
>> To: [email protected]
>> Subject: [SPAM?] [Gluster-users] What does this error mean?
>> Importance: Low
>>
>> Hello.
>>
>> After purposely crashing (via ' echo b>/proc/sysrq-trigger ) node jc1letgfs6 
>> to test mirroring, even after the node has rebooted and is back online I am 
>> still seeing the statement "Disconnected" for that node when I execute the 
>> following command on the first storage node:
>>
>> root@jc1letgfs5:/etc/glusterd/vols# gluster peer status
>> Number of Peers: 3
>>
>> Hostname: jc1letgfs6
>> Uuid: cd590fad-022c-4b9a-97f5-3262080d772d
>> State: Peer in Cluster (Disconnected)
>>
>> Hostname: jc1letgfs7
>> Uuid: c5f40de4-9bb1-47ad-93b6-d52c6689ee29
>> State: Peer in Cluster (Disconnected)
>>
>> Hostname: jc1letgfs8
>> Uuid: 13f4ce3f-042e-4144-a76c-d2b1b91676bd
>> State: Peer in Cluster (Connected)
>>
>> This is running on 4 servers with CentOS 5.5 (x86_64), GlusterFS 3.1.1
>>
>> Here is the volume info:
>>
>> # gluster volume info
>>
>> Volume Name: test-pfs-ro1
>> Type: Distributed-Replicate
>> Status: Started
>> Number of Bricks: 4 x 2 = 8
>> Transport-type: tcp
>> Bricks:
>> Brick1: jc1letgfs5:/export/read-only/g01
>> Brick2: jc1letgfs6:/export/read-only/g01
>> Brick3: jc1letgfs5:/export/read-only/g02
>> Brick4: jc1letgfs6:/export/read-only/g02
>> Brick5: jc1letgfs7:/export/read-only/g01
>> Brick6: jc1letgfs8:/export/read-only/g01
>> Brick7: jc1letgfs7:/export/read-only/g02
>> Brick8: jc1letgfs8:/export/read-only/g02
>> Options Reconfigured:
>> performance.stat-prefetch: on
>> performance.cache-size: 2GB
>> network.ping-timeout: 10
>>
>> Even with this error, mirroring functions as expected, and the node is 
>> recognized and utilized, as can be seen in this log fragment from 
>> jc1letgfs5: /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
>>
>> [2011-03-13 23:51:31.458329] E [socket.c:1656:socket_connect_finish] 
>> management: connection to 10.20.72.157:24007 failed (No route to ho
>> st)
>> [2011-03-13 23:53:49.42170] I 
>> [glusterd3_1-mops.c:172:glusterd3_1_friend_add_cbk] glusterd: Received ACC 
>> from uuid: cd590fad-022c-4b9a-9
>> 7f5-3262080d772d, host: jc1letgfs6, port: 0
>> [2011-03-13 23:53:49.42204] I 
>> [glusterd-utils.c:2062:glusterd_friend_find_by_uuid] glusterd: Friend 
>> found.. state: Peer in Cluster
>> [2011-03-13 23:53:49.42320] I 
>> [glusterd-utils.c:2062:glusterd_friend_find_by_uuid] glusterd: Friend 
>> found.. state: Peer in Cluster
>> [2011-03-13 23:53:49.42336] I 
>> [glusterd-handler.c:2267:glusterd_handle_friend_update] glusterd: Received 
>> friend update from uuid: cd590f
>> ad-022c-4b9a-97f5-3262080d772d
>> [2011-03-13 23:53:49.42359] I 
>> [glusterd-handler.c:2312:glusterd_handle_friend_update] : Received uuid: 
>> 95e1d79a-632a-4774-9d7e-a7234cb08
>> 4ca, hostname:10.20.72.156
>> [2011-03-13 23:53:49.42412] I 
>> [glusterd-handler.c:2315:glusterd_handle_friend_update] : Received my uuid 
>> as Friend
>>
>>
>> Any pointers or help would be appreciated.
>>
>> James Burnash, Unix Engineering
>>
>>
>> DISCLAIMER:
>> This e-mail, and any attachments thereto, is intended only for use by the 
>> addressee(s) named herein and may contain legally privileged and/or 
>> confidential information. If you are not the intended recipient of this 
>> e-mail, you are hereby notified that any dissemination, distribution or 
>> copying of this e-mail, and any attachments thereto, is strictly prohibited. 
>> If you have received this in error, please immediately notify me and 
>> permanently delete the original and any copy of any e-mail and any printout 
>> thereof. E-mail transmission cannot be guaranteed to be secure or 
>> error-free. The sender therefore does not accept liability for any errors or 
>> omissions in the contents of this message which arise as a result of e-mail 
>> transmission.
>> NOTICE REGARDING PRIVACY AND CONFIDENTIALITY Knight Capital Group may, at 
>> its discretion, monitor and review the content of all e-mail communications. 
>> http://www.knight.com
>> _______________________________________________
>> Gluster-users mailing list
>> [email protected]
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>> _______________________________________________
>> Gluster-users mailing list
>> [email protected]
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>
> _______________________________________________
> Gluster-users mailing list
> [email protected]
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Reply via email to