Hi Atin,

Nice catch, as you said there was a mistake in the host file of gl4 where gl5 
and gl6 were missing, now it works fine.

Thanks,

Cédric


> On 14 Dec 2016, at 05:10, Atin Mukherjee <[email protected]> wrote:
> 
> 
> 
> On Wed, Dec 14, 2016 at 9:36 AM, Atin Mukherjee <[email protected] 
> <mailto:[email protected]>> wrote:
> From gl4.dump file:
> 
> glusterd.peer4.hostname=gl5                                                   
>      
> glusterd.peer4.port=0                                                         
>      
> glusterd.peer4.state=3                                                        
>      
> glusterd.peer4.quorum-action=0                                                
>      
> glusterd.peer4.quorum-contrib=2                                               
>      
> glusterd.peer4.detaching=0                                                    
>      
> glusterd.peer4.locked=0                                                       
>      
> glusterd.peer4.rpc.peername=                                                  
>      
> glusterd.peer4.rpc.connected=0       <===== this indicates the gl5 is not 
> connected with gl4, so add-brick command failed as it supposed to in this 
> case                                                
> glusterd.peer4.rpc.total-bytes-read=0                                         
>      
> glusterd.peer4.rpc.total-bytes-written=0                                      
>      
> glusterd.peer4.rpc.ping_msgs_sent=0                                           
>      
> glusterd.peer4.rpc.msgs_sent=0
> 
> And the same goes true for gl6 as well as per this dump. So the issue is with 
> gl4 node.
> 
> Now, in gl4's glusterd log I see the repetitive entries of following logs:
> 
> [2016-12-13 16:35:31.438462] E 
> [name.c:262:af_inet_client_get_remote_sockaddr] 0-management: DNS resolution 
> failed on host gl5
> [2016-12-13 16:35:33.440155] E 
> [name.c:262:af_inet_client_get_remote_sockaddr] 0-management: DNS resolution 
> failed on host gl6
> [2016-12-13 16:35:34.441639] E 
> [name.c:262:af_inet_client_get_remote_sockaddr] 0-management: DNS resolution 
> failed on host gl5
> [2016-12-13 16:35:36.454546] E 
> [name.c:262:af_inet_client_get_remote_sockaddr] 0-management: DNS resolution 
> failed on host gl6
> [2016-12-13 16:35:37.456062] E 
> [name.c:262:af_inet_client_get_remote_sockaddr] 0-management: DNS resolution 
> failed on host gl5 
> 
> The above indicates that gl4 is not able to resolve the DNS name for gl5 & 
> gl6 where as in gl5 & gl6 it could resolve for gl4. Please check your DNS 
> configuration and see if there are any incorrect entries put up there. From 
> our side what we need to check is why peer status didn't show both gl5 & gl6 
> as disconnected.
> 
> Can you run gluster peer status from gl4 and see if both gl5 & gl6 are 
> mentioned as disconnected, if so then its expected, since gl5 & gl6 were 
> connected for all the nodes apart from gl4 peer status on all the other nodes 
> apart from gl4 would show up as connected and that's an expected behaviour. 
> Please do confirm.
>  
> 
> 
> On Wed, Dec 14, 2016 at 12:44 AM, Cedric Lemarchand <[email protected] 
> <mailto:[email protected]>> wrote:
> Thanks Atin, the files you asked : https://we.tl/XrOvFhffGq 
> <https://we.tl/XrOvFhffGq>
> 
>> On 13 Dec 2016, at 19:08, Atin Mukherjee <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Thanks, we will get back on this. In the mean time can you please also share 
>> glusterd statedump file from both the nodes? The way to take statedump is 
>> 'kill -SIGUSR1 $(pidof glusterd)' and the file can be found at 
>> /var/run/gluster directory.
>> 
>> On Tue, 13 Dec 2016 at 22:11, Cedric Lemarchand <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 1. sorry, 3.9.0-1
>> 2. no it does nothing
>> 3. here they are, from gl1 to gl6 : https://we.tl/EPaMs6geoR 
>> <https://we.tl/EPaMs6geoR>
>> 
>> 
>> 
>>> On 13 Dec 2016, at 16:49, Atin Mukherjee <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>> And 3. In case 2 doesn't work, please provide the glusterd log files from 
>>> gl1 & gl5
>>> 
>>> On Tue, Dec 13, 2016 at 9:16 PM, Atin Mukherjee <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 1. Could you mention which gluster version are you running with?
>>> 2. Does restarting glusterd instance on gl1 & gl5 solves the issue (after 
>>> removing the volume-id xattr from the bricks) ?
>>> 
>>> On Tue, Dec 13, 2016 at 8:56 PM, Cedric Lemarchand <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> Hello,
>>> 
>>> 
>>> 
>>> 
>>> 
>>> When I try to add 3 bricks to a working cluster composed of 3 nodes / 3 
>>> bricks in dispersed mode 2+1, it fails like this :
>>> 
>>> 
>>> 
>>> 
>>> 
>>> root@gl1:~# gluster volume add-brick vol1 gl4:/data/br1 gl5:/data/br1 
>>> gl6:/data/br1
>>> 
>>> 
>>> volume add-brick: failed: Pre Validation failed on gl4. Host gl5 not 
>>> connected
>>> 
>>> 
>>> 
>>> 
>>> 
>>> However all peers are connected and there aren't networking issues :
>>> 
>>> 
>>> 
>>> 
>>> 
>>> root@gl1:~# gluster peer status
>>> 
>>> 
>>> Number of Peers: 5
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Hostname: gl2
>>> 
>>> 
>>> Uuid: 616f100f-a3f4-46e4-b161-ee5db5a60e26
>>> 
>>> 
>>> State: Peer in Cluster (Connected)
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Hostname: gl3
>>> 
>>> 
>>> Uuid: acb828b8-f4b3-42ab-a9d2-b3e7b917dc9a
>>> 
>>> 
>>> State: Peer in Cluster (Connected)
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Hostname: gl4
>>> 
>>> 
>>> Uuid: 813ad056-5e84-4fdb-ac13-38d24c748bc4
>>> 
>>> 
>>> State: Peer in Cluster (Connected)
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Hostname: gl5
>>> 
>>> 
>>> Uuid: a7933aeb-b08b-4ebb-a797-b8ecbe5a03c6
>>> 
>>> 
>>> State: Peer in Cluster (Connected)
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Hostname: gl6
>>> 
>>> 
>>> Uuid: 63c9a6c1-0adf-4cf5-af7b-b28a60911c99
>>> 
>>> 
>>> State: Peer in Cluster (Connected)
>>> 
>>> 
>>>  :
>>> 
>>> 
>>> 
>>> 
>>> 
>>> When I try a second time, the error is different :
>>> 
>>> 
>>> 
>>> 
>>> 
>>> root@gl1:~# gluster volume add-brick vol1 gl4:/data/br1 gl5:/data/br1 
>>> gl6:/data/br1
>>> 
>>> 
>>> volume add-brick: failed: Pre Validation failed on gl5. /data/br1 is 
>>> already part of a volume
>>> 
>>> 
>>> Pre Validation failed on gl6. /data/br1 is already part of a volume
>>> 
>>> 
>>> Pre Validation failed on gl4. /data/br1 is already part of a volume
>>> 
>>> 
>>> 
>>> 
>>> 
>>> It seems the previous try, even if it has failed, have created the gluster 
>>> attributes on file system as shown by attr on gl4/5/6 :
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Attribute "glusterfs.volume-id" has a 16 byte value for /data/br1
>>> 
>>> 
>>> 
>>> 
>>> 
>>> I already purge gluster and reformat brick on gl4/5/6 but the issue 
>>> persist, any ideas ? did I miss something ?
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Some informations :
>>> 
>>> 
>>> 
>>> 
>>> 
>>> root@gl1:~# gluster volume info
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Volume Name: vol1
>>> 
>>> 
>>> Type: Disperse
>>> 
>>> 
>>> Volume ID: bb563884-0e2a-4757-9fd5-cb851ba113c3
>>> 
>>> 
>>> Status: Started
>>> 
>>> 
>>> Snapshot Count: 0
>>> 
>>> 
>>> Number of Bricks: 1 x (2 + 1) = 3
>>> 
>>> 
>>> Transport-type: tcp
>>> 
>>> 
>>> Bricks:
>>> 
>>> 
>>> Brick1: gl1:/data/br1
>>> 
>>> 
>>> Brick2: gl2:/data/br1
>>> 
>>> 
>>> Brick3: gl3:/data/br1
>>> 
>>> 
>>> Options Reconfigured:
>>> 
>>> 
>>> features.scrub-freq: hourly
>>> 
>>> 
>>> features.scrub: Inactive
>>> 
>>> 
>>> features.bitrot: off
>>> 
>>> 
>>> cluster.disperse-self-heal-daemon: enable
>>> 
>>> 
>>> transport.address-family: inet
>>> 
>>> 
>>> performance.readdir-ahead: on
>>> 
>>> 
>>> nfs.disable: on, I have the following error :
>>> 
>>> 
>>> 
>>> 
>>> 
>>> root@gl1:~# gluster volume status
>>> 
>>> 
>>> Status of volume: vol1
>>> 
>>> 
>>> Gluster process                             TCP Port  RDMA Port  Online  Pid
>>> 
>>> 
>>> ------------------------------------------------------------------------------
>>> 
>>> 
>>> Brick gl1:/data/br1                         49152     0          Y       
>>> 23403
>>> 
>>> 
>>> Brick gl2:/data/br1                         49152     0          Y       
>>> 14545
>>> 
>>> 
>>> Brick gl3:/data/br1                         49152     0          Y       
>>> 11348
>>> 
>>> 
>>> Self-heal Daemon on localhost               N/A       N/A        Y       
>>> 24766
>>> 
>>> 
>>> Self-heal Daemon on gl4                     N/A       N/A        Y       
>>> 1087
>>> 
>>> 
>>> Self-heal Daemon on gl5                     N/A       N/A        Y       
>>> 1080
>>> 
>>> 
>>> Self-heal Daemon on gl3                     N/A       N/A        Y       
>>> 12321
>>> 
>>> 
>>> Self-heal Daemon on gl2                     N/A       N/A        Y       
>>> 15496
>>> 
>>> 
>>> Self-heal Daemon on gl6                     N/A       N/A        Y       
>>> 1091
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Task Status of Volume vol1
>>> 
>>> 
>>> ------------------------------------------------------------------------------
>>> 
>>> 
>>> There are no active volume tasks
>>> 
>>> 
>>> 
>>> 
>>> 
>>> root@gl1:~# gluster volume info
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Volume Name: vol1
>>> 
>>> 
>>> Type: Disperse
>>> 
>>> 
>>> Volume ID: bb563884-0e2a-4757-9fd5-cb851ba113c3
>>> 
>>> 
>>> Status: Started
>>> 
>>> 
>>> Snapshot Count: 0
>>> 
>>> 
>>> Number of Bricks: 1 x (2 + 1) = 3
>>> 
>>> 
>>> Transport-type: tcp
>>> 
>>> 
>>> Bricks:
>>> 
>>> 
>>> Brick1: gl1:/data/br1
>>> 
>>> 
>>> Brick2: gl2:/data/br1
>>> 
>>> 
>>> Brick3: gl3:/data/br1
>>> 
>>> 
>>> Options Reconfigured:
>>> 
>>> 
>>> features.scrub-freq: hourly
>>> 
>>> 
>>> features.scrub: Inactive
>>> 
>>> 
>>> features.bitrot: off
>>> 
>>> 
>>> cluster.disperse-self-heal-daemon: enable
>>> 
>>> 
>>> transport.address-family: inet
>>> 
>>> 
>>> performance.readdir-ahead: on
>>> 
>>> 
>>> nfs.disable: on
>>> 
>>> 
>>> 
>>> 
>>> 
>>> root@gl1:~# gluster peer status
>>> 
>>> 
>>> Number of Peers: 5
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Hostname: gl2
>>> 
>>> 
>>> Uuid: 616f100f-a3f4-46e4-b161-ee5db5a60e26
>>> 
>>> 
>>> State: Peer in Cluster (Connected)
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Hostname: gl3
>>> 
>>> 
>>> Uuid: acb828b8-f4b3-42ab-a9d2-b3e7b917dc9a
>>> 
>>> 
>>> State: Peer in Cluster (Connected)
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Hostname: gl4
>>> 
>>> 
>>> Uuid: 813ad056-5e84-4fdb-ac13-38d24c748bc4
>>> 
>>> 
>>> State: Peer in Cluster (Connected)
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Hostname: gl5
>>> 
>>> 
>>> Uuid: a7933aeb-b08b-4ebb-a797-b8ecbe5a03c6
>>> 
>>> 
>>> State: Peer in Cluster (Connected)
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Hostname: gl6
>>> 
>>> 
>>> Uuid: 63c9a6c1-0adf-4cf5-af7b-b28a60911c99
>>> 
>>> 
>>> State: Peer in Cluster (Connected)
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> 
>>> 
>>> Gluster-users mailing list
>>> 
>>> 
>>> [email protected] <mailto:[email protected]>
>>> 
>>> 
>>> http://www.gluster.org/mailman/listinfo/gluster-users 
>>> <http://www.gluster.org/mailman/listinfo/gluster-users>
>>> 
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> 
>>> ~ Atin (atinm)
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> 
>>> ~ Atin (atinm)
>>> 
>>> 
>>> 
>>> 
>> 
>> -- 
>> - Atin (atinm)
> 
> 
> 
> 
> -- 
> 
> ~ Atin (atinm)
> 
> 
> 
> -- 
> 
> ~ Atin (atinm)

_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users

Reply via email to