Re: [Gluster-users] Rebuild Distributed/Replicated Setup

Pranith Kumar. Karampuri Wed, 18 May 2011 00:32:02 -0700

hi Remi,
      It seems the split-brain is detected on following files:
/agc/production/log/809223185/contact.log
/agc/production/log/809223185/event.log
/agc/production/log/809223635/contact.log
/agc/production/log/809224061/contact.log
/agc/production/log/809224321/contact.log
/agc/production/log/809215319/event.log


Could you give the output of the following command for each file above on both 
the bricks in the replica pair.

getxattr -d -m "trusted.afr*" <filepath>

Thanks
Pranith

----- Original Message -----
From: "Remi Broemeling" <[email protected]>
To: [email protected]
Sent: Tuesday, May 17, 2011 9:02:44 PM
Subject: Re: [Gluster-users] Rebuild Distributed/Replicated Setup


Hi Pranith. Sure, here is a pastebin sampling of logs from one of the hosts: 
http://pastebin.com/1U1ziwjC 


On Mon, May 16, 2011 at 20:48, Pranith Kumar. Karampuri < [email protected] 
> wrote: 


hi Remi, 
Would it be possible to post the logs on the client, so that we can find what 
issue you are running into. 

Pranith 



----- Original Message ----- 
From: "Remi Broemeling" < [email protected] > 
To: [email protected] 
Sent: Monday, May 16, 2011 10:47:33 PM 
Subject: [Gluster-users] Rebuild Distributed/Replicated Setup 


Hi, 

I've got a distributed/replicated GlusterFS v3.1.2 (installed via RPM) setup 
across two servers (web01 and web02) with the following vol config: 

volume shared-application-data-client-0 
type protocol/client 
option remote-host web01 
option remote-subvolume /var/glusterfs/bricks/shared 
option transport-type tcp 
option ping-timeout 5 
end-volume 

volume shared-application-data-client-1 
type protocol/client 
option remote-host web02 
option remote-subvolume /var/glusterfs/bricks/shared 
option transport-type tcp 
option ping-timeout 5 
end-volume 

volume shared-application-data-replicate-0 
type cluster/replicate 
subvolumes shared-application-data-client-0 shared-application-data-client-1 
end-volume 

volume shared-application-data-write-behind 
type performance/write-behind 
subvolumes shared-application-data-replicate-0 
end-volume 

volume shared-application-data-read-ahead 
type performance/read-ahead 
subvolumes shared-application-data-write-behind 
end-volume 

volume shared-application-data-io-cache 
type performance/io-cache 
subvolumes shared-application-data-read-ahead 
end-volume 

volume shared-application-data-quick-read 
type performance/quick-read 
subvolumes shared-application-data-io-cache 
end-volume 

volume shared-application-data-stat-prefetch 
type performance/stat-prefetch 
subvolumes shared-application-data-quick-read 
end-volume 

volume shared-application-data 
type debug/io-stats 
subvolumes shared-application-data-stat-prefetch 
end-volume 

In total, four servers mount this via GlusterFS FUSE. For whatever reason (I'm 
really not sure why), the GlusterFS filesystem has run into a bit of 
split-brain nightmare (although to my knowledge an actual split brain situation 
has never occurred in this environment), and I have been getting solidly 
corrupted issues across the filesystem as well as complaints that the 
filesystem cannot be self-healed. 

What I would like to do is completely empty one of the two servers (here I am 
trying to empty server web01), making the other one (in this case web02) the 
authoritative source for the data; and then have web01 completely rebuild it's 
mirror directly from web02. 

What's the easiest/safest way to do this? Is there a command that I can run 
that will force web01 to re-initialize it's mirror directly from web02 (and 
thus completely eradicate all of the split-brain errors and data 
inconsistencies)? 

Thanks! 
-- 

Remi Broemeling 
System Administrator 
Clio - Practice Management Simplified 
1-888-858-2546 x(2^5) | [email protected] 
www.goclio.com | blog | twitter | facebook 

_______________________________________________ 
Gluster-users mailing list 
[email protected] 
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users 



-- 

Remi Broemeling 
System Administrator 
Clio - Practice Management Simplified 
1-888-858-2546 x(2^5) | [email protected] 
www.goclio.com | blog | twitter | facebook 

_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Re: [Gluster-users] Rebuild Distributed/Replicated Setup

Reply via email to