On 09/25/2013 06:16 AM, Andrew Lau wrote:
That's where I found the 200+ entries
[ root@hv01 ]gluster volume heal STORAGE info split-brain
Gathering Heal info on volume STORAGE has been successful
Brick hv01:/data1
Number of entries: 271
at path on brick
2013-09-25 00:04:29 /6682d31f-39ce-4896-99ef-14e1c9682585/dom_md/ids
2013-09-25 00:04:29
/6682d31f-39ce-4896-99ef-14e1c9682585/images/5599c7c7-0c25-459a-9d7d-80190a7c739b/0593d351-2ab1-49cd-a9b6-c94c897ebcc7
2013-09-24 23:54:29 <gfid:9c83f7e4-6982-4477-816b-172e4e640566>
2013-09-24 23:54:29 <gfid:91e98909-c217-417b-a3c1-4cf0f2356e14>
<snip>
Brick hv02:/data1
Number of entries: 0
When I run the same command on hv02, it will show the reverse (the
other node having 0 entries).
I remember last time having to delete these files individually on
another split-brain case, but I was hoping there was a better solution
than going through 200+ entries.
While I haven't tried it out myself, Jeff Darcy has written a script
(https://github.com/jdarcy/glusterfs/tree/heal-script/extras/heal_script) which
helps in automating the process. He has detailed it's usage in his blog
post http://hekafs.org/index.php/2012/06/healing-split-brain/
Hope this helps.
-Ravi
Cheers.
On Wed, Sep 25, 2013 at 10:39 AM, Mohit Anchlia
<[email protected] <mailto:[email protected]>> wrote:
What's the output of
|gluster volume heal $VOLUME info ||split||-brain|
On Tue, Sep 24, 2013 at 5:33 PM, Andrew Lau <[email protected]
<mailto:[email protected]>> wrote:
Found the BZ
https://bugzilla.redhat.com/show_bug.cgi?id=960190 - so I
restarted one of the volumes and it seems to have restarted
the all daemons again.
Self heal started again, but I seem to have split-brain issues
everywhere. There's over 100 different entries on each node,
what's the best way to restore this now? Short of having to
manually go through and delete 200+ files. It looks like a
full split brain as the file sizes on the different nodes are
out of balance by about 100GB or so.
Any suggestions would be much appreciated!
Cheers.
On Tue, Sep 24, 2013 at 10:32 PM, Andrew Lau
<[email protected] <mailto:[email protected]>> wrote:
Hi,
Right now, I have a 2x1 replica. Ever since I had to
reinstall one of the gluster servers, there's been issues
with split-brain. The self-heal daemon doesn't seem to be
running on either of the nodes.
To reinstall the gluster server (the original brick data
was intact but the OS had to be reinstalled)
- Reinstalled gluster
- Copied over the old uuid from backup
- gluster peer probe
- gluster volume sync $othernode all
- mount -t glusterfs localhost:STORAGE /mnt
- find /mnt -noleaf -print0 | xargs --null stat >/dev/null
2>/var/log/glusterfs/mnt-selfheal.log
I let it resync and it was working fine, atleast so I
thought. I just came back a few days later to see there's
a miss match in the brick volumes. One is 50GB ahead of
the other.
# gluster volume heal STORAGE info
Status: self-heal-daemon is not running on
966456a1-b8a6-4ca8-9da7-d0eb96997cbe
/var/log/gluster/glustershd.log doesn't seem to have any
recent logs, only those from when the two original gluster
servers were running.
# gluster volume status
Self-heal Daemon on localhostN/ANN/A
Any suggestions would be much appreciated!
Cheers
Andrew.
_______________________________________________
Gluster-users mailing list
[email protected] <mailto:[email protected]>
http://supercolony.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users