HI Krutika, Thanks for the reply. However I am afraid that its not too late for us. I already replaced GlusterFS server and copied my data on the new bricks. Now Again its working flawlessly like it was working before. However, I still have old server and snapshots I'd try to implement your solution and I'll let you know about it.
Regards Ankur Pandey +91 9702 831 855 On Tue, Sep 22, 2015 at 4:54 PM, Krutika Dhananjay <[email protected]> wrote: > Hi Ankur, > > It looks like some of the files/directories are in gfid split-brain. > From the logs that you attached, here is the list of gfids of directories > in gfid split-brain, based on the message id for gfid split-brain log > message (108008): > > [kdhananjay@dhcp35-215 logs]$ grep -iarT '108008' * | awk '{print $13}' | > cut -f1 -d'/' | sort | uniq > <16d8005d-3ae2-4c72-9097-2aedd458b5e0 > <3539c175-d694-409d-949f-f9a3e18df17b > <3fd13508-b29e-4d52-8c9c-14ccd2f24b9f > <6b1e5a5a-bb65-46c1-a7c3-0526847beece > <971b5249-92fb-4166-b1a0-33b7efcc39a8 > <b582f326-c8ee-4b04-aba0-d37cb0a6f89a > <cc9d0e49-c9ab-4dab-bca4-1c06c8a7a4e3 > > There are 7 such directories. > > Also, there are 457 entries in gfid split-brain: > [kdhananjay@dhcp35-215 logs]$ grep -iarT '108008' glustershd.log | awk > '{print $13}' | sort | uniq | wc -l > 457 > > You will need to do the following to get things back to the normal state: > > 1) For each gfid in the list of the 7 directories in split-brain, get the > list of files in split-brain. > For example, for <16d8005d-3ae2-4c72-9097-2aedd458b5e0 , the command would > be `grep -iarT '108008' * | grep 16d8005d-3ae2-4c72-9097-2aedd458b5e0` > You will need to omit the repeating messages of course. > You would get messages of the following kind: > glustershd.log :[2015-09-10 01:44:05.512589] E [MSGID: 108008] > [afr-self-heal-entry.c:253:afr_selfheal_detect_gfid_and_type_mismatch] > 0-repl-vol-replicate-0: Gfid mismatch detected for > <16d8005d-3ae2-4c72-9097-2aedd458b5e0/100000075944.jpg>, > d9f15b28-9c9c-4f31-ba3c-543a5331cb9d on repl-vol-client-1 and > 583295f0-1ec4-4783-9b35-1e18b8b4f92c on repl-vol-client-0. Skipping > conservative merge on the file. > > 2) Examine the two copies (one per replica) of each such file, choose one > copy and delete the copy from the other replica. > In the above example, the parent is 16d8005d-3ae2-4c72-9097-2aedd458b5e0 > and the entry is '100000075944.jpg'. > So you can examine the two different copies at > <brick-path>/.glusterfs/16/d8/16d8005d-3ae2-4c72-9097-2aedd458b5e0/100000075944.jpg > to decide which one you want to keep. > Once you have decided on the copy you choose to keep, you need to delete > the bad copy and its hard link. This is assuming all of the entries in gfid > split-brain are regular files. At least that is what I gathered from the > logs since they were all .jpg files. > You can get the absolute path of the entry by noting down inode number of > the gfid link on the bad brick and then grepping for the corresponding > number under the same brick. > In this example, the gfid link would be > <bad-brick-path>/.glusterfs/16/d8/16d8005d-3ae2-4c72-9097-2aedd458b5e0/100000075944.jpg. > So you would need to get its inode number (by doing stat on it) and do a > 'find <bad-brick-path> -inum <inodenumber of gfid link> to get its absolute > path. > Once you have both, unlink them both. If hard links exist, delete them as > well on the bad brick. > > There are about 457 files where you need to repeat this exercise. > > Once you are done, you could execute 'gluster volume heal <VOL>'. This > would take care of healing the good copies to the bricks where the file was > deleted from. > After the heal is complete, heal info split-brain should not be showing > any entries. > > As for the performance problem, it is possible that it was due to > self-heal daemon periodically trying to heal the files in gfid-split-brain > in vain, and should most likely go away once the split-brain is resolved. > > As an aside, it is not clear why so many files ran into gfid split-brain. > You might want to check if the network link between the clients and the > servers was fine. > > Hope that helps. Let me know if you need more clarification. > -Krutika > ------------------------------ > > *From: *"Ankur Pandey" <[email protected]> > *To: *[email protected], [email protected] > *Cc: *"Dhaval Kamani" <[email protected]> > *Sent: *Saturday, September 12, 2015 12:51:31 PM > *Subject: *[Gluster-users] GlusterFS Split Brain issue > > HI Team GlusterFS, > > With reference to Question on server fault. > > http://serverfault.com/questions/721067/glusterfs-split-brain-issue > > On request of Pranith I am sending you logs. Please tell me if you need > anything else. > > Attaching logs for 2 master servers. > > Regards > Ankur Pandey > +91 9702 831 855 > > > > > _______________________________________________ > Gluster-users mailing list > [email protected] > http://www.gluster.org/mailman/listinfo/gluster-users > > >
_______________________________________________ Gluster-users mailing list [email protected] http://www.gluster.org/mailman/listinfo/gluster-users
