Hi Ankur, 

It looks like some of the files/directories are in gfid split-brain. 
>From the logs that you attached, here is the list of gfids of directories in 
>gfid split-brain, based on the message id for gfid split-brain log message 
>(108008): 

[kdhananjay@dhcp35-215 logs]$ grep -iarT '108008' * | awk '{print $13}' | cut 
-f1 -d'/' | sort | uniq 
<16d8005d-3ae2-4c72-9097-2aedd458b5e0 
<3539c175-d694-409d-949f-f9a3e18df17b 
<3fd13508-b29e-4d52-8c9c-14ccd2f24b9f 
<6b1e5a5a-bb65-46c1-a7c3-0526847beece 
<971b5249-92fb-4166-b1a0-33b7efcc39a8 
<b582f326-c8ee-4b04-aba0-d37cb0a6f89a 
<cc9d0e49-c9ab-4dab-bca4-1c06c8a7a4e3 

There are 7 such directories. 

Also, there are 457 entries in gfid split-brain: 
[kdhananjay@dhcp35-215 logs]$ grep -iarT '108008' glustershd.log | awk '{print 
$13}' | sort | uniq | wc -l 
457 

You will need to do the following to get things back to the normal state: 

1) For each gfid in the list of the 7 directories in split-brain, get the list 
of files in split-brain. 
For example, for <16d8005d-3ae2-4c72-9097-2aedd458b5e0 , the command would be 
`grep -iarT '108008' * | grep 16d8005d-3ae2-4c72-9097-2aedd458b5e0` 
You will need to omit the repeating messages of course. 
You would get messages of the following kind: 
glustershd.log :[2015-09-10 01:44:05.512589] E [MSGID: 108008] 
[afr-self-heal-entry.c:253:afr_selfheal_detect_gfid_and_type_mismatch] 
0-repl-vol-replicate-0: Gfid mismatch detected for 
<16d8005d-3ae2-4c72-9097-2aedd458b5e0/100000075944.jpg>, 
d9f15b28-9c9c-4f31-ba3c-543a5331cb9d on repl-vol-client-1 and 
583295f0-1ec4-4783-9b35-1e18b8b4f92c on repl-vol-client-0. Skipping 
conservative merge on the file. 

2) Examine the two copies (one per replica) of each such file, choose one copy 
and delete the copy from the other replica. 
In the above example, the parent is 16d8005d-3ae2-4c72-9097-2aedd458b5e0 and 
the entry is '100000075944.jpg'. 
So you can examine the two different copies at 
<brick-path>/.glusterfs/16/d8/16d8005d-3ae2-4c72-9097-2aedd458b5e0/100000075944.jpg
 to decide which one you want to keep. 
Once you have decided on the copy you choose to keep, you need to delete the 
bad copy and its hard link. This is assuming all of the entries in gfid 
split-brain are regular files. At least that is what I gathered from the logs 
since they were all .jpg files. 
You can get the absolute path of the entry by noting down inode number of the 
gfid link on the bad brick and then grepping for the corresponding number under 
the same brick. 
In this example, the gfid link would be 
<bad-brick-path>/.glusterfs/16/d8/16d8005d-3ae2-4c72-9097-2aedd458b5e0/100000075944.jpg.
 
So you would need to get its inode number (by doing stat on it) and do a 'find 
<bad-brick-path> -inum <inodenumber of gfid link> to get its absolute path. 
Once you have both, unlink them both. If hard links exist, delete them as well 
on the bad brick. 

There are about 457 files where you need to repeat this exercise. 

Once you are done, you could execute 'gluster volume heal <VOL>'. This would 
take care of healing the good copies to the bricks where the file was deleted 
from. 
After the heal is complete, heal info split-brain should not be showing any 
entries. 

As for the performance problem, it is possible that it was due to self-heal 
daemon periodically trying to heal the files in gfid-split-brain in vain, and 
should most likely go away once the split-brain is resolved. 

As an aside, it is not clear why so many files ran into gfid split-brain. You 
might want to check if the network link between the clients and the servers was 
fine. 

Hope that helps. Let me know if you need more clarification. 
-Krutika 
----- Original Message -----

> From: "Ankur Pandey" <[email protected]>
> To: [email protected], [email protected]
> Cc: "Dhaval Kamani" <[email protected]>
> Sent: Saturday, September 12, 2015 12:51:31 PM
> Subject: [Gluster-users] GlusterFS Split Brain issue

> HI Team GlusterFS,

> With reference to Question on server fault.

> http://serverfault.com/questions/721067/glusterfs-split-brain-issue

> On request of Pranith I am sending you logs. Please tell me if you need
> anything else.

> Attaching logs for 2 master servers.

> Regards
> Ankur Pandey
> +91 9702 831 855

> _______________________________________________
> Gluster-users mailing list
> [email protected]
> http://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users

Reply via email to