Re: [Gluster-users] intresting issue of replication and self-heal

Ted Miller Mon, 27 Jan 2014 10:46:07 -0800


On 1/21/2014 11:05 PM, Mingfan Lu wrote:

I have a volume (distribute-replica (*3)), today i found an interesting problem
node22 node23 and node24 are the replica-7 from client A
but the annoying thing is when I create dir or write file from client toreplica-7,
 date;dd if=/dev/zero of=49 bs=1MB count=120
Wed Jan 22 11:51:41 CST 2014
120+0 records in
120+0 records out
120000000 bytes (120 MB) copied, 1.96257 s, 61.1 MB/s

but I could only find node23 & node24 have the find
---------------
node23,node24
---------------
/mnt/xfsd/test-volume/test/49

in clientA, I use find command

I use another machine as client B, and mount the test volume (newly mounted)
to run*find /mnt/xfsd/test-volume/test/49*

from Client A, the  three nodes have the file now.

---------------
node22,node23.node24
---------------
/mnt/xfsd/test-volume/test/49
but in Client A, I delete the file /mnt/xfsd/test-volume/test/49, node22still have the file in brick.
---------------
node22
---------------
/mnt/xfsd/test-volume/test/49

but if i delete the new created files from Client B )
my question is why node22 have no newly created/write dirs/files? I have touse find to trigger the self-heal to fix that?
from ClientA's log, I find something like:
I [afr-self-heal-data.c:712:afr_sh_data_fix] 0-test-volume-replicate-7: noactive sinks for performing self-heal on file /test/49
It is harmless for it is information level?

I also see something like:
[2014-01-19 10:23:48.422757] E[afr-self-heal-entry.c:2376:afr_sh_post_nonblocking_entry_cbk]0-test-volume-replicate-7: Non Blocking entrylks failed for/test/video/2014/01.[2014-01-19 10:23:48.423042] E[afr-self-heal-common.c:2160:afr_self_heal_completion_cbk]0-test-volume-replicate-7: background entry self-heal failed on/test/video/2014/01

From the paths you are listing, it looks like you may be mounting thebricks, not the gluster volume.

You MUST mount the gluster volume, not the bricks that make up the volume.In your example, the mount looks like it is mounting the xfs volume. Yourmount command should be something like:


   mount <host name>:test volume /mount/gluster/test-volume

If a brick is part of a gluster volume, the brick must NEVER be written todirectly. Yes, what you write MAY eventually be duplicated over to the othernodes, but if and when that happens is unpredictable. It will give theunpredictable replication results that you are seeing.

The best way to test is to run "mount". If the line where you are mountingthe gluster volume doesn't say "glusterfs" on it, you have it wrong. Also,the line you use in /etc/fstab must say "glusterfs", not "xfs" or "ext4".

If you are in doubt, include the output of "mount" in your next email to thelist.


Ted Miller
Elkhart, IN, USA

_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] intresting issue of replication and self-heal

Reply via email to