That looks like you haven't had a split-brain since the 9th of October...
On 11/21/2013 02:43 PM, Alexandre Fournier wrote:
I would like also to inform you about the information about the split-brain we
have :
2013-10-09 22:02:59 <gfid:83b39e48-f8eb-4149-b851-d3b97e18c4b6>
2013-10-09 21:52:59 <gfid:85378e3f-0dd1-4f8e-a7d5-70424b643fb9>
2013-10-09 21:52:59 <gfid:0a958cad-1615-4e1b-8e1a-9dc0356859d6>
2013-10-09 21:52:59 <gfid:54b02fde-69d2-4da2-8372-2a7af89a0ae1>
2013-10-09 21:52:59 <gfid:4702c3ab-a2bb-43e3-ae2c-ecb5b440f368>
2013-10-09 21:52:59 <gfid:8fe46824-a9f1-4095-b204-e9e137ae8643>
...
Count : 1023
We tried to clean all the dangling links but they are still coming back the
split-brain is not resolved.
It maybe the root cause of the problem, how do we resolve those split brain?
-----Original Message-----
From: Alexandre Fournier
Sent: 21 novembre 2013 15:20
To: 'Lalatendu Mohanty'; Pranith Kumar Karampuri
Cc: [email protected]; [email protected]
Subject: RE: [Gluster-devel] [Gluster-users] Self Heal and dangling symlinks
Ok here is the information :
Stat :
File: `/aa/aa/aa/aa/aa/aa/aa
Size: 14364 Blocks: 32 IO Block: 4096 regular file
Device: 822h/2082d Inode: 3155137861 Links: 2
Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)
Access: 2013-11-21 13:14:58.527765935 +0000
Modify: 2013-11-13 13:19:13.736226050 +0000
Change: 2013-11-13 13:19:13.736226050 +0000
File: `/aa/aa/aa/aa/aa/aa/aa
Size: 14364 Blocks: 32 IO Block: 4096 regular file
Device: 822h/2082d Inode: 3076494286 Links: 2
Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)
Access: 2013-11-21 13:14:58.527674754 +0000
Modify: 2013-11-13 13:19:13.736442464 +0000
Change: 2013-11-13 13:19:13.736442464 +0000
Birth: -
Attributes :
# file: aa/aa/aa/aa/aa/aa/aa
trusted.afr.gv0-client-0=0x000000000000000000000000
trusted.afr.gv0-client-1=0x000000000000000000000000
trusted.gfid=0xb5b8c3ec9dd24609b56476651113d3fa
# file: aa/aa/aa/aa/aa/aa/aa
trusted.afr.gv0-client-0=0x000000000000000000000000
trusted.afr.gv0-client-1=0x000000000000000000000000
trusted.gfid=0xb5b8c3ec9dd24609b56476651113d3fa
-----Original Message-----
From: Lalatendu Mohanty [mailto:[email protected]]
Sent: 21 novembre 2013 14:05
To: Alexandre Fournier; Pranith Kumar Karampuri
Cc: [email protected]; [email protected]
Subject: Re: [Gluster-devel] [Gluster-users] Self Heal and dangling symlinks
On 11/21/2013 07:54 PM, Alexandre Fournier wrote:
They are both regular file on the node and the replicas and they have
the same GFID. I ran also the gluster volume heal gv0 split-brain
command and the file is not in the list. We have an entire directory
though (1023 entry on a node)
However, the file was already on the brick before uploading it and I noticed
that that the write did not work since the last modification date does not
match the upload time.
Through a web service, we offer to upload files on the gluster mount. This
web service upload the file on a temporary folder and then MOVE the file on
the gluster mount.
Could the move operation give strange behavior like this?
Alexandre,
No, it should not. Please let us know the answers of the questions Pranith and
I asked, so we can understand the root cause of your problem.
Alexandre Fournier
Tools Programmer
Ubisoft Production Services
-----Original Message-----
From: Pranith Kumar Karampuri [mailto:[email protected]]
Sent: 21 novembre 2013 00:47
To: Lalatendu Mohanty
Cc: Alexandre Fournier; [email protected];
[email protected]
Subject: Re: [Gluster-devel] [Gluster-users] Self Heal and dangling
symlinks
Alexandre,
Seems like there is an entry split-brain (same file/dir name but on one
brick it is a file and on the other it is a directory) according to the
following log:
[2013-11-18 18:18:43.052446] W
[afr-common.c:1411:afr_conflicting_iattrs]
0-gv0-replicate-0: /aa/aa/aa/aa: filetype differs on subvolumes (0,
1)
Could you get us the output of "stat <brick-dir-path>/aa/aa/aa/aa/aa" and "getfattr -d -m.
-e hex <brick-dir-path>/aa/aa/aa/aa/aa" on both the bricks.
Pranith
----- Original Message -----
From: "Lalatendu Mohanty" <[email protected]>
To: "Alexandre Fournier" <[email protected]>,
[email protected], [email protected]
Sent: Thursday, November 21, 2013 1:28:01 AM
Subject: Re: [Gluster-devel] [Gluster-users] Self Heal and dangling
symlinks
On 11/19/2013 10:49 PM, Alexandre Fournier wrote:
Hello,
We are experiencing strange behavior when writing file on the Gluster
mount point. On some occasion, when writing to the Gluster Mount we
have an Open Stream error. We’ve looked the gluster logs and found
the following faulty entries :
[From /var/log/glusterfs/mnt-gv0.log]
[2013-11-18 18:18:43.052446] W
[afr-common.c:1411:afr_conflicting_iattrs]
0-gv0-replicate-0: /aa/aa/aa/aa: filetype differs on subvolumes (0,
1)
[2013-11-18 18:18:43.052468] E
[afr-self-heal-common.c:1409:afr_sh_common_lookup_cbk] 0-gv0-replicate-0:
Conflicting entries for /aa/aa/aa/aa
[2013-11-18 18:18:43.052757] E
[afr-self-heal-common.c:2160:afr_self_heal_completion_cbk]
0-gv0-replicate-0: background meta-data data entry missing-entry gfid
self-heal
failed on /aa/aa/aa/aa/aa
[2013-11-18 18:18:43.052780] W [fuse-bridge.c:292:fuse_entry_cbk]
0-glusterfs-fuse: 439382194: LOOKUP() /aa/aa/aa/aa/aa => -1
(Input/output
error)
We’ve looked at the log file etc-glusterfs-glusterd.vol.log but we
found nothing related to this problem. Then, we’ve looked at the log
From /var/log/glusterfs/bricks/mnt-data.log and we found 70 gig of
logs of the same type :
[2013-11-19 17:13:32.269757] W [posix-handle.c:538:posix_handle_soft]
0-gv0-posix: symlink
../../ab/fe/abfeb61c-501d-4417-b8fb-0accdd57146f/cf ->
/mnt/data/.glusterfs/ab/fe/abfeb61c-501d-4417-b8fb-0accdd57146f/cf
failed (No such file or directory)
[2013-11-19 17:13:32.269978] W [posix-handle.c:538:posix_handle_soft]
0-gv0-posix: symlink
../../c7/8b/c78be78f-cc95-47b2-a27f-4217f1759b67/d2 ->
/mnt/data/.glusterfs/c7/8b/c78be78f-cc95-47b2-a27f-4217f1759b67/d2
failed (No such file or directory)
[2013-11-19 17:13:32.270190] W [posix-handle.c:538:posix_handle_soft]
0-gv0-posix: symlink
../../5a/8f/5a8fa43c-4ccc-4d88-9122-a96bc8ffaebc/f2 ->
/mnt/data/.glusterfs/5a/8f/5a8fa43c-4ccc-4d88-9122-a96bc8ffaebc/f2
failed (No such file or directory)
This looks like a bug, unless there is something wrong with the
set-up. I have copied gluster-devel in this thread as I think they might help.
Just curious, is all your gluster nodes have equal time (i.e. ntp synced).
And it does not stop logging. It seems that the self heal is not
working properly when there are broking symlinks in the gluster. It
is worth saying also that this log is only produce on a single node
but the write fail on several node though. Also, we try to clean the
symlinks manually but it always come back.
Is it possible to recover from broken symlinks?
Configuration :
Gluster Version : 3.3.2
Cluster setup : 4 X 2
OS : Ubuntu
On Fuse
Thanks,
Alexandre
_______________________________________________
Gluster-users mailing list [email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-devel mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/gluster-devel
_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users