Hi, On Wed, Sep 21, 2016 at 10:12:44PM +0530, Ravishankar N wrote: > On 09/21/2016 06:45 PM, Pasi Kärkkäinen wrote: > >Hello, > > > >I have a pretty basic two-node gluster 3.7 setup, with a volume > >replicated/mirrored to both servers. > > > >One of the servers was down for hardware maintenance, and later when it got > >back up, the healing process started, re-syncing files. > >In the beginning there was some 200 files that need to be synced, and now > >the number of files is down to 10, but it seems the last 10 files don't seem > >to get synced.. > > > >So the problem is the healing/re-sync never ends for these files.. > > > > > ># gluster volume heal gvol1 info > >Brick gnode1:/bricks/vol1/brick1 > >/foo > >/ - Possibly undergoing heal > > > >/foo6 > >/foo8 > >/foo7 > >/foo9 > >/foo2 > >/foo5 > >/foo4 > >/foo3 > >Status: Connected > >Number of entries: 10 > > > >Brick gnode2:/bricks/vol1/brick1 > >/ > >Status: Connected > >Number of entries: 1 > > > > > >In the brick logs for the volume I see these errors repeating: > > > >[2016-09-21 12:41:43.063209] E [MSGID: 113002] [posix.c:252:posix_lookup] > >0-gvol1-posix: buf->ia_gfid is null for /bricks/vol1/brick1/foo [No data > >available] > >[2016-09-21 12:41:43.063266] E [MSGID: 115050] > >[server-rpc-fops.c:179:server_lookup_cbk] 0-gvol1-server: 1484202: LOOKUP > >/foo (00000000-0000-0000-0000-000000000001/foo) ==> (No data available) [No > >data available] > > > > > >Any idea what might cause those errors? (/foo is exactly the file that is > >being healed, but fails to heal) > >Any tricks to try? > > Can you check if the 'trusted.gfid' xattr is present for those files > on the bricks and the files also have the associated hardlink inside > .glusterfs? You can refer to > https://joejulian.name/blog/what-is-this-new-glusterfs-directory-in-33/ > if you are not familiar with the .glusterfs directory. >
Let's see. # getfattr -m . -d -e hex /bricks/vol1/brick1/foo getfattr: Removing leading '/' from absolute path names # file: bricks/vol1/brick1/foo security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000 So hmm.. no trusted.gfid it seems.. is that perhaps because this node was down when the file was created? On another node: # getfattr -m . -d -e hex /bricks/vol1/brick1/foo getfattr: Removing leading '/' from absolute path names # file: bricks/vol1/brick1/foo security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.gvol1-client-1=0x000016620000000100000000 trusted.bit-rot.version=0x020000000000000057e00db5000624ed trusted.gfid=0xc1ca778ed2af4828b981171c0c5bd45e So there we have the gfid.. How do I fix this and allow healing process to continue/finish.. ? Thanks, -- Pasi > -Ravi > > > > >Software versions: CentOS 7 with gluster37 repo (running Gluster 3.7.15), > >and nfs-ganesha 2.3.3. > > > > > >Thanks a lot, > > > >-- Pasi > > _______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users