On 3 July 2018 at 23:37, Vlad Kopylov <[email protected]> wrote: > might be too late but sort of simple always working solution for such > cases is rebuilding .glusterfs > > kill it and query attr for all files again, it will recreate .glusterfs on > all bricks > > something like mentioned here > https://lists.gluster.org/pipermail/gluster-users/2018-January/033352.html >
Is my problem with .glusterfs though? I'd be super cautious removing the entire directory unless I'm sure that's the solution... Cheers, > On Tue, Jul 3, 2018 at 4:27 PM, Gambit15 <[email protected]> wrote: > >> On 1 July 2018 at 22:37, Ashish Pandey <[email protected]> wrote: >> >>> >>> The only problem at the moment is that arbiter brick offline. You should >>> only bother about completion of maintenance of arbiter brick ASAP. >>> Bring this brick UP, start FULL heal or index heal and the volume will >>> be in healthy state. >>> >> >> Doesn't the arbiter only resolve split-brain situations? None of the >> files that have been marked for healing are marked as in split-brain. >> >> The arbiter has now been brought back up, however the problem continues. >> >> I've found the following information in the client log: >> >> [2018-07-03 19:09:29.245089] W [MSGID: 108008] >> [afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check] >> 0-engine-replicate-0: GFID mismatch for <gfid:db9afb92-d2bc-49ed-8e34- >> dcd437ba7be2>/hosted-engine.metadata 5e95ba8c-2f12-49bf-be2d-b4baf210d366 >> on engine-client-1 and b9cd7613-3b96-415d-a549-1dc788a4f94d on >> engine-client-0 >> [2018-07-03 19:09:29.245585] W [fuse-bridge.c:471:fuse_entry_cbk] >> 0-glusterfs-fuse: 10430040: LOOKUP() /98495dbc-a29c-4893-b6a0-0aa70 >> 860d0c9/ha_agent/hosted-engine.metadata => -1 (Input/output error) >> [2018-07-03 19:09:30.619000] W [MSGID: 108008] >> [afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check] >> 0-engine-replicate-0: GFID mismatch for <gfid:db9afb92-d2bc-49ed-8e34- >> dcd437ba7be2>/hosted-engine.lockspace 8e86902a-c31c-4990-b0c5-0318807edb8f >> on engine-client-1 and e5899a4c-dc5d-487e-84b0-9bbc73133c25 on >> engine-client-0 >> [2018-07-03 19:09:30.619360] W [fuse-bridge.c:471:fuse_entry_cbk] >> 0-glusterfs-fuse: 10430656: LOOKUP() /98495dbc-a29c-4893-b6a0-0aa70 >> 860d0c9/ha_agent/hosted-engine.lockspace => -1 (Input/output error) >> >> As you can see from the logs I posted previously, neither of those two >> files, on either of the two servers, have any of gluster's extended >> attributes set. >> >> The arbiter doesn't have any record of the files in question, as they >> were created after it went offline. >> >> How do I fix this? Is it possible to locate the correct gfids somewhere & >> redefine them on the files manually? >> >> Cheers, >> Doug >> >> ------------------------------ >>> *From: *"Gambit15" <[email protected]> >>> *To: *"Ashish Pandey" <[email protected]> >>> *Cc: *"gluster-users" <[email protected]> >>> *Sent: *Monday, July 2, 2018 1:45:01 AM >>> *Subject: *Re: [Gluster-users] Files not healing & missing their >>> extended attributes - Help! >>> >>> >>> Hi Ashish, >>> >>> The output is below. It's a rep 2+1 volume. The arbiter is offline for >>> maintenance at the moment, however quorum is met & no files are reported as >>> in split-brain (it hosts VMs, so files aren't accessed concurrently). >>> >>> ====================== >>> [root@v0 glusterfs]# gluster volume info engine >>> >>> Volume Name: engine >>> Type: Replicate >>> Volume ID: 279737d3-3e5a-4ee9-8d4a-97edcca42427 >>> Status: Started >>> Snapshot Count: 0 >>> Number of Bricks: 1 x (2 + 1) = 3 >>> Transport-type: tcp >>> Bricks: >>> Brick1: s0:/gluster/engine/brick >>> Brick2: s1:/gluster/engine/brick >>> Brick3: s2:/gluster/engine/arbiter (arbiter) >>> Options Reconfigured: >>> nfs.disable: on >>> performance.readdir-ahead: on >>> transport.address-family: inet >>> performance.quick-read: off >>> performance.read-ahead: off >>> performance.io-cache: off >>> performance.stat-prefetch: off >>> cluster.eager-lock: enable >>> network.remote-dio: enable >>> cluster.quorum-type: auto >>> cluster.server-quorum-type: server >>> storage.owner-uid: 36 >>> storage.owner-gid: 36 >>> performance.low-prio-threads: 32 >>> >>> ====================== >>> >>> [root@v0 glusterfs]# gluster volume heal engine info >>> Brick s0:/gluster/engine/brick >>> /__DIRECT_IO_TEST__ >>> /98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent >>> /98495dbc-a29c-4893-b6a0-0aa70860d0c9 >>> <LIST TRUNCATED FOR BREVITY> >>> Status: Connected >>> Number of entries: 34 >>> >>> Brick s1:/gluster/engine/brick >>> <SAME AS ABOVE - TRUNCATED FOR BREVITY> >>> Status: Connected >>> Number of entries: 34 >>> >>> Brick s2:/gluster/engine/arbiter >>> Status: Ponto final de transporte não está conectado >>> Number of entries: - >>> >>> ====================== >>> === PEER V0 === >>> >>> [root@v0 glusterfs]# getfattr -m . -d -e hex >>> /gluster/engine/brick/98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent >>> getfattr: Removing leading '/' from absolute path names >>> # file: gluster/engine/brick/98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha >>> _agent >>> security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6 >>> c6162656c65645f743a733000 >>> trusted.afr.dirty=0x000000000000000000000000 >>> trusted.afr.engine-client-2=0x0000000000000000000024e8 >>> trusted.gfid=0xdb9afb92d2bc49ed8e34dcd437ba7be2 >>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >>> >>> [root@v0 glusterfs]# getfattr -m . -d -e hex >>> /gluster/engine/brick/98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent/* >>> getfattr: Removing leading '/' from absolute path names >>> # file: gluster/engine/brick/98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha >>> _agent/hosted-engine.lockspace >>> security.selinux=0x73797374656d5f753a6f626a6563745f723a66757 >>> 36566735f743a733000 >>> >>> # file: gluster/engine/brick/98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha >>> _agent/hosted-engine.metadata >>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6675736566735f743a733000 >>> >>> >>> === PEER V1 === >>> >>> [root@v1 glusterfs]# getfattr -m . -d -e hex >>> /gluster/engine/brick/98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent >>> getfattr: Removing leading '/' from absolute path names >>> # file: gluster/engine/brick/98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha >>> _agent >>> security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6 >>> c6162656c65645f743a733000 >>> trusted.afr.dirty=0x000000000000000000000000 >>> trusted.afr.engine-client-2=0x0000000000000000000024ec >>> trusted.gfid=0xdb9afb92d2bc49ed8e34dcd437ba7be2 >>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >>> >>> ====================== >>> >>> cmd_history.log-20180701: >>> >>> [2018-07-01 03:11:38.461175] : volume heal engine full : SUCCESS >>> [2018-07-01 03:11:51.151891] : volume heal data full : SUCCESS >>> >>> glustershd.log-20180701: >>> <LOGS FROM 06/01 TRUNCATED> >>> [2018-07-01 07:15:04.779122] I [MSGID: 100011] >>> [glusterfsd.c:1396:reincarnate] 0-glusterfsd: Fetching the volume file >>> from server... >>> >>> glustershd.log: >>> [2018-07-01 07:15:04.779693] I [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk] >>> 0-glusterfs: No change in volfile, continuing >>> >>> That's the *only* message in glustershd.log today. >>> >>> ====================== >>> >>> [root@v0 glusterfs]# gluster volume status engine >>> Status of volume: engine >>> Gluster process TCP Port RDMA Port >>> Online Pid >>> ------------------------------------------------------------ >>> ------------------ >>> Brick s0:/gluster/engine/brick 49154 0 >>> Y 2816 >>> Brick s1:/gluster/engine/brick 49154 0 >>> Y 3995 >>> Self-heal Daemon on localhost N/A N/A Y >>> 2919 >>> Self-heal Daemon on s1 N/A N/A Y >>> 4013 >>> >>> Task Status of Volume engine >>> ------------------------------------------------------------ >>> ------------------ >>> There are no active volume tasks >>> >>> ====================== >>> >>> Okay, so actually only the directory ha_agent is listed for healing (not >>> its contents), & that does have attributes set. >>> >>> Many thanks for the reply! >>> >>> >>> On 1 July 2018 at 15:34, Ashish Pandey <[email protected]> wrote: >>> >>>> You have not even talked about the volume type and configuration and >>>> this issue would require lot of other information to fix it. >>>> >>>> 1 - What is the type of volume and config. >>>> 2 - Provide the gluster v <volname> info out put >>>> 3 - Heal info out put >>>> 4 - getxattr of one of the file, which needs healing, from all the >>>> bricks. >>>> 5 - What lead to the healing of file? >>>> 6 - gluster v <volname> status >>>> 7 - glustershd.log out put just after you run full heal or index heal >>>> >>>> ---- >>>> Ashish >>>> >>>> ------------------------------ >>>> *From: *"Gambit15" <[email protected]> >>>> *To: *"gluster-users" <[email protected]> >>>> *Sent: *Sunday, July 1, 2018 11:50:16 PM >>>> *Subject: *[Gluster-users] Files not healing & missing their >>>> extended attributes - Help! >>>> >>>> >>>> Hi Guys, >>>> I had to restart our datacenter yesterday, but since doing so a number >>>> of the files on my gluster share have been stuck, marked as healing. After >>>> no signs of progress, I manually set off a full heal last night, but after >>>> 24hrs, nothing's happened. >>>> >>>> The gluster logs all look normal, and there're no messages about failed >>>> connections or heal processes kicking off. >>>> >>>> I checked the listed files' extended attributes on their bricks today, >>>> and they only show the selinux attribute. There's none of the trusted.* >>>> attributes I'd expect. >>>> The healthy files on the bricks do have their extended attributes >>>> though. >>>> >>>> I'm guessing that perhaps the files somehow lost their attributes, and >>>> gluster is no longer able to work out what to do with them? It's not logged >>>> any errors, warnings, or anything else out of the normal though, so I've no >>>> idea what the problem is or how to resolve it. >>>> >>>> I've got 16 hours to get this sorted before the start of work, Monday. >>>> Help! >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> [email protected] >>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>> >>>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> [email protected] >>> http://lists.gluster.org/mailman/listinfo/gluster-users >>> >>> >> >> _______________________________________________ >> Gluster-users mailing list >> [email protected] >> http://lists.gluster.org/mailman/listinfo/gluster-users >> > >
_______________________________________________ Gluster-users mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-users
