Re: [Gluster-users] Geo-replication: Entry not present on master. Fixing gfid mismatch in slave
Hi Strahil, Thank you for that. Do you know if these "Stale file handle" errors on the geo-replication slave could be related? [2020-06-10 01:02:32.268989] E [MSGID: 109040] [dht-helper.c:1332:dht_migration_complete_check_task] 0-gvol0-dht: /.gfid/d4265a0c-d881-48d8-8ca1-0920ab5ae9ba: failed to lookup the file on gvol0-dht [Stale file handle] [2020-06-10 01:02:32.269092] W [fuse-bridge.c:897:fuse_attr_cbk] 0-glusterfs-fuse: 7434237: STAT() /.gfid/d4265a0c-d881-48d8-8ca1-0920ab5ae9ba => -1 (Stale file handle) [2020-06-10 01:02:32.329280] W [fuse-bridge.c:897:fuse_attr_cbk] 0-glusterfs-fuse: 7434251: STAT() /.gfid/d4265a0c-d881-48d8-8ca1-0920ab5ae9ba => -1 (Stale file handle) [2020-06-10 01:02:32.387129] W [fuse-bridge.c:897:fuse_attr_cbk] 0-glusterfs-fuse: 7434264: STAT() /.gfid/d4265a0c-d881-48d8-8ca1-0920ab5ae9ba => -1 (Stale file handle) [2020-06-10 01:02:32.448838] W [fuse-bridge.c:897:fuse_attr_cbk] 0-glusterfs-fuse: 7434277: STAT() /.gfid/d4265a0c-d881-48d8-8ca1-0920ab5ae9ba => -1 (Stale file handle) [2020-06-10 01:02:32.507196] W [fuse-bridge.c:897:fuse_attr_cbk] 0-glusterfs-fuse: 7434290: STAT() /.gfid/d4265a0c-d881-48d8-8ca1-0920ab5ae9ba => -1 (Stale file handle) [2020-06-10 01:02:32.566033] W [fuse-bridge.c:897:fuse_attr_cbk] 0-glusterfs-fuse: 7434303: STAT() /.gfid/d4265a0c-d881-48d8-8ca1-0920ab5ae9ba => -1 (Stale file handle) [2020-06-10 01:02:32.625168] W [fuse-bridge.c:897:fuse_attr_cbk] 0-glusterfs-fuse: 7434316: STAT() /.gfid/d4265a0c-d881-48d8-8ca1-0920ab5ae9ba => -1 (Stale file handle) [2020-06-10 01:02:32.772442] W [fuse-bridge.c:897:fuse_attr_cbk] 0-glusterfs-fuse: 7434329: STAT() /.gfid/d4265a0c-d881-48d8-8ca1-0920ab5ae9ba => -1 (Stale file handle) [2020-06-10 01:02:32.832481] W [fuse-bridge.c:897:fuse_attr_cbk] 0-glusterfs-fuse: 7434342: STAT() /.gfid/d4265a0c-d881-48d8-8ca1-0920ab5ae9ba => -1 (Stale file handle) [2020-06-10 01:02:32.891835] W [fuse-bridge.c:897:fuse_attr_cbk] 0-glusterfs-fuse: 7434403: STAT() /.gfid/d4265a0c-d881-48d8-8ca1-0920ab5ae9ba => -1 (Stale file handle) On Tue, 9 Jun 2020 at 16:31, Strahil Nikolov wrote: > Hey David, > > Can you check the cpu usage in the sar on the rest of the cluster (going > backwards from the day you found the high cpu usage), so we can know if > this behaviour was obseerved on other nodes. > > Maybe that behaviour was "normal" for the push node (which could be > another one) . > > As this script is python, I guess you can put some debug print > statements in it. > > Best Regards, > Strahil Nikolov > > На 9 юни 2020 г. 5:07:11 GMT+03:00, David Cunningham < > dcunning...@voisonics.com> написа: > >Hi Sankarshan, > > > >Thanks for that. So what should we look for to figure out what this > >process > >is doing? In > >/var/log/glusterfs/geo-replication/gvol0_nvfs10_gvol0/gsyncd.log we see > >something like the following logged regularly: > > > > > >[[2020-06-09 02:01:19.670595] D [master(worker > >/nodirectwritedata/gluster/gvol0):1454:changelogs_batch_process] > >_GMaster: > >processing changes > > >batch=['/var/lib/misc/gluster/gsyncd/gvol0_nvfs10_gvol0/nodirectwritedata-gluster-gvol0/.processing/CHANGELOG.1591668040', > > >'/var/lib/misc/gluster/gsyncd/gvol0_nvfs10_gvol0/nodirectwritedata-gluster-gvol0/.processing/CHANGELOG.1591668055'] > >[2020-06-09 02:01:19.674927] D [master(worker > >/nodirectwritedata/gluster/gvol0):1289:process] _GMaster: processing > >change > > > > >changelog=/var/lib/misc/gluster/gsyncd/gvol0_nvfs10_gvol0/nodirectwritedata-gluster-gvol0/.processing/CHANGELOG.1591668040 > >[2020-06-09 02:01:19.683098] D [master(worker > >/nodirectwritedata/gluster/gvol0):1170:process_change] _GMaster: > >entries: [] > >[2020-06-09 02:01:19.695125] D [master(worker > >/nodirectwritedata/gluster/gvol0):312:a_syncdata] _GMaster: files > >files=set(['.gfid/0f98f9cd-1800-4c0f-b449-edcd7446bf17', > >'.gfid/512b4710-5af7-4e5a-8f3a-0a3dece42f77', > >'.gfid/779cd2b3-1571-446a-8903-48d6183d3dd0', > >'.gfid/8ae32eec-f766-4cd9-a788-4561ba1fa435']) > >[2020-06-09 02:01:19.695344] D [master(worker > >/nodirectwritedata/gluster/gvol0):315:a_syncdata] _GMaster: candidate > >for > >syncing file=.gfid/0f98f9cd-1800-4c0f-b449-edcd7446bf17 > >[2020-06-09 02:01:19.695508] D [master(worker > >/nodirectwritedata/gluster/gvol0):315:a_syncdata] _GMaster: candidate > >for > >syncing file=.gfid/512b4710-5af7-4e5a-8f3a-0a3dece42f77 > >[2020-06-09 02:01:19.695638] D [master(worker > >/nodirectwritedata/gluster/gvol0):315:a_syncdata] _GMaster: candidate > >for > >syncing file=.gfid/779cd2b3-1571-446a-8903-48d6183d3dd0 > >[2020-06-09 02:01:19.695759] D [master(worker > >/nodirectwritedata/gluster/gvol0):315:a_syncdata] _GMaster: candidate > >for > >syncing file=.gfid/8ae32eec-f766-4cd9-a788-4561ba1fa435 > >[2020-06-09 02:01:19.695883] D [master(worker > >/nodirectwritedata/gluster/gvol0):1289:process] _GMaster: processing > >change > > > >
Re: [Gluster-users] Geo-replication: Entry not present on master. Fixing gfid mismatch in slave
Hey David, Can you check the cpu usage in the sar on the rest of the cluster (going backwards from the day you found the high cpu usage), so we can know if this behaviour was obseerved on other nodes. Maybe that behaviour was "normal" for the push node (which could be another one) . As this script is python, I guess you can put some debug print statements in it. Best Regards, Strahil Nikolov На 9 юни 2020 г. 5:07:11 GMT+03:00, David Cunningham написа: >Hi Sankarshan, > >Thanks for that. So what should we look for to figure out what this >process >is doing? In >/var/log/glusterfs/geo-replication/gvol0_nvfs10_gvol0/gsyncd.log we see >something like the following logged regularly: > > >[[2020-06-09 02:01:19.670595] D [master(worker >/nodirectwritedata/gluster/gvol0):1454:changelogs_batch_process] >_GMaster: >processing changes >batch=['/var/lib/misc/gluster/gsyncd/gvol0_nvfs10_gvol0/nodirectwritedata-gluster-gvol0/.processing/CHANGELOG.1591668040', >'/var/lib/misc/gluster/gsyncd/gvol0_nvfs10_gvol0/nodirectwritedata-gluster-gvol0/.processing/CHANGELOG.1591668055'] >[2020-06-09 02:01:19.674927] D [master(worker >/nodirectwritedata/gluster/gvol0):1289:process] _GMaster: processing >change > >changelog=/var/lib/misc/gluster/gsyncd/gvol0_nvfs10_gvol0/nodirectwritedata-gluster-gvol0/.processing/CHANGELOG.1591668040 >[2020-06-09 02:01:19.683098] D [master(worker >/nodirectwritedata/gluster/gvol0):1170:process_change] _GMaster: >entries: [] >[2020-06-09 02:01:19.695125] D [master(worker >/nodirectwritedata/gluster/gvol0):312:a_syncdata] _GMaster: files >files=set(['.gfid/0f98f9cd-1800-4c0f-b449-edcd7446bf17', >'.gfid/512b4710-5af7-4e5a-8f3a-0a3dece42f77', >'.gfid/779cd2b3-1571-446a-8903-48d6183d3dd0', >'.gfid/8ae32eec-f766-4cd9-a788-4561ba1fa435']) >[2020-06-09 02:01:19.695344] D [master(worker >/nodirectwritedata/gluster/gvol0):315:a_syncdata] _GMaster: candidate >for >syncing file=.gfid/0f98f9cd-1800-4c0f-b449-edcd7446bf17 >[2020-06-09 02:01:19.695508] D [master(worker >/nodirectwritedata/gluster/gvol0):315:a_syncdata] _GMaster: candidate >for >syncing file=.gfid/512b4710-5af7-4e5a-8f3a-0a3dece42f77 >[2020-06-09 02:01:19.695638] D [master(worker >/nodirectwritedata/gluster/gvol0):315:a_syncdata] _GMaster: candidate >for >syncing file=.gfid/779cd2b3-1571-446a-8903-48d6183d3dd0 >[2020-06-09 02:01:19.695759] D [master(worker >/nodirectwritedata/gluster/gvol0):315:a_syncdata] _GMaster: candidate >for >syncing file=.gfid/8ae32eec-f766-4cd9-a788-4561ba1fa435 >[2020-06-09 02:01:19.695883] D [master(worker >/nodirectwritedata/gluster/gvol0):1289:process] _GMaster: processing >change > >changelog=/var/lib/misc/gluster/gsyncd/gvol0_nvfs10_gvol0/nodirectwritedata-gluster-gvol0/.processing/CHANGELOG.1591668055 >[2020-06-09 02:01:19.696170] D [master(worker >/nodirectwritedata/gluster/gvol0):1170:process_change] _GMaster: >entries: [] >[2020-06-09 02:01:19.714097] D [master(worker >/nodirectwritedata/gluster/gvol0):312:a_syncdata] _GMaster: files >files=set(['.gfid/0f98f9cd-1800-4c0f-b449-edcd7446bf17', >'.gfid/512b4710-5af7-4e5a-8f3a-0a3dece42f77', >'.gfid/8ae32eec-f766-4cd9-a788-4561ba1fa435']) >[2020-06-09 02:01:19.714286] D [master(worker >/nodirectwritedata/gluster/gvol0):315:a_syncdata] _GMaster: candidate >for >syncing file=.gfid/0f98f9cd-1800-4c0f-b449-edcd7446bf17 >[2020-06-09 02:01:19.714433] D [master(worker >/nodirectwritedata/gluster/gvol0):315:a_syncdata] _GMaster: candidate >for >syncing file=.gfid/512b4710-5af7-4e5a-8f3a-0a3dece42f77 >[2020-06-09 02:01:19.714577] D [master(worker >/nodirectwritedata/gluster/gvol0):315:a_syncdata] _GMaster: candidate >for >syncing file=.gfid/8ae32eec-f766-4cd9-a788-4561ba1fa435 >[2020-06-09 02:01:20.179656] D [resource(worker >/nodirectwritedata/gluster/gvol0):1419:rsync] SSH: files: >.gfid/0f98f9cd-1800-4c0f-b449-edcd7446bf17, >.gfid/512b4710-5af7-4e5a-8f3a-0a3dece42f77, >.gfid/779cd2b3-1571-446a-8903-48d6183d3dd0, >.gfid/8ae32eec-f766-4cd9-a788-4561ba1fa435, >.gfid/0f98f9cd-1800-4c0f-b449-edcd7446bf17, >.gfid/512b4710-5af7-4e5a-8f3a-0a3dece42f77, >.gfid/8ae32eec-f766-4cd9-a788-4561ba1fa435 >[2020-06-09 02:01:20.738632] I [master(worker >/nodirectwritedata/gluster/gvol0):1954:syncjob] Syncer: Sync Time Taken >duration=0.5588 num_files=7 job=2 return_code=0 >[2020-06-09 02:01:20.739650] D [master(worker >/nodirectwritedata/gluster/gvol0):321:regjob] _GMaster: synced > file=.gfid/0f98f9cd-1800-4c0f-b449-edcd7446bf17 >[2020-06-09 02:01:20.740041] D [master(worker >/nodirectwritedata/gluster/gvol0):321:regjob] _GMaster: synced > file=.gfid/512b4710-5af7-4e5a-8f3a-0a3dece42f77 >[2020-06-09 02:01:20.740200] D [master(worker >/nodirectwritedata/gluster/gvol0):321:regjob] _GMaster: synced > file=.gfid/779cd2b3-1571-446a-8903-48d6183d3dd0 >[2020-06-09 02:01:20.740343] D [master(worker >/nodirectwritedata/gluster/gvol0):321:regjob] _GMaster: synced > file=.gfid/8ae32eec-f766-4cd9-a788-4561ba1fa435 >[2020-06-09 02:01:20.740482] D [master(worker