> The gluster NFS log has this entry:
> [2020-01-25 19:07:33.085806] E [MSGID: 109040] 
> [dht-helper.c:1388:dht_migration_complete_check_task] 0-cm_shared-dht: 
> 19bd72f0-6863-4f1d-80dc-a426db8670b8: failed to lookup the file on 
> cm_shared-dht [Stale file handle]
> [2020-01-25 19:07:33.085848] W [MSGID: 112199] 
> [nfs3-helpers.c:3578:nfs3_log_commit_res] 0-nfs-nfsv3: 
> /image/images_rw_nfs/r41c4t1n1/rhel8.0/xfs-test.img => (XID: cb501b58, 
> COMMIT: NFS: 70(Invalid file handle), POSIX: 116(Stale file handle)), wverf: 
> 1579988225
> 

I've done more digging. I have access to an actual system that is
failing (instead of my test case) above. It appears to be the same
issue so that's good. (My access goes away in a couple hours).

The nodes don't hang at the mount, but rather, at a check in the code
for the existence of the image file. I'm not sure if the "holes" message
I share below is a problem or not, the file indeed does start sparse.

Restarting 'glusterd' on the problem server allows the node to boot.
However, it does seem like the problem image file disappears from the
face of the earth as far as I can tell (it doesn't exist in the gluster
mount to the same path).

Searching for all messages in nfs.log related to r17c3t6n3 (the problem
node with the problem nfs.img file), I see:

[root@leader1 glusterfs]# grep r17c3t6n3 nfs.log
[2020-01-24 12:29:42.412019] W [MSGID: 112199] 
[nfs3-helpers.c:3494:nfs3_log_write_res] 0-nfs-nfsv3: 
/image/images_rw_nfs/r17c3t6n3/rhel8.0/xfs.img => (XID: ca68a5fc, WRITE: NFS: 
70(Invalid file handle), POSIX: 116(Stale file handle)), count: 0, 
STABLE,wverf: 1579664973
[2020-01-25 04:57:10.199988] W [MSGID: 112199] 
[nfs3-helpers.c:3494:nfs3_log_write_res] 0-nfs-nfsv3: 
/image/images_rw_nfs/r17c3t6n3/rhel8.0/xfs.img => (XID: 1ec43ce0, WRITE: NFS: 
70(Invalid file handle), POSIX: 116(Stale file handle)), count: 0, 
STABLE,wverf: 1579664973 [Invalid argument]
[2020-01-25 04:57:10.200431] W [MSGID: 112199] 
[nfs3-helpers.c:3494:nfs3_log_write_res] 0-nfs-nfsv3: 
/image/images_rw_nfs/r17c3t6n3/rhel8.0/xfs.img => (XID: 20c43ce0, WRITE: NFS: 
70(Invalid file handle), POSIX: 116(Stale file handle)), count: 0, 
STABLE,wverf: 1579664973
[2020-01-25 04:57:10.200695] W [MSGID: 112199] 
[nfs3-helpers.c:3494:nfs3_log_write_res] 0-nfs-nfsv3: 
/image/images_rw_nfs/r17c3t6n3/rhel8.0/xfs.img => (XID: 21c43ce0, WRITE: NFS: 
70(Invalid file handle), POSIX: 116(Stale file handle)), count: 0, 
STABLE,wverf: 1579664973
[2020-01-25 04:57:10.200827] W [MSGID: 112199] 
[nfs3-helpers.c:3494:nfs3_log_write_res] 0-nfs-nfsv3: 
/image/images_rw_nfs/r17c3t6n3/rhel8.0/xfs.img => (XID: 1fc43ce0, WRITE: NFS: 
70(Invalid file handle), POSIX: 116(Stale file handle)), count: 0, 
STABLE,wverf: 1579664973
[2020-01-25 04:57:10.201808] W [MSGID: 112199] 
[nfs3-helpers.c:3494:nfs3_log_write_res] 0-nfs-nfsv3: 
/image/images_rw_nfs/r17c3t6n3/rhel8.0/xfs.img => (XID: 22c43ce0, WRITE: NFS: 
70(Invalid file handle), POSIX: 116(Stale file handle)), count: 0, 
STABLE,wverf: 1579664973 [Invalid argument]
[2020-01-25 23:32:09.629807] I [MSGID: 109063] 
[dht-layout.c:693:dht_layout_normalize] 0-cm_shared-dht: Found anomalies in 
/image/images_rw_nfs/r17c3t6n3/rhel8.0 (gfid = 
00000000-0000-0000-0000-000000000000). Holes=1 overlaps=0
[2020-01-26 02:42:33.712684] W [MSGID: 112199] 
[nfs3-helpers.c:3494:nfs3_log_write_res] 0-nfs-nfsv3: 
/image/images_rw_nfs/r17c3t6n3/rhel8.0/xfs.img => (XID: a0ca8fc3, WRITE: NFS: 
70(Invalid file handle), POSIX: 116(Stale file handle)), count: 0, 
STABLE,wverf: 1579664973


r17c3t4n1 is another case:



[2020-01-25 23:19:46.729427] I [MSGID: 109063] 
[dht-layout.c:693:dht_layout_normalize] 0-cm_shared-dht: Found anomalies in 
/image/images_rw_nfs/r17c3t4n1/rhel8.0 (gfid = 
00000000-0000-0000-0000-000000000000). Holes=1 overlaps=0
[2020-01-26 02:42:43.907163] W [MSGID: 112199] 
[nfs3-helpers.c:3494:nfs3_log_write_res] 0-nfs-nfsv3: 
/image/images_rw_nfs/r17c3t4n1/rhel8.0/xfs.img => (XID: a77e7e7d, WRITE: NFS: 
70(Invalid file handle), POSIX: 116(Stale file handle)), count: 0, 
STABLE,wverf: 1579664973
[2020-01-26 02:42:43.908045] W [MSGID: 112199] 
[nfs3-helpers.c:3494:nfs3_log_write_res] 0-nfs-nfsv3: 
/image/images_rw_nfs/r17c3t4n1/rhel8.0/xfs.img => (XID: a87e7e7d, WRITE: NFS: 
70(Invalid file handle), POSIX: 116(Stale file handle)), count: 0, 
STABLE,wverf: 1579664973
[2020-01-26 02:42:43.908194] W [MSGID: 112199] 
[nfs3-helpers.c:3494:nfs3_log_write_res] 0-nfs-nfsv3: 
/image/images_rw_nfs/r17c3t4n1/rhel8.0/xfs.img => (XID: a67e7e7d, WRITE: NFS: 
70(Invalid file handle), POSIX: 116(Stale file handle)), count: 0, 
STABLE,wverf: 1579664973



________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Reply via email to