While investigating a performance issue affecting timeshares at our institution (which I am provisionally blaming on other clients driving up IO load on the fileservers), I encountered a rerun of an issue that's been reported on openafs-info twice before:
[42342.692729] afs: disk cache read error in CacheItems slot 100849 off 8067940/8750020 code -5/80 (repeated) But this one ends differently than https://lists.openafs.org/pipermail/openafs-info/2018-October/042576.html or https://lists.openafs.org/pipermail/openafs-info/2020-April/042930.html [42342.697743] afs: Failed to invalidate cache chunks for fid NNN.NNN.NNN.NNN; our local disk cache may be throwing errors. We must invalidate these chunks to avoid possibly serving incorrect data, so we'll retry until we succeed. If AFS access seems to hang, this may be why. [42342.697771] openafs: assertion failed: WriteLocked(&tvc->lock), file: /var/lib/dkms/openafs/1.8.6-2.el7_9/build/src/libafs/MODLOAD-3.10.0-1160.6.1.el7.x86_64-SP/afs_daemons.c, line: 606 The first thing I'm going to assert is that this isn't a hardware error. It affects multiple virtual systems, and no IO errors are logged by the kernel. My assertion is that EIO is coming from osi_rdwr, which will turn a short read or write into EIO. The supposition of myself and others who have looked at this is that the source of the problem is using ext4 as a cache (and perhaps also the dedicated cache filesystem being >80% full), and we're remediating that on these systems. This does leave us with two problems in openafs: * The use of EIO, leading to claims that people have hardware errors when they may not. * The lock breakage. For the former, I'd recommend that either the short IOs be logged, or a different code (perhaps ENODATA if available?) used to differentiate it from hardware errors. For the latter, I believe that there's inconsistency about the locking requirements of afs_InvalidateAllSegments. This comment claims the lock is held: /* * Ask a background daemon to do this request for us. Note that _we_ hold * the write lock on 'avc', while the background daemon does the work. This * is a little weird, but it helps avoid any issues with lock ordering * or if our caller does not expect avc->lock to be dropped while * running. */ When called from afs_StoreAllSegments's error path, avc->lock is clearly held, because StoreAllSegments itself downgrades and upgrades the lock. When called from afs_dentry_iput via afs_InactiveVCache, it seems like it isn't. None of the callers on any platform seems to lock the cache before calling inactive. (unless on some platforms there's aliasing between a VFS level lock and vc->lock). afs_remunlink expects to be called with avc unlocked.