While investigating a performance issue affecting timeshares at our institution 
(which I am provisionally blaming on other clients driving up IO load on the 
fileservers), I encountered a rerun of an issue that's been reported on 
openafs-info twice before:

[42342.692729] afs: disk cache read error in CacheItems slot 100849 off 
8067940/8750020 code -5/80
(repeated)

But this one ends differently than 
https://lists.openafs.org/pipermail/openafs-info/2018-October/042576.html or 
https://lists.openafs.org/pipermail/openafs-info/2020-April/042930.html

[42342.697743] afs: Failed to invalidate cache chunks for fid NNN.NNN.NNN.NNN; 
our local disk cache may be throwing errors. We must invalidate these chunks to 
avoid possibly serving incorrect data, so we'll retry until we succeed. If AFS 
access seems to hang, this may be why.
[42342.697771] openafs: assertion failed: WriteLocked(&tvc->lock), file: 
/var/lib/dkms/openafs/1.8.6-2.el7_9/build/src/libafs/MODLOAD-3.10.0-1160.6.1.el7.x86_64-SP/afs_daemons.c,
 line: 606

The first thing I'm going to assert is that this isn't a hardware error. It 
affects multiple virtual systems, and no IO errors are logged by the kernel.
My assertion is that EIO is coming from osi_rdwr, which will turn a short read 
or write into EIO. The supposition of myself and others who have looked at this 
is that the source of the problem is using ext4 as a cache (and perhaps also 
the dedicated cache filesystem being >80% full), and we're remediating that on 
these systems.


This does leave us with two problems in openafs:

  *   The use of EIO, leading to claims that people have hardware errors when 
they may not.
  *   The lock breakage.

For the former, I'd recommend that either the short IOs be logged, or a 
different code (perhaps ENODATA if available?) used to differentiate it from 
hardware errors.

For the latter, I believe that there's inconsistency about the locking 
requirements of  afs_InvalidateAllSegments.
This comment claims the lock is held:
        /*
         * Ask a background daemon to do this request for us. Note that _we_ 
hold
         * the write lock on 'avc', while the background daemon does the work. 
This
         * is a little weird, but it helps avoid any issues with lock ordering
         * or if our caller does not expect avc->lock to be dropped while
         * running.
         */
When called from afs_StoreAllSegments's error path, avc->lock is clearly held, 
because StoreAllSegments itself downgrades and upgrades the lock.
When called from afs_dentry_iput via afs_InactiveVCache, it seems like it isn't.
None of the callers on any platform seems to lock the cache before calling 
inactive. (unless on some platforms there's aliasing between a VFS level lock 
and vc->lock).
afs_remunlink expects to be called with avc unlocked.

Reply via email to