Hi, The current implementation marks the vdev VDEV_STATE_FAULTED first and then does the replicas testing, and the comment says "Faulted state takes precedence over degraded". I want to ask what is the reason to let "Faulted" take precedence, and why not doing the replicas test first and marks Faulted only if the replicas test succeeds ?
I encounter this problem when fminject ereport.io.scsi.disk.predictive-failure to a single disk zpool, io-retire of fmd is disabled (so the disk will not be disabled by devfs underneath zfs), I expect zfs (with the help of zfs-retire) will mark the vdev DEGRADED, and the system works as I expected most of the time. But the zpool get suspended in one test, I cannot remember the exact vdev state of the failed pool, might be "REMOVED"; and I cann't reproduce this problem. Looking at the source code of vdev_fault(), I am wondering if there is a race with zio error recovery logic, I mean after the vdev marked FAULTED buf before the replicas test is started, the zio retry logic (or something else) determined the vdev removed and suspended the zpool? (Cause the vdev state change is protected by the spa configuration lock, maybe the vdev state is markd DEGRAGED first by vdev_fault() and then marked REMOVED by zio ?) Thanks. ------------------------------------------ openzfs: openzfs-developer Permalink: https://openzfs.topicbox.com/groups/developer/Tae6e0dd85f3e1f45-M3f9dd4395457145750997fca Delivery options: https://openzfs.topicbox.com/groups/developer/subscription