Hi,

The current implementation marks the vdev VDEV_STATE_FAULTED first and then 
does the replicas testing, and
the comment says "Faulted state takes precedence over degraded". I want to ask 
what is the reason to let "Faulted"
take precedence, and why not doing the replicas test first and marks Faulted 
only if the replicas test succeeds ?


I encounter this problem when fminject ereport.io.scsi.disk.predictive-failure 
to a single disk zpool, io-retire of fmd
is disabled (so the disk will not be disabled by devfs underneath zfs), I 
expect zfs (with the help of zfs-retire) will mark
the vdev DEGRADED, and the system works as I expected most of the time. But the 
zpool get suspended in one test,
I cannot remember the exact vdev state of the failed pool, might be "REMOVED"; 
and I cann't reproduce this problem.

Looking at the source code of vdev_fault(), I am wondering if there is a race 
with zio error recovery logic, I mean after the vdev 
marked FAULTED buf before the replicas test is started, the zio retry logic (or 
something else) determined the vdev removed 
and suspended the zpool?  (Cause the vdev state change is protected by the spa 
configuration lock, maybe the vdev state is
markd DEGRAGED first by vdev_fault() and then marked REMOVED by zio ?)


Thanks.

 

------------------------------------------
openzfs: openzfs-developer
Permalink: 
https://openzfs.topicbox.com/groups/developer/Tae6e0dd85f3e1f45-M3f9dd4395457145750997fca
Delivery options: https://openzfs.topicbox.com/groups/developer/subscription

Reply via email to