I sent this out almost two weeks ago, but I haven't heard anything back about whether this behavior is intentional or not. I suspect a bunch of people may have been out around that time due to the US July 4 holiday. I thought I would post the small patch I came up with to see if anyone has any feedback. If I could get a definitive answer that this behavior is a bug, I'll open a bug report and work on getting a patch integrated.
--- a/usr/src/uts/common/fs/zfs/spa.c +++ b/usr/src/uts/common/fs/zfs/spa.c @@ -6535,6 +6535,7 @@ spa_vdev_resilver_done_hunt(vdev_t *vd) /* * Check for a completed resilver with the 'unspare' flag set. + * Also potentially update faulted state. */ if (vd->vdev_ops == &vdev_spare_ops) { vdev_t *first = vd->vdev_child[0]; @@ -6557,6 +6558,26 @@ spa_vdev_resilver_done_hunt(vdev_t *vd) return (oldvd); /* + * We know the vdev is a spare (e.g. "spare-1") which just + * finished resilvering. If it's faulted, and one of the + * children is healthy, then set the spare's state to degraded + * so that it will handle read operations. + */ + if (vd->vdev_state == VDEV_STATE_FAULTED && + vd->vdev_children >= 2) { + int i; + + for (i = 0; i < vd->vdev_children; i++) { + if (vd->vdev_child[i]->vdev_state == + VDEV_STATE_HEALTHY) { + vdev_set_state(vd, B_FALSE, + VDEV_STATE_DEGRADED, VDEV_AUX_NONE); + break; + } + } + } + + /* * If there are more than two spares attached to a disk, * and those spares are not required, then we want to * attempt to free them up now so that they can be used Thanks, Jerry On Thu, Jul 5, 2018 at 10:29 AM, Jerry Jelinek <jerry.jeli...@joyent.com> wrote: > Is it intentional that ZFS does not read from a hot spare under a mirror, > even after the resilver has completed? > > We have observed this behavior and I spent some time digging into it. Here > is an example of the zpool status once the resilver has completed: > > NAME STATE READ WRITE CKSUM > jjpool DEGRADED 0 0 0 > mirror-0 DEGRADED 0 0 0 > c1t2d0 ONLINE 0 0 0 > spare-1 FAULTED 0 0 0 > c1t3d0 FAULTED 0 0 0 too many errors > c1t4d0 ONLINE 0 0 0 > spares > c1t4d0 INUSE currently in use > > Because the "spare-1" vdev stays in the "faulted" state, vdev_readable() > always returns false for that vdev and all of the read workload is going to > the other side of the mirror. > > I've tried to determine if this is a regression, intentional, or just an > oversight, but I am not sure. > > If this is an oversight, I have a small change which will change the > "spare" vdev state to "degraded" once the hot-spare resilver has completed. > This allows reads to go to the healthy hot-spare under the "spare-1" vdev. > If that seems reasonable, I could get the patch ready for review. > > Thanks, > Jerry > > ------------------------------------------ openzfs: openzfs-developer Permalink: https://openzfs.topicbox.com/groups/developer/Tc7b699f05a31ea88-Mb2c1261756537f66dfadfc13 Delivery options: https://openzfs.topicbox.com/groups