Author: mav
Date: Mon Apr 16 04:11:48 2018
New Revision: 332548

  MFC r331703: MFV 331702:
  9187 racing condition between vdev label and spa_last_synced_txg in 
  ztest failed with uncorrectable IO error despite having the fix for #7163.
  Both sides of the mirror have CANT_OPEN_BAD_LABEL, which also distinguishes
  it from that issue.
  Definitely seems like a racing condition between the vdev_validate and 
  1. Thread A (spa_sync): vdev label is updated to latest txg
  2. Thread B (vdev_validate): vdev label's txg is compared to 
spa_last_synced_txg and is ahead.
  3. Thread A (spa_sync): spa_last_synced_txg is updated to latest txg.
  Solution: do not check txg in vdev_validate unless config lock is held.
  Reviewed by: George Wilson <>
  Reviewed by: Matt Ahrens <>
  Approved by: Robert Mustacchi <>
  Author: Pavel Zakharov <>

Directory Properties:
  stable/11/   (props changed)

Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c
--- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c     Mon Apr 
16 04:10:56 2018        (r332547)
+++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c     Mon Apr 
16 04:11:48 2018        (r332548)
@@ -1696,8 +1696,11 @@ vdev_validate(vdev_t *vd)
         * If we are performing an extreme rewind, we allow for a label that
         * was modified at a point after the current txg.
+        * If config lock is not held do not check for the txg. spa_sync could
+        * be updating the vdev's label before updating spa_last_synced_txg.
-       if (spa->spa_extreme_rewind || spa_last_synced_txg(spa) == 0)
+       if (spa->spa_extreme_rewind || spa_last_synced_txg(spa) == 0 ||
+           spa_config_held(spa, SCL_CONFIG, RW_WRITER) != SCL_CONFIG)
                txg = UINT64_MAX;
                txg = spa_last_synced_txg(spa);
