Reviewed by: Sebastien Roy <>
Reviewed by: Matt Ahrens <>
Reviewed by: Prashanth Sreenivasa <>

Seen in a test suite run of checkpoint_big_rewind which uses a nested pool.
Does not appear to reproduce easily.

Current theory is that the upper pool’s dirty/anon data is preventing the
lower pool from adding writes into its open context.
The upper pool cannot make progress (and clear its dirty data) until its
write I/O sent to lower pool completes.

nestedpool (upper)
waiting for I/O to a vdev in testpool
lots of dirty/anon data but syncing I stalled
spa_syncing_txg is 162
vdev io is getting throttled

cv_wait+0x70(ffffff03385afb78, ffffff03385afb70)
dsl_pool_sync+0xf9(ffffff033df2cb00, a2)
spa_sync+0x456(ffffff0342c18000, a2)
testpool (lower)
stalled in dmu_tx_assign
waiting for anon_size to shrink

spa_syncing_txg is 11,800,157 (in less than 30 minutes!)
  typically a run of this test completes in under 200 TXGs
  Stacks (per each file vdev used by upper):
cv_wait+0x70(ffffff035368549e, ffffff0353685458)
txg_wait_open+0xcb(ffffff0353685280, b40e5f)
dmu_tx_assign+0x8a(ffffff0328bc87c0, 1)
zfs_write+0x561(ffffff0527861980, ffffff000cdf1a80, 0, ffffff0310e98db0, 0)
fop_write+0x5b(ffffff0527861980, ffffff000cdf1a80, 0, ffffff0310e98db0, 0)
vn_rdwr+0x27a(1, ffffff0527861980, ffffff032dfdc000, 800, ec0000, 1)

Upstream bug: DLPX-53592
You can view, comment on, or merge this pull request online at:

-- Commit Summary --

  * 9465 ARC check for 'anon_size > arc_c/2' can stall the system

-- File Changes --

    M usr/src/uts/common/fs/zfs/arc.c (48)
    M usr/src/uts/common/fs/zfs/dsl_dir.c (2)
    M usr/src/uts/common/fs/zfs/spa_misc.c (6)
    M usr/src/uts/common/fs/zfs/sys/arc.h (2)
    M usr/src/uts/common/fs/zfs/sys/spa.h (1)
    M usr/src/uts/common/fs/zfs/sys/spa_impl.h (4)

-- Patch Links --

You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:

openzfs: openzfs-developer
Delivery options:

Reply via email to