Reviewed by: Sebastien Roy <sebastien....@delphix.com>
Reviewed by: Matt Ahrens <m...@delphix.com>
Reviewed by: Prashanth Sreenivasa <p...@delphix.com>

Seen in a test suite run of checkpoint_big_rewind which uses a nested pool.
Does not appear to reproduce easily.

Current theory is that the upper pool’s dirty/anon data is preventing the
lower pool from adding writes into its open context.
The upper pool cannot make progress (and clear its dirty data) until its
write I/O sent to lower pool completes.

nestedpool (upper)
waiting for I/O to a vdev in testpool
lots of dirty/anon data but syncing I stalled
spa_syncing_txg is 162
vdev io is getting throttled
Stack:

swtch+0x141()
cv_wait+0x70(ffffff03385afb78, ffffff03385afb70)
zio_wait+0xbb(ffffff03385af800)
dsl_pool_sync+0xf9(ffffff033df2cb00, a2)
spa_sync+0x456(ffffff0342c18000, a2)
txg_sync_thread+0x260(ffffff033df2cb00)
thread_start+8()
testpool (lower)
stalled in dmu_tx_assign
waiting for anon_size to shrink

spa_syncing_txg is 11,800,157 (in less than 30 minutes!)
  typically a run of this test completes in under 200 TXGs
  Stacks (per each file vdev used by upper):
swtch+0x141()
cv_wait+0x70(ffffff035368549e, ffffff0353685458)
txg_wait_open+0xcb(ffffff0353685280, b40e5f)
dmu_tx_wait+0x1d8(ffffff0328bc87c0)
dmu_tx_assign+0x8a(ffffff0328bc87c0, 1)
zfs_write+0x561(ffffff0527861980, ffffff000cdf1a80, 0, ffffff0310e98db0, 0)
fop_write+0x5b(ffffff0527861980, ffffff000cdf1a80, 0, ffffff0310e98db0, 0)
vn_rdwr+0x27a(1, ffffff0527861980, ffffff032dfdc000, 800, ec0000, 1)
vdev_file_io_strategy+0x65(ffffff033fd71380)
taskq_d_thread+0xb7(ffffff0322000568)
thread_start+8()

Upstream bug: DLPX-53592
You can view, comment on, or merge this pull request online at:

  https://github.com/openzfs/openzfs/pull/617

-- Commit Summary --

  * 9465 ARC check for 'anon_size > arc_c/2' can stall the system

-- File Changes --

    M usr/src/uts/common/fs/zfs/arc.c (48)
    M usr/src/uts/common/fs/zfs/dsl_dir.c (2)
    M usr/src/uts/common/fs/zfs/spa_misc.c (6)
    M usr/src/uts/common/fs/zfs/sys/arc.h (2)
    M usr/src/uts/common/fs/zfs/sys/spa.h (1)
    M usr/src/uts/common/fs/zfs/sys/spa_impl.h (4)

-- Patch Links --

https://github.com/openzfs/openzfs/pull/617.patch
https://github.com/openzfs/openzfs/pull/617.diff

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/openzfs/openzfs/pull/617

------------------------------------------
openzfs: openzfs-developer
Permalink: 
https://openzfs.topicbox.com/groups/developer/discussions/T3e7a78b326af63d4-Maf89a1522dc668c66a6ff56d
Delivery options: https://openzfs.topicbox.com/groups

Reply via email to