Reviewed by: Matt Ahrens <[email protected]>
Reviewed by: Pavel Zakharov <[email protected]>
The timeline of the race condition is the following:
[1] Thread A is about to finish condesing the first vdev in
spa_condense_indirect_thread(),
so it calls the spa_condense_indirect_complete_sync() sync task which sets
the
spa_condensing_indirect field to NULL. Waiting for the sync task to finish,
thread A
sleeps until the txg is done. When this happens, thread A will acquire
spa_async_lock
and set spa_condense_thread to NULL.
[2] While thread A waits for the txg to finish, thread B which is running
spa_sync() checks
whether it should condense the second vdev in
vdev_indirect_should_condense() by checking
the spa_condensing_indirect field which was set to NULL by
spa_condense_indirect_thread()
from thread A. So it goes on and tries to spawn a new condensing thread in
spa_condense_indirect_start_sync() and the aforementioned assertions fails
because thread A
has not set spa_condense_thread to NULL (which is basically the last thing
it does before
returning).
The main issue here is that we rely on both spa_condensing_indirect and
spa_condense_thread to
signify whether a condensing thread is running. Ideally we would only use one
throughout the
codebase. In addition, for managing spa_condense_thread we currently use
spa_async_lock which
basically tights condensing to scrubing when it comes to pausing and resuming
those actions
during spa export.
This commit introduces the ZTHR infrastructure, which is basically threads
created during
spa_load()/spa_create() and exist until we export or destroy the pool. ZTHRs
sleep the majority
of the time, until they are notified to wake up and do some predefined type of
work.
In the context of the current bug, a zthr to does the condensing of indirect
mappings replacing
the older code that used bare kthreads. When a pool is created, the condensing
zthr is spawned
but sleeps right away, until it is awaken by a signal from spa_sync(). If an
existing pool is
loaded, the condensing zthr looks if there is anything to condense before going
to sleep, in
case we were condensing mappings in the pool before it got exported.
The benefits of this solution are the following:
- The current bug is fixed
- spa_condensing_indirect is the sole indicator of whether we are currently
condensing or not
- condensing is more decoupled from the spa_async_thread related functionality.
As a final note, this commit also sets up the path on upstreaming other
features that use
the ZTHR code like zpool checkpoint and fast clone deletion.
Related Bugs: DLPX-49690, DLPX-50734
You can view, comment on, or merge this pull request online at:
https://github.com/openzfs/openzfs/pull/541
-- Commit Summary --
* 9079 race condition in starting and ending condesing thread for indirect
vdevs
-- File Changes --
M usr/src/uts/common/Makefile.files (3)
M usr/src/uts/common/fs/zfs/spa.c (49)
M usr/src/uts/common/fs/zfs/sys/spa_impl.h (3)
M usr/src/uts/common/fs/zfs/sys/vdev_removal.h (2)
A usr/src/uts/common/fs/zfs/sys/zthr.h (52)
M usr/src/uts/common/fs/zfs/vdev_indirect.c (93)
A usr/src/uts/common/fs/zfs/zthr.c (319)
-- Patch Links --
https://github.com/openzfs/openzfs/pull/541.patch
https://github.com/openzfs/openzfs/pull/541.diff
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/openzfs/openzfs/pull/541
------------------------------------------
openzfs-developer
Archives:
https://openzfs.topicbox.com/groups/developer/discussions/T01ad831dada0a527-M905e5afca6c9ae35166ca955
Powered by Topicbox: https://topicbox.com