Reviewed by: Matt Ahrens <[email protected]>
Reviewed by: Pavel Zakharov <[email protected]>

The timeline of the race condition is the following:
[1] Thread A is about to finish condesing the first vdev in 
spa_condense_indirect_thread(),
    so it calls the spa_condense_indirect_complete_sync() sync task which sets 
the
    spa_condensing_indirect field to NULL. Waiting for the sync task to finish, 
thread A
    sleeps until the txg is done. When this happens, thread A will acquire 
spa_async_lock
    and set spa_condense_thread to NULL.
[2] While thread A waits for the txg to finish, thread B which is running 
spa_sync() checks
    whether it should condense the second vdev in 
vdev_indirect_should_condense() by checking
    the spa_condensing_indirect field which was set to NULL by 
spa_condense_indirect_thread()
    from thread A. So it goes on and tries to spawn a new condensing thread in
    spa_condense_indirect_start_sync() and the aforementioned assertions fails 
because thread A
    has not set spa_condense_thread to NULL (which is basically the last thing 
it does before
    returning).

The main issue here is that we rely on both spa_condensing_indirect and 
spa_condense_thread to
signify whether a condensing thread is running. Ideally we would only use one 
throughout the
codebase. In addition, for managing spa_condense_thread we currently use 
spa_async_lock which
basically tights condensing to scrubing when it comes to pausing and resuming 
those actions
during spa export.

This commit introduces the ZTHR infrastructure, which is basically threads 
created during
spa_load()/spa_create() and exist until we export or destroy the pool. ZTHRs 
sleep the majority
of the time, until they are notified to wake up and do some predefined type of 
work.

In the context of the current bug, a zthr to does the condensing of indirect 
mappings replacing
the older code that used bare kthreads. When a pool is created, the condensing 
zthr is spawned
but sleeps right away, until it is awaken by a signal from spa_sync(). If an 
existing pool is
loaded, the condensing zthr looks if there is anything to condense before going 
to sleep, in
case we were condensing mappings in the pool before it got exported.

The benefits of this solution are the following:
- The current bug is fixed
- spa_condensing_indirect is the sole indicator of whether we are currently 
condensing or not
- condensing is more decoupled from the spa_async_thread related functionality.

As a final note, this commit also sets up the path on upstreaming other 
features that use
the ZTHR code like zpool checkpoint and fast clone deletion.

Related Bugs: DLPX-49690, DLPX-50734
You can view, comment on, or merge this pull request online at:

  https://github.com/openzfs/openzfs/pull/541

-- Commit Summary --

  * 9079 race condition in starting and ending condesing thread for indirect 
vdevs

-- File Changes --

    M usr/src/uts/common/Makefile.files (3)
    M usr/src/uts/common/fs/zfs/spa.c (49)
    M usr/src/uts/common/fs/zfs/sys/spa_impl.h (3)
    M usr/src/uts/common/fs/zfs/sys/vdev_removal.h (2)
    A usr/src/uts/common/fs/zfs/sys/zthr.h (52)
    M usr/src/uts/common/fs/zfs/vdev_indirect.c (93)
    A usr/src/uts/common/fs/zfs/zthr.c (319)

-- Patch Links --

https://github.com/openzfs/openzfs/pull/541.patch
https://github.com/openzfs/openzfs/pull/541.diff

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/openzfs/openzfs/pull/541

------------------------------------------
openzfs-developer
Archives: 
https://openzfs.topicbox.com/groups/developer/discussions/T01ad831dada0a527-M905e5afca6c9ae35166ca955
Powered by Topicbox: https://topicbox.com

Reply via email to