On Fri, Sep 15, 2023 at 08:51:54AM -0400, Brian Foster wrote:
> Initial support for the vfs superblock freeze and unfreeze
> operations. Superblock freeze occurs in stages, where the vfs
> attempts to quiesce high level write operations, page faults, fs
> internal operations, and then finally calls into the filesystem for
> any last stage steps (i.e. log flushing, etc.) before marking the
> superblock frozen.
> 
> The majority of write paths are covered by freeze protection (i.e.
> sb_start_write() and friends) in higher level common code, with the
> exception of the fs-internal SB_FREEZE_FS stage (i.e.
> sb_start_intwrite()). This typically maps to active filesystem
> transactions in a manner that allows the vfs to implement a barrier
> of internal fs operations during the freeze sequence. This is not a
> viable model for bcachefs, however, because it utilizes transactions
> both to populate the journal as well as to perform journal reclaim.
> This means that mapping intwrite protection to transaction lifecycle
> or transaction commit is likely to deadlock freeze, as quiescing the
> journal requires transactional operations blocked by the final stage
> of freeze.
> 
> The flipside of this is that bcachefs does already maintain its own
> internal sets of write references for similar purposes, currently
> utilized for transitions from read-write to read-only mode. Since
> this largely mirrors the high level sequence involved with freeze,
> we can simply invoke this mechanism in the freeze callback to fully
> quiesce the filesystem in the final stage. This means that while the
> SB_FREEZE_FS stage is essentially a no-op, the ->freeze_fs()
> callback that immediately follows begins by performing effectively
> the same step by quiescing all internal write references.
> 
> One caveat to this approach is that without integration of internal
> freeze protection, write operations gated on internal write refs
> will fail with an internal -EROFS error rather than block on
> acquiring freeze protection. IOW, this is roughly equivalent to only
> having support for sb_start_intwrite_trylock(), and not the blocking
> variant. Many of these paths already use non-blocking internal write
> refs and so would map into an sb_start_intwrite_trylock() anyways.
> The only instance of this I've been able to uncover that doesn't
> explicitly rely on a higher level non-blocking write ref is the
> bch2_rbio_narrow_crcs() path, which updates crcs in certain read
> cases, and Kent has pointed out isn't critical if it happens to fail
> due to read-only status.
> 
> Given that, implement basic freeze support as described above and
> leave tighter integration with internal freeze protection as a
> possible future enhancement. There are multiple potential ideas
> worth exploring here. For example, we could implement a multi-stage
> freeze callback that might allow bcachefs to quiesce its internal
> write references without deadlocks, we could integrate intwrite
> protection with bcachefs' internal write references somehow or
> another, or perhaps consider implementing blocking support for
> internal write refs to be used specifically for freeze, etc. In the
> meantime, this enables functional freeze support and the associated
> test coverage that comes with it.
> 
> Signed-off-by: Brian Foster <[email protected]>

This is going to get a whole lot more test coverage and I'm pretty
excited about that - thanks, applied the series

Reply via email to