On Fri, Sep 15, 2023 at 08:51:54AM -0400, Brian Foster wrote: > Initial support for the vfs superblock freeze and unfreeze > operations. Superblock freeze occurs in stages, where the vfs > attempts to quiesce high level write operations, page faults, fs > internal operations, and then finally calls into the filesystem for > any last stage steps (i.e. log flushing, etc.) before marking the > superblock frozen. > > The majority of write paths are covered by freeze protection (i.e. > sb_start_write() and friends) in higher level common code, with the > exception of the fs-internal SB_FREEZE_FS stage (i.e. > sb_start_intwrite()). This typically maps to active filesystem > transactions in a manner that allows the vfs to implement a barrier > of internal fs operations during the freeze sequence. This is not a > viable model for bcachefs, however, because it utilizes transactions > both to populate the journal as well as to perform journal reclaim. > This means that mapping intwrite protection to transaction lifecycle > or transaction commit is likely to deadlock freeze, as quiescing the > journal requires transactional operations blocked by the final stage > of freeze. > > The flipside of this is that bcachefs does already maintain its own > internal sets of write references for similar purposes, currently > utilized for transitions from read-write to read-only mode. Since > this largely mirrors the high level sequence involved with freeze, > we can simply invoke this mechanism in the freeze callback to fully > quiesce the filesystem in the final stage. This means that while the > SB_FREEZE_FS stage is essentially a no-op, the ->freeze_fs() > callback that immediately follows begins by performing effectively > the same step by quiescing all internal write references. > > One caveat to this approach is that without integration of internal > freeze protection, write operations gated on internal write refs > will fail with an internal -EROFS error rather than block on > acquiring freeze protection. IOW, this is roughly equivalent to only > having support for sb_start_intwrite_trylock(), and not the blocking > variant. Many of these paths already use non-blocking internal write > refs and so would map into an sb_start_intwrite_trylock() anyways. > The only instance of this I've been able to uncover that doesn't > explicitly rely on a higher level non-blocking write ref is the > bch2_rbio_narrow_crcs() path, which updates crcs in certain read > cases, and Kent has pointed out isn't critical if it happens to fail > due to read-only status. > > Given that, implement basic freeze support as described above and > leave tighter integration with internal freeze protection as a > possible future enhancement. There are multiple potential ideas > worth exploring here. For example, we could implement a multi-stage > freeze callback that might allow bcachefs to quiesce its internal > write references without deadlocks, we could integrate intwrite > protection with bcachefs' internal write references somehow or > another, or perhaps consider implementing blocking support for > internal write refs to be used specifically for freeze, etc. In the > meantime, this enables functional freeze support and the associated > test coverage that comes with it. > > Signed-off-by: Brian Foster <[email protected]>
This is going to get a whole lot more test coverage and I'm pretty excited about that - thanks, applied the series
