On Tue, Dec 05, 2023 at 08:24:39AM -0500, Brian Foster wrote: > When investigating transient failures of generic/441 on bcachefs, it > was determined that the cause of the failure was a combination of > unconditional emergency shutdown and racing between background > journal activity and the test switchover from a working device > mapper table to an error injecting table. > > Part of the reason for this sequence of events is that bcachefs > aggressively flushes as much as possible during fsync(), regardless > of errors. While this is reasonable behavior, it is technically > unnecessary because once an error is returned from fsync(), the > caller cannot make any assumptions about the resilience of data. > > Tweak the bch2_fsync() logic to return an error on failure of any of > the steps involved in the flush. Note that this change alone does > not prevent generic/441 failure, but in combination with a test > tweak to avoid racing during the dm-error table switchover it avoids > the unnecessary shutdowns and allows the test to pass reliably on > bcachefs. > > Signed-off-by: Brian Foster <[email protected]>
Applied
