On Tue, Dec 05, 2023 at 08:24:39AM -0500, Brian Foster wrote:
> When investigating transient failures of generic/441 on bcachefs, it
> was determined that the cause of the failure was a combination of
> unconditional emergency shutdown and racing between background
> journal activity and the test switchover from a working device
> mapper table to an error injecting table.
> 
> Part of the reason for this sequence of events is that bcachefs
> aggressively flushes as much as possible during fsync(), regardless
> of errors. While this is reasonable behavior, it is technically
> unnecessary because once an error is returned from fsync(), the
> caller cannot make any assumptions about the resilience of data.
> 
> Tweak the bch2_fsync() logic to return an error on failure of any of
> the steps involved in the flush. Note that this change alone does
> not prevent generic/441 failure, but in combination with a test
> tweak to avoid racing during the dm-error table switchover it avoids
> the unnecessary shutdowns and allows the test to pass reliably on
> bcachefs.
> 
> Signed-off-by: Brian Foster <[email protected]>

Applied

Reply via email to