The bcachefs folio writeback code includes a bio full check as well as a fixed size check when it determines whether to submit the current write op or continue to add to the current bio. The current code submits prematurely when the current folio fits exactly in the remaining space allowed in the current bio, which typically results in an extent merge that would have otherwise been unnecessary. This can be observed with a buffered write sized exactly to the current maximum value (1MB) and with key_merging_disabled=1. The latter prevents the merge from the second write such that a subsequent check of the extent list shows a 1020k extent followed by a contiguous 4k extent.
It's not totally clear why the fixed write size check exists. bio_full() already checks that the bio can accommodate the current dirty range being processed, so the only other concern is write latency. Even then, a 1MB cap seems rather small. For reference, iomap includes a folio batch size (of 4k) to mitigate latency associated with writeback completion folio processing, but that restricts writeback bios to somewhere in the range of 16MB-256MB depending on folio size (i.e. considering 4k to 64k pages). Unless there is some known reason for it, remove the size limit and rely on bio_full() to cap the size of the bio. Signed-off-by: Brian Foster <[email protected]> --- fs/bcachefs/fs-io-buffered.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/fs/bcachefs/fs-io-buffered.c b/fs/bcachefs/fs-io-buffered.c index 58ccc7b91ac7..d438b93a3a30 100644 --- a/fs/bcachefs/fs-io-buffered.c +++ b/fs/bcachefs/fs-io-buffered.c @@ -607,8 +607,6 @@ static int __bch2_writepage(struct folio *folio, if (w->io && (w->io->op.res.nr_replicas != nr_replicas_this_write || bio_full(&w->io->op.wbio.bio, sectors << 9) || - w->io->op.wbio.bio.bi_iter.bi_size + (sectors << 9) >= - (BIO_MAX_VECS * PAGE_SIZE) || bio_end_sector(&w->io->op.wbio.bio) != sector)) bch2_writepage_do_io(w); -- 2.41.0
