On Wed, May 27, 2026 at 03:59:35PM +0000, Jaegeuk Kim wrote: > F2FS merges bios before submit_bio, regardless of small or large folios, > since the block addresses are consecutive. So, I think IO subsystem was > working in full speed.
As does every other remotely modern file system. But that merging is surprisingly expensive, which is why using folios gets really major performance improvements. For one doing these checks to merge touch quite a few cache lines. Second, devices are often a lot more efficient if they see fewer SGL entries. I.e. having a 1MB bio a single SGL tends to work better than having 256 of them. The same is true in the kernel code itself, both in the submission path (dma mapping and co), and even more so in the page cache handling both before submitting and in the completion path. See Bart's patch about how long the walk of the bio_vecs in the f2fs completion path can take. We had similar issues in XFS even in the workqueue completion path due to lack of rescheduling, and these simply go away when you do the folio manipulation in larger chunks (LAZY_PREEMPT would avoid the need to explicit rescheduling these days, but that just papers over the symptoms in this case). _______________________________________________ Linux-f2fs-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
