From: Nanzhe <[email protected]>

This RFC series supports large folios for most readable/writable files in
buffered I/O paths, including normal files, block-layer encrypted files,
and atomic files. Compressed files are still excluded.

Atomic files need explicit support here because Android enables atomic
writes through ioctls, which mark the inode as an atomic file
(FI_ATOMIC_FILE). Once large-folio mapping is enabled for such a file,
the atomic buffered write path also needs to handle large folios
correctly.

In the write path, allocating f2fs_folio_state for large folios starts
to conflict with f2fs page-private flags. This RFC series extends
f2fs_folio_state so that it can also store f2fs private flags, and
updates the existing PAGE_PRIVATE helpers to work correctly with
f2fs_folio_state.

In addition, fio results with max large-folio order set to 2 showed
that 4K read/write performance did not improve much.
Analysis showed that one important reason was the extra spinlock traffic
from incrementing read_pages_pending once per 4K subpage.
This RFC series therefore adds two optimizations to
f2fs_read_data_large_folio:

1. batch read_pages_pending updates by the mapped block count instead of
   incrementing it once per 4K subpage;

2. skip f2fs_folio_state allocation when a single block mapping / BIO
   covers the whole folio, because folio_end_read() can complete such a
   folio without extra per-folio state.

In the benchmark tables below, "skip=1" means enabling the second
optimization above, i.e. skipping f2fs_folio_state allocation for the
whole-folio-in-one-bio case.
The "batch" optimization refers to updating read_pages_pending in
and add to bio in larger chunks instead of once per subpage.

Test environment:
- Device: Pixel 6 (device1A, 1A071FDF600053)
- Filesystem: f2fs on dm-49, inlinecrypt enabled
- File size: 256MB
- Repetitions: 10
- Prepare: end_fsync + sync + drop_caches
- Fio: psync, direct=0, iodepth=1
- Max folio order: 2

Table 1: HOLE_READ (10 repeats)
------------------------------------------------------------
All bandwidth numbers are in MiB/s. Non-baseline entries show the
absolute value followed by the percentage delta relative to the
order=0 baseline in parentheses.

| bs  | order=0 | order=2 |
|-----|---------|---------|
| 4k  | 469.6   | 521.9 (+11.1%) |
| 64k | 668.1   | 852.4 (+27.6%) |
| 1M  | 653.0   | 867.2 (+32.8%) |

Table 2: DATA_READ (10 repeats)
----------------------------------------
| bs  | order=0 | batch=0 | batch=1,skip=0 | batch=1,skip=1 |
|-----|---------|---------|----------------|----------------|
| 4k  | 441.6   | 456.5 (+3.4%) | 499.7 (+13.2%) | 544.4 (+23.3%) |
| 64k | 632.8   | 697.0 (+10.1%) | 837.8 (+32.4%) | 990.9 (+56.6%) |
| 1M  | 601.5   | 733.0 (+21.9%) | 927.5 (+54.2%) | 963.4 (+60.2%) |

Table 3: WRITE (10 reps)
----------------------------------------------------------
O = overwrite (N = new write, Y = overwrite)
S = sync / fsync (Y = fsync enabled, N = no fsync)
W = writeback (Y = background writeback, N = no writeback)

| O,S,W | bs  | order=0 | order=2 |
|-------|-----|---------|---------|
| N,N,Y | 4k  | 263.4   | 286.3 (+8.7%) |
| N,N,Y | 64k | 683.5   | 1199.6 (+75.5%) |
| N,N,Y | 1M  | 767.4   | 1383.8 (+80.3%) |
| N,Y,N | 4k  | 10.8    | 9.1 (-15.7%) |
| N,Y,N | 64k | 69.3    | 50.3 (-27.4%) |
| N,Y,N | 1M  | 103.5   | 157.8 (+52.5%) |
| Y,N,Y | 4k  | 301.6   | 344.1 (+14.1%) |
| Y,N,Y | 64k | 691.3   | 865.9 (+25.3%) |
| Y,N,Y | 1M  | 742.3   | 969.2 (+30.6%) |
| Y,Y,N | 4k  | 9.5     | 17.1 (+80.0%) |
| Y,Y,N | 64k | 43.5    | 108.2 (+148.7%) |
| Y,Y,N | 1M  | 140.9   | 146.6 (+4.0%) |

Nanzhe (9):
  f2fs: extend folio state for large folio write path
  f2fs: carry subpage offset and count in write IO
  f2fs: support regular file buffered writes on large folios
  f2fs: support atomic file large folios buffered write
  f2fs: support large folio writeback
  f2fs: prepare mmap write faults for large folios
  f2fs: make GC migration large-folio aware
  f2fs: allow large folio support to writeable files
  f2fs: optimize small block size large folio read

 fs/f2fs/compress.c |    2 +
 fs/f2fs/data.c     | 1015 +++++++++++++++++++++++++++++++++++++++-----
 fs/f2fs/f2fs.h     |   75 +++-
 fs/f2fs/file.c     |   81 ++--
 fs/f2fs/gc.c       |   30 +-
 fs/f2fs/inode.c    |    6 +-
 fs/f2fs/segment.c  |    4 +-
 7 files changed, 1064 insertions(+), 149 deletions(-)

-- 
2.34.1



_______________________________________________
Linux-f2fs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to