On 09/16, hanqi wrote:
> Yes, for most storage devices, since disk performance is much lower
> than memory, writing with F2FS uncached buffer IO does not bring
> significant performance benefits. Its advantages might only become
> apparent in scenarios where disk performance exceeds that of memory.
> 
> Therefore, I also agree that F2FS should first support uncached buffer
> IO for reads, as Chao mentioned.

Thanks, could you please post a patch and get some comments back?

> 
> Thanks,
> 
> On 2025/9/16 11:13, Chao Yu wrote:
> > On 9/12/25 07:53, Jaegeuk Kim wrote:
> > > Given the performance data and implementation overhead, I'm also 
> > > questioning
> > > whether we really need to support this for writes or not. Can we get some 
> > > common
> > > sense of usage models?
> > Seems uncached write implementation affects the performance a lot, I don't 
> > see
> > a good reason to merge this for now.
> > 
> > I think we can try to enable uncached read functionality and return 
> > -EOPNOTSUPP
> > for uncached write first, meanwhile, let's see if there is anything good 
> > usecase
> > for uncached write.
> > 
> > Thanks,
> > 
> > > On 08/28, Qi Han wrote:
> > > > In the link [1], we adapted uncached buffer I/O read support in f2fs.
> > > > Now, let's move forward to enabling uncached buffer I/O write support
> > > > in f2fs.
> > > > 
> > > > In f2fs_write_end_io, a separate asynchronous workqueue is created to
> > > > perform the page drop operation for bios that contain pages of type
> > > > FGP_DONTCACHE.
> > > > 
> > > > The following patch is developed and tested based on the v6.17-rc3 
> > > > branch.
> > > > My local testing results are as follows, along with some issues 
> > > > observed:
> > > > 1) Write performance degradation. Uncached buffer I/O write is slower 
> > > > than
> > > > normal buffered write because uncached I/O triggers a sync operation for
> > > > each I/O after data is written to memory, in order to drop pages 
> > > > promptly
> > > > at end_io. I assume this impact might be less visible on 
> > > > high-performance
> > > > storage devices such as PCIe 6.0 SSDs.
> > > > - f2fs_file_write_iter
> > > >   - f2fs_buffered_write_iter
> > > >   - generic_write_sync
> > > >    - filemap_fdatawrite_range_kick
> > > > 2) As expected, page cache usage does not significantly increase during 
> > > > writes.
> > > > 3) The kswapd0 memory reclaim thread remains mostly idle, but additional
> > > > asynchronous work overhead is introduced, e.g:
> > > >    PID USER         PR  NI VIRT  RES  SHR S[%CPU] %MEM     TIME+ ARGS
> > > > 19650 root          0 -20    0    0    0 I  7.0   0.0   0:00.21 
> > > > [kworker/u33:3-f2fs_post_write_wq]
> > > >     95 root          0 -20    0    0    0 I  6.6   0.0   0:02.08 
> > > > [kworker/u33:0-f2fs_post_write_wq]
> > > > 19653 root          0 -20    0    0    0 I  4.6   0.0   0:01.25 
> > > > [kworker/u33:6-f2fs_post_write_wq]
> > > > 19652 root          0 -20    0    0    0 I  4.3   0.0   0:00.92 
> > > > [kworker/u33:5-f2fs_post_write_wq]
> > > > 19613 root          0 -20    0    0    0 I  4.3   0.0   0:00.99 
> > > > [kworker/u33:1-f2fs_post_write_wq]
> > > > 19651 root          0 -20    0    0    0 I  3.6   0.0   0:00.98 
> > > > [kworker/u33:4-f2fs_post_write_wq]
> > > > 19654 root          0 -20    0    0    0 I  3.0   0.0   0:01.05 
> > > > [kworker/u33:7-f2fs_post_write_wq]
> > > > 19655 root          0 -20    0    0    0 I  2.3   0.0   0:01.18 
> > > > [kworker/u33:8-f2fs_post_write_wq]
> > > > 
> > > > >From these results on my test device, introducing uncached buffer I/O 
> > > > >write on
> > > > f2fs seems to bring more drawbacks than benefits. Do we really need to 
> > > > support
> > > > uncached buffer I/O write in f2fs?
> > > > 
> > > > Write test data without using uncached buffer I/O:
> > > > Starting 1 threads
> > > > pid: 17609
> > > > writing bs 8192, uncached 0
> > > >    1s: 753MB/sec, MB=753
> > > >    2s: 792MB/sec, MB=1546
> > > >    3s: 430MB/sec, MB=1978
> > > >    4s: 661MB/sec, MB=2636
> > > >    5s: 900MB/sec, MB=3542
> > > >    6s: 769MB/sec, MB=4308
> > > >    7s: 808MB/sec, MB=5113
> > > >    8s: 766MB/sec, MB=5884
> > > >    9s: 654MB/sec, MB=6539
> > > >   10s: 456MB/sec, MB=6995
> > > >   11s: 797MB/sec, MB=7793
> > > >   12s: 770MB/sec, MB=8563
> > > >   13s: 778MB/sec, MB=9341
> > > >   14s: 726MB/sec, MB=10077
> > > >   15s: 736MB/sec, MB=10803
> > > >   16s: 798MB/sec, MB=11602
> > > >   17s: 728MB/sec, MB=12330
> > > >   18s: 749MB/sec, MB=13080
> > > >   19s: 777MB/sec, MB=13857
> > > >   20s: 688MB/sec, MB=14395
> > > > 
> > > > 19:29:34      UID       PID    %usr %system  %guest   %wait    %CPU   
> > > > CPU  Command
> > > > 19:29:35        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:29:36        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:29:37        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:29:38        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:29:39        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:29:40        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:29:41        0        94    0.00    2.00    0.00    0.00    2.00     
> > > > 0  kswapd0
> > > > 19:29:42        0        94    0.00   59.00    0.00    0.00   59.00     
> > > > 7  kswapd0
> > > > 19:29:43        0        94    0.00   45.00    0.00    0.00   45.00     
> > > > 7  kswapd0
> > > > 19:29:44        0        94    0.00   36.00    0.00    0.00   36.00     
> > > > 0  kswapd0
> > > > 19:29:45        0        94    0.00   27.00    0.00    1.00   27.00     
> > > > 0  kswapd0
> > > > 19:29:46        0        94    0.00   26.00    0.00    0.00   26.00     
> > > > 2  kswapd0
> > > > 19:29:47        0        94    0.00   57.00    0.00    0.00   57.00     
> > > > 7  kswapd0
> > > > 19:29:48        0        94    0.00   41.00    0.00    0.00   41.00     
> > > > 7  kswapd0
> > > > 19:29:49        0        94    0.00   38.00    0.00    0.00   38.00     
> > > > 7  kswapd0
> > > > 19:29:50        0        94    0.00   47.00    0.00    0.00   47.00     
> > > > 7  kswapd0
> > > > 19:29:51        0        94    0.00   43.00    0.00    1.00   43.00     
> > > > 7  kswapd0
> > > > 19:29:52        0        94    0.00   36.00    0.00    0.00   36.00     
> > > > 7  kswapd0
> > > > 19:29:53        0        94    0.00   39.00    0.00    0.00   39.00     
> > > > 2  kswapd0
> > > > 19:29:54        0        94    0.00   46.00    0.00    0.00   46.00     
> > > > 7  kswapd0
> > > > 19:29:55        0        94    0.00   43.00    0.00    0.00   43.00     
> > > > 7  kswapd0
> > > > 19:29:56        0        94    0.00   39.00    0.00    0.00   39.00     
> > > > 7  kswapd0
> > > > 19:29:57        0        94    0.00   29.00    0.00    1.00   29.00     
> > > > 1  kswapd0
> > > > 19:29:58        0        94    0.00   17.00    0.00    0.00   17.00     
> > > > 4  kswapd0
> > > > 
> > > > 19:29:33    kbmemfree   kbavail kbmemused  %memused kbbuffers  kbcached 
> > > >  kbcommit   %commit  kbactive   kbinact   kbdirty
> > > > 19:29:34      4464588   6742648   4420876     38.12      6156   2032600 
> > > > 179730872    743.27   1863412   1822544         4
> > > > 19:29:35      4462572   6740784   4422752     38.13      6156   2032752 
> > > > 179739004    743.30   1863460   1823584        16
> > > > 19:29:36      4381512   6740856   4422420     38.13      6156   2114144 
> > > > 179746508    743.33   1863476   1905384     81404
> > > > 19:29:37      3619456   6741840   4421588     38.12      6156   2877032 
> > > > 179746652    743.33   1863536   2668896    592584
> > > > 19:29:38      2848184   6740720   4422472     38.13      6164   3646188 
> > > > 179746652    743.33   1863600   3438520    815692
> > > > 19:29:39      2436336   6739452   4423720     38.14      6164   4056772 
> > > > 179746652    743.33   1863604   3849164    357096
> > > > 19:29:40      1712660   6737700   4425140     38.15      6164   4779020 
> > > > 179746604    743.33   1863612   4571124    343716
> > > > 19:29:41       810664   6738020   4425004     38.15      6164   5681152 
> > > > 179746604    743.33   1863612   5473444    297268
> > > > 19:29:42       673756   6779120   4373200     37.71      5656   5869928 
> > > > 179746604    743.33   1902852   5589452    269032
> > > > 19:29:43       688480   6782024   4371012     37.69      5648   5856940 
> > > > 179750048    743.34   1926336   5542004    279344
> > > > 19:29:44       688956   6789028   4364260     37.63      5584   5863272 
> > > > 179750048    743.34   1941608   5518808    300096
> > > > 19:29:45       740768   6804560   4348772     37.49      5524   5827248 
> > > > 179750000    743.34   1954084   5452844    123120
> > > > 19:29:46       697936   6810612   4342768     37.44      5524   5876048 
> > > > 179750048    743.34   1962020   5483944    274908
> > > > 19:29:47       734504   6818716   4334156     37.37      5512   5849188 
> > > > 179750000    743.34   1978120   5426796    274504
> > > > 19:29:48       771696   6828316   4324180     37.28      5504   5820948 
> > > > 179762260    743.39   2006732   5354152    305388
> > > > 19:29:49       691944   6838812   4313108     37.19      5476   5912444 
> > > > 179749952    743.34   2021720   5418996    296852
> > > > 19:29:50       679392   6844496   4306892     37.13      5452   5931356 
> > > > 179749952    743.34   1982772   5463040    233600
> > > > 19:29:51       768528   6868080   4284224     36.94      5412   5865704 
> > > > 176317452    729.15   1990220   5359012    343160
> > > > 19:29:52       717880   6893940   4259968     36.73      5400   5942368 
> > > > 176317404    729.15   1965624   5444140    304856
> > > > 19:29:53       712408   6902660   4251268     36.65      5372   5956584 
> > > > 176318376    729.15   1969192   5442132    290224
> > > > 19:29:54       707184   6917512   4236160     36.52      5344   5976944 
> > > > 176318568    729.15   1968716   5443620    336948
> > > > 19:29:55       703172   6921608   4232332     36.49      5292   5984836 
> > > > 176318568    729.15   1979788   5429484    328716
> > > > 19:29:56       733256   6933020   4220864     36.39      5212   5966340 
> > > > 176318568    729.15   1983636   5396256    300008
> > > > 19:29:57       723308   6936340   4217280     36.36      5120   5979816 
> > > > 176318568    729.15   1987088   5394360    508792
> > > > 19:29:58       732148   6942972   4210680     36.30      5108   5977656 
> > > > 176311064    729.12   1990400   5379884    214936
> > > > 
> > > > Write test data after using uncached buffer I/O:
> > > > Starting 1 threads
> > > > pid: 17742
> > > > writing bs 8192, uncached 1
> > > >    1s: 433MB/sec, MB=433
> > > >    2s: 195MB/sec, MB=628
> > > >    3s: 209MB/sec, MB=836
> > > >    4s: 54MB/sec, MB=883
> > > >    5s: 277MB/sec, MB=1169
> > > >    6s: 141MB/sec, MB=1311
> > > >    7s: 185MB/sec, MB=1495
> > > >    8s: 134MB/sec, MB=1631
> > > >    9s: 201MB/sec, MB=1834
> > > >   10s: 283MB/sec, MB=2114
> > > >   11s: 223MB/sec, MB=2339
> > > >   12s: 164MB/sec, MB=2506
> > > >   13s: 155MB/sec, MB=2657
> > > >   14s: 132MB/sec, MB=2792
> > > >   15s: 186MB/sec, MB=2965
> > > >   16s: 218MB/sec, MB=3198
> > > >   17s: 220MB/sec, MB=3412
> > > >   18s: 191MB/sec, MB=3606
> > > >   19s: 214MB/sec, MB=3828
> > > >   20s: 257MB/sec, MB=4085
> > > > 
> > > > 19:31:31      UID       PID    %usr %system  %guest   %wait    %CPU   
> > > > CPU  Command
> > > > 19:31:32        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:31:33        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:31:34        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:31:35        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:31:36        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:31:37        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:31:38        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:31:39        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:31:40        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:31:41        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:31:42        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:31:43        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:31:44        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:31:45        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:31:46        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:31:47        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:31:48        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 19:31:49        0        94    0.00    0.00    0.00    0.00    0.00     
> > > > 4  kswapd0
> > > > 
> > > > 19:31:31    kbmemfree   kbavail kbmemused  %memused kbbuffers  kbcached 
> > > >  kbcommit   %commit  kbactive   kbinact   kbdirty
> > > > 19:31:32      4816812   6928788   4225812     36.43      5148   1879676 
> > > > 176322636    729.17   1920900   1336548    285748
> > > > 19:31:33      4781880   6889428   4265592     36.78      5148   1874860 
> > > > 176322636    729.17   1920920   1332268    279028
> > > > 19:31:34      4758972   6822588   4332376     37.35      5148   1830984 
> > > > 176322636    729.17   1920920   1288976    233040
> > > > 19:31:35      4850248   6766480   4387840     37.83      5148   1684244 
> > > > 176322636    729.17   1920920   1142408     90508
> > > > 19:31:36      4644176   6741676   4413256     38.05      5148   1864900 
> > > > 176322636    729.17   1920920   1323452    269380
> > > > 19:31:37      4637900   6681480   4473436     38.57      5148   1810996 
> > > > 176322588    729.17   1920920   1269612    217632
> > > > 19:31:38      4502108   6595508   4559500     39.31      5148   1860724 
> > > > 176322492    729.17   1920920   1319588    267760
> > > > 19:31:39      4498844   6551068   4603928     39.69      5148   1819528 
> > > > 176322492    729.17   1920920   1278440    226496
> > > > 19:31:40      4498812   6587396   4567340     39.38      5148   1856116 
> > > > 176322492    729.17   1920920   1314800    263292
> > > > 19:31:41      4656784   6706252   4448372     38.35      5148   1817112 
> > > > 176322492    729.17   1920920   1275704    224600
> > > > 19:31:42      4635032   6673328   4481436     38.64      5148   1805816 
> > > > 176322492    729.17   1920920   1264548    213436
> > > > 19:31:43      4636852   6679736   4474884     38.58      5148   1810548 
> > > > 176322492    729.17   1920932   1269796    218276
> > > > 19:31:44      4654740   6669104   4485544     38.67      5148   1782000 
> > > > 176322444    729.17   1920932   1241552    189880
> > > > 19:31:45      4821604   6693156   4461848     38.47      5148   1638864 
> > > > 176322444    729.17   1920932   1098784     31076
> > > > 19:31:46      4707548   6728796   4426400     38.16      5148   1788368 
> > > > 176322444    729.17   1920932   1248936    196596
> > > > 19:31:47      4683996   6747632   4407348     38.00      5148   1830968 
> > > > 176322444    729.17   1920932   1291396    239636
> > > > 19:31:48      4694648   6773808   4381320     37.78      5148   1846376 
> > > > 176322624    729.17   1920944   1307576    254800
> > > > 19:31:49      4663784   6730212   4424776     38.15      5148   1833784 
> > > > 176322772    729.17   1920948   1295156    242200
> > > > 
> > > > [1]
> > > > https://lore.kernel.org/lkml/[email protected]/
> > > > 
> > > > Signed-off-by: Qi Han <[email protected]>
> > > > ---
> > > >   fs/f2fs/data.c    | 178 ++++++++++++++++++++++++++++++++++------------
> > > >   fs/f2fs/f2fs.h    |   5 ++
> > > >   fs/f2fs/file.c    |   2 +-
> > > >   fs/f2fs/iostat.c  |   8 ++-
> > > >   fs/f2fs/iostat.h  |   4 +-
> > > >   fs/f2fs/segment.c |   2 +-
> > > >   fs/f2fs/super.c   |  16 ++++-
> > > >   7 files changed, 161 insertions(+), 54 deletions(-)
> > > > 
> > > > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> > > > index 7961e0ddfca3..4eeb2b36473d 100644
> > > > --- a/fs/f2fs/data.c
> > > > +++ b/fs/f2fs/data.c
> > > > @@ -30,8 +30,10 @@
> > > >   #define NUM_PREALLOC_POST_READ_CTXS   128
> > > >   static struct kmem_cache *bio_post_read_ctx_cache;
> > > > +static struct kmem_cache *bio_post_write_ctx_cache;
> > > >   static struct kmem_cache *bio_entry_slab;
> > > >   static mempool_t *bio_post_read_ctx_pool;
> > > > +static mempool_t *bio_post_write_ctx_pool;
> > > >   static struct bio_set f2fs_bioset;
> > > >   #define       F2FS_BIO_POOL_SIZE      NR_CURSEG_TYPE
> > > > @@ -120,6 +122,12 @@ struct bio_post_read_ctx {
> > > >         block_t fs_blkaddr;
> > > >   };
> > > > +struct bio_post_write_ctx {
> > > > +       struct bio *bio;
> > > > +       struct f2fs_sb_info *sbi;
> > > > +       struct work_struct work;
> > > > +};
> > > > +
> > > >   /*
> > > >    * Update and unlock a bio's pages, and free the bio.
> > > >    *
> > > > @@ -159,6 +167,56 @@ static void f2fs_finish_read_bio(struct bio *bio, 
> > > > bool in_task)
> > > >         bio_put(bio);
> > > >   }
> > > > +static void f2fs_finish_write_bio(struct f2fs_sb_info *sbi, struct bio 
> > > > *bio)
> > > > +{
> > > > +       struct folio_iter fi;
> > > > +       struct bio_post_write_ctx *write_ctx = (struct 
> > > > bio_post_write_ctx *)bio->bi_private;
> > > > +
> > > > +       bio_for_each_folio_all(fi, bio) {
> > > > +               struct folio *folio = fi.folio;
> > > > +               enum count_type type;
> > > > +
> > > > +               if (fscrypt_is_bounce_folio(folio)) {
> > > > +                       struct folio *io_folio = folio;
> > > > +
> > > > +                       folio = fscrypt_pagecache_folio(io_folio);
> > > > +                       fscrypt_free_bounce_page(&io_folio->page);
> > > > +               }
> > > > +
> > > > +#ifdef CONFIG_F2FS_FS_COMPRESSION
> > > > +               if (f2fs_is_compressed_page(folio)) {
> > > > +                       f2fs_compress_write_end_io(bio, folio);
> > > > +                       continue;
> > > > +               }
> > > > +#endif
> > > > +
> > > > +               type = WB_DATA_TYPE(folio, false);
> > > > +
> > > > +               if (unlikely(bio->bi_status != BLK_STS_OK)) {
> > > > +                       mapping_set_error(folio->mapping, -EIO);
> > > > +                       if (type == F2FS_WB_CP_DATA)
> > > > +                               f2fs_stop_checkpoint(sbi, true,
> > > > +                                               
> > > > STOP_CP_REASON_WRITE_FAIL);
> > > > +               }
> > > > +
> > > > +               f2fs_bug_on(sbi, is_node_folio(folio) &&
> > > > +                               folio->index != nid_of_node(folio));
> > > > +
> > > > +               dec_page_count(sbi, type);
> > > > +               if (f2fs_in_warm_node_list(sbi, folio))
> > > > +                       f2fs_del_fsync_node_entry(sbi, folio);
> > > > +               folio_clear_f2fs_gcing(folio);
> > > > +               folio_end_writeback(folio);
> > > > +       }
> > > > +       if (!get_pages(sbi, F2FS_WB_CP_DATA) &&
> > > > +                               wq_has_sleeper(&sbi->cp_wait))
> > > > +               wake_up(&sbi->cp_wait);
> > > > +
> > > > +       if (write_ctx)
> > > > +               mempool_free(write_ctx, bio_post_write_ctx_pool);
> > > > +       bio_put(bio);
> > > > +}
> > > > +
> > > >   static void f2fs_verify_bio(struct work_struct *work)
> > > >   {
> > > >         struct bio_post_read_ctx *ctx =
> > > > @@ -314,58 +372,32 @@ static void f2fs_read_end_io(struct bio *bio)
> > > >         f2fs_verify_and_finish_bio(bio, intask);
> > > >   }
> > > > +static void f2fs_finish_write_bio_async_work(struct work_struct *work)
> > > > +{
> > > > +       struct bio_post_write_ctx *ctx =
> > > > +               container_of(work, struct bio_post_write_ctx, work);
> > > > +
> > > > +       f2fs_finish_write_bio(ctx->sbi, ctx->bio);
> > > > +}
> > > > +
> > > >   static void f2fs_write_end_io(struct bio *bio)
> > > >   {
> > > > -       struct f2fs_sb_info *sbi;
> > > > -       struct folio_iter fi;
> > > > +       struct f2fs_sb_info *sbi = F2FS_F_SB(bio_first_folio_all(bio));
> > > > +       struct bio_post_write_ctx *write_ctx;
> > > >         iostat_update_and_unbind_ctx(bio);
> > > > -       sbi = bio->bi_private;
> > > >         if (time_to_inject(sbi, FAULT_WRITE_IO))
> > > >                 bio->bi_status = BLK_STS_IOERR;
> > > > -       bio_for_each_folio_all(fi, bio) {
> > > > -               struct folio *folio = fi.folio;
> > > > -               enum count_type type;
> > > > -
> > > > -               if (fscrypt_is_bounce_folio(folio)) {
> > > > -                       struct folio *io_folio = folio;
> > > > -
> > > > -                       folio = fscrypt_pagecache_folio(io_folio);
> > > > -                       fscrypt_free_bounce_page(&io_folio->page);
> > > > -               }
> > > > -
> > > > -#ifdef CONFIG_F2FS_FS_COMPRESSION
> > > > -               if (f2fs_is_compressed_page(folio)) {
> > > > -                       f2fs_compress_write_end_io(bio, folio);
> > > > -                       continue;
> > > > -               }
> > > > -#endif
> > > > -
> > > > -               type = WB_DATA_TYPE(folio, false);
> > > > -
> > > > -               if (unlikely(bio->bi_status != BLK_STS_OK)) {
> > > > -                       mapping_set_error(folio->mapping, -EIO);
> > > > -                       if (type == F2FS_WB_CP_DATA)
> > > > -                               f2fs_stop_checkpoint(sbi, true,
> > > > -                                               
> > > > STOP_CP_REASON_WRITE_FAIL);
> > > > -               }
> > > > -
> > > > -               f2fs_bug_on(sbi, is_node_folio(folio) &&
> > > > -                               folio->index != nid_of_node(folio));
> > > > -
> > > > -               dec_page_count(sbi, type);
> > > > -               if (f2fs_in_warm_node_list(sbi, folio))
> > > > -                       f2fs_del_fsync_node_entry(sbi, folio);
> > > > -               folio_clear_f2fs_gcing(folio);
> > > > -               folio_end_writeback(folio);
> > > > +       write_ctx = (struct bio_post_write_ctx *)bio->bi_private;
> > > > +       if (write_ctx) {
> > > > +               INIT_WORK(&write_ctx->work, 
> > > > f2fs_finish_write_bio_async_work);
> > > > +               queue_work(write_ctx->sbi->post_write_wq, 
> > > > &write_ctx->work);
> > > > +               return;
> > > >         }
> > > > -       if (!get_pages(sbi, F2FS_WB_CP_DATA) &&
> > > > -                               wq_has_sleeper(&sbi->cp_wait))
> > > > -               wake_up(&sbi->cp_wait);
> > > > -       bio_put(bio);
> > > > +       f2fs_finish_write_bio(sbi, bio);
> > > >   }
> > > >   #ifdef CONFIG_BLK_DEV_ZONED
> > > > @@ -467,11 +499,10 @@ static struct bio *__bio_alloc(struct 
> > > > f2fs_io_info *fio, int npages)
> > > >                 bio->bi_private = NULL;
> > > >         } else {
> > > >                 bio->bi_end_io = f2fs_write_end_io;
> > > > -               bio->bi_private = sbi;
> > > > +               bio->bi_private = NULL;
> > > >                 bio->bi_write_hint = f2fs_io_type_to_rw_hint(sbi,
> > > >                                                 fio->type, fio->temp);
> > > >         }
> > > > -       iostat_alloc_and_bind_ctx(sbi, bio, NULL);
> > > >         if (fio->io_wbc)
> > > >                 wbc_init_bio(fio->io_wbc, bio);
> > > > @@ -701,6 +732,7 @@ int f2fs_submit_page_bio(struct f2fs_io_info *fio)
> > > >         /* Allocate a new bio */
> > > >         bio = __bio_alloc(fio, 1);
> > > > +       iostat_alloc_and_bind_ctx(fio->sbi, bio, NULL, NULL);
> > > >         f2fs_set_bio_crypt_ctx(bio, fio_folio->mapping->host,
> > > >                         fio_folio->index, fio, GFP_NOIO);
> > > > @@ -899,6 +931,8 @@ int f2fs_merge_page_bio(struct f2fs_io_info *fio)
> > > >   alloc_new:
> > > >         if (!bio) {
> > > >                 bio = __bio_alloc(fio, BIO_MAX_VECS);
> > > > +               iostat_alloc_and_bind_ctx(fio->sbi, bio, NULL, NULL);
> > > > +
> > > >                 f2fs_set_bio_crypt_ctx(bio, folio->mapping->host,
> > > >                                 folio->index, fio, GFP_NOIO);
> > > > @@ -948,6 +982,7 @@ void f2fs_submit_page_write(struct f2fs_io_info 
> > > > *fio)
> > > >         struct f2fs_bio_info *io = sbi->write_io[btype] + fio->temp;
> > > >         struct folio *bio_folio;
> > > >         enum count_type type;
> > > > +       struct bio_post_write_ctx *write_ctx = NULL;
> > > >         f2fs_bug_on(sbi, is_read_io(fio->op));
> > > > @@ -1001,6 +1036,13 @@ void f2fs_submit_page_write(struct f2fs_io_info 
> > > > *fio)
> > > >                 f2fs_set_bio_crypt_ctx(io->bio, fio_inode(fio),
> > > >                                 bio_folio->index, fio, GFP_NOIO);
> > > >                 io->fio = *fio;
> > > > +
> > > > +               if (folio_test_dropbehind(bio_folio)) {
> > > > +                       write_ctx = 
> > > > mempool_alloc(bio_post_write_ctx_pool, GFP_NOFS);
> > > > +                       write_ctx->bio = io->bio;
> > > > +                       write_ctx->sbi = sbi;
> > > > +               }
> > > > +               iostat_alloc_and_bind_ctx(fio->sbi, io->bio, NULL, 
> > > > write_ctx);
> > > >         }
> > > >         if (!bio_add_folio(io->bio, bio_folio, folio_size(bio_folio), 
> > > > 0)) {
> > > > @@ -1077,7 +1119,7 @@ static struct bio *f2fs_grab_read_bio(struct 
> > > > inode *inode, block_t blkaddr,
> > > >                 ctx->decompression_attempted = false;
> > > >                 bio->bi_private = ctx;
> > > >         }
> > > > -       iostat_alloc_and_bind_ctx(sbi, bio, ctx);
> > > > +       iostat_alloc_and_bind_ctx(sbi, bio, ctx, NULL);
> > > >         return bio;
> > > >   }
> > > > @@ -3540,6 +3582,7 @@ static int f2fs_write_begin(const struct kiocb 
> > > > *iocb,
> > > >         bool use_cow = false;
> > > >         block_t blkaddr = NULL_ADDR;
> > > >         int err = 0;
> > > > +       fgf_t fgp = FGP_LOCK | FGP_WRITE | FGP_CREAT;
> > > >         trace_f2fs_write_begin(inode, pos, len);
> > > > @@ -3582,12 +3625,13 @@ static int f2fs_write_begin(const struct kiocb 
> > > > *iocb,
> > > >   #endif
> > > >   repeat:
> > > > +       if (iocb && iocb->ki_flags & IOCB_DONTCACHE)
> > > > +               fgp |= FGP_DONTCACHE;
> > > >         /*
> > > >          * Do not use FGP_STABLE to avoid deadlock.
> > > >          * Will wait that below with our IO control.
> > > >          */
> > > > -       folio = __filemap_get_folio(mapping, index,
> > > > -                               FGP_LOCK | FGP_WRITE | FGP_CREAT, 
> > > > GFP_NOFS);
> > > > +       folio = __filemap_get_folio(mapping, index, fgp, GFP_NOFS);
> > > >         if (IS_ERR(folio)) {
> > > >                 err = PTR_ERR(folio);
> > > >                 goto fail;
> > > > @@ -4127,12 +4171,38 @@ int __init f2fs_init_post_read_processing(void)
> > > >         return -ENOMEM;
> > > >   }
> > > > +int __init f2fs_init_post_write_processing(void)
> > > > +{
> > > > +       bio_post_write_ctx_cache =
> > > > +               kmem_cache_create("f2fs_bio_post_write_ctx",
> > > > +                               sizeof(struct bio_post_write_ctx), 0, 
> > > > 0, NULL);
> > > > +       if (!bio_post_write_ctx_cache)
> > > > +               goto fail;
> > > > +       bio_post_write_ctx_pool =
> > > > +               mempool_create_slab_pool(NUM_PREALLOC_POST_READ_CTXS,
> > > > +                               bio_post_write_ctx_cache);
> > > > +       if (!bio_post_write_ctx_pool)
> > > > +               goto fail_free_cache;
> > > > +       return 0;
> > > > +
> > > > +fail_free_cache:
> > > > +       kmem_cache_destroy(bio_post_write_ctx_cache);
> > > > +fail:
> > > > +       return -ENOMEM;
> > > > +}
> > > > +
> > > >   void f2fs_destroy_post_read_processing(void)
> > > >   {
> > > >         mempool_destroy(bio_post_read_ctx_pool);
> > > >         kmem_cache_destroy(bio_post_read_ctx_cache);
> > > >   }
> > > > +void f2fs_destroy_post_write_processing(void)
> > > > +{
> > > > +       mempool_destroy(bio_post_write_ctx_pool);
> > > > +       kmem_cache_destroy(bio_post_write_ctx_cache);
> > > > +}
> > > > +
> > > >   int f2fs_init_post_read_wq(struct f2fs_sb_info *sbi)
> > > >   {
> > > >         if (!f2fs_sb_has_encrypt(sbi) &&
> > > > @@ -4146,12 +4216,26 @@ int f2fs_init_post_read_wq(struct f2fs_sb_info 
> > > > *sbi)
> > > >         return sbi->post_read_wq ? 0 : -ENOMEM;
> > > >   }
> > > > +int f2fs_init_post_write_wq(struct f2fs_sb_info *sbi)
> > > > +{
> > > > +       sbi->post_write_wq = alloc_workqueue("f2fs_post_write_wq",
> > > > +                                                WQ_UNBOUND | 
> > > > WQ_HIGHPRI,
> > > > +                                                num_online_cpus());
> > > > +       return sbi->post_write_wq ? 0 : -ENOMEM;
> > > > +}
> > > > +
> > > >   void f2fs_destroy_post_read_wq(struct f2fs_sb_info *sbi)
> > > >   {
> > > >         if (sbi->post_read_wq)
> > > >                 destroy_workqueue(sbi->post_read_wq);
> > > >   }
> > > > +void f2fs_destroy_post_write_wq(struct f2fs_sb_info *sbi)
> > > > +{
> > > > +       if (sbi->post_write_wq)
> > > > +               destroy_workqueue(sbi->post_write_wq);
> > > > +}
> > > > +
> > > >   int __init f2fs_init_bio_entry_cache(void)
> > > >   {
> > > >         bio_entry_slab = f2fs_kmem_cache_create("f2fs_bio_entry_slab",
> > > > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> > > > index 46be7560548c..fe3f81876b23 100644
> > > > --- a/fs/f2fs/f2fs.h
> > > > +++ b/fs/f2fs/f2fs.h
> > > > @@ -1812,6 +1812,7 @@ struct f2fs_sb_info {
> > > >         /* Precomputed FS UUID checksum for seeding other checksums */
> > > >         __u32 s_chksum_seed;
> > > > +       struct workqueue_struct *post_write_wq;
> > > >         struct workqueue_struct *post_read_wq;  /* post read workqueue 
> > > > */
> > > >         /*
> > > > @@ -4023,9 +4024,13 @@ bool f2fs_release_folio(struct folio *folio, 
> > > > gfp_t wait);
> > > >   bool f2fs_overwrite_io(struct inode *inode, loff_t pos, size_t len);
> > > >   void f2fs_clear_page_cache_dirty_tag(struct folio *folio);
> > > >   int f2fs_init_post_read_processing(void);
> > > > +int f2fs_init_post_write_processing(void);
> > > >   void f2fs_destroy_post_read_processing(void);
> > > > +void f2fs_destroy_post_write_processing(void);
> > > >   int f2fs_init_post_read_wq(struct f2fs_sb_info *sbi);
> > > > +int f2fs_init_post_write_wq(struct f2fs_sb_info *sbi);
> > > >   void f2fs_destroy_post_read_wq(struct f2fs_sb_info *sbi);
> > > > +void f2fs_destroy_post_write_wq(struct f2fs_sb_info *sbi);
> > > >   extern const struct iomap_ops f2fs_iomap_ops;
> > > >   /*
> > > > diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> > > > index 42faaed6a02d..8aa6a4fd52e8 100644
> > > > --- a/fs/f2fs/file.c
> > > > +++ b/fs/f2fs/file.c
> > > > @@ -5443,5 +5443,5 @@ const struct file_operations f2fs_file_operations 
> > > > = {
> > > >         .splice_read    = f2fs_file_splice_read,
> > > >         .splice_write   = iter_file_splice_write,
> > > >         .fadvise        = f2fs_file_fadvise,
> > > > -       .fop_flags      = FOP_BUFFER_RASYNC,
> > > > +       .fop_flags      = FOP_BUFFER_RASYNC | FOP_DONTCACHE,
> > > >   };
> > > > diff --git a/fs/f2fs/iostat.c b/fs/f2fs/iostat.c
> > > > index f8703038e1d8..b2e6ce80c68d 100644
> > > > --- a/fs/f2fs/iostat.c
> > > > +++ b/fs/f2fs/iostat.c
> > > > @@ -245,7 +245,7 @@ void iostat_update_and_unbind_ctx(struct bio *bio)
> > > >         if (op_is_write(bio_op(bio))) {
> > > >                 lat_type = bio->bi_opf & REQ_SYNC ?
> > > >                                 WRITE_SYNC_IO : WRITE_ASYNC_IO;
> > > > -               bio->bi_private = iostat_ctx->sbi;
> > > > +               bio->bi_private = iostat_ctx->post_write_ctx;
> > > >         } else {
> > > >                 lat_type = READ_IO;
> > > >                 bio->bi_private = iostat_ctx->post_read_ctx;
> > > > @@ -256,7 +256,8 @@ void iostat_update_and_unbind_ctx(struct bio *bio)
> > > >   }
> > > >   void iostat_alloc_and_bind_ctx(struct f2fs_sb_info *sbi,
> > > > -               struct bio *bio, struct bio_post_read_ctx *ctx)
> > > > +               struct bio *bio, struct bio_post_read_ctx *read_ctx,
> > > > +               struct bio_post_write_ctx *write_ctx)
> > > >   {
> > > >         struct bio_iostat_ctx *iostat_ctx;
> > > >         /* Due to the mempool, this never fails. */
> > > > @@ -264,7 +265,8 @@ void iostat_alloc_and_bind_ctx(struct f2fs_sb_info 
> > > > *sbi,
> > > >         iostat_ctx->sbi = sbi;
> > > >         iostat_ctx->submit_ts = 0;
> > > >         iostat_ctx->type = 0;
> > > > -       iostat_ctx->post_read_ctx = ctx;
> > > > +       iostat_ctx->post_read_ctx = read_ctx;
> > > > +       iostat_ctx->post_write_ctx = write_ctx;
> > > >         bio->bi_private = iostat_ctx;
> > > >   }
> > > > diff --git a/fs/f2fs/iostat.h b/fs/f2fs/iostat.h
> > > > index eb99d05cf272..a358909bb5e8 100644
> > > > --- a/fs/f2fs/iostat.h
> > > > +++ b/fs/f2fs/iostat.h
> > > > @@ -40,6 +40,7 @@ struct bio_iostat_ctx {
> > > >         unsigned long submit_ts;
> > > >         enum page_type type;
> > > >         struct bio_post_read_ctx *post_read_ctx;
> > > > +       struct bio_post_write_ctx *post_write_ctx;
> > > >   };
> > > >   static inline void iostat_update_submit_ctx(struct bio *bio,
> > > > @@ -60,7 +61,8 @@ static inline struct bio_post_read_ctx 
> > > > *get_post_read_ctx(struct bio *bio)
> > > >   extern void iostat_update_and_unbind_ctx(struct bio *bio);
> > > >   extern void iostat_alloc_and_bind_ctx(struct f2fs_sb_info *sbi,
> > > > -               struct bio *bio, struct bio_post_read_ctx *ctx);
> > > > +               struct bio *bio, struct bio_post_read_ctx *read_ctx,
> > > > +               struct bio_post_write_ctx *write_ctx);
> > > >   extern int f2fs_init_iostat_processing(void);
> > > >   extern void f2fs_destroy_iostat_processing(void);
> > > >   extern int f2fs_init_iostat(struct f2fs_sb_info *sbi);
> > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> > > > index cc82d42ef14c..8501008e42b2 100644
> > > > --- a/fs/f2fs/segment.c
> > > > +++ b/fs/f2fs/segment.c
> > > > @@ -3856,7 +3856,7 @@ int f2fs_allocate_data_block(struct f2fs_sb_info 
> > > > *sbi, struct folio *folio,
> > > >                 f2fs_inode_chksum_set(sbi, folio);
> > > >         }
> > > > -       if (fio) {
> > > > +       if (fio && !folio_test_dropbehind(folio)) {
> > > >                 struct f2fs_bio_info *io;
> > > >                 INIT_LIST_HEAD(&fio->list);
> > > > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> > > > index e16c4e2830c2..110dfe073aee 100644
> > > > --- a/fs/f2fs/super.c
> > > > +++ b/fs/f2fs/super.c
> > > > @@ -1963,6 +1963,7 @@ static void f2fs_put_super(struct super_block *sb)
> > > >         flush_work(&sbi->s_error_work);
> > > >         f2fs_destroy_post_read_wq(sbi);
> > > > +       f2fs_destroy_post_write_wq(sbi);
> > > >         kvfree(sbi->ckpt);
> > > > @@ -4959,6 +4960,12 @@ static int f2fs_fill_super(struct super_block 
> > > > *sb, struct fs_context *fc)
> > > >                 goto free_devices;
> > > >         }
> > > > +       err = f2fs_init_post_write_wq(sbi);
> > > > +       if (err) {
> > > > +               f2fs_err(sbi, "Failed to initialize post write 
> > > > workqueue");
> > > > +               goto free_devices;
> > > > +       }
> > > > +
> > > >         sbi->total_valid_node_count =
> > > >                                 
> > > > le32_to_cpu(sbi->ckpt->valid_node_count);
> > > >         percpu_counter_set(&sbi->total_valid_inode_count,
> > > > @@ -5240,6 +5247,7 @@ static int f2fs_fill_super(struct super_block 
> > > > *sb, struct fs_context *fc)
> > > >         /* flush s_error_work before sbi destroy */
> > > >         flush_work(&sbi->s_error_work);
> > > >         f2fs_destroy_post_read_wq(sbi);
> > > > +       f2fs_destroy_post_write_wq(sbi);
> > > >   free_devices:
> > > >         destroy_device_list(sbi);
> > > >         kvfree(sbi->ckpt);
> > > > @@ -5435,9 +5443,12 @@ static int __init init_f2fs_fs(void)
> > > >         err = f2fs_init_post_read_processing();
> > > >         if (err)
> > > >                 goto free_root_stats;
> > > > -       err = f2fs_init_iostat_processing();
> > > > +       err = f2fs_init_post_write_processing();
> > > >         if (err)
> > > >                 goto free_post_read;
> > > > +       err = f2fs_init_iostat_processing();
> > > > +       if (err)
> > > > +               goto free_post_write;
> > > >         err = f2fs_init_bio_entry_cache();
> > > >         if (err)
> > > >                 goto free_iostat;
> > > > @@ -5469,6 +5480,8 @@ static int __init init_f2fs_fs(void)
> > > >         f2fs_destroy_bio_entry_cache();
> > > >   free_iostat:
> > > >         f2fs_destroy_iostat_processing();
> > > > +free_post_write:
> > > > +       f2fs_destroy_post_write_processing();
> > > >   free_post_read:
> > > >         f2fs_destroy_post_read_processing();
> > > >   free_root_stats:
> > > > @@ -5504,6 +5517,7 @@ static void __exit exit_f2fs_fs(void)
> > > >         f2fs_destroy_bio_entry_cache();
> > > >         f2fs_destroy_iostat_processing();
> > > >         f2fs_destroy_post_read_processing();
> > > > +       f2fs_destroy_post_write_processing();
> > > >         f2fs_destroy_root_stats();
> > > >         f2fs_exit_shrinker();
> > > >         f2fs_exit_sysfs();
> > > > -- 
> > > > 2.50.0


_______________________________________________
Linux-f2fs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to