From: Yongpeng Yang <[email protected]>

Under stress tests with frequent metadata operations, checkpoint write
time can become excessively long. Analysis shows that the slowdown is
caused by synchronous, one-by-one reads of NAT blocks during checkpoint
processing.

The issue can be reproduced with the following workload:
1. seq 1 650000 | xargs -P 16 -n 1 touch
2. sync # avoid checkpoint write during deleting
3. delete 1 file every 455 files
4. echo 3 > /proc/sys/vm/drop_caches
5. sync # trigger checkpoint write

This patch submits read I/O for all NAT blocks required in the
__flush_nat_entry_set() phase in advance, reducing the overhead of
synchronous waiting for individual NAT block reads.

The NAT block flush latency before and after the change is as below:

|             |NAT blocks accessed|NAT blocks read|Flush time (ms)|
|-------------|-------------------|---------------|---------------|
|Before change|1205               |1191           |158            |
|After change |1264               |1242           |11             |

With a similar number of NAT blocks accessed and read from disk, adding
NAT block readahead reduces the total NAT block flush time by more than
90%.

Signed-off-by: Yongpeng Yang <[email protected]>
---
 fs/f2fs/node.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 99e425e8c00a..fa1ddfd6633f 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -3164,7 +3164,7 @@ int f2fs_flush_nat_entries(struct f2fs_sb_info *sbi, 
struct cp_control *cpc)
        struct f2fs_journal *journal = curseg->journal;
        struct nat_entry_set *setvec[NAT_VEC_SIZE];
        struct nat_entry_set *set, *tmp;
-       unsigned int found;
+       unsigned int found, entry_count = 0;
        nid_t set_idx = 0;
        LIST_HEAD(sets);
        int err = 0;
@@ -3204,6 +3204,17 @@ int f2fs_flush_nat_entries(struct f2fs_sb_info *sbi, 
struct cp_control *cpc)
                                                MAX_NAT_JENTRIES(journal));
        }
 
+       /*
+        * Readahead the current NAT block to prevent read requests from
+        * being issued and waited on one by one.
+        */
+       list_for_each_entry(set, &sets, set_list) {
+               entry_count += set->entry_cnt;
+               if (!enabled_nat_bits(sbi, cpc) &&
+                       __has_cursum_space(journal, entry_count, NAT_JOURNAL))
+                       continue;
+               f2fs_ra_meta_pages(sbi, set->set, 1, META_NAT, true);
+       }
        /* flush dirty nats in nat entry set */
        list_for_each_entry_safe(set, tmp, &sets, set_list) {
                err = __flush_nat_entry_set(sbi, set, cpc);
-- 
2.43.0



_______________________________________________
Linux-f2fs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to