From: Yongpeng Yang <[email protected]> Under stress tests with frequent metadata operations, checkpoint write time can become excessively long. Analysis shows that the slowdown is caused by synchronous, one-by-one reads of NAT blocks during checkpoint processing.
The issue can be reproduced with the following workload: 1. seq 1 650000 | xargs -P 16 -n 1 touch 2. sync # avoid checkpoint write during deleting 3. delete 1 file every 455 files 4. echo 3 > /proc/sys/vm/drop_caches 5. sync # trigger checkpoint write This patch submits read I/O for all NAT blocks required in the __flush_nat_entry_set() phase in advance, reducing the overhead of synchronous waiting for individual NAT block reads. The NAT block flush latency before and after the change is as below: | |NAT blocks accessed|NAT blocks read|Flush time (ms)| |-------------|-------------------|---------------|---------------| |Before change|1205 |1191 |158 | |After change |1264 |1242 |11 | With a similar number of NAT blocks accessed and read from disk, adding NAT block readahead reduces the total NAT block flush time by more than 90%. Signed-off-by: Yongpeng Yang <[email protected]> --- fs/f2fs/node.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 99e425e8c00a..fa1ddfd6633f 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -3164,7 +3164,7 @@ int f2fs_flush_nat_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc) struct f2fs_journal *journal = curseg->journal; struct nat_entry_set *setvec[NAT_VEC_SIZE]; struct nat_entry_set *set, *tmp; - unsigned int found; + unsigned int found, entry_count = 0; nid_t set_idx = 0; LIST_HEAD(sets); int err = 0; @@ -3204,6 +3204,17 @@ int f2fs_flush_nat_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc) MAX_NAT_JENTRIES(journal)); } + /* + * Readahead the current NAT block to prevent read requests from + * being issued and waited on one by one. + */ + list_for_each_entry(set, &sets, set_list) { + entry_count += set->entry_cnt; + if (!enabled_nat_bits(sbi, cpc) && + __has_cursum_space(journal, entry_count, NAT_JOURNAL)) + continue; + f2fs_ra_meta_pages(sbi, set->set, 1, META_NAT, true); + } /* flush dirty nats in nat entry set */ list_for_each_entry_safe(set, tmp, &sets, set_list) { err = __flush_nat_entry_set(sbi, set, cpc); -- 2.43.0 _______________________________________________ Linux-f2fs-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
