[f2fs-dev] [PATCH v5 3/6] f2fs: compress: fix to check unreleased compressed cluster
From: Sheng Yong Compressed cluster may not be released due to we can fail in release_compress_blocks(), fix to handle reserved compressed cluster correctly in reserve_compress_blocks(). Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Sheng Yong Signed-off-by: Chao Yu --- v5: - fix wrong condition pointed out by Daeho. fs/f2fs/file.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 026d05a7edd8..4aef2310395f 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -3624,7 +3624,13 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) goto next; } - if (__is_valid_data_blkaddr(blkaddr)) { + /* +* compressed cluster was not released due to it +* fails in release_compress_blocks(), so NEW_ADDR +* is a possible case. +*/ + if (blkaddr == NEW_ADDR || + __is_valid_data_blkaddr(blkaddr)) { compr_blocks++; continue; } @@ -3633,6 +3639,11 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) } reserved = cluster_size - compr_blocks; + + /* for the case all blocks in cluster were reserved */ + if (reserved == 1) + goto next; + ret = inc_valid_block_count(sbi, dn->inode, &reserved); if (ret) return ret; -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH v5 5/6] f2fs: fix to remove unnecessary f2fs_bug_on() to avoid panic
verify_blkaddr() will trigger panic once we inject fault into f2fs_is_valid_blkaddr(), fix to remove this unnecessary f2fs_bug_on(). Fixes: 18792e64c86d ("f2fs: support fault injection for f2fs_is_valid_blkaddr()") Reviewed-by: Daeho Jeong Signed-off-by: Chao Yu --- fs/f2fs/f2fs.h | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 69e71460a950..ab710bb6d8b3 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -3470,11 +3470,9 @@ bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi, static inline void verify_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr, int type) { - if (!f2fs_is_valid_blkaddr(sbi, blkaddr, type)) { + if (!f2fs_is_valid_blkaddr(sbi, blkaddr, type)) f2fs_err(sbi, "invalid blkaddr: %u, type: %d, run fsck to fix.", blkaddr, type); - f2fs_bug_on(sbi, 1); - } } static inline bool __is_valid_data_blkaddr(block_t blkaddr) -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH v5 6/6] f2fs: introduce FAULT_BLKADDR_CONSISTENCE
We will encounter below inconsistent status when FAULT_BLKADDR type fault injection is on. Info: checkpoint state = d6 : nat_bits crc fsck compacted_summary orphan_inodes sudden-power-off [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1c100 has i_blocks: 00c0, but has 191 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1c100] i_blocks=0x00c0 -> 0xbf [FIX] (fsck_chk_inode_blk:1269) --> [0x1c100] i_compr_blocks=0x0026 -> 0x27 [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1cadb has i_blocks: 002f, but has 46 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1cadb] i_blocks=0x002f -> 0x2e [FIX] (fsck_chk_inode_blk:1269) --> [0x1cadb] i_compr_blocks=0x0011 -> 0x12 [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1c62c has i_blocks: 0002, but has 1 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1c62c] i_blocks=0x0002 -> 0x1 After we inject fault into f2fs_is_valid_blkaddr() during truncation, a) it missed to increase @nr_free or @valid_blocks b) it can cause in blkaddr leak in truncated dnode Which may cause inconsistent status. This patch separates FAULT_BLKADDR_CONSISTENCE from FAULT_BLKADDR, and rename FAULT_BLKADDR to FAULT_BLKADDR_VALIDITY so that we can: a) use FAULT_BLKADDR_CONSISTENCE in f2fs_truncate_data_blocks_range() to simulate inconsistent issue independently, then it can verify fsck repair flow. b) FAULT_BLKADDR_VALIDITY fault will not cause any inconsistent status, we can just use it to check error path handling in kernel side. Reviewed-by: Daeho Jeong Signed-off-by: Chao Yu --- Documentation/ABI/testing/sysfs-fs-f2fs | 47 + Documentation/filesystems/f2fs.rst | 47 + fs/f2fs/checkpoint.c| 19 +++--- fs/f2fs/f2fs.h | 5 ++- fs/f2fs/file.c | 8 +++-- fs/f2fs/super.c | 37 +-- 6 files changed, 92 insertions(+), 71 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs index 99fa87a43926..48c135e24eb5 100644 --- a/Documentation/ABI/testing/sysfs-fs-f2fs +++ b/Documentation/ABI/testing/sysfs-fs-f2fs @@ -701,29 +701,30 @@ Description: Support configuring fault injection type, should be enabled with fault_injection option, fault type value is shown below, it supports single or combined type. - === === - Type_NameType_Value - === === - FAULT_KMALLOC0x1 - FAULT_KVMALLOC 0x2 - FAULT_PAGE_ALLOC 0x4 - FAULT_PAGE_GET 0x8 - FAULT_ALLOC_BIO 0x00010 (obsolete) - FAULT_ALLOC_NID 0x00020 - FAULT_ORPHAN 0x00040 - FAULT_BLOCK 0x00080 - FAULT_DIR_DEPTH 0x00100 - FAULT_EVICT_INODE0x00200 - FAULT_TRUNCATE 0x00400 - FAULT_READ_IO0x00800 - FAULT_CHECKPOINT 0x01000 - FAULT_DISCARD0x02000 - FAULT_WRITE_IO 0x04000 - FAULT_SLAB_ALLOC 0x08000 - FAULT_DQUOT_INIT 0x1 - FAULT_LOCK_OP0x2 - FAULT_BLKADDR0x4 - === === + === === + Type_NameType_Value + === === + FAULT_KMALLOC0x1 + FAULT_KVMALLOC 0x2 + FAULT_PAGE_ALLOC 0x4 + FAULT_PAGE_GET 0x8 + FAULT_ALLOC_BIO 0x00010 (obsolete) + FAULT_ALLOC_NID 0x00020 + FAULT_ORPHAN 0x00040 + FAULT_BLOCK 0x00080 + FAULT_DIR_DEPTH 0x00100 + FAULT_EVICT_INODE0x00200 + FAULT_TRUNCATE 0x00400 + FAULT_READ_IO0x00800 + FAULT_CHECKPOINT 0x01000 + FAULT_DISCARD0x02000 + FAULT_WRITE_IO 0x04000 + FAULT_SLAB_ALLOC 0x08000 + FAULT_DQUOT_INIT 0x1 + FAULT_LOCK_OP0x2
[f2fs-dev] [PATCH v5 4/6] f2fs: compress: fix to avoid inconsistence bewteen i_blocks and dnode
In reserve_compress_blocks(), we update blkaddrs of dnode in prior to inc_valid_block_count(), it may cause inconsistent status bewteen i_blocks and blkaddrs once inc_valid_block_count() fails. To fix this issue, it needs to reverse their invoking order. Fixes: c75488fb4d82 ("f2fs: introduce F2FS_IOC_RESERVE_COMPRESS_BLOCKS") Reviewed-by: Daeho Jeong Signed-off-by: Chao Yu --- fs/f2fs/data.c| 5 +++-- fs/f2fs/f2fs.h| 7 ++- fs/f2fs/file.c| 26 ++ fs/f2fs/segment.c | 2 +- 4 files changed, 24 insertions(+), 16 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index b171a9980f6a..8d2ace723310 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1219,7 +1219,8 @@ int f2fs_reserve_new_blocks(struct dnode_of_data *dn, blkcnt_t count) if (unlikely(is_inode_flag_set(dn->inode, FI_NO_ALLOC))) return -EPERM; - if (unlikely((err = inc_valid_block_count(sbi, dn->inode, &count + err = inc_valid_block_count(sbi, dn->inode, &count, true); + if (unlikely(err)) return err; trace_f2fs_reserve_new_blocks(dn->inode, dn->nid, @@ -1476,7 +1477,7 @@ static int __allocate_data_block(struct dnode_of_data *dn, int seg_type) dn->data_blkaddr = f2fs_data_blkaddr(dn); if (dn->data_blkaddr == NULL_ADDR) { - err = inc_valid_block_count(sbi, dn->inode, &count); + err = inc_valid_block_count(sbi, dn->inode, &count, true); if (unlikely(err)) return err; } diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 50f3d546ded8..69e71460a950 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -2252,7 +2252,7 @@ static inline bool __allow_reserved_blocks(struct f2fs_sb_info *sbi, static inline void f2fs_i_blocks_write(struct inode *, block_t, bool, bool); static inline int inc_valid_block_count(struct f2fs_sb_info *sbi, -struct inode *inode, blkcnt_t *count) +struct inode *inode, blkcnt_t *count, bool partial) { blkcnt_t diff = 0, release = 0; block_t avail_user_block_count; @@ -2292,6 +2292,11 @@ static inline int inc_valid_block_count(struct f2fs_sb_info *sbi, avail_user_block_count = 0; } if (unlikely(sbi->total_valid_block_count > avail_user_block_count)) { + if (!partial) { + spin_unlock(&sbi->stat_lock); + goto enospc; + } + diff = sbi->total_valid_block_count - avail_user_block_count; if (diff > *count) diff = *count; diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 4aef2310395f..a02c530c7e4b 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -3614,14 +3614,16 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) blkcnt_t reserved; int ret; - for (i = 0; i < cluster_size; i++, dn->ofs_in_node++) { - blkaddr = f2fs_data_blkaddr(dn); + for (i = 0; i < cluster_size; i++) { + blkaddr = data_blkaddr(dn->inode, dn->node_page, + dn->ofs_in_node + i); if (i == 0) { - if (blkaddr == COMPRESS_ADDR) - continue; - dn->ofs_in_node += cluster_size; - goto next; + if (blkaddr != COMPRESS_ADDR) { + dn->ofs_in_node += cluster_size; + goto next; + } + continue; } /* @@ -3634,8 +3636,6 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) compr_blocks++; continue; } - - f2fs_set_data_blkaddr(dn, NEW_ADDR); } reserved = cluster_size - compr_blocks; @@ -3644,12 +3644,14 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) if (reserved == 1) goto next; - ret = inc_valid_block_count(sbi, dn->inode, &reserved); - if (ret) + ret = inc_valid_block_count(sbi, dn->inode, &reserved, false); + if (unlikely(ret)) return ret; - if (reserved != cluster_size - compr_blocks) - return -ENOSPC; + for (i = 0; i < cluster_size; i++, dn->ofs_in_node++) { +
Re: [f2fs-dev] [PATCH v4 2/6] f2fs: compress: fix to cover normal cluster write with cp_rwsem
Thanks, let me resend v5 w/ blow cleanups. On 2024/1/13 9:39, Jaegeuk Kim wrote: Cleaned up a bit: --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -1443,13 +1443,14 @@ void f2fs_compress_write_end_io(struct bio *bio, struct page *page) } static int f2fs_write_raw_pages(struct compress_ctx *cc, - int *submitted, + int *submitted_p, struct writeback_control *wbc, enum iostat_type io_type) { struct address_space *mapping = cc->inode->i_mapping; struct f2fs_sb_info *sbi = F2FS_M_SB(mapping); - int _submitted, compr_blocks, ret = 0, i; + int submitted, compr_blocks, i; + int ret = 0; compr_blocks = f2fs_compressed_blocks(cc); @@ -1492,7 +1493,7 @@ static int f2fs_write_raw_pages(struct compress_ctx *cc, if (!clear_page_dirty_for_io(cc->rpages[i])) goto continue_unlock; - ret = f2fs_write_single_data_page(cc->rpages[i], &_submitted, + ret = f2fs_write_single_data_page(cc->rpages[i], &submitted, NULL, NULL, wbc, io_type, compr_blocks, false); if (ret) { @@ -1514,7 +1515,7 @@ static int f2fs_write_raw_pages(struct compress_ctx *cc, goto out; } - *submitted += _submitted; + *submitted_p += submitted; } out: On 01/11, Chao Yu wrote: When we overwrite compressed cluster w/ normal cluster, we should not unlock cp_rwsem during f2fs_write_raw_pages(), otherwise data will be corrupted if partial blocks were persisted before CP & SPOR, due to cluster metadata wasn't updated atomically. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu --- fs/f2fs/compress.c | 20 ++-- fs/f2fs/data.c | 3 ++- 2 files changed, 16 insertions(+), 7 deletions(-) diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index 9940b7886e5d..bf4cfab67aec 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -1448,7 +1448,8 @@ static int f2fs_write_raw_pages(struct compress_ctx *cc, enum iostat_type io_type) { struct address_space *mapping = cc->inode->i_mapping; - int _submitted, compr_blocks, ret, i; + struct f2fs_sb_info *sbi = F2FS_M_SB(mapping); + int _submitted, compr_blocks, ret = 0, i; compr_blocks = f2fs_compressed_blocks(cc); @@ -1463,6 +1464,10 @@ static int f2fs_write_raw_pages(struct compress_ctx *cc, if (compr_blocks < 0) return compr_blocks; + /* overwrite compressed cluster w/ normal cluster */ + if (compr_blocks > 0) + f2fs_lock_op(sbi); + for (i = 0; i < cc->cluster_size; i++) { if (!cc->rpages[i]) continue; @@ -1495,26 +1500,29 @@ static int f2fs_write_raw_pages(struct compress_ctx *cc, unlock_page(cc->rpages[i]); ret = 0; } else if (ret == -EAGAIN) { + ret = 0; /* * for quota file, just redirty left pages to * avoid deadlock caused by cluster update race * from foreground operation. */ if (IS_NOQUOTA(cc->inode)) - return 0; - ret = 0; + goto out; f2fs_io_schedule_timeout(DEFAULT_IO_TIMEOUT); goto retry_write; } - return ret; + goto out; } *submitted += _submitted; } - f2fs_balance_fs(F2FS_M_SB(mapping), true); +out: + if (compr_blocks > 0) + f2fs_unlock_op(sbi); - return 0; + f2fs_balance_fs(sbi, true); + return ret; } int f2fs_write_multi_pages(struct compress_ctx *cc, diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 81f9e2cc49e2..b171a9980f6a 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -2839,7 +2839,7 @@ int f2fs_write_single_data_page(struct page *page, int *submitted, .encrypted_page = NULL, .submitted = 0, .compr_blocks = compr_blocks, - .need_lock = LOCK_RETRY, + .need_lock = compr_blocks ? LOCK_DONE : LOCK_RETRY, .post_read = f2fs_post_read_required(inode) ? 1 : 0, .io_type = io_type,
Re: [f2fs-dev] [PATCH] f2fs: remove unnecessary f2fs_put_page in f2fs_rename
On 2024/1/13 1:16, Jaegeuk Kim wrote: [1] changed the below condition, which made f2fs_put_page() voided. This patch reapplies the AL's resolution in -next from [2]. - if (S_ISDIR(old_inode->i_mode)) { + if (old_is_dir && old_dir != new_dir) { old_dir_entry = f2fs_parent_dir(old_inode, &old_dir_page); if (!old_dir_entry) { if (IS_ERR(old_dir_page)) [1] 7deee77b993a ("f2fs: Avoid reading renamed directory if parent does not change") [2] https://lore.kernel.org/all/20231220013402.GW1674809@ZenIV/ Suggested-by: Al Viro Signed-off-by: Jaegeuk Kim Reviewed-by: Chao Yu Thanks, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH] f2fs: fix double free of f2fs_sb_info
On 2024/1/13 8:57, Eric Biggers wrote: From: Eric Biggers kill_f2fs_super() is called even if f2fs_fill_super() fails. f2fs_fill_super() frees the struct f2fs_sb_info, so it must set sb->s_fs_info to NULL to prevent it from being freed again. Oh, I missed that case as well during reviewing, my bad. Fixes: 275dca4630c1 ("f2fs: move release of block devices to after kill_block_super()") Reported-by: syzbot+8f477ac014ff5b32d...@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/6cb174060ec34...@google.com Signed-off-by: Eric Biggers Reviewed-by: Chao Yu Thanks, --- fs/f2fs/super.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index d00d21a8b53ad..d45ab0992ae59 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -4873,20 +4873,21 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent) kfree(F2FS_OPTION(sbi).s_qf_names[i]); #endif fscrypt_free_dummy_policy(&F2FS_OPTION(sbi).dummy_enc_policy); kvfree(options); free_sb_buf: kfree(raw_super); free_sbi: if (sbi->s_chksum_driver) crypto_free_shash(sbi->s_chksum_driver); kfree(sbi); + sb->s_fs_info = NULL; /* give only one another chance */ if (retry_cnt > 0 && skip_recovery) { retry_cnt--; shrink_dcache_sb(sb); goto try_onemore; } return err; } base-commit: 38814330fedd778edffcabe0c8cb462ee365782e
Re: [f2fs-dev] [PATCH] f2fs: check free sections before disable checkpoint
On 2023/12/29 11:25, Wu Bo wrote: 'f2fs_is_checkpoint_ready()' checks free sections. If there is not enough free sections, most f2fs operations will return -ENOSPC when checkpoint is disabled. It would be better to check free sections before disable checkpoint. Signed-off-by: Wu Bo Reviewed-by: Chao Yu Thanks, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH v3 3/6] f2fs: compress: fix to check unreleased compressed cluster
On 2024/1/12 1:15, Daeho Jeong wrote: On Wed, Jan 10, 2024 at 5:33 PM Chao Yu wrote: On 2024/1/11 9:18, Daeho Jeong wrote: On Thu, Dec 28, 2023 at 6:33 AM Chao Yu wrote: From: Sheng Yong Compressed cluster may not be released due to we can fail in release_compress_blocks(), fix to handle reserved compressed cluster correctly in reserve_compress_blocks(). Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Sheng Yong Signed-off-by: Chao Yu --- fs/f2fs/file.c | 12 1 file changed, 12 insertions(+) diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 026d05a7edd8..782ae3be48f6 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -3624,6 +3624,15 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) goto next; } + /* +* compressed cluster was not released due to +* it fails in release_compress_blocks(). +*/ + if (blkaddr == NEW_ADDR) { + compr_blocks++; + continue; + } + if (__is_valid_data_blkaddr(blkaddr)) { compr_blocks++; continue; How about merging two conditions like "blkaddr == NEW_ADDR || __is_valid_data_blkaddr(blkaddr)"? Oh, sure. @@ -3633,6 +3642,9 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) } reserved = cluster_size - compr_blocks; + if (!reserved) + goto next; + How can the reserved variable be zero? I guess it can happen if a cluster was not released during release_compress_blocks(), then all blocks in the cluster should has been reserved, so, in this round of reserving, it needs to skip reserve blocks, right? Let's assume cluster_size is 4. How can compr_blocks be 4? if (i == 0) { if (blkaddr == COMPRESS_ADDR) continue; dn->ofs_in_node += cluster_size; goto next; } We skip the block having COMPRESS_ADDR when counting compr_blocks. So, the maximum value of compr_blocks should be 3, right? Ah, got it, and I think you're right. Should fix the condition as below? /* for the case all blocks in cluster were reserved */ if (reserved == 1) goto next; Thanks, Thanks, ret = inc_valid_block_count(sbi, dn->inode, &reserved); if (ret) return ret; -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH v3] f2fs: reduce expensive checkpoint trigger frequency
We may trigger high frequent checkpoint for below case: 1. mkdir /mnt/dir1; set dir1 encrypted 2. touch /mnt/file1; fsync /mnt/file1 3. mkdir /mnt/dir2; set dir2 encrypted 4. touch /mnt/file2; fsync /mnt/file2 ... Although, newly created dir and file are not related, due to commit bbf156f7afa7 ("f2fs: fix lost xattrs of directories"), we will trigger checkpoint whenever fsync() comes after a new encrypted dir created. In order to avoid such condition, let's record an entry including directory's ino into global cache when we initialize encryption policy in a checkpointed directory, and then only trigger checkpoint() when target file's parent has non-persisted encryption policy, for the case its parent is not checkpointed, need_do_checkpoint() has cover that by verifying it with f2fs_is_checkpointed_node(). Reported-by: Zhiguo Niu Tested-by: Zhiguo Niu Reported-by: Yunlei He Signed-off-by: Chao Yu --- v3: - Recently, Zhiguo Niu reported the same issue, so I repost this patch for comments. fs/f2fs/f2fs.h | 2 ++ fs/f2fs/file.c | 3 +++ fs/f2fs/xattr.c | 16 ++-- include/trace/events/f2fs.h | 3 ++- 4 files changed, 21 insertions(+), 3 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index e2e0ca45f881..0094a8c85f4a 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -279,6 +279,7 @@ enum { APPEND_INO, /* for append ino list */ UPDATE_INO, /* for update ino list */ TRANS_DIR_INO, /* for transactions dir ino list */ + ENC_DIR_INO,/* for encrypted dir ino list */ FLUSH_INO, /* for multiple device flushing */ MAX_INO_ENTRY, /* max. list */ }; @@ -1147,6 +1148,7 @@ enum cp_reason_type { CP_FASTBOOT_MODE, CP_SPEC_LOG_NUM, CP_RECOVER_DIR, + CP_ENC_DIR, }; enum iostat_type { diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 8198afb5fb9c..18b33b1f0c83 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -218,6 +218,9 @@ static inline enum cp_reason_type need_do_checkpoint(struct inode *inode) f2fs_exist_written_data(sbi, F2FS_I(inode)->i_pino, TRANS_DIR_INO)) cp_reason = CP_RECOVER_DIR; + else if (f2fs_exist_written_data(sbi, F2FS_I(inode)->i_pino, + ENC_DIR_INO)) + cp_reason = CP_ENC_DIR; return cp_reason; } diff --git a/fs/f2fs/xattr.c b/fs/f2fs/xattr.c index f290fe9327c4..cbd1b88297fe 100644 --- a/fs/f2fs/xattr.c +++ b/fs/f2fs/xattr.c @@ -629,6 +629,7 @@ static int __f2fs_setxattr(struct inode *inode, int index, const char *name, const void *value, size_t size, struct page *ipage, int flags) { + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); struct f2fs_xattr_entry *here, *last; void *base_addr, *last_base_addr; int found, newsize; @@ -772,8 +773,19 @@ static int __f2fs_setxattr(struct inode *inode, int index, if (index == F2FS_XATTR_INDEX_ENCRYPTION && !strcmp(name, F2FS_XATTR_NAME_ENCRYPTION_CONTEXT)) f2fs_set_encrypted_inode(inode); - if (S_ISDIR(inode->i_mode)) - set_sbi_flag(F2FS_I_SB(inode), SBI_NEED_CP); + + if (S_ISDIR(inode->i_mode)) { + /* +* In restrict mode, fsync() always tries triggering checkpoint +* for all metadata consistency, in other mode, it only triggers +* checkpoint when parent's encryption metadata updates. +*/ + if (F2FS_OPTION(sbi).fsync_mode == FSYNC_MODE_STRICT) + set_sbi_flag(F2FS_I_SB(inode), SBI_NEED_CP); + else if (IS_ENCRYPTED(inode) && + f2fs_is_checkpointed_node(sbi, inode->i_ino)) + f2fs_add_ino_entry(sbi, inode->i_ino, ENC_DIR_INO); + } same: if (is_inode_flag_set(inode, FI_ACL_MODE)) { diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h index 7ed0fc430dc6..48f2e399e184 100644 --- a/include/trace/events/f2fs.h +++ b/include/trace/events/f2fs.h @@ -139,7 +139,8 @@ TRACE_DEFINE_ENUM(EX_BLOCK_AGE); { CP_NODE_NEED_CP, "node needs cp" }, \ { CP_FASTBOOT_MODE, "fastboot mode" }, \ { CP_SPEC_LOG_NUM, "log type is 2" }, \ - { CP_RECOVER_DIR, "dir needs recovery" }) + { CP_RECOVER_DIR, "dir needs recovery" }, \ + { CP_ENC_DIR, "persist encryption policy" }) #define show_shutdown_mode(type) \ __print_symbol
[f2fs-dev] [PATCH] f2fs: compress: fix to cover f2fs_disable_compressed_file() w/ i_sem
- f2fs_disable_compressed_file - check inode_has_data - f2fs_file_mmap - mkwrite - f2fs_get_block_locked : update metadata in compressed inode's disk layout - fi->i_flags &= ~F2FS_COMPR_FL - clear_inode_flag(inode, FI_COMPRESSED_FILE); we should use i_sem lock to prevent above race case. Meanwhile, this patch adds i_size check to restrict compress inode conversion condition. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu --- fs/f2fs/f2fs.h | 18 -- fs/f2fs/file.c | 5 ++--- 2 files changed, 18 insertions(+), 5 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 74729db0b381..e2e0ca45f881 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -4406,19 +4406,33 @@ static inline int set_compress_context(struct inode *inode) #endif } +static inline bool inode_has_data(struct inode *inode) +{ + return (S_ISREG(inode->i_mode) && + (F2FS_HAS_BLOCKS(inode) || i_size_read(inode))); +} + static inline bool f2fs_disable_compressed_file(struct inode *inode) { struct f2fs_inode_info *fi = F2FS_I(inode); - if (!f2fs_compressed_file(inode)) + f2fs_down_write(&F2FS_I(inode)->i_sem); + + if (!f2fs_compressed_file(inode)) { + f2fs_up_write(&F2FS_I(inode)->i_sem); return true; - if (S_ISREG(inode->i_mode) && F2FS_HAS_BLOCKS(inode)) + } + if (f2fs_is_mmap_file(inode) || inode_has_data(inode)) { + f2fs_up_write(&F2FS_I(inode)->i_sem); return false; + } fi->i_flags &= ~F2FS_COMPR_FL; stat_dec_compr_inode(inode); clear_inode_flag(inode, FI_COMPRESSED_FILE); f2fs_mark_inode_dirty_sync(inode, true); + + f2fs_up_write(&F2FS_I(inode)->i_sem); return true; } diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 0e4c871d6aed..5e5df234eb92 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -1926,8 +1926,7 @@ static int f2fs_setflags_common(struct inode *inode, u32 iflags, u32 mask) f2fs_down_write(&F2FS_I(inode)->i_sem); if (!f2fs_may_compress(inode) || - (S_ISREG(inode->i_mode) && - F2FS_HAS_BLOCKS(inode))) { + inode_has_data(inode)) { f2fs_up_write(&F2FS_I(inode)->i_sem); return -EINVAL; } @@ -4011,7 +4010,7 @@ static int f2fs_ioc_set_compress_option(struct file *filp, unsigned long arg) goto out; } - if (F2FS_HAS_BLOCKS(inode)) { + if (inode_has_data(inode)) { ret = -EFBIG; goto out; } -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH v4 2/6] f2fs: compress: fix to cover normal cluster write with cp_rwsem
When we overwrite compressed cluster w/ normal cluster, we should not unlock cp_rwsem during f2fs_write_raw_pages(), otherwise data will be corrupted if partial blocks were persisted before CP & SPOR, due to cluster metadata wasn't updated atomically. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu --- fs/f2fs/compress.c | 20 ++-- fs/f2fs/data.c | 3 ++- 2 files changed, 16 insertions(+), 7 deletions(-) diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index 9940b7886e5d..bf4cfab67aec 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -1448,7 +1448,8 @@ static int f2fs_write_raw_pages(struct compress_ctx *cc, enum iostat_type io_type) { struct address_space *mapping = cc->inode->i_mapping; - int _submitted, compr_blocks, ret, i; + struct f2fs_sb_info *sbi = F2FS_M_SB(mapping); + int _submitted, compr_blocks, ret = 0, i; compr_blocks = f2fs_compressed_blocks(cc); @@ -1463,6 +1464,10 @@ static int f2fs_write_raw_pages(struct compress_ctx *cc, if (compr_blocks < 0) return compr_blocks; + /* overwrite compressed cluster w/ normal cluster */ + if (compr_blocks > 0) + f2fs_lock_op(sbi); + for (i = 0; i < cc->cluster_size; i++) { if (!cc->rpages[i]) continue; @@ -1495,26 +1500,29 @@ static int f2fs_write_raw_pages(struct compress_ctx *cc, unlock_page(cc->rpages[i]); ret = 0; } else if (ret == -EAGAIN) { + ret = 0; /* * for quota file, just redirty left pages to * avoid deadlock caused by cluster update race * from foreground operation. */ if (IS_NOQUOTA(cc->inode)) - return 0; - ret = 0; + goto out; f2fs_io_schedule_timeout(DEFAULT_IO_TIMEOUT); goto retry_write; } - return ret; + goto out; } *submitted += _submitted; } - f2fs_balance_fs(F2FS_M_SB(mapping), true); +out: + if (compr_blocks > 0) + f2fs_unlock_op(sbi); - return 0; + f2fs_balance_fs(sbi, true); + return ret; } int f2fs_write_multi_pages(struct compress_ctx *cc, diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 81f9e2cc49e2..b171a9980f6a 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -2839,7 +2839,7 @@ int f2fs_write_single_data_page(struct page *page, int *submitted, .encrypted_page = NULL, .submitted = 0, .compr_blocks = compr_blocks, - .need_lock = LOCK_RETRY, + .need_lock = compr_blocks ? LOCK_DONE : LOCK_RETRY, .post_read = f2fs_post_read_required(inode) ? 1 : 0, .io_type = io_type, .io_wbc = wbc, @@ -2920,6 +2920,7 @@ int f2fs_write_single_data_page(struct page *page, int *submitted, if (err == -EAGAIN) { err = f2fs_do_write_data_page(&fio); if (err == -EAGAIN) { + f2fs_bug_on(sbi, compr_blocks); fio.need_lock = LOCK_REQ; err = f2fs_do_write_data_page(&fio); } -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH v4 5/6] f2fs: fix to remove unnecessary f2fs_bug_on() to avoid panic
verify_blkaddr() will trigger panic once we inject fault into f2fs_is_valid_blkaddr(), fix to remove this unnecessary f2fs_bug_on(). Fixes: 18792e64c86d ("f2fs: support fault injection for f2fs_is_valid_blkaddr()") Signed-off-by: Chao Yu --- fs/f2fs/f2fs.h | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 69e71460a950..ab710bb6d8b3 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -3470,11 +3470,9 @@ bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi, static inline void verify_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr, int type) { - if (!f2fs_is_valid_blkaddr(sbi, blkaddr, type)) { + if (!f2fs_is_valid_blkaddr(sbi, blkaddr, type)) f2fs_err(sbi, "invalid blkaddr: %u, type: %d, run fsck to fix.", blkaddr, type); - f2fs_bug_on(sbi, 1); - } } static inline bool __is_valid_data_blkaddr(block_t blkaddr) -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH v4 6/6] f2fs: introduce FAULT_BLKADDR_CONSISTENCE
We will encounter below inconsistent status when FAULT_BLKADDR type fault injection is on. Info: checkpoint state = d6 : nat_bits crc fsck compacted_summary orphan_inodes sudden-power-off [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1c100 has i_blocks: 00c0, but has 191 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1c100] i_blocks=0x00c0 -> 0xbf [FIX] (fsck_chk_inode_blk:1269) --> [0x1c100] i_compr_blocks=0x0026 -> 0x27 [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1cadb has i_blocks: 002f, but has 46 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1cadb] i_blocks=0x002f -> 0x2e [FIX] (fsck_chk_inode_blk:1269) --> [0x1cadb] i_compr_blocks=0x0011 -> 0x12 [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1c62c has i_blocks: 0002, but has 1 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1c62c] i_blocks=0x0002 -> 0x1 After we inject fault into f2fs_is_valid_blkaddr() during truncation, a) it missed to increase @nr_free or @valid_blocks b) it can cause in blkaddr leak in truncated dnode Which may cause inconsistent status. This patch separates FAULT_BLKADDR_CONSISTENCE from FAULT_BLKADDR, and rename FAULT_BLKADDR to FAULT_BLKADDR_VALIDITY so that we can: a) use FAULT_BLKADDR_CONSISTENCE in f2fs_truncate_data_blocks_range() to simulate inconsistent issue independently, then it can verify fsck repair flow. b) FAULT_BLKADDR_VALIDITY fault will not cause any inconsistent status, we can just use it to check error path handling in kernel side. Signed-off-by: Chao Yu --- v4: - rename macro to FAULT_BLKADDR_CONSISTENCE and FAULT_BLKADDR_VALIDITY suggested by Jaegeuk. Documentation/ABI/testing/sysfs-fs-f2fs | 47 + Documentation/filesystems/f2fs.rst | 47 + fs/f2fs/checkpoint.c| 19 +++--- fs/f2fs/f2fs.h | 5 ++- fs/f2fs/file.c | 8 +++-- fs/f2fs/super.c | 37 +-- 6 files changed, 92 insertions(+), 71 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs index 99fa87a43926..48c135e24eb5 100644 --- a/Documentation/ABI/testing/sysfs-fs-f2fs +++ b/Documentation/ABI/testing/sysfs-fs-f2fs @@ -701,29 +701,30 @@ Description: Support configuring fault injection type, should be enabled with fault_injection option, fault type value is shown below, it supports single or combined type. - === === - Type_NameType_Value - === === - FAULT_KMALLOC0x1 - FAULT_KVMALLOC 0x2 - FAULT_PAGE_ALLOC 0x4 - FAULT_PAGE_GET 0x8 - FAULT_ALLOC_BIO 0x00010 (obsolete) - FAULT_ALLOC_NID 0x00020 - FAULT_ORPHAN 0x00040 - FAULT_BLOCK 0x00080 - FAULT_DIR_DEPTH 0x00100 - FAULT_EVICT_INODE0x00200 - FAULT_TRUNCATE 0x00400 - FAULT_READ_IO0x00800 - FAULT_CHECKPOINT 0x01000 - FAULT_DISCARD0x02000 - FAULT_WRITE_IO 0x04000 - FAULT_SLAB_ALLOC 0x08000 - FAULT_DQUOT_INIT 0x1 - FAULT_LOCK_OP0x2 - FAULT_BLKADDR0x4 - === === + === === + Type_NameType_Value + === === + FAULT_KMALLOC0x1 + FAULT_KVMALLOC 0x2 + FAULT_PAGE_ALLOC 0x4 + FAULT_PAGE_GET 0x8 + FAULT_ALLOC_BIO 0x00010 (obsolete) + FAULT_ALLOC_NID 0x00020 + FAULT_ORPHAN 0x00040 + FAULT_BLOCK 0x00080 + FAULT_DIR_DEPTH 0x00100 + FAULT_EVICT_INODE0x00200 + FAULT_TRUNCATE 0x00400 + FAULT_READ_IO0x00800 + FAULT_CHECKPOINT 0x01000 + FAULT_DISCARD0x02000 + FAULT_WRITE_IO 0x04000 + FAULT_SLAB_ALLOC 0x08000 + FAULT_DQUOT_INIT 0x
[f2fs-dev] [PATCH v4 1/6] f2fs: compress: fix to guarantee persisting compressed blocks by CP
If data block in compressed cluster is not persisted with metadata during checkpoint, after SPOR, the data may be corrupted, let's guarantee to write compressed page by checkpoint. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu --- fs/f2fs/compress.c | 4 +++- fs/f2fs/data.c | 17 + fs/f2fs/f2fs.h | 4 +++- 3 files changed, 15 insertions(+), 10 deletions(-) diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index c5a4364c4482..9940b7886e5d 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -1418,6 +1418,8 @@ void f2fs_compress_write_end_io(struct bio *bio, struct page *page) struct f2fs_sb_info *sbi = bio->bi_private; struct compress_io_ctx *cic = (struct compress_io_ctx *)page_private(page); + enum count_type type = WB_DATA_TYPE(page, + f2fs_is_compressed_page(page)); int i; if (unlikely(bio->bi_status)) @@ -1425,7 +1427,7 @@ void f2fs_compress_write_end_io(struct bio *bio, struct page *page) f2fs_compress_free_page(page); - dec_page_count(sbi, F2FS_WB_DATA); + dec_page_count(sbi, type); if (atomic_dec_return(&cic->pending_pages)) return; diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index dce8defdf4c7..81f9e2cc49e2 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -48,7 +48,7 @@ void f2fs_destroy_bioset(void) bioset_exit(&f2fs_bioset); } -static bool __is_cp_guaranteed(struct page *page) +bool f2fs_is_cp_guaranteed(struct page *page) { struct address_space *mapping = page->mapping; struct inode *inode; @@ -65,8 +65,6 @@ static bool __is_cp_guaranteed(struct page *page) S_ISDIR(inode->i_mode)) return true; - if (f2fs_is_compressed_page(page)) - return false; if ((S_ISREG(inode->i_mode) && IS_NOQUOTA(inode)) || page_private_gcing(page)) return true; @@ -338,7 +336,7 @@ static void f2fs_write_end_io(struct bio *bio) bio_for_each_segment_all(bvec, bio, iter_all) { struct page *page = bvec->bv_page; - enum count_type type = WB_DATA_TYPE(page); + enum count_type type = WB_DATA_TYPE(page, false); if (page_private_dummy(page)) { clear_page_private_dummy(page); @@ -762,7 +760,7 @@ int f2fs_submit_page_bio(struct f2fs_io_info *fio) wbc_account_cgroup_owner(fio->io_wbc, fio->page, PAGE_SIZE); inc_page_count(fio->sbi, is_read_io(fio->op) ? - __read_io_type(page) : WB_DATA_TYPE(fio->page)); + __read_io_type(page) : WB_DATA_TYPE(fio->page, false)); if (is_read_io(bio_op(bio))) f2fs_submit_read_bio(fio->sbi, bio, fio->type); @@ -973,7 +971,7 @@ int f2fs_merge_page_bio(struct f2fs_io_info *fio) if (fio->io_wbc) wbc_account_cgroup_owner(fio->io_wbc, fio->page, PAGE_SIZE); - inc_page_count(fio->sbi, WB_DATA_TYPE(page)); + inc_page_count(fio->sbi, WB_DATA_TYPE(page, false)); *fio->last_block = fio->new_blkaddr; *fio->bio = bio; @@ -1007,6 +1005,7 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio) enum page_type btype = PAGE_TYPE_OF_BIO(fio->type); struct f2fs_bio_info *io = sbi->write_io[btype] + fio->temp; struct page *bio_page; + enum count_type type; f2fs_bug_on(sbi, is_read_io(fio->op)); @@ -1046,7 +1045,8 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio) /* set submitted = true as a return value */ fio->submitted = 1; - inc_page_count(sbi, WB_DATA_TYPE(bio_page)); + type = WB_DATA_TYPE(bio_page, fio->compressed_page); + inc_page_count(sbi, type); if (io->bio && (!io_is_mergeable(sbi, io->bio, io, fio, io->last_block_in_bio, @@ -1059,7 +1059,8 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio) if (F2FS_IO_ALIGNED(sbi) && (fio->type == DATA || fio->type == NODE) && fio->new_blkaddr & F2FS_IO_SIZE_MASK(sbi)) { - dec_page_count(sbi, WB_DATA_TYPE(bio_page)); + dec_page_count(sbi, WB_DATA_TYPE(bio_page, + fio->compressed_page)); fio->retry = 1; goto skip; } diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 65294e3b0bef..50f3d546ded8 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -1080,7 +1080,8 @@ struct f2fs_sm_info { * f2fs monitors the number of several block types such as on-writeback, * dirty dentry b
[f2fs-dev] [PATCH v4 4/6] f2fs: compress: fix to avoid inconsistence bewteen i_blocks and dnode
In reserve_compress_blocks(), we update blkaddrs of dnode in prior to inc_valid_block_count(), it may cause inconsistent status bewteen i_blocks and blkaddrs once inc_valid_block_count() fails. To fix this issue, it needs to reverse their invoking order. Fixes: c75488fb4d82 ("f2fs: introduce F2FS_IOC_RESERVE_COMPRESS_BLOCKS") Signed-off-by: Chao Yu --- fs/f2fs/data.c| 5 +++-- fs/f2fs/f2fs.h| 7 ++- fs/f2fs/file.c| 26 ++ fs/f2fs/segment.c | 2 +- 4 files changed, 24 insertions(+), 16 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index b171a9980f6a..8d2ace723310 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1219,7 +1219,8 @@ int f2fs_reserve_new_blocks(struct dnode_of_data *dn, blkcnt_t count) if (unlikely(is_inode_flag_set(dn->inode, FI_NO_ALLOC))) return -EPERM; - if (unlikely((err = inc_valid_block_count(sbi, dn->inode, &count + err = inc_valid_block_count(sbi, dn->inode, &count, true); + if (unlikely(err)) return err; trace_f2fs_reserve_new_blocks(dn->inode, dn->nid, @@ -1476,7 +1477,7 @@ static int __allocate_data_block(struct dnode_of_data *dn, int seg_type) dn->data_blkaddr = f2fs_data_blkaddr(dn); if (dn->data_blkaddr == NULL_ADDR) { - err = inc_valid_block_count(sbi, dn->inode, &count); + err = inc_valid_block_count(sbi, dn->inode, &count, true); if (unlikely(err)) return err; } diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 50f3d546ded8..69e71460a950 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -2252,7 +2252,7 @@ static inline bool __allow_reserved_blocks(struct f2fs_sb_info *sbi, static inline void f2fs_i_blocks_write(struct inode *, block_t, bool, bool); static inline int inc_valid_block_count(struct f2fs_sb_info *sbi, -struct inode *inode, blkcnt_t *count) +struct inode *inode, blkcnt_t *count, bool partial) { blkcnt_t diff = 0, release = 0; block_t avail_user_block_count; @@ -2292,6 +2292,11 @@ static inline int inc_valid_block_count(struct f2fs_sb_info *sbi, avail_user_block_count = 0; } if (unlikely(sbi->total_valid_block_count > avail_user_block_count)) { + if (!partial) { + spin_unlock(&sbi->stat_lock); + goto enospc; + } + diff = sbi->total_valid_block_count - avail_user_block_count; if (diff > *count) diff = *count; diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 80d9c4c096f0..53c495651789 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -3614,14 +3614,16 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) blkcnt_t reserved; int ret; - for (i = 0; i < cluster_size; i++, dn->ofs_in_node++) { - blkaddr = f2fs_data_blkaddr(dn); + for (i = 0; i < cluster_size; i++) { + blkaddr = data_blkaddr(dn->inode, dn->node_page, + dn->ofs_in_node + i); if (i == 0) { - if (blkaddr == COMPRESS_ADDR) - continue; - dn->ofs_in_node += cluster_size; - goto next; + if (blkaddr != COMPRESS_ADDR) { + dn->ofs_in_node += cluster_size; + goto next; + } + continue; } /* @@ -3634,20 +3636,20 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) compr_blocks++; continue; } - - f2fs_set_data_blkaddr(dn, NEW_ADDR); } reserved = cluster_size - compr_blocks; if (!reserved) goto next; - ret = inc_valid_block_count(sbi, dn->inode, &reserved); - if (ret) + ret = inc_valid_block_count(sbi, dn->inode, &reserved, false); + if (unlikely(ret)) return ret; - if (reserved != cluster_size - compr_blocks) - return -ENOSPC; + for (i = 0; i < cluster_size; i++, dn->ofs_in_node++) { + if (f2fs_data_blkaddr(dn) == NULL_ADDR) + f2fs_set_data_blkaddr(dn, NEW_ADDR); + }
[f2fs-dev] [PATCH v4 3/6] f2fs: compress: fix to check unreleased compressed cluster
From: Sheng Yong Compressed cluster may not be released due to we can fail in release_compress_blocks(), fix to handle reserved compressed cluster correctly in reserve_compress_blocks(). Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Sheng Yong Signed-off-by: Chao Yu --- v4: - merge check condition suggested by Daeho. fs/f2fs/file.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 026d05a7edd8..80d9c4c096f0 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -3624,7 +3624,13 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) goto next; } - if (__is_valid_data_blkaddr(blkaddr)) { + /* +* compressed cluster was not released due to it +* fails in release_compress_blocks(), so NEW_ADDR +* is a possible case. +*/ + if (blkaddr == NEW_ADDR || + __is_valid_data_blkaddr(blkaddr)) { compr_blocks++; continue; } @@ -3633,6 +3639,9 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) } reserved = cluster_size - compr_blocks; + if (!reserved) + goto next; + ret = inc_valid_block_count(sbi, dn->inode, &reserved); if (ret) return ret; -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH 5/6] f2fs: fix to restrict condition of compress inode conversion
On 2024/1/3 6:54, Jaegeuk Kim wrote: On 12/28, Chao Yu wrote: On 2023/12/13 6:21, Jaegeuk Kim wrote: On 12/12, Chao Yu wrote: On 2023/12/12 6:11, Jaegeuk Kim wrote: On 12/10, Chao Yu wrote: This patch adds i_size check during compress inode conversion in order to avoid .page_mkwrite races w/ conversion. Which race condition do you see? Something like: - f2fs_setflags_common - check S_ISREG && F2FS_HAS_BLOCKS - mkwrite - f2fs_get_block_locked : update metadata in old inode's disk layout Don't we need to prevent setting this for mmapped file? Oh, we have used i_sem lock to prevent such race case, however we missed f2fs_disable_compressed_file(): - f2fs_disable_compressed_file - check inode_has_data - f2fs_file_mmap - mkwrite - f2fs_get_block_locked : update metadata in compressed inode's disk layout - fi->i_flags &= ~F2FS_COMPR_FL - clear_inode_flag(inode, FI_COMPRESSED_FILE); So, needing i_sem for disabling it on mmapped file? It seems i_size would not be enough? Agreed, let me update the patch. Thanks, Thanks, - set_compress_context Thanks, Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu --- fs/f2fs/f2fs.h | 8 +++- fs/f2fs/file.c | 5 ++--- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 65294e3b0bef..c9b8a1953913 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -4397,13 +4397,19 @@ static inline int set_compress_context(struct inode *inode) #endif } +static inline bool inode_has_data(struct inode *inode) +{ + return (S_ISREG(inode->i_mode) && + (F2FS_HAS_BLOCKS(inode) || i_size_read(inode))); +} + static inline bool f2fs_disable_compressed_file(struct inode *inode) { struct f2fs_inode_info *fi = F2FS_I(inode); if (!f2fs_compressed_file(inode)) return true; - if (S_ISREG(inode->i_mode) && F2FS_HAS_BLOCKS(inode)) + if (inode_has_data(inode)) return false; fi->i_flags &= ~F2FS_COMPR_FL; diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 1a3c29a9a6a0..8af4b29c3e1a 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -1922,8 +1922,7 @@ static int f2fs_setflags_common(struct inode *inode, u32 iflags, u32 mask) f2fs_down_write(&F2FS_I(inode)->i_sem); if (!f2fs_may_compress(inode) || - (S_ISREG(inode->i_mode) && - F2FS_HAS_BLOCKS(inode))) { + inode_has_data(inode)) { f2fs_up_write(&F2FS_I(inode)->i_sem); return -EINVAL; } @@ -3996,7 +3995,7 @@ static int f2fs_ioc_set_compress_option(struct file *filp, unsigned long arg) goto out; } - if (F2FS_HAS_BLOCKS(inode)) { + if (inode_has_data(inode)) { ret = -EFBIG; goto out; } -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH v3 6/6] f2fs: introduce FAULT_BLKADDR_INCONSISTENCE
On 2024/1/3 4:55, Jaegeuk Kim wrote: On 12/28, Chao Yu wrote: We will encounter below inconsistent status when FAULT_BLKADDR type fault injection is on. Info: checkpoint state = d6 : nat_bits crc fsck compacted_summary orphan_inodes sudden-power-off [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1c100 has i_blocks: 00c0, but has 191 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1c100] i_blocks=0x00c0 -> 0xbf [FIX] (fsck_chk_inode_blk:1269) --> [0x1c100] i_compr_blocks=0x0026 -> 0x27 [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1cadb has i_blocks: 002f, but has 46 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1cadb] i_blocks=0x002f -> 0x2e [FIX] (fsck_chk_inode_blk:1269) --> [0x1cadb] i_compr_blocks=0x0011 -> 0x12 [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1c62c has i_blocks: 0002, but has 1 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1c62c] i_blocks=0x0002 -> 0x1 After we inject fault into f2fs_is_valid_blkaddr() during truncation, a) it missed to increase @nr_free or @valid_blocks b) it can cause in blkaddr leak in truncated dnode Which may cause inconsistent status. This patch separates FAULT_BLKADDR_INCONSISTENCE from FAULT_BLKADDR, so that we can: a) use FAULT_BLKADDR_INCONSISTENCE in f2fs_truncate_data_blocks_range() to simulate inconsistent issue independently, b) FAULT_BLKADDR fault will not cause any inconsistent status, we can just use it to check error path handling in kernel side. How about defining FAULT_BLKADDR_VALIDITY and FAULT_BLKADDR_CONSISTENCY? Better, :) Thanks, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH v3 3/6] f2fs: compress: fix to check unreleased compressed cluster
On 2024/1/11 9:18, Daeho Jeong wrote: On Thu, Dec 28, 2023 at 6:33 AM Chao Yu wrote: From: Sheng Yong Compressed cluster may not be released due to we can fail in release_compress_blocks(), fix to handle reserved compressed cluster correctly in reserve_compress_blocks(). Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Sheng Yong Signed-off-by: Chao Yu --- fs/f2fs/file.c | 12 1 file changed, 12 insertions(+) diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 026d05a7edd8..782ae3be48f6 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -3624,6 +3624,15 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) goto next; } + /* +* compressed cluster was not released due to +* it fails in release_compress_blocks(). +*/ + if (blkaddr == NEW_ADDR) { + compr_blocks++; + continue; + } + if (__is_valid_data_blkaddr(blkaddr)) { compr_blocks++; continue; How about merging two conditions like "blkaddr == NEW_ADDR || __is_valid_data_blkaddr(blkaddr)"? Oh, sure. @@ -3633,6 +3642,9 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) } reserved = cluster_size - compr_blocks; + if (!reserved) + goto next; + How can the reserved variable be zero? I guess it can happen if a cluster was not released during release_compress_blocks(), then all blocks in the cluster should has been reserved, so, in this round of reserving, it needs to skip reserve blocks, right? Thanks, ret = inc_valid_block_count(sbi, dn->inode, &reserved); if (ret) return ret; -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH v3 1/6] f2fs: compress: fix to guarantee persisting compressed blocks by CP
On 2024/1/11 8:55, Daeho Jeong wrote: On Thu, Dec 28, 2023 at 6:33 AM Chao Yu wrote: If data block in compressed cluster is not persisted with metadata during checkpoint, after SPOR, the data may be corrupted, let's guarantee to write compressed page by checkpoint. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu --- v3: - treat compressed page as CP guaranteed data explictly. fs/f2fs/compress.c | 4 +++- fs/f2fs/data.c | 17 + fs/f2fs/f2fs.h | 4 +++- 3 files changed, 15 insertions(+), 10 deletions(-) diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index c5a4364c4482..9940b7886e5d 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -1418,6 +1418,8 @@ void f2fs_compress_write_end_io(struct bio *bio, struct page *page) struct f2fs_sb_info *sbi = bio->bi_private; struct compress_io_ctx *cic = (struct compress_io_ctx *)page_private(page); + enum count_type type = WB_DATA_TYPE(page, + f2fs_is_compressed_page(page)); int i; if (unlikely(bio->bi_status)) @@ -1425,7 +1427,7 @@ void f2fs_compress_write_end_io(struct bio *bio, struct page *page) f2fs_compress_free_page(page); - dec_page_count(sbi, F2FS_WB_DATA); + dec_page_count(sbi, type); if (atomic_dec_return(&cic->pending_pages)) return; diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index dce8defdf4c7..81f9e2cc49e2 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -48,7 +48,7 @@ void f2fs_destroy_bioset(void) bioset_exit(&f2fs_bioset); } -static bool __is_cp_guaranteed(struct page *page) +bool f2fs_is_cp_guaranteed(struct page *page) { struct address_space *mapping = page->mapping; struct inode *inode; @@ -65,8 +65,6 @@ static bool __is_cp_guaranteed(struct page *page) S_ISDIR(inode->i_mode)) return true; - if (f2fs_is_compressed_page(page)) - return false; Out of curiosity, why don't we simply change the above to "return true"? Daeho, I used the implementation, please check v1 and related comments from Jaegeuk and me, let me know if that was not clear enough. :) https://lore.kernel.org/linux-f2fs-devel/aae654e7-8a7e-478d-9f5a-65807a0e0...@kernel.org/ if ((S_ISREG(inode->i_mode) && IS_NOQUOTA(inode)) || page_private_gcing(page)) return true; @@ -338,7 +336,7 @@ static void f2fs_write_end_io(struct bio *bio) bio_for_each_segment_all(bvec, bio, iter_all) { struct page *page = bvec->bv_page; - enum count_type type = WB_DATA_TYPE(page); + enum count_type type = WB_DATA_TYPE(page, false); if (page_private_dummy(page)) { clear_page_private_dummy(page); @@ -762,7 +760,7 @@ int f2fs_submit_page_bio(struct f2fs_io_info *fio) wbc_account_cgroup_owner(fio->io_wbc, fio->page, PAGE_SIZE); inc_page_count(fio->sbi, is_read_io(fio->op) ? - __read_io_type(page) : WB_DATA_TYPE(fio->page)); + __read_io_type(page) : WB_DATA_TYPE(fio->page, false)); if (is_read_io(bio_op(bio))) f2fs_submit_read_bio(fio->sbi, bio, fio->type); @@ -973,7 +971,7 @@ int f2fs_merge_page_bio(struct f2fs_io_info *fio) if (fio->io_wbc) wbc_account_cgroup_owner(fio->io_wbc, fio->page, PAGE_SIZE); - inc_page_count(fio->sbi, WB_DATA_TYPE(page)); + inc_page_count(fio->sbi, WB_DATA_TYPE(page, false)); *fio->last_block = fio->new_blkaddr; *fio->bio = bio; @@ -1007,6 +1005,7 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio) enum page_type btype = PAGE_TYPE_OF_BIO(fio->type); struct f2fs_bio_info *io = sbi->write_io[btype] + fio->temp; struct page *bio_page; + enum count_type type; f2fs_bug_on(sbi, is_read_io(fio->op)); @@ -1046,7 +1045,8 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio) /* set submitted = true as a return value */ fio->submitted = 1; - inc_page_count(sbi, WB_DATA_TYPE(bio_page)); + type = WB_DATA_TYPE(bio_page, fio->compressed_page); + inc_page_count(sbi, type); if (io->bio && (!io_is_mergeable(sbi, io->bio, io, fio, io->last_block_in_bio, @@ -1059,7 +1059,8 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio) if (F2FS_IO_ALIGNED(sbi) && (fio->type == DATA || fio->type == NODE) && fio->new_blkaddr & F2FS_IO_SIZE_MASK(sbi)) { - dec_page_count(sbi, WB_DATA_TYPE(bio_pag
Re: [f2fs-dev] [PATCH 5/6] f2fs: fix to restrict condition of compress inode conversion
On 2023/12/13 6:21, Jaegeuk Kim wrote: On 12/12, Chao Yu wrote: On 2023/12/12 6:11, Jaegeuk Kim wrote: On 12/10, Chao Yu wrote: This patch adds i_size check during compress inode conversion in order to avoid .page_mkwrite races w/ conversion. Which race condition do you see? Something like: - f2fs_setflags_common - check S_ISREG && F2FS_HAS_BLOCKS - mkwrite - f2fs_get_block_locked : update metadata in old inode's disk layout Don't we need to prevent setting this for mmapped file? Oh, we have used i_sem lock to prevent such race case, however we missed f2fs_disable_compressed_file(): - f2fs_disable_compressed_file - check inode_has_data - f2fs_file_mmap - mkwrite - f2fs_get_block_locked : update metadata in compressed inode's disk layout - fi->i_flags &= ~F2FS_COMPR_FL - clear_inode_flag(inode, FI_COMPRESSED_FILE); Thanks, - set_compress_context Thanks, Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu --- fs/f2fs/f2fs.h | 8 +++- fs/f2fs/file.c | 5 ++--- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 65294e3b0bef..c9b8a1953913 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -4397,13 +4397,19 @@ static inline int set_compress_context(struct inode *inode) #endif } +static inline bool inode_has_data(struct inode *inode) +{ + return (S_ISREG(inode->i_mode) && + (F2FS_HAS_BLOCKS(inode) || i_size_read(inode))); +} + static inline bool f2fs_disable_compressed_file(struct inode *inode) { struct f2fs_inode_info *fi = F2FS_I(inode); if (!f2fs_compressed_file(inode)) return true; - if (S_ISREG(inode->i_mode) && F2FS_HAS_BLOCKS(inode)) + if (inode_has_data(inode)) return false; fi->i_flags &= ~F2FS_COMPR_FL; diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 1a3c29a9a6a0..8af4b29c3e1a 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -1922,8 +1922,7 @@ static int f2fs_setflags_common(struct inode *inode, u32 iflags, u32 mask) f2fs_down_write(&F2FS_I(inode)->i_sem); if (!f2fs_may_compress(inode) || - (S_ISREG(inode->i_mode) && - F2FS_HAS_BLOCKS(inode))) { + inode_has_data(inode)) { f2fs_up_write(&F2FS_I(inode)->i_sem); return -EINVAL; } @@ -3996,7 +3995,7 @@ static int f2fs_ioc_set_compress_option(struct file *filp, unsigned long arg) goto out; } - if (F2FS_HAS_BLOCKS(inode)) { + if (inode_has_data(inode)) { ret = -EFBIG; goto out; } -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH v3 4/6] f2fs: compress: fix to avoid inconsistent bewteen i_blocks and dnode
In reserve_compress_blocks(), we update blkaddrs of dnode in prior to inc_valid_block_count(), it may cause inconsistent status bewteen i_blocks and blkaddrs once inc_valid_block_count() fails. To fix this issue, it needs to reverse their invoking order. Fixes: c75488fb4d82 ("f2fs: introduce F2FS_IOC_RESERVE_COMPRESS_BLOCKS") Signed-off-by: Chao Yu --- fs/f2fs/data.c| 5 +++-- fs/f2fs/f2fs.h| 7 ++- fs/f2fs/file.c| 26 ++ fs/f2fs/segment.c | 2 +- 4 files changed, 24 insertions(+), 16 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index b171a9980f6a..8d2ace723310 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1219,7 +1219,8 @@ int f2fs_reserve_new_blocks(struct dnode_of_data *dn, blkcnt_t count) if (unlikely(is_inode_flag_set(dn->inode, FI_NO_ALLOC))) return -EPERM; - if (unlikely((err = inc_valid_block_count(sbi, dn->inode, &count + err = inc_valid_block_count(sbi, dn->inode, &count, true); + if (unlikely(err)) return err; trace_f2fs_reserve_new_blocks(dn->inode, dn->nid, @@ -1476,7 +1477,7 @@ static int __allocate_data_block(struct dnode_of_data *dn, int seg_type) dn->data_blkaddr = f2fs_data_blkaddr(dn); if (dn->data_blkaddr == NULL_ADDR) { - err = inc_valid_block_count(sbi, dn->inode, &count); + err = inc_valid_block_count(sbi, dn->inode, &count, true); if (unlikely(err)) return err; } diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 50f3d546ded8..69e71460a950 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -2252,7 +2252,7 @@ static inline bool __allow_reserved_blocks(struct f2fs_sb_info *sbi, static inline void f2fs_i_blocks_write(struct inode *, block_t, bool, bool); static inline int inc_valid_block_count(struct f2fs_sb_info *sbi, -struct inode *inode, blkcnt_t *count) +struct inode *inode, blkcnt_t *count, bool partial) { blkcnt_t diff = 0, release = 0; block_t avail_user_block_count; @@ -2292,6 +2292,11 @@ static inline int inc_valid_block_count(struct f2fs_sb_info *sbi, avail_user_block_count = 0; } if (unlikely(sbi->total_valid_block_count > avail_user_block_count)) { + if (!partial) { + spin_unlock(&sbi->stat_lock); + goto enospc; + } + diff = sbi->total_valid_block_count - avail_user_block_count; if (diff > *count) diff = *count; diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 782ae3be48f6..9f4e21b5916c 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -3614,14 +3614,16 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) blkcnt_t reserved; int ret; - for (i = 0; i < cluster_size; i++, dn->ofs_in_node++) { - blkaddr = f2fs_data_blkaddr(dn); + for (i = 0; i < cluster_size; i++) { + blkaddr = data_blkaddr(dn->inode, dn->node_page, + dn->ofs_in_node + i); if (i == 0) { - if (blkaddr == COMPRESS_ADDR) - continue; - dn->ofs_in_node += cluster_size; - goto next; + if (blkaddr != COMPRESS_ADDR) { + dn->ofs_in_node += cluster_size; + goto next; + } + continue; } /* @@ -3637,20 +3639,20 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) compr_blocks++; continue; } - - f2fs_set_data_blkaddr(dn, NEW_ADDR); } reserved = cluster_size - compr_blocks; if (!reserved) goto next; - ret = inc_valid_block_count(sbi, dn->inode, &reserved); - if (ret) + ret = inc_valid_block_count(sbi, dn->inode, &reserved, false); + if (unlikely(ret)) return ret; - if (reserved != cluster_size - compr_blocks) - return -ENOSPC; + for (i = 0; i < cluster_size; i++, dn->ofs_in_node++) { + if (f2fs_data_blkaddr(dn) == NULL_ADDR) + f2fs_set_data_blkaddr(dn, NEW_ADDR); + }
[f2fs-dev] [PATCH v3 6/6] f2fs: introduce FAULT_BLKADDR_INCONSISTENCE
We will encounter below inconsistent status when FAULT_BLKADDR type fault injection is on. Info: checkpoint state = d6 : nat_bits crc fsck compacted_summary orphan_inodes sudden-power-off [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1c100 has i_blocks: 00c0, but has 191 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1c100] i_blocks=0x00c0 -> 0xbf [FIX] (fsck_chk_inode_blk:1269) --> [0x1c100] i_compr_blocks=0x0026 -> 0x27 [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1cadb has i_blocks: 002f, but has 46 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1cadb] i_blocks=0x002f -> 0x2e [FIX] (fsck_chk_inode_blk:1269) --> [0x1cadb] i_compr_blocks=0x0011 -> 0x12 [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1c62c has i_blocks: 0002, but has 1 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1c62c] i_blocks=0x0002 -> 0x1 After we inject fault into f2fs_is_valid_blkaddr() during truncation, a) it missed to increase @nr_free or @valid_blocks b) it can cause in blkaddr leak in truncated dnode Which may cause inconsistent status. This patch separates FAULT_BLKADDR_INCONSISTENCE from FAULT_BLKADDR, so that we can: a) use FAULT_BLKADDR_INCONSISTENCE in f2fs_truncate_data_blocks_range() to simulate inconsistent issue independently, b) FAULT_BLKADDR fault will not cause any inconsistent status, we can just use it to check error path handling in kernel side. Signed-off-by: Chao Yu --- v3: - rename FAULT_INCONSISTENCE as Jaegeuk's suggestion. Documentation/ABI/testing/sysfs-fs-f2fs | 47 + Documentation/filesystems/f2fs.rst | 47 + fs/f2fs/checkpoint.c| 19 +++--- fs/f2fs/f2fs.h | 3 ++ fs/f2fs/file.c | 8 +++-- fs/f2fs/super.c | 37 +-- 6 files changed, 91 insertions(+), 70 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs index 4f1d4e636d67..039a16ebaaaf 100644 --- a/Documentation/ABI/testing/sysfs-fs-f2fs +++ b/Documentation/ABI/testing/sysfs-fs-f2fs @@ -686,29 +686,30 @@ Description: Support configuring fault injection type, should be enabled with fault_injection option, fault type value is shown below, it supports single or combined type. - === === - Type_NameType_Value - === === - FAULT_KMALLOC0x1 - FAULT_KVMALLOC 0x2 - FAULT_PAGE_ALLOC 0x4 - FAULT_PAGE_GET 0x8 - FAULT_ALLOC_BIO 0x00010 (obsolete) - FAULT_ALLOC_NID 0x00020 - FAULT_ORPHAN 0x00040 - FAULT_BLOCK 0x00080 - FAULT_DIR_DEPTH 0x00100 - FAULT_EVICT_INODE0x00200 - FAULT_TRUNCATE 0x00400 - FAULT_READ_IO0x00800 - FAULT_CHECKPOINT 0x01000 - FAULT_DISCARD0x02000 - FAULT_WRITE_IO 0x04000 - FAULT_SLAB_ALLOC 0x08000 - FAULT_DQUOT_INIT 0x1 - FAULT_LOCK_OP0x2 - FAULT_BLKADDR0x4 - === === + === === + Type_NameType_Value + === === + FAULT_KMALLOC0x1 + FAULT_KVMALLOC 0x2 + FAULT_PAGE_ALLOC 0x4 + FAULT_PAGE_GET 0x8 + FAULT_ALLOC_BIO 0x00010 (obsolete) + FAULT_ALLOC_NID 0x00020 + FAULT_ORPHAN 0x00040 + FAULT_BLOCK 0x00080 + FAULT_DIR_DEPTH 0x00100 + FAULT_EVICT_INODE0x00200 + FAULT_TRUNCATE 0x00400 + FAULT_READ_IO0x00800 + FAULT_CHECKPOINT 0x01000 + FAULT_DISCARD0x02000 + FAULT_WRITE_IO 0x04000 + FAULT_SLAB_ALLOC 0x08000 + FAULT_DQUOT_INIT 0x1 + FAULT_LOCK_OP0x2 +
[f2fs-dev] [PATCH v3 2/6] f2fs: compress: fix to cover normal cluster write with cp_rwsem
When we overwrite compressed cluster w/ normal cluster, we should not unlock cp_rwsem during f2fs_write_raw_pages(), otherwise data will be corrupted if partial blocks were persisted before CP & SPOR, due to cluster metadata wasn't updated atomically. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu --- fs/f2fs/compress.c | 20 ++-- fs/f2fs/data.c | 3 ++- 2 files changed, 16 insertions(+), 7 deletions(-) diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index 9940b7886e5d..bf4cfab67aec 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -1448,7 +1448,8 @@ static int f2fs_write_raw_pages(struct compress_ctx *cc, enum iostat_type io_type) { struct address_space *mapping = cc->inode->i_mapping; - int _submitted, compr_blocks, ret, i; + struct f2fs_sb_info *sbi = F2FS_M_SB(mapping); + int _submitted, compr_blocks, ret = 0, i; compr_blocks = f2fs_compressed_blocks(cc); @@ -1463,6 +1464,10 @@ static int f2fs_write_raw_pages(struct compress_ctx *cc, if (compr_blocks < 0) return compr_blocks; + /* overwrite compressed cluster w/ normal cluster */ + if (compr_blocks > 0) + f2fs_lock_op(sbi); + for (i = 0; i < cc->cluster_size; i++) { if (!cc->rpages[i]) continue; @@ -1495,26 +1500,29 @@ static int f2fs_write_raw_pages(struct compress_ctx *cc, unlock_page(cc->rpages[i]); ret = 0; } else if (ret == -EAGAIN) { + ret = 0; /* * for quota file, just redirty left pages to * avoid deadlock caused by cluster update race * from foreground operation. */ if (IS_NOQUOTA(cc->inode)) - return 0; - ret = 0; + goto out; f2fs_io_schedule_timeout(DEFAULT_IO_TIMEOUT); goto retry_write; } - return ret; + goto out; } *submitted += _submitted; } - f2fs_balance_fs(F2FS_M_SB(mapping), true); +out: + if (compr_blocks > 0) + f2fs_unlock_op(sbi); - return 0; + f2fs_balance_fs(sbi, true); + return ret; } int f2fs_write_multi_pages(struct compress_ctx *cc, diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 81f9e2cc49e2..b171a9980f6a 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -2839,7 +2839,7 @@ int f2fs_write_single_data_page(struct page *page, int *submitted, .encrypted_page = NULL, .submitted = 0, .compr_blocks = compr_blocks, - .need_lock = LOCK_RETRY, + .need_lock = compr_blocks ? LOCK_DONE : LOCK_RETRY, .post_read = f2fs_post_read_required(inode) ? 1 : 0, .io_type = io_type, .io_wbc = wbc, @@ -2920,6 +2920,7 @@ int f2fs_write_single_data_page(struct page *page, int *submitted, if (err == -EAGAIN) { err = f2fs_do_write_data_page(&fio); if (err == -EAGAIN) { + f2fs_bug_on(sbi, compr_blocks); fio.need_lock = LOCK_REQ; err = f2fs_do_write_data_page(&fio); } -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH v3 5/6] f2fs: fix to remove unnecessary f2fs_bug_on() to avoid panic
verify_blkaddr() will trigger panic once we inject fault into f2fs_is_valid_blkaddr(), fix to remove this unnecessary f2fs_bug_on(). Fixes: 18792e64c86d ("f2fs: support fault injection for f2fs_is_valid_blkaddr()") Signed-off-by: Chao Yu --- fs/f2fs/f2fs.h | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 69e71460a950..ab710bb6d8b3 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -3470,11 +3470,9 @@ bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi, static inline void verify_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr, int type) { - if (!f2fs_is_valid_blkaddr(sbi, blkaddr, type)) { + if (!f2fs_is_valid_blkaddr(sbi, blkaddr, type)) f2fs_err(sbi, "invalid blkaddr: %u, type: %d, run fsck to fix.", blkaddr, type); - f2fs_bug_on(sbi, 1); - } } static inline bool __is_valid_data_blkaddr(block_t blkaddr) -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH v3 3/6] f2fs: compress: fix to check unreleased compressed cluster
From: Sheng Yong Compressed cluster may not be released due to we can fail in release_compress_blocks(), fix to handle reserved compressed cluster correctly in reserve_compress_blocks(). Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Sheng Yong Signed-off-by: Chao Yu --- fs/f2fs/file.c | 12 1 file changed, 12 insertions(+) diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 026d05a7edd8..782ae3be48f6 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -3624,6 +3624,15 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) goto next; } + /* +* compressed cluster was not released due to +* it fails in release_compress_blocks(). +*/ + if (blkaddr == NEW_ADDR) { + compr_blocks++; + continue; + } + if (__is_valid_data_blkaddr(blkaddr)) { compr_blocks++; continue; @@ -3633,6 +3642,9 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) } reserved = cluster_size - compr_blocks; + if (!reserved) + goto next; + ret = inc_valid_block_count(sbi, dn->inode, &reserved); if (ret) return ret; -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH v3 1/6] f2fs: compress: fix to guarantee persisting compressed blocks by CP
If data block in compressed cluster is not persisted with metadata during checkpoint, after SPOR, the data may be corrupted, let's guarantee to write compressed page by checkpoint. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu --- v3: - treat compressed page as CP guaranteed data explictly. fs/f2fs/compress.c | 4 +++- fs/f2fs/data.c | 17 + fs/f2fs/f2fs.h | 4 +++- 3 files changed, 15 insertions(+), 10 deletions(-) diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index c5a4364c4482..9940b7886e5d 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -1418,6 +1418,8 @@ void f2fs_compress_write_end_io(struct bio *bio, struct page *page) struct f2fs_sb_info *sbi = bio->bi_private; struct compress_io_ctx *cic = (struct compress_io_ctx *)page_private(page); + enum count_type type = WB_DATA_TYPE(page, + f2fs_is_compressed_page(page)); int i; if (unlikely(bio->bi_status)) @@ -1425,7 +1427,7 @@ void f2fs_compress_write_end_io(struct bio *bio, struct page *page) f2fs_compress_free_page(page); - dec_page_count(sbi, F2FS_WB_DATA); + dec_page_count(sbi, type); if (atomic_dec_return(&cic->pending_pages)) return; diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index dce8defdf4c7..81f9e2cc49e2 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -48,7 +48,7 @@ void f2fs_destroy_bioset(void) bioset_exit(&f2fs_bioset); } -static bool __is_cp_guaranteed(struct page *page) +bool f2fs_is_cp_guaranteed(struct page *page) { struct address_space *mapping = page->mapping; struct inode *inode; @@ -65,8 +65,6 @@ static bool __is_cp_guaranteed(struct page *page) S_ISDIR(inode->i_mode)) return true; - if (f2fs_is_compressed_page(page)) - return false; if ((S_ISREG(inode->i_mode) && IS_NOQUOTA(inode)) || page_private_gcing(page)) return true; @@ -338,7 +336,7 @@ static void f2fs_write_end_io(struct bio *bio) bio_for_each_segment_all(bvec, bio, iter_all) { struct page *page = bvec->bv_page; - enum count_type type = WB_DATA_TYPE(page); + enum count_type type = WB_DATA_TYPE(page, false); if (page_private_dummy(page)) { clear_page_private_dummy(page); @@ -762,7 +760,7 @@ int f2fs_submit_page_bio(struct f2fs_io_info *fio) wbc_account_cgroup_owner(fio->io_wbc, fio->page, PAGE_SIZE); inc_page_count(fio->sbi, is_read_io(fio->op) ? - __read_io_type(page) : WB_DATA_TYPE(fio->page)); + __read_io_type(page) : WB_DATA_TYPE(fio->page, false)); if (is_read_io(bio_op(bio))) f2fs_submit_read_bio(fio->sbi, bio, fio->type); @@ -973,7 +971,7 @@ int f2fs_merge_page_bio(struct f2fs_io_info *fio) if (fio->io_wbc) wbc_account_cgroup_owner(fio->io_wbc, fio->page, PAGE_SIZE); - inc_page_count(fio->sbi, WB_DATA_TYPE(page)); + inc_page_count(fio->sbi, WB_DATA_TYPE(page, false)); *fio->last_block = fio->new_blkaddr; *fio->bio = bio; @@ -1007,6 +1005,7 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio) enum page_type btype = PAGE_TYPE_OF_BIO(fio->type); struct f2fs_bio_info *io = sbi->write_io[btype] + fio->temp; struct page *bio_page; + enum count_type type; f2fs_bug_on(sbi, is_read_io(fio->op)); @@ -1046,7 +1045,8 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio) /* set submitted = true as a return value */ fio->submitted = 1; - inc_page_count(sbi, WB_DATA_TYPE(bio_page)); + type = WB_DATA_TYPE(bio_page, fio->compressed_page); + inc_page_count(sbi, type); if (io->bio && (!io_is_mergeable(sbi, io->bio, io, fio, io->last_block_in_bio, @@ -1059,7 +1059,8 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio) if (F2FS_IO_ALIGNED(sbi) && (fio->type == DATA || fio->type == NODE) && fio->new_blkaddr & F2FS_IO_SIZE_MASK(sbi)) { - dec_page_count(sbi, WB_DATA_TYPE(bio_page)); + dec_page_count(sbi, WB_DATA_TYPE(bio_page, + fio->compressed_page)); fio->retry = 1; goto skip; } diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 65294e3b0bef..50f3d546ded8 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -1080,7 +1080,8 @@ struct f2fs_sm_info { * f2fs monitors the number
Re: [f2fs-dev] [PATCH v2 6/6] f2fs: introduce FAULT_INCONSISTENCE
On 2023/12/28 7:06, Jaegeuk Kim wrote: On 12/25, Chao Yu wrote: We will encounter below inconsistent status when FAULT_BLKADDR type fault injection is on. Info: checkpoint state = d6 : nat_bits crc fsck compacted_summary orphan_inodes sudden-power-off [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1c100 has i_blocks: 00c0, but has 191 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1c100] i_blocks=0x00c0 -> 0xbf [FIX] (fsck_chk_inode_blk:1269) --> [0x1c100] i_compr_blocks=0x0026 -> 0x27 [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1cadb has i_blocks: 002f, but has 46 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1cadb] i_blocks=0x002f -> 0x2e [FIX] (fsck_chk_inode_blk:1269) --> [0x1cadb] i_compr_blocks=0x0011 -> 0x12 [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1c62c has i_blocks: 0002, but has 1 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1c62c] i_blocks=0x0002 -> 0x1 After we inject fault into f2fs_is_valid_blkaddr() during truncation, a) it missed to increase @nr_free or @valid_blocks b) it can cause in blkaddr leak in truncated dnode Which may cause inconsistent status. This patch separates FAULT_INCONSISTENCE from FAULT_BLKADDR, so that Could you please rename FAULT_INCONSISTENCE to give exactly what it tries to break? Sure, maybe FAULT_BLKADDR_INCONSISTENCE... let me know if you want/have a better one. :) Thanks, we can: a) use FAULT_INCONSISTENCE in f2fs_truncate_data_blocks_range() to simulate inconsistent issue independently, b) FAULT_BLKADDR fault will not cause any inconsistent status, we can just use it to check error path handling in kernel side. Signed-off-by: Chao Yu --- v2: - make __f2fs_is_valid_blkaddr() void. Documentation/ABI/testing/sysfs-fs-f2fs | 1 + Documentation/filesystems/f2fs.rst | 1 + fs/f2fs/checkpoint.c| 19 +++ fs/f2fs/f2fs.h | 3 +++ fs/f2fs/file.c | 8 ++-- fs/f2fs/super.c | 1 + 6 files changed, 27 insertions(+), 6 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs index 4f1d4e636d67..649aabac16c2 100644 --- a/Documentation/ABI/testing/sysfs-fs-f2fs +++ b/Documentation/ABI/testing/sysfs-fs-f2fs @@ -708,6 +708,7 @@ Description:Support configuring fault injection type, should be FAULT_DQUOT_INIT 0x1 FAULT_LOCK_OP0x2 FAULT_BLKADDR0x4 + FAULT_INCONSISTENCE 0x8 === === What: /sys/fs/f2fs//discard_io_aware_gran diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst index d32c6209685d..5616fb8ae207 100644 --- a/Documentation/filesystems/f2fs.rst +++ b/Documentation/filesystems/f2fs.rst @@ -206,6 +206,7 @@ fault_type=%dSupport configuring fault injection type, should be FAULT_DQUOT_INIT 0x1 FAULT_LOCK_OP0x2 FAULT_BLKADDR0x4 +FAULT_INCONSISTENCE 0x8 === === mode=%sControl block allocation mode which supports "adaptive" and "lfs". In "lfs" mode, there should be no random diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index b0597a539fc5..84546f529cf0 100644 --- a/fs/f2fs/checkpoint.c +++ b/fs/f2fs/checkpoint.c @@ -170,12 +170,9 @@ static bool __is_bitmap_valid(struct f2fs_sb_info *sbi, block_t blkaddr, return exist; } -bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi, +static bool __f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr, int type) { - if (time_to_inject(sbi, FAULT_BLKADDR)) - return false; - switch (type) { case META_NAT: break; @@ -230,6 +227,20 @@ bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi, return true; } +bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi, + block_t blkaddr, int type) +{ + if (time_to_inject(sbi, FAULT_BLKADDR)) + return false; + return __f2fs_is_valid_blkaddr(sbi, blkaddr, type); +} + +bool f2fs_is_valid_blkaddr_raw(struct f2fs_sb_info *sbi, + block_t blkaddr, int type) +{ + return __f2fs_is_valid_blkaddr(sbi, blkaddr, type); +} + /* * Readahead CP/NAT/SIT/SSA/POR pages */ diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 34b20700b5ec..3985296e64cb 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -61,6 +61,7 @@ enum { FAULT_DQUOT_INIT,
Re: [f2fs-dev] [PATCH 1/6] f2fs: compress: fix to guarantee persisting compressed blocks by CP
On 2023/12/28 6:55, Jaegeuk Kim wrote: On 12/27, Chao Yu wrote: On 2023/12/27 5:02, Jaegeuk Kim wrote: On 12/20, Chao Yu wrote: If data block in compressed cluster is not persisted with metadata during checkpoint, after SPOR, the data may be corrupted, let's guarantee to write compressed page by checkpoint. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu --- fs/f2fs/compress.c | 3 ++- fs/f2fs/data.c | 12 +--- fs/f2fs/f2fs.h | 3 ++- 3 files changed, 13 insertions(+), 5 deletions(-) diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index 5b076329e9bf..1122db8cc0b0 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -1442,6 +1442,7 @@ void f2fs_compress_write_end_io(struct bio *bio, struct page *page) struct f2fs_sb_info *sbi = bio->bi_private; struct compress_io_ctx *cic = (struct compress_io_ctx *)page_private(page); + enum count_type type = WB_DATA_TYPE(page); int i; if (unlikely(bio->bi_status)) @@ -1449,7 +1450,7 @@ void f2fs_compress_write_end_io(struct bio *bio, struct page *page) f2fs_compress_free_page(page); - dec_page_count(sbi, F2FS_WB_DATA); + dec_page_count(sbi, type); if (atomic_dec_return(&cic->pending_pages)) return; diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index d28c97282e68..6c72a6e86ba8 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -48,7 +48,7 @@ void f2fs_destroy_bioset(void) bioset_exit(&f2fs_bioset); } -static bool __is_cp_guaranteed(struct page *page) +bool f2fs_is_cp_guaranteed(struct page *page) { struct address_space *mapping = page->mapping; struct inode *inode; @@ -66,7 +66,7 @@ static bool __is_cp_guaranteed(struct page *page) return true; if (f2fs_is_compressed_page(page)) - return false; + return true; if ((S_ISREG(inode->i_mode) && IS_NOQUOTA(inode)) || page_private_gcing(page)) return true; @@ -1007,6 +1007,7 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio) enum page_type btype = PAGE_TYPE_OF_BIO(fio->type); struct f2fs_bio_info *io = sbi->write_io[btype] + fio->temp; struct page *bio_page; + enum count_type type; f2fs_bug_on(sbi, is_read_io(fio->op)); @@ -1046,7 +1047,12 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio) /* set submitted = true as a return value */ fio->submitted = 1; - inc_page_count(sbi, WB_DATA_TYPE(bio_page)); + type = WB_DATA_TYPE(bio_page); + /* override count type if page is compressed one */ + if (fio->compressed_page) + type = WB_DATA_TYPE(fio->compressed_page); Doesn't bio_page already point fio->compressed_page? Please check below codes, bio_page will point to fio->encrypted_page if both software encryption feature and compression feature are on, for this case, we still need to account F2FS_WB_CP_DATA. So, it seems you want to make F2FS_WB_CP_DATA regardless of conditions. Then, how about making this explictly instead of implicit condition check of the page? #define WB_DATA_TYPE(p, f) (f || __is_cp_guaranteed(p) ? F2FS_WB_CP_DATA : F2FS_WB_DATA) inc_page_count(sbi, WB_DATA_TYPE(bio_page, bio_page == fio->compressed_page)); Do you mean inc_page_count(sbi, WB_DATA_TYPE(bio_page, fio->compressed_page));? If we use inc_page_count(sbi, WB_DATA_TYPE(bio_page, bio_page == fio->compressed_page)); if bio_page points to fio->encrypted_page, but fio->compressed_page is valid, then WB_DATA_TYPE() will return F2FS_WB_DATA, it doesn't as expect. Or am I missing something? Thanks, dec_page_count(sbi, WB_DATA_TYPE(page, f2fs_is_compressed_page(page))); if (fio->encrypted_page) bio_page = fio->encrypted_page; else if (fio->compressed_page) bio_page = fio->compressed_page; else bio_page = fio->page; Thanks, + + inc_page_count(sbi, type); if (io->bio && (!io_is_mergeable(sbi, io->bio, io, fio, io->last_block_in_bio, diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 76e9a8682e38..bcb3940ab5ba 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -1092,7 +1092,7 @@ struct f2fs_sm_info { * f2fs monitors the number of several block types such as on-writeback, * dirty dentry blocks, dirty node blocks, and dirty meta blocks. */ -#define WB_DATA_TYPE(p)(__is_cp_guaranteed(p) ? F2FS_WB_CP_DATA : F2FS_WB_DATA) +#define WB_DATA_TYPE(p)(f2fs_is_cp_guaranteed(p) ? F2FS_WB_CP_DATA : F2FS_WB_DATA) enum count_type { F2FS_DIRTY_DENTS, F2FS_DIRTY_DATA, @@ -3824,6 +3824,7 @@ void f2fs_init_ckpt_req_control(struct f2fs_sb_info *sbi);
Re: [f2fs-dev] [PATCH v2 1/2] f2fs: move release of block devices to after kill_block_super()
On 2023/12/28 1:14, Eric Biggers wrote: From: Eric Biggers Call destroy_device_list() and free the f2fs_sb_info from kill_f2fs_super(), after the call to kill_block_super(). This is necessary to order it after the call to fscrypt_destroy_keyring() once generic_shutdown_super() starts calling fscrypt_destroy_keyring() just after calling ->put_super. This is because fscrypt_destroy_keyring() may call into f2fs_get_devices() via the fscrypt_operations. Signed-off-by: Eric Biggers Reviewed-by: Chao Yu Thanks,
Re: [f2fs-dev] [PATCH 2/3] f2fs: move release of block devices to after kill_block_super()
On 2023/12/13 12:00, Eric Biggers wrote: From: Eric Biggers Call destroy_device_list() and free the f2fs_sb_info from kill_f2fs_super(), after the call to kill_block_super(). This is necessary to order it after the call to fscrypt_destroy_keyring() once generic_shutdown_super() starts calling fscrypt_destroy_keyring() just after calling ->put_super. This is because fscrypt_destroy_keyring() may call into f2fs_get_devices() via the fscrypt_operations. Signed-off-by: Eric Biggers --- fs/f2fs/super.c | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index 033af907c3b1d..ba95a341a9a36 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -1710,42 +1710,39 @@ static void f2fs_put_super(struct super_block *sb) f2fs_destroy_node_manager(sbi); f2fs_destroy_segment_manager(sbi); /* flush s_error_work before sbi destroy */ flush_work(&sbi->s_error_work); f2fs_destroy_post_read_wq(sbi); kvfree(sbi->ckpt); - sb->s_fs_info = NULL; if (sbi->s_chksum_driver) crypto_free_shash(sbi->s_chksum_driver); kfree(sbi->raw_super); - destroy_device_list(sbi); f2fs_destroy_page_array_cache(sbi); f2fs_destroy_xattr_caches(sbi); mempool_destroy(sbi->write_io_dummy); #ifdef CONFIG_QUOTA for (i = 0; i < MAXQUOTAS; i++) kfree(F2FS_OPTION(sbi).s_qf_names[i]); #endif fscrypt_free_dummy_policy(&F2FS_OPTION(sbi).dummy_enc_policy); destroy_percpu_info(sbi); f2fs_destroy_iostat(sbi); for (i = 0; i < NR_PAGE_TYPE; i++) kvfree(sbi->write_io[i]); #if IS_ENABLED(CONFIG_UNICODE) utf8_unload(sb->s_encoding); #endif - kfree(sbi); } int f2fs_sync_fs(struct super_block *sb, int sync) { struct f2fs_sb_info *sbi = F2FS_SB(sb); int err = 0; if (unlikely(f2fs_cp_error(sbi))) return 0; if (unlikely(is_sbi_flag_set(sbi, SBI_CP_DISABLED))) @@ -4895,23 +4892,23 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent) } static struct dentry *f2fs_mount(struct file_system_type *fs_type, int flags, const char *dev_name, void *data) { return mount_bdev(fs_type, flags, dev_name, data, f2fs_fill_super); } static void kill_f2fs_super(struct super_block *sb) { - if (sb->s_root) { - struct f2fs_sb_info *sbi = F2FS_SB(sb); + struct f2fs_sb_info *sbi = F2FS_SB(sb); + if (sb->s_root) { set_sbi_flag(sbi, SBI_IS_CLOSE); f2fs_stop_gc_thread(sbi); f2fs_stop_discard_thread(sbi); #ifdef CONFIG_F2FS_FS_COMPRESSION /* * latter evict_inode() can bypass checking and invalidating * compress inode cache. */ if (test_opt(sbi, COMPRESS_CACHE)) @@ -4924,20 +4921,25 @@ static void kill_f2fs_super(struct super_block *sb) .reason = CP_UMOUNT, }; stat_inc_cp_call_count(sbi, TOTAL_CALL); f2fs_write_checkpoint(sbi, &cpc); } if (is_sbi_flag_set(sbi, SBI_IS_RECOVERED) && f2fs_readonly(sb)) sb->s_flags &= ~SB_RDONLY; } kill_block_super(sb); + if (sbi) { Can you please add one single line comment here to expand why we need to delay destroying device_list? Other code part looks good to me. Thanks, + destroy_device_list(sbi); + kfree(sbi); + sb->s_fs_info = NULL; + } } static struct file_system_type f2fs_fs_type = { .owner = THIS_MODULE, .name = "f2fs", .mount = f2fs_mount, .kill_sb= kill_f2fs_super, .fs_flags = FS_REQUIRES_DEV | FS_ALLOW_IDMAP, }; MODULE_ALIAS_FS("f2fs"); ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH 1/6] f2fs: compress: fix to guarantee persisting compressed blocks by CP
On 2023/12/27 5:02, Jaegeuk Kim wrote: On 12/20, Chao Yu wrote: If data block in compressed cluster is not persisted with metadata during checkpoint, after SPOR, the data may be corrupted, let's guarantee to write compressed page by checkpoint. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu --- fs/f2fs/compress.c | 3 ++- fs/f2fs/data.c | 12 +--- fs/f2fs/f2fs.h | 3 ++- 3 files changed, 13 insertions(+), 5 deletions(-) diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index 5b076329e9bf..1122db8cc0b0 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -1442,6 +1442,7 @@ void f2fs_compress_write_end_io(struct bio *bio, struct page *page) struct f2fs_sb_info *sbi = bio->bi_private; struct compress_io_ctx *cic = (struct compress_io_ctx *)page_private(page); + enum count_type type = WB_DATA_TYPE(page); int i; if (unlikely(bio->bi_status)) @@ -1449,7 +1450,7 @@ void f2fs_compress_write_end_io(struct bio *bio, struct page *page) f2fs_compress_free_page(page); - dec_page_count(sbi, F2FS_WB_DATA); + dec_page_count(sbi, type); if (atomic_dec_return(&cic->pending_pages)) return; diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index d28c97282e68..6c72a6e86ba8 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -48,7 +48,7 @@ void f2fs_destroy_bioset(void) bioset_exit(&f2fs_bioset); } -static bool __is_cp_guaranteed(struct page *page) +bool f2fs_is_cp_guaranteed(struct page *page) { struct address_space *mapping = page->mapping; struct inode *inode; @@ -66,7 +66,7 @@ static bool __is_cp_guaranteed(struct page *page) return true; if (f2fs_is_compressed_page(page)) - return false; + return true; if ((S_ISREG(inode->i_mode) && IS_NOQUOTA(inode)) || page_private_gcing(page)) return true; @@ -1007,6 +1007,7 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio) enum page_type btype = PAGE_TYPE_OF_BIO(fio->type); struct f2fs_bio_info *io = sbi->write_io[btype] + fio->temp; struct page *bio_page; + enum count_type type; f2fs_bug_on(sbi, is_read_io(fio->op)); @@ -1046,7 +1047,12 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio) /* set submitted = true as a return value */ fio->submitted = 1; - inc_page_count(sbi, WB_DATA_TYPE(bio_page)); + type = WB_DATA_TYPE(bio_page); + /* override count type if page is compressed one */ + if (fio->compressed_page) + type = WB_DATA_TYPE(fio->compressed_page); Doesn't bio_page already point fio->compressed_page? Please check below codes, bio_page will point to fio->encrypted_page if both software encryption feature and compression feature are on, for this case, we still need to account F2FS_WB_CP_DATA. if (fio->encrypted_page) bio_page = fio->encrypted_page; else if (fio->compressed_page) bio_page = fio->compressed_page; else bio_page = fio->page; Thanks, + + inc_page_count(sbi, type); if (io->bio && (!io_is_mergeable(sbi, io->bio, io, fio, io->last_block_in_bio, diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 76e9a8682e38..bcb3940ab5ba 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -1092,7 +1092,7 @@ struct f2fs_sm_info { * f2fs monitors the number of several block types such as on-writeback, * dirty dentry blocks, dirty node blocks, and dirty meta blocks. */ -#define WB_DATA_TYPE(p)(__is_cp_guaranteed(p) ? F2FS_WB_CP_DATA : F2FS_WB_DATA) +#define WB_DATA_TYPE(p)(f2fs_is_cp_guaranteed(p) ? F2FS_WB_CP_DATA : F2FS_WB_DATA) enum count_type { F2FS_DIRTY_DENTS, F2FS_DIRTY_DATA, @@ -3824,6 +3824,7 @@ void f2fs_init_ckpt_req_control(struct f2fs_sb_info *sbi); */ int __init f2fs_init_bioset(void); void f2fs_destroy_bioset(void); +bool f2fs_is_cp_guaranteed(struct page *page); int f2fs_init_bio_entry_cache(void); void f2fs_destroy_bio_entry_cache(void); void f2fs_submit_read_bio(struct f2fs_sb_info *sbi, struct bio *bio, -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH v2 6/6] f2fs: introduce FAULT_INCONSISTENCE
We will encounter below inconsistent status when FAULT_BLKADDR type fault injection is on. Info: checkpoint state = d6 : nat_bits crc fsck compacted_summary orphan_inodes sudden-power-off [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1c100 has i_blocks: 00c0, but has 191 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1c100] i_blocks=0x00c0 -> 0xbf [FIX] (fsck_chk_inode_blk:1269) --> [0x1c100] i_compr_blocks=0x0026 -> 0x27 [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1cadb has i_blocks: 002f, but has 46 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1cadb] i_blocks=0x002f -> 0x2e [FIX] (fsck_chk_inode_blk:1269) --> [0x1cadb] i_compr_blocks=0x0011 -> 0x12 [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1c62c has i_blocks: 0002, but has 1 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1c62c] i_blocks=0x0002 -> 0x1 After we inject fault into f2fs_is_valid_blkaddr() during truncation, a) it missed to increase @nr_free or @valid_blocks b) it can cause in blkaddr leak in truncated dnode Which may cause inconsistent status. This patch separates FAULT_INCONSISTENCE from FAULT_BLKADDR, so that we can: a) use FAULT_INCONSISTENCE in f2fs_truncate_data_blocks_range() to simulate inconsistent issue independently, b) FAULT_BLKADDR fault will not cause any inconsistent status, we can just use it to check error path handling in kernel side. Signed-off-by: Chao Yu --- v2: - make __f2fs_is_valid_blkaddr() void. Documentation/ABI/testing/sysfs-fs-f2fs | 1 + Documentation/filesystems/f2fs.rst | 1 + fs/f2fs/checkpoint.c| 19 +++ fs/f2fs/f2fs.h | 3 +++ fs/f2fs/file.c | 8 ++-- fs/f2fs/super.c | 1 + 6 files changed, 27 insertions(+), 6 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs index 4f1d4e636d67..649aabac16c2 100644 --- a/Documentation/ABI/testing/sysfs-fs-f2fs +++ b/Documentation/ABI/testing/sysfs-fs-f2fs @@ -708,6 +708,7 @@ Description:Support configuring fault injection type, should be FAULT_DQUOT_INIT 0x1 FAULT_LOCK_OP0x2 FAULT_BLKADDR0x4 + FAULT_INCONSISTENCE 0x8 === === What: /sys/fs/f2fs//discard_io_aware_gran diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst index d32c6209685d..5616fb8ae207 100644 --- a/Documentation/filesystems/f2fs.rst +++ b/Documentation/filesystems/f2fs.rst @@ -206,6 +206,7 @@ fault_type=%dSupport configuring fault injection type, should be FAULT_DQUOT_INIT 0x1 FAULT_LOCK_OP0x2 FAULT_BLKADDR0x4 +FAULT_INCONSISTENCE 0x8 === === mode=%s Control block allocation mode which supports "adaptive" and "lfs". In "lfs" mode, there should be no random diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index b0597a539fc5..84546f529cf0 100644 --- a/fs/f2fs/checkpoint.c +++ b/fs/f2fs/checkpoint.c @@ -170,12 +170,9 @@ static bool __is_bitmap_valid(struct f2fs_sb_info *sbi, block_t blkaddr, return exist; } -bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi, +static bool __f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr, int type) { - if (time_to_inject(sbi, FAULT_BLKADDR)) - return false; - switch (type) { case META_NAT: break; @@ -230,6 +227,20 @@ bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi, return true; } +bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi, + block_t blkaddr, int type) +{ + if (time_to_inject(sbi, FAULT_BLKADDR)) + return false; + return __f2fs_is_valid_blkaddr(sbi, blkaddr, type); +} + +bool f2fs_is_valid_blkaddr_raw(struct f2fs_sb_info *sbi, + block_t blkaddr, int type) +{ + return __f2fs_is_valid_blkaddr(sbi, blkaddr, type); +} + /* * Readahead CP/NAT/SIT/SSA/POR pages */ diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 34b20700b5ec..3985296e64cb 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -61,6 +61,7 @@ enum { FAULT_DQUOT_INIT, FAULT_LOCK_OP, FAULT_BLKADDR, + FAULT_INCONSISTENCE, FAULT_MAX, }; @@ -3767,6 +3768,8 @@ struct page *f2fs_get_meta_page_retry(struct f2fs_sb_info *sbi, pgoff_t index); struct page *f2fs_get_tmp_page(struct f2fs_sb_info *sbi,
Re: [f2fs-dev] [PATCH v2] f2fs: Use wait_event_freezable_timeout() for freezable kthread
On 2023/12/21 14:49, Kevin Hao wrote: A freezable kernel thread can enter frozen state during freezing by either calling try_to_freeze() or using wait_event_freezable() and its variants. So for the following snippet of code in a kernel thread loop: wait_event_interruptible_timeout(); try_to_freeze(); We can change it to a simple wait_event_freezable_timeout() and then eliminate the function calls to try_to_freeze() and freezing(). Signed-off-by: Kevin Hao Reviewed-by: Chao Yu Thanks, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH v2 4/6] f2fs: compress: fix to avoid inconsistent bewteen i_blocks and dnode
In reserve_compress_blocks(), we update blkaddrs of dnode in prior to inc_valid_block_count(), it may cause inconsistent status bewteen i_blocks and blkaddrs once inc_valid_block_count() fails. To fix this issue, it needs to reverse their invoking order. Fixes: c75488fb4d82 ("f2fs: introduce F2FS_IOC_RESERVE_COMPRESS_BLOCKS") Signed-off-by: Chao Yu --- v2: - rebase code to last dev-test branch fs/f2fs/data.c| 5 +++-- fs/f2fs/f2fs.h| 7 ++- fs/f2fs/file.c| 26 ++ fs/f2fs/segment.c | 2 +- 4 files changed, 24 insertions(+), 16 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 263053219b28..b6e35e601e24 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1224,7 +1224,8 @@ int f2fs_reserve_new_blocks(struct dnode_of_data *dn, blkcnt_t count) if (unlikely(is_inode_flag_set(dn->inode, FI_NO_ALLOC))) return -EPERM; - if (unlikely((err = inc_valid_block_count(sbi, dn->inode, &count + err = inc_valid_block_count(sbi, dn->inode, &count, true); + if (unlikely(err)) return err; trace_f2fs_reserve_new_blocks(dn->inode, dn->nid, @@ -1481,7 +1482,7 @@ static int __allocate_data_block(struct dnode_of_data *dn, int seg_type) dn->data_blkaddr = f2fs_data_blkaddr(dn); if (dn->data_blkaddr == NULL_ADDR) { - err = inc_valid_block_count(sbi, dn->inode, &count); + err = inc_valid_block_count(sbi, dn->inode, &count, true); if (unlikely(err)) return err; } diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 7d0c2b05c5a8..dc1feafb4973 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -2251,7 +2251,7 @@ static inline bool __allow_reserved_blocks(struct f2fs_sb_info *sbi, static inline void f2fs_i_blocks_write(struct inode *, block_t, bool, bool); static inline int inc_valid_block_count(struct f2fs_sb_info *sbi, -struct inode *inode, blkcnt_t *count) +struct inode *inode, blkcnt_t *count, bool partial) { blkcnt_t diff = 0, release = 0; block_t avail_user_block_count; @@ -2291,6 +2291,11 @@ static inline int inc_valid_block_count(struct f2fs_sb_info *sbi, avail_user_block_count = 0; } if (unlikely(sbi->total_valid_block_count > avail_user_block_count)) { + if (!partial) { + spin_unlock(&sbi->stat_lock); + goto enospc; + } + diff = sbi->total_valid_block_count - avail_user_block_count; if (diff > *count) diff = *count; diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 782ae3be48f6..9f4e21b5916c 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -3614,14 +3614,16 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) blkcnt_t reserved; int ret; - for (i = 0; i < cluster_size; i++, dn->ofs_in_node++) { - blkaddr = f2fs_data_blkaddr(dn); + for (i = 0; i < cluster_size; i++) { + blkaddr = data_blkaddr(dn->inode, dn->node_page, + dn->ofs_in_node + i); if (i == 0) { - if (blkaddr == COMPRESS_ADDR) - continue; - dn->ofs_in_node += cluster_size; - goto next; + if (blkaddr != COMPRESS_ADDR) { + dn->ofs_in_node += cluster_size; + goto next; + } + continue; } /* @@ -3637,20 +3639,20 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) compr_blocks++; continue; } - - f2fs_set_data_blkaddr(dn, NEW_ADDR); } reserved = cluster_size - compr_blocks; if (!reserved) goto next; - ret = inc_valid_block_count(sbi, dn->inode, &reserved); - if (ret) + ret = inc_valid_block_count(sbi, dn->inode, &reserved, false); + if (unlikely(ret)) return ret; - if (reserved != cluster_size - compr_blocks) - return -ENOSPC; + for (i = 0; i < cluster_size; i++, dn->ofs_in_node++) { + if (f2fs_data_blkaddr(dn) == NULL_ADDR) + f2fs_set_data_blkaddr(
Re: [f2fs-dev] [PATCH 1/2] f2fs: Constrain the modification range of dir_level in the sysfs
On 2023/12/22 11:29, Yongpeng Yang wrote: The {struct f2fs_sb_info}->dir_level can be modified through the sysfs interface, but its value range is not limited. If the value exceeds MAX_DIR_HASH_DEPTH and the mount options include "noinline_dentry", the following error will occur: [root@fedora ~]# mount -o noinline_dentry /dev/sdb /mnt/sdb/ [root@fedora ~]# echo 128 > /sys/fs/f2fs/sdb/dir_level [root@fedora ~]# cd /mnt/sdb/ [root@fedora sdb]# mkdir test [root@fedora sdb]# cd test/ [root@fedora test]# mkdir test mkdir: cannot create directory 'test': Argument list too long Signed-off-by: Yongpeng Yang Reviewed-by: Chao Yu Thanks, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH 2/2] f2fs: Add error handling for negative returns from do_garbage_collect
On 2023/12/22 11:29, Yongpeng Yang wrote: The function do_garbage_collect can return a value less than 0 due to f2fs_cp_error being true or page allocation failure, as a result of calling f2fs_get_sum_page. However, f2fs_gc does not account for such cases, which could potentially lead to an abnormal total_freed and thus cause subsequent code to behave unexpectedly. Given that an f2fs_cp_error is irrecoverable, and considering that do_garbage_collect already retries page allocation errors through its call to f2fs_get_sum_page->f2fs_get_meta_page_retry, any error reported by do_garbage_collect should immediately terminate the current GC. Signed-off-by: Yongpeng Yang Reviewed-by: Chao Yu Thanks, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH 2/6] f2fs: compress: fix to cover normal cluster write with cp_rwsem
When we overwrite compressed cluster w/ normal cluster, we should not unlock cp_rwsem during f2fs_write_raw_pages(), otherwise data will be corrupted if partial blocks were persisted before CP & SPOR, due to cluster metadata wasn't updated atomically. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu --- fs/f2fs/compress.c | 20 ++-- fs/f2fs/data.c | 3 ++- 2 files changed, 16 insertions(+), 7 deletions(-) diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index 1122db8cc0b0..a1426c3eadcc 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -1471,7 +1471,8 @@ static int f2fs_write_raw_pages(struct compress_ctx *cc, enum iostat_type io_type) { struct address_space *mapping = cc->inode->i_mapping; - int _submitted, compr_blocks, ret, i; + struct f2fs_sb_info *sbi = F2FS_M_SB(mapping); + int _submitted, compr_blocks, ret = 0, i; compr_blocks = f2fs_compressed_blocks(cc); @@ -1486,6 +1487,10 @@ static int f2fs_write_raw_pages(struct compress_ctx *cc, if (compr_blocks < 0) return compr_blocks; + /* overwrite compressed cluster w/ normal cluster */ + if (compr_blocks > 0) + f2fs_lock_op(sbi); + for (i = 0; i < cc->cluster_size; i++) { if (!cc->rpages[i]) continue; @@ -1518,26 +1523,29 @@ static int f2fs_write_raw_pages(struct compress_ctx *cc, unlock_page(cc->rpages[i]); ret = 0; } else if (ret == -EAGAIN) { + ret = 0; /* * for quota file, just redirty left pages to * avoid deadlock caused by cluster update race * from foreground operation. */ if (IS_NOQUOTA(cc->inode)) - return 0; - ret = 0; + goto out; f2fs_io_schedule_timeout(DEFAULT_IO_TIMEOUT); goto retry_write; } - return ret; + goto out; } *submitted += _submitted; } - f2fs_balance_fs(F2FS_M_SB(mapping), true); +out: + if (compr_blocks > 0) + f2fs_unlock_op(sbi); - return 0; + f2fs_balance_fs(sbi, true); + return ret; } int f2fs_write_multi_pages(struct compress_ctx *cc, diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 6c72a6e86ba8..ceed5ac6c66b 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -2848,7 +2848,7 @@ int f2fs_write_single_data_page(struct page *page, int *submitted, .encrypted_page = NULL, .submitted = 0, .compr_blocks = compr_blocks, - .need_lock = LOCK_RETRY, + .need_lock = compr_blocks ? LOCK_DONE : LOCK_RETRY, .post_read = f2fs_post_read_required(inode) ? 1 : 0, .io_type = io_type, .io_wbc = wbc, @@ -2929,6 +2929,7 @@ int f2fs_write_single_data_page(struct page *page, int *submitted, if (err == -EAGAIN) { err = f2fs_do_write_data_page(&fio); if (err == -EAGAIN) { + f2fs_bug_on(sbi, compr_blocks); fio.need_lock = LOCK_REQ; err = f2fs_do_write_data_page(&fio); } -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH 3/6] f2fs: compress: fix to check unreleased compressed cluster
From: Sheng Yong Compressed cluster may not be released due to we can fail in release_compress_blocks(), fix to handle reserved compressed cluster correctly in reserve_compress_blocks(). Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Sheng Yong Signed-off-by: Chao Yu --- fs/f2fs/file.c | 12 1 file changed, 12 insertions(+) diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index c5e681fc1d58..c200b4c81baf 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -3643,6 +3643,15 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) goto next; } + /* +* compressed cluster was not released due to +* it fails in release_compress_blocks(). +*/ + if (blkaddr == NEW_ADDR) { + compr_blocks++; + continue; + } + if (__is_valid_data_blkaddr(blkaddr)) { compr_blocks++; continue; @@ -3652,6 +3661,9 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) } reserved = cluster_size - compr_blocks; + if (!reserved) + goto next; + ret = inc_valid_block_count(sbi, dn->inode, &reserved); if (ret) return ret; -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH 1/6] f2fs: compress: fix to guarantee persisting compressed blocks by CP
If data block in compressed cluster is not persisted with metadata during checkpoint, after SPOR, the data may be corrupted, let's guarantee to write compressed page by checkpoint. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu --- fs/f2fs/compress.c | 3 ++- fs/f2fs/data.c | 12 +--- fs/f2fs/f2fs.h | 3 ++- 3 files changed, 13 insertions(+), 5 deletions(-) diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index 5b076329e9bf..1122db8cc0b0 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -1442,6 +1442,7 @@ void f2fs_compress_write_end_io(struct bio *bio, struct page *page) struct f2fs_sb_info *sbi = bio->bi_private; struct compress_io_ctx *cic = (struct compress_io_ctx *)page_private(page); + enum count_type type = WB_DATA_TYPE(page); int i; if (unlikely(bio->bi_status)) @@ -1449,7 +1450,7 @@ void f2fs_compress_write_end_io(struct bio *bio, struct page *page) f2fs_compress_free_page(page); - dec_page_count(sbi, F2FS_WB_DATA); + dec_page_count(sbi, type); if (atomic_dec_return(&cic->pending_pages)) return; diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index d28c97282e68..6c72a6e86ba8 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -48,7 +48,7 @@ void f2fs_destroy_bioset(void) bioset_exit(&f2fs_bioset); } -static bool __is_cp_guaranteed(struct page *page) +bool f2fs_is_cp_guaranteed(struct page *page) { struct address_space *mapping = page->mapping; struct inode *inode; @@ -66,7 +66,7 @@ static bool __is_cp_guaranteed(struct page *page) return true; if (f2fs_is_compressed_page(page)) - return false; + return true; if ((S_ISREG(inode->i_mode) && IS_NOQUOTA(inode)) || page_private_gcing(page)) return true; @@ -1007,6 +1007,7 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio) enum page_type btype = PAGE_TYPE_OF_BIO(fio->type); struct f2fs_bio_info *io = sbi->write_io[btype] + fio->temp; struct page *bio_page; + enum count_type type; f2fs_bug_on(sbi, is_read_io(fio->op)); @@ -1046,7 +1047,12 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio) /* set submitted = true as a return value */ fio->submitted = 1; - inc_page_count(sbi, WB_DATA_TYPE(bio_page)); + type = WB_DATA_TYPE(bio_page); + /* override count type if page is compressed one */ + if (fio->compressed_page) + type = WB_DATA_TYPE(fio->compressed_page); + + inc_page_count(sbi, type); if (io->bio && (!io_is_mergeable(sbi, io->bio, io, fio, io->last_block_in_bio, diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 76e9a8682e38..bcb3940ab5ba 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -1092,7 +1092,7 @@ struct f2fs_sm_info { * f2fs monitors the number of several block types such as on-writeback, * dirty dentry blocks, dirty node blocks, and dirty meta blocks. */ -#define WB_DATA_TYPE(p)(__is_cp_guaranteed(p) ? F2FS_WB_CP_DATA : F2FS_WB_DATA) +#define WB_DATA_TYPE(p)(f2fs_is_cp_guaranteed(p) ? F2FS_WB_CP_DATA : F2FS_WB_DATA) enum count_type { F2FS_DIRTY_DENTS, F2FS_DIRTY_DATA, @@ -3824,6 +3824,7 @@ void f2fs_init_ckpt_req_control(struct f2fs_sb_info *sbi); */ int __init f2fs_init_bioset(void); void f2fs_destroy_bioset(void); +bool f2fs_is_cp_guaranteed(struct page *page); int f2fs_init_bio_entry_cache(void); void f2fs_destroy_bio_entry_cache(void); void f2fs_submit_read_bio(struct f2fs_sb_info *sbi, struct bio *bio, -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH 4/6] f2fs: compress: fix to avoid inconsistent bewteen i_blocks and dnode
In reserve_compress_blocks(), we update blkaddrs of dnode in prior to inc_valid_block_count(), it may cause inconsistent status bewteen i_blocks and blkaddrs once inc_valid_block_count() fails. To fix this issue, it needs to reverse their invoking order. Fixes: c75488fb4d82 ("f2fs: introduce F2FS_IOC_RESERVE_COMPRESS_BLOCKS") Signed-off-by: Chao Yu --- fs/f2fs/data.c| 5 +++-- fs/f2fs/f2fs.h| 7 ++- fs/f2fs/file.c| 26 ++ fs/f2fs/segment.c | 2 +- 4 files changed, 24 insertions(+), 16 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index ceed5ac6c66b..cc4cb9099db6 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1224,7 +1224,8 @@ int f2fs_reserve_new_blocks(struct dnode_of_data *dn, blkcnt_t count) if (unlikely(is_inode_flag_set(dn->inode, FI_NO_ALLOC))) return -EPERM; - if (unlikely((err = inc_valid_block_count(sbi, dn->inode, &count + err = inc_valid_block_count(sbi, dn->inode, &count, true); + if (unlikely(err)) return err; trace_f2fs_reserve_new_blocks(dn->inode, dn->nid, @@ -1481,7 +1482,7 @@ static int __allocate_data_block(struct dnode_of_data *dn, int seg_type) dn->data_blkaddr = f2fs_data_blkaddr(dn); if (dn->data_blkaddr == NULL_ADDR) { - err = inc_valid_block_count(sbi, dn->inode, &count); + err = inc_valid_block_count(sbi, dn->inode, &count, true); if (unlikely(err)) return err; } diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index bcb3940ab5ba..541c52fe2872 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -2294,7 +2294,7 @@ static inline unsigned int get_available_block_count(struct f2fs_sb_info *sbi, static inline void f2fs_i_blocks_write(struct inode *, block_t, bool, bool); static inline int inc_valid_block_count(struct f2fs_sb_info *sbi, -struct inode *inode, blkcnt_t *count) +struct inode *inode, blkcnt_t *count, bool partial) { blkcnt_t diff = 0, release = 0; block_t avail_user_block_count; @@ -2320,6 +2320,11 @@ static inline int inc_valid_block_count(struct f2fs_sb_info *sbi, avail_user_block_count = get_available_block_count(sbi, inode, true); if (unlikely(sbi->total_valid_block_count > avail_user_block_count)) { + if (!partial) { + spin_unlock(&sbi->stat_lock); + goto enospc; + } + diff = sbi->total_valid_block_count - avail_user_block_count; if (diff > *count) diff = *count; diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index c200b4c81baf..4f3ed627f6a1 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -3633,14 +3633,16 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) blkcnt_t reserved; int ret; - for (i = 0; i < cluster_size; i++, dn->ofs_in_node++) { - blkaddr = f2fs_data_blkaddr(dn); + for (i = 0; i < cluster_size; i++) { + blkaddr = data_blkaddr(dn->inode, dn->node_page, + dn->ofs_in_node + i); if (i == 0) { - if (blkaddr == COMPRESS_ADDR) - continue; - dn->ofs_in_node += cluster_size; - goto next; + if (blkaddr != COMPRESS_ADDR) { + dn->ofs_in_node += cluster_size; + goto next; + } + continue; } /* @@ -3656,20 +3658,20 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) compr_blocks++; continue; } - - f2fs_set_data_blkaddr(dn, NEW_ADDR); } reserved = cluster_size - compr_blocks; if (!reserved) goto next; - ret = inc_valid_block_count(sbi, dn->inode, &reserved); - if (ret) + ret = inc_valid_block_count(sbi, dn->inode, &reserved, false); + if (unlikely(ret)) return ret; - if (reserved != cluster_size - compr_blocks) - return -ENOSPC; + for (i = 0; i < cluster_size; i++, dn->ofs_in_node++) { + if (f2fs_data_blkaddr(dn) == NULL_ADDR) + f2fs_set_data_blkaddr(dn, NEW_ADDR); +
[f2fs-dev] [PATCH 5/6] f2fs: fix to remove unnecessary f2fs_bug_on() to avoid panic
verify_blkaddr() will trigger panic once we inject fault into f2fs_is_valid_blkaddr(), fix to remove this unnecessary f2fs_bug_on(). Fixes: 18792e64c86d ("f2fs: support fault injection for f2fs_is_valid_blkaddr()") Signed-off-by: Chao Yu --- fs/f2fs/f2fs.h | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 541c52fe2872..9487581db08a 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -3489,11 +3489,9 @@ bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi, static inline void verify_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr, int type) { - if (!f2fs_is_valid_blkaddr(sbi, blkaddr, type)) { + if (!f2fs_is_valid_blkaddr(sbi, blkaddr, type)) f2fs_err(sbi, "invalid blkaddr: %u, type: %d, run fsck to fix.", blkaddr, type); - f2fs_bug_on(sbi, 1); - } } static inline bool __is_valid_data_blkaddr(block_t blkaddr) -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH 6/6] f2fs: introduce FAULT_INCONSISTENCE
We will encounter below inconsistent status when FAULT_BLKADDR type fault injection is on. Info: checkpoint state = d6 : nat_bits crc fsck compacted_summary orphan_inodes sudden-power-off [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1c100 has i_blocks: 00c0, but has 191 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1c100] i_blocks=0x00c0 -> 0xbf [FIX] (fsck_chk_inode_blk:1269) --> [0x1c100] i_compr_blocks=0x0026 -> 0x27 [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1cadb has i_blocks: 002f, but has 46 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1cadb] i_blocks=0x002f -> 0x2e [FIX] (fsck_chk_inode_blk:1269) --> [0x1cadb] i_compr_blocks=0x0011 -> 0x12 [ASSERT] (fsck_chk_inode_blk:1254) --> ino: 0x1c62c has i_blocks: 0002, but has 1 blocks [FIX] (fsck_chk_inode_blk:1260) --> [0x1c62c] i_blocks=0x0002 -> 0x1 After we inject fault into f2fs_is_valid_blkaddr() during truncation, a) it missed to increase @nr_free or @valid_blocks b) it can cause in blkaddr leak in truncated dnode Which may cause inconsistent status. This patch separates FAULT_INCONSISTENCE from FAULT_BLKADDR, so that we can: a) use FAULT_INCONSISTENCE in f2fs_truncate_data_blocks_range() to simulate inconsistent issue independently, b) FAULT_BLKADDR fault will not cause any inconsistent status, we can just use it to check error path handling in kernel side. Signed-off-by: Chao Yu --- Documentation/ABI/testing/sysfs-fs-f2fs | 1 + Documentation/filesystems/f2fs.rst | 1 + fs/f2fs/checkpoint.c| 19 +++ fs/f2fs/f2fs.h | 3 +++ fs/f2fs/file.c | 8 ++-- fs/f2fs/super.c | 1 + 6 files changed, 27 insertions(+), 6 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs index 4f1d4e636d67..649aabac16c2 100644 --- a/Documentation/ABI/testing/sysfs-fs-f2fs +++ b/Documentation/ABI/testing/sysfs-fs-f2fs @@ -708,6 +708,7 @@ Description:Support configuring fault injection type, should be FAULT_DQUOT_INIT 0x1 FAULT_LOCK_OP0x2 FAULT_BLKADDR0x4 + FAULT_INCONSISTENCE 0x8 === === What: /sys/fs/f2fs//discard_io_aware_gran diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst index 5a1686cdd6b4..dfc1e73b17d9 100644 --- a/Documentation/filesystems/f2fs.rst +++ b/Documentation/filesystems/f2fs.rst @@ -209,6 +209,7 @@ fault_type=%dSupport configuring fault injection type, should be FAULT_DQUOT_INIT 0x1 FAULT_LOCK_OP0x2 FAULT_BLKADDR0x4 +FAULT_INCONSISTENCE 0x8 === === mode=%s Control block allocation mode which supports "adaptive" and "lfs". In "lfs" mode, there should be no random diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index f3f3e98fd6b0..b59db6c1932c 100644 --- a/fs/f2fs/checkpoint.c +++ b/fs/f2fs/checkpoint.c @@ -170,12 +170,9 @@ static bool __is_bitmap_valid(struct f2fs_sb_info *sbi, block_t blkaddr, return exist; } -bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi, +bool __f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr, int type) { - if (time_to_inject(sbi, FAULT_BLKADDR)) - return false; - switch (type) { case META_NAT: break; @@ -230,6 +227,20 @@ bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi, return true; } +bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi, + block_t blkaddr, int type) +{ + if (time_to_inject(sbi, FAULT_BLKADDR)) + return false; + return __f2fs_is_valid_blkaddr(sbi, blkaddr, type); +} + +bool f2fs_is_valid_blkaddr_raw(struct f2fs_sb_info *sbi, + block_t blkaddr, int type) +{ + return __f2fs_is_valid_blkaddr(sbi, blkaddr, type); +} + /* * Readahead CP/NAT/SIT/SSA/POR pages */ diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 9487581db08a..49e4de324841 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -61,6 +61,7 @@ enum { FAULT_DQUOT_INIT, FAULT_LOCK_OP, FAULT_BLKADDR, + FAULT_INCONSISTENCE, FAULT_MAX, }; @@ -3787,6 +3788,8 @@ struct page *f2fs_get_meta_page_retry(struct f2fs_sb_info *sbi, pgoff_t index); struct page *f2fs_get_tmp_page(struct f2fs_sb_info *sbi, pgoff_t index); bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *s
Re: [f2fs-dev] [PATCH V5] f2fs: show more discard status by sysfs
On 2023/12/20 9:59, Zhiguo Niu wrote: The current pending_discard attr just only shows the discard_cmd_cnt information. More discard status can be shown so that we can check them through sysfs when needed. Signed-off-by: Zhiguo Niu Reviewed-by: Chao Yu Thanks, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH V4] f2fs: show more discard status by sysfs
On 2023/12/19 12:09, Zhiguo Niu wrote: On Tue, Dec 19, 2023 at 12:00 PM Chao Yu wrote: On 2023/12/19 10:21, Zhiguo Niu wrote: The current pending_discard attr just only shows the discard_cmd_cnt information. More discard status can be shown so that we can check them through sysfs when needed. Signed-off-by: Zhiguo Niu --- changes of v2: Improve the patch according to Chao's suggestions. changes of v3: Add a blank line for easy reading. changes of v4: Split to three entries --- --- Documentation/ABI/testing/sysfs-fs-f2fs | 15 +++ fs/f2fs/sysfs.c | 33 + 2 files changed, 48 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs index 4f1d4e6..606a298 100644 --- a/Documentation/ABI/testing/sysfs-fs-f2fs +++ b/Documentation/ABI/testing/sysfs-fs-f2fs @@ -159,6 +159,21 @@ Date:November 2021 Contact:"Jaegeuk Kim" Description:Shows the number of pending discard commands in the queue. +What: /sys/fs/f2fs//issued_discard Add them to /sys/fs/f2fs//stat/? I just want to keep them consistent with the entry "pending_discard" There are too many entries in root directory of f2fs sysfs entry, so I created the 'stat' sub-directory for later all read-only stat-related entry. I think it's fine to create new discard stat entries there. Thanks, if they are split to 3 entries. they are all discard related infos. Thanks Thanks, +Date: December 2023 +Contact:"Zhiguo Niu" +Description:Shows the number of issued discard. + +What: /sys/fs/f2fs//queued_discard +Date: December 2023 +Contact:"Zhiguo Niu" +Description:Shows the number of queued discard. + +What: /sys/fs/f2fs//undiscard_blks +Date: December 2023 +Contact:"Zhiguo Niu" +Description:Shows the total number of undiscard blocks. + What: /sys/fs/f2fs//max_victim_search Date: January 2014 Contact:"Jaegeuk Kim" diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c index 7099ffa..666efdd 100644 --- a/fs/f2fs/sysfs.c +++ b/fs/f2fs/sysfs.c @@ -143,6 +143,33 @@ static ssize_t pending_discard_show(struct f2fs_attr *a, &SM_I(sbi)->dcc_info->discard_cmd_cnt)); } +static ssize_t issued_discard_show(struct f2fs_attr *a, + struct f2fs_sb_info *sbi, char *buf) +{ + if (!SM_I(sbi)->dcc_info) + return -EINVAL; + return sysfs_emit(buf, "%llu\n", (unsigned long long)atomic_read( + &SM_I(sbi)->dcc_info->issued_discard)); +} + +static ssize_t queued_discard_show(struct f2fs_attr *a, + struct f2fs_sb_info *sbi, char *buf) +{ + if (!SM_I(sbi)->dcc_info) + return -EINVAL; + return sysfs_emit(buf, "%llu\n", (unsigned long long)atomic_read( + &SM_I(sbi)->dcc_info->queued_discard)); +} + +static ssize_t undiscard_blks_show(struct f2fs_attr *a, + struct f2fs_sb_info *sbi, char *buf) +{ + if (!SM_I(sbi)->dcc_info) + return -EINVAL; + return sysfs_emit(buf, "%u\n", + SM_I(sbi)->dcc_info->undiscard_blks); +} + static ssize_t gc_mode_show(struct f2fs_attr *a, struct f2fs_sb_info *sbi, char *buf) { @@ -1025,6 +1052,9 @@ static ssize_t f2fs_sb_feature_show(struct f2fs_attr *a, F2FS_GENERAL_RO_ATTR(mounted_time_sec); F2FS_GENERAL_RO_ATTR(main_blkaddr); F2FS_GENERAL_RO_ATTR(pending_discard); +F2FS_GENERAL_RO_ATTR(issued_discard); +F2FS_GENERAL_RO_ATTR(queued_discard); +F2FS_GENERAL_RO_ATTR(undiscard_blks); F2FS_GENERAL_RO_ATTR(gc_mode); #ifdef CONFIG_F2FS_STAT_FS F2FS_GENERAL_RO_ATTR(moved_blocks_background); @@ -1084,6 +1114,9 @@ static ssize_t f2fs_sb_feature_show(struct f2fs_attr *a, ATTR_LIST(max_ordered_discard), ATTR_LIST(discard_io_aware), ATTR_LIST(pending_discard), + ATTR_LIST(issued_discard), + ATTR_LIST(queued_discard), + ATTR_LIST(undiscard_blks), ATTR_LIST(gc_mode), ATTR_LIST(ipu_policy), ATTR_LIST(min_ipu_util), ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH V4] f2fs: show more discard status by sysfs
On 2023/12/19 10:21, Zhiguo Niu wrote: The current pending_discard attr just only shows the discard_cmd_cnt information. More discard status can be shown so that we can check them through sysfs when needed. Signed-off-by: Zhiguo Niu --- changes of v2: Improve the patch according to Chao's suggestions. changes of v3: Add a blank line for easy reading. changes of v4: Split to three entries --- --- Documentation/ABI/testing/sysfs-fs-f2fs | 15 +++ fs/f2fs/sysfs.c | 33 + 2 files changed, 48 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs index 4f1d4e6..606a298 100644 --- a/Documentation/ABI/testing/sysfs-fs-f2fs +++ b/Documentation/ABI/testing/sysfs-fs-f2fs @@ -159,6 +159,21 @@ Date: November 2021 Contact: "Jaegeuk Kim" Description: Shows the number of pending discard commands in the queue. +What: /sys/fs/f2fs//issued_discard Add them to /sys/fs/f2fs//stat/? Thanks, +Date: December 2023 +Contact:"Zhiguo Niu" +Description:Shows the number of issued discard. + +What: /sys/fs/f2fs//queued_discard +Date: December 2023 +Contact:"Zhiguo Niu" +Description:Shows the number of queued discard. + +What: /sys/fs/f2fs//undiscard_blks +Date: December 2023 +Contact:"Zhiguo Niu" +Description:Shows the total number of undiscard blocks. + What: /sys/fs/f2fs//max_victim_search Date: January 2014 Contact: "Jaegeuk Kim" diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c index 7099ffa..666efdd 100644 --- a/fs/f2fs/sysfs.c +++ b/fs/f2fs/sysfs.c @@ -143,6 +143,33 @@ static ssize_t pending_discard_show(struct f2fs_attr *a, &SM_I(sbi)->dcc_info->discard_cmd_cnt)); } +static ssize_t issued_discard_show(struct f2fs_attr *a, + struct f2fs_sb_info *sbi, char *buf) +{ + if (!SM_I(sbi)->dcc_info) + return -EINVAL; + return sysfs_emit(buf, "%llu\n", (unsigned long long)atomic_read( + &SM_I(sbi)->dcc_info->issued_discard)); +} + +static ssize_t queued_discard_show(struct f2fs_attr *a, + struct f2fs_sb_info *sbi, char *buf) +{ + if (!SM_I(sbi)->dcc_info) + return -EINVAL; + return sysfs_emit(buf, "%llu\n", (unsigned long long)atomic_read( + &SM_I(sbi)->dcc_info->queued_discard)); +} + +static ssize_t undiscard_blks_show(struct f2fs_attr *a, + struct f2fs_sb_info *sbi, char *buf) +{ + if (!SM_I(sbi)->dcc_info) + return -EINVAL; + return sysfs_emit(buf, "%u\n", + SM_I(sbi)->dcc_info->undiscard_blks); +} + static ssize_t gc_mode_show(struct f2fs_attr *a, struct f2fs_sb_info *sbi, char *buf) { @@ -1025,6 +1052,9 @@ static ssize_t f2fs_sb_feature_show(struct f2fs_attr *a, F2FS_GENERAL_RO_ATTR(mounted_time_sec); F2FS_GENERAL_RO_ATTR(main_blkaddr); F2FS_GENERAL_RO_ATTR(pending_discard); +F2FS_GENERAL_RO_ATTR(issued_discard); +F2FS_GENERAL_RO_ATTR(queued_discard); +F2FS_GENERAL_RO_ATTR(undiscard_blks); F2FS_GENERAL_RO_ATTR(gc_mode); #ifdef CONFIG_F2FS_STAT_FS F2FS_GENERAL_RO_ATTR(moved_blocks_background); @@ -1084,6 +1114,9 @@ static ssize_t f2fs_sb_feature_show(struct f2fs_attr *a, ATTR_LIST(max_ordered_discard), ATTR_LIST(discard_io_aware), ATTR_LIST(pending_discard), + ATTR_LIST(issued_discard), + ATTR_LIST(queued_discard), + ATTR_LIST(undiscard_blks), ATTR_LIST(gc_mode), ATTR_LIST(ipu_policy), ATTR_LIST(min_ipu_util), ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH V3] f2fs: show more discard status by sysfs
On 2023/12/18 19:07, Zhiguo Niu wrote: The current pending_discard attr just only shows the discard_cmd_cnt information. More discard status can be shown so that we can check them through sysfs when needed. Signed-off-by: Zhiguo Niu Reviewed-by: Chao Yu Thanks, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH V2] f2fs: unify the error handling of f2fs_is_valid_blkaddr
On 2023/12/12 13:23, Zhiguo Niu wrote: unify the error handling of ERROR_INVALID_BLKADDR in f2fs_is_valid_blkaddr and remove some redundant codes in f2fs_cache_compressed_page. Signed-off-by: Zhiguo Niu --- changes of v2: improve patch according Chao's suggestions. --- --- fs/f2fs/checkpoint.c | 43 +++ fs/f2fs/compress.c | 4 fs/f2fs/data.c | 24 fs/f2fs/extent_cache.c | 7 ++- fs/f2fs/f2fs.h | 5 ++--- fs/f2fs/file.c | 17 + fs/f2fs/gc.c | 2 -- fs/f2fs/inode.c| 5 ++--- fs/f2fs/node.c | 2 +- fs/f2fs/recovery.c | 13 ++--- fs/f2fs/segment.c | 2 -- 11 files changed, 45 insertions(+), 79 deletions(-) diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index b0597a5..d8ff056 100644 --- a/fs/f2fs/checkpoint.c +++ b/fs/f2fs/checkpoint.c @@ -154,19 +154,17 @@ static bool __is_bitmap_valid(struct f2fs_sb_info *sbi, block_t blkaddr, if (unlikely(f2fs_cp_error(sbi))) return exist; - if (exist && type == DATA_GENERIC_ENHANCE_UPDATE) { - f2fs_err(sbi, "Inconsistent error blkaddr:%u, sit bitmap:%d", -blkaddr, exist); - set_sbi_flag(sbi, SBI_NEED_FSCK); - return exist; - } + if ((exist && type == DATA_GENERIC_ENHANCE_UPDATE) || + (!exist && type == DATA_GENERIC_ENHANCE)) + goto err; - if (!exist && type == DATA_GENERIC_ENHANCE) { - f2fs_err(sbi, "Inconsistent error blkaddr:%u, sit bitmap:%d", -blkaddr, exist); - set_sbi_flag(sbi, SBI_NEED_FSCK); - dump_stack(); - } + return exist; +err: + f2fs_err(sbi, "Inconsistent error blkaddr:%u, sit bitmap:%d", + blkaddr, exist); + set_sbi_flag(sbi, SBI_NEED_FSCK); + dump_stack(); + f2fs_handle_error(sbi, ERROR_INVALID_BLKADDR); return exist; } @@ -174,29 +172,29 @@ bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr, int type) { if (time_to_inject(sbi, FAULT_BLKADDR)) - return false; + goto err; switch (type) { case META_NAT: break; case META_SIT: if (unlikely(blkaddr >= SIT_BLK_CNT(sbi))) - return false; + goto err; break; case META_SSA: if (unlikely(blkaddr >= MAIN_BLKADDR(sbi) || blkaddr < SM_I(sbi)->ssa_blkaddr)) - return false; + goto err; break; case META_CP: if (unlikely(blkaddr >= SIT_I(sbi)->sit_base_addr || blkaddr < __start_cp_addr(sbi))) - return false; + goto err; break; case META_POR: if (unlikely(blkaddr >= MAX_BLKADDR(sbi) || blkaddr < MAIN_BLKADDR(sbi))) - return false; + goto err; break; case DATA_GENERIC: case DATA_GENERIC_ENHANCE: @@ -213,7 +211,7 @@ bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi, blkaddr); set_sbi_flag(sbi, SBI_NEED_FSCK); dump_stack(); - return false; + goto err; } else { return __is_bitmap_valid(sbi, blkaddr, type); } @@ -221,13 +219,16 @@ bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi, case META_GENERIC: if (unlikely(blkaddr < SEG0_BLKADDR(sbi) || blkaddr >= MAIN_BLKADDR(sbi))) - return false; + goto err; break; default: BUG(); } return true; +err: + f2fs_handle_error(sbi, ERROR_INVALID_BLKADDR); + return false; } /* @@ -256,8 +257,10 @@ int f2fs_ra_meta_pages(struct f2fs_sb_info *sbi, block_t start, int nrpages, blk_start_plug(&plug); for (; nrpages-- > 0; blkno++) { - if (!f2fs_is_valid_blkaddr(sbi, blkno, type)) + if (!f2fs_is_valid_blkaddr(sbi, blkno, type)) { + err = -EFSCORRUPTED; goto out; + } switch (type) { case META_NAT: diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index 36e5dab..bd5acb5 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -1878,12 +1878,8 @@ void f2fs_cache_compressed_page(struct f2fs_sb_info *sbi, struct page *page, set_page_private_data(cpage, ino); - if (!f2fs_is_valid_blkaddr(sbi, blkaddr, DATA_GENERIC_ENHANCE
Re: [f2fs-dev] [PATCH V2] f2fs: fix to check return value of f2fs_recover_xattr_data
On 2023/12/12 10:15, Zhiguo Niu wrote: Should check return value of f2fs_recover_xattr_data in __f2fs_setxattr rather than doing invalid retry if error happen. Also just do set_page_dirty in f2fs_recover_xattr_data when page is changed really. Fixes: 50a472bbc79f ("f2fs: do not return EFSCORRUPTED, but try to run online repair") Signed-off-by: Zhiguo Niu Reviewed-by: Chao Yu Thanks, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH] f2fs: fix to check return value of f2fs_recover_xattr_data
On 2023/12/8 16:50, Zhiguo Niu wrote: Should check return value of f2fs_recover_xattr_data in __f2fs_setxattr rather than doing invalid retry if error happen. Also just do set_page_dirty in f2fs_recover_xattr_data when page is changed really. Fixes: 50a472bbc79f ("f2fs: do not return EFSCORRUPTED, but try to run online repair") Signed-off-by: Zhiguo Niu --- fs/f2fs/node.c | 6 +++--- fs/f2fs/xattr.c | 12 +++- 2 files changed, 10 insertions(+), 8 deletions(-) diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index e50a093..93bf724 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -2754,11 +2754,11 @@ int f2fs_recover_xattr_data(struct inode *inode, struct page *page) f2fs_update_inode_page(inode); /* 3: update and set xattr node page dirty */ - if (page) + if (page) { memcpy(F2FS_NODE(xpage), F2FS_NODE(page), VALID_XATTR_BLOCK_SIZE); - - set_page_dirty(xpage); + set_page_dirty(xpage); + } f2fs_put_page(xpage, 1); return 0; diff --git a/fs/f2fs/xattr.c b/fs/f2fs/xattr.c index 47e88b4..de92891 100644 --- a/fs/f2fs/xattr.c +++ b/fs/f2fs/xattr.c @@ -660,11 +660,13 @@ static int __f2fs_setxattr(struct inode *inode, int index, here = __find_xattr(base_addr, last_base_addr, NULL, index, len, name); if (!here) { if (!F2FS_I(inode)->i_xattr_nid) { - f2fs_notice(F2FS_I_SB(inode), - "recover xattr in inode (%lu)", inode->i_ino); - f2fs_recover_xattr_data(inode, NULL); - kfree(base_addr); - goto retry; + error = f2fs_recover_xattr_data(inode, NULL); How about printing here? f2fs_notice(F2FS_I_SB(inode), "recover xattr in inode (%lu), error: %d", inode->i_ino, error); Thanks, + if (!error) { + f2fs_notice(F2FS_I_SB(inode), + "recover xattr in inode (%lu)", inode->i_ino); + kfree(base_addr); + goto retry; + } } f2fs_err(F2FS_I_SB(inode), "set inode (%lu) has corrupted xattr", inode->i_ino); ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH] f2fs: unify the error handling of f2fs_is_valid_blkaddr
On 2023/12/1 13:42, Zhiguo Niu wrote: unify the error handling of f2fs_is_valid_blkaddr and remove some redundant codes in f2fs_cache_compressed_page. What about moving f2fs_handle_error(ERROR_INVALID_BLKADDR) into f2fs_is_valid_blkaddr() for cleanup? Thanks, Signed-off-by: Zhiguo Niu --- fs/f2fs/checkpoint.c | 6 +- fs/f2fs/compress.c | 8 +++- fs/f2fs/data.c | 3 ++- fs/f2fs/extent_cache.c | 7 --- fs/f2fs/f2fs.h | 6 -- fs/f2fs/file.c | 10 +++--- fs/f2fs/node.c | 7 +-- fs/f2fs/recovery.c | 15 +++ 8 files changed, 41 insertions(+), 21 deletions(-) diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index b0597a5..47a1335 100644 --- a/fs/f2fs/checkpoint.c +++ b/fs/f2fs/checkpoint.c @@ -158,6 +158,7 @@ static bool __is_bitmap_valid(struct f2fs_sb_info *sbi, block_t blkaddr, f2fs_err(sbi, "Inconsistent error blkaddr:%u, sit bitmap:%d", blkaddr, exist); set_sbi_flag(sbi, SBI_NEED_FSCK); + dump_stack(); return exist; } @@ -256,8 +257,11 @@ int f2fs_ra_meta_pages(struct f2fs_sb_info *sbi, block_t start, int nrpages, blk_start_plug(&plug); for (; nrpages-- > 0; blkno++) { - if (!f2fs_is_valid_blkaddr(sbi, blkno, type)) + if (!f2fs_is_valid_blkaddr(sbi, blkno, type)) { + err = -EFSCORRUPTED; + f2fs_handle_error(sbi, ERROR_INVALID_BLKADDR); goto out; + } switch (type) { case META_NAT: diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index 36e5dab..15d5035 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -1853,8 +1853,10 @@ void f2fs_cache_compressed_page(struct f2fs_sb_info *sbi, struct page *page, if (!test_opt(sbi, COMPRESS_CACHE)) return; - if (!f2fs_is_valid_blkaddr(sbi, blkaddr, DATA_GENERIC_ENHANCE_READ)) + if (!f2fs_is_valid_blkaddr(sbi, blkaddr, DATA_GENERIC_ENHANCE_READ)) { + f2fs_handle_error(sbi, ERROR_INVALID_BLKADDR); return; + } if (!f2fs_available_free_memory(sbi, COMPRESS_PAGE)) return; @@ -1878,12 +1880,8 @@ void f2fs_cache_compressed_page(struct f2fs_sb_info *sbi, struct page *page, set_page_private_data(cpage, ino); - if (!f2fs_is_valid_blkaddr(sbi, blkaddr, DATA_GENERIC_ENHANCE_READ)) - goto out; - memcpy(page_address(cpage), page_address(page), PAGE_SIZE); SetPageUptodate(cpage); -out: f2fs_put_page(cpage, 1); } diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 4e42b5f..e1a37ea 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -2309,7 +2309,8 @@ int f2fs_read_multi_pages(struct compress_ctx *cc, struct bio **bio_ret, break; if (!f2fs_is_valid_blkaddr(sbi, blkaddr, DATA_GENERIC)) { - ret = -EFAULT; + ret = -EFSCORRUPTED; + f2fs_handle_error(sbi, ERROR_INVALID_BLKADDR); goto out_put_dnode; } cc->nr_cpages++; diff --git a/fs/f2fs/extent_cache.c b/fs/f2fs/extent_cache.c index ad8dfac7..33e5632 100644 --- a/fs/f2fs/extent_cache.c +++ b/fs/f2fs/extent_cache.c @@ -43,7 +43,7 @@ bool sanity_check_extent_cache(struct inode *inode) if (!f2fs_is_valid_blkaddr(sbi, ei->blk, DATA_GENERIC_ENHANCE) || !f2fs_is_valid_blkaddr(sbi, ei->blk + ei->len - 1, DATA_GENERIC_ENHANCE)) { - set_sbi_flag(sbi, SBI_NEED_FSCK); + f2fs_handle_error(sbi, ERROR_INVALID_BLKADDR); f2fs_warn(sbi, "%s: inode (ino=%lx) extent info [%u, %u, %u] is incorrect, run fsck to fix", __func__, inode->i_ino, ei->blk, ei->fofs, ei->len); @@ -857,8 +857,9 @@ static int __get_new_block_age(struct inode *inode, struct extent_info *ei, if (__is_valid_data_blkaddr(blkaddr) && !f2fs_is_valid_blkaddr(sbi, blkaddr, DATA_GENERIC_ENHANCE)) { - f2fs_bug_on(sbi, 1); - return -EINVAL; + f2fs_handle_error(sbi, ERROR_INVALID_BLKADDR); + return -EFSCORRUPTED; + } out: /* diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 9043ced..a4b8d86 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -3454,13 +3454,16 @@ static inline int get_inline_xattr_addrs(struct inode *inode) bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr, int type); +void f2fs_handle_error(struct f2fs_sb_info *sbi, unsigned char error); + static inline void verify_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr, int type) { if (!f2fs_is_valid_blkaddr(sbi, blkaddr, typ
Re: [f2fs-dev] [PATCH 5/6] f2fs: fix to restrict condition of compress inode conversion
On 2023/12/12 6:11, Jaegeuk Kim wrote: On 12/10, Chao Yu wrote: This patch adds i_size check during compress inode conversion in order to avoid .page_mkwrite races w/ conversion. Which race condition do you see? Something like: - f2fs_setflags_common - check S_ISREG && F2FS_HAS_BLOCKS - mkwrite - f2fs_get_block_locked : update metadata in old inode's disk layout - set_compress_context Thanks, Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu --- fs/f2fs/f2fs.h | 8 +++- fs/f2fs/file.c | 5 ++--- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 65294e3b0bef..c9b8a1953913 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -4397,13 +4397,19 @@ static inline int set_compress_context(struct inode *inode) #endif } +static inline bool inode_has_data(struct inode *inode) +{ + return (S_ISREG(inode->i_mode) && + (F2FS_HAS_BLOCKS(inode) || i_size_read(inode))); +} + static inline bool f2fs_disable_compressed_file(struct inode *inode) { struct f2fs_inode_info *fi = F2FS_I(inode); if (!f2fs_compressed_file(inode)) return true; - if (S_ISREG(inode->i_mode) && F2FS_HAS_BLOCKS(inode)) + if (inode_has_data(inode)) return false; fi->i_flags &= ~F2FS_COMPR_FL; diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 1a3c29a9a6a0..8af4b29c3e1a 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -1922,8 +1922,7 @@ static int f2fs_setflags_common(struct inode *inode, u32 iflags, u32 mask) f2fs_down_write(&F2FS_I(inode)->i_sem); if (!f2fs_may_compress(inode) || - (S_ISREG(inode->i_mode) && - F2FS_HAS_BLOCKS(inode))) { + inode_has_data(inode)) { f2fs_up_write(&F2FS_I(inode)->i_sem); return -EINVAL; } @@ -3996,7 +3995,7 @@ static int f2fs_ioc_set_compress_option(struct file *filp, unsigned long arg) goto out; } - if (F2FS_HAS_BLOCKS(inode)) { + if (inode_has_data(inode)) { ret = -EFBIG; goto out; } -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH v2 4/6] f2fs: don't set FI_PREALLOCATED_ALL for partial write
In f2fs_preallocate_blocks(), if it is partial write in 4KB, it's not necessary to call f2fs_map_blocks() and set FI_PREALLOCATED_ALL flag. Cc: Eric Biggers Signed-off-by: Chao Yu --- v2: - clean up codes fs/f2fs/file.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 79d5b64c109c..026d05a7edd8 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -4567,7 +4567,8 @@ static int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *iter, if (map.m_len > map.m_lblk) map.m_len -= map.m_lblk; else - map.m_len = 0; + return 0; + map.m_may_create = true; if (dio) { map.m_seg_type = f2fs_rw_hint_to_seg_type(inode->i_write_hint); -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH 6/6] f2fs: fix to update iostat correctly in f2fs_filemap_fault()
In f2fs_filemap_fault(), it fixes to update iostat info only if VM_FAULT_LOCKED is tagged in return value of filemap_fault(). Fixes: 8b83ac81f428 ("f2fs: support read iostat") Signed-off-by: Chao Yu --- fs/f2fs/file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 8af4b29c3e1a..0626da43fa84 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -42,7 +42,7 @@ static vm_fault_t f2fs_filemap_fault(struct vm_fault *vmf) vm_fault_t ret; ret = filemap_fault(vmf); - if (!ret) + if (ret & VM_FAULT_LOCKED) f2fs_update_iostat(F2FS_I_SB(inode), inode, APP_MAPPED_READ_IO, F2FS_BLKSIZE); -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH 5/6] f2fs: fix to restrict condition of compress inode conversion
This patch adds i_size check during compress inode conversion in order to avoid .page_mkwrite races w/ conversion. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu --- fs/f2fs/f2fs.h | 8 +++- fs/f2fs/file.c | 5 ++--- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 65294e3b0bef..c9b8a1953913 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -4397,13 +4397,19 @@ static inline int set_compress_context(struct inode *inode) #endif } +static inline bool inode_has_data(struct inode *inode) +{ + return (S_ISREG(inode->i_mode) && + (F2FS_HAS_BLOCKS(inode) || i_size_read(inode))); +} + static inline bool f2fs_disable_compressed_file(struct inode *inode) { struct f2fs_inode_info *fi = F2FS_I(inode); if (!f2fs_compressed_file(inode)) return true; - if (S_ISREG(inode->i_mode) && F2FS_HAS_BLOCKS(inode)) + if (inode_has_data(inode)) return false; fi->i_flags &= ~F2FS_COMPR_FL; diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 1a3c29a9a6a0..8af4b29c3e1a 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -1922,8 +1922,7 @@ static int f2fs_setflags_common(struct inode *inode, u32 iflags, u32 mask) f2fs_down_write(&F2FS_I(inode)->i_sem); if (!f2fs_may_compress(inode) || - (S_ISREG(inode->i_mode) && - F2FS_HAS_BLOCKS(inode))) { + inode_has_data(inode)) { f2fs_up_write(&F2FS_I(inode)->i_sem); return -EINVAL; } @@ -3996,7 +3995,7 @@ static int f2fs_ioc_set_compress_option(struct file *filp, unsigned long arg) goto out; } - if (F2FS_HAS_BLOCKS(inode)) { + if (inode_has_data(inode)) { ret = -EFBIG; goto out; } -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH 3/6] f2fs: fix to check compress file in f2fs_move_file_range()
f2fs_move_file_range() doesn't support migrating compressed cluster data, let's add the missing check condition and return -EOPNOTSUPP for the case until we support it. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu --- fs/f2fs/file.c | 5 + 1 file changed, 5 insertions(+) diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 156b0ff05a3b..5c2f99ada6be 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -2813,6 +2813,11 @@ static int f2fs_move_file_range(struct file *file_in, loff_t pos_in, goto out; } + if (f2fs_compressed_file(src) || f2fs_compressed_file(dst)) { + ret = -EOPNOTSUPP; + goto out_unlock; + } + ret = -EINVAL; if (pos_in + len > src->i_size || pos_in + len < pos_in) goto out_unlock; -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH 4/6] f2fs: don't set FI_PREALLOCATED_ALL for partial write
In f2fs_preallocate_blocks(), if it is partial write in 4KB, it's not necessary to call f2fs_map_blocks() and set FI_PREALLOCATED_ALL flag. Cc: Eric Biggers Signed-off-by: Chao Yu --- fs/f2fs/file.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 5c2f99ada6be..1a3c29a9a6a0 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -4561,13 +4561,14 @@ static int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *iter, return ret; } - /* Do not preallocate blocks that will be written partially in 4KB. */ map.m_lblk = F2FS_BLK_ALIGN(pos); map.m_len = F2FS_BYTES_TO_BLK(pos + count); - if (map.m_len > map.m_lblk) - map.m_len -= map.m_lblk; - else - map.m_len = 0; + + /* Do not preallocate blocks that will be written partially in 4KB. */ + if (map.m_len <= map.m_lblk) + return 0; + + map.m_len -= map.m_lblk; map.m_may_create = true; if (dio) { map.m_seg_type = f2fs_rw_hint_to_seg_type(inode->i_write_hint); -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH 2/6] f2fs: fix to wait on block writeback for post_read case
If inode is compressed, but not encrypted, it missed to call f2fs_wait_on_block_writeback() to wait for GCed page writeback in IPU write path. Thread AGC-Thread - f2fs_gc - do_garbage_collect - gc_data_segment - move_data_block - f2fs_submit_page_write migrate normal cluster's block via meta_inode's page cache - f2fs_write_single_data_page - f2fs_do_write_data_page - f2fs_inplace_write_data - f2fs_submit_page_bio IRQ - f2fs_read_end_io IRQ old data overrides new data due to out-of-order GC and common IO. - f2fs_read_end_io Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu --- fs/f2fs/data.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 27015b7875ae..dce8defdf4c7 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -2556,9 +2556,6 @@ int f2fs_encrypt_one_page(struct f2fs_io_info *fio) page = fio->compressed_page ? fio->compressed_page : fio->page; - /* wait for GCed page writeback via META_MAPPING */ - f2fs_wait_on_block_writeback(inode, fio->old_blkaddr); - if (fscrypt_inode_uses_inline_crypto(inode)) return 0; @@ -2745,6 +2742,10 @@ int f2fs_do_write_data_page(struct f2fs_io_info *fio) goto out_writepage; } + /* wait for GCed page writeback via META_MAPPING */ + if (fio->post_read) + f2fs_wait_on_block_writeback(inode, fio->old_blkaddr); + /* * If current allocation needs SSR, * it had better in-place writes for updated data. -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH 1/6] f2fs: fix to tag gcing flag on page during block migration
It needs to add missing gcing flag on page during block migration, in order to garantee migrated data be persisted during checkpoint, otherwise out-of-order persistency between data and node may cause data corruption after SPOR. Similar issue was fixed by commit 2d1fe8a86bf5 ("f2fs: fix to tag gcing flag on page during file defragment"). Signed-off-by: Chao Yu --- fs/f2fs/compress.c | 4 +++- fs/f2fs/file.c | 2 ++ 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index b35be5799726..c5a4364c4482 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -1036,8 +1036,10 @@ static void set_cluster_dirty(struct compress_ctx *cc) int i; for (i = 0; i < cc->cluster_size; i++) - if (cc->rpages[i]) + if (cc->rpages[i]) { set_page_dirty(cc->rpages[i]); + set_page_private_gcing(cc->rpages[i]); + } } static int prepare_compress_overwrite(struct compress_ctx *cc, diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 60290940018d..156b0ff05a3b 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -1312,6 +1312,7 @@ static int __clone_blkaddrs(struct inode *src_inode, struct inode *dst_inode, } memcpy_page(pdst, 0, psrc, 0, PAGE_SIZE); set_page_dirty(pdst); + set_page_private_gcing(pdst); f2fs_put_page(pdst, 1); f2fs_put_page(psrc, 1); @@ -4046,6 +4047,7 @@ static int redirty_blocks(struct inode *inode, pgoff_t page_idx, int len) f2fs_bug_on(F2FS_I_SB(inode), !page); set_page_dirty(page); + set_page_private_gcing(page); f2fs_put_page(page, 1); f2fs_put_page(page, 0); } -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH 5/6] f2fs: introduce f2fs_invalidate_internal_cache() for cleanup
Just cleanup, no logic changes. Signed-off-by: Chao Yu --- fs/f2fs/data.c| 8 +++- fs/f2fs/f2fs.h| 7 +++ fs/f2fs/gc.c | 5 ++--- fs/f2fs/segment.c | 14 -- 4 files changed, 16 insertions(+), 18 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index d86419b01310..27015b7875ae 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1484,11 +1484,9 @@ static int __allocate_data_block(struct dnode_of_data *dn, int seg_type) old_blkaddr = dn->data_blkaddr; f2fs_allocate_data_block(sbi, NULL, old_blkaddr, &dn->data_blkaddr, &sum, seg_type, NULL); - if (GET_SEGNO(sbi, old_blkaddr) != NULL_SEGNO) { - invalidate_mapping_pages(META_MAPPING(sbi), - old_blkaddr, old_blkaddr); - f2fs_invalidate_compress_page(sbi, old_blkaddr); - } + if (GET_SEGNO(sbi, old_blkaddr) != NULL_SEGNO) + f2fs_invalidate_internal_cache(sbi, old_blkaddr); + f2fs_update_data_blkaddr(dn, dn->data_blkaddr); return 0; } diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 50e666ebd987..65294e3b0bef 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -4613,6 +4613,13 @@ static inline bool f2fs_is_readonly(struct f2fs_sb_info *sbi) return f2fs_sb_has_readonly(sbi) || f2fs_readonly(sbi->sb); } +static inline void f2fs_invalidate_internal_cache(struct f2fs_sb_info *sbi, + block_t blkaddr) +{ + invalidate_mapping_pages(META_MAPPING(sbi), blkaddr, blkaddr); + f2fs_invalidate_compress_page(sbi, blkaddr); +} + #define EFSBADCRC EBADMSG /* Bad CRC detected */ #define EFSCORRUPTED EUCLEAN /* Filesystem is corrupted */ diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c index 2fbe16ad726f..405a6077bd83 100644 --- a/fs/f2fs/gc.c +++ b/fs/f2fs/gc.c @@ -1380,9 +1380,8 @@ static int move_data_block(struct inode *inode, block_t bidx, memcpy(page_address(fio.encrypted_page), page_address(mpage), PAGE_SIZE); f2fs_put_page(mpage, 1); - invalidate_mapping_pages(META_MAPPING(fio.sbi), - fio.old_blkaddr, fio.old_blkaddr); - f2fs_invalidate_compress_page(fio.sbi, fio.old_blkaddr); + + f2fs_invalidate_internal_cache(fio.sbi, fio.old_blkaddr); set_page_dirty(fio.encrypted_page); if (clear_page_dirty_for_io(fio.encrypted_page)) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 007ebb107236..61da26eb61cc 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -2500,8 +2500,7 @@ void f2fs_invalidate_blocks(struct f2fs_sb_info *sbi, block_t addr) if (addr == NEW_ADDR || addr == COMPRESS_ADDR) return; - invalidate_mapping_pages(META_MAPPING(sbi), addr, addr); - f2fs_invalidate_compress_page(sbi, addr); + f2fs_invalidate_internal_cache(sbi, addr); /* add it into sit main buffer */ down_write(&sit_i->sentry_lock); @@ -3562,11 +3561,8 @@ static void do_write_page(struct f2fs_summary *sum, struct f2fs_io_info *fio) reallocate: f2fs_allocate_data_block(fio->sbi, fio->page, fio->old_blkaddr, &fio->new_blkaddr, sum, type, fio); - if (GET_SEGNO(fio->sbi, fio->old_blkaddr) != NULL_SEGNO) { - invalidate_mapping_pages(META_MAPPING(fio->sbi), - fio->old_blkaddr, fio->old_blkaddr); - f2fs_invalidate_compress_page(fio->sbi, fio->old_blkaddr); - } + if (GET_SEGNO(fio->sbi, fio->old_blkaddr) != NULL_SEGNO) + f2fs_invalidate_internal_cache(fio->sbi, fio->old_blkaddr); /* writeout dirty page into bdev */ f2fs_submit_page_write(fio); @@ -3762,9 +3758,7 @@ void f2fs_do_replace_block(struct f2fs_sb_info *sbi, struct f2fs_summary *sum, update_sit_entry(sbi, new_blkaddr, 1); } if (GET_SEGNO(sbi, old_blkaddr) != NULL_SEGNO) { - invalidate_mapping_pages(META_MAPPING(sbi), - old_blkaddr, old_blkaddr); - f2fs_invalidate_compress_page(sbi, old_blkaddr); + f2fs_invalidate_internal_cache(sbi, old_blkaddr); if (!from_gc) update_segment_mtime(sbi, old_blkaddr, 0); update_sit_entry(sbi, old_blkaddr, -1); -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH 1/6] f2fs: delete obsolete FI_FIRST_BLOCK_WRITTEN
Commit 3c6c2bebef79 ("f2fs: avoid punch_hole overhead when releasing volatile data") introduced FI_FIRST_BLOCK_WRITTEN as below reason: This patch is to avoid some punch_hole overhead when releasing volatile data. If volatile data was not written yet, we just can make the first page as zero. After commit 7bc155fec5b3 ("f2fs: kill volatile write support"), we won't support volatile write, but it missed to remove obsolete FI_FIRST_BLOCK_WRITTEN, delete it in this patch. Signed-off-by: Chao Yu --- fs/f2fs/compress.c | 2 -- fs/f2fs/data.c | 2 -- fs/f2fs/f2fs.h | 6 -- fs/f2fs/file.c | 3 --- fs/f2fs/gc.c | 2 -- fs/f2fs/inode.c| 25 - 6 files changed, 40 deletions(-) diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index 36e5dab6baae..b35be5799726 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -1369,8 +1369,6 @@ static int f2fs_write_compressed_pages(struct compress_ctx *cc, add_compr_block_stat(inode, cc->valid_nr_cpages); set_inode_flag(cc->inode, FI_APPEND_WRITE); - if (cc->cluster_idx == 0) - set_inode_flag(inode, FI_FIRST_BLOCK_WRITTEN); f2fs_put_dnode(&dn); if (quota_inode) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 42f0f6184f73..73d0726ac366 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -2810,8 +2810,6 @@ int f2fs_do_write_data_page(struct f2fs_io_info *fio) f2fs_outplace_write_data(&dn, fio); trace_f2fs_do_write_data_page(page, OPU); set_inode_flag(inode, FI_APPEND_WRITE); - if (page->index == 0) - set_inode_flag(inode, FI_FIRST_BLOCK_WRITTEN); out_writepage: f2fs_put_dnode(&dn); out: diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 86a145be4e53..be9a8e50ac50 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -781,7 +781,6 @@ enum { FI_UPDATE_WRITE,/* inode has in-place-update data */ FI_NEED_IPU,/* used for ipu per file */ FI_ATOMIC_FILE, /* indicate atomic file */ - FI_FIRST_BLOCK_WRITTEN, /* indicate #0 data block was written */ FI_DROP_CACHE, /* drop dirty page cache */ FI_DATA_EXIST, /* indicate data exists */ FI_INLINE_DOTS, /* indicate inline dot dentries */ @@ -3279,11 +3278,6 @@ static inline bool f2fs_is_cow_file(struct inode *inode) return is_inode_flag_set(inode, FI_COW_FILE); } -static inline bool f2fs_is_first_block_written(struct inode *inode) -{ - return is_inode_flag_set(inode, FI_FIRST_BLOCK_WRITTEN); -} - static inline bool f2fs_is_drop_cache(struct inode *inode) { return is_inode_flag_set(inode, FI_DROP_CACHE); diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 325dab01a29d..5025abf2d995 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -599,9 +599,6 @@ void f2fs_truncate_data_blocks_range(struct dnode_of_data *dn, int count) valid_blocks++; } - if (dn->ofs_in_node == 0 && IS_INODE(dn->node_page)) - clear_inode_flag(dn->inode, FI_FIRST_BLOCK_WRITTEN); - f2fs_invalidate_blocks(sbi, blkaddr); if (!released || blkaddr != COMPRESS_ADDR) diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c index f550cdeaa663..2fbe16ad726f 100644 --- a/fs/f2fs/gc.c +++ b/fs/f2fs/gc.c @@ -1405,8 +1405,6 @@ static int move_data_block(struct inode *inode, block_t bidx, f2fs_update_data_blkaddr(&dn, newaddr); set_inode_flag(inode, FI_APPEND_WRITE); - if (page->index == 0) - set_inode_flag(inode, FI_FIRST_BLOCK_WRITTEN); put_page_out: f2fs_put_page(fio.encrypted_page, 1); recover_block: diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index 560bfcad1af2..108e3d00028a 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -76,20 +76,6 @@ static void __get_inode_rdev(struct inode *inode, struct f2fs_inode *ri) } } -static int __written_first_block(struct f2fs_sb_info *sbi, - struct f2fs_inode *ri) -{ - block_t addr = le32_to_cpu(ri->i_addr[offset_in_addr(ri)]); - - if (!__is_valid_data_blkaddr(addr)) - return 1; - if (!f2fs_is_valid_blkaddr(sbi, addr, DATA_GENERIC_ENHANCE)) { - f2fs_handle_error(sbi, ERROR_INVALID_BLKADDR); - return -EFSCORRUPTED; - } - return 0; -} - static void __set_inode_rdev(struct inode *inode, struct f2fs_inode *ri) { int extra_size = get_extra_isize(inode); @@ -398,7 +384,6 @@ static int do_read_inode(struct inode *inode) struct page *node_page; struct f2fs_inode *ri; projid_t i_projid; - int err; /* Check if ino is within scope */ if (f2fs_check_nid_range(sbi, inode->i_ino)) @@ -480,16 +465,6 @@ static int do_read_inode(struct inode *inode)
[f2fs-dev] [PATCH 3/6] f2fs: introduce get_dnode_addr() to clean up codes
Just cleanup, no logic changes. Signed-off-by: Chao Yu --- fs/f2fs/data.c | 11 ++- fs/f2fs/f2fs.h | 18 +++--- fs/f2fs/file.c | 8 +--- fs/f2fs/inode.c | 32 ++-- 4 files changed, 32 insertions(+), 37 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 7a81ff3c385a..ae46c4841ca9 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1181,16 +1181,9 @@ static int f2fs_submit_page_read(struct inode *inode, struct page *page, static void __set_data_blkaddr(struct dnode_of_data *dn) { - struct f2fs_node *rn = F2FS_NODE(dn->node_page); - __le32 *addr_array; - int base = 0; + __le32 *addr = get_dnode_addr(dn->inode, dn->node_page); - if (IS_INODE(dn->node_page) && f2fs_has_extra_attr(dn->inode)) - base = get_extra_isize(dn->inode); - - /* Get physical address of data block */ - addr_array = blkaddr_in_node(rn); - addr_array[base + dn->ofs_in_node] = cpu_to_le32(dn->data_blkaddr); + addr[dn->ofs_in_node] = cpu_to_le32(dn->data_blkaddr); } /* diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 2f1d76088953..a06b8aad5117 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -3277,12 +3277,13 @@ static inline bool f2fs_is_cow_file(struct inode *inode) return is_inode_flag_set(inode, FI_COW_FILE); } +static inline __le32 *get_dnode_addr(struct inode *inode, + struct page *node_page); static inline void *inline_data_addr(struct inode *inode, struct page *page) { - struct f2fs_inode *ri = F2FS_INODE(page); - int extra_size = get_extra_isize(inode); + __le32 *addr = get_dnode_addr(inode, page); - return (void *)&(ri->i_addr[extra_size + DEF_INLINE_RESERVED_SIZE]); + return (void *)(addr + DEF_INLINE_RESERVED_SIZE); } static inline int f2fs_has_inline_dentry(struct inode *inode) @@ -3427,6 +3428,17 @@ static inline int get_inline_xattr_addrs(struct inode *inode) return F2FS_I(inode)->i_inline_xattr_size; } +static inline __le32 *get_dnode_addr(struct inode *inode, + struct page *node_page) +{ + int base = 0; + + if (IS_INODE(node_page) && f2fs_has_extra_attr(inode)) + base = get_extra_isize(inode); + + return blkaddr_in_node(F2FS_NODE(node_page)) + base; +} + #define f2fs_get_inode_mode(i) \ ((is_inode_flag_set(i, FI_ACL_MODE)) ? \ (F2FS_I(i)->i_acl_mode) : ((i)->i_mode)) diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 5025abf2d995..d0e7894e42d4 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -557,20 +557,14 @@ static int f2fs_file_open(struct inode *inode, struct file *filp) void f2fs_truncate_data_blocks_range(struct dnode_of_data *dn, int count) { struct f2fs_sb_info *sbi = F2FS_I_SB(dn->inode); - struct f2fs_node *raw_node; int nr_free = 0, ofs = dn->ofs_in_node, len = count; __le32 *addr; - int base = 0; bool compressed_cluster = false; int cluster_index = 0, valid_blocks = 0; int cluster_size = F2FS_I(dn->inode)->i_cluster_size; bool released = !atomic_read(&F2FS_I(dn->inode)->i_compr_blocks); - if (IS_INODE(dn->node_page) && f2fs_has_extra_attr(dn->inode)) - base = get_extra_isize(dn->inode); - - raw_node = F2FS_NODE(dn->node_page); - addr = blkaddr_in_node(raw_node) + base + ofs; + addr = get_dnode_addr(dn->inode, dn->node_page) + ofs; /* Assumption: truncation starts with cluster */ for (; count > 0; count--, addr++, dn->ofs_in_node++, cluster_index++) { diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index 108e3d00028a..b31410c4afbc 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -61,35 +61,31 @@ void f2fs_set_inode_flags(struct inode *inode) S_ENCRYPTED|S_VERITY|S_CASEFOLD); } -static void __get_inode_rdev(struct inode *inode, struct f2fs_inode *ri) +static void __get_inode_rdev(struct inode *inode, struct page *node_page) { - int extra_size = get_extra_isize(inode); + __le32 *addr = get_dnode_addr(inode, node_page); if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode) || S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode)) { - if (ri->i_addr[extra_size]) - inode->i_rdev = old_decode_dev( - le32_to_cpu(ri->i_addr[extra_size])); + if (addr[0]) + inode->i_rdev = old_decode_dev(le32_to_cpu(addr[0])); else - inode->i_rdev = new_decode_dev( - le32_to_cpu(ri->i_addr[extra_size + 1])); + inode->i_rdev = new_decode_dev(le32_to_cpu(addr[1]));
[f2fs-dev] [PATCH 4/6] f2fs: update blkaddr in __set_data_blkaddr() for cleanup
This patch allows caller to pass blkaddr to f2fs_set_data_blkaddr() and let __set_data_blkaddr() inside f2fs_set_data_blkaddr() to update dn->data_blkaddr w/ last value of blkaddr. Just cleanup, no logic changes. Signed-off-by: Chao Yu --- fs/f2fs/data.c | 13 ++--- fs/f2fs/f2fs.h | 2 +- fs/f2fs/file.c | 12 3 files changed, 11 insertions(+), 16 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index ae46c4841ca9..d86419b01310 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1179,10 +1179,11 @@ static int f2fs_submit_page_read(struct inode *inode, struct page *page, return 0; } -static void __set_data_blkaddr(struct dnode_of_data *dn) +static void __set_data_blkaddr(struct dnode_of_data *dn, block_t blkaddr) { __le32 *addr = get_dnode_addr(dn->inode, dn->node_page); + dn->data_blkaddr = blkaddr; addr[dn->ofs_in_node] = cpu_to_le32(dn->data_blkaddr); } @@ -1192,18 +1193,17 @@ static void __set_data_blkaddr(struct dnode_of_data *dn) * ->node_page *update block addresses in the node page */ -void f2fs_set_data_blkaddr(struct dnode_of_data *dn) +void f2fs_set_data_blkaddr(struct dnode_of_data *dn, block_t blkaddr) { f2fs_wait_on_page_writeback(dn->node_page, NODE, true, true); - __set_data_blkaddr(dn); + __set_data_blkaddr(dn, blkaddr); if (set_page_dirty(dn->node_page)) dn->node_changed = true; } void f2fs_update_data_blkaddr(struct dnode_of_data *dn, block_t blkaddr) { - dn->data_blkaddr = blkaddr; - f2fs_set_data_blkaddr(dn); + f2fs_set_data_blkaddr(dn, blkaddr); f2fs_update_read_extent_cache(dn); } @@ -1230,8 +1230,7 @@ int f2fs_reserve_new_blocks(struct dnode_of_data *dn, blkcnt_t count) block_t blkaddr = f2fs_data_blkaddr(dn); if (blkaddr == NULL_ADDR) { - dn->data_blkaddr = NEW_ADDR; - __set_data_blkaddr(dn); + __set_data_blkaddr(dn, NEW_ADDR); count--; } } diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index a06b8aad5117..50e666ebd987 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -3822,7 +3822,7 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio); struct block_device *f2fs_target_device(struct f2fs_sb_info *sbi, block_t blk_addr, sector_t *sector); int f2fs_target_device_index(struct f2fs_sb_info *sbi, block_t blkaddr); -void f2fs_set_data_blkaddr(struct dnode_of_data *dn); +void f2fs_set_data_blkaddr(struct dnode_of_data *dn, block_t blkaddr); void f2fs_update_data_blkaddr(struct dnode_of_data *dn, block_t blkaddr); int f2fs_reserve_new_blocks(struct dnode_of_data *dn, blkcnt_t count); int f2fs_reserve_new_block(struct dnode_of_data *dn); diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index d0e7894e42d4..3c7e6bfc1265 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -582,8 +582,7 @@ void f2fs_truncate_data_blocks_range(struct dnode_of_data *dn, int count) if (blkaddr == NULL_ADDR) continue; - dn->data_blkaddr = NULL_ADDR; - f2fs_set_data_blkaddr(dn); + f2fs_set_data_blkaddr(dn, NULL_ADDR); if (__is_valid_data_blkaddr(blkaddr)) { if (!f2fs_is_valid_blkaddr(sbi, blkaddr, @@ -1478,8 +1477,7 @@ static int f2fs_do_zero_range(struct dnode_of_data *dn, pgoff_t start, } f2fs_invalidate_blocks(sbi, dn->data_blkaddr); - dn->data_blkaddr = NEW_ADDR; - f2fs_set_data_blkaddr(dn); + f2fs_set_data_blkaddr(dn, NEW_ADDR); } f2fs_update_read_extent_cache_range(dn, start, 0, index - start); @@ -3454,8 +3452,7 @@ static int release_compress_blocks(struct dnode_of_data *dn, pgoff_t count) if (blkaddr != NEW_ADDR) continue; - dn->data_blkaddr = NULL_ADDR; - f2fs_set_data_blkaddr(dn); + f2fs_set_data_blkaddr(dn, NULL_ADDR); } f2fs_i_compr_blocks_update(dn->inode, compr_blocks, false); @@ -3621,8 +3618,7 @@ static int reserve_compress_blocks(struct dnode_of_data *dn, pgoff_t count) continue; } - dn->data_blkaddr = NEW_ADDR; - f2fs_set_data_blkaddr(dn); + f2fs_set_data_blkaddr(dn, NEW_ADDR); } reserved = cluster_size - compr_blocks; -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH 2/6] f2fs: delete obsolete FI_DROP_CACHE
FI_DROP_CACHE was introduced in commit 1e84371ffeef ("f2fs: change atomic and volatile write policies") for volatile write feature, after commit 7bc155fec5b3 ("f2fs: kill volatile write support"), we won't support volatile write, let's delete related codes. Signed-off-by: Chao Yu --- fs/f2fs/data.c | 3 --- fs/f2fs/f2fs.h | 6 -- 2 files changed, 9 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 73d0726ac366..7a81ff3c385a 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -2892,9 +2892,6 @@ int f2fs_write_single_data_page(struct page *page, int *submitted, zero_user_segment(page, offset, PAGE_SIZE); write: - if (f2fs_is_drop_cache(inode)) - goto out; - /* Dentry/quota blocks are controlled by checkpoint */ if (S_ISDIR(inode->i_mode) || quota_inode) { /* diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index be9a8e50ac50..2f1d76088953 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -781,7 +781,6 @@ enum { FI_UPDATE_WRITE,/* inode has in-place-update data */ FI_NEED_IPU,/* used for ipu per file */ FI_ATOMIC_FILE, /* indicate atomic file */ - FI_DROP_CACHE, /* drop dirty page cache */ FI_DATA_EXIST, /* indicate data exists */ FI_INLINE_DOTS, /* indicate inline dot dentries */ FI_SKIP_WRITES, /* should skip data page writeback */ @@ -3278,11 +3277,6 @@ static inline bool f2fs_is_cow_file(struct inode *inode) return is_inode_flag_set(inode, FI_COW_FILE); } -static inline bool f2fs_is_drop_cache(struct inode *inode) -{ - return is_inode_flag_set(inode, FI_DROP_CACHE); -} - static inline void *inline_data_addr(struct inode *inode, struct page *page) { struct f2fs_inode *ri = F2FS_INODE(page); -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH 6/6] f2fs: add tracepoint for f2fs_vm_page_mkwrite()
This patch adds to support tracepoint for f2fs_vm_page_mkwrite(), meanwhile it prints more details for trace_f2fs_filemap_fault(). Signed-off-by: Chao Yu --- fs/f2fs/file.c | 25 ++-- include/trace/events/f2fs.h | 39 - 2 files changed, 41 insertions(+), 23 deletions(-) diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 3c7e6bfc1265..60290940018d 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -46,7 +46,7 @@ static vm_fault_t f2fs_filemap_fault(struct vm_fault *vmf) f2fs_update_iostat(F2FS_I_SB(inode), inode, APP_MAPPED_READ_IO, F2FS_BLKSIZE); - trace_f2fs_filemap_fault(inode, vmf->pgoff, (unsigned long)ret); + trace_f2fs_filemap_fault(inode, vmf->pgoff, vmf->vma->vm_flags, ret); return ret; } @@ -59,26 +59,29 @@ static vm_fault_t f2fs_vm_page_mkwrite(struct vm_fault *vmf) struct dnode_of_data dn; bool need_alloc = true; int err = 0; + vm_fault_t ret; if (unlikely(IS_IMMUTABLE(inode))) return VM_FAULT_SIGBUS; - if (is_inode_flag_set(inode, FI_COMPRESS_RELEASED)) - return VM_FAULT_SIGBUS; + if (is_inode_flag_set(inode, FI_COMPRESS_RELEASED)) { + err = -EIO; + goto out; + } if (unlikely(f2fs_cp_error(sbi))) { err = -EIO; - goto err; + goto out; } if (!f2fs_is_checkpoint_ready(sbi)) { err = -ENOSPC; - goto err; + goto out; } err = f2fs_convert_inline_inode(inode); if (err) - goto err; + goto out; #ifdef CONFIG_F2FS_FS_COMPRESSION if (f2fs_compressed_file(inode)) { @@ -86,7 +89,7 @@ static vm_fault_t f2fs_vm_page_mkwrite(struct vm_fault *vmf) if (ret < 0) { err = ret; - goto err; + goto out; } else if (ret) { need_alloc = false; } @@ -153,13 +156,15 @@ static vm_fault_t f2fs_vm_page_mkwrite(struct vm_fault *vmf) f2fs_update_iostat(sbi, inode, APP_MAPPED_IO, F2FS_BLKSIZE); f2fs_update_time(sbi, REQ_TIME); - trace_f2fs_vm_page_mkwrite(page, DATA); out_sem: filemap_invalidate_unlock_shared(inode->i_mapping); sb_end_pagefault(inode->i_sb); -err: - return vmf_fs_error(err); +out: + ret = vmf_fs_error(err); + + trace_f2fs_vm_page_mkwrite(inode, page->index, vmf->vma->vm_flags, ret); + return ret; } static const struct vm_operations_struct f2fs_file_vm_ops = { diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h index 479d26eae22f..7ed0fc430dc6 100644 --- a/include/trace/events/f2fs.h +++ b/include/trace/events/f2fs.h @@ -1369,13 +1369,6 @@ DEFINE_EVENT(f2fs__page, f2fs_set_page_dirty, TP_ARGS(page, type) ); -DEFINE_EVENT(f2fs__page, f2fs_vm_page_mkwrite, - - TP_PROTO(struct page *page, int type), - - TP_ARGS(page, type) -); - TRACE_EVENT(f2fs_replace_atomic_write_block, TP_PROTO(struct inode *inode, struct inode *cow_inode, pgoff_t index, @@ -1413,30 +1406,50 @@ TRACE_EVENT(f2fs_replace_atomic_write_block, __entry->recovery) ); -TRACE_EVENT(f2fs_filemap_fault, +DECLARE_EVENT_CLASS(f2fs_mmap, - TP_PROTO(struct inode *inode, pgoff_t index, unsigned long ret), + TP_PROTO(struct inode *inode, pgoff_t index, + vm_flags_t flags, vm_fault_t ret), - TP_ARGS(inode, index, ret), + TP_ARGS(inode, index, flags, ret), TP_STRUCT__entry( __field(dev_t, dev) __field(ino_t, ino) __field(pgoff_t, index) - __field(unsigned long, ret) + __field(vm_flags_t, flags) + __field(vm_fault_t, ret) ), TP_fast_assign( __entry->dev= inode->i_sb->s_dev; __entry->ino= inode->i_ino; __entry->index = index; + __entry->flags = flags; __entry->ret= ret; ), - TP_printk("dev = (%d,%d), ino = %lu, index = %lu, ret = %lx", + TP_printk("dev = (%d,%d), ino = %lu, index = %lu, flags: %s, ret: %s", show_dev_ino(__entry), (unsigned long)__entry->index, - __entry->ret) + __print_flags(__entry->flags, "|", FAULT_FLAG_TRACE), + __print_flags(__entry->ret, "|", VM_FAULT_RESULT_TRACE)) +); + +DEFINE_EVENT(f2fs_mmap, f2fs_filemap_fault, + + TP_PROTO(struct inode *inode, pgoff_t index, + vm_flags_t flags, vm_fault_t ret), + + TP_ARGS(inode, i
Re: [f2fs-dev] [PATCH 1/1] f2fs: fix fallocate failed under pinned block situation
On 2023/11/28 20:51, Wu Bo wrote: On 2023/11/28 14:22, Chao Yu wrote: On 2023/11/17 7:34, Wu Bo wrote: On 2023/11/11 12:49, Chao Yu wrote: On 2023/11/8 21:48, Wu Bo wrote: On 2023/11/7 22:39, Chao Yu wrote: On 2023/10/30 17:40, Wu Bo wrote: If GC victim has pinned block, it can't be recycled. And if GC is foreground running, after many failure try, the pinned file is expected to be clear pin flag. To enable the section be recycled. But when fallocate trigger FG_GC, GC can never recycle the pinned section. Because GC will go to stop before the failure try meet the threshold: if (has_enough_free_secs(sbi, sec_freed, 0)) { if (!gc_control->no_bg_gc && total_sec_freed < gc_control->nr_free_secs) goto go_gc_more; goto stop; } So when fallocate trigger FG_GC, at least recycle one. Hmm... it may break pinfile's semantics at least on one pinned file? In this case, I prefer to fail fallocate() rather than unpinning file, in order to avoid leaving invalid LBA references of unpinned file held by userspace. As f2fs designed now, FG_GC is able to unpin the pinned file. fallocate() triggered FG_GC, but can't recycle space. It breaks the design logic of FG_GC. Yes, contradictoriness exists. IMO, unpin file by GC looks more dangerous, it may cause potential data corruption w/ below case: 1. app pins file & holds LBAs of data blocks. 2. GC unpins file and migrates its data to new LBAs. 3. other file reuses previous LBAs. 4. app read/write data via previous LBAs. So I suggest to normalize use of pinfile and do not add more unpin cases in filesystem inner processes. This issue is happened in Android OTA scenario. fallocate() always return failure cause OTA fail. Can you please check why other pinned files were so fragmented that f2fs_gc() can not recycle one free section? Not because pinned files were fragmented, but if the GC victim section has one block is pinned will cause this issue. If the section don't unpin the block, it can't be recycled. But there is high chance that the pinned section will be chosen next time under f2fs current victim selection strategy. So if we want to avoid unpin files, I think change victim selection to considering pinned blocks can fix this issue. Oh, I get it. How about this? diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 325dab01a29d..3fb52dec5df8 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -1730,7 +1730,10 @@ next_alloc: f2fs_down_write(&sbi->gc_lock); stat_inc_gc_call_count(sbi, FOREGROUND); err = f2fs_gc(sbi, &gc_control); - if (err && err != -ENODATA) + + if (err == -EAGAIN) + f2fs_balance_fs(sbi, true); + else if (err && err != -ENODATA) goto out_err; } Do you mean to call f2fs_balance_fs() to recycle one section? But in this situation, f2fs_balance_fs() will return at enough-free-section check: if (has_enough_free_secs(sbi, 0, 0)) return; As you said, there are lots of free segments, so I guess it's fine for latter 2m-aligned allocation, and for the case number of free section is lower than fggc threshold, we can call f2fs_balance_fs() to reclaim enough free sections. Thanks, However, the code won't fix contradictoriness issue, because the root cause is we left fragmented pinned data in filesystem, which should be avoided in GC-reliance LFS filesyetem as much as possible. Thanks, Thanks, And this commit changed previous behavior of fallocate(): Commit 2e42b7f817ac ("f2fs: stop allocating pinned sections if EAGAIN happens") Before this commit, if fallocate() meet this situation, it will trigger FG_GC to recycle pinned space finally. FG_GC is expected to recycle pinned space when there is no more free space. And this is the right time to do it when fallocate() need free space. It is weird when f2fs shows enough spare space but can't fallocate(). So I think it should be fixed. Thoughts? Thanks, This issue can be reproduced by filling f2fs space as following layout. Every segment has one block is pinned: +-+-+-+-+-+-+-+-+ | | |p| | | | ... | | seg_n +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ | | |p| | | | ... | | seg_n+1 +-+-+-+-+-+-+-+-+ ... +-+-+-+-+-+-+-+-+ | | |p| | | | ... | | seg_n+k +-+-+-+-+-+-+-+-+ And following are steps to reproduce this issue: dd if=/dev/zero of=./f2fs_pin.img bs=2M count=1024 mkfs.f2fs f2fs_pin.img mkdir f2fs mount f2fs_pin.img ./f2fs cd f2fs dd if=/dev/zero of=./large_padding bs=1M count=1760 ./pin_filling.sh rm padding* sync touch fallocate_40m f2fs_io pinfile set fallocate_40m fallocate -l 41943040 fallocate_40m fallocate always fail with EAGAIN even there has enough free space. 'pin_filling.sh' is: count=1 while : do # filling the seg space for i in {1..511}:
Re: [f2fs-dev] [PATCH] f2fs: compress: do cleanup in f2fs_truncate_partial_cluster()
On 2023/11/30 17:23, Yangtao Li wrote: Remove unnecessary code logic. Signed-off-by: Yangtao Li --- fs/f2fs/compress.c | 30 +- 1 file changed, 13 insertions(+), 17 deletions(-) diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index 36e5dab6baae..de55c266509a 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -1170,7 +1170,9 @@ int f2fs_truncate_partial_cluster(struct inode *inode, u64 from, bool lock) int log_cluster_size = F2FS_I(inode)->i_log_cluster_size; pgoff_t start_idx = from >> (PAGE_SHIFT + log_cluster_size) << log_cluster_size; - int err; + struct page **rpages = fsdata; fsdata is NULL here. Thanks, + int cluster_size = F2FS_I(inode)->i_cluster_size; + int err, i; err = f2fs_is_compressed_cluster(inode, start_idx); if (err < 0) @@ -1190,25 +1192,19 @@ int f2fs_truncate_partial_cluster(struct inode *inode, u64 from, bool lock) if (err <= 0) return err; - if (err > 0) { - struct page **rpages = fsdata; - int cluster_size = F2FS_I(inode)->i_cluster_size; - int i; - - for (i = cluster_size - 1; i >= 0; i--) { - loff_t start = rpages[i]->index << PAGE_SHIFT; + for (i = cluster_size - 1; i >= 0; i--) { + loff_t start = rpages[i]->index << PAGE_SHIFT; - if (from <= start) { - zero_user_segment(rpages[i], 0, PAGE_SIZE); - } else { - zero_user_segment(rpages[i], from - start, - PAGE_SIZE); - break; - } + if (from <= start) { + zero_user_segment(rpages[i], 0, PAGE_SIZE); + } else { + zero_user_segment(rpages[i], from - start, + PAGE_SIZE); + break; } - - f2fs_compress_write_end(inode, fsdata, start_idx, true); } + + f2fs_compress_write_end(inode, fsdata, start_idx, true); return 0; } ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH 2/2] f2fs: show i_mode in trace_f2fs_new_inode()
This patch supports to show i_mode field in trace_f2fs_new_inode(). Signed-off-by: Chao Yu --- include/trace/events/f2fs.h | 19 ++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h index 36d9e29ca3c5..479d26eae22f 100644 --- a/include/trace/events/f2fs.h +++ b/include/trace/events/f2fs.h @@ -161,6 +161,19 @@ TRACE_DEFINE_ENUM(EX_BLOCK_AGE); { EX_READ, "Read" }, \ { EX_BLOCK_AGE, "Block Age" }) +#define show_inode_type(x) \ + __print_symbolic(x, \ + { S_IFLNK, "symbolic" }, \ + { S_IFREG, "regular" }, \ + { S_IFDIR, "directory" }, \ + { S_IFCHR, "character" }, \ + { S_IFBLK, "block" }, \ + { S_IFIFO, "fifo" }, \ + { S_IFSOCK, "sock" }) + +#define S_ALL_PERM (S_ISUID | S_ISGID | S_ISVTX | \ + S_IRWXU | S_IRWXG | S_IRWXO) + struct f2fs_sb_info; struct f2fs_io_info; struct extent_info; @@ -215,17 +228,21 @@ DECLARE_EVENT_CLASS(f2fs__inode_exit, TP_STRUCT__entry( __field(dev_t, dev) __field(ino_t, ino) + __field(umode_t, mode) __field(int,ret) ), TP_fast_assign( __entry->dev= inode->i_sb->s_dev; __entry->ino= inode->i_ino; + __entry->mode = inode->i_mode; __entry->ret= ret; ), - TP_printk("dev = (%d,%d), ino = %lu, ret = %d", + TP_printk("dev = (%d,%d), ino = %lu, type: %s, mode = 0%o, ret = %d", show_dev_ino(__entry), + show_inode_type(__entry->mode & S_IFMT), + __entry->mode & S_ALL_PERM, __entry->ret) ); -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH 1/2] f2fs: introduce tracepoint for f2fs_rename()
This patch adds tracepoints for f2fs_rename(). Signed-off-by: Chao Yu --- fs/f2fs/namei.c | 16 ++--- include/trace/events/f2fs.h | 69 + 2 files changed, 80 insertions(+), 5 deletions(-) diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c index a765db9e26c2..ede6afb81762 100644 --- a/fs/f2fs/namei.c +++ b/fs/f2fs/namei.c @@ -1315,21 +1315,27 @@ static int f2fs_rename2(struct mnt_idmap *idmap, if (flags & ~(RENAME_NOREPLACE | RENAME_EXCHANGE | RENAME_WHITEOUT)) return -EINVAL; + trace_f2fs_rename_start(old_dir, old_dentry, new_dir, new_dentry, + flags); + err = fscrypt_prepare_rename(old_dir, old_dentry, new_dir, new_dentry, flags); if (err) return err; - if (flags & RENAME_EXCHANGE) { - return f2fs_cross_rename(old_dir, old_dentry, -new_dir, new_dentry); - } + if (flags & RENAME_EXCHANGE) + err = f2fs_cross_rename(old_dir, old_dentry, + new_dir, new_dentry); + else /* * VFS has already handled the new dentry existence case, * here, we just deal with "RENAME_NOREPLACE" as regular rename. */ - return f2fs_rename(idmap, old_dir, old_dentry, + err = f2fs_rename(idmap, old_dir, old_dentry, new_dir, new_dentry, flags); + + trace_f2fs_rename_end(old_dentry, new_dentry, flags, err); + return err; } static const char *f2fs_encrypted_get_link(struct dentry *dentry, diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h index 793f82cc1515..36d9e29ca3c5 100644 --- a/include/trace/events/f2fs.h +++ b/include/trace/events/f2fs.h @@ -866,6 +866,75 @@ TRACE_EVENT(f2fs_lookup_end, __entry->err) ); +TRACE_EVENT(f2fs_rename_start, + + TP_PROTO(struct inode *old_dir, struct dentry *old_dentry, + struct inode *new_dir, struct dentry *new_dentry, + unsigned int flags), + + TP_ARGS(old_dir, old_dentry, new_dir, new_dentry, flags), + + TP_STRUCT__entry( + __field(dev_t, dev) + __field(ino_t, ino) + __string(old_name, old_dentry->d_name.name) + __field(ino_t, new_pino) + __string(new_name, new_dentry->d_name.name) + __field(unsigned int, flags) + ), + + TP_fast_assign( + __entry->dev= old_dir->i_sb->s_dev; + __entry->ino= old_dir->i_ino; + __assign_str(old_name, old_dentry->d_name.name); + __entry->new_pino = new_dir->i_ino; + __assign_str(new_name, new_dentry->d_name.name); + __entry->flags = flags; + ), + + TP_printk("dev = (%d,%d), old_dir = %lu, old_name: %s, " + "new_dir = %lu, new_name: %s, flags = %u", + show_dev_ino(__entry), + __get_str(old_name), + __entry->new_pino, + __get_str(new_name), + __entry->flags) +); + +TRACE_EVENT(f2fs_rename_end, + + TP_PROTO(struct dentry *old_dentry, struct dentry *new_dentry, + unsigned int flags, int ret), + + TP_ARGS(old_dentry, new_dentry, flags, ret), + + TP_STRUCT__entry( + __field(dev_t, dev) + __field(ino_t, ino) + __string(old_name, old_dentry->d_name.name) + __string(new_name, new_dentry->d_name.name) + __field(unsigned int, flags) + __field(int,ret) + ), + + TP_fast_assign( + __entry->dev= old_dentry->d_sb->s_dev; + __entry->ino= old_dentry->d_inode->i_ino; + __assign_str(old_name, old_dentry->d_name.name); + __assign_str(new_name, new_dentry->d_name.name); + __entry->flags = flags; + __entry->ret= ret; + ), + + TP_printk("dev = (%d,%d), ino = %lu, old_name: %s, " + "new_name: %s, flags = %u, ret = %d", + show_dev_ino(__entry), + __get_str(old_name), + __get_str(new_name), + __entry->flags, + __entry->ret) +); + TRACE_EVENT(f2fs_readdir, TP_PROTO(struct inode *dir, loff_t start_pos, loff_t end_pos, int err), -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH] f2fs: fix to avoid dirent corruption
As Al reported in link[1]: f2fs_rename() ... if (old_dir != new_dir && !whiteout) f2fs_set_link(old_inode, old_dir_entry, old_dir_page, new_dir); else f2fs_put_page(old_dir_page, 0); You want correct inumber in the ".." link. And cross-directory rename does move the source to new parent, even if you'd been asked to leave a whiteout in the old place. [1] https://lore.kernel.org/all/20231017055040.GN800259@ZenIV/ With below testcase, it may cause dirent corruption, due to it missed to call f2fs_set_link() to update ".." link to new directory. - mkdir -p dir/foo - renameat2 -w dir/foo bar [ASSERT] (__chk_dots_dentries:1421) --> Bad inode number[0x4] for '..', parent parent ino is [0x3] [FSCK] other corrupted bugs [Fail] Fixes: 7e01e7ad746b ("f2fs: support RENAME_WHITEOUT") Cc: Jan Kara Reported-by: Al Viro Signed-off-by: Chao Yu --- fs/f2fs/namei.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c index 3b1793cfb002..ede6afb81762 100644 --- a/fs/f2fs/namei.c +++ b/fs/f2fs/namei.c @@ -1105,7 +1105,7 @@ static int f2fs_rename(struct mnt_idmap *idmap, struct inode *old_dir, } if (old_dir_entry) { - if (old_dir != new_dir && !whiteout) + if (old_dir != new_dir) f2fs_set_link(old_inode, old_dir_entry, old_dir_page, new_dir); else -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH v2] f2fs-tools: fixed incorrect error handling
On 2023/11/20 15:55, Maxim Korotkov wrote: Case of failed memory allocation of dev->zone_cap_blocks doesn't release heap allocated rep Found by RASU JSC Fixes: f8410857b7a8(f2fs-tools: zns zone-capacity support) Signed-off-by: Maxim Korotkov Reviewed-by: Chao Yu Thanks, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH 1/1] f2fs: fix fallocate failed under pinned block situation
On 2023/11/17 7:34, Wu Bo wrote: On 2023/11/11 12:49, Chao Yu wrote: On 2023/11/8 21:48, Wu Bo wrote: On 2023/11/7 22:39, Chao Yu wrote: On 2023/10/30 17:40, Wu Bo wrote: If GC victim has pinned block, it can't be recycled. And if GC is foreground running, after many failure try, the pinned file is expected to be clear pin flag. To enable the section be recycled. But when fallocate trigger FG_GC, GC can never recycle the pinned section. Because GC will go to stop before the failure try meet the threshold: if (has_enough_free_secs(sbi, sec_freed, 0)) { if (!gc_control->no_bg_gc && total_sec_freed < gc_control->nr_free_secs) goto go_gc_more; goto stop; } So when fallocate trigger FG_GC, at least recycle one. Hmm... it may break pinfile's semantics at least on one pinned file? In this case, I prefer to fail fallocate() rather than unpinning file, in order to avoid leaving invalid LBA references of unpinned file held by userspace. As f2fs designed now, FG_GC is able to unpin the pinned file. fallocate() triggered FG_GC, but can't recycle space. It breaks the design logic of FG_GC. Yes, contradictoriness exists. IMO, unpin file by GC looks more dangerous, it may cause potential data corruption w/ below case: 1. app pins file & holds LBAs of data blocks. 2. GC unpins file and migrates its data to new LBAs. 3. other file reuses previous LBAs. 4. app read/write data via previous LBAs. So I suggest to normalize use of pinfile and do not add more unpin cases in filesystem inner processes. This issue is happened in Android OTA scenario. fallocate() always return failure cause OTA fail. Can you please check why other pinned files were so fragmented that f2fs_gc() can not recycle one free section? Not because pinned files were fragmented, but if the GC victim section has one block is pinned will cause this issue. If the section don't unpin the block, it can't be recycled. But there is high chance that the pinned section will be chosen next time under f2fs current victim selection strategy. So if we want to avoid unpin files, I think change victim selection to considering pinned blocks can fix this issue. Oh, I get it. How about this? diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 325dab01a29d..3fb52dec5df8 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -1730,7 +1730,10 @@ next_alloc: f2fs_down_write(&sbi->gc_lock); stat_inc_gc_call_count(sbi, FOREGROUND); err = f2fs_gc(sbi, &gc_control); - if (err && err != -ENODATA) + + if (err == -EAGAIN) + f2fs_balance_fs(sbi, true); + else if (err && err != -ENODATA) goto out_err; } However, the code won't fix contradictoriness issue, because the root cause is we left fragmented pinned data in filesystem, which should be avoided in GC-reliance LFS filesyetem as much as possible. Thanks, Thanks, And this commit changed previous behavior of fallocate(): Commit 2e42b7f817ac ("f2fs: stop allocating pinned sections if EAGAIN happens") Before this commit, if fallocate() meet this situation, it will trigger FG_GC to recycle pinned space finally. FG_GC is expected to recycle pinned space when there is no more free space. And this is the right time to do it when fallocate() need free space. It is weird when f2fs shows enough spare space but can't fallocate(). So I think it should be fixed. Thoughts? Thanks, This issue can be reproduced by filling f2fs space as following layout. Every segment has one block is pinned: +-+-+-+-+-+-+-+-+ | | |p| | | | ... | | seg_n +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ | | |p| | | | ... | | seg_n+1 +-+-+-+-+-+-+-+-+ ... +-+-+-+-+-+-+-+-+ | | |p| | | | ... | | seg_n+k +-+-+-+-+-+-+-+-+ And following are steps to reproduce this issue: dd if=/dev/zero of=./f2fs_pin.img bs=2M count=1024 mkfs.f2fs f2fs_pin.img mkdir f2fs mount f2fs_pin.img ./f2fs cd f2fs dd if=/dev/zero of=./large_padding bs=1M count=1760 ./pin_filling.sh rm padding* sync touch fallocate_40m f2fs_io pinfile set fallocate_40m fallocate -l 41943040 fallocate_40m fallocate always fail with EAGAIN even there has enough free space. 'pin_filling.sh' is: count=1 while : do # filling the seg space for i in {1..511}: do name=padding_$count-$i echo write $name dd if=/dev/zero of=./$name bs=4K count=1 > /dev/null 2>&1 if [ $? -ne 0 ]; then exit 0 fi done sync # pin one block in a segment name=pin_file$count dd if=/dev/zero of=./$name bs=4K count=1 > /dev/null 2>&1 sync f2fs_io pinfile set $name c
Re: [f2fs-dev] [PATCH] f2fs: add support for an i_version counter
On 2023/11/20 17:54, Yangtao Li wrote: NFSv4 mandates a change attribute to avoid problems with timestamp granularity, which Linux implements using the i_version counter. This is particularly important when the underlying filesystem is fast. Signed-off-by: Yangtao Li --- fs/f2fs/f2fs.h | 1 + fs/f2fs/inode.c | 1 + fs/f2fs/super.c | 1 + 3 files changed, 3 insertions(+) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 9043cedfa12b..68fd2ef35104 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -28,6 +28,7 @@ #include #include +#include struct pagevec; diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index 560bfcad1af2..2604fa4a0704 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -32,6 +32,7 @@ void f2fs_mark_inode_dirty_sync(struct inode *inode, bool sync) if (f2fs_inode_dirtied(inode, sync)) return; + inode_inc_iversion(inode); For the case f2fs is not used by nfs as backend storage, there will be overhead for iversion update logic. So what about introducing a new mount option to enable the iversion functionality, and disabling it by default? Thanks, mark_inode_dirty_sync(inode); } diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index 033af907c3b1..2bddccd47e3a 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -2204,6 +2204,7 @@ static void default_options(struct f2fs_sb_info *sbi, bool remount) set_opt(sbi, MERGE_CHECKPOINT); F2FS_OPTION(sbi).unusable_cap = 0; sbi->sb->s_flags |= SB_LAZYTIME; + sbi->sb->s_flags |= SB_I_VERSION; if (!f2fs_is_readonly(sbi)) set_opt(sbi, FLUSH_MERGE); if (f2fs_sb_has_blkzoned(sbi)) ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH] f2fs: show more discard stat by sysfs
On 2023/11/24 9:08, Zhiguo Niu wrote: The current pending_discard attr just only shows the discard_cmd_cnt information, which is not very meaningful. More discard information can be shown so that we can check them through sysfs when needed. What about adding this entry to /sys/fs/f2fs//stat/? Signed-off-by: Zhiguo Niu --- fs/f2fs/sysfs.c | 21 +++-- 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c index 417fae96..f98e680 100644 --- a/fs/f2fs/sysfs.c +++ b/fs/f2fs/sysfs.c @@ -134,13 +134,22 @@ static ssize_t cp_status_show(struct f2fs_attr *a, return sysfs_emit(buf, "%x\n", le32_to_cpu(F2FS_CKPT(sbi)->ckpt_flags)); } -static ssize_t pending_discard_show(struct f2fs_attr *a, +static ssize_t discard_stat_show(struct f2fs_attr *a, struct f2fs_sb_info *sbi, char *buf) { - if (!SM_I(sbi)->dcc_info) + struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; + + if (!dcc) return -EINVAL; - return sysfs_emit(buf, "%llu\n", (unsigned long long)atomic_read( - &SM_I(sbi)->dcc_info->discard_cmd_cnt)); It's better to keep the old one for any potential user. + + return sysfs_emit(buf, "%llu, %llu, %llu, %u\n", + (unsigned long long)atomic_read( + &dcc->discard_cmd_cnt), + (unsigned long long)atomic_read( + &dcc->issued_discard), + (unsigned long long)atomic_read( + &dcc->queued_discard), + dcc->undiscard_blks); } static ssize_t gc_mode_show(struct f2fs_attr *a, @@ -1016,7 +1025,7 @@ static ssize_t f2fs_sb_feature_show(struct f2fs_attr *a, F2FS_GENERAL_RO_ATTR(encoding); F2FS_GENERAL_RO_ATTR(mounted_time_sec); F2FS_GENERAL_RO_ATTR(main_blkaddr); -F2FS_GENERAL_RO_ATTR(pending_discard); +F2FS_GENERAL_RO_ATTR(discard_stat); It needs to adjust Documentation/ABI/testing/sysfs-fs-f2fs. Thanks, F2FS_GENERAL_RO_ATTR(gc_mode); #ifdef CONFIG_F2FS_STAT_FS F2FS_GENERAL_RO_ATTR(moved_blocks_background); @@ -1074,7 +1083,7 @@ static ssize_t f2fs_sb_feature_show(struct f2fs_attr *a, ATTR_LIST(discard_urgent_util), ATTR_LIST(discard_granularity), ATTR_LIST(max_ordered_discard), - ATTR_LIST(pending_discard), + ATTR_LIST(discard_stat), ATTR_LIST(gc_mode), ATTR_LIST(ipu_policy), ATTR_LIST(min_ipu_util), ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH 2/3] f2fs-tools: Wait for Block Size to initialize Cache
On 2023/11/28 8:52, Daniel Rosenberg wrote: On Sun, Nov 26, 2023 at 5:42 PM Chao Yu wrote: Hi Daniel, How about this? It be more explicit to indicate the logic? --- fsck/mount.c | 2 ++ include/f2fs_fs.h | 3 +++ lib/libf2fs_io.c | 4 3 files changed, 9 insertions(+) diff --git a/fsck/mount.c b/fsck/mount.c index 72516f4..4dfb996 100644 --- a/fsck/mount.c +++ b/fsck/mount.c @@ -1238,6 +1238,8 @@ int init_sb_info(struct f2fs_sb_info *sbi) MSG(0, "Info: total FS sectors = %"PRIu64" (%"PRIu64" MB)\n", total_sectors, total_sectors >> (20 - get_sb(log_sectorsize))); + + c.cache_config.blksize_initialized = true; return 0; } diff --git a/include/f2fs_fs.h b/include/f2fs_fs.h index 6df2e73..e377d48 100644 --- a/include/f2fs_fs.h +++ b/include/f2fs_fs.h @@ -460,6 +460,9 @@ typedef struct { unsigned max_hash_collision; bool dbg_en; + + /* whether blksize has been initialized */ + bool blksize_initialized; } dev_cache_config_t; /* f2fs_configration for compression used for sload.f2fs */ diff --git a/lib/libf2fs_io.c b/lib/libf2fs_io.c index 39d3777..bb77418 100644 --- a/lib/libf2fs_io.c +++ b/lib/libf2fs_io.c @@ -199,6 +199,10 @@ void dcache_init(void) { long n; + /* Must not initiate cache until block size is known */ + if (!c.cache_config.blksize_initialized) + return; + if (c.cache_config.num_cache_entry <= 0) return; -- 2.40.1 I think that works too. I was initially going to go with code like that, but I was unsure if read/writes that aren't in the fsck path used caching as well. There's a few uses of those in the mkfs/ paths, but I don't think there's anything that sets cache_config.num_cache_entry there. I was worried about missing code paths like that, so I had restricted the changes to where I knew we were reading the superblock off disk. If there is a problem, it could be covered by setting c.cache_config.blksize_initialized = true within the mkfs code after the blocksize is set in f2fs_parse_options. But that isn't needed if there's no caching in mkfs. Oh, I think the most useful case of dcache is fsck, not sure about resize, but I don't think it will be benefit to mkfs/sload, based on IO pattern of these cases. IMO, we can get rid of dcache for mkfs/sload cases in order to avoid maintaining complicated design/logic. Thanks, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH v1] f2fs: New victim selection for GC
Hi Yonggil, On 2023/10/26 17:18, Yonggil Song wrote: Overview Introduce a new way to select the data section first when selecting a victim in foreground GC. This victim selection method works when the prefer_data_victim mount option is enabled. If foreground GC migrates only data sections and runs out of free sections, it cleans dirty node sections to get more free sections. What about introducing parameter to adjust cost calculated by get_gc_cost()? Something like: get_gc_cost() if (p->gc_mode == GC_GREEDY) { vblocks = get_valid_blocks(); if (seg_type is data) return vblocks * data_factor; return vblock * node_factor; } If we prefer to select data segment during fggc, we can config data/node factor as 1 and 512? Thoughts? Thanks, Problem === If the total amount of nodes is larger than the size of one section, nodes occupy multiple sections, and node victims are often selected because the gc cost is lowered by data block migration in foreground gc. Since moving the data section causes frequent node victim selection, victim threshing occurs in the node section. This results in an increase in WAF. Experiment == Test environment is as follows. System info - 3.6GHz, 16 core CPU - 36GiB Memory Device info - a conventional null_blk with 228MiB - a sequential null_blk with 4068 zones of 8MiB Format - mkfs.f2fs -c -m -Z 8 -o 3.89 Mount - mount -o prefer_data_victim Fio script - fio --rw=randwrite --bs=4k --ba=4k --filesize=31187m --norandommap --overwrite=1 --name=job1 --filename=./mnt/sustain --io_size=128g WAF calculation - (IOs on conv. null_blk + IOs on seq. null_blk) / random write IOs Conclusion == This experiment showed that the WAF was reduced by 29% (18.75 -> 13.3) when the data section was selected first when selecting GC victims. This was achieved by reducing the migration of the node blocks by 69.4% (253,131,743 blks -> 77,463,278 blks). It is possible to achieve low WAF performance with the GC victim selection method in environments where the section size is relatively small. Signed-off-by: Yonggil Song --- Documentation/filesystems/f2fs.rst | 3 + fs/f2fs/f2fs.h | 2 + fs/f2fs/gc.c | 100 +++-- fs/f2fs/segment.h | 2 + fs/f2fs/super.c| 9 +++ 5 files changed, 95 insertions(+), 21 deletions(-) diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst index d32c6209685d..58e6d001d7ab 100644 --- a/Documentation/filesystems/f2fs.rst +++ b/Documentation/filesystems/f2fs.rst @@ -367,6 +367,9 @@ errors=%sSpecify f2fs behavior on critical errors. This supports modes: pending node write dropkeep N/A pending meta write keepkeep N/A == === === +prefer_data_victim When selecting victims in foreground GC, victims of data type +are prioritized. This option minimizes GC victim threshing +in the node section to reduce WAF. Debugfs Entries diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 6d688e42d89c..8b31fa2ea09a 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -108,6 +108,7 @@ extern const char *f2fs_fault_name[FAULT_MAX]; #define F2FS_MOUNT_GC_MERGE 0x0200 #define F2FS_MOUNT_COMPRESS_CACHE 0x0400 #define F2FS_MOUNT_AGE_EXTENT_CACHE 0x0800 +#define F2FS_MOUNT_PREFER_DATA_VICTIM 0x1000 #define F2FS_OPTION(sbi) ((sbi)->mount_opt) #define clear_opt(sbi, option)(F2FS_OPTION(sbi).opt &= ~F2FS_MOUNT_##option) @@ -1648,6 +1649,7 @@ struct f2fs_sb_info { struct f2fs_mount_info mount_opt; /* mount options */ /* for cleaning operations */ + bool need_node_clean; /* only used for prefer_data_victim */ struct f2fs_rwsem gc_lock; /* * semaphore for GC, avoid * race between GC and GC or CP diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c index f550cdeaa663..8a2da808a5fb 100644 --- a/fs/f2fs/gc.c +++ b/fs/f2fs/gc.c @@ -752,6 +752,8 @@ int f2fs_get_victim(struct f2fs_sb_info *sbi, unsigned int *result, unsigned int last_segment; unsigned int nsearched; bool is_atgc; + bool is_prefer_data_victim = + test_opt(sbi, PREFER_DATA_VICTIM) && gc_type == FG_GC; int ret = 0; mutex_lock(&dirty_i->seglist_lock
Re: [f2fs-dev] [PATCH 3/3] f2fs-tools: Fix dqb_curspace to reflect blocksize
On 2023/11/18 10:03, Daniel Rosenberg wrote: The initial sizes for dqblk.dqb_curspace should reflect the block size, as that's the minimal filesize. Signed-off-by: Daniel Rosenberg Reviewed-by: Chao Yu Thanks, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH 2/3] f2fs-tools: Wait for Block Size to initialize Cache
On 2023/11/18 10:03, Daniel Rosenberg wrote: The cache is initialized during the first read, however, it requires the block size to establish its buffer. This disables the cache until the block size is known. Hi Daniel, How about this? It be more explicit to indicate the logic? --- fsck/mount.c | 2 ++ include/f2fs_fs.h | 3 +++ lib/libf2fs_io.c | 4 3 files changed, 9 insertions(+) diff --git a/fsck/mount.c b/fsck/mount.c index 72516f4..4dfb996 100644 --- a/fsck/mount.c +++ b/fsck/mount.c @@ -1238,6 +1238,8 @@ int init_sb_info(struct f2fs_sb_info *sbi) MSG(0, "Info: total FS sectors = %"PRIu64" (%"PRIu64" MB)\n", total_sectors, total_sectors >> (20 - get_sb(log_sectorsize))); + + c.cache_config.blksize_initialized = true; return 0; } diff --git a/include/f2fs_fs.h b/include/f2fs_fs.h index 6df2e73..e377d48 100644 --- a/include/f2fs_fs.h +++ b/include/f2fs_fs.h @@ -460,6 +460,9 @@ typedef struct { unsigned max_hash_collision; bool dbg_en; + + /* whether blksize has been initialized */ + bool blksize_initialized; } dev_cache_config_t; /* f2fs_configration for compression used for sload.f2fs */ diff --git a/lib/libf2fs_io.c b/lib/libf2fs_io.c index 39d3777..bb77418 100644 --- a/lib/libf2fs_io.c +++ b/lib/libf2fs_io.c @@ -199,6 +199,10 @@ void dcache_init(void) { long n; + /* Must not initiate cache until block size is known */ + if (!c.cache_config.blksize_initialized) + return; + if (c.cache_config.num_cache_entry <= 0) return; -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH 1/3] f2fs-tools: Fix debug size print
On 2023/11/18 10:03, Daniel Rosenberg wrote: The conversion from block size to MB in this debug statement assumes a block size of 4K. This switches it to properly use the filesystem's block size. Signed-off-by: Daniel Rosenberg Reviewed-by: Chao Yu Thanks, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH] f2fs: sysfs: support discard_io_aware
It gives a way to enable/disable IO aware feature for background discard, so that we can tune background discard more precisely based on undiscard condition. e.g. force to disable IO aware if there are large number of discard extents, and discard IO may always be interrupted by frequent common IO. Signed-off-by: Chao Yu --- Documentation/ABI/testing/sysfs-fs-f2fs | 6 ++ fs/f2fs/f2fs.h | 7 +++ fs/f2fs/segment.c | 6 +- fs/f2fs/sysfs.c | 9 + 4 files changed, 27 insertions(+), 1 deletion(-) diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs index 36c3cb547901..4f1d4e636d67 100644 --- a/Documentation/ABI/testing/sysfs-fs-f2fs +++ b/Documentation/ABI/testing/sysfs-fs-f2fs @@ -740,3 +740,9 @@ Description:When compress cache is on, it controls cached page If cached page percent exceed threshold, then deny caching compress page. The value should be in range of (0, 100], by default it was initialized as 20(%). + +What: /sys/fs/f2fs//discard_io_aware +Date: November 2023 +Contact: "Chao Yu" +Description: It controls to enable/disable IO aware feature for background discard. + By default, the value is 1 which indicates IO aware is on. diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 9043cedfa12b..86a145be4e53 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -374,6 +374,12 @@ enum { MAX_DPOLICY, }; +enum { + DPOLICY_IO_AWARE_DISABLE, /* force to not be aware of IO */ + DPOLICY_IO_AWARE_ENABLE,/* force to be aware of IO */ + DPOLICY_IO_AWARE_MAX, +}; + struct discard_policy { int type; /* type of discard */ unsigned int min_interval; /* used for candidates exist */ @@ -406,6 +412,7 @@ struct discard_cmd_control { unsigned int discard_urgent_util; /* utilization which issue discard proactively */ unsigned int discard_granularity; /* discard granularity */ unsigned int max_ordered_discard; /* maximum discard granularity issued by lba order */ + unsigned int discard_io_aware; /* io_aware policy */ unsigned int undiscard_blks;/* # of undiscard blocks */ unsigned int next_pos; /* next discard position */ atomic_t issued_discard;/* # of issued discard */ diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index f4ffd64b44b2..08e2f44a1264 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -1172,7 +1172,10 @@ static void __init_discard_policy(struct f2fs_sb_info *sbi, dpolicy->min_interval = dcc->min_discard_issue_time; dpolicy->mid_interval = dcc->mid_discard_issue_time; dpolicy->max_interval = dcc->max_discard_issue_time; - dpolicy->io_aware = true; + if (dcc->discard_io_aware == DPOLICY_IO_AWARE_ENABLE) + dpolicy->io_aware = true; + else if (dcc->discard_io_aware == DPOLICY_IO_AWARE_DISABLE) + dpolicy->io_aware = false; dpolicy->sync = false; dpolicy->ordered = true; if (utilization(sbi) > dcc->discard_urgent_util) { @@ -2275,6 +2278,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) dcc->discard_io_aware_gran = MAX_PLIST_NUM; dcc->discard_granularity = DEFAULT_DISCARD_GRANULARITY; dcc->max_ordered_discard = DEFAULT_MAX_ORDERED_DISCARD_GRANULARITY; + dcc->discard_io_aware = DPOLICY_IO_AWARE_ENABLE; if (F2FS_OPTION(sbi).discard_unit == DISCARD_UNIT_SEGMENT) dcc->discard_granularity = sbi->blocks_per_seg; else if (F2FS_OPTION(sbi).discard_unit == DISCARD_UNIT_SECTION) diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c index 417fae96890f..7099ffa57299 100644 --- a/fs/f2fs/sysfs.c +++ b/fs/f2fs/sysfs.c @@ -516,6 +516,13 @@ static ssize_t __sbi_store(struct f2fs_attr *a, return count; } + if (!strcmp(a->attr.name, "discard_io_aware")) { + if (t >= DPOLICY_IO_AWARE_MAX) + return -EINVAL; + *ui = t; + return count; + } + if (!strcmp(a->attr.name, "migration_granularity")) { if (t == 0 || t > sbi->segs_per_sec) return -EINVAL; @@ -926,6 +933,7 @@ DCC_INFO_GENERAL_RW_ATTR(discard_io_aware_gran); DCC_INFO_GENERAL_RW_ATTR(discard_urgent_util); DCC_INFO_GENERAL_RW_ATTR(discard_granularity); DCC_INFO_GENERAL_RW_ATTR(max_ordered_discard); +DCC_INFO_GENERAL_RW_ATTR(discard_io_aware); /* NM_INFO ATTR */ NM_INFO_RW_ATTR(max_roll_forward_node_blocks, max_rf_node_bl
Re: [f2fs-dev] [PATCH] f2fs-tools: adjust nat and block release logic
On 2023/11/18 4:38, Daeho Jeong wrote: From: Daeho Jeong Fixes: 0f503e443ccb ("f2fs-tools: do not reuse corrupted quota inodes") Signed-off-by: Daeho Jeong Reviewed-by: Chao Yu Thanks, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH] f2fs: skip adding a discard command if exists
On 2023/11/18 1:41, Jaegeuk Kim wrote: Not sure other cases yet.. let's do one by one, since I hit this in real. Sure. Signed-off-by: Jaegeuk Kim Reviewed-by: Chao Yu Thansk, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH 3/3] f2fs: fix to check return value of f2fs_reserve_new_block()
Let's check return value of f2fs_reserve_new_block() in do_recover_data() rather than letting it fails silently. Also refactoring check condition on return value of f2fs_reserve_new_block() as below: - trigger f2fs_bug_on() only for ENOSPC case; - use do-while statement to avoid redundant codes; Signed-off-by: Chao Yu --- fs/f2fs/recovery.c | 23 +-- 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c index b56d0f1078a7..16415c770b45 100644 --- a/fs/f2fs/recovery.c +++ b/fs/f2fs/recovery.c @@ -712,7 +712,16 @@ static int do_recover_data(struct f2fs_sb_info *sbi, struct inode *inode, */ if (dest == NEW_ADDR) { f2fs_truncate_data_blocks_range(&dn, 1); - f2fs_reserve_new_block(&dn); + do { + err = f2fs_reserve_new_block(&dn); + if (err == -ENOSPC) { + f2fs_bug_on(sbi, 1); + break; + } + } while (err && + IS_ENABLED(CONFIG_F2FS_FAULT_INJECTION)); + if (err) + goto err; continue; } @@ -720,12 +729,14 @@ static int do_recover_data(struct f2fs_sb_info *sbi, struct inode *inode, if (f2fs_is_valid_blkaddr(sbi, dest, META_POR)) { if (src == NULL_ADDR) { - err = f2fs_reserve_new_block(&dn); - while (err && - IS_ENABLED(CONFIG_F2FS_FAULT_INJECTION)) + do { err = f2fs_reserve_new_block(&dn); - /* We should not get -ENOSPC */ - f2fs_bug_on(sbi, err); + if (err == -ENOSPC) { + f2fs_bug_on(sbi, 1); + break; + } + } while (err && + IS_ENABLED(CONFIG_F2FS_FAULT_INJECTION)); if (err) goto err; } -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH 2/3] f2fs: use shared inode lock during f2fs_fiemap()
f2fs_fiemap() will only traverse metadata of inode, let's use shared inode lock for it to avoid unnecessary race on inode lock. Signed-off-by: Chao Yu --- fs/f2fs/data.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 4e42b5f24deb..42f0f6184f73 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1992,7 +1992,7 @@ int f2fs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, if (ret) return ret; - inode_lock(inode); + inode_lock_shared(inode); maxbytes = max_file_blocks(inode) << F2FS_BLKSIZE_BITS; if (start > maxbytes) { @@ -2112,7 +2112,7 @@ int f2fs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, if (ret == 1) ret = 0; - inode_unlock(inode); + inode_unlock_shared(inode); return ret; } -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [PATCH 1/3] f2fs: clean up w/ dotdot_name
Just cleanup, no logic changes. Signed-off-by: Chao Yu --- fs/f2fs/namei.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c index d0053b0284d8..a9360ee02da1 100644 --- a/fs/f2fs/namei.c +++ b/fs/f2fs/namei.c @@ -459,7 +459,6 @@ static int __recover_dot_dentries(struct inode *dir, nid_t pino) { struct f2fs_sb_info *sbi = F2FS_I_SB(dir); struct qstr dot = QSTR_INIT(".", 1); - struct qstr dotdot = QSTR_INIT("..", 2); struct f2fs_dir_entry *de; struct page *page; int err = 0; @@ -497,13 +496,13 @@ static int __recover_dot_dentries(struct inode *dir, nid_t pino) goto out; } - de = f2fs_find_entry(dir, &dotdot, &page); + de = f2fs_find_entry(dir, &dotdot_name, &page); if (de) f2fs_put_page(page, 0); else if (IS_ERR(page)) err = PTR_ERR(page); else - err = f2fs_do_add_link(dir, &dotdot, NULL, pino, S_IFDIR); + err = f2fs_do_add_link(dir, &dotdot_name, NULL, pino, S_IFDIR); out: if (!err) clear_inode_flag(dir, FI_INLINE_DOTS); -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH] f2fs-tools: do not put CP_UMOUNT_FLAG for roll forward recovery
On 2023/10/27 23:49, Daeho Jeong wrote: From: Daeho Jeong If we write CP_UMOUNT_FLAG in fsck, f2fs will not do foll forward recovery :s/foll/roll Otherwise, it looks good to me. Reviewed-by: Chao Yu Thanks, even though it has to do. Signed-off-by: Daeho Jeong --- fsck/fsck.c | 3 ++- fsck/mount.c | 5 - include/f2fs_fs.h | 1 + 3 files changed, 7 insertions(+), 2 deletions(-) diff --git a/fsck/fsck.c b/fsck/fsck.c index f1a55db..126458c 100644 --- a/fsck/fsck.c +++ b/fsck/fsck.c @@ -2526,7 +2526,8 @@ static void fix_checkpoint(struct f2fs_sb_info *sbi) struct f2fs_super_block *sb = F2FS_RAW_SUPER(sbi); struct f2fs_checkpoint *cp = F2FS_CKPT(sbi); unsigned long long cp_blk_no; - u32 flags = c.alloc_failed ? CP_FSCK_FLAG: CP_UMOUNT_FLAG; + u32 flags = c.alloc_failed ? CP_FSCK_FLAG : + (c.roll_forward ? 0 : CP_UMOUNT_FLAG); block_t orphan_blks = 0; block_t cp_blocks; u32 i; diff --git a/fsck/mount.c b/fsck/mount.c index 3b02d73..805671c 100644 --- a/fsck/mount.c +++ b/fsck/mount.c @@ -3218,7 +3218,7 @@ void write_checkpoint(struct f2fs_sb_info *sbi) struct f2fs_super_block *sb = F2FS_RAW_SUPER(sbi); block_t orphan_blks = 0; unsigned long long cp_blk_no; - u32 flags = CP_UMOUNT_FLAG; + u32 flags = c.roll_forward ? 0 : CP_UMOUNT_FLAG; int i, ret; uint32_t crc = 0; @@ -3837,6 +3837,9 @@ static int record_fsync_data(struct f2fs_sb_info *sbi) if (ret) goto out; + if (c.func == FSCK && inode_list.next != &inode_list) + c.roll_forward = 1; + ret = late_build_segment_manager(sbi); if (ret < 0) { ERR_MSG("late_build_segment_manager failed\n"); diff --git a/include/f2fs_fs.h b/include/f2fs_fs.h index abd5abf..faa5d6b 100644 --- a/include/f2fs_fs.h +++ b/include/f2fs_fs.h @@ -1513,6 +1513,7 @@ struct f2fs_configuration { unsigned int feature; /* defined features */ unsigned int quota_bits;/* quota bits */ time_t fixed_time; + int roll_forward; /* mkfs parameters */ int fake_seed; ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH] f2fs-tools: use total_node_count when creating a new node block in fsck
On 2023/10/27 23:30, Daeho Jeong wrote: From: Daeho Jeong We might allocate more node blocks than total_valid_node_count, when we recreate quota files. Signed-off-by: Daeho Jeong Reviewed-by: Chao Yu Thanks, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH] f2fs-tools: ensure that unused xattr space is zeroized
On 2023/10/26 10:57, Eric Biggers wrote: From: Eric Biggers Make fsck.f2fs zeroize the unused xattr space, i.e. the space after the end of the zero-terminated xattr list, if it isn't already zeroized. This is important because the kernel currently does not explicitly zero-terminate the list when writing xattrs. So, the kernel relies on the unused space containing zeroes. Also, add a missing free() to fix a memory leak. Signed-off-by: Eric Biggers Reviewed-by: Chao Yu Thanks, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH] f2fs-tools: fix corrupted xattr entry
On 2023/10/23 23:50, Jaegeuk Kim wrote: From: Daeho Jeong Detect and fix a corrupted xattr entry. Signed-off-by: Daeho Jeong Reviewed-by: Chao Yu Thanks, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH v2] f2fs-tools: do not reuse corrupted quota inodes
On 2023/10/27 8:21, Daeho Jeong wrote: From: Daeho Jeong When we detect quota inode corruption, we better deallocate the current space and allocate new ones for a clean start. Signed-off-by: Daeho Jeong --- v2: change node count check when creating a new node block --- fsck/fsck.c| 148 +++-- fsck/segment.c | 5 +- 2 files changed, 109 insertions(+), 44 deletions(-) diff --git a/fsck/fsck.c b/fsck/fsck.c index 99cface..dc8e282 100644 --- a/fsck/fsck.c +++ b/fsck/fsck.c @@ -67,6 +67,14 @@ int f2fs_set_sit_bitmap(struct f2fs_sb_info *sbi, u32 blk) return f2fs_set_bit(BLKOFF_FROM_MAIN(sbi, blk), fsck->sit_area_bitmap); } +static inline int f2fs_clear_sit_bitmap(struct f2fs_sb_info *sbi, u32 blk) +{ + struct f2fs_fsck *fsck = F2FS_FSCK(sbi); + + return f2fs_clear_bit(BLKOFF_FROM_MAIN(sbi, blk), + fsck->sit_area_bitmap); +} + static int add_into_hard_link_list(struct f2fs_sb_info *sbi, u32 nid, u32 link_cnt) { @@ -2150,6 +2158,9 @@ int fsck_chk_quota_node(struct f2fs_sb_info *sbi) return ret; } +static void fsck_disconnect_file(struct f2fs_sb_info *sbi, nid_t ino, + bool dealloc); + int fsck_chk_quota_files(struct f2fs_sb_info *sbi) { struct f2fs_fsck *fsck = F2FS_FSCK(sbi); @@ -2181,6 +2192,8 @@ int fsck_chk_quota_files(struct f2fs_sb_info *sbi) if (c.fix_on) { DBG(0, "Fixing Quota file ([%3d] ino [0x%x])\n", qtype, ino); + fsck_disconnect_file(sbi, ino, true); + f2fs_rebuild_qf_inode(sbi, qtype); f2fs_filesize_update(sbi, ino, 0); ret = quota_write_inode(sbi, qtype); if (!ret) { @@ -2874,10 +2887,53 @@ static int fsck_do_reconnect_file(struct f2fs_sb_info *sbi, return 0; } -static void fsck_failed_reconnect_file_dnode(struct f2fs_sb_info *sbi, - struct f2fs_inode *inode, nid_t nid) +static inline void release_inode_cnt(struct f2fs_sb_info *sbi, bool dealloc) +{ + F2FS_FSCK(sbi)->chk.valid_inode_cnt--; + if (dealloc) + sbi->total_valid_inode_count--; +} + +static inline void release_node_cnt(struct f2fs_sb_info *sbi, bool dealloc) +{ + F2FS_FSCK(sbi)->chk.valid_node_cnt--; + if (dealloc) + sbi->total_valid_node_count--; +} + +static inline void release_block_cnt(struct f2fs_sb_info *sbi, bool dealloc) +{ + F2FS_FSCK(sbi)->chk.valid_blk_cnt--; + if (dealloc) + sbi->total_valid_block_count--; +} + +static inline void release_block(struct f2fs_sb_info *sbi, u64 blkaddr, + bool dealloc) +{ + f2fs_clear_main_bitmap(sbi, blkaddr); + if (dealloc) { + struct seg_entry *se; + u64 offset; + + se = get_seg_entry(sbi, GET_SEGNO(sbi, blkaddr)); + offset = OFFSET_IN_SEG(sbi, blkaddr); + se->valid_blocks--; + f2fs_clear_bit(offset, (char *)se->cur_valid_map); + se->dirty = 1; + f2fs_clear_sit_bitmap(sbi, blkaddr); + } +} + +static inline void release_nat_entry(struct f2fs_sb_info *sbi, u32 nid) +{ + nullify_nat_entry(sbi, nid); + F2FS_FSCK(sbi)->chk.valid_nat_entry_cnt--; +} + +static void fsck_disconnect_file_dnode(struct f2fs_sb_info *sbi, + struct f2fs_inode *inode, nid_t nid, bool dealloc) { - struct f2fs_fsck *fsck = F2FS_FSCK(sbi); struct f2fs_node *node; struct node_info ni; u32 addr; @@ -2890,27 +2946,29 @@ static void fsck_failed_reconnect_file_dnode(struct f2fs_sb_info *sbi, err = dev_read_block(node, ni.blk_addr); ASSERT(err >= 0); - fsck->chk.valid_node_cnt--; - fsck->chk.valid_blk_cnt--; - f2fs_clear_main_bitmap(sbi, ni.blk_addr); + release_node_cnt(sbi, dealloc); + release_block_cnt(sbi, dealloc); + release_block(sbi, ni.blk_addr, dealloc); for (i = 0; i < ADDRS_PER_BLOCK(inode); i++) { addr = le32_to_cpu(node->dn.addr[i]); if (!addr) continue; - fsck->chk.valid_blk_cnt--; + release_block_cnt(sbi, dealloc); if (addr == NEW_ADDR || addr == COMPRESS_ADDR) continue; - f2fs_clear_main_bitmap(sbi, addr); + release_block(sbi, addr, dealloc); } + if (dealloc) + release_nat_entry(sbi, nid); + free(node); } -static void fsck_failed_reconnect_file_idnode(struct f2fs_sb_info *sbi, - struct f2fs_inode *inode, nid_t nid) +static void fsck_disconnect_file_idnode(
Re: [f2fs-dev] [PATCH v2] f2fs_io: add list/setattr command
On 2023/10/20 10:39, Jaegeuk Kim wrote: Let's add list/set/removexattrs commands. It looks it missed to add related description in man/f2fs_io.8. Thanks, Signed-off-by: Jaegeuk Kim --- - add removexattrs tools/f2fs_io/f2fs_io.c | 107 1 file changed, 107 insertions(+) diff --git a/tools/f2fs_io/f2fs_io.c b/tools/f2fs_io/f2fs_io.c index c812aa1458a2..e7d286a67939 100644 --- a/tools/f2fs_io/f2fs_io.c +++ b/tools/f2fs_io/f2fs_io.c @@ -35,6 +35,7 @@ #include #include #include +#include #ifdef HAVE_CONFIG_H #include @@ -1526,6 +1527,109 @@ static void do_gc_range(int argc, char **argv, const struct cmd_desc *cmd) exit(0); } +#define listxattr_desc "listxattr" +#define listxattr_help "f2fs_io listxattr [file_path]\n\n" + +static void do_listxattr(int argc, char **argv, const struct cmd_desc *cmd) +{ + char *buf, *key, *val; + ssize_t buflen, vallen, keylen; + + if (argc != 2) { + fputs("Excess arguments\n\n", stderr); + fputs(cmd->cmd_help, stderr); + exit(1); + } + + buflen = listxattr(argv[1], NULL, 0); + if (buflen == -1) { + perror("listxattr"); + exit(1); + } + if (buflen == 0) { + printf("%s has no attributes.\n", argv[1]); + exit(0); + } + buf = xmalloc(buflen); + buflen = listxattr(argv[1], buf, buflen); + if (buflen == -1) { + perror("listxattr"); + exit(1); + } + + key = buf; + while (buflen > 0) { + printf("%s: ", key); + vallen = getxattr(argv[1], key, NULL, 0); + if (vallen == -1) { + perror("getxattr"); + exit(1); + } + if (vallen == 0) { + printf(""); + } else { + val = xmalloc(vallen + 1); + vallen = getxattr(argv[1], key, val, vallen); + if (vallen == -1) { + perror("getxattr"); + exit(1); + } + val[vallen] = 0; + printf("%s", val); + free(val); + } + printf("\n"); + keylen = strlen(key) + 1; + buflen -= keylen; + key += keylen; + } + exit(0); +} + +#define setxattr_desc "setxattr" +#define setxattr_help "f2fs_io setxattr [name] [value] [file_path]\n\n" + +static void do_setxattr(int argc, char **argv, const struct cmd_desc *cmd) +{ + int ret; + + if (argc != 4) { + fputs("Excess arguments\n\n", stderr); + fputs(cmd->cmd_help, stderr); + exit(1); + } + + ret = setxattr(argv[3], argv[1], argv[2], strlen(argv[2]), XATTR_CREATE); + printf("setxattr %s CREATE: name: %s, value: %s: ret=%d\n", + argv[3], argv[1], argv[2], ret); + if (ret < 0 && errno == EEXIST) { + ret = setxattr(argv[3], argv[1], argv[2], strlen(argv[2]), XATTR_REPLACE); + printf("setxattr %s REPLACE: name: %s, value: %s: ret=%d\n", + argv[3], argv[1], argv[2], ret); + } + if (ret < 0) + perror("setxattr"); + exit(0); +} + +#define removexattr_desc "removexattr" +#define removexattr_help "f2fs_io removexattr [name] [file_path]\n\n" + +static void do_removexattr(int argc, char **argv, const struct cmd_desc *cmd) +{ + int ret; + + if (argc != 3) { + fputs("Excess arguments\n\n", stderr); + fputs(cmd->cmd_help, stderr); + exit(1); + } + + ret = removexattr(argv[2], argv[1]); + printf("removexattr %s REMOVE: name: %s: ret=%d\n", argv[1], argv[2], ret); + exit(0); +} + #define CMD_HIDDEN0x0001 #define CMD(name) { #name, do_##name, name##_desc, name##_help, 0 } #define _CMD(name) { #name, do_##name, NULL, NULL, CMD_HIDDEN } @@ -1564,6 +1668,9 @@ const struct cmd_desc cmd_list[] = { CMD(precache_extents), CMD(move_range), CMD(gc_range), + CMD(listxattr), + CMD(setxattr), + CMD(removexattr), { NULL, NULL, NULL, NULL, 0 } }; ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH 2/2] fsck.f2fs: fix cache offset for multiple partitions
On 2023/10/17 5:58, Jaegeuk Kim wrote: The cache offset should have been considered multiple partitions per fd. Let's fix. Signed-off-by: Jaegeuk Kim Reviewed-by: Chao Yu Thanks, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH 1/2] Revert "f2fs-tools: do not support user-space cache"
On 2023/10/17 5:58, Jaegeuk Kim wrote: This reverts commit 2835107ae3908576b41ff5f6a4e63ba7ec9a6246. There's a report that the impact was true. Signed-off-by: Jaegeuk Kim Sorry for delay reply, feel free to add: Reviewed-by: Chao Yu Thanks, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH] f2fs: skip adding a discard command if exists
On 2023/11/15 10:45, Jaegeuk Kim wrote: On 11/15, Chao Yu wrote: On 2023/11/15 5:24, Jaegeuk Kim wrote: When recovering zoned UFS, sometimes we add the same zone to discard multiple times. Simple workaround is to bypass adding it. What about skipping f2fs_bug_on() just for zoned UFS case? so that the check condition can still be used for non-zoned UFS case. Hmm, I've never seen this bug_on before, but even this really happens, it does I've never seen it was been triggered as well. not make sense to move forward to create duplicate commands resulting in a loop. Agreed. It looks those codes were copied from extent_cache code base, do we need to fix all cases to avoid loop? So, the question is, do we really need to check this? Have we hit this before? Not sure, just be worry about that flaw of newly developed feature can make code run into that branch. Thanks, Thanks, Signed-off-by: Jaegeuk Kim --- fs/f2fs/segment.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 727d016318f9..f4ffd64b44b2 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -1380,7 +1380,8 @@ static void __insert_discard_cmd(struct f2fs_sb_info *sbi, p = &(*p)->rb_right; leftmost = false; } else { - f2fs_bug_on(sbi, 1); + /* Let's skip to add, if exists */ + return; } } ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH] f2fs: skip adding a discard command if exists
On 2023/11/15 5:24, Jaegeuk Kim wrote: When recovering zoned UFS, sometimes we add the same zone to discard multiple times. Simple workaround is to bypass adding it. What about skipping f2fs_bug_on() just for zoned UFS case? so that the check condition can still be used for non-zoned UFS case. Thanks, Signed-off-by: Jaegeuk Kim --- fs/f2fs/segment.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 727d016318f9..f4ffd64b44b2 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -1380,7 +1380,8 @@ static void __insert_discard_cmd(struct f2fs_sb_info *sbi, p = &(*p)->rb_right; leftmost = false; } else { - f2fs_bug_on(sbi, 1); + /* Let's skip to add, if exists */ + return; } } ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH 1/1] f2fs: fix fallocate failed under pinned block situation
On 2023/11/8 21:48, Wu Bo wrote: On 2023/11/7 22:39, Chao Yu wrote: On 2023/10/30 17:40, Wu Bo wrote: If GC victim has pinned block, it can't be recycled. And if GC is foreground running, after many failure try, the pinned file is expected to be clear pin flag. To enable the section be recycled. But when fallocate trigger FG_GC, GC can never recycle the pinned section. Because GC will go to stop before the failure try meet the threshold: if (has_enough_free_secs(sbi, sec_freed, 0)) { if (!gc_control->no_bg_gc && total_sec_freed < gc_control->nr_free_secs) goto go_gc_more; goto stop; } So when fallocate trigger FG_GC, at least recycle one. Hmm... it may break pinfile's semantics at least on one pinned file? In this case, I prefer to fail fallocate() rather than unpinning file, in order to avoid leaving invalid LBA references of unpinned file held by userspace. As f2fs designed now, FG_GC is able to unpin the pinned file. fallocate() triggered FG_GC, but can't recycle space. It breaks the design logic of FG_GC. Yes, contradictoriness exists. IMO, unpin file by GC looks more dangerous, it may cause potential data corruption w/ below case: 1. app pins file & holds LBAs of data blocks. 2. GC unpins file and migrates its data to new LBAs. 3. other file reuses previous LBAs. 4. app read/write data via previous LBAs. So I suggest to normalize use of pinfile and do not add more unpin cases in filesystem inner processes. This issue is happened in Android OTA scenario. fallocate() always return failure cause OTA fail. Can you please check why other pinned files were so fragmented that f2fs_gc() can not recycle one free section? Thanks, And this commit changed previous behavior of fallocate(): Commit 2e42b7f817ac ("f2fs: stop allocating pinned sections if EAGAIN happens") Before this commit, if fallocate() meet this situation, it will trigger FG_GC to recycle pinned space finally. FG_GC is expected to recycle pinned space when there is no more free space. And this is the right time to do it when fallocate() need free space. It is weird when f2fs shows enough spare space but can't fallocate(). So I think it should be fixed. Thoughts? Thanks, This issue can be reproduced by filling f2fs space as following layout. Every segment has one block is pinned: +-+-+-+-+-+-+-+-+ | | |p| | | | ... | | seg_n +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ | | |p| | | | ... | | seg_n+1 +-+-+-+-+-+-+-+-+ ... +-+-+-+-+-+-+-+-+ | | |p| | | | ... | | seg_n+k +-+-+-+-+-+-+-+-+ And following are steps to reproduce this issue: dd if=/dev/zero of=./f2fs_pin.img bs=2M count=1024 mkfs.f2fs f2fs_pin.img mkdir f2fs mount f2fs_pin.img ./f2fs cd f2fs dd if=/dev/zero of=./large_padding bs=1M count=1760 ./pin_filling.sh rm padding* sync touch fallocate_40m f2fs_io pinfile set fallocate_40m fallocate -l 41943040 fallocate_40m fallocate always fail with EAGAIN even there has enough free space. 'pin_filling.sh' is: count=1 while : do # filling the seg space for i in {1..511}: do name=padding_$count-$i echo write $name dd if=/dev/zero of=./$name bs=4K count=1 > /dev/null 2>&1 if [ $? -ne 0 ]; then exit 0 fi done sync # pin one block in a segment name=pin_file$count dd if=/dev/zero of=./$name bs=4K count=1 > /dev/null 2>&1 sync f2fs_io pinfile set $name count=$(($count + 1)) done Signed-off-by: Wu Bo --- fs/f2fs/file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index ca5904129b16..e8a13616543f 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -1690,7 +1690,7 @@ static int f2fs_expand_inode_data(struct inode *inode, loff_t offset, .init_gc_type = FG_GC, .should_migrate_blocks = false, .err_gc_skipped = true, - .nr_free_secs = 0 }; + .nr_free_secs = 1 }; pgoff_t pg_start, pg_end; loff_t new_size; loff_t off_end; ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH V2] libf2fs: Fix using uninitialized variables error in get_device_info()
On 2023/11/9 23:47, zangyangyang1 wrote: This issue comes from a static code scanning tool. When c.sparse_mode is 1, stat_buf will not be initialized, but it will be used next. If this issue does not require modification, please ignore this commit. Signed-off-by: zangyangyang1 Reviewed-by: Chao Yu Thanks, ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH] libf2fs: Fix using uninitialized variables error in get_device_info()
On 2023/11/8 20:38, zangyangyang1 wrote: This issue comes from a static code scanning tool. When c.sparse_mode is 1, stat_buf will not be initialized, but it will be used next. If this issue does not require modification, please ignore this commit. Signed-off-by: zangyangyang1 --- lib/libf2fs.c | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/libf2fs.c b/lib/libf2fs.c index 995e42d..0398c52 100644 --- a/lib/libf2fs.c +++ b/lib/libf2fs.c @@ -933,6 +933,7 @@ int get_device_info(int i) stat_buf = malloc(sizeof(struct stat)); Use calloc() instead of malloc()? Thanks, ASSERT(stat_buf); + memset(stat_buf, 0, sizeof(struct stat)); if (!c.sparse_mode) { if (stat(dev->path, stat_buf) < 0 ) { -- 2.40.1 ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH] f2fs: data: fix possible overflow in check_swap_activate()
Hi Sergey, Thanks for the patch. On 2023/10/26 4:20, Sergey Shtylyov wrote: In check_swap_activate(), if the *while* loop exits early (0- or 1-page long swap file), an overflow happens while calculating the value of the span parameter as the lowest_pblock variable ends up being greater than the highest_pblock variable. Let's set *span to 0 in this case... What do you think of returning -EINVAL for such case? I assume this is a corner case. Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Signed-off-by: Sergey Shtylyov --- This patch is against the 'master' branch of Jaegeuk Kim's F2FS repo... fs/f2fs/data.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 916e317ac925..342cb0d5056d 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -4047,7 +4047,10 @@ static int check_swap_activate(struct swap_info_struct *sis, cur_lblock += nr_pblocks; } ret = nr_extents; - *span = 1 + highest_pblock - lowest_pblock; + if (lowest_pblock <= highest_pblock) if (unlikely(higest_pblock < lowest_pblock)) return -EINVAL; *span = 1 + highest_pblock - lowest_pblock; Thanks, + *span = 1 + highest_pblock - lowest_pblock; + else + *span = 0; if (cur_lblock == 0) cur_lblock = 1; /* force Empty message */ sis->max = cur_lblock; ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel