[f2fs-dev] [PATCH] f2fs_io: fix output of do_read()

2024-05-23 Thread Chao Yu
echo 1 > file
f2fs_io read 1 0 1 dio 4096 ./file
Read 0 bytes total_time = 17 us, print 4096 bytes:
 : ffd537 ffc957 0500     
0100 :        
0200 :        
0300 :     ffc10f 0200  

For the case reading across EOF, it missed to copy returned
data to print_buf.

After:
f2fs_io read 1 0 1 dio 4096 ./file
pread expected: 4096, readed: 2
Read 2 bytes total_time = 177 us, print 4096 bytes:
 : 310a       

Signed-off-by: Chao Yu 
---
 tools/f2fs_io/f2fs_io.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/f2fs_io/f2fs_io.c b/tools/f2fs_io/f2fs_io.c
index a7b593a..79b4d04 100644
--- a/tools/f2fs_io/f2fs_io.c
+++ b/tools/f2fs_io/f2fs_io.c
@@ -867,8 +867,15 @@ static void do_read(int argc, char **argv, const struct 
cmd_desc *cmd)
if (!do_mmap) {
for (i = 0; i < count; i++) {
ret = pread(fd, buf, buf_size, offset + buf_size * i);
-   if (ret != buf_size)
+   if (ret != buf_size) {
+   printf("pread expected: %"PRIu64", readed: 
%"PRIu64"\n",
+   buf_size, ret);
+   if (ret > 0) {
+   read_cnt += ret;
+   memcpy(print_buf, buf, print_bytes);
+   }
break;
+   }
 
read_cnt += ret;
if (i == 0)
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] f2fs: fix to force buffered IO on inline_data inode

2024-05-23 Thread Chao Yu
It will return all zero data when DIO reading from inline_data inode, it
is because f2fs_iomap_begin() assign iomap->type w/ IOMAP_HOLE incorrectly
for this case.

We can let iomap framework handle inline data via assigning iomap->type
and iomap->inline_data correctly, however, it will be a little bit
complicated when handling race case in between direct IO and buffered IO.

So, let's force to use buffered IO to fix this issue.

Cc: sta...@vger.kernel.org
Reported-by: Barry Song 
Signed-off-by: Chao Yu 
---
 fs/f2fs/file.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index db6236f27852..e038910ad1e5 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -851,6 +851,8 @@ static bool f2fs_force_buffered_io(struct inode *inode, int 
rw)
return true;
if (f2fs_compressed_file(inode))
return true;
+   if (f2fs_has_inline_data(inode))
+   return true;
 
/* disallow direct IO if any of devices has unaligned blksize */
if (f2fs_is_multi_device(sbi) && !sbi->aligned_blksize)
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH 2/2] f2fs: fix to do sanity check on blocks for inline_data inode

2024-05-21 Thread Chao Yu
inode can be fuzzed, so it can has F2FS_INLINE_DATA flag and valid
i_blocks/i_nid value, this patch supports to do extra sanity check
to detect such corrupted state.

Signed-off-by: Chao Yu 
---
 fs/f2fs/f2fs.h   |  2 +-
 fs/f2fs/inline.c | 20 +++-
 fs/f2fs/inode.c  |  2 +-
 3 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 1974b6aff397..f463961b497c 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4149,7 +4149,7 @@ extern struct kmem_cache *f2fs_inode_entry_slab;
  * inline.c
  */
 bool f2fs_may_inline_data(struct inode *inode);
-bool f2fs_sanity_check_inline_data(struct inode *inode);
+bool f2fs_sanity_check_inline_data(struct inode *inode, struct page *ipage);
 bool f2fs_may_inline_dentry(struct inode *inode);
 void f2fs_do_read_inline_data(struct folio *folio, struct page *ipage);
 void f2fs_truncate_inline_inode(struct inode *inode,
diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
index 7638d0d7b7ee..0203c3baabb6 100644
--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -33,11 +33,29 @@ bool f2fs_may_inline_data(struct inode *inode)
return !f2fs_post_read_required(inode);
 }
 
-bool f2fs_sanity_check_inline_data(struct inode *inode)
+static bool inode_has_blocks(struct inode *inode, struct page *ipage)
+{
+   struct f2fs_inode *ri = F2FS_INODE(ipage);
+   int i;
+
+   if (F2FS_HAS_BLOCKS(inode))
+   return true;
+
+   for (i = 0; i < DEF_NIDS_PER_INODE; i++) {
+   if (ri->i_nid[i])
+   return true;
+   }
+   return false;
+}
+
+bool f2fs_sanity_check_inline_data(struct inode *inode, struct page *ipage)
 {
if (!f2fs_has_inline_data(inode))
return false;
 
+   if (inode_has_blocks(inode, ipage))
+   return false;
+
if (!support_inline_data(inode))
return true;
 
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index 791c06e159fd..4b39aebd3c70 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -344,7 +344,7 @@ static bool sanity_check_inode(struct inode *inode, struct 
page *node_page)
}
}
 
-   if (f2fs_sanity_check_inline_data(inode)) {
+   if (f2fs_sanity_check_inline_data(inode, node_page)) {
f2fs_warn(sbi, "%s: inode (ino=%lx, mode=%u) should not have 
inline_data, run fsck to fix",
  __func__, inode->i_ino, inode->i_mode);
return false;
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH 1/2] f2fs: fix to do sanity check on F2FS_INLINE_DATA flag in inode during GC

2024-05-21 Thread Chao Yu
syzbot reports a f2fs bug as below:

[ cut here ]
kernel BUG at fs/f2fs/inline.c:258!
CPU: 1 PID: 34 Comm: kworker/u8:2 Not tainted 
6.9.0-rc6-syzkaller-00012-g9e4bc4bcae01 #0
RIP: 0010:f2fs_write_inline_data+0x781/0x790 fs/f2fs/inline.c:258
Call Trace:
 f2fs_write_single_data_page+0xb65/0x1d60 fs/f2fs/data.c:2834
 f2fs_write_cache_pages fs/f2fs/data.c:3133 [inline]
 __f2fs_write_data_pages fs/f2fs/data.c:3288 [inline]
 f2fs_write_data_pages+0x1efe/0x3a90 fs/f2fs/data.c:3315
 do_writepages+0x35b/0x870 mm/page-writeback.c:2612
 __writeback_single_inode+0x165/0x10b0 fs/fs-writeback.c:1650
 writeback_sb_inodes+0x905/0x1260 fs/fs-writeback.c:1941
 wb_writeback+0x457/0xce0 fs/fs-writeback.c:2117
 wb_do_writeback fs/fs-writeback.c:2264 [inline]
 wb_workfn+0x410/0x1090 fs/fs-writeback.c:2304
 process_one_work kernel/workqueue.c:3254 [inline]
 process_scheduled_works+0xa12/0x17c0 kernel/workqueue.c:3335
 worker_thread+0x86d/0xd70 kernel/workqueue.c:3416
 kthread+0x2f2/0x390 kernel/kthread.c:388
 ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

The root cause is: inline_data inode can be fuzzed, so that there may
be valid blkaddr in its direct node, once f2fs triggers background GC
to migrate the block, it will hit f2fs_bug_on() during dirty page
writeback.

Let's add sanity check on F2FS_INLINE_DATA flag in inode during GC,
so that, it can forbid migrating inline_data inode's data block for
fixing.

Reported-by: syzbot+848062ba19c8782ca...@syzkaller.appspotmail.com
Closes: 
https://lore.kernel.org/linux-f2fs-devel/d103ce06174d7...@google.com
Signed-off-by: Chao Yu 
---
 fs/f2fs/gc.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 6066c6eecf41..20e2f989013b 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1563,6 +1563,16 @@ static int gc_data_segment(struct f2fs_sb_info *sbi, 
struct f2fs_summary *sum,
continue;
}
 
+   if (f2fs_has_inline_data(inode)) {
+   iput(inode);
+   set_sbi_flag(sbi, SBI_NEED_FSCK);
+   f2fs_err_ratelimited(sbi,
+   "inode %lx has both inline_data flag 
and "
+   "data block, nid=%u, ofs_in_node=%u",
+   inode->i_ino, dni.nid, ofs_in_node);
+   continue;
+   }
+
err = f2fs_gc_pinned_control(inode, gc_type, segno);
if (err == -EAGAIN) {
iput(inode);
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] f2fs: fix to truncate preallocated blocks in f2fs_file_open()

2024-05-19 Thread Chao Yu
chenyuwen reports a f2fs bug as below:

Unable to handle kernel NULL pointer dereference at virtual address 
0011
 fscrypt_set_bio_crypt_ctx+0x78/0x1e8
 f2fs_grab_read_bio+0x78/0x208
 f2fs_submit_page_read+0x44/0x154
 f2fs_get_read_data_page+0x288/0x5f4
 f2fs_get_lock_data_page+0x60/0x190
 truncate_partial_data_page+0x108/0x4fc
 f2fs_do_truncate_blocks+0x344/0x5f0
 f2fs_truncate_blocks+0x6c/0x134
 f2fs_truncate+0xd8/0x200
 f2fs_iget+0x20c/0x5ac
 do_garbage_collect+0x5d0/0xf6c
 f2fs_gc+0x22c/0x6a4
 f2fs_disable_checkpoint+0xc8/0x310
 f2fs_fill_super+0x14bc/0x1764
 mount_bdev+0x1b4/0x21c
 f2fs_mount+0x20/0x30
 legacy_get_tree+0x50/0xbc
 vfs_get_tree+0x5c/0x1b0
 do_new_mount+0x298/0x4cc
 path_mount+0x33c/0x5fc
 __arm64_sys_mount+0xcc/0x15c
 invoke_syscall+0x60/0x150
 el0_svc_common+0xb8/0xf8
 do_el0_svc+0x28/0xa0
 el0_svc+0x24/0x84
 el0t_64_sync_handler+0x88/0xec

It is because inode.i_crypt_info is not initialized during below path:
- mount
 - f2fs_fill_super
  - f2fs_disable_checkpoint
   - f2fs_gc
- f2fs_iget
 - f2fs_truncate

So, let's relocate truncation of preallocated blocks to f2fs_file_open(),
after fscrypt_file_open().

Fixes: d4dd19ec1ea0 ("f2fs: do not expose unwritten blocks to user by DIO")
Reported-by: chenyuwen 
Closes: 
https://lore.kernel.org/linux-kernel/20240517085327.1188515-1-yuwen.c...@xjmz.com
Signed-off-by: Chao Yu 
---
 fs/f2fs/file.c  | 28 +++-
 fs/f2fs/inode.c |  8 
 2 files changed, 27 insertions(+), 9 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index ef4cfb4436ef..058fcc83a2fc 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -554,6 +554,28 @@ static int f2fs_file_mmap(struct file *file, struct 
vm_area_struct *vma)
return 0;
 }
 
+static int finish_preallocate_blocks(struct inode *inode)
+{
+   int ret;
+
+   if (is_sbi_flag_set(F2FS_I_SB(inode), SBI_POR_DOING))
+   return 0;
+
+   inode_lock(inode);
+   if (!file_should_truncate(inode)) {
+   inode_unlock(inode);
+   return 0;
+   }
+
+   ret = f2fs_truncate(inode);
+   inode_unlock(inode);
+   if (ret)
+   return ret;
+
+   file_dont_truncate(inode);
+   return 0;
+}
+
 static int f2fs_file_open(struct inode *inode, struct file *filp)
 {
int err = fscrypt_file_open(inode, filp);
@@ -571,7 +593,11 @@ static int f2fs_file_open(struct inode *inode, struct file 
*filp)
filp->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC;
filp->f_mode |= FMODE_CAN_ODIRECT;
 
-   return dquot_file_open(inode, filp);
+   err = dquot_file_open(inode, filp);
+   if (err)
+   return err;
+
+   return finish_preallocate_blocks(inode);
 }
 
 void f2fs_truncate_data_blocks_range(struct dnode_of_data *dn, int count)
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index 005dde72aff3..791c06e159fd 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -610,14 +610,6 @@ struct inode *f2fs_iget(struct super_block *sb, unsigned 
long ino)
}
f2fs_set_inode_flags(inode);
 
-   if (file_should_truncate(inode) &&
-   !is_sbi_flag_set(sbi, SBI_POR_DOING)) {
-   ret = f2fs_truncate(inode);
-   if (ret)
-   goto bad_inode;
-   file_dont_truncate(inode);
-   }
-
unlock_new_inode(inode);
trace_f2fs_iget(inode);
return inode;
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: fix to check return value of f2fs_allocate_new_section

2024-05-19 Thread Chao Yu

On 2024/5/17 19:26, Zhiguo Niu wrote:

commit 245930617c9b ("f2fs: fix to handle error paths of {new,change}_curseg()")
missed this allocated path, fix it.

Signed-off-by: Zhiguo Niu 


Reviewed-by: Chao Yu 

Thanks,


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] 答复: [External Mail][PATCH] f2fs: fix panic in f2fs_put_super

2024-05-16 Thread Chao Yu

On 2024/5/16 18:15, 孙士杰 wrote:

I didn't get it, if there is no cp_err, f2fs_write_checkpoint() in
f2fs_put_super() will flush all dirty pages of node_inode, if there is
cp_err, below flow will keep all dirty pages being truncated, and
there is sanity check on all types of dirty pages.

===》
I understand what you mean, so is it better to modify in this way? Please help 
to check, thank you


Hi, let's figure out the root cause first?

Thanks,



--
*发件人:* sunshijie 
*发送时间:* 2024年5月16日 18:13:38
*收件人:* jaeg...@kernel.org; c...@kernel.org; 
linux-f2fs-devel@lists.sourceforge.net; linux-ker...@vger.kernel.org
*抄送:* 孙士杰
*主题:* [External Mail][PATCH] f2fs: fix panic in f2fs_put_super
[外部邮件] 此邮件来源于小米公司外部,请谨慎处理。若对邮件安全性存疑,请将邮件转发给mi...@xiaomi.com进行反馈

When thread A calls kill_f2fs_super, Thread A first executes the code 
sbi->node_inode = NULL;
Then thread A may submit a bio to the function iput(sbi->meta_inode);
Then thread A enters the process D state,
Now that the bio submitted by thread A is complete, it calls f2fs_write_end_io 
and may trigger null-ptr-deref in NODE_MAPPING.

Thread A  IRQ context
- f2fs_put_super
  - sbi->node_inode = NULL;
  - iput(sbi->meta_inode);
   - iput_final
    - write_inode_now
     - writeback_single_inode
  - __writeback_single_inode
   - filemap_fdatawait
    - filemap_fdatawait_range
     - __kcfi_typeid_free_transhuge_page
  - __filemap_fdatawait_range
   - wait_on_page_writeback
    - folio_wait_writeback
     - folio_wait_bit
  - folio_wait_bit_common
   - io_schedule

   - __handle_irq_event_percpu
    - ufs_qcom_mcq_esi_handler
     - 
ufshcd_mcq_poll_cqe_nolock
  - ufshcd_compl_one_cqe
   - scsi_done
    - scsi_done_internal
     - 
blk_mq_complete_request
  - scsi_complete
   - scsi_finish_command
    - scsi_io_completion
     - scsi_end_request
  - 
blk_update_request
   - bio_endio
    - 
f2fs_write_end_io
     - 
NODE_MAPPING(sbi)

Signed-off-by: sunshijie 
---
  fs/f2fs/super.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index adffc9b80a9c..62d4f229f601 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1642,9 +1642,9 @@ static void f2fs_put_super(struct super_block *sb)
     f2fs_destroy_compress_inode(sbi);

     iput(sbi->node_inode);
-   sbi->node_inode = NULL;
-
     iput(sbi->meta_inode);
+
+   sbi->node_inode = NULL;
     sbi->meta_inode = NULL;

     mutex_unlock(>umount_mutex);
--
2.34.1

#/**本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
 This e-mail and its attachments contain confidential information from XIAOMI, 
which is intended only for the person or entity whose address is listed above. 
Any use of the information contained herein in any way (including, but not 
limited to, total or partial disclosure, reproduction, or dissemination) by 
persons other than the intended recipient(s) is prohibited. If you receive this 
e-mail in error, please notify the sender by phone or email immediately and 
delete 

Re: [f2fs-dev] [PATCH] f2fs: fix panic in f2fs_put_super

2024-05-16 Thread Chao Yu

On 2024/5/16 16:55, sunshijie wrote:

When thread A calls kill_f2fs_super, Thread A first executes the code 
sbi->node_inode = NULL;
Then thread A may submit a bio to the function iput(sbi->meta_inode);
Then thread A enters the process D state,
Now that the bio submitted by thread A is complete, it calls f2fs_write_end_io 
and may trigger null-ptr-deref in NODE_MAPPING.


I didn't get it, if there is no cp_err, f2fs_write_checkpoint() in
f2fs_put_super() will flush all dirty pages of node_inode, if there is
cp_err, below flow will keep all dirty pages being truncated, and
there is sanity check on all types of dirty pages.

/* our cp_error case, we can wait for any writeback page */
f2fs_flush_merged_writes(sbi);

f2fs_wait_on_all_pages(sbi, F2FS_WB_CP_DATA);

if (err || f2fs_cp_error(sbi)) {
truncate_inode_pages_final(NODE_MAPPING(sbi));
truncate_inode_pages_final(META_MAPPING(sbi));
}

for (i = 0; i < NR_COUNT_TYPE; i++) {
if (!get_pages(sbi, i))
continue;
f2fs_err(sbi, "detect filesystem reference count leak during "
"umount, type: %d, count: %lld", i, get_pages(sbi, i));
f2fs_bug_on(sbi, 1);
}

So, is there any missing case that dirty page of node_inode is missed by
f2fs_put_super()?

Thanks,



Thread A  IRQ context
- f2fs_put_super
  - sbi->node_inode = NULL;
  - iput(sbi->meta_inode);
   - iput_final
- write_inode_now
 - writeback_single_inode
  - __writeback_single_inode
   - filemap_fdatawait
- filemap_fdatawait_range
 - __kcfi_typeid_free_transhuge_page
  - __filemap_fdatawait_range
   - wait_on_page_writeback
- folio_wait_writeback
 - folio_wait_bit
  - folio_wait_bit_common
   - io_schedule

   - __handle_irq_event_percpu
- ufs_qcom_mcq_esi_handler
 - 
ufshcd_mcq_poll_cqe_nolock
  - ufshcd_compl_one_cqe
   - scsi_done
- scsi_done_internal
 - 
blk_mq_complete_request
  - scsi_complete
   - scsi_finish_command
- scsi_io_completion
 - scsi_end_request
  - 
blk_update_request
   - bio_endio
- 
f2fs_write_end_io
 - 
NODE_MAPPING(sbi)

Signed-off-by: sunshijie 
---
  fs/f2fs/super.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index adffc9b80a9c..aeb085e11f9a 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1641,12 +1641,12 @@ static void f2fs_put_super(struct super_block *sb)
  
  	f2fs_destroy_compress_inode(sbi);
  
-	iput(sbi->node_inode);

-   sbi->node_inode = NULL;
-
iput(sbi->meta_inode);
sbi->meta_inode = NULL;
  
+	iput(sbi->node_inode);

+   sbi->node_inode = NULL;
+
mutex_unlock(>umount_mutex);
  
  	/*



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs:modify the entering condition for f2fs_migrate_blocks()

2024-05-16 Thread Chao Yu

On 2024/5/15 16:24, Liao Yuanhong wrote:

Currently, when we allocating a swap file on zone UFS, this file will
created on conventional UFS. If the swap file size is not aligned with the
zone size, the last extent will enter f2fs_migrate_blocks(), resulting in
significant additional I/O overhead and prolonged lock occupancy. In most
cases, this is unnecessary, because on Conventional UFS, as long as the
start block of the swap file is aligned with zone, it is sequentially
aligned.To circumvent this issue, we have altered the conditions for
entering f2fs_migrate_blocks(). Now, if the start block of the last extent
is aligned with the start of zone, we avoids entering
f2fs_migrate_blocks().


Hi,

Is it possible that we can pin swapfile, and fallocate on it aligned to
zone size, then mkswap and swapon?

Thanks,



Signed-off-by: Liao Yuanhong 
Signed-off-by: Wu Bo 
---
  fs/f2fs/data.c | 23 +--
  1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 50ceb25b3..4d58fb6c2 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -3925,10 +3925,12 @@ static int check_swap_activate(struct swap_info_struct 
*sis,
 block_t pblock;
 block_t lowest_pblock = -1;
 block_t highest_pblock = 0;
+   block_t blk_start;
 int nr_extents = 0;
 unsigned int nr_pblocks;
 unsigned int blks_per_sec = BLKS_PER_SEC(sbi);
 unsigned int not_aligned = 0;
+   unsigned int cur_sec;
 int ret = 0;

 /*
@@ -3965,23 +3967,39 @@ static int check_swap_activate(struct swap_info_struct 
*sis,
 pblock = map.m_pblk;
 nr_pblocks = map.m_len;

-   if ((pblock - SM_I(sbi)->main_blkaddr) % blks_per_sec ||
+   blk_start = pblock - SM_I(sbi)->main_blkaddr;
+
+   if (blk_start % blks_per_sec ||
 nr_pblocks % blks_per_sec ||
 !f2fs_valid_pinned_area(sbi, pblock)) {
 bool last_extent = false;

 not_aligned++;

+   cur_sec = (blk_start + nr_pblocks) / BLKS_PER_SEC(sbi);
 nr_pblocks = roundup(nr_pblocks, blks_per_sec);
-   if (cur_lblock + nr_pblocks > sis->max)
+   if (cur_lblock + nr_pblocks > sis->max) {
 nr_pblocks -= blks_per_sec;

+   /* the start address is aligned to section */
+   if (!(blk_start % blks_per_sec))
+   last_extent = true;
+   }
+
 /* this extent is last one */
 if (!nr_pblocks) {
 nr_pblocks = last_lblock - cur_lblock;
 last_extent = true;
 }

+   /*
+* the last extent which located on conventional UFS 
doesn't
+* need migrate
+*/
+   if (last_extent && f2fs_sb_has_blkzoned(sbi) &&
+   cur_sec < GET_SEC_FROM_SEG(sbi, 
first_zoned_segno(sbi)))
+   goto next;
+
 ret = f2fs_migrate_blocks(inode, cur_lblock,
 nr_pblocks);
 if (ret) {
@@ -3994,6 +4012,7 @@ static int check_swap_activate(struct swap_info_struct 
*sis,
 goto retry;
 }

+next:
 if (cur_lblock + nr_pblocks >= sis->max)
 nr_pblocks = sis->max - cur_lblock;

--
2.25.1




___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] f2fs: add support for FS_IOC_GETFSSYSFSPATH

2024-05-15 Thread Chao Yu
FS_IOC_GETFSSYSFSPATH ioctl expects sysfs sub-path of a filesystem, the
format can be "$FSTYP/$SYSFS_IDENTIFIER" under /sys/fs, it can helps to
standardizes exporting sysfs datas across filesystems.

This patch wires up FS_IOC_GETFSSYSFSPATH for f2fs, it will output
"f2fs/".

Signed-off-by: Chao Yu 
---
 fs/f2fs/super.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index daf2c4dbe150..1f0f306cbcac 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -4481,6 +4481,7 @@ static int f2fs_fill_super(struct super_block *sb, void 
*data, int silent)
sb->s_flags = (sb->s_flags & ~SB_POSIXACL) |
(test_opt(sbi, POSIX_ACL) ? SB_POSIXACL : 0);
super_set_uuid(sb, (void *) raw_super->uuid, sizeof(raw_super->uuid));
+   super_set_sysfs_name_bdev(sb);
sb->s_iflags |= SB_I_CGROUPWB;
 
/* init f2fs-specific super block info */
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH v2] f2fs: fix to avoid racing in between read and OPU dio write

2024-05-15 Thread Chao Yu

On 2024/5/15 12:42, Jaegeuk Kim wrote:

On 05/15, Chao Yu wrote:

On 2024/5/15 0:09, Jaegeuk Kim wrote:

On 05/10, Chao Yu wrote:

If lfs mode is on, buffered read may race w/ OPU dio write as below,
it may cause buffered read hits unwritten data unexpectly, and for
dio read, the race condition exists as well.

Thread A  Thread B
- f2fs_file_write_iter
   - f2fs_dio_write_iter
- __iomap_dio_rw
 - f2fs_iomap_begin
  - f2fs_map_blocks
   - __allocate_data_block
- allocated blkaddr #x
 - iomap_dio_submit_bio
- f2fs_file_read_iter
 - filemap_read
  - f2fs_read_data_folio
   - f2fs_mpage_readpages
- f2fs_map_blocks
 : get blkaddr #x
- f2fs_submit_read_bio
IRQ
- f2fs_read_end_io
 : read IO on blkaddr #x complete
IRQ
- iomap_dio_bio_end_io
   : direct write IO on blkaddr #x complete

This patch introduces a new per-inode i_opu_rwsem lock to avoid
such race condition.


Wasn't this supposed to be managed by user-land?


Actually, the test case is:

1. mount w/ lfs mode
2. touch file;
3. initialize file w/ 4k zeroed data; fsync;
4. continue triggering dio write 4k zeroed data to file;
5. and meanwhile, continue triggering buf/dio 4k read in file,
use md5sum to verify the 4k data;

It expects data is all zero, however it turned out it's not.


Can we check outstanding write bios instead of abusing locks?


I didn't figure out a way to solve this w/o lock, due to:
- write bios can be issued after outstanding write bios check condition,
result in the race.
- once read() detects that there are outstanding write bios, we need to
delay read flow rather than fail it, right? It looks using a lock is more
proper here?

Any suggestion?

Thanks,





Thanks,





Fixes: f847c699cff3 ("f2fs: allow out-place-update for direct IO in LFS mode")
Signed-off-by: Chao Yu 
---
v2:
- fix to cover dio read path w/ i_opu_rwsem as well.
   fs/f2fs/f2fs.h  |  1 +
   fs/f2fs/file.c  | 28 ++--
   fs/f2fs/super.c |  1 +
   3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 30058e16a5d0..91cf4b3d6bc6 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -847,6 +847,7 @@ struct f2fs_inode_info {
/* avoid racing between foreground op and gc */
struct f2fs_rwsem i_gc_rwsem[2];
struct f2fs_rwsem i_xattr_sem; /* avoid racing between reading and 
changing EAs */
+ struct f2fs_rwsem i_opu_rwsem;  /* avoid racing between buf read and opu 
dio write */

int i_extra_isize;  /* size of extra space located in 
i_addr */
kprojid_t i_projid; /* id for project quota */
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 72ce1a522fb2..4ec260af321f 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -4445,6 +4445,7 @@ static ssize_t f2fs_dio_read_iter(struct kiocb *iocb, 
struct iov_iter *to)
const loff_t pos = iocb->ki_pos;
const size_t count = iov_iter_count(to);
struct iomap_dio *dio;
+ bool do_opu = f2fs_lfs_mode(sbi);
ssize_t ret;

if (count == 0)
@@ -4457,8 +4458,14 @@ static ssize_t f2fs_dio_read_iter(struct kiocb *iocb, 
struct iov_iter *to)
ret = -EAGAIN;
goto out;
}
+ if (do_opu && !f2fs_down_read_trylock(>i_opu_rwsem)) {
+ f2fs_up_read(>i_gc_rwsem[READ]);
+ ret = -EAGAIN;
+ goto out;
+ }
} else {
f2fs_down_read(>i_gc_rwsem[READ]);
+ f2fs_down_read(>i_opu_rwsem);
}

/*
@@ -4477,6 +4484,7 @@ static ssize_t f2fs_dio_read_iter(struct kiocb *iocb, 
struct iov_iter *to)
ret = iomap_dio_complete(dio);
}

+ f2fs_up_read(>i_opu_rwsem);
f2fs_up_read(>i_gc_rwsem[READ]);

file_accessed(file);
@@ -4523,7 +4531,13 @@ static ssize_t f2fs_file_read_iter(struct kiocb *iocb, 
struct iov_iter *to)
if (f2fs_should_use_dio(inode, iocb, to)) {
ret = f2fs_dio_read_iter(iocb, to);
} else {
+ bool do_opu = f2fs_lfs_mode(F2FS_I_SB(inode));
+
+ if (do_opu)
+ f2fs_down_read(_I(inode)->i_opu_rwsem);
ret = filemap_read(iocb, to, 0);
+ if (do_opu)
+ f2fs_up_read(_I(inode)->i_opu_rwsem);
if (ret > 0)
f2fs_update_iostat(F2FS_I_SB(inode), inode,
APP_BUFFERED_READ_IO, ret);
@@ -4748,14 +4762,22 @@ static ssize_t f2fs_dio_write

Re: [f2fs-dev] [PATCH 3/3] f2fs: fix to do sanity check on i_nid for inline_data inode

2024-05-15 Thread Chao Yu

On 2024/5/15 12:39, Jaegeuk Kim wrote:

On 05/15, Chao Yu wrote:

On 2024/5/15 0:07, Jaegeuk Kim wrote:

外部邮件/External Mail


On 05/11, Chao Yu wrote:

On 2024/5/11 8:38, Jaegeuk Kim wrote:

On 05/10, Chao Yu wrote:

On 2024/5/10 11:36, Jaegeuk Kim wrote:

On 05/10, Chao Yu wrote:

On 2024/5/9 23:52, Jaegeuk Kim wrote:

On 05/06, Chao Yu wrote:

syzbot reports a f2fs bug as below:

[ cut here ]
kernel BUG at fs/f2fs/inline.c:258!
CPU: 1 PID: 34 Comm: kworker/u8:2 Not tainted 
6.9.0-rc6-syzkaller-00012-g9e4bc4bcae01 #0
RIP: 0010:f2fs_write_inline_data+0x781/0x790 fs/f2fs/inline.c:258
Call Trace:
  f2fs_write_single_data_page+0xb65/0x1d60 fs/f2fs/data.c:2834
  f2fs_write_cache_pages fs/f2fs/data.c:3133 [inline]
  __f2fs_write_data_pages fs/f2fs/data.c:3288 [inline]
  f2fs_write_data_pages+0x1efe/0x3a90 fs/f2fs/data.c:3315
  do_writepages+0x35b/0x870 mm/page-writeback.c:2612
  __writeback_single_inode+0x165/0x10b0 fs/fs-writeback.c:1650
  writeback_sb_inodes+0x905/0x1260 fs/fs-writeback.c:1941
  wb_writeback+0x457/0xce0 fs/fs-writeback.c:2117
  wb_do_writeback fs/fs-writeback.c:2264 [inline]
  wb_workfn+0x410/0x1090 fs/fs-writeback.c:2304
  process_one_work kernel/workqueue.c:3254 [inline]
  process_scheduled_works+0xa12/0x17c0 kernel/workqueue.c:3335
  worker_thread+0x86d/0xd70 kernel/workqueue.c:3416
  kthread+0x2f2/0x390 kernel/kthread.c:388
  ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

The root cause is: inline_data inode can be fuzzed, so that there may
be valid blkaddr in its direct node, once f2fs triggers background GC
to migrate the block, it will hit f2fs_bug_on() during dirty page
writeback.

Let's add sanity check on i_nid field for inline_data inode, meanwhile,
forbid to migrate inline_data inode's data block to fix this issue.

Reported-by: syzbot+848062ba19c8782ca...@syzkaller.appspotmail.com
Closes: 
https://lore.kernel.org/linux-f2fs-devel/d103ce06174d7...@google.com
Signed-off-by: Chao Yu 
---
  fs/f2fs/f2fs.h   |  2 +-
  fs/f2fs/gc.c |  6 ++
  fs/f2fs/inline.c | 17 -
  fs/f2fs/inode.c  |  2 +-
  4 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index fced2b7652f4..c876813b5532 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4146,7 +4146,7 @@ extern struct kmem_cache *f2fs_inode_entry_slab;
   * inline.c
   */
  bool f2fs_may_inline_data(struct inode *inode);
-bool f2fs_sanity_check_inline_data(struct inode *inode);
+bool f2fs_sanity_check_inline_data(struct inode *inode, struct page *ipage);
  bool f2fs_may_inline_dentry(struct inode *inode);
  void f2fs_do_read_inline_data(struct page *page, struct page *ipage);
  void f2fs_truncate_inline_inode(struct inode *inode,
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index e86c7f01539a..041957750478 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1563,6 +1563,12 @@ static int gc_data_segment(struct f2fs_sb_info *sbi, 
struct f2fs_summary *sum,
continue;
}
+ if (f2fs_has_inline_data(inode)) {
+ iput(inode);
+ set_sbi_flag(sbi, SBI_NEED_FSCK);
+ continue;


Any race condtion to get this as false alarm?


Since there is no reproducer for the bug, I doubt it was caused by metadata
fuzzing, something like this:

- inline inode has one valid blkaddr in i_addr or in dnode reference by i_nid;
- SIT/SSA entry of the block is valid;
- background GC migrates the block;
- kworker writeback it, and trigger the bug_on().


Wasn't detected by sanity_check_inode?


I fuzzed non-inline inode w/ below metadata fields:
- i_blocks = 1
- i_size = 2048
- i_inline |= 0x02

sanity_check_inode() doesn't complain.


I mean, the below sanity_check_inode() can cover the fuzzed case? I'm wondering


I didn't figure out a generic way in sanity_check_inode() to catch all fuzzed 
cases.



The patch described:
   "The root cause is: inline_data inode can be fuzzed, so that there may
   be valid blkaddr in its direct node, once f2fs triggers background GC
   to migrate the block, it will hit f2fs_bug_on() during dirty page
   writeback."

Do you suspect the node block address was suddenly assigned after f2fs_iget()?


No, I suspect that the image was fuzzed by tools offline, not in runtime after
mount().


Otherwise, it looks checking them in sanity_check_inode would be enough.



e.g.
case #1
- blkaddr, its dnode, SSA and SIT are consistent
- dnode.footer.ino points to inline inode
- inline inode doesn't link to the donde

Something like fuzzed special file, please check details in below commit:

9056d6489f5a ("f2fs: fix to do sanity check on inode type during garbage 
collection")

case #2
- blkaddr, its dnode, SSA an

Re: [f2fs-dev] [PATCH] f2fs: Add inline to f2fs_build_fault_attr() stub

2024-05-14 Thread Chao Yu

On 2024/5/13 23:40, Nathan Chancellor wrote:

When building without CONFIG_F2FS_FAULT_INJECTION, there is a warning
from each file that includes f2fs.h because the stub for
f2fs_build_fault_attr() is missing inline:

   In file included from fs/f2fs/segment.c:21:
   fs/f2fs/f2fs.h:4605:12: warning: 'f2fs_build_fault_attr' defined but not 
used [-Wunused-function]
4605 | static int f2fs_build_fault_attr(struct f2fs_sb_info *sbi, unsigned 
long rate,
 |^

Add the missing inline to resolve all of the warnings for this
configuration.

Fixes: 4ed886b187f4 ("f2fs: check validation of fault attrs in 
f2fs_build_fault_attr()")
Signed-off-by: Nathan Chancellor 


Reviewed-by: Chao Yu 

Thanks,


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: initialize last_block_in_bio variable

2024-05-14 Thread Chao Yu

On 2024/5/14 19:35, Wu Bo wrote:

Initialize last_block_in_bio of struct f2fs_bio_info and clean up code.

Signed-off-by: Wu Bo 


Reviewed-by: Chao Yu 

Thanks,


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH v2] f2fs: fix to avoid racing in between read and OPU dio write

2024-05-14 Thread Chao Yu

On 2024/5/15 0:09, Jaegeuk Kim wrote:

On 05/10, Chao Yu wrote:

If lfs mode is on, buffered read may race w/ OPU dio write as below,
it may cause buffered read hits unwritten data unexpectly, and for
dio read, the race condition exists as well.

Thread A  Thread B
- f2fs_file_write_iter
  - f2fs_dio_write_iter
   - __iomap_dio_rw
- f2fs_iomap_begin
 - f2fs_map_blocks
  - __allocate_data_block
   - allocated blkaddr #x
- iomap_dio_submit_bio
   - f2fs_file_read_iter
- filemap_read
 - f2fs_read_data_folio
  - f2fs_mpage_readpages
   - f2fs_map_blocks
: get blkaddr #x
   - f2fs_submit_read_bio
   IRQ
   - f2fs_read_end_io
: read IO on blkaddr #x complete
IRQ
- iomap_dio_bio_end_io
  : direct write IO on blkaddr #x complete

This patch introduces a new per-inode i_opu_rwsem lock to avoid
such race condition.


Wasn't this supposed to be managed by user-land?


Actually, the test case is:

1. mount w/ lfs mode
2. touch file;
3. initialize file w/ 4k zeroed data; fsync;
4. continue triggering dio write 4k zeroed data to file;
5. and meanwhile, continue triggering buf/dio 4k read in file,
use md5sum to verify the 4k data;

It expects data is all zero, however it turned out it's not.

Thanks,





Fixes: f847c699cff3 ("f2fs: allow out-place-update for direct IO in LFS mode")
Signed-off-by: Chao Yu 
---
v2:
- fix to cover dio read path w/ i_opu_rwsem as well.
  fs/f2fs/f2fs.h  |  1 +
  fs/f2fs/file.c  | 28 ++--
  fs/f2fs/super.c |  1 +
  3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 30058e16a5d0..91cf4b3d6bc6 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -847,6 +847,7 @@ struct f2fs_inode_info {
   /* avoid racing between foreground op and gc */
   struct f2fs_rwsem i_gc_rwsem[2];
   struct f2fs_rwsem i_xattr_sem; /* avoid racing between reading and 
changing EAs */
+ struct f2fs_rwsem i_opu_rwsem;  /* avoid racing between buf read and opu 
dio write */

   int i_extra_isize;  /* size of extra space located in i_addr 
*/
   kprojid_t i_projid; /* id for project quota */
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 72ce1a522fb2..4ec260af321f 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -4445,6 +4445,7 @@ static ssize_t f2fs_dio_read_iter(struct kiocb *iocb, 
struct iov_iter *to)
   const loff_t pos = iocb->ki_pos;
   const size_t count = iov_iter_count(to);
   struct iomap_dio *dio;
+ bool do_opu = f2fs_lfs_mode(sbi);
   ssize_t ret;

   if (count == 0)
@@ -4457,8 +4458,14 @@ static ssize_t f2fs_dio_read_iter(struct kiocb *iocb, 
struct iov_iter *to)
   ret = -EAGAIN;
   goto out;
   }
+ if (do_opu && !f2fs_down_read_trylock(>i_opu_rwsem)) {
+ f2fs_up_read(>i_gc_rwsem[READ]);
+ ret = -EAGAIN;
+ goto out;
+ }
   } else {
   f2fs_down_read(>i_gc_rwsem[READ]);
+ f2fs_down_read(>i_opu_rwsem);
   }

   /*
@@ -4477,6 +4484,7 @@ static ssize_t f2fs_dio_read_iter(struct kiocb *iocb, 
struct iov_iter *to)
   ret = iomap_dio_complete(dio);
   }

+ f2fs_up_read(>i_opu_rwsem);
   f2fs_up_read(>i_gc_rwsem[READ]);

   file_accessed(file);
@@ -4523,7 +4531,13 @@ static ssize_t f2fs_file_read_iter(struct kiocb *iocb, 
struct iov_iter *to)
   if (f2fs_should_use_dio(inode, iocb, to)) {
   ret = f2fs_dio_read_iter(iocb, to);
   } else {
+ bool do_opu = f2fs_lfs_mode(F2FS_I_SB(inode));
+
+ if (do_opu)
+ f2fs_down_read(_I(inode)->i_opu_rwsem);
   ret = filemap_read(iocb, to, 0);
+ if (do_opu)
+ f2fs_up_read(_I(inode)->i_opu_rwsem);
   if (ret > 0)
   f2fs_update_iostat(F2FS_I_SB(inode), inode,
   APP_BUFFERED_READ_IO, ret);
@@ -4748,14 +4762,22 @@ static ssize_t f2fs_dio_write_iter(struct kiocb *iocb, 
struct iov_iter *from,
   ret = -EAGAIN;
   goto out;
   }
+ if (do_opu && !f2fs_down_write_trylock(>i_opu_rwsem)) {
+ f2fs_up_read(>i_gc_rwsem[READ]);
+ f2fs_up_read(>i_gc_rwsem[WRITE]);
+ ret = -EAGAIN;
+ goto out;
+ }
   } else {
   ret = f2fs_convert_inline_inode(inode);
   if (ret)
  

Re: [f2fs-dev] [PATCH 3/3] f2fs: fix to do sanity check on i_nid for inline_data inode

2024-05-14 Thread Chao Yu

On 2024/5/15 0:07, Jaegeuk Kim wrote:

外部邮件/External Mail


On 05/11, Chao Yu wrote:

On 2024/5/11 8:38, Jaegeuk Kim wrote:

On 05/10, Chao Yu wrote:

On 2024/5/10 11:36, Jaegeuk Kim wrote:

On 05/10, Chao Yu wrote:

On 2024/5/9 23:52, Jaegeuk Kim wrote:

On 05/06, Chao Yu wrote:

syzbot reports a f2fs bug as below:

[ cut here ]
kernel BUG at fs/f2fs/inline.c:258!
CPU: 1 PID: 34 Comm: kworker/u8:2 Not tainted 
6.9.0-rc6-syzkaller-00012-g9e4bc4bcae01 #0
RIP: 0010:f2fs_write_inline_data+0x781/0x790 fs/f2fs/inline.c:258
Call Trace:
 f2fs_write_single_data_page+0xb65/0x1d60 fs/f2fs/data.c:2834
 f2fs_write_cache_pages fs/f2fs/data.c:3133 [inline]
 __f2fs_write_data_pages fs/f2fs/data.c:3288 [inline]
 f2fs_write_data_pages+0x1efe/0x3a90 fs/f2fs/data.c:3315
 do_writepages+0x35b/0x870 mm/page-writeback.c:2612
 __writeback_single_inode+0x165/0x10b0 fs/fs-writeback.c:1650
 writeback_sb_inodes+0x905/0x1260 fs/fs-writeback.c:1941
 wb_writeback+0x457/0xce0 fs/fs-writeback.c:2117
 wb_do_writeback fs/fs-writeback.c:2264 [inline]
 wb_workfn+0x410/0x1090 fs/fs-writeback.c:2304
 process_one_work kernel/workqueue.c:3254 [inline]
 process_scheduled_works+0xa12/0x17c0 kernel/workqueue.c:3335
 worker_thread+0x86d/0xd70 kernel/workqueue.c:3416
 kthread+0x2f2/0x390 kernel/kthread.c:388
 ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

The root cause is: inline_data inode can be fuzzed, so that there may
be valid blkaddr in its direct node, once f2fs triggers background GC
to migrate the block, it will hit f2fs_bug_on() during dirty page
writeback.

Let's add sanity check on i_nid field for inline_data inode, meanwhile,
forbid to migrate inline_data inode's data block to fix this issue.

Reported-by: syzbot+848062ba19c8782ca...@syzkaller.appspotmail.com
Closes: 
https://lore.kernel.org/linux-f2fs-devel/d103ce06174d7...@google.com
Signed-off-by: Chao Yu 
---
 fs/f2fs/f2fs.h   |  2 +-
 fs/f2fs/gc.c |  6 ++
 fs/f2fs/inline.c | 17 -
 fs/f2fs/inode.c  |  2 +-
 4 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index fced2b7652f4..c876813b5532 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4146,7 +4146,7 @@ extern struct kmem_cache *f2fs_inode_entry_slab;
  * inline.c
  */
 bool f2fs_may_inline_data(struct inode *inode);
-bool f2fs_sanity_check_inline_data(struct inode *inode);
+bool f2fs_sanity_check_inline_data(struct inode *inode, struct page *ipage);
 bool f2fs_may_inline_dentry(struct inode *inode);
 void f2fs_do_read_inline_data(struct page *page, struct page *ipage);
 void f2fs_truncate_inline_inode(struct inode *inode,
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index e86c7f01539a..041957750478 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1563,6 +1563,12 @@ static int gc_data_segment(struct f2fs_sb_info *sbi, 
struct f2fs_summary *sum,
   continue;
   }
+ if (f2fs_has_inline_data(inode)) {
+ iput(inode);
+ set_sbi_flag(sbi, SBI_NEED_FSCK);
+ continue;


Any race condtion to get this as false alarm?


Since there is no reproducer for the bug, I doubt it was caused by metadata
fuzzing, something like this:

- inline inode has one valid blkaddr in i_addr or in dnode reference by i_nid;
- SIT/SSA entry of the block is valid;
- background GC migrates the block;
- kworker writeback it, and trigger the bug_on().


Wasn't detected by sanity_check_inode?


I fuzzed non-inline inode w/ below metadata fields:
- i_blocks = 1
- i_size = 2048
- i_inline |= 0x02

sanity_check_inode() doesn't complain.


I mean, the below sanity_check_inode() can cover the fuzzed case? I'm wondering


I didn't figure out a generic way in sanity_check_inode() to catch all fuzzed 
cases.



The patch described:
  "The root cause is: inline_data inode can be fuzzed, so that there may
  be valid blkaddr in its direct node, once f2fs triggers background GC
  to migrate the block, it will hit f2fs_bug_on() during dirty page
  writeback."

Do you suspect the node block address was suddenly assigned after f2fs_iget()?


No, I suspect that the image was fuzzed by tools offline, not in runtime after
mount().


Otherwise, it looks checking them in sanity_check_inode would be enough.



e.g.
case #1
- blkaddr, its dnode, SSA and SIT are consistent
- dnode.footer.ino points to inline inode
- inline inode doesn't link to the donde

Something like fuzzed special file, please check details in below commit:

9056d6489f5a ("f2fs: fix to do sanity check on inode type during garbage 
collection")

case #2
- blkaddr, its dnode, SSA and SIT are consistent
- blkaddr locates in inline inode's i_addr


The image status is something like 

Re: [f2fs-dev] [PATCH 3/3] f2fs: fix to do sanity check on i_nid for inline_data inode

2024-05-10 Thread Chao Yu

On 2024/5/11 8:38, Jaegeuk Kim wrote:

On 05/10, Chao Yu wrote:

On 2024/5/10 11:36, Jaegeuk Kim wrote:

On 05/10, Chao Yu wrote:

On 2024/5/9 23:52, Jaegeuk Kim wrote:

On 05/06, Chao Yu wrote:

syzbot reports a f2fs bug as below:

[ cut here ]
kernel BUG at fs/f2fs/inline.c:258!
CPU: 1 PID: 34 Comm: kworker/u8:2 Not tainted 
6.9.0-rc6-syzkaller-00012-g9e4bc4bcae01 #0
RIP: 0010:f2fs_write_inline_data+0x781/0x790 fs/f2fs/inline.c:258
Call Trace:
f2fs_write_single_data_page+0xb65/0x1d60 fs/f2fs/data.c:2834
f2fs_write_cache_pages fs/f2fs/data.c:3133 [inline]
__f2fs_write_data_pages fs/f2fs/data.c:3288 [inline]
f2fs_write_data_pages+0x1efe/0x3a90 fs/f2fs/data.c:3315
do_writepages+0x35b/0x870 mm/page-writeback.c:2612
__writeback_single_inode+0x165/0x10b0 fs/fs-writeback.c:1650
writeback_sb_inodes+0x905/0x1260 fs/fs-writeback.c:1941
wb_writeback+0x457/0xce0 fs/fs-writeback.c:2117
wb_do_writeback fs/fs-writeback.c:2264 [inline]
wb_workfn+0x410/0x1090 fs/fs-writeback.c:2304
process_one_work kernel/workqueue.c:3254 [inline]
process_scheduled_works+0xa12/0x17c0 kernel/workqueue.c:3335
worker_thread+0x86d/0xd70 kernel/workqueue.c:3416
kthread+0x2f2/0x390 kernel/kthread.c:388
ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

The root cause is: inline_data inode can be fuzzed, so that there may
be valid blkaddr in its direct node, once f2fs triggers background GC
to migrate the block, it will hit f2fs_bug_on() during dirty page
writeback.

Let's add sanity check on i_nid field for inline_data inode, meanwhile,
forbid to migrate inline_data inode's data block to fix this issue.

Reported-by: syzbot+848062ba19c8782ca...@syzkaller.appspotmail.com
Closes: 
https://lore.kernel.org/linux-f2fs-devel/d103ce06174d7...@google.com
Signed-off-by: Chao Yu 
---
fs/f2fs/f2fs.h   |  2 +-
fs/f2fs/gc.c |  6 ++
fs/f2fs/inline.c | 17 -
fs/f2fs/inode.c  |  2 +-
4 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index fced2b7652f4..c876813b5532 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4146,7 +4146,7 @@ extern struct kmem_cache *f2fs_inode_entry_slab;
 * inline.c
 */
bool f2fs_may_inline_data(struct inode *inode);
-bool f2fs_sanity_check_inline_data(struct inode *inode);
+bool f2fs_sanity_check_inline_data(struct inode *inode, struct page *ipage);
bool f2fs_may_inline_dentry(struct inode *inode);
void f2fs_do_read_inline_data(struct page *page, struct page *ipage);
void f2fs_truncate_inline_inode(struct inode *inode,
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index e86c7f01539a..041957750478 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1563,6 +1563,12 @@ static int gc_data_segment(struct f2fs_sb_info *sbi, 
struct f2fs_summary *sum,
continue;
}
+   if (f2fs_has_inline_data(inode)) {
+   iput(inode);
+   set_sbi_flag(sbi, SBI_NEED_FSCK);
+   continue;


Any race condtion to get this as false alarm?


Since there is no reproducer for the bug, I doubt it was caused by metadata
fuzzing, something like this:

- inline inode has one valid blkaddr in i_addr or in dnode reference by i_nid;
- SIT/SSA entry of the block is valid;
- background GC migrates the block;
- kworker writeback it, and trigger the bug_on().


Wasn't detected by sanity_check_inode?


I fuzzed non-inline inode w/ below metadata fields:
- i_blocks = 1
- i_size = 2048
- i_inline |= 0x02

sanity_check_inode() doesn't complain.


I mean, the below sanity_check_inode() can cover the fuzzed case? I'm wondering


I didn't figure out a generic way in sanity_check_inode() to catch all fuzzed 
cases.

e.g.
case #1
- blkaddr, its dnode, SSA and SIT are consistent
- dnode.footer.ino points to inline inode
- inline inode doesn't link to the donde

Something like fuzzed special file, please check details in below commit:

9056d6489f5a ("f2fs: fix to do sanity check on inode type during garbage 
collection")

case #2
- blkaddr, its dnode, SSA and SIT are consistent
- blkaddr locates in inline inode's i_addr

Thanks,


whether we really need to check it in the gc path.



Thanks,





Thoughts?

Thanks,




+   }
+
err = f2fs_gc_pinned_control(inode, gc_type, segno);
if (err == -EAGAIN) {
iput(inode);
diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
index ac00423f117b..067600fed3d4 100644
--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -33,11 +33,26 @@ bool f2fs_may_inline_data(struct inode *inode)
return !f2fs_post_read_required(inode);
}
-bool f2fs_sanity_check_inline_data(struct inode *inode)
+static bool has_node_blocks(st

Re: [f2fs-dev] [PATCH 3/3] f2fs: fix to do sanity check on i_nid for inline_data inode

2024-05-10 Thread Chao Yu

On 2024/5/10 11:36, Jaegeuk Kim wrote:

On 05/10, Chao Yu wrote:

On 2024/5/9 23:52, Jaegeuk Kim wrote:

On 05/06, Chao Yu wrote:

syzbot reports a f2fs bug as below:

[ cut here ]
kernel BUG at fs/f2fs/inline.c:258!
CPU: 1 PID: 34 Comm: kworker/u8:2 Not tainted 
6.9.0-rc6-syzkaller-00012-g9e4bc4bcae01 #0
RIP: 0010:f2fs_write_inline_data+0x781/0x790 fs/f2fs/inline.c:258
Call Trace:
   f2fs_write_single_data_page+0xb65/0x1d60 fs/f2fs/data.c:2834
   f2fs_write_cache_pages fs/f2fs/data.c:3133 [inline]
   __f2fs_write_data_pages fs/f2fs/data.c:3288 [inline]
   f2fs_write_data_pages+0x1efe/0x3a90 fs/f2fs/data.c:3315
   do_writepages+0x35b/0x870 mm/page-writeback.c:2612
   __writeback_single_inode+0x165/0x10b0 fs/fs-writeback.c:1650
   writeback_sb_inodes+0x905/0x1260 fs/fs-writeback.c:1941
   wb_writeback+0x457/0xce0 fs/fs-writeback.c:2117
   wb_do_writeback fs/fs-writeback.c:2264 [inline]
   wb_workfn+0x410/0x1090 fs/fs-writeback.c:2304
   process_one_work kernel/workqueue.c:3254 [inline]
   process_scheduled_works+0xa12/0x17c0 kernel/workqueue.c:3335
   worker_thread+0x86d/0xd70 kernel/workqueue.c:3416
   kthread+0x2f2/0x390 kernel/kthread.c:388
   ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147
   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

The root cause is: inline_data inode can be fuzzed, so that there may
be valid blkaddr in its direct node, once f2fs triggers background GC
to migrate the block, it will hit f2fs_bug_on() during dirty page
writeback.

Let's add sanity check on i_nid field for inline_data inode, meanwhile,
forbid to migrate inline_data inode's data block to fix this issue.

Reported-by: syzbot+848062ba19c8782ca...@syzkaller.appspotmail.com
Closes: 
https://lore.kernel.org/linux-f2fs-devel/d103ce06174d7...@google.com
Signed-off-by: Chao Yu 
---
   fs/f2fs/f2fs.h   |  2 +-
   fs/f2fs/gc.c |  6 ++
   fs/f2fs/inline.c | 17 -
   fs/f2fs/inode.c  |  2 +-
   4 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index fced2b7652f4..c876813b5532 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4146,7 +4146,7 @@ extern struct kmem_cache *f2fs_inode_entry_slab;
* inline.c
*/
   bool f2fs_may_inline_data(struct inode *inode);
-bool f2fs_sanity_check_inline_data(struct inode *inode);
+bool f2fs_sanity_check_inline_data(struct inode *inode, struct page *ipage);
   bool f2fs_may_inline_dentry(struct inode *inode);
   void f2fs_do_read_inline_data(struct page *page, struct page *ipage);
   void f2fs_truncate_inline_inode(struct inode *inode,
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index e86c7f01539a..041957750478 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1563,6 +1563,12 @@ static int gc_data_segment(struct f2fs_sb_info *sbi, 
struct f2fs_summary *sum,
continue;
}
+   if (f2fs_has_inline_data(inode)) {
+   iput(inode);
+   set_sbi_flag(sbi, SBI_NEED_FSCK);
+   continue;


Any race condtion to get this as false alarm?


Since there is no reproducer for the bug, I doubt it was caused by metadata
fuzzing, something like this:

- inline inode has one valid blkaddr in i_addr or in dnode reference by i_nid;
- SIT/SSA entry of the block is valid;
- background GC migrates the block;
- kworker writeback it, and trigger the bug_on().


Wasn't detected by sanity_check_inode?


I fuzzed non-inline inode w/ below metadata fields:
- i_blocks = 1
- i_size = 2048
- i_inline |= 0x02

sanity_check_inode() doesn't complain.

Thanks,





Thoughts?

Thanks,




+   }
+
err = f2fs_gc_pinned_control(inode, gc_type, segno);
if (err == -EAGAIN) {
iput(inode);
diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
index ac00423f117b..067600fed3d4 100644
--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -33,11 +33,26 @@ bool f2fs_may_inline_data(struct inode *inode)
return !f2fs_post_read_required(inode);
   }
-bool f2fs_sanity_check_inline_data(struct inode *inode)
+static bool has_node_blocks(struct inode *inode, struct page *ipage)
+{
+   struct f2fs_inode *ri = F2FS_INODE(ipage);
+   int i;
+
+   for (i = 0; i < DEF_NIDS_PER_INODE; i++) {
+   if (ri->i_nid[i])
+   return true;
+   }
+   return false;
+}
+
+bool f2fs_sanity_check_inline_data(struct inode *inode, struct page *ipage)
   {
if (!f2fs_has_inline_data(inode))
return false;
+   if (has_node_blocks(inode, ipage))
+   return false;
+
if (!support_inline_data(inode))
return true;
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index c26effdce9aa..1423cd27a477 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -343,7 +343,7 @@ static bool sanity_check

[f2fs-dev] [PATCH v2 2/3] f2fs: fix to add missing iput() in gc_data_segment()

2024-05-09 Thread Chao Yu
During gc_data_segment(), if inode state is abnormal, it missed to call
iput(), fix it.

Fixes: b73e52824c89 ("f2fs: reposition unlock_new_inode to prevent accessing 
invalid inode")
Fixes: 9056d6489f5a ("f2fs: fix to do sanity check on inode type during garbage 
collection")
Signed-off-by: Chao Yu 
---
v2:
- fix wrong fixes tag line.
 fs/f2fs/gc.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index ac4cbbe50c2f..6066c6eecf41 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1554,10 +1554,15 @@ static int gc_data_segment(struct f2fs_sb_info *sbi, 
struct f2fs_summary *sum,
int err;
 
inode = f2fs_iget(sb, dni.ino);
-   if (IS_ERR(inode) || is_bad_inode(inode) ||
-   special_file(inode->i_mode))
+   if (IS_ERR(inode))
continue;
 
+   if (is_bad_inode(inode) ||
+   special_file(inode->i_mode)) {
+   iput(inode);
+   continue;
+   }
+
err = f2fs_gc_pinned_control(inode, gc_type, segno);
if (err == -EAGAIN) {
iput(inode);
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 2/3] f2fs: fix to add missing iput() in gc_data_segment()

2024-05-09 Thread Chao Yu

On 2024/5/9 10:49, Chao Yu wrote:

On 2024/5/9 8:46, Jaegeuk Kim wrote:

On 05/06, Chao Yu wrote:

During gc_data_segment(), if inode state is abnormal, it missed to call
iput(), fix it.

Fixes: 132e3209789c ("f2fs: remove false alarm on iget failure during GC")


Oh, this line should be replaced w/ below one, let me revise the patch.

Fixes: b73e52824c89 ("f2fs: reposition unlock_new_inode to prevent accessing invalid 
inode").

Thanks,


Fixes: 9056d6489f5a ("f2fs: fix to do sanity check on inode type during garbage 
collection")
Signed-off-by: Chao Yu 
---
  fs/f2fs/gc.c | 9 +++--
  1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 8852814dab7f..e86c7f01539a 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1554,10 +1554,15 @@ static int gc_data_segment(struct f2fs_sb_info *sbi, 
struct f2fs_summary *sum,
  int err;
  inode = f2fs_iget(sb, dni.ino);
-    if (IS_ERR(inode) || is_bad_inode(inode) ||
-    special_file(inode->i_mode))
+    if (IS_ERR(inode))
  continue;
+    if (is_bad_inode(inode) ||
+    special_file(inode->i_mode)) {
+    iput(inode);


iget_failed() called iput()?


It looks the bad inode was referenced in this context, it needs to be iput()ed
here.

The bad inode was made in other thread, please check description in commit
b73e52824c89 ("f2fs: reposition unlock_new_inode to prevent accessing invalid
inode").

Thanks,





+    continue;
+    }
+
  err = f2fs_gc_pinned_control(inode, gc_type, segno);
  if (err == -EAGAIN) {
  iput(inode);
--
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH v3 5/5] f2fs: compress: don't allow unaligned truncation on released compress inode

2024-05-09 Thread Chao Yu
f2fs image may be corrupted after below testcase:
- mkfs.f2fs -O extra_attr,compression -f /dev/vdb
- mount /dev/vdb /mnt/f2fs
- touch /mnt/f2fs/file
- f2fs_io setflags compression /mnt/f2fs/file
- dd if=/dev/zero of=/mnt/f2fs/file bs=4k count=4
- f2fs_io release_cblocks /mnt/f2fs/file
- truncate -s 8192 /mnt/f2fs/file
- umount /mnt/f2fs
- fsck.f2fs /dev/vdb

[ASSERT] (fsck_chk_inode_blk:1256)  --> ino: 0x5 has i_blocks: 0x0002, but 
has 0x3 blocks
[FSCK] valid_block_count matching with CP [Fail] [0x4, 0x5]
[FSCK] other corrupted bugs   [Fail]

The reason is: partial truncation assume compressed inode has reserved
blocks, after partial truncation, valid block count may change w/o
.i_blocks and .total_valid_block_count update, result in corruption.

This patch only allow cluster size aligned truncation on released
compress inode for fixing.

Fixes: c61404153eb6 ("f2fs: introduce FI_COMPRESS_RELEASED instead of using 
IMMUTABLE bit")
Signed-off-by: Chao Yu 
---
v3:
- fix typo in commit description: w/ -> w/o
 fs/f2fs/file.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 7371f485b3f7..15f4222da891 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -952,9 +952,14 @@ int f2fs_setattr(struct mnt_idmap *idmap, struct dentry 
*dentry,
  ATTR_GID | ATTR_TIMES_SET
return -EPERM;
 
-   if ((attr->ia_valid & ATTR_SIZE) &&
-   !f2fs_is_compress_backend_ready(inode))
-   return -EOPNOTSUPP;
+   if ((attr->ia_valid & ATTR_SIZE)) {
+   if (!f2fs_is_compress_backend_ready(inode))
+   return -EOPNOTSUPP;
+   if (is_inode_flag_set(inode, FI_COMPRESS_RELEASED) &&
+   !IS_ALIGNED(attr->ia_size,
+   F2FS_BLK_TO_BYTES(F2FS_I(inode)->i_cluster_size)))
+   return -EINVAL;
+   }
 
err = setattr_prepare(idmap, dentry, attr);
if (err)
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH v2] f2fs: fix to avoid racing in between read and OPU dio write

2024-05-09 Thread Chao Yu
If lfs mode is on, buffered read may race w/ OPU dio write as below,
it may cause buffered read hits unwritten data unexpectly, and for
dio read, the race condition exists as well.

Thread AThread B
- f2fs_file_write_iter
 - f2fs_dio_write_iter
  - __iomap_dio_rw
   - f2fs_iomap_begin
- f2fs_map_blocks
 - __allocate_data_block
  - allocated blkaddr #x
   - iomap_dio_submit_bio
- f2fs_file_read_iter
 - filemap_read
  - f2fs_read_data_folio
   - f2fs_mpage_readpages
- f2fs_map_blocks
 : get blkaddr #x
- f2fs_submit_read_bio
IRQ
- f2fs_read_end_io
 : read IO on blkaddr #x complete
IRQ
- iomap_dio_bio_end_io
 : direct write IO on blkaddr #x complete

This patch introduces a new per-inode i_opu_rwsem lock to avoid
such race condition.

Fixes: f847c699cff3 ("f2fs: allow out-place-update for direct IO in LFS mode")
Signed-off-by: Chao Yu 
---
v2:
- fix to cover dio read path w/ i_opu_rwsem as well.
 fs/f2fs/f2fs.h  |  1 +
 fs/f2fs/file.c  | 28 ++--
 fs/f2fs/super.c |  1 +
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 30058e16a5d0..91cf4b3d6bc6 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -847,6 +847,7 @@ struct f2fs_inode_info {
/* avoid racing between foreground op and gc */
struct f2fs_rwsem i_gc_rwsem[2];
struct f2fs_rwsem i_xattr_sem; /* avoid racing between reading and 
changing EAs */
+   struct f2fs_rwsem i_opu_rwsem;  /* avoid racing between buf read and 
opu dio write */
 
int i_extra_isize;  /* size of extra space located in 
i_addr */
kprojid_t i_projid; /* id for project quota */
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 72ce1a522fb2..4ec260af321f 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -4445,6 +4445,7 @@ static ssize_t f2fs_dio_read_iter(struct kiocb *iocb, 
struct iov_iter *to)
const loff_t pos = iocb->ki_pos;
const size_t count = iov_iter_count(to);
struct iomap_dio *dio;
+   bool do_opu = f2fs_lfs_mode(sbi);
ssize_t ret;
 
if (count == 0)
@@ -4457,8 +4458,14 @@ static ssize_t f2fs_dio_read_iter(struct kiocb *iocb, 
struct iov_iter *to)
ret = -EAGAIN;
goto out;
}
+   if (do_opu && !f2fs_down_read_trylock(>i_opu_rwsem)) {
+   f2fs_up_read(>i_gc_rwsem[READ]);
+   ret = -EAGAIN;
+   goto out;
+   }
} else {
f2fs_down_read(>i_gc_rwsem[READ]);
+   f2fs_down_read(>i_opu_rwsem);
}
 
/*
@@ -4477,6 +4484,7 @@ static ssize_t f2fs_dio_read_iter(struct kiocb *iocb, 
struct iov_iter *to)
ret = iomap_dio_complete(dio);
}
 
+   f2fs_up_read(>i_opu_rwsem);
f2fs_up_read(>i_gc_rwsem[READ]);
 
file_accessed(file);
@@ -4523,7 +4531,13 @@ static ssize_t f2fs_file_read_iter(struct kiocb *iocb, 
struct iov_iter *to)
if (f2fs_should_use_dio(inode, iocb, to)) {
ret = f2fs_dio_read_iter(iocb, to);
} else {
+   bool do_opu = f2fs_lfs_mode(F2FS_I_SB(inode));
+
+   if (do_opu)
+   f2fs_down_read(_I(inode)->i_opu_rwsem);
ret = filemap_read(iocb, to, 0);
+   if (do_opu)
+   f2fs_up_read(_I(inode)->i_opu_rwsem);
if (ret > 0)
f2fs_update_iostat(F2FS_I_SB(inode), inode,
APP_BUFFERED_READ_IO, ret);
@@ -4748,14 +4762,22 @@ static ssize_t f2fs_dio_write_iter(struct kiocb *iocb, 
struct iov_iter *from,
ret = -EAGAIN;
goto out;
}
+   if (do_opu && !f2fs_down_write_trylock(>i_opu_rwsem)) {
+   f2fs_up_read(>i_gc_rwsem[READ]);
+   f2fs_up_read(>i_gc_rwsem[WRITE]);
+   ret = -EAGAIN;
+   goto out;
+   }
} else {
ret = f2fs_convert_inline_inode(inode);
if (ret)
goto out;
 
f2fs_down_read(>i_gc_rwsem[WRITE]);
-   if (do_opu)
+   if (do_opu) {
f2fs_down_read(>i_gc_rwsem[READ]);
+   f2fs_down_write(>i_opu_rwsem);
+   }
}
 
/*
@@ -4779,8 +4801,10 @@ static ssize_t f2fs_dio_write_iter(struct kiocb *

Re: [f2fs-dev] [PATCH 3/3] f2fs: fix to do sanity check on i_nid for inline_data inode

2024-05-09 Thread Chao Yu

On 2024/5/9 23:52, Jaegeuk Kim wrote:

On 05/06, Chao Yu wrote:

syzbot reports a f2fs bug as below:

[ cut here ]
kernel BUG at fs/f2fs/inline.c:258!
CPU: 1 PID: 34 Comm: kworker/u8:2 Not tainted 
6.9.0-rc6-syzkaller-00012-g9e4bc4bcae01 #0
RIP: 0010:f2fs_write_inline_data+0x781/0x790 fs/f2fs/inline.c:258
Call Trace:
  f2fs_write_single_data_page+0xb65/0x1d60 fs/f2fs/data.c:2834
  f2fs_write_cache_pages fs/f2fs/data.c:3133 [inline]
  __f2fs_write_data_pages fs/f2fs/data.c:3288 [inline]
  f2fs_write_data_pages+0x1efe/0x3a90 fs/f2fs/data.c:3315
  do_writepages+0x35b/0x870 mm/page-writeback.c:2612
  __writeback_single_inode+0x165/0x10b0 fs/fs-writeback.c:1650
  writeback_sb_inodes+0x905/0x1260 fs/fs-writeback.c:1941
  wb_writeback+0x457/0xce0 fs/fs-writeback.c:2117
  wb_do_writeback fs/fs-writeback.c:2264 [inline]
  wb_workfn+0x410/0x1090 fs/fs-writeback.c:2304
  process_one_work kernel/workqueue.c:3254 [inline]
  process_scheduled_works+0xa12/0x17c0 kernel/workqueue.c:3335
  worker_thread+0x86d/0xd70 kernel/workqueue.c:3416
  kthread+0x2f2/0x390 kernel/kthread.c:388
  ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

The root cause is: inline_data inode can be fuzzed, so that there may
be valid blkaddr in its direct node, once f2fs triggers background GC
to migrate the block, it will hit f2fs_bug_on() during dirty page
writeback.

Let's add sanity check on i_nid field for inline_data inode, meanwhile,
forbid to migrate inline_data inode's data block to fix this issue.

Reported-by: syzbot+848062ba19c8782ca...@syzkaller.appspotmail.com
Closes: 
https://lore.kernel.org/linux-f2fs-devel/d103ce06174d7...@google.com
Signed-off-by: Chao Yu 
---
  fs/f2fs/f2fs.h   |  2 +-
  fs/f2fs/gc.c |  6 ++
  fs/f2fs/inline.c | 17 -
  fs/f2fs/inode.c  |  2 +-
  4 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index fced2b7652f4..c876813b5532 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4146,7 +4146,7 @@ extern struct kmem_cache *f2fs_inode_entry_slab;
   * inline.c
   */
  bool f2fs_may_inline_data(struct inode *inode);
-bool f2fs_sanity_check_inline_data(struct inode *inode);
+bool f2fs_sanity_check_inline_data(struct inode *inode, struct page *ipage);
  bool f2fs_may_inline_dentry(struct inode *inode);
  void f2fs_do_read_inline_data(struct page *page, struct page *ipage);
  void f2fs_truncate_inline_inode(struct inode *inode,
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index e86c7f01539a..041957750478 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1563,6 +1563,12 @@ static int gc_data_segment(struct f2fs_sb_info *sbi, 
struct f2fs_summary *sum,
continue;
}
  
+			if (f2fs_has_inline_data(inode)) {

+   iput(inode);
+   set_sbi_flag(sbi, SBI_NEED_FSCK);
+   continue;


Any race condtion to get this as false alarm?


Since there is no reproducer for the bug, I doubt it was caused by metadata
fuzzing, something like this:

- inline inode has one valid blkaddr in i_addr or in dnode reference by i_nid;
- SIT/SSA entry of the block is valid;
- background GC migrates the block;
- kworker writeback it, and trigger the bug_on().

Thoughts?

Thanks,




+   }
+
err = f2fs_gc_pinned_control(inode, gc_type, segno);
if (err == -EAGAIN) {
iput(inode);
diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
index ac00423f117b..067600fed3d4 100644
--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -33,11 +33,26 @@ bool f2fs_may_inline_data(struct inode *inode)
return !f2fs_post_read_required(inode);
  }
  
-bool f2fs_sanity_check_inline_data(struct inode *inode)

+static bool has_node_blocks(struct inode *inode, struct page *ipage)
+{
+   struct f2fs_inode *ri = F2FS_INODE(ipage);
+   int i;
+
+   for (i = 0; i < DEF_NIDS_PER_INODE; i++) {
+   if (ri->i_nid[i])
+   return true;
+   }
+   return false;
+}
+
+bool f2fs_sanity_check_inline_data(struct inode *inode, struct page *ipage)
  {
if (!f2fs_has_inline_data(inode))
return false;
  
+	if (has_node_blocks(inode, ipage))

+   return false;
+
if (!support_inline_data(inode))
return true;
  
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c

index c26effdce9aa..1423cd27a477 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -343,7 +343,7 @@ static bool sanity_check_inode(struct inode *inode, struct 
page *node_page)
}
}
  
-	if (f2fs_sanity_check_inline_data(inode)) {

+   if (f2fs_sanity_check_inline_data(inode, node_page)) {
f2fs_warn(sbi, "%s: inode (ino=%lx, mode=%u) should not have 
inline_data, run 

Re: [f2fs-dev] [PATCH 2/3] f2fs: fix to add missing iput() in gc_data_segment()

2024-05-08 Thread Chao Yu

On 2024/5/9 8:46, Jaegeuk Kim wrote:

On 05/06, Chao Yu wrote:

During gc_data_segment(), if inode state is abnormal, it missed to call
iput(), fix it.

Fixes: 132e3209789c ("f2fs: remove false alarm on iget failure during GC")
Fixes: 9056d6489f5a ("f2fs: fix to do sanity check on inode type during garbage 
collection")
Signed-off-by: Chao Yu 
---
  fs/f2fs/gc.c | 9 +++--
  1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 8852814dab7f..e86c7f01539a 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1554,10 +1554,15 @@ static int gc_data_segment(struct f2fs_sb_info *sbi, 
struct f2fs_summary *sum,
int err;
  
  			inode = f2fs_iget(sb, dni.ino);

-   if (IS_ERR(inode) || is_bad_inode(inode) ||
-   special_file(inode->i_mode))
+   if (IS_ERR(inode))
continue;
  
+			if (is_bad_inode(inode) ||

+   special_file(inode->i_mode)) {
+   iput(inode);


iget_failed() called iput()?


It looks the bad inode was referenced in this context, it needs to be iput()ed
here.

The bad inode was made in other thread, please check description in commit
b73e52824c89 ("f2fs: reposition unlock_new_inode to prevent accessing invalid
inode").

Thanks,





+   continue;
+   }
+
err = f2fs_gc_pinned_control(inode, gc_type, segno);
if (err == -EAGAIN) {
iput(inode);
--
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] f2fs: fix to avoid racing in between buffered read and OPU dio write

2024-05-07 Thread Chao Yu
If lfs mode is on, buffered read may race w/ OPU dio write as below,
it may cause buffered read hits unwritten data unexpectly.

Thread AThread B
- f2fs_file_write_iter
 - f2fs_dio_write_iter
  - __iomap_dio_rw
   - f2fs_iomap_begin
- f2fs_map_blocks
 - __allocate_data_block
  - allocated blkaddr #x
   - iomap_dio_submit_bio
- f2fs_file_read_iter
 - filemap_read
  - f2fs_read_data_folio
   - f2fs_mpage_readpages
- f2fs_map_blocks
 : get blkaddr #x
- f2fs_submit_read_bio
IRQ
- f2fs_read_end_io
 : read IO on blkaddr #x complete
IRQ
- iomap_dio_bio_end_io
 : direct write IO on blkaddr #x complete

This patch introduces a new per-inode i_opu_rwsem lock to avoid
such race condition.

Fixes: f847c699cff3 ("f2fs: allow out-place-update for direct IO in LFS mode")
Signed-off-by: Chao Yu 
---
 fs/f2fs/f2fs.h  |  1 +
 fs/f2fs/file.c  | 20 ++--
 fs/f2fs/super.c |  1 +
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 145b985bf252..b69ec1109572 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -847,6 +847,7 @@ struct f2fs_inode_info {
/* avoid racing between foreground op and gc */
struct f2fs_rwsem i_gc_rwsem[2];
struct f2fs_rwsem i_xattr_sem; /* avoid racing between reading and 
changing EAs */
+   struct f2fs_rwsem i_opu_rwsem;  /* avoid racing between buf read and 
opu dio write */
 
int i_extra_isize;  /* size of extra space located in 
i_addr */
kprojid_t i_projid; /* id for project quota */
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index ef4cfb4436ef..c761db952b37 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -4545,7 +4545,13 @@ static ssize_t f2fs_file_read_iter(struct kiocb *iocb, 
struct iov_iter *to)
if (f2fs_should_use_dio(inode, iocb, to)) {
ret = f2fs_dio_read_iter(iocb, to);
} else {
+   bool do_opu = f2fs_lfs_mode(F2FS_I_SB(inode));
+
+   if (do_opu)
+   f2fs_down_read(_I(inode)->i_opu_rwsem);
ret = filemap_read(iocb, to, 0);
+   if (do_opu)
+   f2fs_up_read(_I(inode)->i_opu_rwsem);
if (ret > 0)
f2fs_update_iostat(F2FS_I_SB(inode), inode,
APP_BUFFERED_READ_IO, ret);
@@ -4770,14 +4776,22 @@ static ssize_t f2fs_dio_write_iter(struct kiocb *iocb, 
struct iov_iter *from,
ret = -EAGAIN;
goto out;
}
+   if (do_opu && !f2fs_down_write_trylock(>i_opu_rwsem)) {
+   f2fs_up_read(>i_gc_rwsem[READ]);
+   f2fs_up_read(>i_gc_rwsem[WRITE]);
+   ret = -EAGAIN;
+   goto out;
+   }
} else {
ret = f2fs_convert_inline_inode(inode);
if (ret)
goto out;
 
f2fs_down_read(>i_gc_rwsem[WRITE]);
-   if (do_opu)
+   if (do_opu) {
f2fs_down_read(>i_gc_rwsem[READ]);
+   f2fs_down_write(>i_opu_rwsem);
+   }
}
 
/*
@@ -4801,8 +4815,10 @@ static ssize_t f2fs_dio_write_iter(struct kiocb *iocb, 
struct iov_iter *from,
ret = iomap_dio_complete(dio);
}
 
-   if (do_opu)
+   if (do_opu) {
+   f2fs_up_write(>i_opu_rwsem);
f2fs_up_read(>i_gc_rwsem[READ]);
+   }
f2fs_up_read(>i_gc_rwsem[WRITE]);
 
if (ret < 0)
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index daf2c4dbe150..b4ed3b094366 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1428,6 +1428,7 @@ static struct inode *f2fs_alloc_inode(struct super_block 
*sb)
init_f2fs_rwsem(>i_gc_rwsem[READ]);
init_f2fs_rwsem(>i_gc_rwsem[WRITE]);
init_f2fs_rwsem(>i_xattr_sem);
+   init_f2fs_rwsem(>i_opu_rwsem);
 
/* Will be used by directory only */
fi->i_dir_level = F2FS_SB(sb)->dir_level;
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH v2 5/5] f2fs: compress: don't allow unaligned truncation on released compress inode

2024-05-07 Thread Chao Yu
f2fs image may be corrupted after below testcase:
- mkfs.f2fs -O extra_attr,compression -f /dev/vdb
- mount /dev/vdb /mnt/f2fs
- touch /mnt/f2fs/file
- f2fs_io setflags compression /mnt/f2fs/file
- dd if=/dev/zero of=/mnt/f2fs/file bs=4k count=4
- f2fs_io release_cblocks /mnt/f2fs/file
- truncate -s 8192 /mnt/f2fs/file
- umount /mnt/f2fs
- fsck.f2fs /dev/vdb

[ASSERT] (fsck_chk_inode_blk:1256)  --> ino: 0x5 has i_blocks: 0x0002, but 
has 0x3 blocks
[FSCK] valid_block_count matching with CP [Fail] [0x4, 0x5]
[FSCK] other corrupted bugs   [Fail]

The reason is: partial truncation assume compressed inode has reserved
blocks, after partial truncation, valid block count may change w/
.i_blocks and .total_valid_block_count update, result in corruption.

This patch only allow cluster size aligned truncation on released
compress inode for fixing.

Fixes: c61404153eb6 ("f2fs: introduce FI_COMPRESS_RELEASED instead of using 
IMMUTABLE bit")
Signed-off-by: Chao Yu 
---
v2:
- fix compile warning reported by lkp.
 fs/f2fs/file.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 3f0db351e976..0c8194dc6807 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -952,9 +952,14 @@ int f2fs_setattr(struct mnt_idmap *idmap, struct dentry 
*dentry,
  ATTR_GID | ATTR_TIMES_SET
return -EPERM;
 
-   if ((attr->ia_valid & ATTR_SIZE) &&
-   !f2fs_is_compress_backend_ready(inode))
-   return -EOPNOTSUPP;
+   if ((attr->ia_valid & ATTR_SIZE)) {
+   if (!f2fs_is_compress_backend_ready(inode))
+   return -EOPNOTSUPP;
+   if (is_inode_flag_set(inode, FI_COMPRESS_RELEASED) &&
+   !IS_ALIGNED(attr->ia_size,
+   F2FS_BLK_TO_BYTES(F2FS_I(inode)->i_cluster_size)))
+   return -EINVAL;
+   }
 
err = setattr_prepare(idmap, dentry, attr);
if (err)
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH v2] f2fs: check validation of fault attrs in f2fs_build_fault_attr()

2024-05-06 Thread Chao Yu
- It missed to check validation of fault attrs in parse_options(),
let's fix to add check condition in f2fs_build_fault_attr().
- Use f2fs_build_fault_attr() in __sbi_store() to clean up code.

Signed-off-by: Chao Yu 
---
v2:
- add static for f2fs_build_fault_attr().
 fs/f2fs/f2fs.h  | 12 
 fs/f2fs/super.c | 27 ---
 fs/f2fs/sysfs.c | 14 ++
 3 files changed, 38 insertions(+), 15 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 95a40d4f778f..a29576f46796 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -72,7 +72,7 @@ enum {
 
 struct f2fs_fault_info {
atomic_t inject_ops;
-   unsigned int inject_rate;
+   int inject_rate;
unsigned int inject_type;
 };
 
@@ -4597,10 +4597,14 @@ static inline bool f2fs_need_verity(const struct inode 
*inode, pgoff_t idx)
 }
 
 #ifdef CONFIG_F2FS_FAULT_INJECTION
-extern void f2fs_build_fault_attr(struct f2fs_sb_info *sbi, unsigned int rate,
-   unsigned int type);
+extern int f2fs_build_fault_attr(struct f2fs_sb_info *sbi, unsigned long rate,
+   unsigned long type);
 #else
-#define f2fs_build_fault_attr(sbi, rate, type) do { } while (0)
+static int f2fs_build_fault_attr(struct f2fs_sb_info *sbi, unsigned long rate,
+   unsigned long type)
+{
+   return 0;
+}
 #endif
 
 static inline bool is_journalled_quota(struct f2fs_sb_info *sbi)
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index a4bc26dfdb1a..94918ae7eddb 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -66,21 +66,31 @@ const char *f2fs_fault_name[FAULT_MAX] = {
[FAULT_NO_SEGMENT]  = "no free segment",
 };
 
-void f2fs_build_fault_attr(struct f2fs_sb_info *sbi, unsigned int rate,
-   unsigned int type)
+int f2fs_build_fault_attr(struct f2fs_sb_info *sbi, unsigned long rate,
+   unsigned long type)
 {
struct f2fs_fault_info *ffi = _OPTION(sbi).fault_info;
 
if (rate) {
+   if (rate > INT_MAX)
+   return -EINVAL;
atomic_set(>inject_ops, 0);
-   ffi->inject_rate = rate;
+   ffi->inject_rate = (int)rate;
}
 
-   if (type)
-   ffi->inject_type = type;
+   if (type) {
+   if (type >= BIT(FAULT_MAX))
+   return -EINVAL;
+   ffi->inject_type = (unsigned int)type;
+   }
 
if (!rate && !type)
memset(ffi, 0, sizeof(struct f2fs_fault_info));
+   else
+   f2fs_info(sbi,
+   "build fault injection attr: rate: %lu, type: 0x%lx",
+   rate, type);
+   return 0;
 }
 #endif
 
@@ -886,14 +896,17 @@ static int parse_options(struct super_block *sb, char 
*options, bool is_remount)
case Opt_fault_injection:
if (args->from && match_int(args, ))
return -EINVAL;
-   f2fs_build_fault_attr(sbi, arg, F2FS_ALL_FAULT_TYPE);
+   if (f2fs_build_fault_attr(sbi, arg,
+   F2FS_ALL_FAULT_TYPE))
+   return -EINVAL;
set_opt(sbi, FAULT_INJECTION);
break;
 
case Opt_fault_type:
if (args->from && match_int(args, ))
return -EINVAL;
-   f2fs_build_fault_attr(sbi, 0, arg);
+   if (f2fs_build_fault_attr(sbi, 0, arg))
+   return -EINVAL;
set_opt(sbi, FAULT_INJECTION);
break;
 #else
diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
index a568ce96cf56..7aa3844e7a80 100644
--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -484,10 +484,16 @@ static ssize_t __sbi_store(struct f2fs_attr *a,
if (ret < 0)
return ret;
 #ifdef CONFIG_F2FS_FAULT_INJECTION
-   if (a->struct_type == FAULT_INFO_TYPE && t >= BIT(FAULT_MAX))
-   return -EINVAL;
-   if (a->struct_type == FAULT_INFO_RATE && t >= UINT_MAX)
-   return -EINVAL;
+   if (a->struct_type == FAULT_INFO_TYPE) {
+   if (f2fs_build_fault_attr(sbi, 0, t))
+   return -EINVAL;
+   return count;
+   }
+   if (a->struct_type == FAULT_INFO_RATE) {
+   if (f2fs_build_fault_attr(sbi, t, 0))
+   return -EINVAL;
+   return count;
+   }
 #endif
if (a->struct_type == RESERVED_BLOCKS) {
spi

[f2fs-dev] [PATCH v2 1/3] f2fs: fix to release node block count in error path of f2fs_new_node_page()

2024-05-06 Thread Chao Yu
It missed to call dec_valid_node_count() to release node block count
in error path, fix it.

Fixes: 141170b759e0 ("f2fs: fix to avoid use f2fs_bug_on() in 
f2fs_new_node_page()")
Signed-off-by: Chao Yu 
---
v2:
- avoid comppile warning if CONFIG_F2FS_CHECK_FS is off.
 fs/f2fs/node.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index b3de6d6cdb02..7df5ad84cb5e 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1319,6 +1319,7 @@ struct page *f2fs_new_node_page(struct dnode_of_data *dn, 
unsigned int ofs)
}
if (unlikely(new_ni.blk_addr != NULL_ADDR)) {
err = -EFSCORRUPTED;
+   dec_valid_node_count(sbi, dn->inode, !ofs);
set_sbi_flag(sbi, SBI_NEED_FSCK);
f2fs_handle_error(sbi, ERROR_INVALID_BLKADDR);
goto fail;
@@ -1345,7 +1346,6 @@ struct page *f2fs_new_node_page(struct dnode_of_data *dn, 
unsigned int ofs)
if (ofs == 0)
inc_valid_inode_count(sbi);
return page;
-
 fail:
clear_node_page_dirty(page);
f2fs_put_page(page, 1);
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] f2fs: use f2fs_{err, info}_ratelimited() for cleanup

2024-05-06 Thread Chao Yu
Commit b1c9d3f833ba ("f2fs: support printk_ratelimited() in f2fs_printk()")
missed some cases, cover all remains for cleanup.

Signed-off-by: Chao Yu 
---
 fs/f2fs/compress.c | 54 +-
 fs/f2fs/segment.c  |  5 ++---
 2 files changed, 26 insertions(+), 33 deletions(-)

diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index 8892c8262141..3c70a9697063 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -198,8 +198,8 @@ static int lzo_compress_pages(struct compress_ctx *cc)
ret = lzo1x_1_compress(cc->rbuf, cc->rlen, cc->cbuf->cdata,
>clen, cc->private);
if (ret != LZO_E_OK) {
-   printk_ratelimited("%sF2FS-fs (%s): lzo compress failed, 
ret:%d\n",
-   KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id, ret);
+   f2fs_err_ratelimited(F2FS_I_SB(cc->inode),
+   "lzo compress failed, ret:%d", ret);
return -EIO;
}
return 0;
@@ -212,17 +212,15 @@ static int lzo_decompress_pages(struct decompress_io_ctx 
*dic)
ret = lzo1x_decompress_safe(dic->cbuf->cdata, dic->clen,
dic->rbuf, >rlen);
if (ret != LZO_E_OK) {
-   printk_ratelimited("%sF2FS-fs (%s): lzo decompress failed, 
ret:%d\n",
-   KERN_ERR, F2FS_I_SB(dic->inode)->sb->s_id, ret);
+   f2fs_err_ratelimited(F2FS_I_SB(dic->inode),
+   "lzo decompress failed, ret:%d", ret);
return -EIO;
}
 
if (dic->rlen != PAGE_SIZE << dic->log_cluster_size) {
-   printk_ratelimited("%sF2FS-fs (%s): lzo invalid rlen:%zu, "
-   "expected:%lu\n", KERN_ERR,
-   F2FS_I_SB(dic->inode)->sb->s_id,
-   dic->rlen,
-   PAGE_SIZE << dic->log_cluster_size);
+   f2fs_err_ratelimited(F2FS_I_SB(dic->inode),
+   "lzo invalid rlen:%zu, expected:%lu",
+   dic->rlen, PAGE_SIZE << dic->log_cluster_size);
return -EIO;
}
return 0;
@@ -294,16 +292,15 @@ static int lz4_decompress_pages(struct decompress_io_ctx 
*dic)
ret = LZ4_decompress_safe(dic->cbuf->cdata, dic->rbuf,
dic->clen, dic->rlen);
if (ret < 0) {
-   printk_ratelimited("%sF2FS-fs (%s): lz4 decompress failed, 
ret:%d\n",
-   KERN_ERR, F2FS_I_SB(dic->inode)->sb->s_id, ret);
+   f2fs_err_ratelimited(F2FS_I_SB(dic->inode),
+   "lz4 decompress failed, ret:%d", ret);
return -EIO;
}
 
if (ret != PAGE_SIZE << dic->log_cluster_size) {
-   printk_ratelimited("%sF2FS-fs (%s): lz4 invalid ret:%d, "
-   "expected:%lu\n", KERN_ERR,
-   F2FS_I_SB(dic->inode)->sb->s_id, ret,
-   PAGE_SIZE << dic->log_cluster_size);
+   f2fs_err_ratelimited(F2FS_I_SB(dic->inode),
+   "lz4 invalid ret:%d, expected:%lu",
+   ret, PAGE_SIZE << dic->log_cluster_size);
return -EIO;
}
return 0;
@@ -350,9 +347,8 @@ static int zstd_init_compress_ctx(struct compress_ctx *cc)
 
stream = zstd_init_cstream(, 0, workspace, workspace_size);
if (!stream) {
-   printk_ratelimited("%sF2FS-fs (%s): %s zstd_init_cstream 
failed\n",
-   KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
-   __func__);
+   f2fs_err_ratelimited(F2FS_I_SB(cc->inode),
+   "%s zstd_init_cstream failed", __func__);
kvfree(workspace);
return -EIO;
}
@@ -390,16 +386,16 @@ static int zstd_compress_pages(struct compress_ctx *cc)
 
ret = zstd_compress_stream(stream, , );
if (zstd_is_error(ret)) {
-   printk_ratelimited("%sF2FS-fs (%s): %s zstd_compress_stream 
failed, ret: %d\n",
-   KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
+   f2fs_err_ratelimited(F2FS_I_SB(cc->inode),
+   "%s zstd_compress_stream failed, ret: %d",
__func__, zstd_get_error_code(ret));

[f2fs-dev] [PATCH] f2fs: check validation of fault attrs in f2fs_build_fault_attr()

2024-05-06 Thread Chao Yu
- It missed to check validation of fault attrs in parse_options(),
let's fix to add check condition in f2fs_build_fault_attr().
- Use f2fs_build_fault_attr() in __sbi_store() to clean up code.

Signed-off-by: Chao Yu 
---
 fs/f2fs/f2fs.h  | 12 
 fs/f2fs/super.c | 27 ---
 fs/f2fs/sysfs.c | 14 ++
 3 files changed, 38 insertions(+), 15 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 95a40d4f778f..b03d75e4eedc 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -72,7 +72,7 @@ enum {
 
 struct f2fs_fault_info {
atomic_t inject_ops;
-   unsigned int inject_rate;
+   int inject_rate;
unsigned int inject_type;
 };
 
@@ -4597,10 +4597,14 @@ static inline bool f2fs_need_verity(const struct inode 
*inode, pgoff_t idx)
 }
 
 #ifdef CONFIG_F2FS_FAULT_INJECTION
-extern void f2fs_build_fault_attr(struct f2fs_sb_info *sbi, unsigned int rate,
-   unsigned int type);
+extern int f2fs_build_fault_attr(struct f2fs_sb_info *sbi, unsigned long rate,
+   unsigned long type);
 #else
-#define f2fs_build_fault_attr(sbi, rate, type) do { } while (0)
+int f2fs_build_fault_attr(struct f2fs_sb_info *sbi, unsigned long rate,
+   unsigned long type)
+{
+   return 0;
+}
 #endif
 
 static inline bool is_journalled_quota(struct f2fs_sb_info *sbi)
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index a4bc26dfdb1a..94918ae7eddb 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -66,21 +66,31 @@ const char *f2fs_fault_name[FAULT_MAX] = {
[FAULT_NO_SEGMENT]  = "no free segment",
 };
 
-void f2fs_build_fault_attr(struct f2fs_sb_info *sbi, unsigned int rate,
-   unsigned int type)
+int f2fs_build_fault_attr(struct f2fs_sb_info *sbi, unsigned long rate,
+   unsigned long type)
 {
struct f2fs_fault_info *ffi = _OPTION(sbi).fault_info;
 
if (rate) {
+   if (rate > INT_MAX)
+   return -EINVAL;
atomic_set(>inject_ops, 0);
-   ffi->inject_rate = rate;
+   ffi->inject_rate = (int)rate;
}
 
-   if (type)
-   ffi->inject_type = type;
+   if (type) {
+   if (type >= BIT(FAULT_MAX))
+   return -EINVAL;
+   ffi->inject_type = (unsigned int)type;
+   }
 
if (!rate && !type)
memset(ffi, 0, sizeof(struct f2fs_fault_info));
+   else
+   f2fs_info(sbi,
+   "build fault injection attr: rate: %lu, type: 0x%lx",
+   rate, type);
+   return 0;
 }
 #endif
 
@@ -886,14 +896,17 @@ static int parse_options(struct super_block *sb, char 
*options, bool is_remount)
case Opt_fault_injection:
if (args->from && match_int(args, ))
return -EINVAL;
-   f2fs_build_fault_attr(sbi, arg, F2FS_ALL_FAULT_TYPE);
+   if (f2fs_build_fault_attr(sbi, arg,
+   F2FS_ALL_FAULT_TYPE))
+   return -EINVAL;
set_opt(sbi, FAULT_INJECTION);
break;
 
case Opt_fault_type:
if (args->from && match_int(args, ))
return -EINVAL;
-   f2fs_build_fault_attr(sbi, 0, arg);
+   if (f2fs_build_fault_attr(sbi, 0, arg))
+   return -EINVAL;
set_opt(sbi, FAULT_INJECTION);
break;
 #else
diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
index a568ce96cf56..7aa3844e7a80 100644
--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -484,10 +484,16 @@ static ssize_t __sbi_store(struct f2fs_attr *a,
if (ret < 0)
return ret;
 #ifdef CONFIG_F2FS_FAULT_INJECTION
-   if (a->struct_type == FAULT_INFO_TYPE && t >= BIT(FAULT_MAX))
-   return -EINVAL;
-   if (a->struct_type == FAULT_INFO_RATE && t >= UINT_MAX)
-   return -EINVAL;
+   if (a->struct_type == FAULT_INFO_TYPE) {
+   if (f2fs_build_fault_attr(sbi, 0, t))
+   return -EINVAL;
+   return count;
+   }
+   if (a->struct_type == FAULT_INFO_RATE) {
+   if (f2fs_build_fault_attr(sbi, t, 0))
+   return -EINVAL;
+   return count;
+   }
 #endif
if (a->struct_type == RESERVED_BLOCKS) {
spin_lock(>stat_lock);
-- 
2.40.1




[f2fs-dev] [PATCH 2/2] f2fs: fix to limit gc_pin_file_threshold

2024-05-06 Thread Chao Yu
type of f2fs_inode.i_gc_failures, f2fs_inode_info.i_gc_failures, and
f2fs_sb_info.gc_pin_file_threshold is __le16, unsigned int, and u64,
so it will cause truncation during comparison and persistence.

Unifying variable of these three variables to unsigned short, and
add an upper boundary limitation for gc_pin_file_threshold.

Signed-off-by: Chao Yu 
---
 Documentation/ABI/testing/sysfs-fs-f2fs |  2 +-
 fs/f2fs/f2fs.h  |  4 ++--
 fs/f2fs/file.c  | 11 ++-
 fs/f2fs/gc.h|  1 +
 fs/f2fs/sysfs.c |  7 +++
 5 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs 
b/Documentation/ABI/testing/sysfs-fs-f2fs
index 1a4d83953379..cad6c3dc1f9c 100644
--- a/Documentation/ABI/testing/sysfs-fs-f2fs
+++ b/Documentation/ABI/testing/sysfs-fs-f2fs
@@ -331,7 +331,7 @@ Date:   January 2018
 Contact:   Jaegeuk Kim 
 Description:   This indicates how many GC can be failed for the pinned
file. If it exceeds this, F2FS doesn't guarantee its pinning
-   state. 2048 trials is set by default.
+   state. 2048 trials is set by default, and 65535 as maximum.
 
 What:  /sys/fs/f2fs//extension_list
 Date:  February 2018
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 400ff8e1abe0..3dff45cd6cde 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -813,7 +813,7 @@ struct f2fs_inode_info {
unsigned char i_dir_level;  /* use for dentry level for large dir */
union {
unsigned int i_current_depth;   /* only for directory depth */
-   unsigned int i_gc_failures; /* for gc failure statistic */
+   unsigned short i_gc_failures;   /* for gc failure statistic */
};
unsigned int i_pino;/* parent inode number */
umode_t i_acl_mode; /* keep file acl mode temporarily */
@@ -1672,7 +1672,7 @@ struct f2fs_sb_info {
unsigned long long skipped_gc_rwsem;/* FG_GC only */
 
/* threshold for gc trials on pinned files */
-   u64 gc_pin_file_threshold;
+   unsigned short gc_pin_file_threshold;
struct f2fs_rwsem pin_sem;
 
/* maximum # of trials to find a victim segment for SSR and GC */
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 200cafc75dce..1b1b08923f7d 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -3194,16 +3194,17 @@ int f2fs_pin_file_control(struct inode *inode, bool inc)
struct f2fs_inode_info *fi = F2FS_I(inode);
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
 
-   /* Use i_gc_failures for normal file as a risk signal. */
-   if (inc)
-   f2fs_i_gc_failures_write(inode, fi->i_gc_failures + 1);
-
-   if (fi->i_gc_failures > sbi->gc_pin_file_threshold) {
+   if (fi->i_gc_failures >= sbi->gc_pin_file_threshold) {
f2fs_warn(sbi, "%s: Enable GC = ino %lx after %x GC trials",
  __func__, inode->i_ino, fi->i_gc_failures);
clear_inode_flag(inode, FI_PIN_FILE);
return -EAGAIN;
}
+
+   /* Use i_gc_failures for normal file as a risk signal. */
+   if (inc)
+   f2fs_i_gc_failures_write(inode, fi->i_gc_failures + 1);
+
return 0;
 }
 
diff --git a/fs/f2fs/gc.h b/fs/f2fs/gc.h
index 9c0d06c4d19a..a8ea3301b815 100644
--- a/fs/f2fs/gc.h
+++ b/fs/f2fs/gc.h
@@ -26,6 +26,7 @@
 #define LIMIT_FREE_BLOCK   40 /* percentage over invalid + free space */
 
 #define DEF_GC_FAILED_PINNED_FILES 2048
+#define MAX_GC_FAILED_PINNED_FILES USHRT_MAX
 
 /* Search max. number of dirty segments to select a victim segment */
 #define DEF_MAX_VICTIM_SEARCH 4096 /* covers 8GB */
diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
index 7aa3844e7a80..09d3ecfaa4f1 100644
--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -681,6 +681,13 @@ static ssize_t __sbi_store(struct f2fs_attr *a,
return count;
}
 
+   if (!strcmp(a->attr.name, "gc_pin_file_threshold")) {
+   if (t > MAX_GC_FAILED_PINNED_FILES)
+   return -EINVAL;
+   sbi->gc_pin_file_threshold = t;
+   return count;
+   }
+
if (!strcmp(a->attr.name, "gc_reclaimed_segments")) {
if (t != 0)
return -EINVAL;
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH 1/2] f2fs: remove unused GC_FAILURE_PIN

2024-05-06 Thread Chao Yu
After commit 3db1de0e582c ("f2fs: change the current atomic write way"),
we removed all GC_FAILURE_ATOMIC usage, let's change i_gc_failures[]
array to i_pin_failure for cleanup.

Meanwhile, let's define i_current_depth and i_gc_failures as union
variable due to they won't be valid at the same time.

Signed-off-by: Chao Yu 
---
 fs/f2fs/f2fs.h | 14 +-
 fs/f2fs/file.c | 12 +---
 fs/f2fs/inode.c|  6 ++
 fs/f2fs/recovery.c |  3 +--
 4 files changed, 13 insertions(+), 22 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index b03d75e4eedc..400ff8e1abe0 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -765,11 +765,6 @@ enum {
 
 #define DEF_DIR_LEVEL  0
 
-enum {
-   GC_FAILURE_PIN,
-   MAX_GC_FAILURE
-};
-
 /* used for f2fs_inode_info->flags */
 enum {
FI_NEW_INODE,   /* indicate newly allocated inode */
@@ -816,9 +811,10 @@ struct f2fs_inode_info {
unsigned long i_flags;  /* keep an inode flags for ioctl */
unsigned char i_advise; /* use to give file attribute hints */
unsigned char i_dir_level;  /* use for dentry level for large dir */
-   unsigned int i_current_depth;   /* only for directory depth */
-   /* for gc failure statistic */
-   unsigned int i_gc_failures[MAX_GC_FAILURE];
+   union {
+   unsigned int i_current_depth;   /* only for directory depth */
+   unsigned int i_gc_failures; /* for gc failure statistic */
+   };
unsigned int i_pino;/* parent inode number */
umode_t i_acl_mode; /* keep file acl mode temporarily */
 
@@ -3133,7 +3129,7 @@ static inline void f2fs_i_depth_write(struct inode 
*inode, unsigned int depth)
 static inline void f2fs_i_gc_failures_write(struct inode *inode,
unsigned int count)
 {
-   F2FS_I(inode)->i_gc_failures[GC_FAILURE_PIN] = count;
+   F2FS_I(inode)->i_gc_failures = count;
f2fs_mark_inode_dirty_sync(inode, true);
 }
 
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index ac9d6380e433..200cafc75dce 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -3196,13 +3196,11 @@ int f2fs_pin_file_control(struct inode *inode, bool inc)
 
/* Use i_gc_failures for normal file as a risk signal. */
if (inc)
-   f2fs_i_gc_failures_write(inode,
-   fi->i_gc_failures[GC_FAILURE_PIN] + 1);
+   f2fs_i_gc_failures_write(inode, fi->i_gc_failures + 1);
 
-   if (fi->i_gc_failures[GC_FAILURE_PIN] > sbi->gc_pin_file_threshold) {
+   if (fi->i_gc_failures > sbi->gc_pin_file_threshold) {
f2fs_warn(sbi, "%s: Enable GC = ino %lx after %x GC trials",
- __func__, inode->i_ino,
- fi->i_gc_failures[GC_FAILURE_PIN]);
+ __func__, inode->i_ino, fi->i_gc_failures);
clear_inode_flag(inode, FI_PIN_FILE);
return -EAGAIN;
}
@@ -3266,7 +3264,7 @@ static int f2fs_ioc_set_pin_file(struct file *filp, 
unsigned long arg)
}
 
set_inode_flag(inode, FI_PIN_FILE);
-   ret = F2FS_I(inode)->i_gc_failures[GC_FAILURE_PIN];
+   ret = F2FS_I(inode)->i_gc_failures;
 done:
f2fs_update_time(sbi, REQ_TIME);
 out:
@@ -3281,7 +3279,7 @@ static int f2fs_ioc_get_pin_file(struct file *filp, 
unsigned long arg)
__u32 pin = 0;
 
if (is_inode_flag_set(inode, FI_PIN_FILE))
-   pin = F2FS_I(inode)->i_gc_failures[GC_FAILURE_PIN];
+   pin = F2FS_I(inode)->i_gc_failures;
return put_user(pin, (u32 __user *)arg);
 }
 
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index 1423cd27a477..9a8c2b63f56d 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -408,8 +408,7 @@ static int do_read_inode(struct inode *inode)
if (S_ISDIR(inode->i_mode))
fi->i_current_depth = le32_to_cpu(ri->i_current_depth);
else if (S_ISREG(inode->i_mode))
-   fi->i_gc_failures[GC_FAILURE_PIN] =
-   le16_to_cpu(ri->i_gc_failures);
+   fi->i_gc_failures = le16_to_cpu(ri->i_gc_failures);
fi->i_xattr_nid = le32_to_cpu(ri->i_xattr_nid);
fi->i_flags = le32_to_cpu(ri->i_flags);
if (S_ISREG(inode->i_mode))
@@ -679,8 +678,7 @@ void f2fs_update_inode(struct inode *inode, struct page 
*node_page)
ri->i_current_depth =
cpu_to_le32(F2FS_I(inode)->i_current_depth);
else if (S_ISREG(inode->i_mode))
-   ri->i_gc_failures =
-   
cpu_to_le16(F2FS_I(inode)->i_gc_failures[GC_FAILURE_PIN]);
+   ri->i_gc_failures = cpu_to_le16(F2FS_I(inode)->i_gc_failures);
ri->i_xattr_nid = cpu_to_le32(F2FS_I(in

[f2fs-dev] [PATCH 1/5] f2fs: compress: fix to update i_compr_blocks correctly

2024-05-06 Thread Chao Yu
Previously, we account reserved blocks and compressed blocks into
@compr_blocks, then, f2fs_i_compr_blocks_update(,compr_blocks) will
update i_compr_blocks incorrectly, fix it.

Meanwhile, for the case all blocks in cluster were reserved, fix to
update dn->ofs_in_node correctly.

Fixes: eb8fbaa53374 ("f2fs: compress: fix to check unreleased compressed 
cluster")
Signed-off-by: Chao Yu 
---
 fs/f2fs/file.c | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 1761ad125f97..6c84485687d3 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -3641,7 +3641,8 @@ static int reserve_compress_blocks(struct dnode_of_data 
*dn, pgoff_t count,
 
while (count) {
int compr_blocks = 0;
-   blkcnt_t reserved;
+   blkcnt_t reserved = 0;
+   blkcnt_t to_reserved;
int ret;
 
for (i = 0; i < cluster_size; i++) {
@@ -3661,20 +3662,26 @@ static int reserve_compress_blocks(struct dnode_of_data 
*dn, pgoff_t count,
 * fails in release_compress_blocks(), so NEW_ADDR
 * is a possible case.
 */
-   if (blkaddr == NEW_ADDR ||
-   __is_valid_data_blkaddr(blkaddr)) {
+   if (blkaddr == NEW_ADDR) {
+   reserved++;
+   continue;
+   }
+   if (__is_valid_data_blkaddr(blkaddr)) {
compr_blocks++;
continue;
}
}
 
-   reserved = cluster_size - compr_blocks;
+   to_reserved = cluster_size - compr_blocks - reserved;
 
/* for the case all blocks in cluster were reserved */
-   if (reserved == 1)
+   if (to_reserved == 1) {
+   dn->ofs_in_node += cluster_size;
goto next;
+   }
 
-   ret = inc_valid_block_count(sbi, dn->inode, , false);
+   ret = inc_valid_block_count(sbi, dn->inode,
+   _reserved, false);
if (unlikely(ret))
return ret;
 
@@ -3685,7 +3692,7 @@ static int reserve_compress_blocks(struct dnode_of_data 
*dn, pgoff_t count,
 
f2fs_i_compr_blocks_update(dn->inode, compr_blocks, true);
 
-   *reserved_blocks += reserved;
+   *reserved_blocks += to_reserved;
 next:
count -= cluster_size;
}
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH 2/5] f2fs: compress: fix error path of inc_valid_block_count()

2024-05-06 Thread Chao Yu
If inc_valid_block_count() can not allocate all requested blocks,
it needs to release block count in .total_valid_block_count and
resevation blocks in inode.

Fixes: 54607494875e ("f2fs: compress: fix to avoid inconsistence bewteen 
i_blocks and dnode")
Signed-off-by: Chao Yu 
---
 fs/f2fs/f2fs.h | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index c876813b5532..95a40d4f778f 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -2309,7 +2309,7 @@ static inline void f2fs_i_blocks_write(struct inode *, 
block_t, bool, bool);
 static inline int inc_valid_block_count(struct f2fs_sb_info *sbi,
 struct inode *inode, blkcnt_t *count, bool 
partial)
 {
-   blkcnt_t diff = 0, release = 0;
+   long long diff = 0, release = 0;
block_t avail_user_block_count;
int ret;
 
@@ -2329,26 +2329,27 @@ static inline int inc_valid_block_count(struct 
f2fs_sb_info *sbi,
percpu_counter_add(>alloc_valid_block_count, (*count));
 
spin_lock(>stat_lock);
-   sbi->total_valid_block_count += (block_t)(*count);
-   avail_user_block_count = get_available_block_count(sbi, inode, true);
 
-   if (unlikely(sbi->total_valid_block_count > avail_user_block_count)) {
+   avail_user_block_count = get_available_block_count(sbi, inode, true);
+   diff = (long long)sbi->total_valid_block_count + *count -
+   avail_user_block_count;
+   if (unlikely(diff > 0)) {
if (!partial) {
spin_unlock(>stat_lock);
+   release = *count;
goto enospc;
}
-
-   diff = sbi->total_valid_block_count - avail_user_block_count;
if (diff > *count)
diff = *count;
*count -= diff;
release = diff;
-   sbi->total_valid_block_count -= diff;
if (!*count) {
spin_unlock(>stat_lock);
goto enospc;
}
}
+   sbi->total_valid_block_count += (block_t)(*count);
+
spin_unlock(>stat_lock);
 
if (unlikely(release)) {
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH 5/5] f2fs: compress: don't allow unaligned truncation on released compress inode

2024-05-06 Thread Chao Yu
f2fs image may be corrupted after below testcase:
- mkfs.f2fs -O extra_attr,compression -f /dev/vdb
- mount /dev/vdb /mnt/f2fs
- touch /mnt/f2fs/file
- f2fs_io setflags compression /mnt/f2fs/file
- dd if=/dev/zero of=/mnt/f2fs/file bs=4k count=4
- f2fs_io release_cblocks /mnt/f2fs/file
- truncate -s 8192 /mnt/f2fs/file
- umount /mnt/f2fs
- fsck.f2fs /dev/vdb

[ASSERT] (fsck_chk_inode_blk:1256)  --> ino: 0x5 has i_blocks: 0x0002, but 
has 0x3 blocks
[FSCK] valid_block_count matching with CP [Fail] [0x4, 0x5]
[FSCK] other corrupted bugs   [Fail]

The reason is: partial truncation assume compressed inode has reserved
blocks, after partial truncation, valid block count may change w/
.i_blocks and .total_valid_block_count update, result in corruption.

This patch only allow cluster size aligned truncation on released
compress inode for fixing.

Fixes: c61404153eb6 ("f2fs: introduce FI_COMPRESS_RELEASED instead of using 
IMMUTABLE bit")
Signed-off-by: Chao Yu 
---
 fs/f2fs/file.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 3f0db351e976..ac9d6380e433 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -952,9 +952,14 @@ int f2fs_setattr(struct mnt_idmap *idmap, struct dentry 
*dentry,
  ATTR_GID | ATTR_TIMES_SET
return -EPERM;
 
-   if ((attr->ia_valid & ATTR_SIZE) &&
-   !f2fs_is_compress_backend_ready(inode))
-   return -EOPNOTSUPP;
+   if ((attr->ia_valid & ATTR_SIZE)) {
+   if (!f2fs_is_compress_backend_ready(inode))
+   return -EOPNOTSUPP;
+   if (is_inode_flag_set(inode, FI_COMPRESS_RELEASED) &&
+   (attr->ia_size %
+   F2FS_BLK_TO_BYTES(F2FS_I(inode)->i_cluster_size)))
+   return -EINVAL;
+   }
 
err = setattr_prepare(idmap, dentry, attr);
if (err)
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH 3/5] f2fs: compress: fix typo in f2fs_reserve_compress_blocks()

2024-05-06 Thread Chao Yu
s/released/reserved.

Signed-off-by: Chao Yu 
---
 fs/f2fs/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 6c84485687d3..e77e958a9f92 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -3785,7 +3785,7 @@ static int f2fs_reserve_compress_blocks(struct file 
*filp, unsigned long arg)
} else if (reserved_blocks &&
atomic_read(_I(inode)->i_compr_blocks)) {
set_sbi_flag(sbi, SBI_NEED_FSCK);
-   f2fs_warn(sbi, "%s: partial blocks were released i_ino=%lx "
+   f2fs_warn(sbi, "%s: partial blocks were reserved i_ino=%lx "
"iblocks=%llu, reserved=%u, compr_blocks=%u, "
"run fsck to fix.",
__func__, inode->i_ino, inode->i_blocks,
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH 4/5] f2fs: compress: fix to cover {reserve, release}_compress_blocks() w/ cp_rwsem lock

2024-05-06 Thread Chao Yu
It needs to cover {reserve,release}_compress_blocks() w/ cp_rwsem lock
to avoid racing with checkpoint, otherwise, filesystem metadata including
blkaddr in dnode, inode fields and .total_valid_block_count may be
corrupted after SPO case.

Fixes: ef8d563f184e ("f2fs: introduce F2FS_IOC_RELEASE_COMPRESS_BLOCKS")
Fixes: c75488fb4d82 ("f2fs: introduce F2FS_IOC_RESERVE_COMPRESS_BLOCKS")
Signed-off-by: Chao Yu 
---
 fs/f2fs/file.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index e77e958a9f92..3f0db351e976 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -3570,9 +3570,12 @@ static int f2fs_release_compress_blocks(struct file 
*filp, unsigned long arg)
struct dnode_of_data dn;
pgoff_t end_offset, count;
 
+   f2fs_lock_op(sbi);
+
set_new_dnode(, inode, NULL, NULL, 0);
ret = f2fs_get_dnode_of_data(, page_idx, LOOKUP_NODE);
if (ret) {
+   f2fs_unlock_op(sbi);
if (ret == -ENOENT) {
page_idx = f2fs_get_next_page_offset(,
page_idx);
@@ -3590,6 +3593,8 @@ static int f2fs_release_compress_blocks(struct file 
*filp, unsigned long arg)
 
f2fs_put_dnode();
 
+   f2fs_unlock_op(sbi);
+
if (ret < 0)
break;
 
@@ -3742,9 +3747,12 @@ static int f2fs_reserve_compress_blocks(struct file 
*filp, unsigned long arg)
struct dnode_of_data dn;
pgoff_t end_offset, count;
 
+   f2fs_lock_op(sbi);
+
set_new_dnode(, inode, NULL, NULL, 0);
ret = f2fs_get_dnode_of_data(, page_idx, LOOKUP_NODE);
if (ret) {
+   f2fs_unlock_op(sbi);
if (ret == -ENOENT) {
page_idx = f2fs_get_next_page_offset(,
page_idx);
@@ -3762,6 +3770,8 @@ static int f2fs_reserve_compress_blocks(struct file 
*filp, unsigned long arg)
 
f2fs_put_dnode();
 
+   f2fs_unlock_op(sbi);
+
if (ret < 0)
break;
 
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH 2/3] f2fs: fix to add missing iput() in gc_data_segment()

2024-05-06 Thread Chao Yu
During gc_data_segment(), if inode state is abnormal, it missed to call
iput(), fix it.

Fixes: 132e3209789c ("f2fs: remove false alarm on iget failure during GC")
Fixes: 9056d6489f5a ("f2fs: fix to do sanity check on inode type during garbage 
collection")
Signed-off-by: Chao Yu 
---
 fs/f2fs/gc.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 8852814dab7f..e86c7f01539a 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1554,10 +1554,15 @@ static int gc_data_segment(struct f2fs_sb_info *sbi, 
struct f2fs_summary *sum,
int err;
 
inode = f2fs_iget(sb, dni.ino);
-   if (IS_ERR(inode) || is_bad_inode(inode) ||
-   special_file(inode->i_mode))
+   if (IS_ERR(inode))
continue;
 
+   if (is_bad_inode(inode) ||
+   special_file(inode->i_mode)) {
+   iput(inode);
+   continue;
+   }
+
err = f2fs_gc_pinned_control(inode, gc_type, segno);
if (err == -EAGAIN) {
iput(inode);
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH 3/3] f2fs: fix to do sanity check on i_nid for inline_data inode

2024-05-06 Thread Chao Yu
syzbot reports a f2fs bug as below:

[ cut here ]
kernel BUG at fs/f2fs/inline.c:258!
CPU: 1 PID: 34 Comm: kworker/u8:2 Not tainted 
6.9.0-rc6-syzkaller-00012-g9e4bc4bcae01 #0
RIP: 0010:f2fs_write_inline_data+0x781/0x790 fs/f2fs/inline.c:258
Call Trace:
 f2fs_write_single_data_page+0xb65/0x1d60 fs/f2fs/data.c:2834
 f2fs_write_cache_pages fs/f2fs/data.c:3133 [inline]
 __f2fs_write_data_pages fs/f2fs/data.c:3288 [inline]
 f2fs_write_data_pages+0x1efe/0x3a90 fs/f2fs/data.c:3315
 do_writepages+0x35b/0x870 mm/page-writeback.c:2612
 __writeback_single_inode+0x165/0x10b0 fs/fs-writeback.c:1650
 writeback_sb_inodes+0x905/0x1260 fs/fs-writeback.c:1941
 wb_writeback+0x457/0xce0 fs/fs-writeback.c:2117
 wb_do_writeback fs/fs-writeback.c:2264 [inline]
 wb_workfn+0x410/0x1090 fs/fs-writeback.c:2304
 process_one_work kernel/workqueue.c:3254 [inline]
 process_scheduled_works+0xa12/0x17c0 kernel/workqueue.c:3335
 worker_thread+0x86d/0xd70 kernel/workqueue.c:3416
 kthread+0x2f2/0x390 kernel/kthread.c:388
 ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

The root cause is: inline_data inode can be fuzzed, so that there may
be valid blkaddr in its direct node, once f2fs triggers background GC
to migrate the block, it will hit f2fs_bug_on() during dirty page
writeback.

Let's add sanity check on i_nid field for inline_data inode, meanwhile,
forbid to migrate inline_data inode's data block to fix this issue.

Reported-by: syzbot+848062ba19c8782ca...@syzkaller.appspotmail.com
Closes: 
https://lore.kernel.org/linux-f2fs-devel/d103ce06174d7...@google.com
Signed-off-by: Chao Yu 
---
 fs/f2fs/f2fs.h   |  2 +-
 fs/f2fs/gc.c |  6 ++
 fs/f2fs/inline.c | 17 -
 fs/f2fs/inode.c  |  2 +-
 4 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index fced2b7652f4..c876813b5532 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4146,7 +4146,7 @@ extern struct kmem_cache *f2fs_inode_entry_slab;
  * inline.c
  */
 bool f2fs_may_inline_data(struct inode *inode);
-bool f2fs_sanity_check_inline_data(struct inode *inode);
+bool f2fs_sanity_check_inline_data(struct inode *inode, struct page *ipage);
 bool f2fs_may_inline_dentry(struct inode *inode);
 void f2fs_do_read_inline_data(struct page *page, struct page *ipage);
 void f2fs_truncate_inline_inode(struct inode *inode,
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index e86c7f01539a..041957750478 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1563,6 +1563,12 @@ static int gc_data_segment(struct f2fs_sb_info *sbi, 
struct f2fs_summary *sum,
continue;
}
 
+   if (f2fs_has_inline_data(inode)) {
+   iput(inode);
+   set_sbi_flag(sbi, SBI_NEED_FSCK);
+   continue;
+   }
+
err = f2fs_gc_pinned_control(inode, gc_type, segno);
if (err == -EAGAIN) {
iput(inode);
diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
index ac00423f117b..067600fed3d4 100644
--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -33,11 +33,26 @@ bool f2fs_may_inline_data(struct inode *inode)
return !f2fs_post_read_required(inode);
 }
 
-bool f2fs_sanity_check_inline_data(struct inode *inode)
+static bool has_node_blocks(struct inode *inode, struct page *ipage)
+{
+   struct f2fs_inode *ri = F2FS_INODE(ipage);
+   int i;
+
+   for (i = 0; i < DEF_NIDS_PER_INODE; i++) {
+   if (ri->i_nid[i])
+   return true;
+   }
+   return false;
+}
+
+bool f2fs_sanity_check_inline_data(struct inode *inode, struct page *ipage)
 {
if (!f2fs_has_inline_data(inode))
return false;
 
+   if (has_node_blocks(inode, ipage))
+   return false;
+
if (!support_inline_data(inode))
return true;
 
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index c26effdce9aa..1423cd27a477 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -343,7 +343,7 @@ static bool sanity_check_inode(struct inode *inode, struct 
page *node_page)
}
}
 
-   if (f2fs_sanity_check_inline_data(inode)) {
+   if (f2fs_sanity_check_inline_data(inode, node_page)) {
f2fs_warn(sbi, "%s: inode (ino=%lx, mode=%u) should not have 
inline_data, run fsck to fix",
  __func__, inode->i_ino, inode->i_mode);
return false;
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH 1/3] f2fs: fix to release node block count in error path of f2fs_new_node_page()

2024-05-06 Thread Chao Yu
It missed to call dec_valid_node_count() to release node block count
in error path, fix it.

Fixes: 141170b759e0 ("f2fs: fix to avoid use f2fs_bug_on() in 
f2fs_new_node_page()")
Signed-off-by: Chao Yu 
---
 fs/f2fs/node.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index b3de6d6cdb02..ae39971825bc 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1313,15 +1313,14 @@ struct page *f2fs_new_node_page(struct dnode_of_data 
*dn, unsigned int ofs)
 
 #ifdef CONFIG_F2FS_CHECK_FS
err = f2fs_get_node_info(sbi, dn->nid, _ni, false);
-   if (err) {
-   dec_valid_node_count(sbi, dn->inode, !ofs);
-   goto fail;
-   }
+   if (err)
+   goto out_dec;
+
if (unlikely(new_ni.blk_addr != NULL_ADDR)) {
err = -EFSCORRUPTED;
set_sbi_flag(sbi, SBI_NEED_FSCK);
f2fs_handle_error(sbi, ERROR_INVALID_BLKADDR);
-   goto fail;
+   goto out_dec;
}
 #endif
new_ni.nid = dn->nid;
@@ -1345,7 +1344,8 @@ struct page *f2fs_new_node_page(struct dnode_of_data *dn, 
unsigned int ofs)
if (ofs == 0)
inc_valid_inode_count(sbi);
return page;
-
+out_dec:
+   dec_valid_node_count(sbi, dn->inode, !ofs);
 fail:
clear_node_page_dirty(page);
f2fs_put_page(page, 1);
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] mkfs.f2fs: align each device to zone size

2024-04-29 Thread Chao Yu

On 2024/4/10 20:38, Sheng Yong wrote:

For multiple device, each device should be aligned to zone size, instead
of aligning the total size.

Signed-off-by: Sheng Yong 


Reviewed-by: Chao Yu 

Thanks,


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH v2] f2fs: fix block migration when section is not aligned to pow2

2024-04-29 Thread Chao Yu

On 2024/4/29 11:51, Wu Bo wrote:

As for zoned-UFS, f2fs section size is forced to zone size. And zone
size may not aligned to pow2.

Fixes: 859fca6b706e ("f2fs: swap: support migrating swapfile in aligned write 
mode")
Signed-off-by: Liao Yuanhong 
Signed-off-by: Wu Bo 


Reviewed-by: Chao Yu 

Thanks,


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs:remove the restriction on zone sector being align to pow2

2024-04-28 Thread Chao Yu

On 2024/4/28 19:14, Liao Yuanhong wrote:

For zoned-UFS, sector size may not aligned to pow2, so we need to remove
the pow2 limitation.

Signed-off-by: Liao Yuanhong 
---
  drivers/md/dm-table.c | 4 
  1 file changed, 4 deletions(-)

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 41f1d731ae5a..823f2f6a2d53 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c


Hi, please discuss this in dm-de...@lists.linux.dev, thanks.

Thanks,


@@ -1663,10 +1663,6 @@ static int validate_hardware_zoned(struct dm_table *t, 
bool zoned,
return -EINVAL;
}
  
-	/* Check zone size validity and compatibility */

-   if (!zone_sectors || !is_power_of_2(zone_sectors))
-   return -EINVAL;
-
if (dm_table_any_dev_attr(t, device_not_matches_zone_sectors, 
_sectors)) {
DMERR("%s: zone sectors is not consistent across all zoned 
devices",
  dm_device_name(t->md));



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 3/3] f2fs: fix false alarm on invalid block address

2024-04-28 Thread Chao Yu

On 2024/4/28 9:23, Daeho Jeong wrote:

I have a question. Is it okay for META_GENERIC?


It seems all users of META_GENERIC comes from IO paths:
a) f2fs_merge_page_bio
b) f2fs_submit_page_bio
c) f2fs_submit_page_write - verify_fio_blkaddr

They are all impossible cases? so it's fine to record the error
for this case?

Thanks,


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH v2 3/8] f2fs: drop usage of page_index

2024-04-27 Thread Chao Yu

On 2024/4/24 6:58, Matthew Wilcox wrote:

On Wed, Apr 24, 2024 at 01:03:34AM +0800, Kairui Song wrote:

@@ -4086,8 +4086,7 @@ void f2fs_clear_page_cache_dirty_tag(struct page *page)
unsigned long flags;
  
  	xa_lock_irqsave(>i_pages, flags);

-   __xa_clear_mark(>i_pages, page_index(page),
-   PAGECACHE_TAG_DIRTY);
+   __xa_clear_mark(>i_pages, page->index, PAGECACHE_TAG_DIRTY);
xa_unlock_irqrestore(>i_pages, flags);
  }


I just sent a patch which is going to conflict with this:

https://lore.kernel.org/linux-mm/20240423225552.4113447-3-wi...@infradead.org/

Chao Yu, Jaegeuk Kim; what are your plans for converting f2fs to use


Hi Matthew,

I've converted .read_folio and .readahead of f2fs to use folio w/ below 
patchset,
and let me take a look how to support and enable large folio...

https://lore.kernel.org/linux-f2fs-devel/20240422062417.2421616-1-c...@kernel.org/

Thanks,


folios?  This is getting quite urgent.



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH v2] f2fs: zone: fix to don't trigger OPU on pinfile for direct IO

2024-04-27 Thread Chao Yu
Otherwise, it breaks pinfile's sematics.

Cc: Daeho Jeong 
Signed-off-by: Chao Yu 
---
v2:
- fix to disallow OPU on pinfile no matter what device type f2fs uses.
 fs/f2fs/data.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index d8e4434e8801..56600dd43834 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1595,8 +1595,9 @@ int f2fs_map_blocks(struct inode *inode, struct 
f2fs_map_blocks *map, int flag)
}
 
/* use out-place-update for direct IO under LFS mode */
-   if (map->m_may_create &&
-   (is_hole || (f2fs_lfs_mode(sbi) && flag == F2FS_GET_BLOCK_DIO))) {
+   if (map->m_may_create && (is_hole ||
+   (flag == F2FS_GET_BLOCK_DIO && f2fs_lfs_mode(sbi) &&
+   !f2fs_is_pinned_file(inode {
if (unlikely(f2fs_cp_error(sbi))) {
err = -EIO;
goto sync_out;
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: fix block migration when section is not aligned to pow2

2024-04-26 Thread Chao Yu

On 2024/4/26 18:41, Wu Bo wrote:

As for zoned-UFS, f2fs section size is forced to zone size. And zone
size may not aligned to pow2.

Fixes: 859fca6b706e ("f2fs: swap: support migrating swapfile in aligned write 
mode")
Signed-off-by: Liao Yuanhong 
Signed-off-by: Wu Bo 


Reviewed-by: Chao Yu 

Thanks,


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: zone: fix to don't trigger OPU on pinfile for direct IO

2024-04-26 Thread Chao Yu

On 2024/4/26 19:30, Zhiguo Niu wrote:

Dear Chao,

On Fri, Apr 26, 2024 at 6:37 PM Chao Yu  wrote:


Otherwise, it breaks pinfile's sematics.

Cc: Daeho Jeong 
Signed-off-by: Chao Yu 
---
  fs/f2fs/data.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index bee1e45f76b8..e29000d83d52 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1596,7 +1596,8 @@ int f2fs_map_blocks(struct inode *inode, struct 
f2fs_map_blocks *map, int flag)

 /* use out-place-update for direct IO under LFS mode */
 if (map->m_may_create &&
-   (is_hole || (f2fs_lfs_mode(sbi) && flag == F2FS_GET_BLOCK_DIO))) {
+   (is_hole || (flag == F2FS_GET_BLOCK_DIO && (f2fs_lfs_mode(sbi) &&
+   (!f2fs_sb_has_blkzoned(sbi) || !f2fs_is_pinned_file(inode)) {

Excuse me I a little question, should pin files not be written in OPU
mode regardless of device type(conventional or  zone)?


Agreed, so it looks we need remove !f2fs_sb_has_blkzoned condition here...

Thanks,


thanks!

 if (unlikely(f2fs_cp_error(sbi))) {
 err = -EIO;
 goto sync_out;
--
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: zone: fix to don't trigger OPU on pinfile for direct IO

2024-04-26 Thread Chao Yu

On 2024/4/26 22:14, Daeho Jeong wrote:

On Fri, Apr 26, 2024 at 3:35 AM Chao Yu  wrote:


Otherwise, it breaks pinfile's sematics.

Cc: Daeho Jeong 
Signed-off-by: Chao Yu 
---
  fs/f2fs/data.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index bee1e45f76b8..e29000d83d52 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1596,7 +1596,8 @@ int f2fs_map_blocks(struct inode *inode, struct 
f2fs_map_blocks *map, int flag)

 /* use out-place-update for direct IO under LFS mode */
 if (map->m_may_create &&
-   (is_hole || (f2fs_lfs_mode(sbi) && flag == F2FS_GET_BLOCK_DIO))) {
+   (is_hole || (flag == F2FS_GET_BLOCK_DIO && (f2fs_lfs_mode(sbi) &&
+   (!f2fs_sb_has_blkzoned(sbi) || !f2fs_is_pinned_file(inode)) {
 if (unlikely(f2fs_cp_error(sbi))) {
 err = -EIO;
 goto sync_out;
--
2.40.1


So, we block overwrite io for the pinfile here.


I guess you mean we blocked append write for pinfile, right?



static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)

{
...
 if (f2fs_is_pinned_file(inode) &&
 !f2fs_overwrite_io(inode, pos, count)) {


If !f2fs_overwrite_io() is true, it means it may trigger append write on
pinfile?

Thanks,


 ret = -EIO;
 goto out_unlock;
 }







___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [syzbot] [f2fs?] KASAN: slab-out-of-bounds Read in f2fs_get_node_info

2024-04-26 Thread Chao Yu

#syz test git://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git 
bugfix/syzbot

On 2024/4/25 15:59, syzbot wrote:

Hello,

syzbot found the following issue on:

HEAD commit:ed30a4a51bb1 Linux 6.9-rc5
git tree:   upstream
console+strace: https://syzkaller.appspot.com/x/log.txt?x=1116bc3098
kernel config:  https://syzkaller.appspot.com/x/.config?x=5a05c230e142f2bc
dashboard link: https://syzkaller.appspot.com/bug?extid=3694e283cf5c40df6d14
compiler:   Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 
2.40
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1128486b18
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1516bc3098

Downloadable assets:
disk image: 
https://storage.googleapis.com/syzbot-assets/7a2e1a02882c/disk-ed30a4a5.raw.xz
vmlinux: 
https://storage.googleapis.com/syzbot-assets/329966999344/vmlinux-ed30a4a5.xz
kernel image: 
https://storage.googleapis.com/syzbot-assets/1befbdf4dcac/bzImage-ed30a4a5.xz
mounted in repro: 
https://storage.googleapis.com/syzbot-assets/42ddf2738cf7/mount_0.gz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+3694e283cf5c40df6...@syzkaller.appspotmail.com

F2FS-fs (loop0): Mounted with checkpoint version = 48b305e4
==
BUG: KASAN: slab-out-of-bounds in f2fs_test_bit fs/f2fs/f2fs.h:2933 [inline]
BUG: KASAN: slab-out-of-bounds in current_nat_addr fs/f2fs/node.h:213 [inline]
BUG: KASAN: slab-out-of-bounds in f2fs_get_node_info+0xece/0x1200 
fs/f2fs/node.c:600
Read of size 1 at addr 88807a58c76c by task syz-executor280/5076

CPU: 1 PID: 5076 Comm: syz-executor280 Not tainted 6.9.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
03/27/2024
Call Trace:
  
  __dump_stack lib/dump_stack.c:88 [inline]
  dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
  print_address_description mm/kasan/report.c:377 [inline]
  print_report+0x169/0x550 mm/kasan/report.c:488
  kasan_report+0x143/0x180 mm/kasan/report.c:601
  f2fs_test_bit fs/f2fs/f2fs.h:2933 [inline]
  current_nat_addr fs/f2fs/node.h:213 [inline]
  f2fs_get_node_info+0xece/0x1200 fs/f2fs/node.c:600
  f2fs_xattr_fiemap fs/f2fs/data.c:1848 [inline]
  f2fs_fiemap+0x55d/0x1ee0 fs/f2fs/data.c:1925
  ioctl_fiemap fs/ioctl.c:220 [inline]
  do_vfs_ioctl+0x1c07/0x2e50 fs/ioctl.c:838
  __do_sys_ioctl fs/ioctl.c:902 [inline]
  __se_sys_ioctl+0x81/0x170 fs/ioctl.c:890
  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
  do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
  entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f60d34ae739
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 61 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 
48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 
c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:7ffc9f2f1148 EFLAGS: 0246 ORIG_RAX: 0010
RAX: ffda RBX: 7ffc9f2f1318 RCX: 7f60d34ae739
RDX: 2040 RSI: c020660b RDI: 0004
RBP: 7f60d3527610 R08:  R09: 7ffc9f2f1318
R10: 551a R11: 0246 R12: 0001
R13: 7ffc9f2f1308 R14: 0001 R15: 0001
  

Allocated by task 5076:
  kasan_save_stack mm/kasan/common.c:47 [inline]
  kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
  poison_kmalloc_redzone mm/kasan/common.c:370 [inline]
  __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387
  kasan_kmalloc include/linux/kasan.h:211 [inline]
  __do_kmalloc_node mm/slub.c:3966 [inline]
  __kmalloc_node_track_caller+0x24e/0x4e0 mm/slub.c:3986
  kmemdup+0x2a/0x60 mm/util.c:131
  init_node_manager fs/f2fs/node.c:3268 [inline]
  f2fs_build_node_manager+0x8cc/0x2870 fs/f2fs/node.c:3329
  f2fs_fill_super+0x583c/0x8120 fs/f2fs/super.c:4540
  mount_bdev+0x20a/0x2d0 fs/super.c:1658
  legacy_get_tree+0xee/0x190 fs/fs_context.c:662
  vfs_get_tree+0x90/0x2a0 fs/super.c:1779
  do_new_mount+0x2be/0xb40 fs/namespace.c:3352
  do_mount fs/namespace.c:3692 [inline]
  __do_sys_mount fs/namespace.c:3898 [inline]
  __se_sys_mount+0x2d9/0x3c0 fs/namespace.c:3875
  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
  do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
  entry_SYSCALL_64_after_hwframe+0x77/0x7f

The buggy address belongs to the object at 88807a58c700
  which belongs to the cache kmalloc-64 of size 64
The buggy address is located 44 bytes to the right of
  allocated 64-byte region [88807a58c700, 88807a58c740)

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping: index:0x0 pfn:0x7a58c
flags: 0xfff8000800(slab|node=0|zone=1|lastcpupid=0xfff)
page_type: 0x()
raw: 00fff8000800 888015041640 eaaa6400 dead0004
raw:  00200020 0001 
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated

[f2fs-dev] [PATCH] f2fs: zone: fix to don't trigger OPU on pinfile for direct IO

2024-04-26 Thread Chao Yu
Otherwise, it breaks pinfile's sematics.

Cc: Daeho Jeong 
Signed-off-by: Chao Yu 
---
 fs/f2fs/data.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index bee1e45f76b8..e29000d83d52 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1596,7 +1596,8 @@ int f2fs_map_blocks(struct inode *inode, struct 
f2fs_map_blocks *map, int flag)
 
/* use out-place-update for direct IO under LFS mode */
if (map->m_may_create &&
-   (is_hole || (f2fs_lfs_mode(sbi) && flag == F2FS_GET_BLOCK_DIO))) {
+   (is_hole || (flag == F2FS_GET_BLOCK_DIO && (f2fs_lfs_mode(sbi) &&
+   (!f2fs_sb_has_blkzoned(sbi) || !f2fs_is_pinned_file(inode)) {
if (unlikely(f2fs_cp_error(sbi))) {
err = -EIO;
goto sync_out;
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: remove redundant parameter in is_next_segment_free()

2024-04-26 Thread Chao Yu

On 2024/4/25 22:55, Yifan Zhao wrote:

is_next_segment_free() takes a redundant `type` parameter. Remove it.

Signed-off-by: Yifan Zhao 


Reviewed-by: Chao Yu 

Thanks,


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 2/2] f2fs: remove unnecessary block size check in init_f2fs_fs()

2024-04-26 Thread Chao Yu

On 2024/4/16 19:12, Zhiguo Niu wrote:

On Tue, Apr 16, 2024 at 3:22 PM Chao Yu  wrote:


After commit d7e9a9037de2 ("f2fs: Support Block Size == Page Size"),
F2FS_BLKSIZE equals to PAGE_SIZE, remove unnecessary check condition.

Signed-off-by: Chao Yu 
---
  fs/f2fs/super.c | 6 --
  1 file changed, 6 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 6d1e4fc629e2..32aa6d6fa871 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -4933,12 +4933,6 @@ static int __init init_f2fs_fs(void)
  {
 int err;

-   if (PAGE_SIZE != F2FS_BLKSIZE) {
-   printk("F2FS not supported on PAGE_SIZE(%lu) != 
BLOCK_SIZE(%lu)\n",
-   PAGE_SIZE, F2FS_BLKSIZE);
-   return -EINVAL;
-   }
-
 err = init_inodecache();
 if (err)
 goto fail;

Dear Chao,

Can you help modify the following  comment msg together with this patch?
They are also related to commit d7e9a9037de2 ("f2fs: Support Block
Size == Page Size").
If you think there is a more suitable description, please help modify
it directly.


Zhiguo,

I missed to reply this, I guess you can update
"f2fs: fix some ambiguous comments".


thanks!

diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
index a357287..241e7b18 100644
--- a/include/linux/f2fs_fs.h
+++ b/include/linux/f2fs_fs.h
@@ -394,7 +394,8 @@ struct f2fs_nat_block {

  /*
   * F2FS uses 4 bytes to represent block address. As a result, supported size 
of
- * disk is 16 TB and it equals to 16 * 1024 * 1024 / 2 segments.
+ * disk is 16 TB for a 4K page size and 64 TB for a 16K page size and it equals


disk is 16 TB for 4K size block and 64 TB for 16K size block and it equals
to (1 << 32) / 512 segments.

#define F2FS_MAX_SEGMENT((1 << 32) / 512)

Thanks,


+ * to 16 * 1024 * 1024 / 2 segments.
   */
  #define F2FS_MAX_SEGMENT   ((16 * 1024 * 1024) / 2)

@@ -424,8 +425,10 @@ struct f2fs_sit_block {
  /*
   * For segment summary
   *
- * One summary block contains exactly 512 summary entries, which represents
- * exactly one segment by default. Not allow to change the basic units.
+ * One summary block with 4KB size contains exactly 512 summary entries, which
+ * represents exactly one segment with 2MB size.
+ * Similarly, in the case of 16k block size, it represents one
segment with 8MB size.
+ * Not allow to change the basic units.
   *
   * NOTE: For initializing fields, you must use set_summary
   *
@@ -556,6 +559,7 @@ struct f2fs_summary_block {

  /*
   * space utilization of regular dentry and inline dentry (w/o extra
reservation)
+ * when block size is 4KB.




--
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] f2fs: fix to avoid allocating WARM_DATA segment for direct IO

2024-04-26 Thread Chao Yu
If active_log is not 6, we never use WARM_DATA segment, let's
avoid allocating WARM_DATA segment for direct IO.

Signed-off-by: Yunlei He 
Signed-off-by: Chao Yu 
---
 fs/f2fs/data.c|  3 ++-
 fs/f2fs/f2fs.h|  2 +-
 fs/f2fs/file.c|  5 +++--
 fs/f2fs/segment.c | 11 +--
 4 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index bee1e45f76b8..0c516c653f05 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -4179,7 +4179,8 @@ static int f2fs_iomap_begin(struct inode *inode, loff_t 
offset, loff_t length,
map.m_lblk = bytes_to_blks(inode, offset);
map.m_len = bytes_to_blks(inode, offset + length - 1) - map.m_lblk + 1;
map.m_next_pgofs = _pgofs;
-   map.m_seg_type = f2fs_rw_hint_to_seg_type(inode->i_write_hint);
+   map.m_seg_type = f2fs_rw_hint_to_seg_type(F2FS_I_SB(inode),
+   inode->i_write_hint);
if (flags & IOMAP_WRITE)
map.m_may_create = true;
 
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index e8ff301eaf32..6dd50a6075c0 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -3747,7 +3747,7 @@ int f2fs_build_segment_manager(struct f2fs_sb_info *sbi);
 void f2fs_destroy_segment_manager(struct f2fs_sb_info *sbi);
 int __init f2fs_create_segment_manager_caches(void);
 void f2fs_destroy_segment_manager_caches(void);
-int f2fs_rw_hint_to_seg_type(enum rw_hint hint);
+int f2fs_rw_hint_to_seg_type(struct f2fs_sb_info *sbi, enum rw_hint hint);
 enum rw_hint f2fs_io_type_to_rw_hint(struct f2fs_sb_info *sbi,
enum page_type type, enum temp_type temp);
 unsigned int f2fs_usable_segs_in_sec(struct f2fs_sb_info *sbi,
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 856a5d3bd6bf..23601d747716 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -4643,7 +4643,8 @@ static int f2fs_preallocate_blocks(struct kiocb *iocb, 
struct iov_iter *iter,
 
map.m_may_create = true;
if (dio) {
-   map.m_seg_type = f2fs_rw_hint_to_seg_type(inode->i_write_hint);
+   map.m_seg_type = f2fs_rw_hint_to_seg_type(sbi,
+   inode->i_write_hint);
flag = F2FS_GET_BLOCK_PRE_DIO;
} else {
map.m_seg_type = NO_CHECK_TYPE;
@@ -4696,7 +4697,7 @@ static void f2fs_dio_write_submit_io(const struct 
iomap_iter *iter,
 {
struct inode *inode = iter->inode;
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
-   int seg_type = f2fs_rw_hint_to_seg_type(inode->i_write_hint);
+   int seg_type = f2fs_rw_hint_to_seg_type(sbi, inode->i_write_hint);
enum temp_type temp = f2fs_get_segment_temp(seg_type);
 
bio->bi_write_hint = f2fs_io_type_to_rw_hint(sbi, DATA, temp);
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 8313d6aeaf41..94f3380be04c 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -3358,8 +3358,14 @@ int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct 
fstrim_range *range)
return err;
 }
 
-int f2fs_rw_hint_to_seg_type(enum rw_hint hint)
+int f2fs_rw_hint_to_seg_type(struct f2fs_sb_info *sbi, enum rw_hint hint)
 {
+   if (F2FS_OPTION(sbi).active_logs == 2)
+   return CURSEG_HOT_DATA;
+   else if (F2FS_OPTION(sbi).active_logs == 4)
+   return CURSEG_COLD_DATA;
+
+   /* active_log == 6 */
switch (hint) {
case WRITE_LIFE_SHORT:
return CURSEG_HOT_DATA;
@@ -3499,7 +3505,8 @@ static int __get_segment_type_6(struct f2fs_io_info *fio)
is_inode_flag_set(inode, FI_HOT_DATA) ||
f2fs_is_cow_file(inode))
return CURSEG_HOT_DATA;
-   return f2fs_rw_hint_to_seg_type(inode->i_write_hint);
+   return f2fs_rw_hint_to_seg_type(F2FS_I_SB(inode),
+   inode->i_write_hint);
} else {
if (IS_DNODE(fio->page))
return is_cold_node(fio->page) ? CURSEG_WARM_NODE :
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [syzbot] [f2fs?] KASAN: slab-out-of-bounds Read in f2fs_get_node_info

2024-04-25 Thread Chao Yu

#syz test git://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git 
bugfix/syzbot

On 2024/4/25 15:59, syzbot wrote:

Hello,

syzbot found the following issue on:

HEAD commit:ed30a4a51bb1 Linux 6.9-rc5
git tree:   upstream
console+strace: https://syzkaller.appspot.com/x/log.txt?x=1116bc3098
kernel config:  https://syzkaller.appspot.com/x/.config?x=5a05c230e142f2bc
dashboard link: https://syzkaller.appspot.com/bug?extid=3694e283cf5c40df6d14
compiler:   Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 
2.40
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1128486b18
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1516bc3098

Downloadable assets:
disk image: 
https://storage.googleapis.com/syzbot-assets/7a2e1a02882c/disk-ed30a4a5.raw.xz
vmlinux: 
https://storage.googleapis.com/syzbot-assets/329966999344/vmlinux-ed30a4a5.xz
kernel image: 
https://storage.googleapis.com/syzbot-assets/1befbdf4dcac/bzImage-ed30a4a5.xz
mounted in repro: 
https://storage.googleapis.com/syzbot-assets/42ddf2738cf7/mount_0.gz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+3694e283cf5c40df6...@syzkaller.appspotmail.com

F2FS-fs (loop0): Mounted with checkpoint version = 48b305e4
==
BUG: KASAN: slab-out-of-bounds in f2fs_test_bit fs/f2fs/f2fs.h:2933 [inline]
BUG: KASAN: slab-out-of-bounds in current_nat_addr fs/f2fs/node.h:213 [inline]
BUG: KASAN: slab-out-of-bounds in f2fs_get_node_info+0xece/0x1200 
fs/f2fs/node.c:600
Read of size 1 at addr 88807a58c76c by task syz-executor280/5076

CPU: 1 PID: 5076 Comm: syz-executor280 Not tainted 6.9.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
03/27/2024
Call Trace:
  
  __dump_stack lib/dump_stack.c:88 [inline]
  dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
  print_address_description mm/kasan/report.c:377 [inline]
  print_report+0x169/0x550 mm/kasan/report.c:488
  kasan_report+0x143/0x180 mm/kasan/report.c:601
  f2fs_test_bit fs/f2fs/f2fs.h:2933 [inline]
  current_nat_addr fs/f2fs/node.h:213 [inline]
  f2fs_get_node_info+0xece/0x1200 fs/f2fs/node.c:600
  f2fs_xattr_fiemap fs/f2fs/data.c:1848 [inline]
  f2fs_fiemap+0x55d/0x1ee0 fs/f2fs/data.c:1925
  ioctl_fiemap fs/ioctl.c:220 [inline]
  do_vfs_ioctl+0x1c07/0x2e50 fs/ioctl.c:838
  __do_sys_ioctl fs/ioctl.c:902 [inline]
  __se_sys_ioctl+0x81/0x170 fs/ioctl.c:890
  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
  do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
  entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f60d34ae739
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 61 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 
48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 
c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:7ffc9f2f1148 EFLAGS: 0246 ORIG_RAX: 0010
RAX: ffda RBX: 7ffc9f2f1318 RCX: 7f60d34ae739
RDX: 2040 RSI: c020660b RDI: 0004
RBP: 7f60d3527610 R08:  R09: 7ffc9f2f1318
R10: 551a R11: 0246 R12: 0001
R13: 7ffc9f2f1308 R14: 0001 R15: 0001
  

Allocated by task 5076:
  kasan_save_stack mm/kasan/common.c:47 [inline]
  kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
  poison_kmalloc_redzone mm/kasan/common.c:370 [inline]
  __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387
  kasan_kmalloc include/linux/kasan.h:211 [inline]
  __do_kmalloc_node mm/slub.c:3966 [inline]
  __kmalloc_node_track_caller+0x24e/0x4e0 mm/slub.c:3986
  kmemdup+0x2a/0x60 mm/util.c:131
  init_node_manager fs/f2fs/node.c:3268 [inline]
  f2fs_build_node_manager+0x8cc/0x2870 fs/f2fs/node.c:3329
  f2fs_fill_super+0x583c/0x8120 fs/f2fs/super.c:4540
  mount_bdev+0x20a/0x2d0 fs/super.c:1658
  legacy_get_tree+0xee/0x190 fs/fs_context.c:662
  vfs_get_tree+0x90/0x2a0 fs/super.c:1779
  do_new_mount+0x2be/0xb40 fs/namespace.c:3352
  do_mount fs/namespace.c:3692 [inline]
  __do_sys_mount fs/namespace.c:3898 [inline]
  __se_sys_mount+0x2d9/0x3c0 fs/namespace.c:3875
  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
  do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
  entry_SYSCALL_64_after_hwframe+0x77/0x7f

The buggy address belongs to the object at 88807a58c700
  which belongs to the cache kmalloc-64 of size 64
The buggy address is located 44 bytes to the right of
  allocated 64-byte region [88807a58c700, 88807a58c740)

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping: index:0x0 pfn:0x7a58c
flags: 0xfff8000800(slab|node=0|zone=1|lastcpupid=0xfff)
page_type: 0x()
raw: 00fff8000800 888015041640 eaaa6400 dead0004
raw:  00200020 0001 
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated

Re: [f2fs-dev] [PATCH] f2fs: use helper to print zone condition

2024-04-25 Thread Chao Yu

On 2024/4/23 19:27, Wu Bo wrote:

To make code clean, use blk_zone_cond_str() to print debug information.

Signed-off-by: Wu Bo 


Reviewed-by: Chao Yu 

Thanks,


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] f2fs: fix to do sanity check on i_xattr_nid in sanity_check_inode()

2024-04-25 Thread Chao Yu
syzbot reports a kernel bug as below:

F2FS-fs (loop0): Mounted with checkpoint version = 48b305e4
==
BUG: KASAN: slab-out-of-bounds in f2fs_test_bit fs/f2fs/f2fs.h:2933 [inline]
BUG: KASAN: slab-out-of-bounds in current_nat_addr fs/f2fs/node.h:213 [inline]
BUG: KASAN: slab-out-of-bounds in f2fs_get_node_info+0xece/0x1200 
fs/f2fs/node.c:600
Read of size 1 at addr 88807a58c76c by task syz-executor280/5076

CPU: 1 PID: 5076 Comm: syz-executor280 Not tainted 6.9.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
03/27/2024
Call Trace:
 
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
 print_address_description mm/kasan/report.c:377 [inline]
 print_report+0x169/0x550 mm/kasan/report.c:488
 kasan_report+0x143/0x180 mm/kasan/report.c:601
 f2fs_test_bit fs/f2fs/f2fs.h:2933 [inline]
 current_nat_addr fs/f2fs/node.h:213 [inline]
 f2fs_get_node_info+0xece/0x1200 fs/f2fs/node.c:600
 f2fs_xattr_fiemap fs/f2fs/data.c:1848 [inline]
 f2fs_fiemap+0x55d/0x1ee0 fs/f2fs/data.c:1925
 ioctl_fiemap fs/ioctl.c:220 [inline]
 do_vfs_ioctl+0x1c07/0x2e50 fs/ioctl.c:838
 __do_sys_ioctl fs/ioctl.c:902 [inline]
 __se_sys_ioctl+0x81/0x170 fs/ioctl.c:890
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

The root cause is we missed to do sanity check on i_xattr_nid during
f2fs_iget(), so that in fiemap() path, current_nat_addr() will access
nat_bitmap w/ offset from invalid i_xattr_nid, result in triggering
kasan bug report, fix it.

Reported-by: syzbot+3694e283cf5c40df6...@syzkaller.appspotmail.com
Closes: 
https://lore.kernel.org/linux-f2fs-devel/94036c0616e72...@google.com
Signed-off-by: Chao Yu 
---
 fs/f2fs/inode.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index d7a5a88a1a5e..7968b14d49f4 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -362,6 +362,12 @@ static bool sanity_check_inode(struct inode *inode, struct 
page *node_page)
return false;
}
 
+   if (fi->i_xattr_nid && f2fs_check_nid_range(sbi, fi->i_xattr_nid)) {
+   f2fs_warn(sbi, "%s: inode (ino=%lx) has corrupted i_xattr_nid: 
%u, run fsck to fix.",
+ __func__, inode->i_ino, fi->i_xattr_nid);
+   return false;
+   }
+
return true;
 }
 
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 3/3] f2fs: fix false alarm on invalid block address

2024-04-25 Thread Chao Yu

On 2024/4/19 18:27, Juhyung Park wrote:

On Sat, Apr 13, 2024 at 5:57 AM Jaegeuk Kim  wrote:


On 04/11, Chao Yu wrote:

On 2024/4/10 4:34, Jaegeuk Kim wrote:

f2fs_ra_meta_pages can try to read ahead on invalid block address which is
not the corruption case.


In which case we will read ahead invalid meta pages? recovery w/ META_POR?


In my case, it seems like it's META_SIT, and it's triggered right after mount.


Ah, I see, actually it hits at this case, thanks for the information.

Thanks,


fsck detects invalid_blkaddr, and when the kernel mounts it, it
immediately flags invalid_blkaddr again:

[6.333498] init: [libfs_mgr] Running /system/bin/fsck.f2fs -a -c
1 --debug-cache /dev/block/sda13
[6.337671] fsck.f2fs: Info: Fix the reported corruption.
[6.337947] fsck.f2fs: Info: not exist /proc/version!
[6.338010] fsck.f2fs: Info: can't find /sys, assuming normal block device
[6.338294] fsck.f2fs: Info: MKFS version
[6.338319] fsck.f2fs:   "5.10.160-android12-9-ge5cfec41c8e2"
[6.338366] fsck.f2fs: Info: FSCK version
[6.338380] fsck.f2fs:   from "5.10-arter97"
[6.338393] fsck.f2fs: to "5.10-arter97"
[6.338414] fsck.f2fs: Info: superblock features = 1499 :  encrypt
verity extra_attr project_quota quota_ino casefold
[6.338429] fsck.f2fs: Info: superblock encrypt level = 0, salt =

[6.338442] fsck.f2fs: Info: checkpoint stop reason: shutdown(180)
[6.338455] fsck.f2fs: Info: fs errors: invalid_blkaddr
[6.338468] fsck.f2fs: Info: Segments per section = 1
[6.338480] fsck.f2fs: Info: Sections per zone = 1
[6.338492] fsck.f2fs: Info: total FS sectors = 58971571 (230357 MB)
[6.340599] fsck.f2fs: Info: CKPT version = 2b7e3b29
[6.340620] fsck.f2fs: Info: version timestamp cur: 19789296, prev: 18407008
[6.677041] fsck.f2fs: Info: checkpoint state = 46 :  crc
compacted_summary orphan_inodes sudden-power-off
[6.677052] fsck.f2fs: [FSCK] Check node 1 / 712937 (0.00%)
[8.997922] fsck.f2fs: [FSCK] Check node 71294 / 712937 (10.00%)
[   10.629205] fsck.f2fs: [FSCK] Check node 142587 / 712937 (20.00%)
[   12.278186] fsck.f2fs: [FSCK] Check node 213880 / 712937 (30.00%)
[   13.768177] fsck.f2fs: [FSCK] Check node 285173 / 712937 (40.00%)
[   17.446971] fsck.f2fs: [FSCK] Check node 356466 / 712937 (50.00%)
[   19.891623] fsck.f2fs: [FSCK] Check node 427759 / 712937 (60.00%)
[   23.251327] fsck.f2fs: [FSCK] Check node 499052 / 712937 (70.00%)
[   28.493457] fsck.f2fs: [FSCK] Check node 570345 / 712937 (80.00%)
[   29.640800] fsck.f2fs: [FSCK] Check node 641638 / 712937 (90.00%)
[   30.718347] fsck.f2fs: [FSCK] Check node 712931 / 712937 (100.00%)
[   30.724176] fsck.f2fs:
[   30.737160] fsck.f2fs: [FSCK] Max image size: 167506 MB, Free space: 62850 MB
[   30.737164] fsck.f2fs: [FSCK] Unreachable nat entries
  [Ok..] [0x0]
[   30.737638] fsck.f2fs: [FSCK] SIT valid block bitmap checking
  [Ok..]
[   30.737640] fsck.f2fs: [FSCK] Hard link checking for regular file
  [Ok..] [0xd]
[   30.737641] fsck.f2fs: [FSCK] valid_block_count matching with CP
  [Ok..] [0x28b98e6]
[   30.737644] fsck.f2fs: [FSCK] valid_node_count matching with CP (de
lookup)  [Ok..] [0xae0e9]
[   30.737646] fsck.f2fs: [FSCK] valid_node_count matching with CP
(nat lookup) [Ok..] [0xae0e9]
[   30.737647] fsck.f2fs: [FSCK] valid_inode_count matched with CP
  [Ok..] [0xa74a3]
[   30.737649] fsck.f2fs: [FSCK] free segment_count matched with CP
  [Ok..] [0x7aa3]
[   30.737662] fsck.f2fs: [FSCK] next block offset is free
  [Ok..]
[   30.737663] fsck.f2fs: [FSCK] fixing SIT types
[   30.737867] fsck.f2fs: [FSCK] other corrupted bugs
  [Ok..]
[   30.737893] fsck.f2fs: [update_superblock: 765] Info: Done to
update superblock
[   30.960610] fsck.f2fs:
[   30.960618] fsck.f2fs: Done: 24.622956 secs
[   30.960620] fsck.f2fs:
[   30.960622] fsck.f2fs: c, u, RA, CH, CM, Repl=
[   30.960627] fsck.f2fs: 1 1 43600517 42605434 995083 985083
[   30.963274] F2FS-fs (sda13): Using encoding defined by superblock:
utf8-12.1.0 with flags 0x0
[   30.995360] __f2fs_is_valid_blkaddr: type=2

(Manually added that print ^)

[   30.995369] [ cut here ]
[   30.995375] WARNING: CPU: 7 PID: 1 at f2fs_handle_error+0x18/0x3c
[   30.995378] CPU: 7 PID: 1 Comm: init Tainted: G S  W
5.10.209-arter97-r15-kernelsu-g0867d0e4f1d2 #6
[   30.995379] Hardware name: Qualcomm Technologies, Inc. Cape QRD
with PM8010 (DT)
[   30.995380] pstate: 2245 (nzCv daif +PAN -UAO +TCO BTYPE=--)
[   30.995382] pc : f2fs_handle_error+0x18/0x3c
[   30.995384] lr : __f2fs_is_valid_blkaddr+0x2a4/0x2b0
[   30.995385] sp : ff80209e79b0
[   30.995386] x29: ff80209e79b0 x28: 0037
[   30.995388] x27: 01c7 x26: 20120121
[   30.995389] x25: 00d9 x24: 
[   30.995390] x23: 00f1a700 x22: 0

Re: [f2fs-dev] [PATCH 2/3 v2] f2fs: clear writeback when compression failed

2024-04-24 Thread Chao Yu

On 2024/4/17 0:49, Jaegeuk Kim wrote:

Let's stop issuing compressed writes and clear their writeback flags.

Signed-off-by: Jaegeuk Kim 


Reviewed-by: Chao Yu 

Thanks,


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: fix false alarm on invalid block address

2024-04-24 Thread Chao Yu

On 2024/4/25 1:35, Jaegeuk Kim wrote:

f2fs_ra_meta_pages can try to read ahead on invalid block address which is
not the corruption case.

Cc:  # v6.9+
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=218770
Fixes: 31f85ccc84b8 ("f2fs: unify the error handling of f2fs_is_valid_blkaddr")
Signed-off-by: Jaegeuk Kim 


Reviewed-by: Chao Yu 

Thanks,


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH v2 1/2] f2fs: use per-log target_bitmap to improve lookup performace of ssr allocation

2024-04-22 Thread Chao Yu

Jaegeuk, any comments for this serials?

On 2024/4/11 16:23, Chao Yu wrote:

After commit 899fee36fac0 ("f2fs: fix to avoid data corruption by
forbidding SSR overwrite"), valid block bitmap of current openned
segment is fixed, let's introduce a per-log bitmap instead of temp
bitmap to avoid unnecessary calculation overhead whenever allocating
free slot w/ SSR allocator.

Signed-off-by: Chao Yu 
---
v2:
- rebase to last dev-test branch.
  fs/f2fs/segment.c | 30 ++
  fs/f2fs/segment.h |  1 +
  2 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 6474b7338e81..af716925db19 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2840,31 +2840,39 @@ static int new_curseg(struct f2fs_sb_info *sbi, int 
type, bool new_sec)
return 0;
  }
  
-static int __next_free_blkoff(struct f2fs_sb_info *sbi,

-   int segno, block_t start)
+static void __get_segment_bitmap(struct f2fs_sb_info *sbi,
+   unsigned long *target_map,
+   int segno)
  {
struct seg_entry *se = get_seg_entry(sbi, segno);
int entries = SIT_VBLOCK_MAP_SIZE / sizeof(unsigned long);
-   unsigned long *target_map = SIT_I(sbi)->tmp_map;
unsigned long *ckpt_map = (unsigned long *)se->ckpt_valid_map;
unsigned long *cur_map = (unsigned long *)se->cur_valid_map;
int i;
  
  	for (i = 0; i < entries; i++)

target_map[i] = ckpt_map[i] | cur_map[i];
+}
+
+static int __next_free_blkoff(struct f2fs_sb_info *sbi, unsigned long *bitmap,
+   int segno, block_t start)
+{
+   __get_segment_bitmap(sbi, bitmap, segno);
  
-	return __find_rev_next_zero_bit(target_map, BLKS_PER_SEG(sbi), start);

+   return __find_rev_next_zero_bit(bitmap, BLKS_PER_SEG(sbi), start);
  }
  
  static int f2fs_find_next_ssr_block(struct f2fs_sb_info *sbi,

-   struct curseg_info *seg)
+   struct curseg_info *seg)
  {
-   return __next_free_blkoff(sbi, seg->segno, seg->next_blkoff + 1);
+   return __find_rev_next_zero_bit(seg->target_map,
+   BLKS_PER_SEG(sbi), seg->next_blkoff + 1);
  }
  
  bool f2fs_segment_has_free_slot(struct f2fs_sb_info *sbi, int segno)

  {
-   return __next_free_blkoff(sbi, segno, 0) < BLKS_PER_SEG(sbi);
+   return __next_free_blkoff(sbi, SIT_I(sbi)->tmp_map, segno, 0) <
+   BLKS_PER_SEG(sbi);
  }
  
  /*

@@ -2890,7 +2898,8 @@ static int change_curseg(struct f2fs_sb_info *sbi, int 
type)
  
  	reset_curseg(sbi, type, 1);

curseg->alloc_type = SSR;
-   curseg->next_blkoff = __next_free_blkoff(sbi, curseg->segno, 0);
+   curseg->next_blkoff = __next_free_blkoff(sbi, curseg->target_map,
+   curseg->segno, 0);
  
  	sum_page = f2fs_get_sum_page(sbi, new_segno);

if (IS_ERR(sum_page)) {
@@ -4635,6 +4644,10 @@ static int build_curseg(struct f2fs_sb_info *sbi)
sizeof(struct f2fs_journal), GFP_KERNEL);
if (!array[i].journal)
return -ENOMEM;
+   array[i].target_map = f2fs_kzalloc(sbi, SIT_VBLOCK_MAP_SIZE,
+   GFP_KERNEL);
+   if (!array[i].target_map)
+   return -ENOMEM;
if (i < NR_PERSISTENT_LOG)
array[i].seg_type = CURSEG_HOT_DATA + i;
else if (i == CURSEG_COLD_DATA_PINNED)
@@ -5453,6 +5466,7 @@ static void destroy_curseg(struct f2fs_sb_info *sbi)
for (i = 0; i < NR_CURSEG_TYPE; i++) {
kfree(array[i].sum_blk);
kfree(array[i].journal);
+   kfree(array[i].target_map);
}
kfree(array);
  }
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index e1c0f418aa11..10f3e44f036f 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -292,6 +292,7 @@ struct curseg_info {
struct f2fs_summary_block *sum_blk; /* cached summary block */
struct rw_semaphore journal_rwsem;  /* protect journal area */
struct f2fs_journal *journal;   /* cached journal info */
+   unsigned long *target_map;  /* bitmap for SSR allocator */
unsigned char alloc_type;   /* current allocation type */
unsigned short seg_type;/* segment type like 
CURSEG_XXX_TYPE */
unsigned int segno; /* current segment number */



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH v4] f2fs: zone: don't block IO if there is remained open zone

2024-04-22 Thread Chao Yu

On 2024/4/22 1:28, Juhyung Park wrote:

Hi Chao, a small nit.. :)

s/openned/opened/g


Juhyung, thanks for the report, I've fixed it in v5. :)

Thanks,



$ git grep openned v6.9-rc1 | wc -l
2
$ git grep opened v6.9-rc1 | wc -l
2130

On Thu, Apr 11, 2024 at 5:33 PM Chao Yu  wrote:


max open zone may be larger than log header number of f2fs, for
such case, it doesn't need to wait last IO in previous zone, let's
introduce available_open_zone semaphore, and reduce it once we
submit first write IO in a zone, and increase it after completion
of last IO in the zone.

Cc: Daeho Jeong 
Signed-off-by: Chao Yu 
---
v4:
- avoid unneeded condition in f2fs_blkzoned_submit_merged_write().
  fs/f2fs/data.c| 105 ++
  fs/f2fs/f2fs.h|  34 ---
  fs/f2fs/iostat.c  |   7 
  fs/f2fs/iostat.h  |   2 +
  fs/f2fs/segment.c |  43 ---
  fs/f2fs/segment.h |  12 +-
  fs/f2fs/super.c   |   2 +
  7 files changed, 156 insertions(+), 49 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 60056b9a51be..71472ab6b7e7 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -373,11 +373,10 @@ static void f2fs_write_end_io(struct bio *bio)
  #ifdef CONFIG_BLK_DEV_ZONED
  static void f2fs_zone_write_end_io(struct bio *bio)
  {
-   struct f2fs_bio_info *io = (struct f2fs_bio_info *)bio->bi_private;
+   struct f2fs_sb_info *sbi = iostat_get_bio_private(bio);

-   bio->bi_private = io->bi_private;
-   complete(>zone_wait);
 f2fs_write_end_io(bio);
+   up(>available_open_zones);
  }
  #endif

@@ -531,6 +530,24 @@ static void __submit_merged_bio(struct f2fs_bio_info *io)
 if (!io->bio)
 return;

+#ifdef CONFIG_BLK_DEV_ZONED
+   if (io->open_zone) {
+   /*
+* if there is no open zone, it will wait for last IO in
+* previous zone before submitting new IO.
+*/
+   down(>sbi->available_open_zones);
+   io->open_zone = false;
+   io->zone_openned = true;
+   }
+
+   if (io->close_zone) {
+   io->bio->bi_end_io = f2fs_zone_write_end_io;
+   io->zone_openned = false;
+   io->close_zone = false;
+   }
+#endif
+
 if (is_read_io(fio->op)) {
 trace_f2fs_prepare_read_bio(io->sbi->sb, fio->type, io->bio);
 f2fs_submit_read_bio(io->sbi, io->bio, fio->type);
@@ -601,9 +618,9 @@ int f2fs_init_write_merge_io(struct f2fs_sb_info *sbi)
 INIT_LIST_HEAD(>write_io[i][j].bio_list);
 init_f2fs_rwsem(>write_io[i][j].bio_list_lock);
  #ifdef CONFIG_BLK_DEV_ZONED
-   init_completion(>write_io[i][j].zone_wait);
-   sbi->write_io[i][j].zone_pending_bio = NULL;
-   sbi->write_io[i][j].bi_private = NULL;
+   sbi->write_io[i][j].open_zone = false;
+   sbi->write_io[i][j].zone_openned = false;
+   sbi->write_io[i][j].close_zone = false;
  #endif
 }
 }
@@ -634,6 +651,31 @@ static void __f2fs_submit_merged_write(struct f2fs_sb_info 
*sbi,
 f2fs_up_write(>io_rwsem);
  }

+void f2fs_blkzoned_submit_merged_write(struct f2fs_sb_info *sbi, int type)
+{
+#ifdef CONFIG_BLK_DEV_ZONED
+   struct f2fs_bio_info *io;
+
+   if (!f2fs_sb_has_blkzoned(sbi))
+   return;
+
+   io = sbi->write_io[PAGE_TYPE(type)] + type_to_temp(type);
+
+   f2fs_down_write(>io_rwsem);
+   if (io->zone_openned) {
+   if (io->bio) {
+   io->close_zone = true;
+   __submit_merged_bio(io);
+   } else {
+   up(>available_open_zones);
+   io->zone_openned = false;
+   }
+   }
+   f2fs_up_write(>io_rwsem);
+#endif
+
+}
+
  static void __submit_merged_write_cond(struct f2fs_sb_info *sbi,
 struct inode *inode, struct page *page,
 nid_t ino, enum page_type type, bool force)
@@ -918,22 +960,16 @@ int f2fs_merge_page_bio(struct f2fs_io_info *fio)
  }

  #ifdef CONFIG_BLK_DEV_ZONED
-static bool is_end_zone_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr)
+static bool is_blkaddr_zone_boundary(struct f2fs_sb_info *sbi,
+   block_t blkaddr, bool start)
  {
-   int devi = 0;
+   if (!f2fs_blkaddr_in_seqzone(sbi, blkaddr))
+   return false;
+
+   if (start)
+   return (blkaddr % sbi->blocks_per_blkz) == 0;
+   return (blkaddr % sbi->blocks_per_blkz == sbi->blocks_per_blkz - 1);

-   if (f2fs_is_multi_device(sbi)) {
-   devi = f2fs_targe

[f2fs-dev] [PATCH v5] f2fs: zone: don't block IO if there is remained open zone

2024-04-22 Thread Chao Yu
max open zone may be larger than log header number of f2fs, for
such case, it doesn't need to wait last IO in previous zone, let's
introduce available_open_zone semaphore, and reduce it once we
submit first write IO in a zone, and increase it after completion
of last IO in the zone.

Cc: Daeho Jeong 
Signed-off-by: Chao Yu 
Reviewed-by: Daeho Jeong 
---
v5:
- fix `openned` typo pointed out by Juhyung Park
 fs/f2fs/data.c| 105 ++
 fs/f2fs/f2fs.h|  31 +++---
 fs/f2fs/iostat.c  |   7 
 fs/f2fs/iostat.h  |   2 +
 fs/f2fs/segment.c |  37 +++-
 fs/f2fs/segment.h |   3 +-
 fs/f2fs/super.c   |   2 +
 7 files changed, 143 insertions(+), 44 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index d01345af5f3e..657579358498 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -373,11 +373,10 @@ static void f2fs_write_end_io(struct bio *bio)
 #ifdef CONFIG_BLK_DEV_ZONED
 static void f2fs_zone_write_end_io(struct bio *bio)
 {
-   struct f2fs_bio_info *io = (struct f2fs_bio_info *)bio->bi_private;
+   struct f2fs_sb_info *sbi = iostat_get_bio_private(bio);
 
-   bio->bi_private = io->bi_private;
-   complete(>zone_wait);
f2fs_write_end_io(bio);
+   up(>available_open_zones);
 }
 #endif
 
@@ -533,6 +532,24 @@ static void __submit_merged_bio(struct f2fs_bio_info *io)
if (!io->bio)
return;
 
+#ifdef CONFIG_BLK_DEV_ZONED
+   if (io->open_zone) {
+   /*
+* if there is no open zone, it will wait for last IO in
+* previous zone before submitting new IO.
+*/
+   down(>sbi->available_open_zones);
+   io->open_zone = false;
+   io->zone_opened = true;
+   }
+
+   if (io->close_zone) {
+   io->bio->bi_end_io = f2fs_zone_write_end_io;
+   io->zone_opened = false;
+   io->close_zone = false;
+   }
+#endif
+
if (is_read_io(fio->op)) {
trace_f2fs_prepare_read_bio(io->sbi->sb, fio->type, io->bio);
f2fs_submit_read_bio(io->sbi, io->bio, fio->type);
@@ -603,9 +620,9 @@ int f2fs_init_write_merge_io(struct f2fs_sb_info *sbi)
INIT_LIST_HEAD(>write_io[i][j].bio_list);
init_f2fs_rwsem(>write_io[i][j].bio_list_lock);
 #ifdef CONFIG_BLK_DEV_ZONED
-   init_completion(>write_io[i][j].zone_wait);
-   sbi->write_io[i][j].zone_pending_bio = NULL;
-   sbi->write_io[i][j].bi_private = NULL;
+   sbi->write_io[i][j].open_zone = false;
+   sbi->write_io[i][j].zone_opened = false;
+   sbi->write_io[i][j].close_zone = false;
 #endif
}
}
@@ -636,6 +653,31 @@ static void __f2fs_submit_merged_write(struct f2fs_sb_info 
*sbi,
f2fs_up_write(>io_rwsem);
 }
 
+void f2fs_blkzoned_submit_merged_write(struct f2fs_sb_info *sbi, int type)
+{
+#ifdef CONFIG_BLK_DEV_ZONED
+   struct f2fs_bio_info *io;
+
+   if (!f2fs_sb_has_blkzoned(sbi))
+   return;
+
+   io = sbi->write_io[PAGE_TYPE(type)] + type_to_temp(type);
+
+   f2fs_down_write(>io_rwsem);
+   if (io->zone_opened) {
+   if (io->bio) {
+   io->close_zone = true;
+   __submit_merged_bio(io);
+   } else {
+   up(>available_open_zones);
+   io->zone_opened = false;
+   }
+   }
+   f2fs_up_write(>io_rwsem);
+#endif
+
+}
+
 static void __submit_merged_write_cond(struct f2fs_sb_info *sbi,
struct inode *inode, struct page *page,
nid_t ino, enum page_type type, bool force)
@@ -920,22 +962,16 @@ int f2fs_merge_page_bio(struct f2fs_io_info *fio)
 }
 
 #ifdef CONFIG_BLK_DEV_ZONED
-static bool is_end_zone_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr)
+static bool is_blkaddr_zone_boundary(struct f2fs_sb_info *sbi,
+   block_t blkaddr, bool start)
 {
-   int devi = 0;
+   if (!f2fs_blkaddr_in_seqzone(sbi, blkaddr))
+   return false;
+
+   if (start)
+   return (blkaddr % sbi->blocks_per_blkz) == 0;
+   return (blkaddr % sbi->blocks_per_blkz == sbi->blocks_per_blkz - 1);
 
-   if (f2fs_is_multi_device(sbi)) {
-   devi = f2fs_target_device_index(sbi, blkaddr);
-   if (blkaddr < FDEV(devi).start_blk ||
-   blkaddr > FDEV(devi).end_blk) {
-   f2fs_err(sbi, "Invalid block %x", blkaddr);
-   return false;
-   }
-   blkaddr -= FDEV(devi).start_blk;
-   }
-   ret

[f2fs-dev] [PATCH v2 2/4] f2fs: convert f2fs_read_single_page() to use folio

2024-04-22 Thread Chao Yu
Convert f2fs_read_single_page() to use folio and related
functionality.

Signed-off-by: Chao Yu 
---
v2:
- no change.
 fs/f2fs/data.c | 27 ++-
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 6419cf020327..bb6c0e955d7e 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2063,7 +2063,7 @@ static inline loff_t f2fs_readpage_limit(struct inode 
*inode)
return i_size_read(inode);
 }
 
-static int f2fs_read_single_page(struct inode *inode, struct page *page,
+static int f2fs_read_single_page(struct inode *inode, struct folio *folio,
unsigned nr_pages,
struct f2fs_map_blocks *map,
struct bio **bio_ret,
@@ -2076,9 +2076,10 @@ static int f2fs_read_single_page(struct inode *inode, 
struct page *page,
sector_t last_block;
sector_t last_block_in_file;
sector_t block_nr;
+   pgoff_t index = folio_index(folio);
int ret = 0;
 
-   block_in_file = (sector_t)page_index(page);
+   block_in_file = (sector_t)index;
last_block = block_in_file + nr_pages;
last_block_in_file = bytes_to_blks(inode,
f2fs_readpage_limit(inode) + blocksize - 1);
@@ -2109,7 +2110,7 @@ static int f2fs_read_single_page(struct inode *inode, 
struct page *page,
 got_it:
if ((map->m_flags & F2FS_MAP_MAPPED)) {
block_nr = map->m_pblk + block_in_file - map->m_lblk;
-   SetPageMappedToDisk(page);
+   folio_set_mappedtodisk(folio);
 
if (!f2fs_is_valid_blkaddr(F2FS_I_SB(inode), block_nr,
DATA_GENERIC_ENHANCE_READ)) {
@@ -2118,15 +2119,15 @@ static int f2fs_read_single_page(struct inode *inode, 
struct page *page,
}
} else {
 zero_out:
-   zero_user_segment(page, 0, PAGE_SIZE);
-   if (f2fs_need_verity(inode, page->index) &&
-   !fsverity_verify_page(page)) {
+   folio_zero_segment(folio, 0, folio_size(folio));
+   if (f2fs_need_verity(inode, index) &&
+   !fsverity_verify_folio(folio)) {
ret = -EIO;
goto out;
}
-   if (!PageUptodate(page))
-   SetPageUptodate(page);
-   unlock_page(page);
+   if (!folio_test_uptodate(folio))
+   folio_mark_uptodate(folio);
+   folio_unlock(folio);
goto out;
}
 
@@ -2136,14 +2137,14 @@ static int f2fs_read_single_page(struct inode *inode, 
struct page *page,
 */
if (bio && (!page_is_mergeable(F2FS_I_SB(inode), bio,
   *last_block_in_bio, block_nr) ||
-   !f2fs_crypt_mergeable_bio(bio, inode, page->index, NULL))) {
+   !f2fs_crypt_mergeable_bio(bio, inode, index, NULL))) {
 submit_and_realloc:
f2fs_submit_read_bio(F2FS_I_SB(inode), bio, DATA);
bio = NULL;
}
if (bio == NULL) {
bio = f2fs_grab_read_bio(inode, block_nr, nr_pages,
-   is_readahead ? REQ_RAHEAD : 0, page->index,
+   is_readahead ? REQ_RAHEAD : 0, index,
false);
if (IS_ERR(bio)) {
ret = PTR_ERR(bio);
@@ -2158,7 +2159,7 @@ static int f2fs_read_single_page(struct inode *inode, 
struct page *page,
 */
f2fs_wait_on_block_writeback(inode, block_nr);
 
-   if (bio_add_page(bio, page, blocksize, 0) < blocksize)
+   if (!bio_add_folio(bio, folio, blocksize, 0))
goto submit_and_realloc;
 
inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
@@ -2423,7 +2424,7 @@ static int f2fs_mpage_readpages(struct inode *inode,
goto next_page;
 read_single_page:
 #endif
-   ret = f2fs_read_single_page(inode, >page, max_nr_pages, 
,
+   ret = f2fs_read_single_page(inode, folio, max_nr_pages, ,
, _block_in_bio, rac);
if (ret) {
 #ifdef CONFIG_F2FS_FS_COMPRESSION
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH v2 3/4] f2fs: convert f2fs_read_inline_data() to use folio

2024-04-22 Thread Chao Yu
Convert f2fs_read_inline_data() to use folio and related
functionality, and also convert its caller to use folio.

Signed-off-by: Chao Yu 
---
v2:
- no change.
 fs/f2fs/data.c   | 11 +--
 fs/f2fs/f2fs.h   |  4 ++--
 fs/f2fs/inline.c | 34 +-
 3 files changed, 24 insertions(+), 25 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index bb6c0e955d7e..24f9a39ffd56 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2457,20 +2457,19 @@ static int f2fs_mpage_readpages(struct inode *inode,
 
 static int f2fs_read_data_folio(struct file *file, struct folio *folio)
 {
-   struct page *page = >page;
-   struct inode *inode = page_file_mapping(page)->host;
+   struct inode *inode = folio_file_mapping(folio)->host;
int ret = -EAGAIN;
 
-   trace_f2fs_readpage(page, DATA);
+   trace_f2fs_readpage(>page, DATA);
 
if (!f2fs_is_compress_backend_ready(inode)) {
-   unlock_page(page);
+   folio_unlock(folio);
return -EOPNOTSUPP;
}
 
/* If the file has inline data, try to read it directly */
if (f2fs_has_inline_data(inode))
-   ret = f2fs_read_inline_data(inode, page);
+   ret = f2fs_read_inline_data(inode, folio);
if (ret == -EAGAIN)
ret = f2fs_mpage_readpages(inode, NULL, folio);
return ret;
@@ -3399,7 +3398,7 @@ static int prepare_write_begin(struct f2fs_sb_info *sbi,
 
if (f2fs_has_inline_data(inode)) {
if (pos + len <= MAX_INLINE_DATA(inode)) {
-   f2fs_do_read_inline_data(page, ipage);
+   f2fs_do_read_inline_data(page_folio(page), ipage);
set_inode_flag(inode, FI_DATA_EXIST);
if (inode->i_nlink)
set_page_private_inline(ipage);
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 3f7196122574..a0ae99bcca39 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4154,10 +4154,10 @@ extern struct kmem_cache *f2fs_inode_entry_slab;
 bool f2fs_may_inline_data(struct inode *inode);
 bool f2fs_sanity_check_inline_data(struct inode *inode);
 bool f2fs_may_inline_dentry(struct inode *inode);
-void f2fs_do_read_inline_data(struct page *page, struct page *ipage);
+void f2fs_do_read_inline_data(struct folio *folio, struct page *ipage);
 void f2fs_truncate_inline_inode(struct inode *inode,
struct page *ipage, u64 from);
-int f2fs_read_inline_data(struct inode *inode, struct page *page);
+int f2fs_read_inline_data(struct inode *inode, struct folio *folio);
 int f2fs_convert_inline_page(struct dnode_of_data *dn, struct page *page);
 int f2fs_convert_inline_inode(struct inode *inode);
 int f2fs_try_convert_inline_dir(struct inode *dir, struct dentry *dentry);
diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
index 3d3218a4b29d..7638d0d7b7ee 100644
--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -61,22 +61,22 @@ bool f2fs_may_inline_dentry(struct inode *inode)
return true;
 }
 
-void f2fs_do_read_inline_data(struct page *page, struct page *ipage)
+void f2fs_do_read_inline_data(struct folio *folio, struct page *ipage)
 {
-   struct inode *inode = page->mapping->host;
+   struct inode *inode = folio_file_mapping(folio)->host;
 
-   if (PageUptodate(page))
+   if (folio_test_uptodate(folio))
return;
 
-   f2fs_bug_on(F2FS_P_SB(page), page->index);
+   f2fs_bug_on(F2FS_I_SB(inode), folio_index(folio));
 
-   zero_user_segment(page, MAX_INLINE_DATA(inode), PAGE_SIZE);
+   folio_zero_segment(folio, MAX_INLINE_DATA(inode), folio_size(folio));
 
/* Copy the whole inline data block */
-   memcpy_to_page(page, 0, inline_data_addr(inode, ipage),
+   memcpy_to_folio(folio, 0, inline_data_addr(inode, ipage),
   MAX_INLINE_DATA(inode));
-   if (!PageUptodate(page))
-   SetPageUptodate(page);
+   if (!folio_test_uptodate(folio))
+   folio_mark_uptodate(folio);
 }
 
 void f2fs_truncate_inline_inode(struct inode *inode,
@@ -97,13 +97,13 @@ void f2fs_truncate_inline_inode(struct inode *inode,
clear_inode_flag(inode, FI_DATA_EXIST);
 }
 
-int f2fs_read_inline_data(struct inode *inode, struct page *page)
+int f2fs_read_inline_data(struct inode *inode, struct folio *folio)
 {
struct page *ipage;
 
ipage = f2fs_get_node_page(F2FS_I_SB(inode), inode->i_ino);
if (IS_ERR(ipage)) {
-   unlock_page(page);
+   folio_unlock(folio);
return PTR_ERR(ipage);
}
 
@@ -112,15 +112,15 @@ int f2fs_read_inline_data(struct inode *inode, struct 
page *page)
return -EAGAIN;
}
 
-   if (page->index)
-   zero_user_segment(page, 0, PAGE_SIZE);
+   if (folio_index(folio))
+   folio_zer

[f2fs-dev] [PATCH v2 1/4] f2fs: convert f2fs_mpage_readpages() to use folio

2024-04-22 Thread Chao Yu
Convert f2fs_mpage_readpages() to use folio and related
functionality.

Signed-off-by: Chao Yu 
---
v2:
- fix compile warning w/o CONFIG_F2FS_FS_COMPRESSION reported by lkp
 fs/f2fs/data.c | 81 +-
 1 file changed, 40 insertions(+), 41 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index ed7d08785fcf..6419cf020327 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2345,7 +2345,7 @@ int f2fs_read_multi_pages(struct compress_ctx *cc, struct 
bio **bio_ret,
  * Major change was from block_size == page_size in f2fs by default.
  */
 static int f2fs_mpage_readpages(struct inode *inode,
-   struct readahead_control *rac, struct page *page)
+   struct readahead_control *rac, struct folio *folio)
 {
struct bio *bio = NULL;
sector_t last_block_in_bio = 0;
@@ -2362,6 +2362,7 @@ static int f2fs_mpage_readpages(struct inode *inode,
.nr_cpages = 0,
};
pgoff_t nc_cluster_idx = NULL_CLUSTER;
+   pgoff_t index;
 #endif
unsigned nr_pages = rac ? readahead_count(rac) : 1;
unsigned max_nr_pages = nr_pages;
@@ -2378,64 +2379,62 @@ static int f2fs_mpage_readpages(struct inode *inode,
 
for (; nr_pages; nr_pages--) {
if (rac) {
-   page = readahead_page(rac);
-   prefetchw(>flags);
+   folio = readahead_folio(rac);
+   prefetchw(>flags);
}
 
 #ifdef CONFIG_F2FS_FS_COMPRESSION
-   if (f2fs_compressed_file(inode)) {
-   /* there are remained compressed pages, submit them */
-   if (!f2fs_cluster_can_merge_page(, page->index)) {
-   ret = f2fs_read_multi_pages(, ,
-   max_nr_pages,
-   _block_in_bio,
-   rac != NULL, false);
-   f2fs_destroy_compress_ctx(, false);
-   if (ret)
-   goto set_error_page;
-   }
-   if (cc.cluster_idx == NULL_CLUSTER) {
-   if (nc_cluster_idx ==
-   page->index >> cc.log_cluster_size) {
-   goto read_single_page;
-   }
-
-   ret = f2fs_is_compressed_cluster(inode, 
page->index);
-   if (ret < 0)
-   goto set_error_page;
-   else if (!ret) {
-   nc_cluster_idx =
-   page->index >> 
cc.log_cluster_size;
-   goto read_single_page;
-   }
-
-   nc_cluster_idx = NULL_CLUSTER;
-   }
-   ret = f2fs_init_compress_ctx();
+   index = folio_index(folio);
+
+   if (!f2fs_compressed_file(inode))
+   goto read_single_page;
+
+   /* there are remained compressed pages, submit them */
+   if (!f2fs_cluster_can_merge_page(, index)) {
+   ret = f2fs_read_multi_pages(, ,
+   max_nr_pages,
+   _block_in_bio,
+   rac != NULL, false);
+   f2fs_destroy_compress_ctx(, false);
if (ret)
goto set_error_page;
+   }
+   if (cc.cluster_idx == NULL_CLUSTER) {
+   if (nc_cluster_idx == index >> cc.log_cluster_size)
+   goto read_single_page;
 
-   f2fs_compress_ctx_add_page(, page);
+   ret = f2fs_is_compressed_cluster(inode, index);
+   if (ret < 0)
+   goto set_error_page;
+   else if (!ret) {
+   nc_cluster_idx =
+   index >> cc.log_cluster_size;
+   goto read_single_page;
+   }
 
-   goto next_page;
+   nc_cluster_idx = NULL_CLUSTER;
}
+   ret = f2fs_init_compress_ctx();
+   if (ret)
+   goto set_error_page;
+
+   f2fs_compress_ctx_add_page(, >page);
+
+   goto next_page;
 read_single_page:
 #endif
-
-   ret = f2fs_read_single_page(inode, page, max_nr_pages, ,
+   ret

[f2fs-dev] [PATCH v2 4/4] f2fs: convert f2fs__page tracepoint class to use folio

2024-04-22 Thread Chao Yu
Convert f2fs__page tracepoint class() and its instances to use folio
and related functionality, and rename it to f2fs__folio().

Signed-off-by: Chao Yu 
---
v2:
- no change.
 fs/f2fs/checkpoint.c|  4 ++--
 fs/f2fs/data.c  | 10 -
 fs/f2fs/node.c  |  4 ++--
 include/trace/events/f2fs.h | 42 ++---
 4 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index eac698b8dd38..5d05a413f451 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -345,7 +345,7 @@ static int __f2fs_write_meta_page(struct page *page,
 {
struct f2fs_sb_info *sbi = F2FS_P_SB(page);
 
-   trace_f2fs_writepage(page, META);
+   trace_f2fs_writepage(page_folio(page), META);
 
if (unlikely(f2fs_cp_error(sbi))) {
if (is_sbi_flag_set(sbi, SBI_IS_CLOSE)) {
@@ -492,7 +492,7 @@ long f2fs_sync_meta_pages(struct f2fs_sb_info *sbi, enum 
page_type type,
 static bool f2fs_dirty_meta_folio(struct address_space *mapping,
struct folio *folio)
 {
-   trace_f2fs_set_page_dirty(>page, META);
+   trace_f2fs_set_page_dirty(folio, META);
 
if (!folio_test_uptodate(folio))
folio_mark_uptodate(folio);
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 24f9a39ffd56..21d4c1c9b25b 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2460,7 +2460,7 @@ static int f2fs_read_data_folio(struct file *file, struct 
folio *folio)
struct inode *inode = folio_file_mapping(folio)->host;
int ret = -EAGAIN;
 
-   trace_f2fs_readpage(>page, DATA);
+   trace_f2fs_readpage(folio, DATA);
 
if (!f2fs_is_compress_backend_ready(inode)) {
folio_unlock(folio);
@@ -2709,7 +2709,7 @@ int f2fs_do_write_data_page(struct f2fs_io_info *fio)
} else {
set_inode_flag(inode, FI_UPDATE_WRITE);
}
-   trace_f2fs_do_write_data_page(fio->page, IPU);
+   trace_f2fs_do_write_data_page(page_folio(page), IPU);
return err;
}
 
@@ -2738,7 +2738,7 @@ int f2fs_do_write_data_page(struct f2fs_io_info *fio)
 
/* LFS mode write path */
f2fs_outplace_write_data(, fio);
-   trace_f2fs_do_write_data_page(page, OPU);
+   trace_f2fs_do_write_data_page(page_folio(page), OPU);
set_inode_flag(inode, FI_APPEND_WRITE);
 out_writepage:
f2fs_put_dnode();
@@ -2785,7 +2785,7 @@ int f2fs_write_single_data_page(struct page *page, int 
*submitted,
.last_block = last_block,
};
 
-   trace_f2fs_writepage(page, DATA);
+   trace_f2fs_writepage(page_folio(page), DATA);
 
/* we should bypass data pages to proceed the kworker jobs */
if (unlikely(f2fs_cp_error(sbi))) {
@@ -3759,7 +3759,7 @@ static bool f2fs_dirty_data_folio(struct address_space 
*mapping,
 {
struct inode *inode = mapping->host;
 
-   trace_f2fs_set_page_dirty(>page, DATA);
+   trace_f2fs_set_page_dirty(folio, DATA);
 
if (!folio_test_uptodate(folio))
folio_mark_uptodate(folio);
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 3b9eb5693683..95cecf08cb37 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1624,7 +1624,7 @@ static int __write_node_page(struct page *page, bool 
atomic, bool *submitted,
};
unsigned int seq;
 
-   trace_f2fs_writepage(page, NODE);
+   trace_f2fs_writepage(page_folio(page), NODE);
 
if (unlikely(f2fs_cp_error(sbi))) {
/* keep node pages in remount-ro mode */
@@ -2171,7 +2171,7 @@ static int f2fs_write_node_pages(struct address_space 
*mapping,
 static bool f2fs_dirty_node_folio(struct address_space *mapping,
struct folio *folio)
 {
-   trace_f2fs_set_page_dirty(>page, NODE);
+   trace_f2fs_set_page_dirty(folio, NODE);
 
if (!folio_test_uptodate(folio))
folio_mark_uptodate(folio);
diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h
index 7ed0fc430dc6..371ba28415f5 100644
--- a/include/trace/events/f2fs.h
+++ b/include/trace/events/f2fs.h
@@ -1304,11 +1304,11 @@ TRACE_EVENT(f2fs_write_end,
__entry->copied)
 );
 
-DECLARE_EVENT_CLASS(f2fs__page,
+DECLARE_EVENT_CLASS(f2fs__folio,
 
-   TP_PROTO(struct page *page, int type),
+   TP_PROTO(struct folio *folio, int type),
 
-   TP_ARGS(page, type),
+   TP_ARGS(folio, type),
 
TP_STRUCT__entry(
__field(dev_t,  dev)
@@ -1321,14 +1321,14 @@ DECLARE_EVENT_CLASS(f2fs__page,
),
 
TP_fast_assign(
-   __entry->dev= page_file_mapping(page)->host->i_sb->s_dev;
-   __entry->ino= page_file_mapping(page)->host->i_ino;
+   __entry->dev= folio_file_mapping(folio)->host->i_sb->s_dev;
+   __entry->ino= folio_fil

Re: [f2fs-dev] [PATCH] f2fs: assign write hint in direct write IO path

2024-04-19 Thread Chao Yu

On 2024/4/20 1:53, Jaegeuk Kim wrote:

Thanks, Chao,

If you don't mind, can I merge this into my patch. Ok?


No problem. :)

Thanks,



On 04/18, Chao Yu wrote:

f2fs has its own write_hint policy, let's assign write hint for
direct write bio.

Cc: Hyunchul Lee 
Signed-off-by: Chao Yu 
---
  fs/f2fs/f2fs.h|  1 +
  fs/f2fs/file.c| 15 ++-
  fs/f2fs/segment.c | 17 +++--
  3 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index b3b878acc86b..3f7196122574 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -3722,6 +3722,7 @@ void f2fs_replace_block(struct f2fs_sb_info *sbi, struct 
dnode_of_data *dn,
block_t old_addr, block_t new_addr,
unsigned char version, bool recover_curseg,
bool recover_newaddr);
+int f2fs_get_segment_temp(int seg_type);
  int f2fs_allocate_data_block(struct f2fs_sb_info *sbi, struct page *page,
block_t old_blkaddr, block_t *new_blkaddr,
struct f2fs_summary *sum, int type,
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index ac1ae85f3cc3..d382f8bc2fbe 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -4685,8 +4685,21 @@ static int f2fs_dio_write_end_io(struct kiocb *iocb, 
ssize_t size, int error,
return 0;
  }
  
+static void f2fs_dio_write_submit_io(const struct iomap_iter *iter,

+   struct bio *bio, loff_t file_offset)
+{
+   struct inode *inode = iter->inode;
+   struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+   int seg_type = f2fs_rw_hint_to_seg_type(inode->i_write_hint);
+   enum temp_type temp = f2fs_get_segment_temp(seg_type);
+
+   bio->bi_write_hint = f2fs_io_type_to_rw_hint(sbi, DATA, temp);
+   submit_bio(bio);
+}
+
  static const struct iomap_dio_ops f2fs_iomap_dio_write_ops = {
-   .end_io = f2fs_dio_write_end_io,
+   .end_io = f2fs_dio_write_end_io,
+   .submit_io  = f2fs_dio_write_submit_io,
  };
  
  static void f2fs_flush_buffered_write(struct address_space *mapping,

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index daa94669f7ee..2206199e8099 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -3502,6 +3502,15 @@ static int __get_segment_type_6(struct f2fs_io_info *fio)
}
  }
  
+int f2fs_get_segment_temp(int seg_type)

+{
+   if (IS_HOT(seg_type))
+   return HOT;
+   else if (IS_WARM(seg_type))
+   return WARM;
+   return COLD;
+}
+
  static int __get_segment_type(struct f2fs_io_info *fio)
  {
int type = 0;
@@ -3520,12 +3529,8 @@ static int __get_segment_type(struct f2fs_io_info *fio)
f2fs_bug_on(fio->sbi, true);
}
  
-	if (IS_HOT(type))

-   fio->temp = HOT;
-   else if (IS_WARM(type))
-   fio->temp = WARM;
-   else
-   fio->temp = COLD;
+   fio->temp = f2fs_get_segment_temp(type);
+
return type;
  }
  
--

2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] f2fs: assign write hint in direct write IO path

2024-04-17 Thread Chao Yu
f2fs has its own write_hint policy, let's assign write hint for
direct write bio.

Cc: Hyunchul Lee 
Signed-off-by: Chao Yu 
---
 fs/f2fs/f2fs.h|  1 +
 fs/f2fs/file.c| 15 ++-
 fs/f2fs/segment.c | 17 +++--
 3 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index b3b878acc86b..3f7196122574 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -3722,6 +3722,7 @@ void f2fs_replace_block(struct f2fs_sb_info *sbi, struct 
dnode_of_data *dn,
block_t old_addr, block_t new_addr,
unsigned char version, bool recover_curseg,
bool recover_newaddr);
+int f2fs_get_segment_temp(int seg_type);
 int f2fs_allocate_data_block(struct f2fs_sb_info *sbi, struct page *page,
block_t old_blkaddr, block_t *new_blkaddr,
struct f2fs_summary *sum, int type,
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index ac1ae85f3cc3..d382f8bc2fbe 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -4685,8 +4685,21 @@ static int f2fs_dio_write_end_io(struct kiocb *iocb, 
ssize_t size, int error,
return 0;
 }
 
+static void f2fs_dio_write_submit_io(const struct iomap_iter *iter,
+   struct bio *bio, loff_t file_offset)
+{
+   struct inode *inode = iter->inode;
+   struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+   int seg_type = f2fs_rw_hint_to_seg_type(inode->i_write_hint);
+   enum temp_type temp = f2fs_get_segment_temp(seg_type);
+
+   bio->bi_write_hint = f2fs_io_type_to_rw_hint(sbi, DATA, temp);
+   submit_bio(bio);
+}
+
 static const struct iomap_dio_ops f2fs_iomap_dio_write_ops = {
-   .end_io = f2fs_dio_write_end_io,
+   .end_io = f2fs_dio_write_end_io,
+   .submit_io  = f2fs_dio_write_submit_io,
 };
 
 static void f2fs_flush_buffered_write(struct address_space *mapping,
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index daa94669f7ee..2206199e8099 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -3502,6 +3502,15 @@ static int __get_segment_type_6(struct f2fs_io_info *fio)
}
 }
 
+int f2fs_get_segment_temp(int seg_type)
+{
+   if (IS_HOT(seg_type))
+   return HOT;
+   else if (IS_WARM(seg_type))
+   return WARM;
+   return COLD;
+}
+
 static int __get_segment_type(struct f2fs_io_info *fio)
 {
int type = 0;
@@ -3520,12 +3529,8 @@ static int __get_segment_type(struct f2fs_io_info *fio)
f2fs_bug_on(fio->sbi, true);
}
 
-   if (IS_HOT(type))
-   fio->temp = HOT;
-   else if (IS_WARM(type))
-   fio->temp = WARM;
-   else
-   fio->temp = COLD;
+   fio->temp = f2fs_get_segment_temp(type);
+
return type;
 }
 
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: assign the write hint per stream by default

2024-04-17 Thread Chao Yu

On 2024/4/18 5:12, Jaegeuk Kim wrote:

This reverts commit 930e2607638d ("f2fs: remove obsolete whint_mode"), as we
decide to pass write hints to the disk.

Signed-off-by: Jaegeuk Kim 
---
  Documentation/filesystems/f2fs.rst | 29 +++
  fs/f2fs/data.c |  2 +
  fs/f2fs/f2fs.h |  2 +
  fs/f2fs/segment.c  | 59 ++
  4 files changed, 92 insertions(+)

diff --git a/Documentation/filesystems/f2fs.rst 
b/Documentation/filesystems/f2fs.rst
index efc3493fd6f8..68a0885fb5e6 100644
--- a/Documentation/filesystems/f2fs.rst
+++ b/Documentation/filesystems/f2fs.rst
@@ -774,6 +774,35 @@ In order to identify whether the data in the victim 
segment are valid or not,
  F2FS manages a bitmap. Each bit represents the validity of a block, and the
  bitmap is composed of a bit stream covering whole blocks in main area.
  
+Write-hint Policy

+-
+
+F2FS sets the whint all the time with the below policy.


No user-based mode?

Thanks,


+
+=  ===
+User  F2FS Block
+=  ===
+N/A   META WRITE_LIFE_NONE|REQ_META
+N/A   HOT_NODE WRITE_LIFE_NONE
+N/A   WARM_NODEWRITE_LIFE_MEDIUM
+N/A   COLD_NODEWRITE_LIFE_LONG
+ioctl(COLD)   COLD_DATAWRITE_LIFE_EXTREME
+extension list""
+
+-- buffered io
+N/A   COLD_DATAWRITE_LIFE_EXTREME
+N/A   HOT_DATA WRITE_LIFE_SHORT
+N/A   WARM_DATAWRITE_LIFE_NOT_SET
+
+-- direct io
+WRITE_LIFE_EXTREMECOLD_DATAWRITE_LIFE_EXTREME
+WRITE_LIFE_SHORT  HOT_DATA WRITE_LIFE_SHORT
+WRITE_LIFE_NOT_SETWARM_DATAWRITE_LIFE_NOT_SET
+WRITE_LIFE_NONE   "WRITE_LIFE_NONE
+WRITE_LIFE_MEDIUM "WRITE_LIFE_MEDIUM
+WRITE_LIFE_LONG   "WRITE_LIFE_LONG
+=  ===
+
  Fallocate(2) Policy
  ---
  
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c

index 5d641fac02ba..ed7d08785fcf 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -465,6 +465,8 @@ static struct bio *__bio_alloc(struct f2fs_io_info *fio, 
int npages)
} else {
bio->bi_end_io = f2fs_write_end_io;
bio->bi_private = sbi;
+   bio->bi_write_hint = f2fs_io_type_to_rw_hint(sbi,
+   fio->type, fio->temp);
}
iostat_alloc_and_bind_ctx(sbi, bio, NULL);
  
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h

index dd530dc70005..b3b878acc86b 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -3745,6 +3745,8 @@ void f2fs_destroy_segment_manager(struct f2fs_sb_info 
*sbi);
  int __init f2fs_create_segment_manager_caches(void);
  void f2fs_destroy_segment_manager_caches(void);
  int f2fs_rw_hint_to_seg_type(enum rw_hint hint);
+enum rw_hint f2fs_io_type_to_rw_hint(struct f2fs_sb_info *sbi,
+   enum page_type type, enum temp_type temp);
  unsigned int f2fs_usable_segs_in_sec(struct f2fs_sb_info *sbi,
unsigned int segno);
  unsigned int f2fs_usable_blks_in_seg(struct f2fs_sb_info *sbi,
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index f0da516ba8dc..daa94669f7ee 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -3364,6 +3364,65 @@ int f2fs_rw_hint_to_seg_type(enum rw_hint hint)
}
  }
  
+/*

+ * This returns write hints for each segment type. This hints will be
+ * passed down to block layer as below by default.
+ *
+ * User  F2FS Block
+ *    -
+ *   META WRITE_LIFE_NONE|REQ_META
+ *   HOT_NODE WRITE_LIFE_NONE
+ *   WARM_NODEWRITE_LIFE_MEDIUM
+ *   COLD_NODEWRITE_LIFE_LONG
+ * ioctl(COLD)   COLD_DATAWRITE_LIFE_EXTREME
+ * extension list""
+ *
+ * -- buffered io
+ *   COLD_DATAWRITE_LIFE_EXTREME
+ *   HOT_DATA WRITE_LIFE_SHORT
+ *   WARM_DATAWRITE_LIFE_NOT_SET
+ *
+ * -- direct io
+ * WRITE_LIFE_EXTREMECOLD_DATAWRITE_LIFE_EXTREME
+ * WRITE_LIFE_SHORT  HOT_DATA WRITE_LIFE_SHORT
+ * WRITE_LIFE_NOT_SETWARM_DATAWRITE_LIFE_NOT_SET
+ * WRITE_LIFE_NONE   "WRITE_LIFE_NONE
+ * WRITE_LIFE_MEDIUM "   

Re: [f2fs-dev] [PATCH v3] f2fs: zone: don't block IO if there is remained open zone

2024-04-16 Thread Chao Yu

On 2024/4/17 0:51, Jaegeuk Kim wrote:

On 04/16, Chao Yu wrote:

On 2024/4/15 22:01, Chao Yu wrote:

On 2024/4/15 11:26, Chao Yu wrote:

On 2024/4/14 23:19, Jaegeuk Kim wrote:

It seems this caused kernel hang. Chao, have you tested this patch enough?


Jaegeuk,

Oh, I've checked this patch w/ fsstress before submitting it, but missed
the SPO testcase... do you encounter kernel hang w/ SPO testcase?


I did see any hang issue w/ por_fsstress testcase, which testcase do you use?


Sorry, I mean I haven't reproduced it yet...


I'd prefer to check this patch later. Have you tested on Zoned device with
nullblk?


Yes, I enabled blkzoned feature w/ nullblk device, and set
/sys/kernel/config/nullb/nullb0/zone_max_open to six, so that it can
emulate ZUFS' configuration.

Thanks,





Thanks,



Thanks,



Anyway, let me test it more.

Thanks,



On 04/13, Chao Yu wrote:

On 2024/4/13 5:11, Jaegeuk Kim wrote:

On 04/07, Chao Yu wrote:

max open zone may be larger than log header number of f2fs, for
such case, it doesn't need to wait last IO in previous zone, let's
introduce available_open_zone semaphore, and reduce it once we
submit first write IO in a zone, and increase it after completion
of last IO in the zone.

Cc: Daeho Jeong 
Signed-off-by: Chao Yu 
---
v3:
- avoid race condition in between __submit_merged_bio()
and __allocate_new_segment().
    fs/f2fs/data.c    | 105 ++
    fs/f2fs/f2fs.h    |  34 ---
    fs/f2fs/iostat.c  |   7 
    fs/f2fs/iostat.h  |   2 +
    fs/f2fs/segment.c |  43 ---
    fs/f2fs/segment.h |  12 +-
    fs/f2fs/super.c   |   2 +
    7 files changed, 156 insertions(+), 49 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 0d88649c60a5..18a4ac0a06bc 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -373,11 +373,10 @@ static void f2fs_write_end_io(struct bio *bio)
    #ifdef CONFIG_BLK_DEV_ZONED
    static void f2fs_zone_write_end_io(struct bio *bio)
    {
-    struct f2fs_bio_info *io = (struct f2fs_bio_info *)bio->bi_private;
+    struct f2fs_sb_info *sbi = iostat_get_bio_private(bio);
-    bio->bi_private = io->bi_private;
-    complete(>zone_wait);
    f2fs_write_end_io(bio);
+    up(>available_open_zones);
    }
    #endif
@@ -531,6 +530,24 @@ static void __submit_merged_bio(struct f2fs_bio_info *io)
    if (!io->bio)
    return;
+#ifdef CONFIG_BLK_DEV_ZONED
+    if (io->open_zone) {
+    /*
+ * if there is no open zone, it will wait for last IO in
+ * previous zone before submitting new IO.
+ */
+    down(>sbi->available_open_zones);
+    io->open_zone = false;
+    io->zone_openned = true;
+    }
+
+    if (io->close_zone) {
+    io->bio->bi_end_io = f2fs_zone_write_end_io;
+    io->zone_openned = false;
+    io->close_zone = false;
+    }
+#endif
+
    if (is_read_io(fio->op)) {
    trace_f2fs_prepare_read_bio(io->sbi->sb, fio->type, io->bio);
    f2fs_submit_read_bio(io->sbi, io->bio, fio->type);
@@ -601,9 +618,9 @@ int f2fs_init_write_merge_io(struct f2fs_sb_info *sbi)
    INIT_LIST_HEAD(>write_io[i][j].bio_list);
    init_f2fs_rwsem(>write_io[i][j].bio_list_lock);
    #ifdef CONFIG_BLK_DEV_ZONED
-    init_completion(>write_io[i][j].zone_wait);
-    sbi->write_io[i][j].zone_pending_bio = NULL;
-    sbi->write_io[i][j].bi_private = NULL;
+    sbi->write_io[i][j].open_zone = false;
+    sbi->write_io[i][j].zone_openned = false;
+    sbi->write_io[i][j].close_zone = false;
    #endif
    }
    }
@@ -634,6 +651,31 @@ static void __f2fs_submit_merged_write(struct f2fs_sb_info 
*sbi,
    f2fs_up_write(>io_rwsem);
    }
+void f2fs_blkzoned_submit_merged_write(struct f2fs_sb_info *sbi, int type)
+{
+#ifdef CONFIG_BLK_DEV_ZONED
+    struct f2fs_bio_info *io;
+
+    if (!f2fs_sb_has_blkzoned(sbi))
+    return;
+
+    io = sbi->write_io[PAGE_TYPE(type)] + type_to_temp(type);
+
+    f2fs_down_write(>io_rwsem);
+    if (io->zone_openned) {
+    if (io->bio) {
+    io->close_zone = true;
+    __submit_merged_bio(io);
+    } else if (io->zone_openned) {
+    up(>available_open_zones);
+    io->zone_openned = false;
+    }
+    }
+    f2fs_up_write(>io_rwsem);
+#endif
+
+}
+
    static void __submit_merged_write_cond(struct f2fs_sb_info *sbi,
    struct inode *inode, struct page *page,
    nid_t ino, enum page_type type, bool force)
@@ -918,22 +960,16 @@ int f2fs_merge_page_bio(struct f2fs_io_info *fio)
    }
    #ifdef CONFIG_BLK_DEV_ZONED
-static bool is_end_zone_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr)
+static bool is_blkaddr_zone_boundary(struct f2fs_sb_info *sbi,
+    block_t blkaddr,

Re: [f2fs-dev] [PATCH] common/quota: fix keywords of quota feature in _require_prjquota() for f2fs

2024-04-16 Thread Chao Yu

On 2024/4/16 16:49, Zorro Lang wrote:

On Tue, Apr 16, 2024 at 03:18:19PM +0800, Chao Yu wrote:

Previously, in f2fs, sysfile quota feature has different name:
- "quota" in mkfs.f2fs
- and "quota_ino" in dump.f2fs

Now, it has unified the name to "quota" since commit 92cc5edeb7
("f2fs-tools: reuse feature_table to clean up print_sb_state()").

It needs to fix keywords in _require_prjquota() for f2fs, Otherwise,
quota testcase will fail.

generic/383 1s ... [not run] quota sysfile not enabled in this device /dev/vdc

Cc: Jaegeuk Kim 
Signed-off-by: Chao Yu 
---
  common/quota | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/common/quota b/common/quota
index 6b529bf4..cfe3276f 100644
--- a/common/quota
+++ b/common/quota
@@ -145,7 +145,7 @@ _require_prjquota()
  if [ "$FSTYP" == "f2fs" ]; then
dump.f2fs $_dev 2>&1 | grep -qw project_quota
[ $? -ne 0 ] && _notrun "Project quota not enabled in this device $_dev"
-   dump.f2fs $_dev 2>&1 | grep -qw quota_ino
+   dump.f2fs $_dev 2>&1 | grep -qw quota


This will _notrun on old f2fs-tools, due to `grep -w quota` doesn't match
old "quota_ino". So how about grep -Eqw "quota|quota_ino", or any better idea
you have.


Thanks for your suggestion, I fix this in v2, I've tested v2 w/ old f2fs-tools,
it works fine.

Thanks,



Thanks,
Zorro


[ $? -ne 0 ] && _notrun "quota sysfile not enabled in this device $_dev"
cat /sys/fs/f2fs/features/project_quota | grep -qw supported
[ $? -ne 0 ] && _notrun "Installed kernel does not support project 
quotas"
--
2.40.1







___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH v2] common/quota: update keywords of quota feature in _require_prjquota() for f2fs

2024-04-16 Thread Chao Yu
Previously, in f2fs, sysfile quota feature has different name:
- "quota" in mkfs.f2fs
- and "quota_ino" in dump.f2fs

Now, it has unified the name to "quota" since commit 92cc5edeb7
("f2fs-tools: reuse feature_table to clean up print_sb_state()").

It needs to update keywords "quota" in _require_prjquota() for f2fs,
Otherwise, quota testcase will fail as below.

generic/383 1s ... [not run] quota sysfile not enabled in this device /dev/vdc

This patch keeps keywords "quota_ino" in _require_prjquota() to
keep compatibility for old f2fs-tools.

Cc: Jaegeuk Kim 
Signed-off-by: Chao Yu 
---
v2:
- keep keywords "quota_ino" for compatibility of old f2fs-tools
suggested by Zorro Lang.
 common/quota | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/common/quota b/common/quota
index 6b529bf4..4c1d3dcd 100644
--- a/common/quota
+++ b/common/quota
@@ -145,7 +145,7 @@ _require_prjquota()
 if [ "$FSTYP" == "f2fs" ]; then
dump.f2fs $_dev 2>&1 | grep -qw project_quota
[ $? -ne 0 ] && _notrun "Project quota not enabled in this device $_dev"
-   dump.f2fs $_dev 2>&1 | grep -qw quota_ino
+   dump.f2fs $_dev 2>&1 | grep -Eqw "quota|quota_ino"
[ $? -ne 0 ] && _notrun "quota sysfile not enabled in this device $_dev"
cat /sys/fs/f2fs/features/project_quota | grep -qw supported
[ $? -ne 0 ] && _notrun "Installed kernel does not support project 
quotas"
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH 4/4] f2fs: convert f2fs__page tracepoint class to use folio

2024-04-16 Thread Chao Yu
Convert f2fs__page tracepoint class() and its instances to use folio
and related functionality, and rename it to f2fs__folio().

Signed-off-by: Chao Yu 
---
 fs/f2fs/checkpoint.c|  4 ++--
 fs/f2fs/data.c  | 10 -
 fs/f2fs/node.c  |  4 ++--
 include/trace/events/f2fs.h | 42 ++---
 4 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index eac698b8dd38..5d05a413f451 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -345,7 +345,7 @@ static int __f2fs_write_meta_page(struct page *page,
 {
struct f2fs_sb_info *sbi = F2FS_P_SB(page);
 
-   trace_f2fs_writepage(page, META);
+   trace_f2fs_writepage(page_folio(page), META);
 
if (unlikely(f2fs_cp_error(sbi))) {
if (is_sbi_flag_set(sbi, SBI_IS_CLOSE)) {
@@ -492,7 +492,7 @@ long f2fs_sync_meta_pages(struct f2fs_sb_info *sbi, enum 
page_type type,
 static bool f2fs_dirty_meta_folio(struct address_space *mapping,
struct folio *folio)
 {
-   trace_f2fs_set_page_dirty(>page, META);
+   trace_f2fs_set_page_dirty(folio, META);
 
if (!folio_test_uptodate(folio))
folio_mark_uptodate(folio);
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 3eb90b9b0f8b..cf6d31e3e630 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2490,7 +2490,7 @@ static int f2fs_read_data_folio(struct file *file, struct 
folio *folio)
struct inode *inode = folio_file_mapping(folio)->host;
int ret = -EAGAIN;
 
-   trace_f2fs_readpage(>page, DATA);
+   trace_f2fs_readpage(folio, DATA);
 
if (!f2fs_is_compress_backend_ready(inode)) {
folio_unlock(folio);
@@ -2739,7 +2739,7 @@ int f2fs_do_write_data_page(struct f2fs_io_info *fio)
} else {
set_inode_flag(inode, FI_UPDATE_WRITE);
}
-   trace_f2fs_do_write_data_page(fio->page, IPU);
+   trace_f2fs_do_write_data_page(page_folio(page), IPU);
return err;
}
 
@@ -2768,7 +2768,7 @@ int f2fs_do_write_data_page(struct f2fs_io_info *fio)
 
/* LFS mode write path */
f2fs_outplace_write_data(, fio);
-   trace_f2fs_do_write_data_page(page, OPU);
+   trace_f2fs_do_write_data_page(page_folio(page), OPU);
set_inode_flag(inode, FI_APPEND_WRITE);
 out_writepage:
f2fs_put_dnode();
@@ -2815,7 +2815,7 @@ int f2fs_write_single_data_page(struct page *page, int 
*submitted,
.last_block = last_block,
};
 
-   trace_f2fs_writepage(page, DATA);
+   trace_f2fs_writepage(page_folio(page), DATA);
 
/* we should bypass data pages to proceed the kworker jobs */
if (unlikely(f2fs_cp_error(sbi))) {
@@ -3789,7 +3789,7 @@ static bool f2fs_dirty_data_folio(struct address_space 
*mapping,
 {
struct inode *inode = mapping->host;
 
-   trace_f2fs_set_page_dirty(>page, DATA);
+   trace_f2fs_set_page_dirty(folio, DATA);
 
if (!folio_test_uptodate(folio))
folio_mark_uptodate(folio);
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 3b9eb5693683..95cecf08cb37 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1624,7 +1624,7 @@ static int __write_node_page(struct page *page, bool 
atomic, bool *submitted,
};
unsigned int seq;
 
-   trace_f2fs_writepage(page, NODE);
+   trace_f2fs_writepage(page_folio(page), NODE);
 
if (unlikely(f2fs_cp_error(sbi))) {
/* keep node pages in remount-ro mode */
@@ -2171,7 +2171,7 @@ static int f2fs_write_node_pages(struct address_space 
*mapping,
 static bool f2fs_dirty_node_folio(struct address_space *mapping,
struct folio *folio)
 {
-   trace_f2fs_set_page_dirty(>page, NODE);
+   trace_f2fs_set_page_dirty(folio, NODE);
 
if (!folio_test_uptodate(folio))
folio_mark_uptodate(folio);
diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h
index 7ed0fc430dc6..371ba28415f5 100644
--- a/include/trace/events/f2fs.h
+++ b/include/trace/events/f2fs.h
@@ -1304,11 +1304,11 @@ TRACE_EVENT(f2fs_write_end,
__entry->copied)
 );
 
-DECLARE_EVENT_CLASS(f2fs__page,
+DECLARE_EVENT_CLASS(f2fs__folio,
 
-   TP_PROTO(struct page *page, int type),
+   TP_PROTO(struct folio *folio, int type),
 
-   TP_ARGS(page, type),
+   TP_ARGS(folio, type),
 
TP_STRUCT__entry(
__field(dev_t,  dev)
@@ -1321,14 +1321,14 @@ DECLARE_EVENT_CLASS(f2fs__page,
),
 
TP_fast_assign(
-   __entry->dev= page_file_mapping(page)->host->i_sb->s_dev;
-   __entry->ino= page_file_mapping(page)->host->i_ino;
+   __entry->dev= folio_file_mapping(folio)->host->i_sb->s_dev;
+   __entry->ino= folio_file_mapping

[f2fs-dev] [PATCH 1/4] f2fs: convert f2fs_mpage_readpages() to use folio

2024-04-16 Thread Chao Yu
Convert f2fs_mpage_readpages() to use folio and related
functionality.

Signed-off-by: Chao Yu 
---
 fs/f2fs/data.c | 80 +-
 1 file changed, 40 insertions(+), 40 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 9c5512be1a1b..14dcd621acaa 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2374,7 +2374,7 @@ int f2fs_read_multi_pages(struct compress_ctx *cc, struct 
bio **bio_ret,
  * Major change was from block_size == page_size in f2fs by default.
  */
 static int f2fs_mpage_readpages(struct inode *inode,
-   struct readahead_control *rac, struct page *page)
+   struct readahead_control *rac, struct folio *folio)
 {
struct bio *bio = NULL;
sector_t last_block_in_bio = 0;
@@ -2394,6 +2394,7 @@ static int f2fs_mpage_readpages(struct inode *inode,
 #endif
unsigned nr_pages = rac ? readahead_count(rac) : 1;
unsigned max_nr_pages = nr_pages;
+   pgoff_t index;
int ret = 0;
 
map.m_pblk = 0;
@@ -2407,64 +2408,63 @@ static int f2fs_mpage_readpages(struct inode *inode,
 
for (; nr_pages; nr_pages--) {
if (rac) {
-   page = readahead_page(rac);
-   prefetchw(>flags);
+   folio = readahead_folio(rac);
+   prefetchw(>flags);
}
 
-#ifdef CONFIG_F2FS_FS_COMPRESSION
-   if (f2fs_compressed_file(inode)) {
-   /* there are remained compressed pages, submit them */
-   if (!f2fs_cluster_can_merge_page(, page->index)) {
-   ret = f2fs_read_multi_pages(, ,
-   max_nr_pages,
-   _block_in_bio,
-   rac != NULL, false);
-   f2fs_destroy_compress_ctx(, false);
-   if (ret)
-   goto set_error_page;
-   }
-   if (cc.cluster_idx == NULL_CLUSTER) {
-   if (nc_cluster_idx ==
-   page->index >> cc.log_cluster_size) {
-   goto read_single_page;
-   }
-
-   ret = f2fs_is_compressed_cluster(inode, 
page->index);
-   if (ret < 0)
-   goto set_error_page;
-   else if (!ret) {
-   nc_cluster_idx =
-   page->index >> 
cc.log_cluster_size;
-   goto read_single_page;
-   }
+   index = folio_index(folio);
 
-   nc_cluster_idx = NULL_CLUSTER;
-   }
-   ret = f2fs_init_compress_ctx();
+#ifdef CONFIG_F2FS_FS_COMPRESSION
+   if (!f2fs_compressed_file(inode))
+   goto read_single_page;
+
+   /* there are remained compressed pages, submit them */
+   if (!f2fs_cluster_can_merge_page(, index)) {
+   ret = f2fs_read_multi_pages(, ,
+   max_nr_pages,
+   _block_in_bio,
+   rac != NULL, false);
+   f2fs_destroy_compress_ctx(, false);
if (ret)
goto set_error_page;
+   }
+   if (cc.cluster_idx == NULL_CLUSTER) {
+   if (nc_cluster_idx == index >> cc.log_cluster_size)
+   goto read_single_page;
 
-   f2fs_compress_ctx_add_page(, page);
+   ret = f2fs_is_compressed_cluster(inode, index);
+   if (ret < 0)
+   goto set_error_page;
+   else if (!ret) {
+   nc_cluster_idx =
+   index >> cc.log_cluster_size;
+   goto read_single_page;
+   }
 
-   goto next_page;
+   nc_cluster_idx = NULL_CLUSTER;
}
+   ret = f2fs_init_compress_ctx();
+   if (ret)
+   goto set_error_page;
+
+   f2fs_compress_ctx_add_page(, >page);
+
+   goto next_page;
 read_single_page:
 #endif
 
-   ret = f2fs_read_single_page(inode, page, max_nr_pages, ,
+   ret = f2fs_read_single_page(inode, >page, max_nr_pages, 
,

[f2fs-dev] [PATCH 3/4] f2fs: convert f2fs_read_inline_data() to use folio

2024-04-16 Thread Chao Yu
Convert f2fs_read_inline_data() to use folio and related
functionality, and also convert its caller to use folio.

Signed-off-by: Chao Yu 
---
 fs/f2fs/data.c   | 11 +--
 fs/f2fs/f2fs.h   |  4 ++--
 fs/f2fs/inline.c | 34 +-
 3 files changed, 24 insertions(+), 25 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index c35107657c97..3eb90b9b0f8b 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2487,20 +2487,19 @@ static int f2fs_mpage_readpages(struct inode *inode,
 
 static int f2fs_read_data_folio(struct file *file, struct folio *folio)
 {
-   struct page *page = >page;
-   struct inode *inode = page_file_mapping(page)->host;
+   struct inode *inode = folio_file_mapping(folio)->host;
int ret = -EAGAIN;
 
-   trace_f2fs_readpage(page, DATA);
+   trace_f2fs_readpage(>page, DATA);
 
if (!f2fs_is_compress_backend_ready(inode)) {
-   unlock_page(page);
+   folio_unlock(folio);
return -EOPNOTSUPP;
}
 
/* If the file has inline data, try to read it directly */
if (f2fs_has_inline_data(inode))
-   ret = f2fs_read_inline_data(inode, page);
+   ret = f2fs_read_inline_data(inode, folio);
if (ret == -EAGAIN)
ret = f2fs_mpage_readpages(inode, NULL, folio);
return ret;
@@ -3429,7 +3428,7 @@ static int prepare_write_begin(struct f2fs_sb_info *sbi,
 
if (f2fs_has_inline_data(inode)) {
if (pos + len <= MAX_INLINE_DATA(inode)) {
-   f2fs_do_read_inline_data(page, ipage);
+   f2fs_do_read_inline_data(page_folio(page), ipage);
set_inode_flag(inode, FI_DATA_EXIST);
if (inode->i_nlink)
set_page_private_inline(ipage);
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 34acd791c198..13dee521fbe8 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4153,10 +4153,10 @@ extern struct kmem_cache *f2fs_inode_entry_slab;
 bool f2fs_may_inline_data(struct inode *inode);
 bool f2fs_sanity_check_inline_data(struct inode *inode);
 bool f2fs_may_inline_dentry(struct inode *inode);
-void f2fs_do_read_inline_data(struct page *page, struct page *ipage);
+void f2fs_do_read_inline_data(struct folio *folio, struct page *ipage);
 void f2fs_truncate_inline_inode(struct inode *inode,
struct page *ipage, u64 from);
-int f2fs_read_inline_data(struct inode *inode, struct page *page);
+int f2fs_read_inline_data(struct inode *inode, struct folio *folio);
 int f2fs_convert_inline_page(struct dnode_of_data *dn, struct page *page);
 int f2fs_convert_inline_inode(struct inode *inode);
 int f2fs_try_convert_inline_dir(struct inode *dir, struct dentry *dentry);
diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
index 3d3218a4b29d..7638d0d7b7ee 100644
--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -61,22 +61,22 @@ bool f2fs_may_inline_dentry(struct inode *inode)
return true;
 }
 
-void f2fs_do_read_inline_data(struct page *page, struct page *ipage)
+void f2fs_do_read_inline_data(struct folio *folio, struct page *ipage)
 {
-   struct inode *inode = page->mapping->host;
+   struct inode *inode = folio_file_mapping(folio)->host;
 
-   if (PageUptodate(page))
+   if (folio_test_uptodate(folio))
return;
 
-   f2fs_bug_on(F2FS_P_SB(page), page->index);
+   f2fs_bug_on(F2FS_I_SB(inode), folio_index(folio));
 
-   zero_user_segment(page, MAX_INLINE_DATA(inode), PAGE_SIZE);
+   folio_zero_segment(folio, MAX_INLINE_DATA(inode), folio_size(folio));
 
/* Copy the whole inline data block */
-   memcpy_to_page(page, 0, inline_data_addr(inode, ipage),
+   memcpy_to_folio(folio, 0, inline_data_addr(inode, ipage),
   MAX_INLINE_DATA(inode));
-   if (!PageUptodate(page))
-   SetPageUptodate(page);
+   if (!folio_test_uptodate(folio))
+   folio_mark_uptodate(folio);
 }
 
 void f2fs_truncate_inline_inode(struct inode *inode,
@@ -97,13 +97,13 @@ void f2fs_truncate_inline_inode(struct inode *inode,
clear_inode_flag(inode, FI_DATA_EXIST);
 }
 
-int f2fs_read_inline_data(struct inode *inode, struct page *page)
+int f2fs_read_inline_data(struct inode *inode, struct folio *folio)
 {
struct page *ipage;
 
ipage = f2fs_get_node_page(F2FS_I_SB(inode), inode->i_ino);
if (IS_ERR(ipage)) {
-   unlock_page(page);
+   folio_unlock(folio);
return PTR_ERR(ipage);
}
 
@@ -112,15 +112,15 @@ int f2fs_read_inline_data(struct inode *inode, struct 
page *page)
return -EAGAIN;
}
 
-   if (page->index)
-   zero_user_segment(page, 0, PAGE_SIZE);
+   if (folio_index(folio))
+   folio_zero_segment(folio,

[f2fs-dev] [PATCH 2/4] f2fs: convert f2fs_read_single_page() to use folio

2024-04-16 Thread Chao Yu
Convert f2fs_read_single_page() to use folio and related
functionality.

Signed-off-by: Chao Yu 
---
 fs/f2fs/data.c | 27 ++-
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 14dcd621acaa..c35107657c97 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2092,7 +2092,7 @@ static inline loff_t f2fs_readpage_limit(struct inode 
*inode)
return i_size_read(inode);
 }
 
-static int f2fs_read_single_page(struct inode *inode, struct page *page,
+static int f2fs_read_single_page(struct inode *inode, struct folio *folio,
unsigned nr_pages,
struct f2fs_map_blocks *map,
struct bio **bio_ret,
@@ -2105,9 +2105,10 @@ static int f2fs_read_single_page(struct inode *inode, 
struct page *page,
sector_t last_block;
sector_t last_block_in_file;
sector_t block_nr;
+   pgoff_t index = folio_index(folio);
int ret = 0;
 
-   block_in_file = (sector_t)page_index(page);
+   block_in_file = (sector_t)index;
last_block = block_in_file + nr_pages;
last_block_in_file = bytes_to_blks(inode,
f2fs_readpage_limit(inode) + blocksize - 1);
@@ -2138,7 +2139,7 @@ static int f2fs_read_single_page(struct inode *inode, 
struct page *page,
 got_it:
if ((map->m_flags & F2FS_MAP_MAPPED)) {
block_nr = map->m_pblk + block_in_file - map->m_lblk;
-   SetPageMappedToDisk(page);
+   folio_set_mappedtodisk(folio);
 
if (!f2fs_is_valid_blkaddr(F2FS_I_SB(inode), block_nr,
DATA_GENERIC_ENHANCE_READ)) {
@@ -2147,15 +2148,15 @@ static int f2fs_read_single_page(struct inode *inode, 
struct page *page,
}
} else {
 zero_out:
-   zero_user_segment(page, 0, PAGE_SIZE);
-   if (f2fs_need_verity(inode, page->index) &&
-   !fsverity_verify_page(page)) {
+   folio_zero_segment(folio, 0, folio_size(folio));
+   if (f2fs_need_verity(inode, index) &&
+   !fsverity_verify_folio(folio)) {
ret = -EIO;
goto out;
}
-   if (!PageUptodate(page))
-   SetPageUptodate(page);
-   unlock_page(page);
+   if (!folio_test_uptodate(folio))
+   folio_mark_uptodate(folio);
+   folio_unlock(folio);
goto out;
}
 
@@ -2165,14 +2166,14 @@ static int f2fs_read_single_page(struct inode *inode, 
struct page *page,
 */
if (bio && (!page_is_mergeable(F2FS_I_SB(inode), bio,
   *last_block_in_bio, block_nr) ||
-   !f2fs_crypt_mergeable_bio(bio, inode, page->index, NULL))) {
+   !f2fs_crypt_mergeable_bio(bio, inode, index, NULL))) {
 submit_and_realloc:
f2fs_submit_read_bio(F2FS_I_SB(inode), bio, DATA);
bio = NULL;
}
if (bio == NULL) {
bio = f2fs_grab_read_bio(inode, block_nr, nr_pages,
-   is_readahead ? REQ_RAHEAD : 0, page->index,
+   is_readahead ? REQ_RAHEAD : 0, index,
false);
if (IS_ERR(bio)) {
ret = PTR_ERR(bio);
@@ -2187,7 +2188,7 @@ static int f2fs_read_single_page(struct inode *inode, 
struct page *page,
 */
f2fs_wait_on_block_writeback(inode, block_nr);
 
-   if (bio_add_page(bio, page, blocksize, 0) < blocksize)
+   if (!bio_add_folio(bio, folio, blocksize, 0))
goto submit_and_realloc;
 
inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
@@ -2453,7 +2454,7 @@ static int f2fs_mpage_readpages(struct inode *inode,
 read_single_page:
 #endif
 
-   ret = f2fs_read_single_page(inode, >page, max_nr_pages, 
,
+   ret = f2fs_read_single_page(inode, folio, max_nr_pages, ,
, _block_in_bio, rac);
if (ret) {
 #ifdef CONFIG_F2FS_FS_COMPRESSION
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH 2/2] f2fs: remove unnecessary block size check in init_f2fs_fs()

2024-04-16 Thread Chao Yu
After commit d7e9a9037de2 ("f2fs: Support Block Size == Page Size"),
F2FS_BLKSIZE equals to PAGE_SIZE, remove unnecessary check condition.

Signed-off-by: Chao Yu 
---
 fs/f2fs/super.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 6d1e4fc629e2..32aa6d6fa871 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -4933,12 +4933,6 @@ static int __init init_f2fs_fs(void)
 {
int err;
 
-   if (PAGE_SIZE != F2FS_BLKSIZE) {
-   printk("F2FS not supported on PAGE_SIZE(%lu) != 
BLOCK_SIZE(%lu)\n",
-   PAGE_SIZE, F2FS_BLKSIZE);
-   return -EINVAL;
-   }
-
err = init_inodecache();
if (err)
goto fail;
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH 1/2] f2fs: fix comment in sanity_check_raw_super()

2024-04-16 Thread Chao Yu
Commit d7e9a9037de2 ("f2fs: Support Block Size == Page Size") missed to
adjust comment in sanity_check_raw_super(), fix it.

Signed-off-by: Chao Yu 
---
 fs/f2fs/super.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 0a34c8746782..6d1e4fc629e2 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -3456,7 +3456,7 @@ static int sanity_check_raw_super(struct f2fs_sb_info 
*sbi,
}
}
 
-   /* Currently, support only 4KB block size */
+   /* only support block_size equals to PAGE_SIZE */
if (le32_to_cpu(raw_super->log_blocksize) != F2FS_BLKSIZE_BITS) {
f2fs_info(sbi, "Invalid log_blocksize (%u), supports only %u",
  le32_to_cpu(raw_super->log_blocksize),
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] common/quota: fix keywords of quota feature in _require_prjquota() for f2fs

2024-04-16 Thread Chao Yu
Previously, in f2fs, sysfile quota feature has different name:
- "quota" in mkfs.f2fs
- and "quota_ino" in dump.f2fs

Now, it has unified the name to "quota" since commit 92cc5edeb7
("f2fs-tools: reuse feature_table to clean up print_sb_state()").

It needs to fix keywords in _require_prjquota() for f2fs, Otherwise,
quota testcase will fail.

generic/383 1s ... [not run] quota sysfile not enabled in this device /dev/vdc

Cc: Jaegeuk Kim 
Signed-off-by: Chao Yu 
---
 common/quota | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/common/quota b/common/quota
index 6b529bf4..cfe3276f 100644
--- a/common/quota
+++ b/common/quota
@@ -145,7 +145,7 @@ _require_prjquota()
 if [ "$FSTYP" == "f2fs" ]; then
dump.f2fs $_dev 2>&1 | grep -qw project_quota
[ $? -ne 0 ] && _notrun "Project quota not enabled in this device $_dev"
-   dump.f2fs $_dev 2>&1 | grep -qw quota_ino
+   dump.f2fs $_dev 2>&1 | grep -qw quota
[ $? -ne 0 ] && _notrun "quota sysfile not enabled in this device $_dev"
cat /sys/fs/f2fs/features/project_quota | grep -qw supported
[ $? -ne 0 ] && _notrun "Installed kernel does not support project 
quotas"
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] mkfs.f2fs: add description for ro feature in manual

2024-04-16 Thread Chao Yu
Add missing description for readonly feature in manual of mkfs.f2fs.

Signed-off-by: Chao Yu 
---
 man/mkfs.f2fs.8 | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/man/mkfs.f2fs.8 b/man/mkfs.f2fs.8
index 0dc367b..1f0c724 100644
--- a/man/mkfs.f2fs.8
+++ b/man/mkfs.f2fs.8
@@ -208,6 +208,9 @@ Enable casefolding support in the filesystem. Optional 
flags can be passed with
 .TP
 .B compression
 Enable support for filesystem level compression. Requires extra attr.
+.TP
+.B ro
+Enable readonly feature to eliminate OVP/SSA on-disk layout for small readonly 
partition.
 .RE
 .TP
 .BI \-C " encoding:flags"
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH v3] f2fs: zone: don't block IO if there is remained open zone

2024-04-15 Thread Chao Yu

On 2024/4/15 22:01, Chao Yu wrote:

On 2024/4/15 11:26, Chao Yu wrote:

On 2024/4/14 23:19, Jaegeuk Kim wrote:

It seems this caused kernel hang. Chao, have you tested this patch enough?


Jaegeuk,

Oh, I've checked this patch w/ fsstress before submitting it, but missed
the SPO testcase... do you encounter kernel hang w/ SPO testcase?


I did see any hang issue w/ por_fsstress testcase, which testcase do you use?


Sorry, I mean I haven't reproduced it yet...

Thanks,



Thanks,



Anyway, let me test it more.

Thanks,



On 04/13, Chao Yu wrote:

On 2024/4/13 5:11, Jaegeuk Kim wrote:

On 04/07, Chao Yu wrote:

max open zone may be larger than log header number of f2fs, for
such case, it doesn't need to wait last IO in previous zone, let's
introduce available_open_zone semaphore, and reduce it once we
submit first write IO in a zone, and increase it after completion
of last IO in the zone.

Cc: Daeho Jeong 
Signed-off-by: Chao Yu 
---
v3:
- avoid race condition in between __submit_merged_bio()
and __allocate_new_segment().
   fs/f2fs/data.c    | 105 ++
   fs/f2fs/f2fs.h    |  34 ---
   fs/f2fs/iostat.c  |   7 
   fs/f2fs/iostat.h  |   2 +
   fs/f2fs/segment.c |  43 ---
   fs/f2fs/segment.h |  12 +-
   fs/f2fs/super.c   |   2 +
   7 files changed, 156 insertions(+), 49 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 0d88649c60a5..18a4ac0a06bc 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -373,11 +373,10 @@ static void f2fs_write_end_io(struct bio *bio)
   #ifdef CONFIG_BLK_DEV_ZONED
   static void f2fs_zone_write_end_io(struct bio *bio)
   {
-    struct f2fs_bio_info *io = (struct f2fs_bio_info *)bio->bi_private;
+    struct f2fs_sb_info *sbi = iostat_get_bio_private(bio);
-    bio->bi_private = io->bi_private;
-    complete(>zone_wait);
   f2fs_write_end_io(bio);
+    up(>available_open_zones);
   }
   #endif
@@ -531,6 +530,24 @@ static void __submit_merged_bio(struct f2fs_bio_info *io)
   if (!io->bio)
   return;
+#ifdef CONFIG_BLK_DEV_ZONED
+    if (io->open_zone) {
+    /*
+ * if there is no open zone, it will wait for last IO in
+ * previous zone before submitting new IO.
+ */
+    down(>sbi->available_open_zones);
+    io->open_zone = false;
+    io->zone_openned = true;
+    }
+
+    if (io->close_zone) {
+    io->bio->bi_end_io = f2fs_zone_write_end_io;
+    io->zone_openned = false;
+    io->close_zone = false;
+    }
+#endif
+
   if (is_read_io(fio->op)) {
   trace_f2fs_prepare_read_bio(io->sbi->sb, fio->type, io->bio);
   f2fs_submit_read_bio(io->sbi, io->bio, fio->type);
@@ -601,9 +618,9 @@ int f2fs_init_write_merge_io(struct f2fs_sb_info *sbi)
   INIT_LIST_HEAD(>write_io[i][j].bio_list);
   init_f2fs_rwsem(>write_io[i][j].bio_list_lock);
   #ifdef CONFIG_BLK_DEV_ZONED
-    init_completion(>write_io[i][j].zone_wait);
-    sbi->write_io[i][j].zone_pending_bio = NULL;
-    sbi->write_io[i][j].bi_private = NULL;
+    sbi->write_io[i][j].open_zone = false;
+    sbi->write_io[i][j].zone_openned = false;
+    sbi->write_io[i][j].close_zone = false;
   #endif
   }
   }
@@ -634,6 +651,31 @@ static void __f2fs_submit_merged_write(struct f2fs_sb_info 
*sbi,
   f2fs_up_write(>io_rwsem);
   }
+void f2fs_blkzoned_submit_merged_write(struct f2fs_sb_info *sbi, int type)
+{
+#ifdef CONFIG_BLK_DEV_ZONED
+    struct f2fs_bio_info *io;
+
+    if (!f2fs_sb_has_blkzoned(sbi))
+    return;
+
+    io = sbi->write_io[PAGE_TYPE(type)] + type_to_temp(type);
+
+    f2fs_down_write(>io_rwsem);
+    if (io->zone_openned) {
+    if (io->bio) {
+    io->close_zone = true;
+    __submit_merged_bio(io);
+    } else if (io->zone_openned) {
+    up(>available_open_zones);
+    io->zone_openned = false;
+    }
+    }
+    f2fs_up_write(>io_rwsem);
+#endif
+
+}
+
   static void __submit_merged_write_cond(struct f2fs_sb_info *sbi,
   struct inode *inode, struct page *page,
   nid_t ino, enum page_type type, bool force)
@@ -918,22 +960,16 @@ int f2fs_merge_page_bio(struct f2fs_io_info *fio)
   }
   #ifdef CONFIG_BLK_DEV_ZONED
-static bool is_end_zone_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr)
+static bool is_blkaddr_zone_boundary(struct f2fs_sb_info *sbi,
+    block_t blkaddr, bool start)
   {
-    int devi = 0;
+    if (!f2fs_blkaddr_in_seqzone(sbi, blkaddr))
+    return false;
+
+    if (start)
+    return (blkaddr % sbi->blocks_per_blkz) == 0;
+    return (blkaddr % sbi->blocks_per_blkz == sbi->blocks_per_blkz - 1);
-    if (f2fs_is_multi_device(sbi)) {
-    devi = f2fs_target_device_index(sbi, blkaddr);
-

Re: [f2fs-dev] [PATCH v3] f2fs: zone: don't block IO if there is remained open zone

2024-04-15 Thread Chao Yu

On 2024/4/15 11:26, Chao Yu wrote:

On 2024/4/14 23:19, Jaegeuk Kim wrote:

It seems this caused kernel hang. Chao, have you tested this patch enough?


Jaegeuk,

Oh, I've checked this patch w/ fsstress before submitting it, but missed
the SPO testcase... do you encounter kernel hang w/ SPO testcase?


I did see any hang issue w/ por_fsstress testcase, which testcase do you use?

Thanks,



Anyway, let me test it more.

Thanks,



On 04/13, Chao Yu wrote:

On 2024/4/13 5:11, Jaegeuk Kim wrote:

On 04/07, Chao Yu wrote:

max open zone may be larger than log header number of f2fs, for
such case, it doesn't need to wait last IO in previous zone, let's
introduce available_open_zone semaphore, and reduce it once we
submit first write IO in a zone, and increase it after completion
of last IO in the zone.

Cc: Daeho Jeong 
Signed-off-by: Chao Yu 
---
v3:
- avoid race condition in between __submit_merged_bio()
and __allocate_new_segment().
   fs/f2fs/data.c    | 105 ++
   fs/f2fs/f2fs.h    |  34 ---
   fs/f2fs/iostat.c  |   7 
   fs/f2fs/iostat.h  |   2 +
   fs/f2fs/segment.c |  43 ---
   fs/f2fs/segment.h |  12 +-
   fs/f2fs/super.c   |   2 +
   7 files changed, 156 insertions(+), 49 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 0d88649c60a5..18a4ac0a06bc 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -373,11 +373,10 @@ static void f2fs_write_end_io(struct bio *bio)
   #ifdef CONFIG_BLK_DEV_ZONED
   static void f2fs_zone_write_end_io(struct bio *bio)
   {
-    struct f2fs_bio_info *io = (struct f2fs_bio_info *)bio->bi_private;
+    struct f2fs_sb_info *sbi = iostat_get_bio_private(bio);
-    bio->bi_private = io->bi_private;
-    complete(>zone_wait);
   f2fs_write_end_io(bio);
+    up(>available_open_zones);
   }
   #endif
@@ -531,6 +530,24 @@ static void __submit_merged_bio(struct f2fs_bio_info *io)
   if (!io->bio)
   return;
+#ifdef CONFIG_BLK_DEV_ZONED
+    if (io->open_zone) {
+    /*
+ * if there is no open zone, it will wait for last IO in
+ * previous zone before submitting new IO.
+ */
+    down(>sbi->available_open_zones);
+    io->open_zone = false;
+    io->zone_openned = true;
+    }
+
+    if (io->close_zone) {
+    io->bio->bi_end_io = f2fs_zone_write_end_io;
+    io->zone_openned = false;
+    io->close_zone = false;
+    }
+#endif
+
   if (is_read_io(fio->op)) {
   trace_f2fs_prepare_read_bio(io->sbi->sb, fio->type, io->bio);
   f2fs_submit_read_bio(io->sbi, io->bio, fio->type);
@@ -601,9 +618,9 @@ int f2fs_init_write_merge_io(struct f2fs_sb_info *sbi)
   INIT_LIST_HEAD(>write_io[i][j].bio_list);
   init_f2fs_rwsem(>write_io[i][j].bio_list_lock);
   #ifdef CONFIG_BLK_DEV_ZONED
-    init_completion(>write_io[i][j].zone_wait);
-    sbi->write_io[i][j].zone_pending_bio = NULL;
-    sbi->write_io[i][j].bi_private = NULL;
+    sbi->write_io[i][j].open_zone = false;
+    sbi->write_io[i][j].zone_openned = false;
+    sbi->write_io[i][j].close_zone = false;
   #endif
   }
   }
@@ -634,6 +651,31 @@ static void __f2fs_submit_merged_write(struct f2fs_sb_info 
*sbi,
   f2fs_up_write(>io_rwsem);
   }
+void f2fs_blkzoned_submit_merged_write(struct f2fs_sb_info *sbi, int type)
+{
+#ifdef CONFIG_BLK_DEV_ZONED
+    struct f2fs_bio_info *io;
+
+    if (!f2fs_sb_has_blkzoned(sbi))
+    return;
+
+    io = sbi->write_io[PAGE_TYPE(type)] + type_to_temp(type);
+
+    f2fs_down_write(>io_rwsem);
+    if (io->zone_openned) {
+    if (io->bio) {
+    io->close_zone = true;
+    __submit_merged_bio(io);
+    } else if (io->zone_openned) {
+    up(>available_open_zones);
+    io->zone_openned = false;
+    }
+    }
+    f2fs_up_write(>io_rwsem);
+#endif
+
+}
+
   static void __submit_merged_write_cond(struct f2fs_sb_info *sbi,
   struct inode *inode, struct page *page,
   nid_t ino, enum page_type type, bool force)
@@ -918,22 +960,16 @@ int f2fs_merge_page_bio(struct f2fs_io_info *fio)
   }
   #ifdef CONFIG_BLK_DEV_ZONED
-static bool is_end_zone_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr)
+static bool is_blkaddr_zone_boundary(struct f2fs_sb_info *sbi,
+    block_t blkaddr, bool start)
   {
-    int devi = 0;
+    if (!f2fs_blkaddr_in_seqzone(sbi, blkaddr))
+    return false;
+
+    if (start)
+    return (blkaddr % sbi->blocks_per_blkz) == 0;
+    return (blkaddr % sbi->blocks_per_blkz == sbi->blocks_per_blkz - 1);
-    if (f2fs_is_multi_device(sbi)) {
-    devi = f2fs_target_device_index(sbi, blkaddr);
-    if (blkaddr < FDEV(devi).start_blk ||
-    blkaddr > FDEV(devi).end_b

Re: [f2fs-dev] [PATCH v3] f2fs: zone: don't block IO if there is remained open zone

2024-04-14 Thread Chao Yu

On 2024/4/14 23:19, Jaegeuk Kim wrote:

It seems this caused kernel hang. Chao, have you tested this patch enough?


Jaegeuk,

Oh, I've checked this patch w/ fsstress before submitting it, but missed
the SPO testcase... do you encounter kernel hang w/ SPO testcase?

Anyway, let me test it more.

Thanks,



On 04/13, Chao Yu wrote:

On 2024/4/13 5:11, Jaegeuk Kim wrote:

On 04/07, Chao Yu wrote:

max open zone may be larger than log header number of f2fs, for
such case, it doesn't need to wait last IO in previous zone, let's
introduce available_open_zone semaphore, and reduce it once we
submit first write IO in a zone, and increase it after completion
of last IO in the zone.

Cc: Daeho Jeong 
Signed-off-by: Chao Yu 
---
v3:
- avoid race condition in between __submit_merged_bio()
and __allocate_new_segment().
   fs/f2fs/data.c| 105 ++
   fs/f2fs/f2fs.h|  34 ---
   fs/f2fs/iostat.c  |   7 
   fs/f2fs/iostat.h  |   2 +
   fs/f2fs/segment.c |  43 ---
   fs/f2fs/segment.h |  12 +-
   fs/f2fs/super.c   |   2 +
   7 files changed, 156 insertions(+), 49 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 0d88649c60a5..18a4ac0a06bc 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -373,11 +373,10 @@ static void f2fs_write_end_io(struct bio *bio)
   #ifdef CONFIG_BLK_DEV_ZONED
   static void f2fs_zone_write_end_io(struct bio *bio)
   {
-   struct f2fs_bio_info *io = (struct f2fs_bio_info *)bio->bi_private;
+   struct f2fs_sb_info *sbi = iostat_get_bio_private(bio);
-   bio->bi_private = io->bi_private;
-   complete(>zone_wait);
f2fs_write_end_io(bio);
+   up(>available_open_zones);
   }
   #endif
@@ -531,6 +530,24 @@ static void __submit_merged_bio(struct f2fs_bio_info *io)
if (!io->bio)
return;
+#ifdef CONFIG_BLK_DEV_ZONED
+   if (io->open_zone) {
+   /*
+* if there is no open zone, it will wait for last IO in
+* previous zone before submitting new IO.
+*/
+   down(>sbi->available_open_zones);
+   io->open_zone = false;
+   io->zone_openned = true;
+   }
+
+   if (io->close_zone) {
+   io->bio->bi_end_io = f2fs_zone_write_end_io;
+   io->zone_openned = false;
+   io->close_zone = false;
+   }
+#endif
+
if (is_read_io(fio->op)) {
trace_f2fs_prepare_read_bio(io->sbi->sb, fio->type, io->bio);
f2fs_submit_read_bio(io->sbi, io->bio, fio->type);
@@ -601,9 +618,9 @@ int f2fs_init_write_merge_io(struct f2fs_sb_info *sbi)
INIT_LIST_HEAD(>write_io[i][j].bio_list);
init_f2fs_rwsem(>write_io[i][j].bio_list_lock);
   #ifdef CONFIG_BLK_DEV_ZONED
-   init_completion(>write_io[i][j].zone_wait);
-   sbi->write_io[i][j].zone_pending_bio = NULL;
-   sbi->write_io[i][j].bi_private = NULL;
+   sbi->write_io[i][j].open_zone = false;
+   sbi->write_io[i][j].zone_openned = false;
+   sbi->write_io[i][j].close_zone = false;
   #endif
}
}
@@ -634,6 +651,31 @@ static void __f2fs_submit_merged_write(struct f2fs_sb_info 
*sbi,
f2fs_up_write(>io_rwsem);
   }
+void f2fs_blkzoned_submit_merged_write(struct f2fs_sb_info *sbi, int type)
+{
+#ifdef CONFIG_BLK_DEV_ZONED
+   struct f2fs_bio_info *io;
+
+   if (!f2fs_sb_has_blkzoned(sbi))
+   return;
+
+   io = sbi->write_io[PAGE_TYPE(type)] + type_to_temp(type);
+
+   f2fs_down_write(>io_rwsem);
+   if (io->zone_openned) {
+   if (io->bio) {
+   io->close_zone = true;
+   __submit_merged_bio(io);
+   } else if (io->zone_openned) {
+   up(>available_open_zones);
+   io->zone_openned = false;
+   }
+   }
+   f2fs_up_write(>io_rwsem);
+#endif
+
+}
+
   static void __submit_merged_write_cond(struct f2fs_sb_info *sbi,
struct inode *inode, struct page *page,
nid_t ino, enum page_type type, bool force)
@@ -918,22 +960,16 @@ int f2fs_merge_page_bio(struct f2fs_io_info *fio)
   }
   #ifdef CONFIG_BLK_DEV_ZONED
-static bool is_end_zone_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr)
+static bool is_blkaddr_zone_boundary(struct f2fs_sb_info *sbi,
+   block_t blkaddr, bool start)
   {
-   int devi = 0;
+   if (!f2fs_blkaddr_in_seqzone(sbi, blkaddr))
+   return false;
+
+   if (start)
+   return (blkaddr % sbi->blocks_per_blkz) == 0;
+   r

Re: [f2fs-dev] [PATCH 2/2] f2fs: allow direct io of pinned files for zoned storage

2024-04-13 Thread Chao Yu

On 2024/4/12 2:37, Daeho Jeong wrote:

From: Daeho Jeong 

Since the allocation happens in conventional LU for zoned storage, we
can allow direct io for that.

Signed-off-by: Daeho Jeong 


Reviewed-by: Chao Yu 

Thanks,


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 1/2] f2fs: prevent writing without fallocate() for pinned files

2024-04-13 Thread Chao Yu

On 2024/4/12 1:54, Daeho Jeong wrote:

From: Daeho Jeong 

In a case writing without fallocate(), we can't guarantee it's allocated
in the conventional area for zoned stroage. To make it consistent across
storage devices, we disallow it regardless of storage device types.

Signed-off-by: Daeho Jeong 


Reviewed-by: Chao Yu 

Thanks,


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH v3] f2fs: zone: don't block IO if there is remained open zone

2024-04-12 Thread Chao Yu

On 2024/4/13 5:11, Jaegeuk Kim wrote:

On 04/07, Chao Yu wrote:

max open zone may be larger than log header number of f2fs, for
such case, it doesn't need to wait last IO in previous zone, let's
introduce available_open_zone semaphore, and reduce it once we
submit first write IO in a zone, and increase it after completion
of last IO in the zone.

Cc: Daeho Jeong 
Signed-off-by: Chao Yu 
---
v3:
- avoid race condition in between __submit_merged_bio()
and __allocate_new_segment().
  fs/f2fs/data.c| 105 ++
  fs/f2fs/f2fs.h|  34 ---
  fs/f2fs/iostat.c  |   7 
  fs/f2fs/iostat.h  |   2 +
  fs/f2fs/segment.c |  43 ---
  fs/f2fs/segment.h |  12 +-
  fs/f2fs/super.c   |   2 +
  7 files changed, 156 insertions(+), 49 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 0d88649c60a5..18a4ac0a06bc 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -373,11 +373,10 @@ static void f2fs_write_end_io(struct bio *bio)
  #ifdef CONFIG_BLK_DEV_ZONED
  static void f2fs_zone_write_end_io(struct bio *bio)
  {
-   struct f2fs_bio_info *io = (struct f2fs_bio_info *)bio->bi_private;
+   struct f2fs_sb_info *sbi = iostat_get_bio_private(bio);
  
-	bio->bi_private = io->bi_private;

-   complete(>zone_wait);
f2fs_write_end_io(bio);
+   up(>available_open_zones);
  }
  #endif
  
@@ -531,6 +530,24 @@ static void __submit_merged_bio(struct f2fs_bio_info *io)

if (!io->bio)
return;
  
+#ifdef CONFIG_BLK_DEV_ZONED

+   if (io->open_zone) {
+   /*
+* if there is no open zone, it will wait for last IO in
+* previous zone before submitting new IO.
+*/
+   down(>sbi->available_open_zones);
+   io->open_zone = false;
+   io->zone_openned = true;
+   }
+
+   if (io->close_zone) {
+   io->bio->bi_end_io = f2fs_zone_write_end_io;
+   io->zone_openned = false;
+   io->close_zone = false;
+   }
+#endif
+
if (is_read_io(fio->op)) {
trace_f2fs_prepare_read_bio(io->sbi->sb, fio->type, io->bio);
f2fs_submit_read_bio(io->sbi, io->bio, fio->type);
@@ -601,9 +618,9 @@ int f2fs_init_write_merge_io(struct f2fs_sb_info *sbi)
INIT_LIST_HEAD(>write_io[i][j].bio_list);
init_f2fs_rwsem(>write_io[i][j].bio_list_lock);
  #ifdef CONFIG_BLK_DEV_ZONED
-   init_completion(>write_io[i][j].zone_wait);
-   sbi->write_io[i][j].zone_pending_bio = NULL;
-   sbi->write_io[i][j].bi_private = NULL;
+   sbi->write_io[i][j].open_zone = false;
+   sbi->write_io[i][j].zone_openned = false;
+   sbi->write_io[i][j].close_zone = false;
  #endif
}
}
@@ -634,6 +651,31 @@ static void __f2fs_submit_merged_write(struct f2fs_sb_info 
*sbi,
f2fs_up_write(>io_rwsem);
  }
  
+void f2fs_blkzoned_submit_merged_write(struct f2fs_sb_info *sbi, int type)

+{
+#ifdef CONFIG_BLK_DEV_ZONED
+   struct f2fs_bio_info *io;
+
+   if (!f2fs_sb_has_blkzoned(sbi))
+   return;
+
+   io = sbi->write_io[PAGE_TYPE(type)] + type_to_temp(type);
+
+   f2fs_down_write(>io_rwsem);
+   if (io->zone_openned) {
+   if (io->bio) {
+   io->close_zone = true;
+   __submit_merged_bio(io);
+   } else if (io->zone_openned) {
+   up(>available_open_zones);
+   io->zone_openned = false;
+   }
+   }
+   f2fs_up_write(>io_rwsem);
+#endif
+
+}
+
  static void __submit_merged_write_cond(struct f2fs_sb_info *sbi,
struct inode *inode, struct page *page,
nid_t ino, enum page_type type, bool force)
@@ -918,22 +960,16 @@ int f2fs_merge_page_bio(struct f2fs_io_info *fio)
  }
  
  #ifdef CONFIG_BLK_DEV_ZONED

-static bool is_end_zone_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr)
+static bool is_blkaddr_zone_boundary(struct f2fs_sb_info *sbi,
+   block_t blkaddr, bool start)
  {
-   int devi = 0;
+   if (!f2fs_blkaddr_in_seqzone(sbi, blkaddr))
+   return false;
+
+   if (start)
+   return (blkaddr % sbi->blocks_per_blkz) == 0;
+   return (blkaddr % sbi->blocks_per_blkz == sbi->blocks_per_blkz - 1);
  
-	if (f2fs_is_multi_device(sbi)) {

-   devi = f2fs_target_device_index(sbi, blkaddr);
-   if (blkaddr < FDEV(devi).start_blk ||
-   blkaddr > FDEV(devi).end_blk) {
-   f2fs_err(sbi, "Invalid block %x", bl

Re: [f2fs-dev] [PATCH] f2fs: Fix incorrect return value

2024-04-11 Thread Chao Yu

On 2024/4/9 14:47, wangjianjian (C) via Linux-f2fs-devel wrote:

On 2024/4/7 14:23, Chao Yu wrote:

On 2024/4/4 21:47, Wang Jianjian wrote:

dquot_mark_dquot_dirty returns old dirty state not the error code.


I think it's fine to just pass return value of dquot_mark_dquot_dirty()
to caller, because caller can distinguish status from return value:
1) < 0, there is an error, 2) >= 0, there is no error, previously it is
dirty if it is 1.

mark_all_dquot_dirty uses if return value is 0 to save error code. It may cause 
mess.


I didn't get your point...

No caller of mark_all_dquot_dirty() cares about its return value, so,
I think there is no practical problem now.


By the way, I am fine don't change it.



Thanks,



Signed-off-by: Wang Jianjian 
---
  fs/f2fs/super.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index a6867f26f141..af07027475d9 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -3063,13 +3063,13 @@ static int f2fs_dquot_mark_dquot_dirty(struct dquot 
*dquot)
  {
  struct super_block *sb = dquot->dq_sb;
  struct f2fs_sb_info *sbi = F2FS_SB(sb);
-    int ret = dquot_mark_dquot_dirty(dquot);
+    dquot_mark_dquot_dirty(dquot);
  /* if we are using journalled quota */
  if (is_journalled_quota(sbi))
  set_sbi_flag(sbi, SBI_QUOTA_NEED_FLUSH);
-    return ret;
+    return 0;
  }
  static int f2fs_dquot_commit_info(struct super_block *sb, int type)



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 3/3] f2fs: fix false alarm on invalid block address

2024-04-11 Thread Chao Yu

On 2024/4/10 4:34, Jaegeuk Kim wrote:

f2fs_ra_meta_pages can try to read ahead on invalid block address which is
not the corruption case.


In which case we will read ahead invalid meta pages? recovery w/ META_POR?

Thanks,



Fixes: 31f85ccc84b8 ("f2fs: unify the error handling of f2fs_is_valid_blkaddr")
Signed-off-by: Jaegeuk Kim 
---
  fs/f2fs/checkpoint.c | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index eac698b8dd38..b01320502624 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -179,22 +179,22 @@ static bool __f2fs_is_valid_blkaddr(struct f2fs_sb_info 
*sbi,
break;
case META_SIT:
if (unlikely(blkaddr >= SIT_BLK_CNT(sbi)))
-   goto err;
+   goto check_only;
break;
case META_SSA:
if (unlikely(blkaddr >= MAIN_BLKADDR(sbi) ||
blkaddr < SM_I(sbi)->ssa_blkaddr))
-   goto err;
+   goto check_only;
break;
case META_CP:
if (unlikely(blkaddr >= SIT_I(sbi)->sit_base_addr ||
blkaddr < __start_cp_addr(sbi)))
-   goto err;
+   goto check_only;
break;
case META_POR:
if (unlikely(blkaddr >= MAX_BLKADDR(sbi) ||
blkaddr < MAIN_BLKADDR(sbi)))
-   goto err;
+   goto check_only;
break;
case DATA_GENERIC:
case DATA_GENERIC_ENHANCE:
@@ -228,6 +228,7 @@ static bool __f2fs_is_valid_blkaddr(struct f2fs_sb_info 
*sbi,
return true;
  err:
f2fs_handle_error(sbi, ERROR_INVALID_BLKADDR);
+check_only:
return false;
  }
  



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 2/3] f2fs: clear writeback when compression failed

2024-04-11 Thread Chao Yu

On 2024/4/10 4:34, Jaegeuk Kim wrote:

Let's stop issuing compressed writes and clear their writeback flags.

Signed-off-by: Jaegeuk Kim 
---
  fs/f2fs/compress.c | 33 +++--
  1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index d67c471ab5df..3a8ecc6aee84 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -1031,6 +1031,25 @@ static void set_cluster_writeback(struct compress_ctx 
*cc)
}
  }
  
+static void cancel_cluster_writeback(struct compress_ctx *cc, int submitted)

+{
+   int i;
+
+   for (i = 0; i < cc->cluster_size; i++) {
+   if (!cc->rpages[i])
+   continue;
+   if (i < submitted) {
+   if (i)
+   f2fs_wait_on_page_writeback(cc->rpages[i],
+   DATA, true, true);
+   inode_inc_dirty_pages(cc->inode);
+   lock_page(cc->rpages[i]);
+   }
+   clear_page_private_gcing(cc->rpages[i]);
+   end_page_writeback(cc->rpages[i]);
+   }
+}
+
  static void set_cluster_dirty(struct compress_ctx *cc)
  {
int i;
@@ -1232,7 +1251,6 @@ static int f2fs_write_compressed_pages(struct 
compress_ctx *cc,
.page = NULL,
.encrypted_page = NULL,
.compressed_page = NULL,
-   .submitted = 0,
.io_type = io_type,
.io_wbc = wbc,
.encrypted = fscrypt_inode_uses_fs_layer_crypto(cc->inode) ?
@@ -1358,7 +1376,15 @@ static int f2fs_write_compressed_pages(struct 
compress_ctx *cc,
fio.compressed_page = cc->cpages[i - 1];
  
  		cc->cpages[i - 1] = NULL;

+   fio.submitted = 0;
f2fs_outplace_write_data(, );
+   if (unlikely(!fio.submitted)) {
+   cancel_cluster_writeback(cc, i);
+
+   /* To call fscrypt_finalize_bounce_page */
+   i = cc->valid_nr_cpages;


*submitted = 0; ?

Thanks,


+   goto out_destroy_crypt;
+   }
(*submitted)++;
  unlock_continue:
inode_dec_dirty_pages(cc->inode);
@@ -1392,8 +1418,11 @@ static int f2fs_write_compressed_pages(struct 
compress_ctx *cc,
  out_destroy_crypt:
page_array_free(cc->inode, cic->rpages, cc->cluster_size);
  
-	for (--i; i >= 0; i--)

+   for (--i; i >= 0; i--) {
+   if (!cc->cpages[i])
+   continue;
fscrypt_finalize_bounce_page(>cpages[i]);
+   }
  out_put_cic:
kmem_cache_free(cic_entry_slab, cic);
  out_put_dnode:



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 1/3] f2fs: use folio_test_writeback

2024-04-11 Thread Chao Yu

On 2024/4/10 4:34, Jaegeuk Kim wrote:

Let's convert PageWriteback to folio_test_writeback.

Signed-off-by: Jaegeuk Kim 


Reviewed-by: Chao Yu 

Thanks,


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH v4] f2fs: zone: don't block IO if there is remained open zone

2024-04-11 Thread Chao Yu
max open zone may be larger than log header number of f2fs, for
such case, it doesn't need to wait last IO in previous zone, let's
introduce available_open_zone semaphore, and reduce it once we
submit first write IO in a zone, and increase it after completion
of last IO in the zone.

Cc: Daeho Jeong 
Signed-off-by: Chao Yu 
---
v4:
- avoid unneeded condition in f2fs_blkzoned_submit_merged_write().
 fs/f2fs/data.c| 105 ++
 fs/f2fs/f2fs.h|  34 ---
 fs/f2fs/iostat.c  |   7 
 fs/f2fs/iostat.h  |   2 +
 fs/f2fs/segment.c |  43 ---
 fs/f2fs/segment.h |  12 +-
 fs/f2fs/super.c   |   2 +
 7 files changed, 156 insertions(+), 49 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 60056b9a51be..71472ab6b7e7 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -373,11 +373,10 @@ static void f2fs_write_end_io(struct bio *bio)
 #ifdef CONFIG_BLK_DEV_ZONED
 static void f2fs_zone_write_end_io(struct bio *bio)
 {
-   struct f2fs_bio_info *io = (struct f2fs_bio_info *)bio->bi_private;
+   struct f2fs_sb_info *sbi = iostat_get_bio_private(bio);
 
-   bio->bi_private = io->bi_private;
-   complete(>zone_wait);
f2fs_write_end_io(bio);
+   up(>available_open_zones);
 }
 #endif
 
@@ -531,6 +530,24 @@ static void __submit_merged_bio(struct f2fs_bio_info *io)
if (!io->bio)
return;
 
+#ifdef CONFIG_BLK_DEV_ZONED
+   if (io->open_zone) {
+   /*
+* if there is no open zone, it will wait for last IO in
+* previous zone before submitting new IO.
+*/
+   down(>sbi->available_open_zones);
+   io->open_zone = false;
+   io->zone_openned = true;
+   }
+
+   if (io->close_zone) {
+   io->bio->bi_end_io = f2fs_zone_write_end_io;
+   io->zone_openned = false;
+   io->close_zone = false;
+   }
+#endif
+
if (is_read_io(fio->op)) {
trace_f2fs_prepare_read_bio(io->sbi->sb, fio->type, io->bio);
f2fs_submit_read_bio(io->sbi, io->bio, fio->type);
@@ -601,9 +618,9 @@ int f2fs_init_write_merge_io(struct f2fs_sb_info *sbi)
INIT_LIST_HEAD(>write_io[i][j].bio_list);
init_f2fs_rwsem(>write_io[i][j].bio_list_lock);
 #ifdef CONFIG_BLK_DEV_ZONED
-   init_completion(>write_io[i][j].zone_wait);
-   sbi->write_io[i][j].zone_pending_bio = NULL;
-   sbi->write_io[i][j].bi_private = NULL;
+   sbi->write_io[i][j].open_zone = false;
+   sbi->write_io[i][j].zone_openned = false;
+   sbi->write_io[i][j].close_zone = false;
 #endif
}
}
@@ -634,6 +651,31 @@ static void __f2fs_submit_merged_write(struct f2fs_sb_info 
*sbi,
f2fs_up_write(>io_rwsem);
 }
 
+void f2fs_blkzoned_submit_merged_write(struct f2fs_sb_info *sbi, int type)
+{
+#ifdef CONFIG_BLK_DEV_ZONED
+   struct f2fs_bio_info *io;
+
+   if (!f2fs_sb_has_blkzoned(sbi))
+   return;
+
+   io = sbi->write_io[PAGE_TYPE(type)] + type_to_temp(type);
+
+   f2fs_down_write(>io_rwsem);
+   if (io->zone_openned) {
+   if (io->bio) {
+   io->close_zone = true;
+   __submit_merged_bio(io);
+   } else {
+   up(>available_open_zones);
+   io->zone_openned = false;
+   }
+   }
+   f2fs_up_write(>io_rwsem);
+#endif
+
+}
+
 static void __submit_merged_write_cond(struct f2fs_sb_info *sbi,
struct inode *inode, struct page *page,
nid_t ino, enum page_type type, bool force)
@@ -918,22 +960,16 @@ int f2fs_merge_page_bio(struct f2fs_io_info *fio)
 }
 
 #ifdef CONFIG_BLK_DEV_ZONED
-static bool is_end_zone_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr)
+static bool is_blkaddr_zone_boundary(struct f2fs_sb_info *sbi,
+   block_t blkaddr, bool start)
 {
-   int devi = 0;
+   if (!f2fs_blkaddr_in_seqzone(sbi, blkaddr))
+   return false;
+
+   if (start)
+   return (blkaddr % sbi->blocks_per_blkz) == 0;
+   return (blkaddr % sbi->blocks_per_blkz == sbi->blocks_per_blkz - 1);
 
-   if (f2fs_is_multi_device(sbi)) {
-   devi = f2fs_target_device_index(sbi, blkaddr);
-   if (blkaddr < FDEV(devi).start_blk ||
-   blkaddr > FDEV(devi).end_blk) {
-   f2fs_err(sbi, "Invalid block %x", blkaddr);
-   return false;
-   }
-   blkaddr -= FDEV(devi).start_blk;
-   }
-   ret

[f2fs-dev] [PATCH v2 2/2] f2fs: introduce written_map to indicate written datas

2024-04-11 Thread Chao Yu
Currently, __exchange_data_block() will check checkpointed state of data,
if it is not checkpointed, it will try to exchange blkaddrs directly in
dnode.

However, after commit 899fee36fac0 ("f2fs: fix to avoid data corruption
by forbidding SSR overwrite"), in order to disallow SSR allocator to
reuse all written data/node type blocks, all written blocks were set as
checkpointed.

In order to reenable metadata exchange functionality, let's introduce
written_map to indicate all written blocks including checkpointed one,
or newly written and invalidated one, and use it for SSR allocation,
and then ckpt_valid_bitmap can indicate real checkpointed status, and
we can use it correctly in __exchange_data_block().

[testcase]
xfs_io -f /mnt/f2fs/src -c "pwrite 0 2m"
xfs_io -f /mnt/f2fs/dst -c "pwrite 0 2m"
xfs_io /mnt/f2fs/src -c "fiemap -v"
xfs_io /mnt/f2fs/dst -c "fiemap -v"
f2fs_io move_range /mnt/f2fs/src /mnt/f2fs/dst 0 0 2097152
xfs_io /mnt/f2fs/src -c "fiemap -v"
xfs_io /mnt/f2fs/dst -c "fiemap -v"

[before]
/mnt/f2fs/src:
 EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
   0: [0..4095]:   8445952..8450047  4096 0x1001
/mnt/f2fs/dst:
 EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
   0: [0..4095]:   143360..1474554096 0x1001

/mnt/f2fs/src:
/mnt/f2fs/dst:
 EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
   0: [0..4095]:   4284416..4288511  4096 0x1001

[after]
/mnt/f2fs/src:
 EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
   0: [0..4095]:   147456..1515514096 0x1001
/mnt/f2fs/dst:
 EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
   0: [0..4095]:   151552..1556474096 0x1001

/mnt/f2fs/src:
/mnt/f2fs/dst:
 EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
   0: [0..4095]:   147456..1515514096 0x1001

Signed-off-by: Chao Yu 
---
v2:
- introduce written_blocks in struct seg_entry for
ssr allocator.
 fs/f2fs/gc.c  |  2 +-
 fs/f2fs/segment.c | 22 --
 fs/f2fs/segment.h | 19 ++-
 3 files changed, 27 insertions(+), 16 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 8852814dab7f..ea7b5ca6f09b 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -366,7 +366,7 @@ static inline unsigned int get_gc_cost(struct f2fs_sb_info 
*sbi,
unsigned int segno, struct victim_sel_policy *p)
 {
if (p->alloc_mode == SSR)
-   return get_seg_entry(sbi, segno)->ckpt_valid_blocks;
+   return get_seg_entry(sbi, segno)->written_blocks;
 
/* alloc_mode == LFS */
if (p->gc_mode == GC_GREEDY)
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index af716925db19..0d110908e383 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2456,12 +2456,13 @@ static void update_sit_entry(struct f2fs_sb_info *sbi, 
block_t blkaddr, int del)
sbi->discard_blks--;
 
/*
-* SSR should never reuse block which is checkpointed
-* or newly invalidated.
+* if CP disabling is enable, it allows SSR to reuse newly
+* invalidated block, otherwise forbidding it to pretect fsyned
+* datas.
 */
if (!is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
-   if (!f2fs_test_and_set_bit(offset, se->ckpt_valid_map))
-   se->ckpt_valid_blocks++;
+   if (!f2fs_test_and_set_bit(offset, se->written_map))
+   se->written_blocks++;
}
} else {
exist = f2fs_test_and_clear_bit(offset, se->cur_valid_map);
@@ -2498,8 +2499,6 @@ static void update_sit_entry(struct f2fs_sb_info *sbi, 
block_t blkaddr, int del)
f2fs_test_and_clear_bit(offset, se->discard_map))
sbi->discard_blks++;
}
-   if (!f2fs_test_bit(offset, se->ckpt_valid_map))
-   se->ckpt_valid_blocks += del;
 
__mark_sit_entry_dirty(sbi, segno);
 
@@ -2847,11 +2846,11 @@ static void __get_segment_bitmap(struct f2fs_sb_info 
*sbi,
struct seg_entry *se = get_seg_entry(sbi, segno);
int entries = SIT_VBLOCK_MAP_SIZE / sizeof(unsigned long);
unsigned long *ckpt_map = (unsigned long *)se->ckpt_valid_map;
-   unsigned long *cur_map = (unsigned long *)se->cur_valid_map;
+   unsigned long *written_map = (unsigned long *)se->written_map;
int i;
 
for (i = 0; i < entries; i++)
-   target_map[i] = ckpt_map[i] | cur_map[i];
+   target_map[i] = ckpt_map[i] | written_map[i];
 }
 
 static int __next_free_blkoff(struct f2fs_sb_info *sbi, unsigned long *bitmap,
@@ -4512,9 +4511,9 @@ static int build_sit_info(struct f2fs_sb_info *sbi)
return -ENOMEM;
 
 #ifdef CONFIG_F2FS_CHECK_FS
-   bitmap_size = MAIN_SEGS(s

[f2fs-dev] [PATCH v2 1/2] f2fs: use per-log target_bitmap to improve lookup performace of ssr allocation

2024-04-11 Thread Chao Yu
After commit 899fee36fac0 ("f2fs: fix to avoid data corruption by
forbidding SSR overwrite"), valid block bitmap of current openned
segment is fixed, let's introduce a per-log bitmap instead of temp
bitmap to avoid unnecessary calculation overhead whenever allocating
free slot w/ SSR allocator.

Signed-off-by: Chao Yu 
---
v2:
- rebase to last dev-test branch.
 fs/f2fs/segment.c | 30 ++
 fs/f2fs/segment.h |  1 +
 2 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 6474b7338e81..af716925db19 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2840,31 +2840,39 @@ static int new_curseg(struct f2fs_sb_info *sbi, int 
type, bool new_sec)
return 0;
 }
 
-static int __next_free_blkoff(struct f2fs_sb_info *sbi,
-   int segno, block_t start)
+static void __get_segment_bitmap(struct f2fs_sb_info *sbi,
+   unsigned long *target_map,
+   int segno)
 {
struct seg_entry *se = get_seg_entry(sbi, segno);
int entries = SIT_VBLOCK_MAP_SIZE / sizeof(unsigned long);
-   unsigned long *target_map = SIT_I(sbi)->tmp_map;
unsigned long *ckpt_map = (unsigned long *)se->ckpt_valid_map;
unsigned long *cur_map = (unsigned long *)se->cur_valid_map;
int i;
 
for (i = 0; i < entries; i++)
target_map[i] = ckpt_map[i] | cur_map[i];
+}
+
+static int __next_free_blkoff(struct f2fs_sb_info *sbi, unsigned long *bitmap,
+   int segno, block_t start)
+{
+   __get_segment_bitmap(sbi, bitmap, segno);
 
-   return __find_rev_next_zero_bit(target_map, BLKS_PER_SEG(sbi), start);
+   return __find_rev_next_zero_bit(bitmap, BLKS_PER_SEG(sbi), start);
 }
 
 static int f2fs_find_next_ssr_block(struct f2fs_sb_info *sbi,
-   struct curseg_info *seg)
+   struct curseg_info *seg)
 {
-   return __next_free_blkoff(sbi, seg->segno, seg->next_blkoff + 1);
+   return __find_rev_next_zero_bit(seg->target_map,
+   BLKS_PER_SEG(sbi), seg->next_blkoff + 1);
 }
 
 bool f2fs_segment_has_free_slot(struct f2fs_sb_info *sbi, int segno)
 {
-   return __next_free_blkoff(sbi, segno, 0) < BLKS_PER_SEG(sbi);
+   return __next_free_blkoff(sbi, SIT_I(sbi)->tmp_map, segno, 0) <
+   BLKS_PER_SEG(sbi);
 }
 
 /*
@@ -2890,7 +2898,8 @@ static int change_curseg(struct f2fs_sb_info *sbi, int 
type)
 
reset_curseg(sbi, type, 1);
curseg->alloc_type = SSR;
-   curseg->next_blkoff = __next_free_blkoff(sbi, curseg->segno, 0);
+   curseg->next_blkoff = __next_free_blkoff(sbi, curseg->target_map,
+   curseg->segno, 0);
 
sum_page = f2fs_get_sum_page(sbi, new_segno);
if (IS_ERR(sum_page)) {
@@ -4635,6 +4644,10 @@ static int build_curseg(struct f2fs_sb_info *sbi)
sizeof(struct f2fs_journal), GFP_KERNEL);
if (!array[i].journal)
return -ENOMEM;
+   array[i].target_map = f2fs_kzalloc(sbi, SIT_VBLOCK_MAP_SIZE,
+   GFP_KERNEL);
+   if (!array[i].target_map)
+   return -ENOMEM;
if (i < NR_PERSISTENT_LOG)
array[i].seg_type = CURSEG_HOT_DATA + i;
else if (i == CURSEG_COLD_DATA_PINNED)
@@ -5453,6 +5466,7 @@ static void destroy_curseg(struct f2fs_sb_info *sbi)
for (i = 0; i < NR_CURSEG_TYPE; i++) {
kfree(array[i].sum_blk);
kfree(array[i].journal);
+   kfree(array[i].target_map);
}
kfree(array);
 }
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index e1c0f418aa11..10f3e44f036f 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -292,6 +292,7 @@ struct curseg_info {
struct f2fs_summary_block *sum_blk; /* cached summary block */
struct rw_semaphore journal_rwsem;  /* protect journal area */
struct f2fs_journal *journal;   /* cached journal info */
+   unsigned long *target_map;  /* bitmap for SSR allocator */
unsigned char alloc_type;   /* current allocation type */
unsigned short seg_type;/* segment type like 
CURSEG_XXX_TYPE */
unsigned int segno; /* current segment number */
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs_io: support unset subcommand for pinfile

2024-04-09 Thread Chao Yu

Ping,

Missed to check this patch?

On 2024/3/29 18:25, Chao Yu wrote:

This patch adds unset subcommand for pinfile command.

Usage: f2fs_io pinfile unset [target_file]

Signed-off-by: Chao Yu 
---
  man/f2fs_io.8   |  2 +-
  tools/f2fs_io/f2fs_io.c | 11 +--
  2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/man/f2fs_io.8 b/man/f2fs_io.8
index f097bde..b9c9dc8 100644
--- a/man/f2fs_io.8
+++ b/man/f2fs_io.8
@@ -44,7 +44,7 @@ going down with metadata flush
  going down with fsck mark
  .RE
  .TP
-\fBpinfile\fR \fI[get|set] [file]\fR
+\fBpinfile\fR \fI[get|set|unset] [file]\fR
  Get or set the pinning status on a file.
  .TP
  \fBfadvise\fR \fI[advice] [offset] [length] [file]\fR
diff --git a/tools/f2fs_io/f2fs_io.c b/tools/f2fs_io/f2fs_io.c
index b8e4f02..a7b593a 100644
--- a/tools/f2fs_io/f2fs_io.c
+++ b/tools/f2fs_io/f2fs_io.c
@@ -442,7 +442,7 @@ static void do_fadvise(int argc, char **argv, const struct 
cmd_desc *cmd)
  
  #define pinfile_desc "pin file control"

  #define pinfile_help  \
-"f2fs_io pinfile [get|set] [file]\n\n"   \
+"f2fs_io pinfile [get|set|unset] [file]\n\n" \
  "get/set pinning given the file\n"  \
  
  static void do_pinfile(int argc, char **argv, const struct cmd_desc *cmd)

@@ -464,7 +464,14 @@ static void do_pinfile(int argc, char **argv, const struct 
cmd_desc *cmd)
ret = ioctl(fd, F2FS_IOC_SET_PIN_FILE, );
if (ret != 0)
die_errno("F2FS_IOC_SET_PIN_FILE failed");
-   printf("set_pin_file: %u blocks moved in %s\n", ret, argv[2]);
+   printf("%s pinfile: %u blocks moved in %s\n",
+   argv[1], ret, argv[2]);
+   } else if (!strcmp(argv[1], "unset")) {
+   pin = 0;
+   ret = ioctl(fd, F2FS_IOC_SET_PIN_FILE, );
+   if (ret != 0)
+   die_errno("F2FS_IOC_SET_PIN_FILE failed");
+   printf("%s pinfile in %s\n", argv[1], argv[2]);
} else if (!strcmp(argv[1], "get")) {
unsigned int flags;
  



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: write missing last sum blk of file pinning section

2024-04-09 Thread Chao Yu

On 2024/4/10 7:34, Daeho Jeong wrote:

From: Daeho Jeong 

While do not allocating a new section in advance for file pinning area, I
missed that we should write the sum block for the last segment of a file
pinning section.

Fixes: 9703d69d9d15 ("f2fs: support file pinning for zoned devices")
Signed-off-by: Daeho Jeong 


Reviewed-by: Chao Yu 

Thanks,


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: don't set RO when shutting down f2fs

2024-04-09 Thread Chao Yu

On 2024/4/10 0:21, Jaegeuk Kim wrote:

On 04/09, Chao Yu wrote:

On 2024/4/5 3:52, Jaegeuk Kim wrote:

Shutdown does not check the error of thaw_super due to readonly, which
causes a deadlock like below.

f2fs_ioc_shutdown(F2FS_GOING_DOWN_FULLSYNC)issue_discard_thread
   - bdev_freeze
- freeze_super
   - f2fs_stop_checkpoint()
- f2fs_handle_critical_error - sb_start_write
  - set RO - waiting
   - bdev_thaw
- thaw_super_locked
  - return -EINVAL, if sb_rdonly()
   - f2fs_stop_discard_thread
-> wait for kthread_stop(discard_thread);

Reported-by: "Light Hsieh (謝明燈)" 
Signed-off-by: Jaegeuk Kim 
---
   fs/f2fs/super.c | 11 +--
   1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index df9765b41dac..ba6288e870c5 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -4135,9 +4135,16 @@ void f2fs_handle_critical_error(struct f2fs_sb_info 
*sbi, unsigned char reason,
if (shutdown)
set_sbi_flag(sbi, SBI_IS_SHUTDOWN);
-   /* continue filesystem operators if errors=continue */
-   if (continue_fs || f2fs_readonly(sb))
+   /*
+* Continue filesystem operators if errors=continue. Should not set
+* RO by shutdown, since RO bypasses thaw_super which can hang the
+* system.
+*/
+   if (continue_fs || f2fs_readonly(sb) ||
+   reason == STOP_CP_REASON_SHUTDOWN) {
+   f2fs_warn(sbi, "Stopped filesystem due to readon: %d", reason);
return;


Do we need to set RO after bdev_thaw() in f2fs_do_shutdown()?


IIRC, shutdown doesn't need to set RO as we stopped the checkpoint.
I'm more concerned on any side effect caused by this RO change.


Okay, I just wonder whether we need to follow semantics of errors=remount-ro
semantics, but it looks fine since shutdown operation simulated by ioctl
could not be treated as an inner critical error,

errors=%sSpecify f2fs behavior on critical errors. This 
supports modes:
 "panic", "continue" and "remount-ro", respectively, 
trigger
 panic immediately, continue without doing anything, 
and remount
 the partition in read-only mode. By default it uses 
"continue"
 mode.

Also, it keeps the behavior consistent w/ what we do for errors=panic case.

if (F2FS_OPTION(sbi).errors == MOUNT_ERRORS_PANIC &&
!shutdown && !system_going_down() &&
^
!is_sbi_flag_set(sbi, SBI_IS_SHUTDOWN))
panic("F2FS-fs (device %s): panic forced after error\n",
sb->s_id);

Thanks,





Thanks,


+   }
f2fs_warn(sbi, "Remounting filesystem read-only");
/*



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH v2] f2fs: don't set RO when shutting down f2fs

2024-04-09 Thread Chao Yu

On 2024/4/10 0:20, Jaegeuk Kim wrote:

Shutdown does not check the error of thaw_super due to readonly, which
causes a deadlock like below.

f2fs_ioc_shutdown(F2FS_GOING_DOWN_FULLSYNC)issue_discard_thread
  - bdev_freeze
   - freeze_super
  - f2fs_stop_checkpoint()
   - f2fs_handle_critical_error - sb_start_write
 - set RO - waiting
  - bdev_thaw
   - thaw_super_locked
 - return -EINVAL, if sb_rdonly()
  - f2fs_stop_discard_thread
   -> wait for kthread_stop(discard_thread);

Reported-by: "Light Hsieh (謝明燈)" 
Signed-off-by: Jaegeuk Kim 


Reviewed-by: Chao Yu 

Thanks,


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs-tools: give 6 sections for overprovision buffer

2024-04-08 Thread Chao Yu

On 2024/4/3 7:54, Jaegeuk Kim wrote:

This addresses high GC cost at runtime.

Signed-off-by: Jaegeuk Kim 


Reviewed-by: Chao Yu 

Thanks,


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs-tools: print extension list properly

2024-04-08 Thread Chao Yu

On 2024/4/8 21:11, Sheng Yong wrote:

The "hot file extensions" list does not print properly.

**Before**

extension_count [0x  23 : 35]
cold file extentsions
 [mp  wm  og  jp  ]
 [avi m4v m4p mkv ]
 [mov webmwav m4a ]
 [3gp opusflacgif ]
 [png svg webpjar ]
 [deb iso gz  xz  ]
 [zst pdf pyc ttc ]
 [ttf exe apk cnt ]
 [exo odexvdex]
hot_ext_count   [0x   1 : 1]
hot file extentsions
db  ]
cp_payload  [0x   0 : 0]

**After**

extension_count [0x  23 : 35]
cold file extentsions
 [mp  wm  og  jp  ]
 [avi m4v m4p mkv ]
 [mov webmwav m4a ]
 [3gp opusflacgif ]
 [png svg webpjar ]
 [deb iso gz  xz  ]
 [zst pdf pyc ttc ]
 [ttf exe apk cnt ]
 [exo odexvdex]
hot_ext_count   [0x   1 : 1]
hot file extentsions
 [db  ]
cp_payload  [0x   0 : 0]

Signed-off-by: Sheng Yong 


Reviewed-by: Chao Yu 

Thanks,


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


  1   2   3   4   5   6   7   8   9   10   >