On 12/26/2025 9:22 PM, Yongpeng Yang wrote:
On 12/26/25 10:23, Chao Yu via Linux-f2fs-devel wrote:
On 12/24/2025 9:16 PM, Yongpeng Yang wrote:
From: Yongpeng Yang <[email protected]>

Nice catch!


During SPO tests, when mounting F2FS, an -EINVAL error was returned from
f2fs_recover_inode_page. The issue occurred under the following scenario

Thread A                                     Thread B
f2fs_ioc_commit_atomic_write
   - f2fs_do_sync_file // atomic = true
    - f2fs_fsync_node_pages
      : last_folio = inode folio
      : schedule before folio_lock(last_folio) f2fs_write_checkpoint
                                                - block_operations//
writeback last_folio
                                                - schedule before
f2fs_flush_nat_entries
      : set_fsync_mark(last_folio, 1)
      : set_dentry_mark(last_folio, 1)
      : folio_mark_dirty(last_folio)
      : __write_node_folio(last_folio)

What do you think of relocating set_fsync_mark & set_dentry_mark logic
into __write_node_folio(), so that we can cover them w/ existed .node_write
lock in __write_node_folio(), it can avoid checkpoint racing as well.

Yes, this makes more sense. I'll fix this in v2 patch.


                                                - f2fs_flush_nat_entries
                                                  : {struct nat_entry}-
flag |= BIT(IS_CHECKPOINTED)
                                               SPO

Thread A calls f2fs_need_dentry_mark(sbi, ino), and the last_folio has
already been written once. However, the {struct nat_entry}->flag did not
have the IS_CHECKPOINTED set, causing set_dentry_mark(last_folio, 1) and
write last_folio again.

After SPO and reboot, it was detected that {struct node_info}->blk_addr
was not NULL_ADDR because Thread B successfully write the checkpoint.

This issue only occurs in atomic write scenarios, and does not affect

If atomic is false, we will encounter such issue as well? or am I missing
something?

In the case of atomic is true, the folio must be non-dirty.
For regular file fsync operations, the folio must be dirty. If
block_operations->f2fs_sync_node_pages successfully submit the folio
write, this path will not be executed. Otherwise, the
f2fs_write_checkpoint will need to wait for the folio write submission
to complete, as sbi->nr_pages[F2FS_DIRTY_NODES] > 0. Therefore, the
situation where f2fs_need_dentry_mark checks that the {struct
nat_entry}->flag does not have the IS_CHECKPOINTED flag, but the folio
write has already been submitted, will not occur.

IIUC, you mean only atomic write will redirty last folio after 
block_operations(),
then, set_dentry_mark() can race w/ __flush_nat_entry_set()?

        if (atomic && !marked) {
                f2fs_debug(sbi, "Retry to write fsync mark: ino=%u, idx=%lx",
                           ino, last_folio->index);
                folio_lock(last_folio);
                f2fs_folio_wait_writeback(last_folio, NODE, true, true);
                folio_mark_dirty(last_folio);
                folio_unlock(last_folio);
                goto retry;
        }

If that's what you mean, please add these details into commit message.

Thanks,


Thanks
Yongpeng,


             if (!atomic || folio == last_folio) {
                 set_fsync_mark(folio, 1);
                 percpu_counter_inc(&sbi->rf_node_block_count);
                 if (IS_INODE(folio)) {
                     if (is_inode_flag_set(inode,
                                 FI_DIRTY_INODE))
                         f2fs_update_inode(inode, folio);
                     set_dentry_mark(folio,
                         f2fs_need_dentry_mark(sbi, ino));
                 }
                 /* may be written by other thread */
                 if (!folio_test_dirty(folio))
                     folio_mark_dirty(folio);
             }

Thanks,

regular file fsync operations. Therefore, for atomic file fsync,
sbi->cp_rwsem should be acquired to ensure that the IS_CHECKPOINTED flag
correctly indicates that the checkpoint write has been completed.

Fixes: 608514deba38 ("f2fs: set fsync mark only for the last dnode")
Signed-off-by: Sheng Yong <[email protected]>
Signed-off-by: Yongpeng Yang <[email protected]>
---
   fs/f2fs/node.c | 8 +++++++-
   1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 482a362f2625..e482a38444f1 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1854,7 +1854,7 @@ int f2fs_fsync_node_pages(struct f2fs_sb_info
*sbi, struct inode *inode,
       struct folio_batch fbatch;
       int ret = 0;
       struct folio *last_folio = NULL;
-    bool marked = false;
+    bool marked = false, locked = false;
       nid_t ino = inode->i_ino;
       int nr_folios;
       int nwritten = 0;
@@ -1889,6 +1889,10 @@ int f2fs_fsync_node_pages(struct f2fs_sb_info
*sbi, struct inode *inode,
               if (ino_of_node(folio) != ino)
                   continue;
   +            if (atomic && folio == last_folio) {
+                f2fs_lock_op(sbi);
+                locked = true;
+            }
               folio_lock(folio);
                 if (unlikely(!is_node_folio(folio))) {
@@ -1959,6 +1963,8 @@ int f2fs_fsync_node_pages(struct f2fs_sb_info
*sbi, struct inode *inode,
           goto retry;
       }
   out:
+    if (locked)
+        f2fs_unlock_op(sbi);
       if (nwritten)
           f2fs_submit_merged_write_cond(sbi, NULL, NULL, ino, NODE);
       return ret;



_______________________________________________
Linux-f2fs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel




_______________________________________________
Linux-f2fs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to