On 12/26/25 10:23, Chao Yu via Linux-f2fs-devel wrote:
> On 12/24/2025 9:16 PM, Yongpeng Yang wrote:
>> From: Yongpeng Yang <[email protected]>
> 
> Nice catch!
> 
>>
>> During SPO tests, when mounting F2FS, an -EINVAL error was returned from
>> f2fs_recover_inode_page. The issue occurred under the following scenario
>>
>> Thread A                                     Thread B
>> f2fs_ioc_commit_atomic_write
>>   - f2fs_do_sync_file // atomic = true
>>    - f2fs_fsync_node_pages
>>      : last_folio = inode folio
>>      : schedule before folio_lock(last_folio) f2fs_write_checkpoint
>>                                                - block_operations//
>> writeback last_folio
>>                                                - schedule before
>> f2fs_flush_nat_entries
>>      : set_fsync_mark(last_folio, 1)
>>      : set_dentry_mark(last_folio, 1)
>>      : folio_mark_dirty(last_folio)
>>      : __write_node_folio(last_folio)
> 
> What do you think of relocating set_fsync_mark & set_dentry_mark logic
> into __write_node_folio(), so that we can cover them w/ existed .node_write
> lock in __write_node_folio(), it can avoid checkpoint racing as well.

Yes, this makes more sense. I'll fix this in v2 patch.

> 
>>                                                - f2fs_flush_nat_entries
>>                                                  : {struct nat_entry}-
>> >flag |= BIT(IS_CHECKPOINTED)
>>                                               SPO
>>
>> Thread A calls f2fs_need_dentry_mark(sbi, ino), and the last_folio has
>> already been written once. However, the {struct nat_entry}->flag did not
>> have the IS_CHECKPOINTED set, causing set_dentry_mark(last_folio, 1) and
>> write last_folio again.
>>
>> After SPO and reboot, it was detected that {struct node_info}->blk_addr
>> was not NULL_ADDR because Thread B successfully write the checkpoint.
>>
>> This issue only occurs in atomic write scenarios, and does not affect
> 
> If atomic is false, we will encounter such issue as well? or am I missing
> something?

In the case of atomic is true, the folio must be non-dirty.
For regular file fsync operations, the folio must be dirty. If
block_operations->f2fs_sync_node_pages successfully submit the folio
write, this path will not be executed. Otherwise, the
f2fs_write_checkpoint will need to wait for the folio write submission
to complete, as sbi->nr_pages[F2FS_DIRTY_NODES] > 0. Therefore, the
situation where f2fs_need_dentry_mark checks that the {struct
nat_entry}->flag does not have the IS_CHECKPOINTED flag, but the folio
write has already been submitted, will not occur.

Thanks
Yongpeng,

> 
>             if (!atomic || folio == last_folio) {
>                 set_fsync_mark(folio, 1);
>                 percpu_counter_inc(&sbi->rf_node_block_count);
>                 if (IS_INODE(folio)) {
>                     if (is_inode_flag_set(inode,
>                                 FI_DIRTY_INODE))
>                         f2fs_update_inode(inode, folio);
>                     set_dentry_mark(folio,
>                         f2fs_need_dentry_mark(sbi, ino));
>                 }
>                 /* may be written by other thread */
>                 if (!folio_test_dirty(folio))
>                     folio_mark_dirty(folio);
>             }
> 
> Thanks,
> 
>> regular file fsync operations. Therefore, for atomic file fsync,
>> sbi->cp_rwsem should be acquired to ensure that the IS_CHECKPOINTED flag
>> correctly indicates that the checkpoint write has been completed.
>>
>> Fixes: 608514deba38 ("f2fs: set fsync mark only for the last dnode")
>> Signed-off-by: Sheng Yong <[email protected]>
>> Signed-off-by: Yongpeng Yang <[email protected]>
>> ---
>>   fs/f2fs/node.c | 8 +++++++-
>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
>> index 482a362f2625..e482a38444f1 100644
>> --- a/fs/f2fs/node.c
>> +++ b/fs/f2fs/node.c
>> @@ -1854,7 +1854,7 @@ int f2fs_fsync_node_pages(struct f2fs_sb_info
>> *sbi, struct inode *inode,
>>       struct folio_batch fbatch;
>>       int ret = 0;
>>       struct folio *last_folio = NULL;
>> -    bool marked = false;
>> +    bool marked = false, locked = false;
>>       nid_t ino = inode->i_ino;
>>       int nr_folios;
>>       int nwritten = 0;
>> @@ -1889,6 +1889,10 @@ int f2fs_fsync_node_pages(struct f2fs_sb_info
>> *sbi, struct inode *inode,
>>               if (ino_of_node(folio) != ino)
>>                   continue;
>>   +            if (atomic && folio == last_folio) {
>> +                f2fs_lock_op(sbi);
>> +                locked = true;
>> +            }
>>               folio_lock(folio);
>>                 if (unlikely(!is_node_folio(folio))) {
>> @@ -1959,6 +1963,8 @@ int f2fs_fsync_node_pages(struct f2fs_sb_info
>> *sbi, struct inode *inode,
>>           goto retry;
>>       }
>>   out:
>> +    if (locked)
>> +        f2fs_unlock_op(sbi);
>>       if (nwritten)
>>           f2fs_submit_merged_write_cond(sbi, NULL, NULL, ino, NODE);
>>       return ret;
> 
> 
> 
> _______________________________________________
> Linux-f2fs-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel



_______________________________________________
Linux-f2fs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to