On 2026/3/4 12:57, Shinichiro Kawasaki wrote:
On Feb 27, 2026 / 20:32, Jaegeuk Kim wrote:
On 02/24, Shinichiro Kawasaki wrote:
On Feb 24, 2026 / 03:26, Jaegeuk Kim wrote:
On 02/18, Shin'ichiro Kawasaki wrote:
From: Shin'ichiro Kawasaki via Linux-f2fs-devel 
<[email protected]>

A lockdep WARN is observed recently under the following steps:

1) Create a zoned TCMU device
2) Create a f2fs filesystem on the zoned TCMU device and mount it
3) Fill the filesystem with files and trigger GC
4) Unmout the filesystem
5) Remove the zoned TCMU device

The lockdep WARN indicates that a circular lock depedency formed by four
contexts, as described below.

a) TCMU device removal context:
  - call del_gendisk() to get q->q_usage_counter
  - call start_flush_work() to get work_completion of wb->dwork
b) f2fs writeback context:
  - in wb_workfn(), which holds work_completion of wb->dwork
  - call f2fs_balance_fs() to get sbi->gc_lock
c) f2fs vfs_write context:
  - call f2fs_gc() to get sbi->gc_lock
  - call f2fs_write_checkpoint() to get sbi->cp_global_sem
d) f2fs mount context:
  - call recover_fsync_data() to get sbi->cp_global_sem
  - call f2fs_check_and_fix_write_pointer() to call blkdev_report_zones()
    that goes down to blk_mq_alloc_request and get q->q_usage_counter

To suppress the WARN, cut the dependency d) between sbi->cp_global_sem
and q->q_usage_counter. For that purpose, move the
f2fs_check_and_fix_write_pointer() call outside of the critical section
of sbi->cp_global_sem in f2fs_recovery_fsync_data(). This change is fine
because the write pointer fix operation only affects the main segments
and does not interact with the check point metadata. Furthermore,
conflicts between the write pointer fix operation and data/node flush
operations remain protected by SBI_POR_DOING.

Fixes: c426d99127b1 ("f2fs: Check write pointer consistency of open zones")
Reviewed-by: Damien Le Moal <[email protected]>
Signed-off-by: Shin'ichiro Kawasaki <[email protected]>
---
  fs/f2fs/recovery.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
index a26071f2b0bc..87fd6cd436fe 100644
--- a/fs/f2fs/recovery.c
+++ b/fs/f2fs/recovery.c
@@ -922,6 +922,8 @@ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool 
check_only)
                truncate_inode_pages_final(META_MAPPING(sbi));
        }
+ f2fs_up_write_trace(&sbi->cp_global_sem, &lc);
+
        /*
         * If fsync data succeeds or there is no fsync data to recover,
         * and the f2fs is not read only, check and fix zoned block devices'
@@ -933,8 +935,6 @@ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool 
check_only)
        if (!err)
                clear_sbi_flag(sbi, SBI_POR_DOING);
- f2fs_up_write_trace(&sbi->cp_global_sem, &lc);
-

This was a guard to prevent checkpoint during f2fs_check_and_fix_write_pointer()
where it changes the checkpoint as well?

I checked f2fs_check_and_fix_write_pointer() again, and it does not look
changing the checkpoint to me. FYI, here I show the rough function call chain
from f2fs_check_and_fix_write_pointer() as below. I guess this call chain does
not change the checkpoint, but if I misunderstand anything, please let me know.

  f2fs_check_and_fix_write_pointer()
   fix_curseg_write_pointer()
    do_fix_curseg_write_pointer()
     blkdev_report_zones()
      report_one_zone_cb()
     f2fs_allocate_new_section()
      __allocate_new_segment()
       new_curseg()

E.g., curseg.


Thanks, I looked in do_checkpoint() in fs/f2fs/checkpoing.c, and found it refers
to cursegs. Then, this patch will allow recording the cursegs in parallel of
f2fs_check_and_fix_write_pointer(), and it will results in inconsistent curseg
values in checkpoints. Not good. Let me drop this patch.

I will seek out other ways to avoid the lockdep. I have no idea how to do that
at this moment, though.

Shinichiro,

IMO, this looks like a false alarm of lockdep, what do you think of this fix?

https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/commit/?h=bugfix/syzbot&id=3b19564b95e9ba9803ef30e90eace0977b9d140d

---
 fs/f2fs/f2fs.h  | 3 +++
 fs/f2fs/super.c | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index bb34e864d0ef..5b400e99f332 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -2042,6 +2042,9 @@ struct f2fs_sb_info {
        spinlock_t iostat_lat_lock;
        struct iostat_lat_info *iostat_io_lat;
 #endif
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+       struct lock_class_key cp_global_sem_key;
+#endif
 };

 /* Definitions to access f2fs_sb_info */
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 8774c60b4be4..9e85f31fa828 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -4948,6 +4948,9 @@ static int f2fs_fill_super(struct super_block *sb, struct 
fs_context *fc)
        init_f2fs_rwsem_trace(&sbi->gc_lock, sbi, LOCK_NAME_GC_LOCK);
        mutex_init(&sbi->writepages);
        init_f2fs_rwsem_trace(&sbi->cp_global_sem, sbi, LOCK_NAME_CP_GLOBAL);
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+       lockdep_set_class(&sbi->cp_global_sem, &sbi->cp_global_sem_key);
+#endif
        init_f2fs_rwsem_trace(&sbi->node_write, sbi, LOCK_NAME_NODE_WRITE);
        init_f2fs_rwsem_trace(&sbi->node_change, sbi, LOCK_NAME_NODE_CHANGE);
        spin_lock_init(&sbi->stat_lock);
--
2.49.0




_______________________________________________
Linux-f2fs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to