>From: Chao Yu <c...@kernel.org> >Sent: Monday, March 31, 2025 8:36 PM >To: 정요한(JOUNG YOHAN) Mobile AE <yohan.jo...@sk.com>; linux-f2fs- >de...@lists.sourceforge.net >Cc: c...@kernel.org; jaeg...@kernel.org; jyh...@gmail.com; linux- >ker...@vger.kernel.org; 김필현(KIM PILHYUN) Mobile AE <pilhyun....@sk.com> >Subject: [External Mail] Re: [External Mail] Re: [f2fs-dev] [External Mail] >Re: [External Mail] Re: [PATCH] f2fs: prevent the current section from >being selected as a victim during garbage collection > >On 3/31/25 13:13, yohan.joung wrote: >>> On 2025/3/28 15:25, yohan.joung wrote: >>>>> On 2025/3/28 11:40, yohan.joung wrote: >>>>>>> From: Chao Yu <c...@kernel.org> >>>>>>> Sent: Thursday, March 27, 2025 10:48 PM >>>>>>> To: 정요한(JOUNG YOHAN) Mobile AE <yohan.jo...@sk.com>; Yohan Joung >>>>>>> <jyh...@gmail.com>; jaeg...@kernel.org; daeh...@gmail.com >>>>>>> Cc: c...@kernel.org; linux-f2fs-devel@lists.sourceforge.net; >>>>>>> linux- ker...@vger.kernel.org; 김필현(KIM PILHYUN) Mobile AE >>>>>>> <pilhyun....@sk.com> >>>>>>> Subject: [External Mail] Re: [External Mail] Re: [External Mail] Re: >>>>>>> [PATCH] f2fs: prevent the current section from being selected as >>>>>>> a victim during garbage collection >>>>>>> >>>>>>> On 2025/3/27 16:00, yohan.jo...@sk.com wrote: >>>>>>>>> From: Chao Yu <c...@kernel.org> >>>>>>>>> Sent: Thursday, March 27, 2025 4:30 PM >>>>>>>>> To: 정요한(JOUNG YOHAN) Mobile AE <yohan.jo...@sk.com>; Yohan >>>>>>>>> Joung <jyh...@gmail.com>; jaeg...@kernel.org; daeh...@gmail.com >>>>>>>>> Cc: c...@kernel.org; linux-f2fs-devel@lists.sourceforge.net; >>>>>>>>> linux- ker...@vger.kernel.org; 김필현(KIM PILHYUN) Mobile AE >>>>>>>>> <pilhyun....@sk.com> >>>>>>>>> Subject: [External Mail] Re: [External Mail] Re: [PATCH] f2fs: >>>>>>>>> prevent the current section from being selected as a victim >>>>>>>>> during garbage collection >>>>>>>>> >>>>>>>>> On 3/27/25 14:43, yohan.jo...@sk.com wrote: >>>>>>>>>>> From: Chao Yu <c...@kernel.org> >>>>>>>>>>> Sent: Thursday, March 27, 2025 3:02 PM >>>>>>>>>>> To: Yohan Joung <jyh...@gmail.com>; jaeg...@kernel.org; >>>>>>>>>>> daeh...@gmail.com >>>>>>>>>>> Cc: c...@kernel.org; linux-f2fs-devel@lists.sourceforge.net; >>>>>>>>>>> linux- ker...@vger.kernel.org; 정요한(JOUNG YOHAN) Mobile AE >>>>>>>>>>> <yohan.jo...@sk.com> >>>>>>>>>>> Subject: [External Mail] Re: [PATCH] f2fs: prevent the >>>>>>>>>>> current section from being selected as a victim during >>>>>>>>>>> garbage collection >>>>>>>>>>> >>>>>>>>>>> On 3/26/25 22:14, Yohan Joung wrote: >>>>>>>>>>>> When selecting a victim using next_victim_seg in a large >>>>>>>>>>>> section, the selected section might already have been >>>>>>>>>>>> cleared and designated as the new current section, making it >>>>>>>>>>>> actively in >>>>> use. >>>>>>>>>>>> This behavior causes inconsistency between the SIT and SSA. >>>>>>>>>>> >>>>>>>>>>> Hi, does this fix your issue? >>>>>>>>>> >>>>>>>>>> This is an issue that arises when dividing a large section >>>>>>>>>> into segments for garbage collection. >>>>>>>>>> caused by the background GC (garbage collection) thread in >>>>>>>>>> large section >>>>>>>>>> f2fs_gc(victim_section) -> >>>>>>>>>> f2fs_clear_prefree_segments(victim_section)-> >>>>>>>>>> cursec(victim_section) -> f2fs_gc(victim_section by >>>>>>>>>> next_victim_seg) >>>>>>>>> >>>>>>>>> I didn't get it, why f2fs_get_victim() will return section >>>>>>>>> which is used by curseg? It should be avoided by checking w/ >>> sec_usage_check(). >>>>>>>>> >>>>>>>>> Or we missed to check gcing section which next_victim_seg >>>>>>>>> points to during get_new_segment()? >>>>>>>>> >>>>>>>>> Can this happen? >>>>>>>>> >>>>>>>>> e.g. >>>>>>>>> - bggc selects sec #0 >>>>>>>>> - next_victim_seg: seg #0 >>>>>>>>> - migrate seg #0 and stop >>>>>>>>> - next_victim_seg: seg #1 >>>>>>>>> - checkpoint, set sec #0 free if sec #0 has no valid blocks >>>>>>>>> - allocate seg #0 in sec #0 for curseg >>>>>>>>> - curseg moves to seg #1 after allocation >>>>>>>>> - bggc tries to migrate seg #1 >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>> That's correct >>>>>>>> In f2fs_get_victim, we use next_victim_seg to directly jump to >>>>>>>> got_result, thereby bypassing sec_usage_check What do you think >>>>>>>> about this change? >>>>>>>> >>>>>>>> @@ -850,15 +850,20 @@ int f2fs_get_victim(struct f2fs_sb_info >>>>>>>> *sbi, >>>>>>> unsigned int *result, >>>>>>>> p.min_segno = sbi->next_victim_seg[BG_GC]; >>>>>>>> *result = p.min_segno; >>>>>>>> sbi->next_victim_seg[BG_GC] = NULL_SEGNO; >>>>>>>> - goto got_result; >>>>>>>> } >>>>>>>> if (gc_type == FG_GC && >>>>>>>> sbi->next_victim_seg[FG_GC] >>>>>>>> != NULL_SEGNO) >>> { >>>>>>>> p.min_segno = sbi->next_victim_seg[FG_GC]; >>>>>>>> *result = p.min_segno; >>>>>>>> sbi->next_victim_seg[FG_GC] = NULL_SEGNO; >>>>>>>> - goto got_result; >>>>>>>> } >>>>>>>> + >>>>>>>> + secno = GET_SEC_FROM_SEG(sbi, segno); >>>>>>>> + >>>>>>>> + if (sec_usage_check(sbi, secno)) >>>>>>>> + goto next; >>>>>>>> + >>>>>>>> + goto got_result; >>>>>>>> } >>>>>>> >>>>>>> But still allocator can assign this segment after >>>>>>> sec_usage_check() in race condition, right? >>>>>> Since the BG GC using next_victim takes place after the SIT >>>>>> update in do_checkpoint, it seems unlikely that a race condition >>>>>> with >>>>> sec_usage_check will occur. >>>>> >>>>> I mean this: >>>>> >>>>> - gc_thread >>>>> - f2fs_gc >>>>> - f2fs_get_victim >>>>> - sec_usage_check --- segno #1 is not used in any cursegs >>>>> - f2fs_allocate_data_block >>>>> - new_curseg >>>>> - get_new_segment find segno #1 >>>>> >>>>> - do_garbage_collect >>>>> >>>>> Thanks, >>>> >>>> do_checkpoint sec0 free >>>> If sec0 is not freed, then >>> segno1 within sec0 cannot be >>>> allocated >>>> - gc_thread >>>> - f2fs_gc >>>> - f2fs_get_victim >>>> - sec_usage_check --- segno #1 is not used in any cursegs (but >>>> sec0 >>> is already used) >>>> - >>>> f2fs_allocate_data_block >>>> - new_curseg >>>> - get_new_segment find >>> segno #1 >>>> >>>> - do_garbage_collect >>>> >>>> I appreciate your patch, it is under testing. >>>> but I'm wondering if there's a risk of a race condition in this >>>> situation >>> >>> Oh, yes, I may missed that get_new_segment can return a free segment >>> in partial used section. >>> >>> So what do you think of this? >>> - check CURSEG() in do_garbage_collect() and get_victim() >>> - reset next_victim_seg[] in get_new_segment() and >>> __set_test_and_free() during checkpoint. >>> >>> Thanks, >> >> How about using victim_secmap? >> gc_thread >> mutex_lock(&DIRTY_I(sbi)->seglist_lock); >> __set_test_and_free >> check cur section next_victim clear >> mutex_unlock(&dirty_i->seglist_lock); >> >> mutex_lock(&dirty->seglist_lock); >> f2fs_get_victim >> mutex_unlock(&dirty_i->seglist_lock); >> >> static inline void __set_test_and_free(struct f2fs_sb_info *sbi, >> if (next >= start_segno + usable_segs) { >> if (test_and_clear_bit(secno, free_i->free_secmap)) >> free_i->free_sections++; >> + >> + if (test_and_clear_bit(secno, >> dirty_i->victim_secmap)) >> + sbi->next_victim_seg[BG_GC] = >> + NULL_SEGNO; > >Can this happen? > >segs_per_sec=2 > >- seg#0 and seg#1 are all dirty >- all valid blocks are removed in seg#1 >- checkpoint -> seg#1 becomes free >- gc select this sec and next_victim_seg=seg#0 >- migrate seg#0, next_victim_seg=seg#1 >- allocator assigns seg#1 to curseg >- gc tries to migrate seg#1 > >Thanks, The detailed scenario segs_per_sec=2 - seg#0 and seg#1 are all dirty - all valid blocks are removed in seg#1 - gc select this sec and next_victim_seg=seg#0 - migrate seg#0, next_victim_seg=seg#1 - checkpoint -> sec(seg#0, seg#1) becomes free - allocator assigns sec(seg#0, seg#1) to curseg - gc tries to migrate seg#1 > >> } >> } >>> >>>> >>>> >>>>> >>>>>>> >>>>>>> IMO, we can clear next_victim_seg[] once section is free in >>>>>>> __set_test_and_free()? something like this: >>>>>> I will test it according to your suggestion. >>>>>> If there are no issues, can I submit it again with the patch? >>>>>> Thanks >>>>>>> >>>>>>> --- >>>>>>> fs/f2fs/segment.h | 13 ++++++++++--- >>>>>>> 1 file changed, 10 insertions(+), 3 deletions(-) >>>>>>> >>>>>>> diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h index >>>>>>> 0465dc00b349..826e37999085 100644 >>>>>>> --- a/fs/f2fs/segment.h >>>>>>> +++ b/fs/f2fs/segment.h >>>>>>> @@ -473,9 +473,16 @@ static inline void >>>>>>> __set_test_and_free(struct f2fs_sb_info *sbi, >>>>>>> goto skip_free; >>>>>>> next = find_next_bit(free_i->free_segmap, >>>>>>> start_segno + SEGS_PER_SEC(sbi), >>> start_segno); >>>>>>> - if (next >= start_segno + usable_segs) { >>>>>>> - if (test_and_clear_bit(secno, free_i- >>free_secmap)) >>>>>>> - free_i->free_sections++; >>>>>>> + if ((next >= start_segno + usable_segs) && >>>>>>> + test_and_clear_bit(secno, free_i->free_secmap)) >{ >>>>>>> + free_i->free_sections++; >>>>>>> + >>>>>>> + if >>>>>>> (GET_SEC_FROM_SEG(sbi->next_victim_seg[BG_GC]) >== >>>>>>> + >>>>>>> secno) >>>>>>> + sbi->next_victim_seg[BG_GC] = >>>>>>> NULL_SEGNO; >>>>>>> + if >>>>>>> (GET_SEC_FROM_SEG(sbi->next_victim_seg[FG_GC]) >== >>>>>>> + >>>>>>> secno) >>>>>>> + sbi->next_victim_seg[FG_GC] = >>>>>>> NULL_SEGNO; >>>>>>> } >>>>>>> } >>>>>>> skip_free: >>>>>>> -- >>>>>>> 2.40.1 >>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Because the call stack is different, I think that in order to >>>>>>>>>> handle everything at once, we need to address it within >>>>>>>>>> do_garbage_collect, or otherwise include it on both sides. >>>>>>>>>> What do you think? >>>>>>>>>> >>>>>>>>>> [30146.337471][ T1300] F2FS-fs (dm-54): Inconsistent segment >>>>>>>>>> (70961) type [0, 1] in SSA and SIT [30146.346151][ T1300] Call >>> trace: >>>>>>>>>> [30146.346152][ T1300] dump_backtrace+0xe8/0x10c >>>>>>>>>> [30146.346157][ T1300] show_stack+0x18/0x28 [30146.346158][ >>>>>>>>>> T1300] dump_stack_lvl+0x50/0x6c [30146.346161][ T1300] >>>>>>>>>> dump_stack+0x18/0x28 [30146.346162][ T1300] >>>>>>>>>> f2fs_stop_checkpoint+0x1c/0x3c [30146.346165][ T1300] >>>>>>>>>> do_garbage_collect+0x41c/0x271c [30146.346167][ T1300] >>>>>>>>>> f2fs_gc+0x27c/0x828 [30146.346168][ T1300] >>>>>>>>>> gc_thread_func+0x290/0x88c [30146.346169][ T1300] >>>>>>>>>> kthread+0x11c/0x164 [30146.346172][ T1300] >>>>>>>>>> ret_from_fork+0x10/0x20 >>>>>>>>>> >>>>>>>>>> struct curseg_info : 0xffffff803f95e800 { >>>>>>>>>> segno : 0x11531 : 70961 >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> struct f2fs_sb_info : 0xffffff8811d12000 { >>>>>>>>>> next_victim_seg[0] : 0x11531 : 70961 } >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> https://lore.kernel.org/linux-f2fs-devel/20250325080646.32919 >>>>>>>>>>> 47 >>>>>>>>>>> -2 >>>>>>>>>>> - >>>>>>>>>>> c...@kernel.org >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Signed-off-by: Yohan Joung <yohan.jo...@sk.com> >>>>>>>>>>>> --- >>>>>>>>>>>> fs/f2fs/gc.c | 4 ++++ >>>>>>>>>>>> 1 file changed, 4 insertions(+) >>>>>>>>>>>> >>>>>>>>>>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c index >>>>>>>>>>>> 2b8f9239bede..4b5d18e395eb 100644 >>>>>>>>>>>> --- a/fs/f2fs/gc.c >>>>>>>>>>>> +++ b/fs/f2fs/gc.c >>>>>>>>>>>> @@ -1926,6 +1926,10 @@ int f2fs_gc(struct f2fs_sb_info *sbi, >>>>>>>>>>>> struct >>>>>>>>>>> f2fs_gc_control *gc_control) >>>>>>>>>>>> goto stop; >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> + if (__is_large_section(sbi) && >>>>>>>>>>>> + IS_CURSEC(sbi, GET_SEC_FROM_SEG(sbi, segno))) >>>>>>>>>>>> + goto stop; >>>>>>>>>>>> + >>>>>>>>>>>> seg_freed = do_garbage_collect(sbi, segno, &gc_list, >gc_type, >>>>>>>>>>>> >>>>>>>>>>>> gc_control->should_migrate_blocks, >>>>>>>>>>>> gc_control->one_time); >>>>>>>>>> >>>>>>>> >>>>>> >>>>
_______________________________________________ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel