>On 2025/3/28 15:25, yohan.joung wrote: >>> On 2025/3/28 11:40, yohan.joung wrote: >>>>> From: Chao Yu <c...@kernel.org> >>>>> Sent: Thursday, March 27, 2025 10:48 PM >>>>> To: 정요한(JOUNG YOHAN) Mobile AE <yohan.jo...@sk.com>; Yohan Joung >>>>> <jyh...@gmail.com>; jaeg...@kernel.org; daeh...@gmail.com >>>>> Cc: c...@kernel.org; linux-f2fs-devel@lists.sourceforge.net; linux- >>>>> ker...@vger.kernel.org; 김필현(KIM PILHYUN) Mobile AE >>>>> <pilhyun....@sk.com> >>>>> Subject: [External Mail] Re: [External Mail] Re: [External Mail] Re: >>>>> [PATCH] f2fs: prevent the current section from being selected as a >>>>> victim during garbage collection >>>>> >>>>> On 2025/3/27 16:00, yohan.jo...@sk.com wrote: >>>>>>> From: Chao Yu <c...@kernel.org> >>>>>>> Sent: Thursday, March 27, 2025 4:30 PM >>>>>>> To: 정요한(JOUNG YOHAN) Mobile AE <yohan.jo...@sk.com>; Yohan Joung >>>>>>> <jyh...@gmail.com>; jaeg...@kernel.org; daeh...@gmail.com >>>>>>> Cc: c...@kernel.org; linux-f2fs-devel@lists.sourceforge.net; >>>>>>> linux- ker...@vger.kernel.org; 김필현(KIM PILHYUN) Mobile AE >>>>>>> <pilhyun....@sk.com> >>>>>>> Subject: [External Mail] Re: [External Mail] Re: [PATCH] f2fs: >>>>>>> prevent the current section from being selected as a victim >>>>>>> during garbage collection >>>>>>> >>>>>>> On 3/27/25 14:43, yohan.jo...@sk.com wrote: >>>>>>>>> From: Chao Yu <c...@kernel.org> >>>>>>>>> Sent: Thursday, March 27, 2025 3:02 PM >>>>>>>>> To: Yohan Joung <jyh...@gmail.com>; jaeg...@kernel.org; >>>>>>>>> daeh...@gmail.com >>>>>>>>> Cc: c...@kernel.org; linux-f2fs-devel@lists.sourceforge.net; >>>>>>>>> linux- ker...@vger.kernel.org; 정요한(JOUNG YOHAN) Mobile AE >>>>>>>>> <yohan.jo...@sk.com> >>>>>>>>> Subject: [External Mail] Re: [PATCH] f2fs: prevent the current >>>>>>>>> section from being selected as a victim during garbage >>>>>>>>> collection >>>>>>>>> >>>>>>>>> On 3/26/25 22:14, Yohan Joung wrote: >>>>>>>>>> When selecting a victim using next_victim_seg in a large >>>>>>>>>> section, the selected section might already have been cleared >>>>>>>>>> and designated as the new current section, making it actively >>>>>>>>>> in >>> use. >>>>>>>>>> This behavior causes inconsistency between the SIT and SSA. >>>>>>>>> >>>>>>>>> Hi, does this fix your issue? >>>>>>>> >>>>>>>> This is an issue that arises when dividing a large section into >>>>>>>> segments for garbage collection. >>>>>>>> caused by the background GC (garbage collection) thread in large >>>>>>>> section >>>>>>>> f2fs_gc(victim_section) -> >>>>>>>> f2fs_clear_prefree_segments(victim_section)-> >>>>>>>> cursec(victim_section) -> f2fs_gc(victim_section by >>>>>>>> next_victim_seg) >>>>>>> >>>>>>> I didn't get it, why f2fs_get_victim() will return section which >>>>>>> is used by curseg? It should be avoided by checking w/ >sec_usage_check(). >>>>>>> >>>>>>> Or we missed to check gcing section which next_victim_seg points >>>>>>> to during get_new_segment()? >>>>>>> >>>>>>> Can this happen? >>>>>>> >>>>>>> e.g. >>>>>>> - bggc selects sec #0 >>>>>>> - next_victim_seg: seg #0 >>>>>>> - migrate seg #0 and stop >>>>>>> - next_victim_seg: seg #1 >>>>>>> - checkpoint, set sec #0 free if sec #0 has no valid blocks >>>>>>> - allocate seg #0 in sec #0 for curseg >>>>>>> - curseg moves to seg #1 after allocation >>>>>>> - bggc tries to migrate seg #1 >>>>>>> >>>>>>> Thanks, >>>>>> That's correct >>>>>> In f2fs_get_victim, we use next_victim_seg to directly jump to >>>>>> got_result, thereby bypassing sec_usage_check What do you think >>>>>> about this change? >>>>>> >>>>>> @@ -850,15 +850,20 @@ int f2fs_get_victim(struct f2fs_sb_info >>>>>> *sbi, >>>>> unsigned int *result, >>>>>> p.min_segno = sbi->next_victim_seg[BG_GC]; >>>>>> *result = p.min_segno; >>>>>> sbi->next_victim_seg[BG_GC] = NULL_SEGNO; >>>>>> - goto got_result; >>>>>> } >>>>>> if (gc_type == FG_GC && >>>>>> sbi->next_victim_seg[FG_GC] != >>>>>> NULL_SEGNO) >{ >>>>>> p.min_segno = sbi->next_victim_seg[FG_GC]; >>>>>> *result = p.min_segno; >>>>>> sbi->next_victim_seg[FG_GC] = NULL_SEGNO; >>>>>> - goto got_result; >>>>>> } >>>>>> + >>>>>> + secno = GET_SEC_FROM_SEG(sbi, segno); >>>>>> + >>>>>> + if (sec_usage_check(sbi, secno)) >>>>>> + goto next; >>>>>> + >>>>>> + goto got_result; >>>>>> } >>>>> >>>>> But still allocator can assign this segment after sec_usage_check() >>>>> in race condition, right? >>>> Since the BG GC using next_victim takes place after the SIT update >>>> in do_checkpoint, it seems unlikely that a race condition with >>> sec_usage_check will occur. >>> >>> I mean this: >>> >>> - gc_thread >>> - f2fs_gc >>> - f2fs_get_victim >>> - sec_usage_check --- segno #1 is not used in any cursegs >>> - f2fs_allocate_data_block >>> - new_curseg >>> - get_new_segment find segno #1 >>> >>> - do_garbage_collect >>> >>> Thanks, >> >> do_checkpoint sec0 free >> If sec0 is not freed, then >segno1 within sec0 cannot be >> allocated >> - gc_thread >> - f2fs_gc >> - f2fs_get_victim >> - sec_usage_check --- segno #1 is not used in any cursegs (but sec0 >is already used) >> - >> f2fs_allocate_data_block >> - new_curseg >> - get_new_segment find >segno #1 >> >> - do_garbage_collect >> >> I appreciate your patch, it is under testing. >> but I'm wondering if there's a risk of a race condition in this >> situation > >Oh, yes, I may missed that get_new_segment can return a free segment in >partial used section. > >So what do you think of this? >- check CURSEG() in do_garbage_collect() and get_victim() >- reset next_victim_seg[] in get_new_segment() and __set_test_and_free() >during checkpoint. > >Thanks,
How about using victim_secmap? gc_thread mutex_lock(&DIRTY_I(sbi)->seglist_lock); __set_test_and_free check cur section next_victim clear mutex_unlock(&dirty_i->seglist_lock); mutex_lock(&dirty->seglist_lock); f2fs_get_victim mutex_unlock(&dirty_i->seglist_lock); static inline void __set_test_and_free(struct f2fs_sb_info *sbi, if (next >= start_segno + usable_segs) { if (test_and_clear_bit(secno, free_i->free_secmap)) free_i->free_sections++; + + if (test_and_clear_bit(secno, dirty_i->victim_secmap)) + sbi->next_victim_seg[BG_GC] = NULL_SEGNO; } } > >> >> >>> >>>>> >>>>> IMO, we can clear next_victim_seg[] once section is free in >>>>> __set_test_and_free()? something like this: >>>> I will test it according to your suggestion. >>>> If there are no issues, can I submit it again with the patch? >>>> Thanks >>>>> >>>>> --- >>>>> fs/f2fs/segment.h | 13 ++++++++++--- >>>>> 1 file changed, 10 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h index >>>>> 0465dc00b349..826e37999085 100644 >>>>> --- a/fs/f2fs/segment.h >>>>> +++ b/fs/f2fs/segment.h >>>>> @@ -473,9 +473,16 @@ static inline void __set_test_and_free(struct >>>>> f2fs_sb_info *sbi, >>>>> goto skip_free; >>>>> next = find_next_bit(free_i->free_segmap, >>>>> start_segno + SEGS_PER_SEC(sbi), >start_segno); >>>>> - if (next >= start_segno + usable_segs) { >>>>> - if (test_and_clear_bit(secno, free_i->free_secmap)) >>>>> - free_i->free_sections++; >>>>> + if ((next >= start_segno + usable_segs) && >>>>> + test_and_clear_bit(secno, free_i->free_secmap)) { >>>>> + free_i->free_sections++; >>>>> + >>>>> + if (GET_SEC_FROM_SEG(sbi->next_victim_seg[BG_GC]) == >>>>> + secno) >>>>> + sbi->next_victim_seg[BG_GC] = NULL_SEGNO; >>>>> + if (GET_SEC_FROM_SEG(sbi->next_victim_seg[FG_GC]) == >>>>> + secno) >>>>> + sbi->next_victim_seg[FG_GC] = NULL_SEGNO; >>>>> } >>>>> } >>>>> skip_free: >>>>> -- >>>>> 2.40.1 >>>>> >>>>>>> >>>>>>>> >>>>>>>> Because the call stack is different, I think that in order to >>>>>>>> handle everything at once, we need to address it within >>>>>>>> do_garbage_collect, or otherwise include it on both sides. >>>>>>>> What do you think? >>>>>>>> >>>>>>>> [30146.337471][ T1300] F2FS-fs (dm-54): Inconsistent segment >>>>>>>> (70961) type [0, 1] in SSA and SIT [30146.346151][ T1300] Call >trace: >>>>>>>> [30146.346152][ T1300] dump_backtrace+0xe8/0x10c >>>>>>>> [30146.346157][ T1300] show_stack+0x18/0x28 [30146.346158][ >>>>>>>> T1300] dump_stack_lvl+0x50/0x6c [30146.346161][ T1300] >>>>>>>> dump_stack+0x18/0x28 [30146.346162][ T1300] >>>>>>>> f2fs_stop_checkpoint+0x1c/0x3c [30146.346165][ T1300] >>>>>>>> do_garbage_collect+0x41c/0x271c [30146.346167][ T1300] >>>>>>>> f2fs_gc+0x27c/0x828 [30146.346168][ T1300] >>>>>>>> gc_thread_func+0x290/0x88c [30146.346169][ T1300] >>>>>>>> kthread+0x11c/0x164 [30146.346172][ T1300] >>>>>>>> ret_from_fork+0x10/0x20 >>>>>>>> >>>>>>>> struct curseg_info : 0xffffff803f95e800 { >>>>>>>> segno : 0x11531 : 70961 >>>>>>>> } >>>>>>>> >>>>>>>> struct f2fs_sb_info : 0xffffff8811d12000 { >>>>>>>> next_victim_seg[0] : 0x11531 : 70961 } >>>>>>>> >>>>>>>>> >>>>>>>>> https://lore.kernel.org/linux-f2fs-devel/20250325080646.3291947 >>>>>>>>> -2 >>>>>>>>> - >>>>>>>>> c...@kernel.org >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Signed-off-by: Yohan Joung <yohan.jo...@sk.com> >>>>>>>>>> --- >>>>>>>>>> fs/f2fs/gc.c | 4 ++++ >>>>>>>>>> 1 file changed, 4 insertions(+) >>>>>>>>>> >>>>>>>>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c index >>>>>>>>>> 2b8f9239bede..4b5d18e395eb 100644 >>>>>>>>>> --- a/fs/f2fs/gc.c >>>>>>>>>> +++ b/fs/f2fs/gc.c >>>>>>>>>> @@ -1926,6 +1926,10 @@ int f2fs_gc(struct f2fs_sb_info *sbi, >>>>>>>>>> struct >>>>>>>>> f2fs_gc_control *gc_control) >>>>>>>>>> goto stop; >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> + if (__is_large_section(sbi) && >>>>>>>>>> + IS_CURSEC(sbi, GET_SEC_FROM_SEG(sbi, segno))) >>>>>>>>>> + goto stop; >>>>>>>>>> + >>>>>>>>>> seg_freed = do_garbage_collect(sbi, segno, &gc_list, gc_type, >>>>>>>>>> gc_control->should_migrate_blocks, >>>>>>>>>> gc_control->one_time); >>>>>>>> >>>>>> >>>> >> _______________________________________________ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel