On 4/1/25 09:51, yohan.joung wrote: >> From: Chao Yu <c...@kernel.org> >> Sent: Monday, March 31, 2025 8:36 PM >> To: 정요한(JOUNG YOHAN) Mobile AE <yohan.jo...@sk.com>; linux-f2fs- >> de...@lists.sourceforge.net >> Cc: c...@kernel.org; jaeg...@kernel.org; jyh...@gmail.com; linux- >> ker...@vger.kernel.org; 김필현(KIM PILHYUN) Mobile AE <pilhyun....@sk.com> >> Subject: [External Mail] Re: [External Mail] Re: [f2fs-dev] [External Mail] >> Re: [External Mail] Re: [PATCH] f2fs: prevent the current section from >> being selected as a victim during garbage collection >> >> On 3/31/25 13:13, yohan.joung wrote: >>>> On 2025/3/28 15:25, yohan.joung wrote: >>>>>> On 2025/3/28 11:40, yohan.joung wrote: >>>>>>>> From: Chao Yu <c...@kernel.org> >>>>>>>> Sent: Thursday, March 27, 2025 10:48 PM >>>>>>>> To: 정요한(JOUNG YOHAN) Mobile AE <yohan.jo...@sk.com>; Yohan Joung >>>>>>>> <jyh...@gmail.com>; jaeg...@kernel.org; daeh...@gmail.com >>>>>>>> Cc: c...@kernel.org; linux-f2fs-devel@lists.sourceforge.net; >>>>>>>> linux- ker...@vger.kernel.org; 김필현(KIM PILHYUN) Mobile AE >>>>>>>> <pilhyun....@sk.com> >>>>>>>> Subject: [External Mail] Re: [External Mail] Re: [External Mail] Re: >>>>>>>> [PATCH] f2fs: prevent the current section from being selected as >>>>>>>> a victim during garbage collection >>>>>>>> >>>>>>>> On 2025/3/27 16:00, yohan.jo...@sk.com wrote: >>>>>>>>>> From: Chao Yu <c...@kernel.org> >>>>>>>>>> Sent: Thursday, March 27, 2025 4:30 PM >>>>>>>>>> To: 정요한(JOUNG YOHAN) Mobile AE <yohan.jo...@sk.com>; Yohan >>>>>>>>>> Joung <jyh...@gmail.com>; jaeg...@kernel.org; daeh...@gmail.com >>>>>>>>>> Cc: c...@kernel.org; linux-f2fs-devel@lists.sourceforge.net; >>>>>>>>>> linux- ker...@vger.kernel.org; 김필현(KIM PILHYUN) Mobile AE >>>>>>>>>> <pilhyun....@sk.com> >>>>>>>>>> Subject: [External Mail] Re: [External Mail] Re: [PATCH] f2fs: >>>>>>>>>> prevent the current section from being selected as a victim >>>>>>>>>> during garbage collection >>>>>>>>>> >>>>>>>>>> On 3/27/25 14:43, yohan.jo...@sk.com wrote: >>>>>>>>>>>> From: Chao Yu <c...@kernel.org> >>>>>>>>>>>> Sent: Thursday, March 27, 2025 3:02 PM >>>>>>>>>>>> To: Yohan Joung <jyh...@gmail.com>; jaeg...@kernel.org; >>>>>>>>>>>> daeh...@gmail.com >>>>>>>>>>>> Cc: c...@kernel.org; linux-f2fs-devel@lists.sourceforge.net; >>>>>>>>>>>> linux- ker...@vger.kernel.org; 정요한(JOUNG YOHAN) Mobile AE >>>>>>>>>>>> <yohan.jo...@sk.com> >>>>>>>>>>>> Subject: [External Mail] Re: [PATCH] f2fs: prevent the >>>>>>>>>>>> current section from being selected as a victim during >>>>>>>>>>>> garbage collection >>>>>>>>>>>> >>>>>>>>>>>> On 3/26/25 22:14, Yohan Joung wrote: >>>>>>>>>>>>> When selecting a victim using next_victim_seg in a large >>>>>>>>>>>>> section, the selected section might already have been >>>>>>>>>>>>> cleared and designated as the new current section, making it >>>>>>>>>>>>> actively in >>>>>> use. >>>>>>>>>>>>> This behavior causes inconsistency between the SIT and SSA. >>>>>>>>>>>> >>>>>>>>>>>> Hi, does this fix your issue? >>>>>>>>>>> >>>>>>>>>>> This is an issue that arises when dividing a large section >>>>>>>>>>> into segments for garbage collection. >>>>>>>>>>> caused by the background GC (garbage collection) thread in >>>>>>>>>>> large section >>>>>>>>>>> f2fs_gc(victim_section) -> >>>>>>>>>>> f2fs_clear_prefree_segments(victim_section)-> >>>>>>>>>>> cursec(victim_section) -> f2fs_gc(victim_section by >>>>>>>>>>> next_victim_seg) >>>>>>>>>> >>>>>>>>>> I didn't get it, why f2fs_get_victim() will return section >>>>>>>>>> which is used by curseg? It should be avoided by checking w/ >>>> sec_usage_check(). >>>>>>>>>> >>>>>>>>>> Or we missed to check gcing section which next_victim_seg >>>>>>>>>> points to during get_new_segment()? >>>>>>>>>> >>>>>>>>>> Can this happen? >>>>>>>>>> >>>>>>>>>> e.g. >>>>>>>>>> - bggc selects sec #0 >>>>>>>>>> - next_victim_seg: seg #0 >>>>>>>>>> - migrate seg #0 and stop >>>>>>>>>> - next_victim_seg: seg #1 >>>>>>>>>> - checkpoint, set sec #0 free if sec #0 has no valid blocks >>>>>>>>>> - allocate seg #0 in sec #0 for curseg >>>>>>>>>> - curseg moves to seg #1 after allocation >>>>>>>>>> - bggc tries to migrate seg #1 >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>> That's correct >>>>>>>>> In f2fs_get_victim, we use next_victim_seg to directly jump to >>>>>>>>> got_result, thereby bypassing sec_usage_check What do you think >>>>>>>>> about this change? >>>>>>>>> >>>>>>>>> @@ -850,15 +850,20 @@ int f2fs_get_victim(struct f2fs_sb_info >>>>>>>>> *sbi, >>>>>>>> unsigned int *result, >>>>>>>>> p.min_segno = sbi->next_victim_seg[BG_GC]; >>>>>>>>> *result = p.min_segno; >>>>>>>>> sbi->next_victim_seg[BG_GC] = NULL_SEGNO; >>>>>>>>> - goto got_result; >>>>>>>>> } >>>>>>>>> if (gc_type == FG_GC && >>>>>>>>> sbi->next_victim_seg[FG_GC] >>>>>>>>> != NULL_SEGNO) >>>> { >>>>>>>>> p.min_segno = sbi->next_victim_seg[FG_GC]; >>>>>>>>> *result = p.min_segno; >>>>>>>>> sbi->next_victim_seg[FG_GC] = NULL_SEGNO; >>>>>>>>> - goto got_result; >>>>>>>>> } >>>>>>>>> + >>>>>>>>> + secno = GET_SEC_FROM_SEG(sbi, segno); >>>>>>>>> + >>>>>>>>> + if (sec_usage_check(sbi, secno)) >>>>>>>>> + goto next; >>>>>>>>> + >>>>>>>>> + goto got_result; >>>>>>>>> } >>>>>>>> >>>>>>>> But still allocator can assign this segment after >>>>>>>> sec_usage_check() in race condition, right? >>>>>>> Since the BG GC using next_victim takes place after the SIT >>>>>>> update in do_checkpoint, it seems unlikely that a race condition >>>>>>> with >>>>>> sec_usage_check will occur. >>>>>> >>>>>> I mean this: >>>>>> >>>>>> - gc_thread >>>>>> - f2fs_gc >>>>>> - f2fs_get_victim >>>>>> - sec_usage_check --- segno #1 is not used in any cursegs >>>>>> - f2fs_allocate_data_block >>>>>> - new_curseg >>>>>> - get_new_segment find segno #1 >>>>>> >>>>>> - do_garbage_collect >>>>>> >>>>>> Thanks, >>>>> >>>>> do_checkpoint sec0 free >>>>> If sec0 is not freed, then >>>> segno1 within sec0 cannot be >>>>> allocated >>>>> - gc_thread >>>>> - f2fs_gc >>>>> - f2fs_get_victim >>>>> - sec_usage_check --- segno #1 is not used in any cursegs (but >>>>> sec0 >>>> is already used) >>>>> - >>>>> f2fs_allocate_data_block >>>>> - new_curseg >>>>> - get_new_segment find >>>> segno #1 >>>>> >>>>> - do_garbage_collect >>>>> >>>>> I appreciate your patch, it is under testing. >>>>> but I'm wondering if there's a risk of a race condition in this >>>>> situation >>>> >>>> Oh, yes, I may missed that get_new_segment can return a free segment >>>> in partial used section. >>>> >>>> So what do you think of this? >>>> - check CURSEG() in do_garbage_collect() and get_victim() >>>> - reset next_victim_seg[] in get_new_segment() and >>>> __set_test_and_free() during checkpoint. >>>> >>>> Thanks, >>> >>> How about using victim_secmap? >>> gc_thread >>> mutex_lock(&DIRTY_I(sbi)->seglist_lock); >>> __set_test_and_free >>> check cur section next_victim clear >>> mutex_unlock(&dirty_i->seglist_lock); >>> >>> mutex_lock(&dirty->seglist_lock); >>> f2fs_get_victim >>> mutex_unlock(&dirty_i->seglist_lock); >>> >>> static inline void __set_test_and_free(struct f2fs_sb_info *sbi, >>> if (next >= start_segno + usable_segs) { >>> if (test_and_clear_bit(secno, free_i->free_secmap)) >>> free_i->free_sections++; >>> + >>> + if (test_and_clear_bit(secno, >>> dirty_i->victim_secmap)) >>> + sbi->next_victim_seg[BG_GC] = >>> + NULL_SEGNO; >> >> Can this happen? >> >> segs_per_sec=2 >> >> - seg#0 and seg#1 are all dirty >> - all valid blocks are removed in seg#1 >> - checkpoint -> seg#1 becomes free >> - gc select this sec and next_victim_seg=seg#0 >> - migrate seg#0, next_victim_seg=seg#1 >> - allocator assigns seg#1 to curseg >> - gc tries to migrate seg#1
I meant for above case, below change still can not catch it, right? since next_victim_seg[] was assigned after checkpoint. + if (test_and_clear_bit(secno, dirty_i->victim_secmap)) + sbi->next_victim_seg[BG_GC] = NULL_SEGNO; Thanks, >> >> Thanks, > The detailed scenario > segs_per_sec=2 > - seg#0 and seg#1 are all dirty > - all valid blocks are removed in seg#1 > - gc select this sec and next_victim_seg=seg#0 > - migrate seg#0, next_victim_seg=seg#1 > - checkpoint -> sec(seg#0, seg#1) becomes free > - allocator assigns sec(seg#0, seg#1) to curseg > - gc tries to migrate seg#1 >> >>> } >>> } >>>> >>>>> >>>>> >>>>>> >>>>>>>> >>>>>>>> IMO, we can clear next_victim_seg[] once section is free in >>>>>>>> __set_test_and_free()? something like this: >>>>>>> I will test it according to your suggestion. >>>>>>> If there are no issues, can I submit it again with the patch? >>>>>>> Thanks >>>>>>>> >>>>>>>> --- >>>>>>>> fs/f2fs/segment.h | 13 ++++++++++--- >>>>>>>> 1 file changed, 10 insertions(+), 3 deletions(-) >>>>>>>> >>>>>>>> diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h index >>>>>>>> 0465dc00b349..826e37999085 100644 >>>>>>>> --- a/fs/f2fs/segment.h >>>>>>>> +++ b/fs/f2fs/segment.h >>>>>>>> @@ -473,9 +473,16 @@ static inline void >>>>>>>> __set_test_and_free(struct f2fs_sb_info *sbi, >>>>>>>> goto skip_free; >>>>>>>> next = find_next_bit(free_i->free_segmap, >>>>>>>> start_segno + SEGS_PER_SEC(sbi), >>>> start_segno); >>>>>>>> - if (next >= start_segno + usable_segs) { >>>>>>>> - if (test_and_clear_bit(secno, free_i- >>> free_secmap)) >>>>>>>> - free_i->free_sections++; >>>>>>>> + if ((next >= start_segno + usable_segs) && >>>>>>>> + test_and_clear_bit(secno, free_i->free_secmap)) >> { >>>>>>>> + free_i->free_sections++; >>>>>>>> + >>>>>>>> + if >>>>>>>> (GET_SEC_FROM_SEG(sbi->next_victim_seg[BG_GC]) >> == >>>>>>>> + >>>>>>>> secno) >>>>>>>> + sbi->next_victim_seg[BG_GC] = >>>>>>>> NULL_SEGNO; >>>>>>>> + if >>>>>>>> (GET_SEC_FROM_SEG(sbi->next_victim_seg[FG_GC]) >> == >>>>>>>> + >>>>>>>> secno) >>>>>>>> + sbi->next_victim_seg[FG_GC] = >>>>>>>> NULL_SEGNO; >>>>>>>> } >>>>>>>> } >>>>>>>> skip_free: >>>>>>>> -- >>>>>>>> 2.40.1 >>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Because the call stack is different, I think that in order to >>>>>>>>>>> handle everything at once, we need to address it within >>>>>>>>>>> do_garbage_collect, or otherwise include it on both sides. >>>>>>>>>>> What do you think? >>>>>>>>>>> >>>>>>>>>>> [30146.337471][ T1300] F2FS-fs (dm-54): Inconsistent segment >>>>>>>>>>> (70961) type [0, 1] in SSA and SIT [30146.346151][ T1300] Call >>>> trace: >>>>>>>>>>> [30146.346152][ T1300] dump_backtrace+0xe8/0x10c >>>>>>>>>>> [30146.346157][ T1300] show_stack+0x18/0x28 [30146.346158][ >>>>>>>>>>> T1300] dump_stack_lvl+0x50/0x6c [30146.346161][ T1300] >>>>>>>>>>> dump_stack+0x18/0x28 [30146.346162][ T1300] >>>>>>>>>>> f2fs_stop_checkpoint+0x1c/0x3c [30146.346165][ T1300] >>>>>>>>>>> do_garbage_collect+0x41c/0x271c [30146.346167][ T1300] >>>>>>>>>>> f2fs_gc+0x27c/0x828 [30146.346168][ T1300] >>>>>>>>>>> gc_thread_func+0x290/0x88c [30146.346169][ T1300] >>>>>>>>>>> kthread+0x11c/0x164 [30146.346172][ T1300] >>>>>>>>>>> ret_from_fork+0x10/0x20 >>>>>>>>>>> >>>>>>>>>>> struct curseg_info : 0xffffff803f95e800 { >>>>>>>>>>> segno : 0x11531 : 70961 >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> struct f2fs_sb_info : 0xffffff8811d12000 { >>>>>>>>>>> next_victim_seg[0] : 0x11531 : 70961 } >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> https://lore.kernel.org/linux-f2fs-devel/20250325080646.32919 >>>>>>>>>>>> 47 >>>>>>>>>>>> -2 >>>>>>>>>>>> - >>>>>>>>>>>> c...@kernel.org >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Signed-off-by: Yohan Joung <yohan.jo...@sk.com> >>>>>>>>>>>>> --- >>>>>>>>>>>>> fs/f2fs/gc.c | 4 ++++ >>>>>>>>>>>>> 1 file changed, 4 insertions(+) >>>>>>>>>>>>> >>>>>>>>>>>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c index >>>>>>>>>>>>> 2b8f9239bede..4b5d18e395eb 100644 >>>>>>>>>>>>> --- a/fs/f2fs/gc.c >>>>>>>>>>>>> +++ b/fs/f2fs/gc.c >>>>>>>>>>>>> @@ -1926,6 +1926,10 @@ int f2fs_gc(struct f2fs_sb_info *sbi, >>>>>>>>>>>>> struct >>>>>>>>>>>> f2fs_gc_control *gc_control) >>>>>>>>>>>>> goto stop; >>>>>>>>>>>>> } >>>>>>>>>>>>> >>>>>>>>>>>>> + if (__is_large_section(sbi) && >>>>>>>>>>>>> + IS_CURSEC(sbi, GET_SEC_FROM_SEG(sbi, segno))) >>>>>>>>>>>>> + goto stop; >>>>>>>>>>>>> + >>>>>>>>>>>>> seg_freed = do_garbage_collect(sbi, segno, &gc_list, >> gc_type, >>>>>>>>>>>>> >>>>>>>>>>>>> gc_control->should_migrate_blocks, >>>>>>>>>>>>> gc_control->one_time); >>>>>>>>>>> >>>>>>>>> >>>>>>> >>>>> > > _______________________________________________ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel