On 06/12, Chao Yu wrote: > On 6/8/26 17:09, zhaoyang.huang wrote: > > From: Zhaoyang Huang <[email protected]> > > > > This reverts commit 9609dd704725a40cd63d915f2ab6c44248a44598. > > > > The kernel panics are keeping to be reported especially when the f2fs > > partition get almost full. By investigation, we find that the reason is > > one f2fs page got freed to buddy without being deleted from LRU and the > > root cause is the race happened in [2] which is enrolled by this commit. > > > > There are 3 race processes in this scenario, please find below for their > > main activities. > > > > The changed code in move_data_block() lets the GC path evict the tail-end > > folio from the page cache through folio_end_dropbehind(). Once > > folio_unmap_invalidate() removes the folio from mapping->i_pages, the > > page-cache references for all pages in the folio are dropped. The folio > > is then kept alive only by temporary external references, which allows a > > later split to operate on a folio whose subpages are no longer protected > > by page-cache references. > > > > After the page-cache references are gone, split_folio_to_order() can > > split the big folio into individual pages and put the resulting subpages > > back on the LRU. For tail pages beyond EOF, split removes them from the > > page cache and drops their page-cache references. A tail page can then > > remain on the LRU with PG_lru set while holding only the split caller's > > temporary reference. When free_folio_and_swap_cache() drops that final > > reference, the page enters the final folio_put() release path. > > > > In parallel, folio_isolate_lru() can observe the same tail page with a > > non-zero refcount and PG_lru set. It clears PG_lru before taking its own > > reference. If this races with the final folio_put() from the split path, > > __folio_put() sees PG_lru already cleared and skips lruvec_del_folio(). > > The page is then freed back to the allocator while its lru links are > > still present in the LRU list. A later LRU operation on a neighboring > > page detects the stale link and reports list corruption. > > > > [1] > > [ 22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, > > but was dead000000000122. (next=fffffffec10e0a88) > > [ 22.486130] ------------[ cut here ]------------ > > [ 22.486134] kernel BUG at lib/list_debug.c:67! > > [ 22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP > > [ 22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE > > [ 22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT) > > [ 22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS > > BTYPE=--) > > [ 22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154 > > [ 22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154 > > [ 22.488539] sp : ffffffc08006b830 > > [ 22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: > > 0000000000000000 > > [ 22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: > > fffffffec10e0ac0 > > [ 22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: > > dead000000000122 > > [ 22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: > > ffffffc080061060 > > [ 22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: > > 0000000000000058 > > [ 22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: > > 0000000000000003 > > [ 22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : > > ffe85721f0e25f00 > > [ 22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : > > 6c65645f7473696c > > [ 22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : > > 0000000000000010 > > [ 22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : > > 000000000000006d > > [ 22.488647] Call trace: > > [ 22.488651] __list_del_entry_valid_or_report+0x14c/0x154 (P) > > [ 22.488661] __folio_put+0x2bc/0x434 > > [ 22.488670] folio_put+0x28/0x58 > > [ 22.488678] do_garbage_collect+0x1a34/0x2584 > > [ 22.488689] f2fs_gc+0x230/0x9b4 > > [ 22.488697] f2fs_fallocate+0xb90/0xdf4 > > [ 22.488706] vfs_fallocate+0x1b4/0x2bc > > [ 22.488716] __arm64_sys_fallocate+0x44/0x78 > > [ 22.488725] invoke_syscall+0x58/0xe4 > > [ 22.488732] do_el0_svc+0x48/0xdc > > [ 22.488739] el0_svc+0x3c/0x98 > > [ 22.488747] el0t_64_sync_handler+0x20/0x130 > > [ 22.488754] el0t_64_sync+0x1c4/0x1c8 > > > > [2] > > CPU0 (f2fs GC) CPU1 (split_folio_to_order) CPU2 > > (folio_isolate_lru) > > > > F: pagecache refs = n > > F: extra refs = GC + split > > F: PG_lru set > > move_data_block() > > folio = f2fs_grab_cache_folio(F) > > ... > > __folio_set_dropbehind(F) > > folio_unlock(F) > > folio_end_dropbehind(F) > > folio_unmap_invalidate(F) > > __filemap_remove_folio(F) > > folio_put_refs(F, n) > > folio_put(F) > > split_folio_to_order(F) > > folio_ref_freeze(F, 1) > > ... > > lru_add_split_folio(T) > > list_add_tail(&T->lru, &F->lru) > > folio_set_lru(T) > > __filemap_remove_folio(T) > > folio_put_refs(T, 1) > > /* T refcount == 1, PageLRU set */ > > > > folio_isolate_lru(T) > > > > folio_test_clear_lru(T) > > free_folio_and_swap_cache(T) > > folio_put(T) > > /* refcount: 1 -> 0 */ > > __folio_put(T) > > __page_cache_release(T) > > folio_test_lru(T) == false > > /* skip lruvec_del_folio(T) */ > > free_frozen_pages(T) > > > > folio_get(T) > > > > lruvec_del_folio(T) > > later: > > list_del(adjacent->lru) > > next == &T->lru > > next->prev == LIST_POISON / PCP freelist > > BUG > > > > Missing Fixes and Cc: stable lines.
Applied with them. > > > Signed-off-by: Zhaoyang Huang <[email protected]> > > I suspect this is a bug of MM, we can revert this first, and reapply after we > fix this iusse in MM. > > Thanks, > > > --- > > fs/f2fs/gc.c | 6 +----- > > 1 file changed, 1 insertion(+), 5 deletions(-) > > > > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c > > index ba93010924c0..3084e05e22f2 100644 > > --- a/fs/f2fs/gc.c > > +++ b/fs/f2fs/gc.c > > @@ -1468,11 +1468,7 @@ static int move_data_block(struct inode *inode, > > block_t bidx, > > put_out: > > f2fs_put_dnode(&dn); > > out: > > - if (!folio_test_uptodate(folio)) > > - __folio_set_dropbehind(folio); > > - folio_unlock(folio); > > - folio_end_dropbehind(folio); > > - folio_put(folio); > > + f2fs_folio_put(folio, true); > > return err; > > } > _______________________________________________ Linux-f2fs-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
