From: Chao Yu <[email protected]>

[ Upstream commit 0a736109c9d29de0c26567e42cb99b27861aa8ba ]

Add node footer sanity check during node folio's writeback, if sanity
check fails, let's shutdown filesystem to avoid looping to redirty
and writeback in .writepages.

Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---

LLM Generated explanations, may be completely bogus:

So `sanity_check_node_footer` doesn't exist in 6.12.y. The fix would
need more substantial adaptation for older trees, but the core issue
(looping redirty on corrupted footer) still exists there - the old code
uses `f2fs_bug_on` which only triggers a WARN_ON (unless
CONFIG_F2FS_CHECK_FS is set) and then continues processing the corrupted
node.

## Analysis Summary

### What the commit does

The commit replaces `f2fs_bug_on(sbi, folio->index != nid)` in
`__write_node_folio()` with a proper call to
`sanity_check_node_footer()` + `f2fs_handle_critical_error()`.

### The bug

When a node folio has a corrupted footer (nid mismatch or other
inconsistency):

1. **Old behavior**: `f2fs_bug_on()` triggers either a `BUG_ON()`
   (kernel crash with `CONFIG_F2FS_CHECK_FS`) or just `WARN_ON` + sets
   `SBI_NEED_FSCK` and **continues execution**. In the WARN_ON case, the
   corrupted node gets processed and written. But more critically, if
   the node gets redirected to `redirty_out` or encounters another issue
   later, it enters an **infinite loop** of being redirtied and re-
   attempted for writeback in `.writepages`, since nothing stops the
   cycle.

2. **New behavior**: `sanity_check_node_footer()` detects the corruption
   more thoroughly (checking multiple footer fields, not just nid), and
   `f2fs_handle_critical_error()` shuts down the filesystem to **break
   the infinite writeback loop**.

### Why it matters

- **Infinite loop / soft lockup**: Without this fix, a corrupted node
  footer causes the kernel to loop endlessly trying to write the page,
  consuming CPU and potentially hanging the system.
- **Filesystem corruption defense**: On corrupted or fuzzed images, the
  old code would continue operating on inconsistent data.
- **Small, surgical fix**: Only 5 lines changed (1 insertion, 1 deletion
  effectively), replacing an assertion with a proper error handling
  path.

### Dependencies

The fix calls `sanity_check_node_footer()` (introduced in v6.15-rc1 by
commit 1cf6b5670af1f) and `f2fs_handle_critical_error()` +
`STOP_CP_REASON_CORRUPTED_NID`.

Crucially, `sanity_check_node_footer()` was **already backported** to
6.17.y and 6.18.y as part of commit c18ecd99e0c70's backport. Both
`f2fs_handle_critical_error` and `STOP_CP_REASON_CORRUPTED_NID` also
exist in these stable trees.

### Risk assessment

- **Very low risk**: The change replaces a debug assertion with proper
  error handling + filesystem shutdown. The error path (`goto
  redirty_out`) already exists and is well-tested.
- **Applies cleanly** to 6.17.y and 6.18.y (minor context difference
  with `f2fs_down_read` vs `f2fs_down_read_trace` but that's outside the
  hunk).
- For older trees (6.12.y and below), `sanity_check_node_footer()`
  doesn't exist, so a more complex backport would be needed.

## Verification

- **git show 0a736109c9d29**: Confirmed the actual commit is +5/-1
  lines, single file change in fs/f2fs/node.c
- **git show 1cf6b5670af1f**: Confirmed this is the prerequisite that
  introduced `sanity_check_node_footer()`, first appeared in v6.15-rc1
- **git tag --contains 1cf6b5670af1f**: Confirmed first release is
  v6.15-rc1
- **git show v6.17.13:fs/f2fs/node.c**: Confirmed
  `sanity_check_node_footer` exists (line 1503), and `f2fs_bug_on(sbi,
  folio->index != nid)` still present (the buggy line) at line 1754
- **git show v6.18.10:fs/f2fs/node.c**: Same -
  `sanity_check_node_footer` exists and the buggy `f2fs_bug_on` line is
  present
- **git show v6.17.13:fs/f2fs/f2fs.h**: Confirmed
  `f2fs_handle_critical_error` is declared
- **git show v6.17.13:include/linux/f2fs_fs.h**: Confirmed
  `STOP_CP_REASON_CORRUPTED_NID` exists
- **git show v6.18.10**: Same confirmations for both dependencies
- **git show v6.12.71:fs/f2fs/node.c**: Confirmed
  `sanity_check_node_footer` does NOT exist in 6.12.y (would need more
  work)
- **f2fs_bug_on definition** (fs/f2fs/f2fs.h:34-40): Confirmed that
  without CONFIG_F2FS_CHECK_FS, it's just WARN_ON + SBI_NEED_FSCK,
  allowing execution to continue
- **git show 186098f34b8a5**: Confirmed that c18ecd99e0c70 was
  backported to 6.17.y (combined with prerequisites), establishing
  `sanity_check_node_footer` in stable

This is a small, surgical fix that prevents an infinite writeback loop
on corrupted f2fs node footers. It replaces a weak assertion with proper
error handling and filesystem shutdown. All dependencies exist in 6.17.y
and 6.18.y stable trees. The fix is clearly correct, fixes a real bug
(system hang), and is appropriately scoped.

**YES**

 fs/f2fs/node.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 482a362f26254..a963c4165bc4b 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1751,7 +1751,11 @@ static bool __write_node_folio(struct folio *folio, bool 
atomic, bool *submitted
 
        /* get old block addr of this node page */
        nid = nid_of_node(folio);
-       f2fs_bug_on(sbi, folio->index != nid);
+
+       if (sanity_check_node_footer(sbi, folio, nid, NODE_TYPE_REGULAR)) {
+               f2fs_handle_critical_error(sbi, STOP_CP_REASON_CORRUPTED_NID);
+               goto redirty_out;
+       }
 
        if (f2fs_get_node_info(sbi, nid, &ni, !do_balance))
                goto redirty_out;
-- 
2.51.0



_______________________________________________
Linux-f2fs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to