The .error_remove_folio a_ops is used by different filesystems to handle
folio truncation upon discovery of a memory failure in the memory
associated with the given folio.

Currently, MF_DELAYED is treated as an error, causing "Failed to punch
page" to be written to the console. MF_DELAYED is then relayed to the
caller of truncate_error_folio() as MF_FAILED. This further causes
memory_failure() to return -EBUSY, which then always causes a SIGBUS.

This is also implies that regardless of whether the thread's memory
corruption kill policy is PR_MCE_KILL_EARLY or PR_MCE_KILL_LATE, a
memory failure with MF_DELAYED will always cause a SIGBUS.

Update truncate_error_folio() to return MF_DELAYED to the caller if the
.error_remove_folio() callback reports MF_DELAYED.

Signed-off-by: Lisa Wang <[email protected]>
---
 mm/memory-failure.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 2e53b3024391..fd9ed2cd761d 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -941,7 +941,9 @@ static int truncate_error_folio(struct folio *folio, 
unsigned long pfn,
        if (mapping->a_ops->error_remove_folio) {
                int err = mapping->a_ops->error_remove_folio(mapping, folio);
 
-               if (err != 0)
+               if (err == MF_DELAYED)
+                       ret = err;
+               else if (err != 0)
                        pr_info("%#lx: Failed to punch page: %d\n", pfn, err);
                else if (!filemap_release_folio(folio, GFP_NOIO))
                        pr_info("%#lx: failed to release buffers\n", pfn);

-- 
2.53.0.1213.gd9a14994de-goog


Reply via email to