On 9 Jun 2026, at 6:12, Michael S. Tsirkin wrote:

> TestSetPageHWPoison() is called without zone->lock, so its atomic
> update to page->flags can race with non-atomic flag operations
> that run under zone->lock in the buddy allocator.
>
> In particular, __free_pages_prepare() does:
>
>     page->flags.f &= ~PAGE_FLAGS_CHECK_AT_PREP;
>
> This non-atomic read-modify-write, while correctly excluding
> __PG_HWPOISON from the mask, can still lose a concurrent
> TestSetPageHWPoison if the read happens before the poison bit
> is set and the write happens after.  Will only get worse if/when
> we add more non-atomic flag operations.
>
> Fix by acquiring zone->lock around TestSetPageHWPoison and
> around ClearPageHWPoison in the retry path.  This
> serializes with all buddy flag manipulation.  The cost is
> negligible: one lock/unlock in an extremely rare path
> (hardware memory errors).
>
> Note: SetPageHWPoison and TestClearPageHWPoison calls elsewhere
> in this file operate on pages already removed from the buddy
> allocator or on non-buddy pages (DAX, hugetlb), so they do not
> need zone->lock protection.
>
> Fixes: 6a46079cf57a ("HWPOISON: The high level memory error handler in the VM 
> v7")
> Acked-by: Miaohe Lin <[email protected]>
> Signed-off-by: Michael S. Tsirkin <[email protected]>
> Assisted-by: Claude:claude-opus-4-6
> ---
>
> Sending separately as suggested by multiple people. I also added
> a Fixes tag.
>
>
>  mm/memory-failure.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
Makes sense to me. Thanks.

Acked-by: Zi Yan <[email protected]>

Best Regards,
Yan, Zi

Reply via email to