On 2026/6/10 5:00, Michael S. Tsirkin wrote: > On Tue, Jun 09, 2026 at 04:54:01PM -0400, Zi Yan wrote: >> On 9 Jun 2026, at 16:34, Michael S. Tsirkin wrote: >> >>> On Tue, Jun 09, 2026 at 02:52:47PM -0400, Zi Yan wrote: >>>> On 9 Jun 2026, at 14:39, Zi Yan wrote: >>>> >>>>> On 9 Jun 2026, at 14:38, David Hildenbrand (Arm) wrote: >>>>> >>>>>> On 6/9/26 20:10, Andrew Morton wrote: >>>>>>> On Tue, 9 Jun 2026 06:12:49 -0400 "Michael S. Tsirkin" >>>>>>> <[email protected]> wrote: >>>>>>> >>>>>>>> TestSetPageHWPoison() is called without zone->lock, so its atomic >>>>>>>> update to page->flags can race with non-atomic flag operations >>>>>>>> that run under zone->lock in the buddy allocator. >>>>>>>> >>>>>>>> In particular, __free_pages_prepare() does: >>>>>>>> >>>>>>>> page->flags.f &= ~PAGE_FLAGS_CHECK_AT_PREP; >>>>>>>> >>>>>>>> This non-atomic read-modify-write, while correctly excluding >>>>>>>> __PG_HWPOISON from the mask, can still lose a concurrent >>>>>>>> TestSetPageHWPoison if the read happens before the poison bit >>>>>>>> is set and the write happens after. Will only get worse if/when >>>>>>>> we add more non-atomic flag operations. >>>>>>>> >>>>>>>> Fix by acquiring zone->lock around TestSetPageHWPoison and >>>>>>>> around ClearPageHWPoison in the retry path. This >>>>>>>> serializes with all buddy flag manipulation. The cost is >>>>>>>> negligible: one lock/unlock in an extremely rare path >>>>>>>> (hardware memory errors). >>>>>>>> >>>>>>>> Note: SetPageHWPoison and TestClearPageHWPoison calls elsewhere >>>>>>>> in this file operate on pages already removed from the buddy >>>>>>>> allocator or on non-buddy pages (DAX, hugetlb), so they do not >>>>>>>> need zone->lock protection. >>>>>>> >>>>>>> Sashiko is saying this doesn't do anything "Because >>>>>>> __free_pages_prepare() executes entirely locklessly". Did it goof? >>>>>>> >>>>>>> https://sashiko.dev/#/patchset/df06b66fe4ff8e925ee0714955abc2183a727b90.1780998980.git....@redhat.com >>>>>> >>>>>> Battle of the bots: it's right. >>>>> >>>>> Yep, __free_pages_prepare() changes the page flag without holding >>>>> zone->lock. >>>> >>>> __free_pages_prepare() works on frozen pages and assumes no one else >>>> touches the input page. To avoid this race, memory_failure() might >>>> want to try_get_page() before TestClearPageHWPoison(), but I am not >>>> sure if that works along with memory failure flow. >>>> >>>> Best Regards, >>>> Yan, Zi >>> >>> >>> >>> Actually memory failure already plays with this down the road no? >>> >>> So maybe it's enough to just SetPageHWPoison afterwards again? >>> >>> >>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >>> index ee42d4361309..4758fea94a96 100644 >>> --- a/mm/memory-failure.c >>> +++ b/mm/memory-failure.c >>> @@ -2415,6 +2415,7 @@ int memory_failure(unsigned long pfn, int flags) >>> if (!res) { >>> if (is_free_buddy_page(p)) { >>> if (take_page_off_buddy(p)) { >>> + SetPageHWPoison(p); >>> page_ref_inc(p); >>> res = MF_RECOVERED; >>> } else { >>> >>> >>> and maybe in a bunch of other places in there? >> >> You mean for fear of losing HWPoison flag in the earlier >> TestSetPageHWPoison(), >> just set it again here? > > Yea. > >> Why not do it after get_hwpoison_page(), since that >> is the expected page flag? > > It's still in the buddy at that point right? I'm worried buddy might > poke at flags.
Since __free_pages_prepare() executes entirely locklessly, the only way to ensure HWPoison flag won't be lost might be only set hwpoison flag iff we can make sure pages are not on the way to buddy... Thanks. .

