On Tue, Jun 09, 2026 at 11:10:20AM -0700, Andrew Morton wrote: > On Tue, 9 Jun 2026 06:12:49 -0400 "Michael S. Tsirkin" <[email protected]> > wrote: > > > TestSetPageHWPoison() is called without zone->lock, so its atomic > > update to page->flags can race with non-atomic flag operations > > that run under zone->lock in the buddy allocator. > > > > In particular, __free_pages_prepare() does: > > > > page->flags.f &= ~PAGE_FLAGS_CHECK_AT_PREP; > > > > This non-atomic read-modify-write, while correctly excluding > > __PG_HWPOISON from the mask, can still lose a concurrent > > TestSetPageHWPoison if the read happens before the poison bit > > is set and the write happens after. Will only get worse if/when > > we add more non-atomic flag operations. > > > > Fix by acquiring zone->lock around TestSetPageHWPoison and > > around ClearPageHWPoison in the retry path. This > > serializes with all buddy flag manipulation. The cost is > > negligible: one lock/unlock in an extremely rare path > > (hardware memory errors). > > > > Note: SetPageHWPoison and TestClearPageHWPoison calls elsewhere > > in this file operate on pages already removed from the buddy > > allocator or on non-buddy pages (DAX, hugetlb), so they do not > > need zone->lock protection. > > Sashiko is saying this doesn't do anything "Because > __free_pages_prepare() executes entirely locklessly". Did it goof? > > https://sashiko.dev/#/patchset/df06b66fe4ff8e925ee0714955abc2183a727b90.1780998980.git....@redhat.com >
Oh. So it only helps with the prezero patches. Maybe other places where flags are touched locklessly. Not __free_pages_prepare. I was too focused on that. Scrap this please. I'll try to think of something. -- MST

