On 2026/6/8 22:15, Breno Leitao wrote:
> On Fri, Jun 05, 2026 at 11:42:53AM +0200, David Hildenbrand (Arm) wrote:
>> On 6/5/26 11:35, Breno Leitao wrote:
>>> On Wed, Jun 03, 2026 at 10:33:04AM +0800, Miaohe Lin wrote:
>>>> On 2026/6/2 17:41, David Hildenbrand (Arm) wrote:
>>>>>
>>>>> Races are fine. We might miss some pages, but that can happen on races
>>>>> either way.
>>>>>
>>>>>
>>>>> I'd just do something like
>>>>>
>>>>> if (PageReserved(page))
>>>>> return true;
>>>>>
>>>>> head = compound_head(page);
>>>>
>>>> If @head is split just after compound_head. And then @head is freed into
>>>> buddy and re-allocated as slab
>>>> page while @page is still in the buddy. We would panic on this scene as
>>>> @head is PageSlab. But we were
>>>> supposed to successfully handle @page. Or am I miss something?
>>>
>>> You're right that it is racy, but I think it is an acceptable race here.
>>>
>>
>> I mean, any such races can currently already happen one way or the other?
>>
>> Really, the only way to not get races is to tryget the (compound)page,
>> revalidate that the page is still part of the compound page.
>>
>> I'm not sure if that's really a good idea.
>>
>> But my memory is a bit vague in which scenarios we already hold a page
>> reference
>> here to prevent any concurrent freeing?
>
> No, we don't hold one here in the case that matters.
>
> HWPoisonKernelOwned() runs at the very top of get_any_page(), before
> try_again: and before __get_hwpoison_page(). The first refcount taken in
> the whole path is the folio_try_get() inside __get_hwpoison_page(), which
> runs *after* the short-circuit.
>
> So get_any_page() itself never holds a reference at the check -- the only way
> one exists is if the caller passed MF_COUNT_INCREASED (count_increased ==
> true).
>
> So on the MCE/GHES path -- the one this panic option exists for -- no
> reference is held when HWPoisonKernelOwned() does its compound_head() +
> PageSlab()/PageTable()/PageLargeKmalloc() checks.
>
> Given that, I'd rather keep it racy and take no refcount than add a
> tryget + revalidate purely for this check. As I've said earleir, an operator
Would it be acceptable to add a simple recheck? Something like below:
retry:
head = compound_head(page);
PageSlab()/PageTable()/PageLargeKmalloc() checks
if (head != compound_head(page))
goto retry
Thanks both.
.
> who enabled it has chosen to crash rather than run on corrupted memory;
> mis-attributing one such rare, genuinely-poisoned page is within that
> contract.
> .
>