On Tue, Dec 30 2025, Pasha Tatashin wrote:

> On Mon, Dec 29, 2025 at 4:03 PM Pratyush Yadav <[email protected]> wrote:
>>
>> On Tue, Dec 23 2025, Pasha Tatashin wrote:
[...]
>> >
>> > I kind of do not like relying on magic to decide whether to initialize
>> > the struct page. I would prefer to avoid this magic marker altogether:
>> > i.e. struct page is either initialized or not, not halfway
>> > initialized, etc.
>>
>> The magic is purely sanity checking. It is not used to decide anything
>> other than to make sure this is actually a KHO page. I don't intend to
>> change that. My point is, if we make sure the KHO pages are properly
>> initialized during MM init, then restoring can actually be a very cheap
>> operation, where you only do the sanity checking. You can even put the
>> magic check behind CONFIG_KEXEC_HANDOVER_DEBUG if you want, but I think
>> it is useful enough to keep in production systems too.
>
> It is part of a critical hotpath during blackout, should really be
> behind CONFIG_KEXEC_HANDOVER_DEBUG
>
>> > Magic is not reliable. During machine reset in many firmware
>> > implementations, and in every kexec reboot, memory is not zeroed. The
>> > kernel usually allocates vmemmap using exactly the same pages, so
>> > there is just too high a chance of getting magic values accidentally
>> > inherited from the previous boot.
>>
>> I don't think that can happen. All the pages are zeroed when
>> initialized, which will clear the magic. We should only be setting the
>> magic on an initialized struct page.
>
> This can happen due to bugs when we use a partially initialized
> "struct page", something that Mike have been looking to do. So, pass
> some information in a struct page before it is fully initialized.

The magic is checked at restore time though, and by then all non-KHO
pages should be properly initialized and have their magic cleared.

Also, the magic is cleared by kho_restore_folio(), so this can only ever
happen for pages that were preserved but not restored in the previous
boot. I don't think that is a common use case in the first place.

[...]

-- 
Regards,
Pratyush Yadav

Reply via email to