When `CONFIG_DEFERRED_STRUCT_PAGE_INIT` is enabled, struct page initialization is deferred to parallel kthreads that run later in the boot process.
During KHO restoration, `deserialize_bitmap()` writes metadata for each preserved memory region. However, if the struct page has not been initialized, this write targets uninitialized memory, potentially leading to errors like: ``` BUG: unable to handle page fault for address: ... ``` Fix this by introducing `kho_get_preserved_page()`, which ensures all struct pages in a preserved region are initialized by calling `init_deferred_page()` which is a no-op when deferred init is disabled or when the struct page is already initialized. Signed-off-by: Evangelos Petrongonas <[email protected]> Signed-off-by: Michal Clapinski <[email protected]> --- v2: updated a comment I think we can't initialize those struct pages in kho_restore_page. I encountered this stack: page_zone(start_page) __pageblock_pfn_to_page set_zone_contiguous page_alloc_init_late So, at the end of page_alloc_init_late struct pages are expected to be already initialized. set_zone_contiguous() looks at the first and last struct page of each pageblock in each populated zone to figure out if the zone is contiguous. If a kho page lands on a pageblock boundary, this will lead to access of an uninitialized struct page. There is also page_ext_init that invokes pfn_to_nid, which calls page_to_nid for each section-aligned page. There might be other places that do something similar. Therefore, it's a good idea to initialize all struct pages by the end of deferred struct page init. That's why I'm resending Evangelos's patch. I also tried to implement Pratyush's idea, i.e. iterate over zones, then get node from zone. I didn't notice any performance difference even with 8GB of kho. I repeated Evangelos's testing: In order to test the fix, I modified the KHO selftest, to allocate more memory and do so from higher memory to trigger the incompatibility. The branch with those changes can be found in: https://git.infradead.org/?p=users/vpetrog/linux.git;a=shortlog;h=refs/heads/kho-deferred-struct-page-init --- kernel/liveupdate/Kconfig | 2 -- kernel/liveupdate/kexec_handover.c | 23 ++++++++++++++++++++++- 2 files changed, 22 insertions(+), 3 deletions(-) diff --git a/kernel/liveupdate/Kconfig b/kernel/liveupdate/Kconfig index 1a8513f16ef7..c13af38ba23a 100644 --- a/kernel/liveupdate/Kconfig +++ b/kernel/liveupdate/Kconfig @@ -1,12 +1,10 @@ # SPDX-License-Identifier: GPL-2.0-only menu "Live Update and Kexec HandOver" - depends on !DEFERRED_STRUCT_PAGE_INIT config KEXEC_HANDOVER bool "kexec handover" depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE - depends on !DEFERRED_STRUCT_PAGE_INIT select MEMBLOCK_KHO_SCRATCH select KEXEC_FILE select LIBFDT diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c index b851b09a8e99..26bb45b25809 100644 --- a/kernel/liveupdate/kexec_handover.c +++ b/kernel/liveupdate/kexec_handover.c @@ -457,6 +457,27 @@ static int kho_mem_serialize(struct kho_out *kho_out) return err; } +/* + * With CONFIG_DEFERRED_STRUCT_PAGE_INIT, struct pages in higher memory regions + * may not be initialized yet at the time KHO deserializes preserved memory. + * KHO uses the struct page to store metadata and a later initialization would + * overwrite it. + * Ensure all the struct pages in the preservation are + * initialized. deserialize_bitmap() marks the reservation as noinit to make + * sure they don't get re-initialized later. + */ +static struct page *__init kho_get_preserved_page(phys_addr_t phys, + unsigned int order) +{ + unsigned long pfn = PHYS_PFN(phys); + int nid = early_pfn_to_nid(pfn); + + for (int i = 0; i < (1 << order); i++) + init_deferred_page(pfn + i, nid); + + return pfn_to_page(pfn); +} + static void __init deserialize_bitmap(unsigned int order, struct khoser_mem_bitmap_ptr *elm) { @@ -467,7 +488,7 @@ static void __init deserialize_bitmap(unsigned int order, int sz = 1 << (order + PAGE_SHIFT); phys_addr_t phys = elm->phys_start + (bit << (order + PAGE_SHIFT)); - struct page *page = phys_to_page(phys); + struct page *page = kho_get_preserved_page(phys, order); union kho_page_info info; memblock_reserve(phys, sz); -- 2.53.0.rc2.204.g2597b5adb4-goog
