On Wed, Mar 26, 2025 at 4:59 AM Mike Rapoport <r...@kernel.org> wrote: [...] > > There has, for example, been some talk about making hugetlbfs > > persistent. You could have hugetlb_cma active. The hugetlb CMA areas > > are set up quite early, quite some time before KHO restores memory. So > > that would have to be changed somehow if the location of the KHO init > > call would remain as close as possible to buddy init as possible. I > > suspect there may be other uses. > > I think we can address this when/if implementing preservation for hugetlbfs > and it will be tricky. > If hugetlb in the first kernel uses a lot of memory, we just won't have > enough scratch space for early hugetlb reservations in the second kernel > regardless of hugetlb_cma. On the other hand, we already have the preserved > hugetlbfs memory, so we'd probably need to reserve less memory in the > second kernel. > > But anyway, it's completely different discussion about how to preserve > hugetlbfs.
Right, there would have to be a KHO interface way to carry over the early reserved memory and reinit it early too. > > > > > current requirement in the patch set seems to be "after sparse/page > > > > init", but I'm not sure why it needs to be as close as possibly to > > > > buddy init. > > > > > > Why would you say that sparse/page init would be a requirement here? > > > > At least in its current form, the KHO code expects vmemmap to be > > initialized, as it does its restore base on page structures, as > > deserialize_bitmap expects them. I think the use of the page->private > > field was discussed in a separate thread, I think. If that is done > > differently, it wouldn't rely on vmemmap being initialized. > > In the current form KHO does relies on vmemmap being allocated, but it does > not rely on it being initialized. Marking memblock ranges NOINT ensures > nothing touches the corresponding struct pages and KHO can use their fields > up to the point the memory is returned to KHO callers. > > > A few more things I've noticed (not sure if these were discussed before): > > > > * Should KHO depend on CONFIG_DEFERRED_STRUCT_PAGE_INIT? Essentially, > > marking memblock ranges as NOINIT doesn't work without > > DEFERRED_STRUCT_PAGE_INIT. Although, if the page->private use > > disappears, this wouldn't be an issue anymore. > > It does. > memmap_init_reserved_pages() is called always, no matter of > CONFIG_DEFERRED_STRUCT_PAGE_INIT is set or not and it skips initialization > of NOINIT regions. Yeah, I see - the ordering makes this work out. MEMBLOCK_RSRV_NOINIT is a bit confusing in the sense that if you do a memblock allocation in the !CONFIG_DEFERRED_STRUCT_PAGE_INIT case, and that allocation is done before free_area_init(), the pages will always get initialized regardless, since memmap_init_range() will do it. But this is done before the KHO deserialize, so it works out. > > > * As a future extension, it could be nice to store vmemmap init > > information in the KHO FDT. Then you can use that to init ranges in an > > optimized way (HVO hugetlb or DAX-style persisted ranges) straight > > away. > > These days memmap contents is unstable because of the folio/memdesc > project, but in general carrying memory map data from kernel to kernel is > indeed something to consider. Yes, I think we might have a need for that, but we'll see. Thanks, - Frank