Hi Muchun,
Muchun Song <[email protected]> wrote:
> mm/sparse: Move subsection_map_init() into sparse_init()
>
> This commit moves subsection_map_init() from free_area_init() into
> sparse_init() so that sparse-specific setup stays together instead of
being
> split across the generic free_area_init() path.
This patch introduces a new `sparse_init_subsection_map()` that iterates
over all memblock ranges and calls `sparse_init_subsection_map_range()`:
> +void __init sparse_init_subsection_map(void)
> +{
> + int i, nid;
> + unsigned long start, end;
> +
> + for_each_mem_pfn_range(i, MAX_NUMNODES, &start, &end, &nid)
> + sparse_init_subsection_map_range(start, end - start);
However, earlier in `sparse_init()`, `memblocks_present()` calls
`memory_present()`, which internally caps PFN ranges at
`max_sparsemem_pfn` via `mminit_validate_memmodel_limits()`. Sections
beyond this cap never have `ms->usage` allocated.
`for_each_mem_pfn_range()` returns the raw, uncapped memblock ranges.
If a range extends beyond `max_sparsemem_pfn`, then inside
`sparse_init_subsection_map_range()`:
ms = __nr_to_section(nr);
subsection_mask_set(ms->usage->subsection_map, pfn, pfns);
`ms->usage` is NULL because `sparse_init_early_section()` was never
called for this section, causing a NULL pointer dereference.
I was able to reproduce this on x86_64 with 4-level paging by booting
with `memmap=4G@0x400080000000` to place a memblock range beyond the
~64 TiB `max_sparsemem_pfn` limit. The kernel crashes during early boot:
node -1: [mem 0x0000400080000000-0x000040017fffffff]
------------[ cut here ]------------
WARNING: mm/sparse.c:142 at sparse_init+0x1ac/0x8a0
...
PANIC: early exception 0x0d IP
10:...sparse_init_subsection_map+0x12f/0x250
RIP: 0010:sparse_init_subsection_map+0x12f/0x250
Call Trace:
sparse_init+0x69f/0x8a0
mm_core_init_early+0x12fa/0x20c0
start_kernel+0x89/0x4e0
The fix is a one-line NULL check in sparse_init_subsection_map_range():
--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -608,6 +608,8 @@ void __init sparse_init_subsection_map(unsigned long
pfn,
pfns = min(nr_pages, PAGES_PER_SECTION
- (pfn & ~PAGE_SECTION_MASK));
ms = __nr_to_section(nr);
+ if (!ms->usage)
+ continue;
subsection_mask_set(ms->usage->subsection_map, pfn, pfns);
On most systems `max_sparsemem_pfn` is large enough that this is never
hit, but on 32-bit or PAE configurations where the limit is much lower,
the mismatch between `for_each_mem_pfn_range()` and
`mminit_validate_memmodel_limits()` can trigger with reasonable memory
sizes.
Thanks,
Xiao