On 03/12/18 at 08:04pm, Chao Fan wrote:
> On Mon, Mar 12, 2018 at 11:57:27AM +0100, Ingo Molnar wrote:
> >> > > ***Background:
> >> > > People reported that kaslr may randomly chooses some positions
> >> > > which are located in movable memory regions. This will break memory
> >> > > hotplug feature.
> >> > 
> >> > [...]
> >> > 
> >> > > ***Solutions:
> >> > > Introduce a new kernel parameter 'kaslr_boot_mem=nn@ss' to let users to
> >> > > specify the memory regions where kernel can be allowed to randomize
> >> > > safely.
> >> > 
> >> > Manual solutions like that are pretty suboptimal to users, aren't they?
> >> > 
> >> > In what way does memory hotplug feature 'break'? Does it crash or 
> >> > misbehave? Or 
> >> > simply does it not allow the movement of the affected memory region, 
> >> > while still 
> >> > allowing the rest to be moved?
> >> 
> >> AFAIT, if kernel is randomized into the movable memory region, the
> >> affected memory region can not be hot added/removed since it has kernel
> >> data. Surely, the system can still work, the unaffected part still can
> >> be moved. Still it will cause regression on memory hotplug.
> >> 
> >> Mainly we parse SRAT table to get the ranges of memory provided by
> >> hot-added memory devices in initmem_init(), that's very late. During boot,
> >> we don't know it. Chao ever posted patches to grab SRAT at decompressing
> >> stage, the code is very complicated and not elegant, ACPI maintainer
> >> NACKed that.
> Thanks for Ingo's suggestion and Baoquan's explaination.
> Yes, I did ever try to dig SRAT table in boot period in early RFC PATCH:
> https://lkml.org/lkml/2017/9/3/77
> But the change is too huge so made this patchset to avoid this bug in a
> small change, which will not make the code looks messy.

ACPI tables are not independent, to parse SRAT to get information of
hotplug memory, we need get RSDP pointer, which points at RSDT or XSDT.
Then find SRAT from them. While RSDP is not in a fixed location, there
are several candidate positions, code can be checked in acpi_find_root_pointer()
of drivers/acpi/acpica/tbxfroot.c . And then iterate RSDT/XSDT to search
SRAT. These codes can not be reused between kaslr.c and drivers/acpi
because acpi code has special handling. So it will bloat kaslr boot
code. This is why both Rafael and I think it might be not good to grab
parse ACPI SRAT table in kaslr boot code.

> >
> >So there's apparently a mis-design here:
> >
> > - KASLR needs to be done very early on during bootup: - it's not realistic 
> > to 
> >   expect KASLR to be done with a booted up kernel, because pointers to 
> > various 
> >   KASLR-ed objects are already widely spread out in memory.
> >
> > - But for some unfathomable reason the memory hotplug attribute of memory
> >   regions is not part of the regular memory map but part of late-init ACPI 
> > data
> >   structures.
> >
> >The right solution would be _not_ to fudge the KASLR location, but to 
> >provide the 
> >memory hotplug information to early code, preferably via the primary memory 
> >map. 
> >KASLR can then make use of it and avoid those regions, just like it avoids 
> >other 
> >memory regions already.
> >
> >In addition to that hardware makers (including virtualized hardware) should 
> >also 
> >fix their systems to provide memory hotplug information to early code.

The hugepage allocation on kvm guest is a different situation. If people
want to allocate n pages of 1G size, they will get one page less in
kaslr enabled kernel than kaslr disabled kernel, casually. Because
kernel might be randomized to those 1G aligned huge pages in kaslr
kernel. While in no kaslr case, kernel will be put at 16M.

default_hugepagesz=1G hugepagesz=1G hugepages='n'

For this issue, unless we use a algorithm to analyze kernel cmdline and
do a flexiable estimate to avoid those 1G aligned huge pages. Still we
can't avoid the case that memblock may break the good 1G page. I can't
think of a good way to fix this in kaslr boot code.


> >
> >

Reply via email to