Could the unusual memory config be due to one empty DIMM slot on my motherboard? I have 9 slots, but only 8 x 16G filled. The 6th slot on the motherboard is empty -- which is a valid config according to the manual.
On Fri, Jan 25, 2019 at 11:39 AM Michal Hocko <mho...@kernel.org> wrote: > > On Fri 25-01-19 11:16:30, robert shteynfeld wrote: > > Attached is the dmesg from patched kernel. > > Your Node1 physical memory range precedes Node0 which is quite unusual > but it shouldn't be a huge problem on its own. But memory ranges are > not aligned to the memory section > > [ 0.286954] Early memory node ranges > [ 0.286955] node 1: [mem 0x0000000000001000-0x0000000000090fff] > [ 0.286955] node 1: [mem 0x0000000000100000-0x00000000dbdf8fff] > [ 0.286956] node 1: [mem 0x0000000100000000-0x0000001423ffffff] > [ 0.286956] node 0: [mem 0x0000001424000000-0x0000002023ffffff] > > As you can see the last pfn for the node1 is inside the section and > Node0 starts right after. This is quite unusual as well. If for no other > reasons then the memmap of those struct pages will be remote for one or > the other. Actually I am not even sure we can handle that properly > because we do expect 1:1 mapping between sections and nodes. > > Now it also makes some sense why 2830bf6f05fb ("mm, memory_hotplug: > initialize struct pages for the full memory section") made any > difference. We simply write over a potentially initialized struct page > and blow up on that. I strongly suspect that the commit just uncovered > a pre-existing problem. Let me think what we can do about that. > > > I'm not an expert at debugging the kernel, obviously. I tried setting > > up a serial console before without much luck as part of this debugging > > session. > > Ubuntu has a nice howto for netconsole configuration > https://wiki.ubuntu.com/Kernel/Netconsole. It is quite important to get > the actual failure. > -- > Michal Hocko > SUSE Labs