Re: [PATCH v2] arm64: kernel: numa: fix ACPI boot cpu numa node mapping
On 2016/10/18 16:39, Hanjun Guo wrote: > On 2016/10/17 22:56, Lorenzo Pieralisi wrote: >> Commit 7ba5f605f3a0 ("arm64/numa: remove the limitation that cpu0 must >> bind to node0") removed the numa cpu<->node mapping restriction whereby >> logical cpu 0 always corresponds to numa node 0; removing the >> restriction was correct, in that it does not really exist in practice >> but the commit only updated the early mapping of logical cpu 0 to its >> real numa node for the DT boot path, missing the ACPI one, leading to >> boot failures on ACPI systems owing to missing cpu<->node map for >> logical cpu 0. >> >> Fix the issue by updating the ACPI boot path with code that carries out >> the early cpu<->node mapping also for the boot cpu (ie cpu 0), mirroring >> what is currently done in the DT boot path. >> >> Fixes: 7ba5f605f3a0 ("arm64/numa: remove the limitation that cpu0 must bind >> to node0") >> Signed-off-by: Lorenzo Pieralisi >> Tested-by: Laszlo Ersek >> Reported-by: Laszlo Ersek >> Cc: Will Deacon >> Cc: Laszlo Ersek >> Cc: Hanjun Guo > > Thanks for the quick response and fix, > > Acked-by: Hanjun Guo > > By the way, I got another boot failure [1] when we have multi > NUMA nodes system with some memory-less nodes (only one node > have memory), we are looking into it now, this patch needs > to be merged first. You should apply my numa MEMORYLESS patches first, because the two patches have not been upstreamed yet. I just tested it base on 4.9-rc1 for dt numa, it worked well. I will connect you to check what's wrong on ACPI numa. > > Thanks > Hanjun > > [1]: boot failure log: > [0.00] NUMA: Adding memblock [0x0 - 0x3fff] on node 0 > [0.00] ACPI: SRAT: Node 0 PXM 0 [mem 0x-0x3fff] > [0.00] NUMA: Adding memblock [0x14 - 0x17] on node 1 > [0.00] ACPI: SRAT: Node 1 PXM 1 [mem 0x14-0x17] > [0.00] NUMA: Adding memblock [0x10 - 0x13] on node 0 > [0.00] ACPI: SRAT: Node 0 PXM 0 [mem 0x10-0x13] > [0.00] NUMA: Initmem setup node 0 [mem 0x-0x13fbff] > [0.00] NUMA: NODE_DATA [mem 0x13fbffe500-0x13fbff] > [0.00] NUMA: Initmem setup node 1 [mem 0x14-0x17fbff] > [0.00] NUMA: NODE_DATA [mem 0x17fbfec500-0x17fbfedfff] > [0.00] NUMA: Initmem setup node 2 [mem 0x-0x] > [0.00] NUMA: NODE_DATA [mem 0x17fbfeaa00-0x17fbfec4ff] > [0.00] NUMA: NODE_DATA(2) on node 1 > [0.00] NUMA: Initmem setup node 3 [mem 0x-0x] > [0.00] NUMA: NODE_DATA [mem 0x17fbfe8f00-0x17fbfea9ff] > [0.00] NUMA: NODE_DATA(3) on node 1 > [0.00] Zone ranges: > [0.00] DMA [mem 0x-0x] > [0.00] Normal [mem 0x0001-0x0017fbff] > [0.00] Movable zone start for each node > [0.00] Early memory node ranges > [0.00] node 0: [mem 0x-0x00024fff] > [0.00] node 0: [mem 0x00026000-0x319d] > [0.00] node 0: [mem 0x319e-0x31a4] > [0.00] node 0: [mem 0x31a5-0x31b2] > [0.00] node 0: [mem 0x31b3-0x31b3] > [0.00] node 0: [mem 0x31b4-0x39ba] > [0.00] node 0: [mem 0x39bb-0x3a143fff] > [0.00] node 0: [mem 0x3a144000-0x3f12] > [0.00] node 0: [mem 0x3f13-0x3f15] > [0.00] node 0: [mem 0x3f16-0x3fbf] > [0.00] node 0: [mem 0x00104000-0x0013fbff] > [0.00] node 1: [mem 0x0014-0x0017fbff] > [0.00] Initmem setup node 0 [mem > 0x-0x0013fbff] > [0.00] Initmem setup node 1 [mem > 0x0014-0x0017fbff] > [0.00] Could not find start_pfn for node 2 > [0.00] Initmem setup node 2 [mem > 0x-0x] > [0.00] Could not find start_pfn for node 3 > [0.00] Initmem setup node 3 [mem > 0x-0x] > [0.00] psci: probing for conduit method from ACPI. > [0.00] [ cut here ] > [0.00] kernel BUG at mm/percpu.c:1916! > [0.00] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP > [0.00] Modules linked in: > [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted > 4.9.0-rc1-00083-g3dd62e5 #680 > [0.00] Hardware name: Hisilicon Hi1616 Evaluation Board (DT) > [0.00] task: 08d5e980 task.stack: 08d5 > [0.00] PC is at pcpu_embed_first_chunk+0x464/0x754 > [0.00] LR is at pcpu_embed_first_chunk+0x3f8/0x754 > [0.00] pc : [] lr : [] pstate: > 20c5 > [0.00]
Re: [PATCH v2] arm64: kernel: numa: fix ACPI boot cpu numa node mapping
On 2016/10/17 22:56, Lorenzo Pieralisi wrote: Commit 7ba5f605f3a0 ("arm64/numa: remove the limitation that cpu0 must bind to node0") removed the numa cpu<->node mapping restriction whereby logical cpu 0 always corresponds to numa node 0; removing the restriction was correct, in that it does not really exist in practice but the commit only updated the early mapping of logical cpu 0 to its real numa node for the DT boot path, missing the ACPI one, leading to boot failures on ACPI systems owing to missing cpu<->node map for logical cpu 0. Fix the issue by updating the ACPI boot path with code that carries out the early cpu<->node mapping also for the boot cpu (ie cpu 0), mirroring what is currently done in the DT boot path. Fixes: 7ba5f605f3a0 ("arm64/numa: remove the limitation that cpu0 must bind to node0") Signed-off-by: Lorenzo Pieralisi Tested-by: Laszlo Ersek Reported-by: Laszlo Ersek Cc: Will Deacon Cc: Laszlo Ersek Cc: Hanjun Guo Thanks for the quick response and fix, Acked-by: Hanjun Guo By the way, I got another boot failure [1] when we have multi NUMA nodes system with some memory-less nodes (only one node have memory), we are looking into it now, this patch needs to be merged first. Thanks Hanjun [1]: boot failure log: [0.00] NUMA: Adding memblock [0x0 - 0x3fff] on node 0 [0.00] ACPI: SRAT: Node 0 PXM 0 [mem 0x-0x3fff] [0.00] NUMA: Adding memblock [0x14 - 0x17] on node 1 [0.00] ACPI: SRAT: Node 1 PXM 1 [mem 0x14-0x17] [0.00] NUMA: Adding memblock [0x10 - 0x13] on node 0 [0.00] ACPI: SRAT: Node 0 PXM 0 [mem 0x10-0x13] [0.00] NUMA: Initmem setup node 0 [mem 0x-0x13fbff] [0.00] NUMA: NODE_DATA [mem 0x13fbffe500-0x13fbff] [0.00] NUMA: Initmem setup node 1 [mem 0x14-0x17fbff] [0.00] NUMA: NODE_DATA [mem 0x17fbfec500-0x17fbfedfff] [0.00] NUMA: Initmem setup node 2 [mem 0x-0x] [0.00] NUMA: NODE_DATA [mem 0x17fbfeaa00-0x17fbfec4ff] [0.00] NUMA: NODE_DATA(2) on node 1 [0.00] NUMA: Initmem setup node 3 [mem 0x-0x] [0.00] NUMA: NODE_DATA [mem 0x17fbfe8f00-0x17fbfea9ff] [0.00] NUMA: NODE_DATA(3) on node 1 [0.00] Zone ranges: [0.00] DMA [mem 0x-0x] [0.00] Normal [mem 0x0001-0x0017fbff] [0.00] Movable zone start for each node [0.00] Early memory node ranges [0.00] node 0: [mem 0x-0x00024fff] [0.00] node 0: [mem 0x00026000-0x319d] [0.00] node 0: [mem 0x319e-0x31a4] [0.00] node 0: [mem 0x31a5-0x31b2] [0.00] node 0: [mem 0x31b3-0x31b3] [0.00] node 0: [mem 0x31b4-0x39ba] [0.00] node 0: [mem 0x39bb-0x3a143fff] [0.00] node 0: [mem 0x3a144000-0x3f12] [0.00] node 0: [mem 0x3f13-0x3f15] [0.00] node 0: [mem 0x3f16-0x3fbf] [0.00] node 0: [mem 0x00104000-0x0013fbff] [0.00] node 1: [mem 0x0014-0x0017fbff] [0.00] Initmem setup node 0 [mem 0x-0x0013fbff] [0.00] Initmem setup node 1 [mem 0x0014-0x0017fbff] [0.00] Could not find start_pfn for node 2 [0.00] Initmem setup node 2 [mem 0x-0x] [0.00] Could not find start_pfn for node 3 [0.00] Initmem setup node 3 [mem 0x-0x] [0.00] psci: probing for conduit method from ACPI. [0.00] [ cut here ] [0.00] kernel BUG at mm/percpu.c:1916! [0.00] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [0.00] Modules linked in: [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc1-00083-g3dd62e5 #680 [0.00] Hardware name: Hisilicon Hi1616 Evaluation Board (DT) [0.00] task: 08d5e980 task.stack: 08d5 [0.00] PC is at pcpu_embed_first_chunk+0x464/0x754 [0.00] LR is at pcpu_embed_first_chunk+0x3f8/0x754 [0.00] pc : [] lr : [] pstate: 20c5 [0.00] sp : 08d53e90 [0.00] x29: 08d53e90 [0.00] x28: [0.00] [0.00] x27: 08d55e50 [0.00] x26: 0042 [0.00] [0.00] x25: 08d55d28 [0.00] x24: 0046 [0.00] [0.00] x23: 0040 [0.00] x22: 8017fbfcff00 [0.00] [0.00] x21: 08ca6e20 [0.00] x20: 8017fbfd0518 [0
[PATCH v2] arm64: kernel: numa: fix ACPI boot cpu numa node mapping
Commit 7ba5f605f3a0 ("arm64/numa: remove the limitation that cpu0 must bind to node0") removed the numa cpu<->node mapping restriction whereby logical cpu 0 always corresponds to numa node 0; removing the restriction was correct, in that it does not really exist in practice but the commit only updated the early mapping of logical cpu 0 to its real numa node for the DT boot path, missing the ACPI one, leading to boot failures on ACPI systems owing to missing cpu<->node map for logical cpu 0. Fix the issue by updating the ACPI boot path with code that carries out the early cpu<->node mapping also for the boot cpu (ie cpu 0), mirroring what is currently done in the DT boot path. Fixes: 7ba5f605f3a0 ("arm64/numa: remove the limitation that cpu0 must bind to node0") Signed-off-by: Lorenzo Pieralisi Tested-by: Laszlo Ersek Reported-by: Laszlo Ersek Cc: Will Deacon Cc: Laszlo Ersek Cc: Hanjun Guo Cc: Andrew Jones Cc: Zhen Lei Cc: Catalin Marinas --- v1 -> v2 - Updated commit log to reflect boot failures set-ups arch/arm64/kernel/smp.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index d3f151c..8507703 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -544,6 +544,7 @@ acpi_map_gic_cpu_interface(struct acpi_madt_generic_interrupt *processor) return; } bootcpu_valid = true; + early_map_cpu_to_node(0, acpi_numa_get_nid(0, hwid)); return; } -- 2.10.0