Re: [PATCH v2] arm64: kernel: numa: fix ACPI boot cpu numa node mapping

2016-10-18 Thread Leizhen (ThunderTown)


On 2016/10/18 16:39, Hanjun Guo wrote:
> On 2016/10/17 22:56, Lorenzo Pieralisi wrote:
>> Commit 7ba5f605f3a0 ("arm64/numa: remove the limitation that cpu0 must
>> bind to node0") removed the numa cpu<->node mapping restriction whereby
>> logical cpu 0 always corresponds to numa node 0; removing the
>> restriction was correct, in that it does not really exist in practice
>> but the commit only updated the early mapping of logical cpu 0 to its
>> real numa node for the DT boot path, missing the ACPI one, leading to
>> boot failures on ACPI systems owing to missing cpu<->node map for
>> logical cpu 0.
>>
>> Fix the issue by updating the ACPI boot path with code that carries out
>> the early cpu<->node mapping also for the boot cpu (ie cpu 0), mirroring
>> what is currently done in the DT boot path.
>>
>> Fixes: 7ba5f605f3a0 ("arm64/numa: remove the limitation that cpu0 must bind 
>> to node0")
>> Signed-off-by: Lorenzo Pieralisi 
>> Tested-by: Laszlo Ersek 
>> Reported-by: Laszlo Ersek 
>> Cc: Will Deacon 
>> Cc: Laszlo Ersek 
>> Cc: Hanjun Guo 
> 
> Thanks for the quick response and fix,
> 
> Acked-by: Hanjun Guo 
> 
> By the way, I got another boot failure [1] when we have multi
> NUMA nodes system with some memory-less nodes (only one node
> have memory), we are looking into it now, this patch needs
> to be merged first.
You should apply my numa MEMORYLESS patches first, because the two patches have 
not been upstreamed yet.
I just tested it base on 4.9-rc1 for dt numa, it worked well. I will connect 
you to check what's wrong on ACPI numa.

> 
> Thanks
> Hanjun
> 
> [1]: boot failure log:
> [0.00] NUMA: Adding memblock [0x0 - 0x3fff] on node 0
> [0.00] ACPI: SRAT: Node 0 PXM 0 [mem 0x-0x3fff]
> [0.00] NUMA: Adding memblock [0x14 - 0x17] on node 1
> [0.00] ACPI: SRAT: Node 1 PXM 1 [mem 0x14-0x17]
> [0.00] NUMA: Adding memblock [0x10 - 0x13] on node 0
> [0.00] ACPI: SRAT: Node 0 PXM 0 [mem 0x10-0x13]
> [0.00] NUMA: Initmem setup node 0 [mem 0x-0x13fbff]
> [0.00] NUMA: NODE_DATA [mem 0x13fbffe500-0x13fbff]
> [0.00] NUMA: Initmem setup node 1 [mem 0x14-0x17fbff]
> [0.00] NUMA: NODE_DATA [mem 0x17fbfec500-0x17fbfedfff]
> [0.00] NUMA: Initmem setup node 2 [mem 0x-0x]
> [0.00] NUMA: NODE_DATA [mem 0x17fbfeaa00-0x17fbfec4ff]
> [0.00] NUMA: NODE_DATA(2) on node 1
> [0.00] NUMA: Initmem setup node 3 [mem 0x-0x]
> [0.00] NUMA: NODE_DATA [mem 0x17fbfe8f00-0x17fbfea9ff]
> [0.00] NUMA: NODE_DATA(3) on node 1
> [0.00] Zone ranges:
> [0.00]   DMA  [mem 0x-0x]
> [0.00]   Normal   [mem 0x0001-0x0017fbff]
> [0.00] Movable zone start for each node
> [0.00] Early memory node ranges
> [0.00]   node   0: [mem 0x-0x00024fff]
> [0.00]   node   0: [mem 0x00026000-0x319d]
> [0.00]   node   0: [mem 0x319e-0x31a4]
> [0.00]   node   0: [mem 0x31a5-0x31b2]
> [0.00]   node   0: [mem 0x31b3-0x31b3]
> [0.00]   node   0: [mem 0x31b4-0x39ba]
> [0.00]   node   0: [mem 0x39bb-0x3a143fff]
> [0.00]   node   0: [mem 0x3a144000-0x3f12]
> [0.00]   node   0: [mem 0x3f13-0x3f15]
> [0.00]   node   0: [mem 0x3f16-0x3fbf]
> [0.00]   node   0: [mem 0x00104000-0x0013fbff]
> [0.00]   node   1: [mem 0x0014-0x0017fbff]
> [0.00] Initmem setup node 0 [mem 
> 0x-0x0013fbff]
> [0.00] Initmem setup node 1 [mem 
> 0x0014-0x0017fbff]
> [0.00] Could not find start_pfn for node 2
> [0.00] Initmem setup node 2 [mem 
> 0x-0x]
> [0.00] Could not find start_pfn for node 3
> [0.00] Initmem setup node 3 [mem 
> 0x-0x]
> [0.00] psci: probing for conduit method from ACPI.
> [0.00] [ cut here ]
> [0.00] kernel BUG at mm/percpu.c:1916!
> [0.00] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> [0.00] Modules linked in:
> [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 
> 4.9.0-rc1-00083-g3dd62e5 #680
> [0.00] Hardware name: Hisilicon Hi1616 Evaluation Board (DT)
> [0.00] task: 08d5e980 task.stack: 08d5
> [0.00] PC is at pcpu_embed_first_chunk+0x464/0x754
> [0.00] LR is at pcpu_embed_first_chunk+0x3f8/0x754
> [0.00] pc : [] lr : [] pstate: 
> 20c5
> [0.00]

Re: [PATCH v2] arm64: kernel: numa: fix ACPI boot cpu numa node mapping

2016-10-18 Thread Hanjun Guo

On 2016/10/17 22:56, Lorenzo Pieralisi wrote:

Commit 7ba5f605f3a0 ("arm64/numa: remove the limitation that cpu0 must
bind to node0") removed the numa cpu<->node mapping restriction whereby
logical cpu 0 always corresponds to numa node 0; removing the
restriction was correct, in that it does not really exist in practice
but the commit only updated the early mapping of logical cpu 0 to its
real numa node for the DT boot path, missing the ACPI one, leading to
boot failures on ACPI systems owing to missing cpu<->node map for
logical cpu 0.

Fix the issue by updating the ACPI boot path with code that carries out
the early cpu<->node mapping also for the boot cpu (ie cpu 0), mirroring
what is currently done in the DT boot path.

Fixes: 7ba5f605f3a0 ("arm64/numa: remove the limitation that cpu0 must bind to 
node0")
Signed-off-by: Lorenzo Pieralisi 
Tested-by: Laszlo Ersek 
Reported-by: Laszlo Ersek 
Cc: Will Deacon 
Cc: Laszlo Ersek 
Cc: Hanjun Guo 


Thanks for the quick response and fix,

Acked-by: Hanjun Guo 

By the way, I got another boot failure [1] when we have multi
NUMA nodes system with some memory-less nodes (only one node
have memory), we are looking into it now, this patch needs
to be merged first.

Thanks
Hanjun

[1]: boot failure log:
[0.00] NUMA: Adding memblock [0x0 - 0x3fff] on node 0
[0.00] ACPI: SRAT: Node 0 PXM 0 [mem 0x-0x3fff]
[0.00] NUMA: Adding memblock [0x14 - 0x17] on node 1
[0.00] ACPI: SRAT: Node 1 PXM 1 [mem 0x14-0x17]
[0.00] NUMA: Adding memblock [0x10 - 0x13] on node 0
[0.00] ACPI: SRAT: Node 0 PXM 0 [mem 0x10-0x13]
[0.00] NUMA: Initmem setup node 0 [mem 0x-0x13fbff]
[0.00] NUMA: NODE_DATA [mem 0x13fbffe500-0x13fbff]
[0.00] NUMA: Initmem setup node 1 [mem 0x14-0x17fbff]
[0.00] NUMA: NODE_DATA [mem 0x17fbfec500-0x17fbfedfff]
[0.00] NUMA: Initmem setup node 2 [mem 
0x-0x]

[0.00] NUMA: NODE_DATA [mem 0x17fbfeaa00-0x17fbfec4ff]
[0.00] NUMA: NODE_DATA(2) on node 1
[0.00] NUMA: Initmem setup node 3 [mem 
0x-0x]

[0.00] NUMA: NODE_DATA [mem 0x17fbfe8f00-0x17fbfea9ff]
[0.00] NUMA: NODE_DATA(3) on node 1
[0.00] Zone ranges:
[0.00]   DMA  [mem 0x-0x]
[0.00]   Normal   [mem 0x0001-0x0017fbff]
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x-0x00024fff]
[0.00]   node   0: [mem 0x00026000-0x319d]
[0.00]   node   0: [mem 0x319e-0x31a4]
[0.00]   node   0: [mem 0x31a5-0x31b2]
[0.00]   node   0: [mem 0x31b3-0x31b3]
[0.00]   node   0: [mem 0x31b4-0x39ba]
[0.00]   node   0: [mem 0x39bb-0x3a143fff]
[0.00]   node   0: [mem 0x3a144000-0x3f12]
[0.00]   node   0: [mem 0x3f13-0x3f15]
[0.00]   node   0: [mem 0x3f16-0x3fbf]
[0.00]   node   0: [mem 0x00104000-0x0013fbff]
[0.00]   node   1: [mem 0x0014-0x0017fbff]
[0.00] Initmem setup node 0 [mem 
0x-0x0013fbff]
[0.00] Initmem setup node 1 [mem 
0x0014-0x0017fbff]

[0.00] Could not find start_pfn for node 2
[0.00] Initmem setup node 2 [mem 
0x-0x]

[0.00] Could not find start_pfn for node 3
[0.00] Initmem setup node 3 [mem 
0x-0x]

[0.00] psci: probing for conduit method from ACPI.
[0.00] [ cut here ]
[0.00] kernel BUG at mm/percpu.c:1916!
[0.00] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[0.00] Modules linked in:
[0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 
4.9.0-rc1-00083-g3dd62e5 #680

[0.00] Hardware name: Hisilicon Hi1616 Evaluation Board (DT)
[0.00] task: 08d5e980 task.stack: 08d5
[0.00] PC is at pcpu_embed_first_chunk+0x464/0x754
[0.00] LR is at pcpu_embed_first_chunk+0x3f8/0x754
[0.00] pc : [] lr : [] 
pstate: 20c5

[0.00] sp : 08d53e90
[0.00] x29: 08d53e90 [0.00] x28: 
[0.00]
[0.00] x27: 08d55e50 [0.00] x26: 0042
[0.00]
[0.00] x25: 08d55d28 [0.00] x24: 0046
[0.00]
[0.00] x23: 0040 [0.00] x22: 8017fbfcff00
[0.00]
[0.00] x21: 08ca6e20 [0.00] x20: 8017fbfd0518
[0

[PATCH v2] arm64: kernel: numa: fix ACPI boot cpu numa node mapping

2016-10-17 Thread Lorenzo Pieralisi
Commit 7ba5f605f3a0 ("arm64/numa: remove the limitation that cpu0 must
bind to node0") removed the numa cpu<->node mapping restriction whereby
logical cpu 0 always corresponds to numa node 0; removing the
restriction was correct, in that it does not really exist in practice
but the commit only updated the early mapping of logical cpu 0 to its
real numa node for the DT boot path, missing the ACPI one, leading to
boot failures on ACPI systems owing to missing cpu<->node map for
logical cpu 0.

Fix the issue by updating the ACPI boot path with code that carries out
the early cpu<->node mapping also for the boot cpu (ie cpu 0), mirroring
what is currently done in the DT boot path.

Fixes: 7ba5f605f3a0 ("arm64/numa: remove the limitation that cpu0 must bind to 
node0")
Signed-off-by: Lorenzo Pieralisi 
Tested-by: Laszlo Ersek 
Reported-by: Laszlo Ersek 
Cc: Will Deacon 
Cc: Laszlo Ersek 
Cc: Hanjun Guo 
Cc: Andrew Jones 
Cc: Zhen Lei 
Cc: Catalin Marinas 
---
v1 -> v2
- Updated commit log to reflect boot failures set-ups
 
 arch/arm64/kernel/smp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index d3f151c..8507703 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -544,6 +544,7 @@ acpi_map_gic_cpu_interface(struct 
acpi_madt_generic_interrupt *processor)
return;
}
bootcpu_valid = true;
+   early_map_cpu_to_node(0, acpi_numa_get_nid(0, hwid));
return;
}
 
-- 
2.10.0