Re: [BUG] PowerNV crash with 4.4.0-rc8 at sched_init_numa (related to commit c118baf80256)

2016-01-15 Thread Raghavendra K T
* Jan Stancek  [2016-01-09 18:03:55]:

> Hi,
> 
> I'm seeing bare metal ppc64le system crashing early during boot
> with latest upstream kernel (4.4.0-rc8):
> 
> # git describe
> v4.4-rc8-96-g751e5f5
> 
> [0.625451] Unable to handle kernel paging request for data at address 
> 0x
> [0.625586] Faulting instruction address: 0xc04ae000
> [0.625698] Oops: Kernel access of bad area, sig: 11 [#1]
> [0.625789] SMP NR_CPUS=2048 NUMA PowerNV
> [0.625879] Modules linked in:
> [0.625973] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.4.0-rc8+ #6
> [0.626087] task: c02ff430 ti: c02ff6084000 task.ti: 
> c02ff6084000
> [0.626224] NIP: c04ae000 LR: c090b9e4 CTR: 
> 0003
> [0.626361] REGS: c02ff6087930 TRAP: 0300   Not tainted  (4.4.0-rc8+)
> [0.626475] MSR: 90019033   CR: 48002044  
> XER: 2000
> [0.626808] CFAR: c0008468 DAR:  DSISR: 4000 
> SOFTE: 1
> GPR00: c090b9ac c02ff6087bb0 c1700900 c03ff229e080
> GPR04: c03ff229e080  0003 0001
> GPR08:   0010 90011003
> GPR12: 2200 cfb4 c000bd68 0002
> GPR16: 0028 c0b25940 c173ffa4 
> GPR20: c0b259d8 c0b259e0 c0b259e8 
> GPR24: c03ff229e080  c189b180 
> GPR28:  c1740a94 0002 0002
> [0.627925] NIP [c04ae000] __bitmap_or+0x30/0x50
> [0.627973] LR [c090b9e4] sched_init_numa+0x440/0x7c8
> [0.628030] Call Trace:
> [0.628054] [c02ff6087bb0] [c090b9ac] 
> sched_init_numa+0x408/0x7c8 (unreliable)
> [0.628136] [c02ff6087ca0] [c0c60718] sched_init_smp+0x60/0x238
> [0.628206] [c02ff6087d00] [c0c44294] 
> kernel_init_freeable+0x1fc/0x3b4
> [0.628286] [c02ff6087dc0] [c000bd84] kernel_init+0x24/0x140
> [0.628356] [c02ff6087e30] [c0009544] 
> ret_from_kernel_thread+0x5c/0x98
> [0.628435] Instruction dump:
> [0.628470] 38c6003f 78c9d183 4d820020 38c9 3920 78c60020 38c60001 
> 7cc903a6
> [0.628587] 6000 6000 6000 6042 <7d05482a> 7d44482a 
> 7d0a5378 7d43492a
> [0.628711] ---[ end trace b423f3e02b333fbf ]---
> [0.628757]
> [2.628822] Kernel panic - not syncing: Fatal exception
> [2.628969] Rebooting in 10 seconds..[0.00] OPAL V3 detected !
> 
 
> The crash goes away if I revert following commit:
>   commit c118baf802562688d46e6002f2b5fe66b947da21
>   Author: Raghavendra K T 
>   Date:   Thu Nov 5 18:46:29 2015 -0800
> arch/powerpc/mm/numa.c: do not allocate bootmem memory for non existing 
> nodes
>

Something like below should fix. I 'll send it in a separate email 
 marking Peter and Ingo. Basically for_each_node conversion
has targeted only slowpaths / used_once sort of functions.
But it seems there was a cpumask_or in sched_init_numa that used
unallocated node.

Sorry for getting back late.. Was overcautious checking x86/power
w/ and w/o DEBUG_PER_CPU_MAPS

---8<- 
From 6680994a5a8dde7eccfbd2bffde341fdff2aed63 Mon Sep 17 00:00:00 2001
From: Raghavendra K T 
Date: Fri, 15 Jan 2016 18:19:56 +0530
Subject: [PATCH] Fix: PowerNV crash with 4.4.0-rc8 at sched_init_numa

Commit c118baf80256 ("arch/powerpc/mm/numa.c: do not allocate bootmem
memory for non existing nodes") avoided bootmem memory allocation for
non existent nodes.

When DEBUG_PER_CPU_MAPS enabled, powerNV system failed to boot because
in sched_init_numa, cpumask_or operation was done on unallocated nodes.
Fix that by making cpumask_or operation only on existing nodes.

[ Tested with and w/o DEBUG_PER_CPU_MAPS on x86 and powerpc ]

Reported-by: Jan Stancek 
Signed-off-by: Raghavendra K T 
---
 kernel/sched/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 44253ad..474658b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6840,7 +6840,7 @@ static void sched_init_numa(void)
 
sched_domains_numa_masks[i][j] = mask;
 
-   for (k = 0; k < nr_node_ids; k++) {
+   for_each_node(k) {
if (node_distance(j, k) > 
sched_domains_numa_distance[i])
continue;
 
-- 
1.7.11.7

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] PowerNV crash with 4.4.0-rc8 at sched_init_numa (related to commit c118baf80256)

2016-01-15 Thread Jan Stancek


- Original Message -
> From: "Raghavendra K T" 
> To: "Jan Stancek" 
> Cc: linuxppc-dev@lists.ozlabs.org, "raghavendra kt" 
> , vdavy...@parallels.com,
> b...@kernel.crashing.org, pau...@samba.org, m...@ellerman.id.au, 
> an...@samba.org, n...@linux.vnet.ibm.com,
> gk...@linux.vnet.ibm.com, "grant likely" , 
> nik...@linux.vnet.ibm.com, "Steve Best"
> , "Gustavo Duarte" , "Thomas Huth" 
> 
> Sent: Friday, 15 January, 2016 2:43:07 PM
> Subject: Re: [BUG] PowerNV crash with 4.4.0-rc8 at sched_init_numa (related 
> to commit c118baf80256)
> 
> * Jan Stancek  [2016-01-09 18:03:55]:
> 
> > Hi,
> > 
> > I'm seeing bare metal ppc64le system crashing early during boot
> > with latest upstream kernel (4.4.0-rc8):
> > 
> > # git describe
> > v4.4-rc8-96-g751e5f5
> > 
> > [0.625451] Unable to handle kernel paging request for data at address
> > 0x
> > [0.625586] Faulting instruction address: 0xc04ae000
> > [0.625698] Oops: Kernel access of bad area, sig: 11 [#1]
> > [0.625789] SMP NR_CPUS=2048 NUMA PowerNV
> > [0.625879] Modules linked in:
> > [0.625973] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.4.0-rc8+ #6
> > [0.626087] task: c02ff430 ti: c02ff6084000 task.ti:
> > c02ff6084000
> > [0.626224] NIP: c04ae000 LR: c090b9e4 CTR:
> > 0003
> > [0.626361] REGS: c02ff6087930 TRAP: 0300   Not tainted
> > (4.4.0-rc8+)
> > [0.626475] MSR: 90019033   CR:
> > 48002044  XER: 2000
> > [0.626808] CFAR: c0008468 DAR:  DSISR: 4000
> > SOFTE: 1
> > GPR00: c090b9ac c02ff6087bb0 c1700900 c03ff229e080
> > GPR04: c03ff229e080  0003 0001
> > GPR08:   0010 90011003
> > GPR12: 2200 cfb4 c000bd68 0002
> > GPR16: 0028 c0b25940 c173ffa4 
> > GPR20: c0b259d8 c0b259e0 c0b259e8 
> > GPR24: c03ff229e080  c189b180 
> > GPR28:  c1740a94 0002 0002
> > [0.627925] NIP [c04ae000] __bitmap_or+0x30/0x50
> > [0.627973] LR [c090b9e4] sched_init_numa+0x440/0x7c8
> > [0.628030] Call Trace:
> > [0.628054] [c02ff6087bb0] [c090b9ac]
> > sched_init_numa+0x408/0x7c8 (unreliable)
> > [0.628136] [c02ff6087ca0] [c0c60718]
> > sched_init_smp+0x60/0x238
> > [0.628206] [c02ff6087d00] [c0c44294]
> > kernel_init_freeable+0x1fc/0x3b4
> > [0.628286] [c02ff6087dc0] [c000bd84] kernel_init+0x24/0x140
> > [0.628356] [c02ff6087e30] [c0009544]
> > ret_from_kernel_thread+0x5c/0x98
> > [0.628435] Instruction dump:
> > [0.628470] 38c6003f 78c9d183 4d820020 38c9 3920 78c60020
> > 38c60001 7cc903a6
> > [0.628587] 6000 6000 6000 6042 <7d05482a> 7d44482a
> > 7d0a5378 7d43492a
> > [0.628711] ---[ end trace b423f3e02b333fbf ]---
> > [0.628757]
> > [2.628822] Kernel panic - not syncing: Fatal exception
> > [2.628969] Rebooting in 10 seconds..[0.00] OPAL V3 detected !
> > 
> 
> > The crash goes away if I revert following commit:
> >   commit c118baf802562688d46e6002f2b5fe66b947da21
> >   Author: Raghavendra K T 
> >   Date:   Thu Nov 5 18:46:29 2015 -0800
> > arch/powerpc/mm/numa.c: do not allocate bootmem memory for non existing
> > nodes
> >
> 
> Something like below should fix. I 'll send it in a separate email
>  marking Peter and Ingo. Basically for_each_node conversion
> has targeted only slowpaths / used_once sort of functions.
> But it seems there was a cpumask_or in sched_init_numa that used
> unallocated node.
> 
> Sorry for getting back late.. Was overcautious checking x86/power
> w/ and w/o DEBUG_PER_CPU_MAPS

Hi,

I ran it on my setup (same config as before) on top of v4.4-5966-g7d1fc01.
System now booted OK, dmesg looks clean.

Regards,
Jan

> 
> ---8<-
> From 6680994a5a8dde7eccfbd2bffde341fdff2aed63 Mon Sep 17 00:00:00 2001
> From: Raghavendra K T 
> Date: Fri, 15 Jan 2016 18:19:56 +0530
> Subject: [PATCH] Fix: PowerNV crash with 4.4.0-rc8 at sched_init_numa
> 
> Commit c118baf80256 ("arch/powerpc/mm/numa.c: do not allocate bootmem
> memory for non existing nodes") avoided bootmem memory allocation for
> non existent nodes.
> 
> When DEBUG_PER_CPU_MAPS enabled, powerNV system failed to boot because
> in sched_init_numa, cpumask_or operation was done on unallocated nodes.
> Fix that by making cpumask_or operation only on 

Re: [BUG] PowerNV crash with 4.4.0-rc8 at sched_init_numa (related to commit c118baf80256)

2016-01-11 Thread Raghavendra K T

On 01/10/2016 04:33 AM, Jan Stancek wrote:

Hi,

I'm seeing bare metal ppc64le system crashing early during boot
with latest upstream kernel (4.4.0-rc8):



Jan,
Do you mind sharing the .config you used for the kernel.
Not able to reproduce with the one that I have :(

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] PowerNV crash with 4.4.0-rc8 at sched_init_numa (related to commit c118baf80256)

2016-01-11 Thread Raghavendra K T

On 01/11/2016 05:22 PM, Raghavendra K T wrote:

On 01/10/2016 04:33 AM, Jan Stancek wrote:

Hi,

I'm seeing bare metal ppc64le system crashing early during boot
with latest upstream kernel (4.4.0-rc8):



Jan,
Do you mind sharing the .config you used for the kernel.
Not able to reproduce with the one that I have :(



Never mind.. I enabled DEBUG_PER_CPU_MAPS.. to hit that..
/me goes back..

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] PowerNV crash with 4.4.0-rc8 at sched_init_numa (related to commit c118baf80256)

2016-01-10 Thread Jan Stancek




- Original Message -
> From: "Raghavendra K T" 
> To: "Jan Stancek" 
> Cc: linuxppc-dev@lists.ozlabs.org, vdavy...@parallels.com, 
> b...@kernel.crashing.org, pau...@samba.org,
> m...@ellerman.id.au, an...@samba.org, n...@linux.vnet.ibm.com, 
> gk...@linux.vnet.ibm.com, "grant likely"
> , nik...@linux.vnet.ibm.com, "Steve Best" 
> , "Gustavo Duarte"
> , "Thomas Huth" 
> Sent: Sunday, 10 January, 2016 7:47:31 AM
> Subject: Re: [BUG] PowerNV crash with 4.4.0-rc8 at sched_init_numa (related 
> to commit c118baf80256)
> 
> On 01/10/2016 04:33 AM, Jan Stancek wrote:
> > Hi,
> >
> > I'm seeing bare metal ppc64le system crashing early during boot
> > with latest upstream kernel (4.4.0-rc8):
> >
> 
> Hi Jan,
> Thanks for reporting. Let me try to reproduce the issue.
> 
> (Between if you think there is anything special in the .config
> that I need for testing .. please share).

Config has many debug options turned on, so my guess was SCHED_DEBUG.
I've uploaded my config here:
  
http://jan.stancek.eu/tmp/powernv_crash_sched_init_numa/config-powernv-crash-4.4.0-rc8

Regards,
Jan

> 
> - Raghu
> 
> > # git describe
> > v4.4-rc8-96-g751e5f5
> >
> > [0.625451] Unable to handle kernel paging request for data at address
> > 0x
> > [0.625586] Faulting instruction address: 0xc04ae000
> > [0.625698] Oops: Kernel access of bad area, sig: 11 [#1]
> > [0.625789] SMP NR_CPUS=2048 NUMA PowerNV
> > [0.625879] Modules linked in:
> > [0.625973] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.4.0-rc8+ #6
> > [0.626087] task: c02ff430 ti: c02ff6084000 task.ti:
> > c02ff6084000
> > [0.626224] NIP: c04ae000 LR: c090b9e4 CTR:
> > 0003
> > [0.626361] REGS: c02ff6087930 TRAP: 0300   Not tainted
> > (4.4.0-rc8+)
> > [0.626475] MSR: 90019033   CR:
> > 48002044  XER: 2000
> > [0.626808] CFAR: c0008468 DAR:  DSISR: 4000
> > SOFTE: 1
> > GPR00: c090b9ac c02ff6087bb0 c1700900 c03ff229e080
> > GPR04: c03ff229e080  0003 0001
> > GPR08:   0010 90011003
> > GPR12: 2200 cfb4 c000bd68 0002
> > GPR16: 0028 c0b25940 c173ffa4 
> > GPR20: c0b259d8 c0b259e0 c0b259e8 
> > GPR24: c03ff229e080  c189b180 
> > GPR28:  c1740a94 0002 0002
> > [0.627925] NIP [c04ae000] __bitmap_or+0x30/0x50
> > [0.627973] LR [c090b9e4] sched_init_numa+0x440/0x7c8
> > [0.628030] Call Trace:
> > [0.628054] [c02ff6087bb0] [c090b9ac]
> > sched_init_numa+0x408/0x7c8 (unreliable)
> > [0.628136] [c02ff6087ca0] [c0c60718]
> > sched_init_smp+0x60/0x238
> > [0.628206] [c02ff6087d00] [c0c44294]
> > kernel_init_freeable+0x1fc/0x3b4
> > [0.628286] [c02ff6087dc0] [c000bd84] kernel_init+0x24/0x140
> > [0.628356] [c02ff6087e30] [c0009544]
> > ret_from_kernel_thread+0x5c/0x98
> > [0.628435] Instruction dump:
> > [0.628470] 38c6003f 78c9d183 4d820020 38c9 3920 78c60020
> > 38c60001 7cc903a6
> > [0.628587] 6000 6000 6000 6042 <7d05482a> 7d44482a
> > 7d0a5378 7d43492a
> > [0.628711] ---[ end trace b423f3e02b333fbf ]---
> > [0.628757]
> > [2.628822] Kernel panic - not syncing: Fatal exception
> > [2.628969] Rebooting in 10 seconds..[0.00] OPAL V3 detected !
> >
> > # numactl -H
> > available: 4 nodes (0-1,16-17)
> > node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
> > 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
> > node 0 size: 64941 MB
> > node 0 free: 64210 MB
> > node 1 cpus: 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
> > 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79
> > node 1 size: 65456 MB
> > node 1 free: 62424 MB
> > node 16 cpus: 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
> > 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117
> > 118 119
> > node 16 size: 65457 MB
> > node 16 free: 65258 MB
> > node 17 cpus: 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134
> > 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151
> > node 17 size: 65186 MB
> > node 17 free: 65001 MB
> > node distances:
> > node   0   1  16  17
> >0:  10  20  40  40
> >1:  20  10  40  40
> >   16:  40  40  10  20
> >   17:  40  40  20  10
> >
> > The crash goes away if I revert following commit:
> >commit c118baf802562688d46e6002f2b5fe66b947da21
> >Author: Raghavendra K T 

[BUG] PowerNV crash with 4.4.0-rc8 at sched_init_numa (related to commit c118baf80256)

2016-01-09 Thread Jan Stancek
Hi,

I'm seeing bare metal ppc64le system crashing early during boot
with latest upstream kernel (4.4.0-rc8):

# git describe
v4.4-rc8-96-g751e5f5

[0.625451] Unable to handle kernel paging request for data at address 
0x
[0.625586] Faulting instruction address: 0xc04ae000
[0.625698] Oops: Kernel access of bad area, sig: 11 [#1]
[0.625789] SMP NR_CPUS=2048 NUMA PowerNV
[0.625879] Modules linked in:
[0.625973] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.4.0-rc8+ #6
[0.626087] task: c02ff430 ti: c02ff6084000 task.ti: 
c02ff6084000
[0.626224] NIP: c04ae000 LR: c090b9e4 CTR: 0003
[0.626361] REGS: c02ff6087930 TRAP: 0300   Not tainted  (4.4.0-rc8+)
[0.626475] MSR: 90019033   CR: 48002044  
XER: 2000
[0.626808] CFAR: c0008468 DAR:  DSISR: 4000 
SOFTE: 1
GPR00: c090b9ac c02ff6087bb0 c1700900 c03ff229e080
GPR04: c03ff229e080  0003 0001
GPR08:   0010 90011003
GPR12: 2200 cfb4 c000bd68 0002
GPR16: 0028 c0b25940 c173ffa4 
GPR20: c0b259d8 c0b259e0 c0b259e8 
GPR24: c03ff229e080  c189b180 
GPR28:  c1740a94 0002 0002
[0.627925] NIP [c04ae000] __bitmap_or+0x30/0x50
[0.627973] LR [c090b9e4] sched_init_numa+0x440/0x7c8
[0.628030] Call Trace:
[0.628054] [c02ff6087bb0] [c090b9ac] 
sched_init_numa+0x408/0x7c8 (unreliable)
[0.628136] [c02ff6087ca0] [c0c60718] sched_init_smp+0x60/0x238
[0.628206] [c02ff6087d00] [c0c44294] 
kernel_init_freeable+0x1fc/0x3b4
[0.628286] [c02ff6087dc0] [c000bd84] kernel_init+0x24/0x140
[0.628356] [c02ff6087e30] [c0009544] 
ret_from_kernel_thread+0x5c/0x98
[0.628435] Instruction dump:
[0.628470] 38c6003f 78c9d183 4d820020 38c9 3920 78c60020 38c60001 
7cc903a6
[0.628587] 6000 6000 6000 6042 <7d05482a> 7d44482a 7d0a5378 
7d43492a
[0.628711] ---[ end trace b423f3e02b333fbf ]---
[0.628757]
[2.628822] Kernel panic - not syncing: Fatal exception
[2.628969] Rebooting in 10 seconds..[0.00] OPAL V3 detected !

# numactl -H
available: 4 nodes (0-1,16-17)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
node 0 size: 64941 MB
node 0 free: 64210 MB
node 1 cpus: 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79
node 1 size: 65456 MB
node 1 free: 62424 MB
node 16 cpus: 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119
node 16 size: 65457 MB
node 16 free: 65258 MB
node 17 cpus: 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 
136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151
node 17 size: 65186 MB
node 17 free: 65001 MB
node distances:
node   0   1  16  17
  0:  10  20  40  40
  1:  20  10  40  40
 16:  40  40  10  20
 17:  40  40  20  10

The crash goes away if I revert following commit:
  commit c118baf802562688d46e6002f2b5fe66b947da21
  Author: Raghavendra K T 
  Date:   Thu Nov 5 18:46:29 2015 -0800
arch/powerpc/mm/numa.c: do not allocate bootmem memory for non existing 
nodes

Regards,
Jan
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] PowerNV crash with 4.4.0-rc8 at sched_init_numa (related to commit c118baf80256)

2016-01-09 Thread Raghavendra K T

On 01/10/2016 04:33 AM, Jan Stancek wrote:

Hi,

I'm seeing bare metal ppc64le system crashing early during boot
with latest upstream kernel (4.4.0-rc8):



Hi Jan,
Thanks for reporting. Let me try to reproduce the issue.

(Between if you think there is anything special in the .config
that I need for testing .. please share).

- Raghu


# git describe
v4.4-rc8-96-g751e5f5

[0.625451] Unable to handle kernel paging request for data at address 
0x
[0.625586] Faulting instruction address: 0xc04ae000
[0.625698] Oops: Kernel access of bad area, sig: 11 [#1]
[0.625789] SMP NR_CPUS=2048 NUMA PowerNV
[0.625879] Modules linked in:
[0.625973] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.4.0-rc8+ #6
[0.626087] task: c02ff430 ti: c02ff6084000 task.ti: 
c02ff6084000
[0.626224] NIP: c04ae000 LR: c090b9e4 CTR: 0003
[0.626361] REGS: c02ff6087930 TRAP: 0300   Not tainted  (4.4.0-rc8+)
[0.626475] MSR: 90019033   CR: 48002044  
XER: 2000
[0.626808] CFAR: c0008468 DAR:  DSISR: 4000 
SOFTE: 1
GPR00: c090b9ac c02ff6087bb0 c1700900 c03ff229e080
GPR04: c03ff229e080  0003 0001
GPR08:   0010 90011003
GPR12: 2200 cfb4 c000bd68 0002
GPR16: 0028 c0b25940 c173ffa4 
GPR20: c0b259d8 c0b259e0 c0b259e8 
GPR24: c03ff229e080  c189b180 
GPR28:  c1740a94 0002 0002
[0.627925] NIP [c04ae000] __bitmap_or+0x30/0x50
[0.627973] LR [c090b9e4] sched_init_numa+0x440/0x7c8
[0.628030] Call Trace:
[0.628054] [c02ff6087bb0] [c090b9ac] 
sched_init_numa+0x408/0x7c8 (unreliable)
[0.628136] [c02ff6087ca0] [c0c60718] sched_init_smp+0x60/0x238
[0.628206] [c02ff6087d00] [c0c44294] 
kernel_init_freeable+0x1fc/0x3b4
[0.628286] [c02ff6087dc0] [c000bd84] kernel_init+0x24/0x140
[0.628356] [c02ff6087e30] [c0009544] 
ret_from_kernel_thread+0x5c/0x98
[0.628435] Instruction dump:
[0.628470] 38c6003f 78c9d183 4d820020 38c9 3920 78c60020 38c60001 
7cc903a6
[0.628587] 6000 6000 6000 6042 <7d05482a> 7d44482a 7d0a5378 
7d43492a
[0.628711] ---[ end trace b423f3e02b333fbf ]---
[0.628757]
[2.628822] Kernel panic - not syncing: Fatal exception
[2.628969] Rebooting in 10 seconds..[0.00] OPAL V3 detected !

# numactl -H
available: 4 nodes (0-1,16-17)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
node 0 size: 64941 MB
node 0 free: 64210 MB
node 1 cpus: 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79
node 1 size: 65456 MB
node 1 free: 62424 MB
node 16 cpus: 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119
node 16 size: 65457 MB
node 16 free: 65258 MB
node 17 cpus: 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 
136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151
node 17 size: 65186 MB
node 17 free: 65001 MB
node distances:
node   0   1  16  17
   0:  10  20  40  40
   1:  20  10  40  40
  16:  40  40  10  20
  17:  40  40  20  10

The crash goes away if I revert following commit:
   commit c118baf802562688d46e6002f2b5fe66b947da21
   Author: Raghavendra K T 
   Date:   Thu Nov 5 18:46:29 2015 -0800
 arch/powerpc/mm/numa.c: do not allocate bootmem memory for non existing 
nodes

Regards,
Jan





___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev