Re: crash in kmem_cache_init

2008-01-23 Thread Pekka Enberg
Hi Christoph, On Jan 23, 2008 1:18 AM, Christoph Lameter [EMAIL PROTECTED] wrote: My patch is useless (fascinating history of the changelog there through). fallback_alloc calls kmem_getpages without GFP_THISNODE. This means that alloc_pages_node() will try to allocate on the current node but

Re: crash in kmem_cache_init

2008-01-23 Thread Olaf Hering
On Wed, Jan 23, Pekka Enberg wrote: Hi Christoph, On Jan 23, 2008 1:18 AM, Christoph Lameter [EMAIL PROTECTED] wrote: My patch is useless (fascinating history of the changelog there through). fallback_alloc calls kmem_getpages without GFP_THISNODE. This means that alloc_pages_node()

Re: crash in kmem_cache_init

2008-01-23 Thread Mel Gorman
On (23/01/08 08:58), Olaf Hering didst pronounce: On Tue, Jan 22, Christoph Lameter wrote: 0xc00fe018 is in setup_cpu_cache (/home/olaf/kernel/git/linux-2.6-numa/mm/slab.c:2111). 2106BUG_ON(!cachep-nodelists[node]); 2107

Re: crash in kmem_cache_init

2008-01-23 Thread Olaf Hering
On Wed, Jan 23, Mel Gorman wrote: Sorry this is dragging out. Can you post the full dmesg with loglevel=8 of the following patch against 2.6.24-rc8 please? It contains the debug information that helped me figure out what was going wrong on the PPC64 machine here, the revert and the !l3 checks

Re: crash in kmem_cache_init

2008-01-23 Thread Olaf Hering
On Wed, Jan 23, Olaf Hering wrote: On Wed, Jan 23, Mel Gorman wrote: Sorry this is dragging out. Can you post the full dmesg with loglevel=8 of the following patch against 2.6.24-rc8 please? It contains the debug information that helped me figure out what was going wrong on the PPC64

Re: crash in kmem_cache_init

2008-01-23 Thread Mel Gorman
On (23/01/08 13:14), Olaf Hering didst pronounce: On Wed, Jan 23, Mel Gorman wrote: Sorry this is dragging out. Can you post the full dmesg with loglevel=8 of the following patch against 2.6.24-rc8 please? It contains the debug information that helped me figure out what was going wrong

Re: crash in kmem_cache_init

2008-01-22 Thread Mel Gorman
On (22/01/08 12:11), Christoph Lameter didst pronounce: On Tue, 22 Jan 2008, Mel Gorman wrote: Christoph/Pekka, this patch is papering over the problem and something more fundamental may be going wrong. The crash occurs because l3 is NULL and the cache is kmem_cache so this is early in

Re: crash in kmem_cache_init

2008-01-22 Thread Christoph Lameter
On Tue, 22 Jan 2008, Mel Gorman wrote: After you reverted the slab memoryless node patch there should be per node structures created for node 0 unless the node is marked offline. Is it? If so then you are booting a cpu that is associated with an offline node. I'll roll a patch that

Re: crash in kmem_cache_init

2008-01-22 Thread Olaf Hering
On Tue, Jan 22, Mel Gorman wrote: http://www.csn.ul.ie/~mel/postings/slab-20080122/partial-revert-slab-changes.patch .. Can you please check on your machine if it fixes your problem? It does not fix or change the nature of the crash. Olaf, please confirm whether you need the patch below as

Re: crash in kmem_cache_init

2008-01-22 Thread Nish Aravamudan
On 1/22/08, Olaf Hering [EMAIL PROTECTED] wrote: On Tue, Jan 22, Mel Gorman wrote: http://www.csn.ul.ie/~mel/postings/slab-20080122/partial-revert-slab-changes.patch .. Can you please check on your machine if it fixes your problem? It does not fix or change the nature of the crash.

Re: crash in kmem_cache_init

2008-01-22 Thread Christoph Lameter
On Tue, 22 Jan 2008, Olaf Hering wrote: It crashes now in a different way if the patch below is applied: Yup no l3 structure for the current node. We are early in boostrap. You could just check if the l3 is there and if not just skip starting the reaper? This will be redone later anyways. Not

Re: crash in kmem_cache_init

2008-01-22 Thread Mel Gorman
On (22/01/08 13:34), Christoph Lameter didst pronounce: On Tue, 22 Jan 2008, Mel Gorman wrote: After you reverted the slab memoryless node patch there should be per node structures created for node 0 unless the node is marked offline. Is it? If so then you are booting a cpu

Re: crash in kmem_cache_init

2008-01-22 Thread Christoph Lameter
On Tue, 22 Jan 2008, Mel Gorman wrote: Whatever this was a problem fixed in the past or not, it's broken again now :( . It's possible that there is a __GFP_THISNODE that can be dropped early at boot-time that would also fix this problem in a way that doesn't affect runtime

Re: crash in kmem_cache_init

2008-01-22 Thread Mel Gorman
On (22/01/08 14:57), Christoph Lameter didst pronounce: On Tue, 22 Jan 2008, Mel Gorman wrote: Whatever this was a problem fixed in the past or not, it's broken again now :( . It's possible that there is a __GFP_THISNODE that can be dropped early at boot-time that would

Re: crash in kmem_cache_init

2008-01-22 Thread Christoph Lameter
On Wed, 23 Jan 2008, Pekka Enberg wrote: When we call fallback_alloc() because the current node has -nodelists set to NULL, we end up calling kmem_getpages() with -1 as the node id which is then translated to numa_node_id() by alloc_pages_node. But the reason we called fallback_alloc() in the

Re: crash in kmem_cache_init

2008-01-22 Thread Pekka Enberg
Hi, Mel Gorman wrote: Faulting instruction address: 0xc03c8c00 cpu 0x0: Vector: 300 (Data Access) at [c05c3840] pc: c03c8c00: __lock_text_start+0x20/0x88 lr: c00dadec: .cache_grow+0x7c/0x338 sp: c05c3ac0 msr: 80009032 dar:

Re: crash in kmem_cache_init

2008-01-22 Thread Christoph Lameter
On Tue, 22 Jan 2008, Mel Gorman wrote: Rather it should be 2. I'll admit the physical setup of this machine is less than ideal but clearly it's something that can happen even if it's a bad idea. Ok. Lets hope that Pekka's find does the trick. But this would mean that fallback gets

Re: crash in kmem_cache_init

2008-01-22 Thread Christoph Lameter
On Tue, 22 Jan 2008, Christoph Lameter wrote: But I doubt that this is it. The fallback logic was added later and it worked fine. My patch is useless (fascinating history of the changelog there through). fallback_alloc calls kmem_getpages without GFP_THISNODE. This means that

Re: crash in kmem_cache_init

2008-01-22 Thread Olaf Hering
On Tue, Jan 22, Christoph Lameter wrote: 0xc00fe018 is in setup_cpu_cache (/home/olaf/kernel/git/linux-2.6-numa/mm/slab.c:2111). 2106BUG_ON(!cachep-nodelists[node]); 2107

Re: crash in kmem_cache_init

2008-01-18 Thread Christoph Lameter
On Fri, 18 Jan 2008, Christoph Lameter wrote: Memoryless nodes: Set N_NORMAL_MEMORY for a node if we do not support HIGHMEM If !CONFIG_HIGHMEM then enum node_states { #ifdef CONFIG_HIGHMEM N_HIGH_MEMORY, /* The node has regular or high memory */ #else N_HIGH_MEMORY =

Re: crash in kmem_cache_init

2008-01-18 Thread Nish Aravamudan
On 1/18/08, Christoph Lameter [EMAIL PROTECTED] wrote: Could you try this patch? Memoryless nodes: Set N_NORMAL_MEMORY for a node if we do not support HIGHMEM It seems that we only scan through zones to set N_NORMAL_MEMORY only if CONFIG_HIGHMEM and CONFIG_NUMA are set. We need to set

Re: crash in kmem_cache_init

2008-01-18 Thread Christoph Lameter
Could you try this patch? Memoryless nodes: Set N_NORMAL_MEMORY for a node if we do not support HIGHMEM It seems that we only scan through zones to set N_NORMAL_MEMORY only if CONFIG_HIGHMEM and CONFIG_NUMA are set. We need to set N_NORMAL_MEMORY in the !CONFIG_HIGHMEM case. Signed-off-by:

Re: crash in kmem_cache_init

2008-01-18 Thread Mel Gorman
On (18/01/08 10:47), Christoph Lameter didst pronounce: On Thu, 17 Jan 2008, Olaf Hering wrote: early_node_map[1] active PFN ranges 1:0 - 892928 Could not find start_pfn for node 0 Corrupted min_pfn? Doubtful. Node 0 has no memory but it is still being initialised.

Re: crash in kmem_cache_init

2008-01-17 Thread Pekka Enberg
Hi Olaf, [Adding Christoph as cc.] On Jan 15, 2008 5:09 PM, Olaf Hering [EMAIL PROTECTED] wrote: Current linus tree crashes in kmem_cache_init, as shown below. The system is a 8cpu 2.2GHz POWER5 system, model 9117-570, with 4GB ram. Firmware is 240_332, 2.6.23 boots ok with the same config.

Re: crash in kmem_cache_init

2008-01-17 Thread Christoph Lameter
On Thu, 17 Jan 2008, Pekka Enberg wrote: Looks similar to the one discussed on linux-mm ([BUG] at mm/slab.c:3320 thread). Christoph? Right. Try the latest version of the patch to fix it: Index: linux-2.6/mm/slab.c === ---

Re: crash in kmem_cache_init

2008-01-17 Thread Olaf Hering
On Thu, Jan 17, Christoph Lameter wrote: On Thu, 17 Jan 2008, Pekka Enberg wrote: Looks similar to the one discussed on linux-mm ([BUG] at mm/slab.c:3320 thread). Christoph? Right. Try the latest version of the patch to fix it: The patch does not help. Index: linux-2.6/mm/slab.c

Re: crash in kmem_cache_init

2008-01-17 Thread Christoph Lameter
Could you try Pekka's suggestion of reverting 04231b3002ac53f8a64a7bd142fde3fa4b6808c6 ? ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: crash in kmem_cache_init

2008-01-17 Thread Olaf Hering
On Thu, Jan 17, Christoph Lameter wrote: freeing bootmem node 1 Memory: 3496632k/3571712k available (6188k kernel code, 75080k reserved, 1324k data, 1220k bss, 304k init) cache_grow(2781) swapper(0):c0,j4294937299 cp c06a4fb8 !l3 Is there more backtrace information? What

Re: crash in kmem_cache_init

2008-01-17 Thread Christoph Lameter
On Thu, 17 Jan 2008, Olaf Hering wrote: The patch does not help. Duh. We need to know more about the problem. --- linux-2.6.orig/mm/slab.c2008-01-03 12:26:42.0 -0800 +++ linux-2.6/mm/slab.c 2008-01-09 15:59:49.0 -0800 @@ -2977,7 +2977,10 @@ retry: }

Re: crash in kmem_cache_init

2008-01-17 Thread Olaf Hering
On Thu, Jan 17, Olaf Hering wrote: Since -mm boots further, what patch should I try? rc8-mm1 crashes as well, l3 passed to reap_alien() is NULL. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: crash in kmem_cache_init

2008-01-17 Thread Olaf Hering
On Thu, Jan 17, Christoph Lameter wrote: On Thu, 17 Jan 2008, Olaf Hering wrote: The patch does not help. Duh. We need to know more about the problem. cache_grow is called from 3 places. The third call has cleared l3 for some reason. Allocated 00a0 bytes for kernel @ 0020

crash in kmem_cache_init

2008-01-15 Thread Olaf Hering
Current linus tree crashes in kmem_cache_init, as shown below. The system is a 8cpu 2.2GHz POWER5 system, model 9117-570, with 4GB ram. Firmware is 240_332, 2.6.23 boots ok with the same config. There is a series of mm related patches in 2.6.24-rc1: commit