Re: Next May 11 : BUG during scsi initialization

2009-05-14 Thread Sachin Sant
Nick Piggin wrote: This one possibly looks like a problem with remote memory allocation or memory hotplug or something like that. I'll do a bit of code review Removed linux-scsi from the cc list. I can recreate this issue with today's next tree. Nick let me know if you need any other

Re: Next May 11 : BUG during scsi initialization

2009-05-14 Thread Pekka Enberg
Hi Sachin, On Thu, 2009-05-14 at 14:00 +0530, Sachin Sant wrote: Nick Piggin wrote: This one possibly looks like a problem with remote memory allocation or memory hotplug or something like that. I'll do a bit of code review Removed linux-scsi from the cc list. I can recreate this

Re: Next May 11 : BUG during scsi initialization

2009-05-14 Thread Pekka Enberg
On Thu, 2009-05-14 at 15:24 +0530, Sachin Sant wrote: Pekka Enberg wrote: I wasn't able to find your .config in this thread. Can you please post it? Config attached. Thanks! Can you please enable CONFIG_DEBUG_VM, CONFIG_DEBUG_LIST, and decrease CONFIG_NR_CPUS=1024 to, say, 32 and

Re: Next May 11 : BUG during scsi initialization

2009-05-14 Thread Pekka Enberg
Hi Sachin, On Thu, 2009-05-14 at 12:59 +0300, Pekka Enberg wrote: On Thu, 2009-05-14 at 15:24 +0530, Sachin Sant wrote: Pekka Enberg wrote: I wasn't able to find your .config in this thread. Can you please post it? Config attached. Thanks! Can you please enable

Re: Next May 11 : BUG during scsi initialization

2009-05-14 Thread Sachin Sant
Pekka Enberg wrote: Thanks! Can you please enable CONFIG_DEBUG_VM, CONFIG_DEBUG_LIST, and decrease CONFIG_NR_CPUS=1024 to, say, 32 and retest? Perhaps we'll get a some clues to what's going on here. Furthermore, you might want to test with CONFIG_PPC_4K_PAGES and CONFIG_PPC_16K_PAGES to

Re: Next May 11 : BUG during scsi initialization

2009-05-12 Thread Nick Piggin
On Tue, May 12, 2009 at 03:56:13PM +1000, Stephen Rothwell wrote: Hi Nick, On Tue, 12 May 2009 06:57:16 +0200 Nick Piggin npig...@suse.de wrote: Hmm, I think (hope) your problems were fixed with the recent memory coruption bug fix for SLQB. (if not, let me know) This one possibly

Re: Next May 11 : BUG during scsi initialization

2009-05-12 Thread Stephen Rothwell
On Tue, 12 May 2009 07:59:18 +0200 Nick Piggin npig...@suse.de wrote: On Tue, May 12, 2009 at 03:56:13PM +1000, Stephen Rothwell wrote: Hi Nick, On Tue, 12 May 2009 06:57:16 +0200 Nick Piggin npig...@suse.de wrote: Hmm, I think (hope) your problems were fixed with the recent memory

Re: Next May 11 : BUG during scsi initialization

2009-05-12 Thread Stephen Rothwell
Hi Nick, On Tue, 12 May 2009 16:03:52 +1000 Stephen Rothwell s...@canb.auug.org.au wrote: This is what I have been getting for the last few days: bisected into the net changes, I will follow up there, sorry. -- Cheers, Stephen Rothwells...@canb.auug.org.au

Re: Next May 11 : BUG during scsi initialization

2009-05-12 Thread Nick Piggin
On Tue, May 12, 2009 at 04:52:45PM +1000, Stephen Rothwell wrote: Hi Nick, On Tue, 12 May 2009 16:03:52 +1000 Stephen Rothwell s...@canb.auug.org.au wrote: This is what I have been getting for the last few days: bisected into the net changes, I will follow up there, sorry. No

Next May 11 : BUG during scsi initialization

2009-05-11 Thread Sachin Sant
Today's Next tree failed to boot on a Power6 box with following BUG : BUG: spinlock bad magic on CPU#1, modprobe/63 Unable to handle kernel paging request for data at address 0xc994838 Faulting instruction address: 0xc035f5a8 Oops: Kernel access of bad area, sig: 11 [#1] SMP

Re: Next May 11 : BUG during scsi initialization

2009-05-11 Thread Matthew Wilcox
On Mon, May 11, 2009 at 05:16:10PM +0530, Sachin Sant wrote: Today's Next tree failed to boot on a Power6 box with following BUG : This doesn't actually appear to be a SCSI bug ... it looks like SCSI tried to allocate memory and things went wrong in the memory allocator: [c000c7d038b0]

Re: Next May 11 : BUG during scsi initialization

2009-05-11 Thread Sachin Sant
Matthew Wilcox wrote: On Mon, May 11, 2009 at 05:16:10PM +0530, Sachin Sant wrote: Today's Next tree failed to boot on a Power6 box with following BUG : This doesn't actually appear to be a SCSI bug ... it looks like SCSI tried to allocate memory and things went wrong in the memory

Re: Next May 11 : BUG during scsi initialization

2009-05-11 Thread Matthew Wilcox
On Mon, May 11, 2009 at 05:34:07PM +0530, Sachin Sant wrote: Matthew Wilcox wrote: On Mon, May 11, 2009 at 05:16:10PM +0530, Sachin Sant wrote: Today's Next tree failed to boot on a Power6 box with following BUG : This doesn't actually appear to be a SCSI bug ... it looks like SCSI tried

Re: Next May 11 : BUG during scsi initialization

2009-05-11 Thread Sachin Sant
Matthew Wilcox wrote: Default one. SLQB CONFIG_SLQB_ALLOCATOR=y CONFIG_SLQB=y Page size is 64K with Config DEBUG_PAGEALLOC set. CONFIG_PPC_HAS_HASH_64K=y CONFIG_PPC_64K_PAGES=y CONFIG_DEBUG_PAGEALLOC=y Hm. We've seen some similar problems at Intel while doing database performance

Re: Next May 11 : BUG during scsi initialization

2009-05-11 Thread Matthew Wilcox
On Mon, May 11, 2009 at 09:49:55PM +0530, Sachin Sant wrote: Yeah so the problem seems to be with SLQB. I was able to boot Next 11 with SLUB on the same machine. Is it 100% reproducable with SLQB? Our errors were fairly hard to tickle on demand. -- Matthew Wilcox

Re: Next May 11 : BUG during scsi initialization

2009-05-11 Thread Sachin Sant
Matthew Wilcox wrote: On Mon, May 11, 2009 at 09:49:55PM +0530, Sachin Sant wrote: Yeah so the problem seems to be with SLQB. I was able to boot Next 11 with SLUB on the same machine. Is it 100% reproducable with SLQB? Our errors were fairly hard to tickle on demand. Yes. I am

Re: Next May 11 : BUG during scsi initialization

2009-05-11 Thread Nick Piggin
On Mon, May 11, 2009 at 06:21:35AM -0600, Matthew Wilcox wrote: On Mon, May 11, 2009 at 05:34:07PM +0530, Sachin Sant wrote: Matthew Wilcox wrote: On Mon, May 11, 2009 at 05:16:10PM +0530, Sachin Sant wrote: Today's Next tree failed to boot on a Power6 box with following BUG : This

Re: Next May 11 : BUG during scsi initialization

2009-05-11 Thread Stephen Rothwell
Hi Nick, On Tue, 12 May 2009 06:57:16 +0200 Nick Piggin npig...@suse.de wrote: Hmm, I think (hope) your problems were fixed with the recent memory coruption bug fix for SLQB. (if not, let me know) This one possibly looks like a problem with remote memory allocation or memory hotplug or