Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Thu, Feb 21 2008, Andrew Morton wrote: > On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe <[EMAIL PROTECTED]> wrote: > > > But I think the radix 'scan over entire tree' is a bit fragile. > > eek, it had better not be. Was this an error in the caller? Hope so. The cfq use of it, not the radix tree code! It juggled the keys and wants to make sure that we see all users, modulo raced added ones (ok if we see them, doesn't matter if we don't). > > This > > patch adds a parallel hlist for ease of properly browsing the members, > > Even though io_contexts are fairly uncommon, adding more stuff to a data > structure was a pretty sad alternative to fixing a bug in > radix_tree_gang_lookup(), or to fixing a bug in a caller of it. > > IOW: what exactly went wrong here?? I could not convince myself that the current code would always do the right thing. We should not have been seeing ->key == NULL entries in there, it implied a double exit of that process. So I decided to fix it by making the code a lot more readable (the patch in question deleted a lot more than it added), at the cost of that hlist head + node. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe <[EMAIL PROTECTED]> wrote: > But I think the radix 'scan over entire tree' is a bit fragile. eek, it had better not be. Was this an error in the caller? Hope so. > This > patch adds a parallel hlist for ease of properly browsing the members, Even though io_contexts are fairly uncommon, adding more stuff to a data structure was a pretty sad alternative to fixing a bug in radix_tree_gang_lookup(), or to fixing a bug in a caller of it. IOW: what exactly went wrong here?? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe [EMAIL PROTECTED] wrote: But I think the radix 'scan over entire tree' is a bit fragile. eek, it had better not be. Was this an error in the caller? Hope so. This patch adds a parallel hlist for ease of properly browsing the members, Even though io_contexts are fairly uncommon, adding more stuff to a data structure was a pretty sad alternative to fixing a bug in radix_tree_gang_lookup(), or to fixing a bug in a caller of it. IOW: what exactly went wrong here?? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Thu, Feb 21 2008, Andrew Morton wrote: On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe [EMAIL PROTECTED] wrote: But I think the radix 'scan over entire tree' is a bit fragile. eek, it had better not be. Was this an error in the caller? Hope so. The cfq use of it, not the radix tree code! It juggled the keys and wants to make sure that we see all users, modulo raced added ones (ok if we see them, doesn't matter if we don't). This patch adds a parallel hlist for ease of properly browsing the members, Even though io_contexts are fairly uncommon, adding more stuff to a data structure was a pretty sad alternative to fixing a bug in radix_tree_gang_lookup(), or to fixing a bug in a caller of it. IOW: what exactly went wrong here?? I could not convince myself that the current code would always do the right thing. We should not have been seeing -key == NULL entries in there, it implied a double exit of that process. So I decided to fix it by making the code a lot more readable (the patch in question deleted a lot more than it added), at the cost of that hlist head + node. -- Jens Axboe -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On 2/20/2008, "Zhang, Yanmin" <[EMAIL PROTECTED]> wrote: > Kernel with the reverting patch is ok. > I ran reboot/hackbench for more than 10 times on every one of my 3 x86-64 > machines, and kernel didn't crash. Great, Linus reverted the patch yesterday. Thanks for testing! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On Wed, 2008-02-20 at 10:08 +0800, Zhang, Yanmin wrote: > On Wed, 2008-02-20 at 08:36 +0800, Zhang, Yanmin wrote: > > On Tue, 2008-02-19 at 17:52 +0200, Pekka Enberg wrote: > > > Ingo Molnar wrote: > > > > * Pekka Enberg <[EMAIL PROTECTED]> wrote: > > > > > > > >>> Yes, this can happen. Are you saying it is not safe to be in the > > > >>> lockless path when an IRQ triggers? > > > >> Hmm. The barrier() in slab_free() looks fishy. The comment says it's > > > >> there to make sure we've retrieved c->freelist before c->page but then > > > >> it uses a _compiler barrier_ which doesn't affect the CPU and the > > > >> reads may still be re-ordered... Not sure if that matters here though. > > > > > > > > find a fix patch for that below - most systems affected seem to be SMP > > > > ones. > > > > > > > > If this (or my other patch) indeed solves the problem i'd still favor a > > > > full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it > > > > looks > > > > quite un-cooked and quite un-tested for multiple independent reasons. > > > > > > > > Sigh, why do i again have to be the messenger who brings the bad news > > > > to > > > > SLUB land, and again when poor Christoph went on vacation? :-/ > > > > > > > > Ingo > > > > > > > > --> > > > > Subject: SLUB: barrier fix > > > > From: Ingo Molnar <[EMAIL PROTECTED]> > > > > > > > > --- > > > > mm/slub.c |2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > Index: linux/mm/slub.c > > > > === > > > > --- linux.orig/mm/slub.c > > > > +++ linux/mm/slub.c > > > > @@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st > > > > debug_check_no_locks_freed(object, s->objsize); > > > > do { > > > > freelist = c->freelist; > > > > - barrier(); > > > > + smp_mb(); > > > > /* > > > > * If the compiler would reorder the retrieval of > > > > c->page to > > > > * come before c->freelist then an interrupt could > > > > > > Torsten/Yamin, does this fix things for you? What about reverting commit > > > 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c ("SLUB: Alternate fast paths > > > using cmpxchg_local")? > > I'm busy in another issue and will test it ASAP. Sorry. > I tested it on my 3 x86-64 machines. The small fix to use smp_mb to replace > barrier in slab_free doesn't work. Kernel still crashed at the same place. > > I will test the reverting patch. Kernel with the reverting patch is ok. I ran reboot/hackbench for more than 10 times on every one of my 3 x86-64 machines, and kernel didn't crash. -yanmin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On Wed, 2008-02-20 at 08:36 +0800, Zhang, Yanmin wrote: > On Tue, 2008-02-19 at 17:52 +0200, Pekka Enberg wrote: > > Ingo Molnar wrote: > > > * Pekka Enberg <[EMAIL PROTECTED]> wrote: > > > > > >>> Yes, this can happen. Are you saying it is not safe to be in the > > >>> lockless path when an IRQ triggers? > > >> Hmm. The barrier() in slab_free() looks fishy. The comment says it's > > >> there to make sure we've retrieved c->freelist before c->page but then > > >> it uses a _compiler barrier_ which doesn't affect the CPU and the > > >> reads may still be re-ordered... Not sure if that matters here though. > > > > > > find a fix patch for that below - most systems affected seem to be SMP > > > ones. > > > > > > If this (or my other patch) indeed solves the problem i'd still favor a > > > full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks > > > quite un-cooked and quite un-tested for multiple independent reasons. > > > > > > Sigh, why do i again have to be the messenger who brings the bad news to > > > SLUB land, and again when poor Christoph went on vacation? :-/ > > > > > > Ingo > > > > > > --> > > > Subject: SLUB: barrier fix > > > From: Ingo Molnar <[EMAIL PROTECTED]> > > > > > > --- > > > mm/slub.c |2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > Index: linux/mm/slub.c > > > === > > > --- linux.orig/mm/slub.c > > > +++ linux/mm/slub.c > > > @@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st > > > debug_check_no_locks_freed(object, s->objsize); > > > do { > > > freelist = c->freelist; > > > - barrier(); > > > + smp_mb(); > > > /* > > >* If the compiler would reorder the retrieval of c->page to > > >* come before c->freelist then an interrupt could > > > > Torsten/Yamin, does this fix things for you? What about reverting commit > > 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c ("SLUB: Alternate fast paths > > using cmpxchg_local")? > I'm busy in another issue and will test it ASAP. Sorry. I tested it on my 3 x86-64 machines. The small fix to use smp_mb to replace barrier in slab_free doesn't work. Kernel still crashed at the same place. I will test the reverting patch. -yanmin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On Tue, 2008-02-19 at 17:52 +0200, Pekka Enberg wrote: > Ingo Molnar wrote: > > * Pekka Enberg <[EMAIL PROTECTED]> wrote: > > > >>> Yes, this can happen. Are you saying it is not safe to be in the > >>> lockless path when an IRQ triggers? > >> Hmm. The barrier() in slab_free() looks fishy. The comment says it's > >> there to make sure we've retrieved c->freelist before c->page but then > >> it uses a _compiler barrier_ which doesn't affect the CPU and the > >> reads may still be re-ordered... Not sure if that matters here though. > > > > find a fix patch for that below - most systems affected seem to be SMP > > ones. > > > > If this (or my other patch) indeed solves the problem i'd still favor a > > full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks > > quite un-cooked and quite un-tested for multiple independent reasons. > > > > Sigh, why do i again have to be the messenger who brings the bad news to > > SLUB land, and again when poor Christoph went on vacation? :-/ > > > > Ingo > > > > --> > > Subject: SLUB: barrier fix > > From: Ingo Molnar <[EMAIL PROTECTED]> > > > > --- > > mm/slub.c |2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > Index: linux/mm/slub.c > > === > > --- linux.orig/mm/slub.c > > +++ linux/mm/slub.c > > @@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st > > debug_check_no_locks_freed(object, s->objsize); > > do { > > freelist = c->freelist; > > - barrier(); > > + smp_mb(); > > /* > > * If the compiler would reorder the retrieval of c->page to > > * come before c->freelist then an interrupt could > > Torsten/Yamin, does this fix things for you? What about reverting commit > 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c ("SLUB: Alternate fast paths > using cmpxchg_local")? I'm busy in another issue and will test it ASAP. Sorry. -yanmin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
* Pekka Enberg ([EMAIL PROTECTED]) wrote: > Hi Mathieu, > > On Feb 19, 2008 4:02 PM, Mathieu Desnoyers <[EMAIL PROTECTED]> wrote: > > - stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore > > indicating it is not reentrant if IRQs are disabled. Since those are > > only stats, I guess it's ok, but still weird. > > What is not re-entrant? > incrementing the variable with a "++" when interrupts are not disabled. It's not an atomic add and it's racy. The code within stat() does exactly this. > On Feb 19, 2008 4:02 PM, Mathieu Desnoyers <[EMAIL PROTECTED]> wrote: > > Since this shows mostly with network card drivers, I think the most > > plausible cause would be an IRQ nesting over kmem_cache_alloc_node and > > calling it. > > Yes, this can happen. Are you saying it is not safe to be in the > lockless path when an IRQ triggers? It should be safe, but I think Eric pointed the correct problem in his reply. Thanks, Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
* Eric Dumazet ([EMAIL PROTECTED]) wrote: > On Tue, 19 Feb 2008 09:02:30 -0500 > Mathieu Desnoyers <[EMAIL PROTECTED]> wrote: > > > * Pekka Enberg ([EMAIL PROTECTED]) wrote: > > > On Feb 19, 2008 8:54 AM, Torsten Kaiser <[EMAIL PROTECTED]> wrote: > > > > > > [ 5282.056415] [ cut here ] > > > > > > [ 5282.059757] kernel BUG at lib/list_debug.c:33! > > > > > > [ 5282.062055] invalid opcode: [1] SMP > > > > > > [ 5282.062055] CPU 3 > > > > > > > > > > hm. Your crashes do seem to span multiple subsystems, but it always > > > > > seems to be around the SLUB code. Could you try the patch below? The > > > > > SLUB code has a new optimization and i'm not 100% sure about it. [the > > > > > hack below switches the SLUB optimization off by disabling the CPU > > > > > feature it relies on.] > > > > > > > > > > Ingo > > > > > > > > > > -> > > > > > arch/x86/Kconfig |4 > > > > > 1 file changed, 4 deletions(-) > > > > > > > > > > Index: linux/arch/x86/Kconfig > > > > > === > > > > > --- linux.orig/arch/x86/Kconfig > > > > > +++ linux/arch/x86/Kconfig > > > > > @@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT > > > > > config SEMAPHORE_SLEEPERS > > > > > def_bool y > > > > > > > > > > -config FAST_CMPXCHG_LOCAL > > > > > - bool > > > > > - default y > > > > > - > > > > > config MMU > > > > > def_bool y > > > > > > > > > > > > > $ grep FAST_CMPXCHG_LOCAL */.config > > > > linux-2.6.24-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > > > linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > > > linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > > > linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > > > linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > > > linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > > > linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > > > linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > > > > > > > -rc2-mm1 still worked for me. > > > > > > > > Did you mean the new SLUB_FASTPATH? > > > > $ grep "define SLUB_FASTPATH" */mm/slub.c > > > > linux-2.6.25-rc1/mm/slub.c:#define SLUB_FASTPATH > > > > linux-2.6.25-rc2-mm1/mm/slub.c:#define SLUB_FASTPATH > > > > linux-2.6.25-rc2/mm/slub.c:#define SLUB_FASTPATH > > > > > > > > The 2.6.24-rc3+ mm-kernels did crash for me, but don't seem to contain > > > > this... > > > > > > > > On the other hand: > > > > From the crash in 2.6.25-rc2-mm1: > > > > [59987.116182] RIP [] kmem_cache_alloc_node+0x6d/0xa0 > > > > > > > > (gdb) list *0x8029f83d > > > > 0x8029f83d is in kmem_cache_alloc_node (mm/slub.c:1646). > > > > 1641if (unlikely(is_end(object) || !node_match(c, > > > > node))) { > > > > 1642object = __slab_alloc(s, gfpflags, > > > > node, addr, c); > > > > 1643break; > > > > 1644} > > > > 1645stat(c, ALLOC_FASTPATH); > > > > 1646} while (cmpxchg_local(>freelist, object, > > > > object[c->offset]) > > > > 1647 > > > > != object); > > > > 1648#else > > > > 1649unsigned long flags; > > > > 1650 > > > > > > > > That code is part for SLUB_FASTPATH. > > > > > > > > I'm willing to test the patch, but don't know how fast I can find the > > > > time to do it, so my answer if your patch helps might be delayed until > > > > the weekend. > > > > > > Mathieu, Christoph is on vacation and I'm not at all that familiar > > > with this cmpxchg_local() optimization, so if you could take a peek at > > > this bug report to see if you can spot something obviously wrong with > > > it, I would much appreciate that. > > > > Sure, > > > > I
Re: Linux 2.6.25-rc2
On Feb 19, 2008 5:20 PM, Linus Torvalds <[EMAIL PROTECTED]> wrote: > So: > - it might be something else entirely > - it might still be the local cmpxchg, just Torsten didn't happen to >notice it until later. My new hackbench-testcase also killed 2.6.24-rc2-mm1, so I really noticed to late. > - it might still be the local cmpxchg, but something else changed its >patterns to actually make it start triggering. > > and in general I don't think we should revert it unless we have stronger > indications that it really is the problem (eg somebody finds the actual > bug, or a reporter can confirm that it goes away when the local cmpxchg > optimization is disabled). I tried the following three patches: switching the barrier() for a smp_mb() in 2.6.25-rc2-mm1: -> crashed reverting the FASTPATH-patch in 2.6.25-rc2: -> worked only removed FAST_CMPXCHG_LOCAL from arch/x86/Kconfig -> worked So all of these tests seem to confirm, that the bug is in the new SLUB fastpath. Torsten -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > Earlier today i turned off local-cmpxchg and havent had a crash or > hang since then - but at 200 bootups and 4-5 crashes in a week that's > not conclusive yet. I think others might have workloads that trigger > this bug more often. i mean, today i've only done 200 randconfig bootups since i did the cmpxchg SLUB revert, and given the statistics of the bug (thousands of bootups and just 3 provable crashes) i cannot yet conclude that the bug is truly gone. Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > So: > - it might be something else entirely > - it might still be the local cmpxchg, just Torsten didn't happen to >notice it until later. > - it might still be the local cmpxchg, but something else changed its >patterns to actually make it start triggering. > > and in general I don't think we should revert it unless we have > stronger indications that it really is the problem (eg somebody finds > the actual bug, or a reporter can confirm that it goes away when the > local cmpxchg optimization is disabled). yeah - my revert suggestions were all completely conditional on such type of test feedback. Btw., i did trigger occasional SLUB crashes myself starting at around -rc1, on the order of one per 200-300 straight random bootups, and yesterday i did a 50-bootups series of a specific .config that crashed, to try to reproduce one of them but failed - so bisection was not an option and i had nothing concrete and repeatable to report either. I had a few complete lockups and only 3 usable backtraces - find them below. Networking features in all of the backtraces - and so does the VFS. All of the crashes are on SMP - and given that 50% of the bootups are UP this gives us a 1:8 chance hint that this bug is SMP specific. (All the crashes are in distccd - that is what this build cluster does mainly so it's the main activity of the box - so they dont necessarily indicate anything workload specific.) Earlier today i turned off local-cmpxchg and havent had a crash or hang since then - but at 200 bootups and 4-5 crashes in a week that's not conclusive yet. I think others might have workloads that trigger this bug more often. Ingo > mercury login: [ 582.671916] Oops: [#1] SMP DEBUG_PAGEALLOC [ 582.672334] [ 582.672334] Pid: 3776, comm: distccd Not tainted (2.6.25-rc2 #5) [ 582.672334] EIP: 0060:[] EFLAGS: 00010246 CPU: 0 [ 582.672334] EIP is at kmem_cache_alloc+0x2a/0x90 [ 582.672334] EAX: EBX: 861c ECX: c069ed1c EDX: 01060002 [ 582.672334] ESI: c0aeffc8 EDI: c1d11714 EBP: f6eddcdc ESP: f6eddcc4 [ 582.672334] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 582.672334] Process distccd (pid: 3776, ti=f6edc000 task=f508c000 task.ti=f6edc000) [ 582.672334] Stack: c06a3d48 f6eddce4 0020 861c 066c c0aeffc8 f6eddcf8 c069ed1c [ 582.672334] 0020 861c f7ce6580 f7ce6580 f6eddd18 c045e7bb [ 582.672334] f7f683e0 861c f52136c0 f7ce6580 f6eddd58 c0461de5 f508c000 [ 582.672334] Call Trace: [ 582.672334] [] ? netif_receive_skb+0x2a8/0x320 [ 582.672334] [] ? __alloc_skb+0x2c/0x110 [ 582.672334] [] ? nv_alloc_rx_optimized+0x10b/0x1a0 [ 582.672334] [] ? nv_napi_poll+0x1b5/0x730 [ 582.672334] [] ? net_rx_action+0x16b/0x200 [ 582.672334] [] ? net_rx_action+0x88/0x200 [ 582.672334] [] ? __do_softirq+0x93/0x120 [ 582.672334] [] ? do_softirq+0x57/0x60 [ 582.672334] [] ? irq_exit+0x69/0x80 [ 582.672334] [] ? do_IRQ+0x45/0x80 [ 582.672334] [] ? d_instantiate+0x42/0x60 [ 582.672334] [] ? common_interrupt+0x28/0x30 [ 582.672334] [] ? d_instantiate+0x42/0x60 [ 582.672334] [] ? lock_release+0xc0/0x1b0 [ 582.672334] [] ? _spin_unlock+0x16/0x20 [ 582.672334] [] ? d_instantiate+0x42/0x60 [ 582.672334] [] ? ext3_add_nondir+0x34/0x50 [ 582.672334] [] ? ext3_create+0x9e/0xe0 [ 582.672334] [] ? vfs_create+0xb8/0x100 [ 582.672334] [] ? open_namei+0x4d0/0x5a0 [ 582.672334] [] ? in_group_p+0x26/0x30 [ 582.672334] [] ? ext3_permission+0x0/0x10 [ 582.672334] [] ? do_filp_open+0x31/0x50 [ 582.672334] [] ? _spin_unlock+0x1d/0x20 [ 582.672334] [] ? get_unused_fd_flags+0xbb/0xe0 [ 582.672334] [] ? do_sys_open+0x4d/0xf0 [ 582.672334] [] ? trace_hardirqs_on_thunk+0xc/0x10 [ 582.672334] [] ? trace_hardirqs_on_caller+0xbd/0x140 [ 582.672334] [] ? sys_open+0x1c/0x20 [ 582.672334] [] ? sysenter_past_esp+0x5f/0x99 [ 582.672334] === [ 582.672334] Code: c3 55 89 e5 57 56 89 c6 53 83 ec 0c 8b 4d 04 89 55 f0 64 a1 04 40 b7 c0 8b 7c 86 64 90 8d 74 26 00 8b 17 f6 c2 01 75 41 8b 47 0c <8b> 1c 82 89 d0 0f b1 1f 39 d0 89 c3 75 e8 66 83 7d f0 00 79 1f [ 582.672334] EIP: [] kmem_cache_alloc+0x2a/0x90 SS:ESP 0068:f6eddcc4 [ 582.672343] Kernel panic - not syncing: Fatal exception in interrupt [ 582.673337] Pid: 3776, comm: distccd Tainted: G D 2.6.25-rc2 #5 [ 582.674342] [] panic+0x46/0x120 [ 582.676335] [] die+0x134/0x150 [ 582.678335] [] do_page_fault+0x188/0x610 [ 582.680335] [] ? ip_local_deliver+0xf6/0x1c0 [ 582.682335] [] ? do_page_fault+0x0/0x610 [ 582.685334] [] error_code+0x72/0x80 [ 582.687334] [] ? __alloc_skb+0x2c/0x110 [ 582.689334] [] ? kmem_cache_alloc+0x2a/0x90 [ 582.691333] [] ? netif_receive_skb+0x2a8/0x320 [ 582.69] [] __alloc_skb+0x2c/0x110 [ 582.695333] [] nv_alloc_rx_optimized+0x10b/0x1a0 [ 582.697332] [] nv_napi_poll+0x1b5/0x730 [
Re: Linux 2.6.25-rc2
On Tue, 19 Feb 2008, Eric Dumazet wrote: > > cmpxchg_local(>freelist, object, object[c->offset]) can succeed, > while an interrupt came (on this cpu), and several allocations were done, > and one free was performed at the end of this interruption, so 'object' > was recycled. I think you may well be right. This looks like a good clue. I'll do the revert. I wanted either a confirmation that reveting it actually fixes something, _or_ an actual bug description, and this seems to be a quite possible case of the latter. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On Tue, 19 Feb 2008 09:02:30 -0500 Mathieu Desnoyers <[EMAIL PROTECTED]> wrote: > * Pekka Enberg ([EMAIL PROTECTED]) wrote: > > On Feb 19, 2008 8:54 AM, Torsten Kaiser <[EMAIL PROTECTED]> wrote: > > > > > [ 5282.056415] [ cut here ] > > > > > [ 5282.059757] kernel BUG at lib/list_debug.c:33! > > > > > [ 5282.062055] invalid opcode: [1] SMP > > > > > [ 5282.062055] CPU 3 > > > > > > > > hm. Your crashes do seem to span multiple subsystems, but it always > > > > seems to be around the SLUB code. Could you try the patch below? The > > > > SLUB code has a new optimization and i'm not 100% sure about it. [the > > > > hack below switches the SLUB optimization off by disabling the CPU > > > > feature it relies on.] > > > > > > > > Ingo > > > > > > > > -> > > > > arch/x86/Kconfig |4 > > > > 1 file changed, 4 deletions(-) > > > > > > > > Index: linux/arch/x86/Kconfig > > > > === > > > > --- linux.orig/arch/x86/Kconfig > > > > +++ linux/arch/x86/Kconfig > > > > @@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT > > > > config SEMAPHORE_SLEEPERS > > > > def_bool y > > > > > > > > -config FAST_CMPXCHG_LOCAL > > > > - bool > > > > - default y > > > > - > > > > config MMU > > > > def_bool y > > > > > > > > > > $ grep FAST_CMPXCHG_LOCAL */.config > > > linux-2.6.24-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > > linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > > linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > > linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > > linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > > linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > > linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > > linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > > > > > -rc2-mm1 still worked for me. > > > > > > Did you mean the new SLUB_FASTPATH? > > > $ grep "define SLUB_FASTPATH" */mm/slub.c > > > linux-2.6.25-rc1/mm/slub.c:#define SLUB_FASTPATH > > > linux-2.6.25-rc2-mm1/mm/slub.c:#define SLUB_FASTPATH > > > linux-2.6.25-rc2/mm/slub.c:#define SLUB_FASTPATH > > > > > > The 2.6.24-rc3+ mm-kernels did crash for me, but don't seem to contain > > > this... > > > > > > On the other hand: > > > From the crash in 2.6.25-rc2-mm1: > > > [59987.116182] RIP [] kmem_cache_alloc_node+0x6d/0xa0 > > > > > > (gdb) list *0x8029f83d > > > 0x8029f83d is in kmem_cache_alloc_node (mm/slub.c:1646). > > > 1641if (unlikely(is_end(object) || !node_match(c, > > > node))) { > > > 1642object = __slab_alloc(s, gfpflags, > > > node, addr, c); > > > 1643break; > > > 1644} > > > 1645stat(c, ALLOC_FASTPATH); > > > 1646} while (cmpxchg_local(>freelist, object, > > > object[c->offset]) > > > 1647 > > > != object); > > > 1648#else > > > 1649unsigned long flags; > > > 1650 > > > > > > That code is part for SLUB_FASTPATH. > > > > > > I'm willing to test the patch, but don't know how fast I can find the > > > time to do it, so my answer if your patch helps might be delayed until > > > the weekend. > > > > Mathieu, Christoph is on vacation and I'm not at all that familiar > > with this cmpxchg_local() optimization, so if you could take a peek at > > this bug report to see if you can spot something obviously wrong with > > it, I would much appreciate that. > > Sure, > > Initial thoughts : > > I'd like to get the complete config causing this bug. I suspect either : > > - A race between the lockless algo and an IRQ in a driver allocating > memory. > - stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore > indicating it is not reentrant if IRQs are disabled. Since those are > only stats, I guess it's ok, but still weird. > - CPU hotplug problem. > http://bugzilla.kernel.org/attachment.cgi?id=14877=view shows > last sysf
Re: Linux 2.6.25-rc2
On Tue, 19 Feb 2008, Pekka Enberg wrote: > > Hmm. The barrier() in slab_free() looks fishy. The comment says it's > there to make sure we've retrieved c->freelist before c->page but then > it uses a _compiler barrier_ which doesn't affect the CPU and the > reads may still be re-ordered... Not sure if that matters here though. No, no. The comment says that it's purely there to serialize an *interrupt*, and as such, a compiler-only barrier is sufficient (or the comment is wrong). Interrupts are "totally ordered" within a cpu (of course, in theory a CPU might have speculative work etc reordering, but the CPU also guarantees that interrupt acts _as_if_ it was exact), so a compiler barrier is sufficient. Of course, if we're talking about interrupts on another CPU, that's a different issue, but the fact is, in that case it's not about interrupts any more (might as well be other code just running normally on another CPU), and a barrier doesn't help, it needs real locking. So that barrier is fine per se. Of course, the whole code (and/or just the comment!) may be buggered, but any CPU SMP-aware barriers shouldn't be relevant. What's much more likely to be an issue is simply the fact that since the fastpath now accesses the per-cpu freelist without any locking, if there is *any* sequence what-so-ever that does it from another CPU and assumes the old locking behaviour, the list will be corrupted. And from a quick look-through, I certainly cannot guarantee that isn't the case. There's still a lot of cases that do direct assignments to "c->freelist" without using a guaranteed atomic sequence. They *should* be safe if it's guaranteed that (a) they always run with interrupts disabled AND (b) 'c' is _always_ the "current CPU" list but I can't quickly see that guarantee for either. I'd happily just revert this thing, but it would be really good to have confirmation that it seems to matter. But Torsten's partial bisection seems to say that the quicklist thing went into -mm before the crash even started. So: - it might be something else entirely - it might still be the local cmpxchg, just Torsten didn't happen to notice it until later. - it might still be the local cmpxchg, but something else changed its patterns to actually make it start triggering. and in general I don't think we should revert it unless we have stronger indications that it really is the problem (eg somebody finds the actual bug, or a reporter can confirm that it goes away when the local cmpxchg optimization is disabled). Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
Ingo Molnar wrote: * Ingo Molnar <[EMAIL PROTECTED]> wrote: If this (or my other patch) indeed solves the problem i'd still favor a full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks quite un-cooked and quite un-tested for multiple independent reasons. Sigh, why do i again have to be the messenger who brings the bad news to SLUB land, and again when poor Christoph went on vacation? :-/ the revert patch is below. (manually done due to other changes since 1f84260c8ce3b1ce26d4 was commited, but trivial) I am ok with this if someone can actually confirm it fixes things. Pekka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
Ingo Molnar wrote: * Pekka Enberg <[EMAIL PROTECTED]> wrote: Yes, this can happen. Are you saying it is not safe to be in the lockless path when an IRQ triggers? Hmm. The barrier() in slab_free() looks fishy. The comment says it's there to make sure we've retrieved c->freelist before c->page but then it uses a _compiler barrier_ which doesn't affect the CPU and the reads may still be re-ordered... Not sure if that matters here though. find a fix patch for that below - most systems affected seem to be SMP ones. If this (or my other patch) indeed solves the problem i'd still favor a full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks quite un-cooked and quite un-tested for multiple independent reasons. Sigh, why do i again have to be the messenger who brings the bad news to SLUB land, and again when poor Christoph went on vacation? :-/ Ingo --> Subject: SLUB: barrier fix From: Ingo Molnar <[EMAIL PROTECTED]> --- mm/slub.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux/mm/slub.c === --- linux.orig/mm/slub.c +++ linux/mm/slub.c @@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st debug_check_no_locks_freed(object, s->objsize); do { freelist = c->freelist; - barrier(); + smp_mb(); /* * If the compiler would reorder the retrieval of c->page to * come before c->freelist then an interrupt could Torsten/Yamin, does this fix things for you? What about reverting commit 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c ("SLUB: Alternate fast paths using cmpxchg_local")? Pekka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > If this (or my other patch) indeed solves the problem i'd still favor > a full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it > looks quite un-cooked and quite un-tested for multiple independent > reasons. > > Sigh, why do i again have to be the messenger who brings the bad news > to SLUB land, and again when poor Christoph went on vacation? :-/ the revert patch is below. (manually done due to other changes since 1f84260c8ce3b1ce26d4 was commited, but trivial) Ingo -> Subject: slub: fastpath optimization revert From: Ingo Molnar <[EMAIL PROTECTED]> Date: Tue Feb 19 15:46:37 CET 2008 revert: commit 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c Author: Christoph Lameter <[EMAIL PROTECTED]> Date: Mon Jan 7 23:20:30 2008 -0800 SLUB: Alternate fast paths using cmpxchg_local it was causing problems (crashes) and was incomplete. Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> --- mm/slub.c | 87 -- 1 file changed, 87 deletions(-) Index: linux-x86.q/mm/slub.c === --- linux-x86.q.orig/mm/slub.c +++ linux-x86.q/mm/slub.c @@ -149,13 +149,6 @@ static inline void ClearSlabDebug(struct /* Enable to test recovery from slab corruption on boot */ #undef SLUB_RESILIENCY_TEST -/* - * Currently fastpath is not supported if preemption is enabled. - */ -#if defined(CONFIG_FAST_CMPXCHG_LOCAL) && !defined(CONFIG_PREEMPT) -#define SLUB_FASTPATH -#endif - #if PAGE_SHIFT <= 12 /* @@ -1514,11 +1507,6 @@ static void *__slab_alloc(struct kmem_ca { void **object; struct page *new; -#ifdef SLUB_FASTPATH - unsigned long flags; - - local_irq_save(flags); -#endif if (!c->page) goto new_slab; @@ -1541,9 +1529,6 @@ load_freelist: unlock_out: slab_unlock(c->page); stat(c, ALLOC_SLOWPATH); -#ifdef SLUB_FASTPATH - local_irq_restore(flags); -#endif return object; another_slab: @@ -1575,9 +1560,6 @@ new_slab: c->page = new; goto load_freelist; } -#ifdef SLUB_FASTPATH - local_irq_restore(flags); -#endif /* * No memory available. * @@ -1619,34 +1601,6 @@ static __always_inline void *slab_alloc( { void **object; struct kmem_cache_cpu *c; - -/* - * The SLUB_FASTPATH path is provisional and is currently disabled if the - * kernel is compiled with preemption or if the arch does not support - * fast cmpxchg operations. There are a couple of coming changes that will - * simplify matters and allow preemption. Ultimately we may end up making - * SLUB_FASTPATH the default. - * - * 1. The introduction of the per cpu allocator will avoid array lookups - *through get_cpu_slab(). A special register can be used instead. - * - * 2. The introduction of per cpu atomic operations (cpu_ops) means that - *we can realize the logic here entirely with per cpu atomics. The - *per cpu atomic ops will take care of the preemption issues. - */ - -#ifdef SLUB_FASTPATH - c = get_cpu_slab(s, raw_smp_processor_id()); - do { - object = c->freelist; - if (unlikely(is_end(object) || !node_match(c, node))) { - object = __slab_alloc(s, gfpflags, node, addr, c); - break; - } - stat(c, ALLOC_FASTPATH); - } while (cmpxchg_local(>freelist, object, object[c->offset]) - != object); -#else unsigned long flags; local_irq_save(flags); @@ -1661,7 +1615,6 @@ static __always_inline void *slab_alloc( stat(c, ALLOC_FASTPATH); } local_irq_restore(flags); -#endif if (unlikely((gfpflags & __GFP_ZERO) && object)) memset(object, 0, c->objsize); @@ -1698,11 +1651,6 @@ static void __slab_free(struct kmem_cach void **object = (void *)x; struct kmem_cache_cpu *c; -#ifdef SLUB_FASTPATH - unsigned long flags; - - local_irq_save(flags); -#endif c = get_cpu_slab(s, raw_smp_processor_id()); stat(c, FREE_SLOWPATH); slab_lock(page); @@ -1734,9 +1682,6 @@ checks_ok: out_unlock: slab_unlock(page); -#ifdef SLUB_FASTPATH - local_irq_restore(flags); -#endif return; slab_empty: @@ -1749,9 +1694,6 @@ slab_empty: } slab_unlock(page); stat(c, FREE_SLAB); -#ifdef SLUB_FASTPATH - local_irq_restore(flags); -#endif discard_slab(s, page); return; @@ -1777,34 +1719,6 @@ static __always_inline void slab_free(st { void **object = (void *)x; struct kmem_cache_cpu *c; - -#ifdef SLUB_FASTPATH - void **freelist; - - c = get_cpu_slab(s, raw_smp_processor_id()); - debug_check_no_locks_freed(object,
Re: Linux 2.6.25-rc2
* Pekka Enberg <[EMAIL PROTECTED]> wrote: > > Yes, this can happen. Are you saying it is not safe to be in the > > lockless path when an IRQ triggers? > > Hmm. The barrier() in slab_free() looks fishy. The comment says it's > there to make sure we've retrieved c->freelist before c->page but then > it uses a _compiler barrier_ which doesn't affect the CPU and the > reads may still be re-ordered... Not sure if that matters here though. find a fix patch for that below - most systems affected seem to be SMP ones. If this (or my other patch) indeed solves the problem i'd still favor a full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks quite un-cooked and quite un-tested for multiple independent reasons. Sigh, why do i again have to be the messenger who brings the bad news to SLUB land, and again when poor Christoph went on vacation? :-/ Ingo --> Subject: SLUB: barrier fix From: Ingo Molnar <[EMAIL PROTECTED]> --- mm/slub.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux/mm/slub.c === --- linux.orig/mm/slub.c +++ linux/mm/slub.c @@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st debug_check_no_locks_freed(object, s->objsize); do { freelist = c->freelist; - barrier(); + smp_mb(); /* * If the compiler would reorder the retrieval of c->page to * come before c->freelist then an interrupt could -> Subject: slub: fastpath optimization revert From: Ingo Molnar <[EMAIL PROTECTED]> Date: Tue Feb 19 15:46:37 CET 2008 revert: commit 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c Author: Christoph Lameter <[EMAIL PROTECTED]> Date: Mon Jan 7 23:20:30 2008 -0800 SLUB: Alternate fast paths using cmpxchg_local it was causing problems (crashes) and was incomplete. Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> --- mm/slub.c | 87 -- 1 file changed, 87 deletions(-) Index: linux-x86.q/mm/slub.c === --- linux-x86.q.orig/mm/slub.c +++ linux-x86.q/mm/slub.c @@ -149,13 +149,6 @@ static inline void ClearSlabDebug(struct /* Enable to test recovery from slab corruption on boot */ #undef SLUB_RESILIENCY_TEST -/* - * Currently fastpath is not supported if preemption is enabled. - */ -#if defined(CONFIG_FAST_CMPXCHG_LOCAL) && !defined(CONFIG_PREEMPT) -#define SLUB_FASTPATH -#endif - #if PAGE_SHIFT <= 12 /* @@ -1514,11 +1507,6 @@ static void *__slab_alloc(struct kmem_ca { void **object; struct page *new; -#ifdef SLUB_FASTPATH - unsigned long flags; - - local_irq_save(flags); -#endif if (!c->page) goto new_slab; @@ -1541,9 +1529,6 @@ load_freelist: unlock_out: slab_unlock(c->page); stat(c, ALLOC_SLOWPATH); -#ifdef SLUB_FASTPATH - local_irq_restore(flags); -#endif return object; another_slab: @@ -1575,9 +1560,6 @@ new_slab: c->page = new; goto load_freelist; } -#ifdef SLUB_FASTPATH - local_irq_restore(flags); -#endif /* * No memory available. * @@ -1619,34 +1601,6 @@ static __always_inline void *slab_alloc( { void **object; struct kmem_cache_cpu *c; - -/* - * The SLUB_FASTPATH path is provisional and is currently disabled if the - * kernel is compiled with preemption or if the arch does not support - * fast cmpxchg operations. There are a couple of coming changes that will - * simplify matters and allow preemption. Ultimately we may end up making - * SLUB_FASTPATH the default. - * - * 1. The introduction of the per cpu allocator will avoid array lookups - *through get_cpu_slab(). A special register can be used instead. - * - * 2. The introduction of per cpu atomic operations (cpu_ops) means that - *we can realize the logic here entirely with per cpu atomics. The - *per cpu atomic ops will take care of the preemption issues. - */ - -#ifdef SLUB_FASTPATH - c = get_cpu_slab(s, raw_smp_processor_id()); - do { - object = c->freelist; - if (unlikely(is_end(object) || !node_match(c, node))) { - object = __slab_alloc(s, gfpflags, node, addr, c); - break; - } - stat(c, ALLOC_FASTPATH); - } while (cmpxchg_local(>freelist, object, object[c->offset]) - != object); -#else unsigned long flags; local_irq_save(flags); @@ -1661,7 +1615,6 @@ static __always_inline void *slab_alloc( stat(c, ALLOC_FASTPATH); } local_irq_restore(flags); -#endif if (unlikely((gfpflags & __GFP_ZERO) && object))
Re: Linux 2.6.25-rc2
Hi Mathieu, On Feb 19, 2008 4:02 PM, Mathieu Desnoyers <[EMAIL PROTECTED]> wrote: > > Since this shows mostly with network card drivers, I think the most > > plausible cause would be an IRQ nesting over kmem_cache_alloc_node and > > calling it. On Feb 19, 2008 4:21 PM, Pekka Enberg <[EMAIL PROTECTED]> wrote: > Yes, this can happen. Are you saying it is not safe to be in the > lockless path when an IRQ triggers? Hmm. The barrier() in slab_free() looks fishy. The comment says it's there to make sure we've retrieved c->freelist before c->page but then it uses a _compiler barrier_ which doesn't affect the CPU and the reads may still be re-ordered... Not sure if that matters here though. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
Hi Mathieu, On Feb 19, 2008 4:02 PM, Mathieu Desnoyers <[EMAIL PROTECTED]> wrote: > - stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore > indicating it is not reentrant if IRQs are disabled. Since those are > only stats, I guess it's ok, but still weird. What is not re-entrant? On Feb 19, 2008 4:02 PM, Mathieu Desnoyers <[EMAIL PROTECTED]> wrote: > Since this shows mostly with network card drivers, I think the most > plausible cause would be an IRQ nesting over kmem_cache_alloc_node and > calling it. Yes, this can happen. Are you saying it is not safe to be in the lockless path when an IRQ triggers? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
* Pekka Enberg ([EMAIL PROTECTED]) wrote: > On Feb 19, 2008 8:54 AM, Torsten Kaiser <[EMAIL PROTECTED]> wrote: > > > > [ 5282.056415] [ cut here ] > > > > [ 5282.059757] kernel BUG at lib/list_debug.c:33! > > > > [ 5282.062055] invalid opcode: [1] SMP > > > > [ 5282.062055] CPU 3 > > > > > > hm. Your crashes do seem to span multiple subsystems, but it always > > > seems to be around the SLUB code. Could you try the patch below? The > > > SLUB code has a new optimization and i'm not 100% sure about it. [the > > > hack below switches the SLUB optimization off by disabling the CPU > > > feature it relies on.] > > > > > > Ingo > > > > > > -> > > > arch/x86/Kconfig |4 > > > 1 file changed, 4 deletions(-) > > > > > > Index: linux/arch/x86/Kconfig > > > === > > > --- linux.orig/arch/x86/Kconfig > > > +++ linux/arch/x86/Kconfig > > > @@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT > > > config SEMAPHORE_SLEEPERS > > > def_bool y > > > > > > -config FAST_CMPXCHG_LOCAL > > > - bool > > > - default y > > > - > > > config MMU > > > def_bool y > > > > > > > $ grep FAST_CMPXCHG_LOCAL */.config > > linux-2.6.24-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > > > -rc2-mm1 still worked for me. > > > > Did you mean the new SLUB_FASTPATH? > > $ grep "define SLUB_FASTPATH" */mm/slub.c > > linux-2.6.25-rc1/mm/slub.c:#define SLUB_FASTPATH > > linux-2.6.25-rc2-mm1/mm/slub.c:#define SLUB_FASTPATH > > linux-2.6.25-rc2/mm/slub.c:#define SLUB_FASTPATH > > > > The 2.6.24-rc3+ mm-kernels did crash for me, but don't seem to contain > > this... > > > > On the other hand: > > From the crash in 2.6.25-rc2-mm1: > > [59987.116182] RIP [] kmem_cache_alloc_node+0x6d/0xa0 > > > > (gdb) list *0x8029f83d > > 0x8029f83d is in kmem_cache_alloc_node (mm/slub.c:1646). > > 1641if (unlikely(is_end(object) || !node_match(c, > > node))) { > > 1642object = __slab_alloc(s, gfpflags, > > node, addr, c); > > 1643break; > > 1644} > > 1645stat(c, ALLOC_FASTPATH); > > 1646} while (cmpxchg_local(>freelist, object, > > object[c->offset]) > > 1647 > > != object); > > 1648#else > > 1649unsigned long flags; > > 1650 > > > > That code is part for SLUB_FASTPATH. > > > > I'm willing to test the patch, but don't know how fast I can find the > > time to do it, so my answer if your patch helps might be delayed until > > the weekend. > > Mathieu, Christoph is on vacation and I'm not at all that familiar > with this cmpxchg_local() optimization, so if you could take a peek at > this bug report to see if you can spot something obviously wrong with > it, I would much appreciate that. Sure, Initial thoughts : I'd like to get the complete config causing this bug. I suspect either : - A race between the lockless algo and an IRQ in a driver allocating memory. - stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore indicating it is not reentrant if IRQs are disabled. Since those are only stats, I guess it's ok, but still weird. - CPU hotplug problem. http://bugzilla.kernel.org/attachment.cgi?id=14877=view shows last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map -- is this linked to a cpu up/down event ? Since this shows mostly with network card drivers, I think the most plausible cause would be an IRQ nesting over kmem_cache_alloc_node and calling it. Will dig further... Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
* Mathieu Desnoyers <[EMAIL PROTECTED]> wrote: > Ingo, a comment in slub.c explains it : > > /* > * The SLUB_FASTPATH path is provisional and is currently disabled if the > * kernel is compiled with preemption or if the arch does not support > * fast cmpxchg operations. There are a couple of coming changes that will > * simplify matters and allow preemption. Ultimately we may end up making > * SLUB_FASTPATH the default. well the feature is not complete and there are no reasons given _why_ it's not complete ... and even if there's a reason it should have been deferred to the next merge window. We still have 10 year old "this is a temporary hack" comments in the kernel ;-) "hardware does not support it" is a valid argument, "kernel developer had no time to implement it properly" is not ;-) Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
Jens Axboe wrote: > On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: >> On Sun, 17 Feb 2008 20:29:13 +0100 >> Jens Axboe <[EMAIL PROTECTED]> wrote: >> >>> It's odd stuff. Could you perhaps try and add some printks to >>> block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return >>> from radix_tree_gang_lookup() and the pointer value of cics[i] in the >>> for() loop after the lookup? >>> >> I met the same issue on ia64/NUMA box. >> seems cisc[]->key is NULL and index for radix_tree_gang_lookup() was >> always '1'. > > Why does it keep repeating then? If ->key is NULL, the next lookup index > should be 1UL. > > But I think the radix 'scan over entire tree' is a bit fragile. This > patch adds a parallel hlist for ease of properly browsing the members, > does that work for you? It compiles, but I haven't booted it here yet... > >> Attached patch works well for me, but I don't know much about cfq. >> please confirm. > > It doesn't make a lot of sense, I'm afraid. > > block/blk-ioc.c | 35 +++ > block/cfq-iosched.c | 37 +++-- > include/linux/iocontext.h |2 ++ > 3 files changed, 28 insertions(+), 46 deletions(-) > > diff --git a/block/blk-ioc.c b/block/blk-ioc.c > index 80245dc..73c7002 100644 > --- a/block/blk-ioc.c Hi Jens, Thanks for the patch. The patch works fine, machine boots up without the kernel panic. -- Thanks & Regards, Kamalesh Babulal, Linux Technology Center, IBM, ISTL. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
* Ingo Molnar ([EMAIL PROTECTED]) wrote: > > * Pekka Enberg <[EMAIL PROTECTED]> wrote: > > > Mathieu, Christoph is on vacation and I'm not at all that familiar > > with this cmpxchg_local() optimization, so if you could take a peek at > > this bug report to see if you can spot something obviously wrong with > > it, I would much appreciate that. > > hm, it's bad for at least one other reason as well (which is probably > unrelated to this crash): > > /* > * Currently fastpath is not supported if preemption is enabled. > */ > #if defined(CONFIG_FAST_CMPXCHG_LOCAL) && !defined(CONFIG_PREEMPT) > #define SLUB_FASTPATH > #endif > > such !PREEMPT exceptions tend to show "i didnt want to think too hard > about the preemptible case so just turn it off" thinking. > Ingo, a comment in slub.c explains it : /* * The SLUB_FASTPATH path is provisional and is currently disabled if the * kernel is compiled with preemption or if the arch does not support * fast cmpxchg operations. There are a couple of coming changes that will * simplify matters and allow preemption. Ultimately we may end up making * SLUB_FASTPATH the default. * * 1. The introduction of the per cpu allocator will avoid array lookups *through get_cpu_slab(). A special register can be used instead. * * 2. The introduction of per cpu atomic operations (cpu_ops) means that *we can realize the logic here entirely with per cpu atomics. The *per cpu atomic ops will take care of the preemption issues. */ So there is more coming in the preemption area. > Also, why isnt this "SLUB_FASTPATH" flag done in the Kconfig space? > Eventually, I think only CONFIG_FAST_CMPXCHG_LOCAL will be needed (when the code will support preemption). Therefore, this SLUB_FASTPATH define seems to be only here temporarily. I'm looking at the code right now.. more to come. Mathieu > Ingo -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
Hi, Pekka Enberg <[EMAIL PROTECTED]> wrote: > > Mathieu, Christoph is on vacation and I'm not at all that familiar > > with this cmpxchg_local() optimization, so if you could take a peek at > > this bug report to see if you can spot something obviously wrong with > > it, I would much appreciate that. On Feb 19, 2008 12:27 PM, Ingo Molnar <[EMAIL PROTECTED]> wrote: > hm, it's bad for at least one other reason as well (which is probably > unrelated to this crash): > > /* > * Currently fastpath is not supported if preemption is enabled. > */ > #if defined(CONFIG_FAST_CMPXCHG_LOCAL) && !defined(CONFIG_PREEMPT) > #define SLUB_FASTPATH > #endif > > such !PREEMPT exceptions tend to show "i didnt want to think too hard > about the preemptible case so just turn it off" thinking. > > Also, why isnt this "SLUB_FASTPATH" flag done in the Kconfig space? Hmm, no idea. I think might have been some mix-up with merging the patch. The one I saw was: http://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24/2.6.24-mm1/broken-out/slub-optional-fast-path-using-cmpxchg_local.patch But I don't remember giving out a Reviewed-by for it (and my mailbox confirms that). Furthermore, somehow it turned into this when merged: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c In any case, if Torsten/someone can verify that reverting 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c ("SLUB: Alternate fast paths using cmpxchg_local") fixes these problems, I think we should just do it and let Christoph sort it out when he gets back. Pekka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
* Pekka Enberg <[EMAIL PROTECTED]> wrote: > Mathieu, Christoph is on vacation and I'm not at all that familiar > with this cmpxchg_local() optimization, so if you could take a peek at > this bug report to see if you can spot something obviously wrong with > it, I would much appreciate that. hm, it's bad for at least one other reason as well (which is probably unrelated to this crash): /* * Currently fastpath is not supported if preemption is enabled. */ #if defined(CONFIG_FAST_CMPXCHG_LOCAL) && !defined(CONFIG_PREEMPT) #define SLUB_FASTPATH #endif such !PREEMPT exceptions tend to show "i didnt want to think too hard about the preemptible case so just turn it off" thinking. Also, why isnt this "SLUB_FASTPATH" flag done in the Kconfig space? Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: > On Tue, 19 Feb 2008 09:58:38 +0100 > Jens Axboe <[EMAIL PROTECTED]> wrote: > > > when I inserted printk here > > > == > > > for (i = 0; i < nr; i++) > > > func(ioc, cics[i]); > > > printk("%d %lx\n", nr, index); > > > == > > > index was always "1" and nr was always 32. > > > > > > So, cics[31]->key was always NULL when index=1 is passed to > > > radix_tree_gang_lookup(). > > > > Hang on, it returned 32? It should not return more than 16, since that > > is what we have room for and asked for. > sorry. Of course, it was 16 ;( I expected so, otherwise we would have had far more serious problems :-) > your patch works well. thank you. It's committed now and posted in the relevant bugzilla as well (#9948). -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Tue, 19 Feb 2008 09:58:38 +0100 Jens Axboe <[EMAIL PROTECTED]> wrote: > > when I inserted printk here > > == > > for (i = 0; i < nr; i++) > > func(ioc, cics[i]); > > printk("%d %lx\n", nr, index); > > == > > index was always "1" and nr was always 32. > > > > So, cics[31]->key was always NULL when index=1 is passed to > > radix_tree_gang_lookup(). > > Hang on, it returned 32? It should not return more than 16, since that > is what we have room for and asked for. sorry. Of course, it was 16 ;( your patch works well. thank you. -Kame -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: > On Tue, 19 Feb 2008 09:36:34 +0100 > Jens Axboe <[EMAIL PROTECTED]> wrote: > > > On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: > > > On Sun, 17 Feb 2008 20:29:13 +0100 > > > Jens Axboe <[EMAIL PROTECTED]> wrote: > > > > > > > It's odd stuff. Could you perhaps try and add some printks to > > > > block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return > > > > from radix_tree_gang_lookup() and the pointer value of cics[i] in the > > > > for() loop after the lookup? > > > > > > > I met the same issue on ia64/NUMA box. > > > seems cisc[]->key is NULL and index for radix_tree_gang_lookup() was > > > always '1'. > > > > Why does it keep repeating then? If ->key is NULL, the next lookup index > > should be 1UL. > > > > But I think the radix 'scan over entire tree' is a bit fragile. This > > patch adds a parallel hlist for ease of properly browsing the members, > > does that work for you? It compiles, but I haven't booted it here yet... > > > Works well for me and my box booted ! Super, I'll get it upstream. Thanks for testing and debugging! -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: > On Tue, 19 Feb 2008 09:36:34 +0100 > Jens Axboe <[EMAIL PROTECTED]> wrote: > > > On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: > > > On Sun, 17 Feb 2008 20:29:13 +0100 > > > Jens Axboe <[EMAIL PROTECTED]> wrote: > > > > > > > It's odd stuff. Could you perhaps try and add some printks to > > > > block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return > > > > from radix_tree_gang_lookup() and the pointer value of cics[i] in the > > > > for() loop after the lookup? > > > > > > > I met the same issue on ia64/NUMA box. > > > seems cisc[]->key is NULL and index for radix_tree_gang_lookup() was > > > always '1'. > > > > Why does it keep repeating then? If ->key is NULL, the next lookup index > > should be 1UL. > > > when I inserted printk here > == > for (i = 0; i < nr; i++) > func(ioc, cics[i]); > printk("%d %lx\n", nr, index); > == > index was always "1" and nr was always 32. > > So, cics[31]->key was always NULL when index=1 is passed to > radix_tree_gang_lookup(). Hang on, it returned 32? It should not return more than 16, since that is what we have room for and asked for. Using ->dead_key when ->key is NULL is correct btw, since that is the correct location in the tree once the process has exited. But that should not happen until AFTER the func() call, so I still think the list patch is safer. > > But I think the radix 'scan over entire tree' is a bit fragile. This > > patch adds a parallel hlist for ease of properly browsing the > > members, does that work for you? It compiles, but I haven't booted > > it here yet... > > > will try. please wait a bit. It boots here, so at least it passes normal sanity tests. It should solve your problem as well, hopefully. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe <[EMAIL PROTECTED]> wrote: > On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: > > On Sun, 17 Feb 2008 20:29:13 +0100 > > Jens Axboe <[EMAIL PROTECTED]> wrote: > > > > > It's odd stuff. Could you perhaps try and add some printks to > > > block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return > > > from radix_tree_gang_lookup() and the pointer value of cics[i] in the > > > for() loop after the lookup? > > > > > I met the same issue on ia64/NUMA box. > > seems cisc[]->key is NULL and index for radix_tree_gang_lookup() was > > always '1'. > > Why does it keep repeating then? If ->key is NULL, the next lookup index > should be 1UL. > > But I think the radix 'scan over entire tree' is a bit fragile. This > patch adds a parallel hlist for ease of properly browsing the members, > does that work for you? It compiles, but I haven't booted it here yet... > Works well for me and my box booted ! Thanks, -Kame -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group
[added CCs from the other thread on this topic] Alasdair G Kergon schrieb: On Sat, Feb 16, 2008 at 11:37:37PM +0100, Jiri Slaby wrote: # CONFIG_SYSFS_DEPRECATED is not set IMHO That should be *set* by default until everyone has had time to update their userspace software to cope with the changed sysfs layout. It *is* set by default. The root cause of the trouble is that its semantics are changing. At one point in time (sorry, don't remember which kernel release exactly) I tested whether the openSUSE 10.3 userspace supported a CONFIG_SYSFS_DEPRECATED=n kernel and found that it did. From then on, "make oldconfig" would carry that setting over to every new kernel I built, which was fine while the meaning of this setting - ie. the difference in sysfs layout it controlled - stayed the same. With commit edfaa7c36574f1bf09c65ad602412db9da5f96bf however, the sysfs layout changed again, so the same CONFIG_SYSFS_DEPRECATED setting now controls a different difference (argh) in sysfs layout. That kind of situation is not handled very well by "make oldconfig", which basically starts from the assumption that a setting that was ok for the previous kernel version is still ok for the new one. I see two ways of avoiding that problem: either create a new backward compatibility config setting for that new sysfs change, or create a way of telling "make oldconfig" that the semantics of CONFIG_SYSFS_DEPRECATED have changed and it should ask the user for that again even if there is a previous setting. HTH T. -- Tilman SchmidtE-Mail: [EMAIL PROTECTED] Bonn, Germany Diese Nachricht besteht zu 100% aus wiederverwerteten Bits. Ungeöffnet mindestens haltbar bis: (siehe Rückseite) signature.asc Description: OpenPGP digital signature
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe <[EMAIL PROTECTED]> wrote: > On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: > > On Sun, 17 Feb 2008 20:29:13 +0100 > > Jens Axboe <[EMAIL PROTECTED]> wrote: > > > > > It's odd stuff. Could you perhaps try and add some printks to > > > block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return > > > from radix_tree_gang_lookup() and the pointer value of cics[i] in the > > > for() loop after the lookup? > > > > > I met the same issue on ia64/NUMA box. > > seems cisc[]->key is NULL and index for radix_tree_gang_lookup() was > > always '1'. > > Why does it keep repeating then? If ->key is NULL, the next lookup index > should be 1UL. > when I inserted printk here == for (i = 0; i < nr; i++) func(ioc, cics[i]); printk("%d %lx\n", nr, index); == index was always "1" and nr was always 32. So, cics[31]->key was always NULL when index=1 is passed to radix_tree_gang_lookup(). > But I think the radix 'scan over entire tree' is a bit fragile. This > patch adds a parallel hlist for ease of properly browsing the members, > does that work for you? It compiles, but I haven't booted it here yet... > will try. please wait a bit. Thanks, -Kame -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: > On Sun, 17 Feb 2008 20:29:13 +0100 > Jens Axboe <[EMAIL PROTECTED]> wrote: > > > It's odd stuff. Could you perhaps try and add some printks to > > block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return > > from radix_tree_gang_lookup() and the pointer value of cics[i] in the > > for() loop after the lookup? > > > I met the same issue on ia64/NUMA box. > seems cisc[]->key is NULL and index for radix_tree_gang_lookup() was > always '1'. Why does it keep repeating then? If ->key is NULL, the next lookup index should be 1UL. But I think the radix 'scan over entire tree' is a bit fragile. This patch adds a parallel hlist for ease of properly browsing the members, does that work for you? It compiles, but I haven't booted it here yet... > Attached patch works well for me, but I don't know much about cfq. > please confirm. It doesn't make a lot of sense, I'm afraid. block/blk-ioc.c | 35 +++ block/cfq-iosched.c | 37 +++-- include/linux/iocontext.h |2 ++ 3 files changed, 28 insertions(+), 46 deletions(-) diff --git a/block/blk-ioc.c b/block/blk-ioc.c index 80245dc..73c7002 100644 --- a/block/blk-ioc.c +++ b/block/blk-ioc.c @@ -17,17 +17,13 @@ static struct kmem_cache *iocontext_cachep; static void cfq_dtor(struct io_context *ioc) { - struct cfq_io_context *cic[1]; - int r; + if (!hlist_empty(>cic_list)) { + struct cfq_io_context *cic; - /* -* We don't have a specific key to lookup with, so use the gang -* lookup to just retrieve the first item stored. The cfq exit -* function will iterate the full tree, so any member will do. -*/ - r = radix_tree_gang_lookup(>radix_root, (void **) cic, 0, 1); - if (r > 0) - cic[0]->dtor(ioc); + cic = list_entry(ioc->cic_list.first, struct cfq_io_context, + cic_list); + cic->dtor(ioc); + } } /* @@ -57,18 +53,16 @@ EXPORT_SYMBOL(put_io_context); static void cfq_exit(struct io_context *ioc) { - struct cfq_io_context *cic[1]; - int r; - rcu_read_lock(); - /* -* See comment for cfq_dtor() -*/ - r = radix_tree_gang_lookup(>radix_root, (void **) cic, 0, 1); - rcu_read_unlock(); - if (r > 0) - cic[0]->exit(ioc); + if (!hlist_empty(>cic_list)) { + struct cfq_io_context *cic; + + cic = list_entry(ioc->cic_list.first, struct cfq_io_context, + cic_list); + cic->exit(ioc); + } + rcu_read_unlock(); } /* Called by the exitting task */ @@ -105,6 +99,7 @@ struct io_context *alloc_io_context(gfp_t gfp_flags, int node) ret->nr_batch_requests = 0; /* because this is 0 */ ret->aic = NULL; INIT_RADIX_TREE(>radix_root, GFP_ATOMIC | __GFP_HIGH); + INIT_HLIST_HEAD(>cic_list); ret->ioc_data = NULL; } diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index ca198e6..62eda3f 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -1145,38 +1145,19 @@ static void cfq_put_queue(struct cfq_queue *cfqq) /* * Call func for each cic attached to this ioc. Returns number of cic's seen. */ -#define CIC_GANG_NR16 static unsigned int call_for_each_cic(struct io_context *ioc, void (*func)(struct io_context *, struct cfq_io_context *)) { - struct cfq_io_context *cics[CIC_GANG_NR]; - unsigned long index = 0; - unsigned int called = 0; - int nr; + struct cfq_io_context *cic; + struct hlist_node *n; + int called = 0; rcu_read_lock(); - - do { - int i; - - /* -* Perhaps there's a better way - this just gang lookups from -* 0 to the end, restarting after each CIC_GANG_NR from the -* last key + 1. -*/ - nr = radix_tree_gang_lookup(>radix_root, (void **) cics, - index, CIC_GANG_NR); - if (!nr) - break; - - called += nr; - index = 1 + (unsigned long) cics[nr - 1]->key; - - for (i = 0; i < nr; i++) - func(ioc, cics[i]); - } while (nr == CIC_GANG_NR); - + hlist_for_each_entry_rcu(cic, n, >cic_list, cic_list) { + func(ioc, cic); + called++; + } rcu_read_unlock(); return called; @@ -1190,6 +1171,7 @@ static void cic_free_func(struct io_context *ioc, struct cfq_io_context *cic) spin_lock_irqsave(>lock, flags); radix_tree_delete(>radix_root, cic->dead_key); +
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Sun, 17 Feb 2008 20:29:13 +0100 Jens Axboe <[EMAIL PROTECTED]> wrote: > It's odd stuff. Could you perhaps try and add some printks to > block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return > from radix_tree_gang_lookup() and the pointer value of cics[i] in the > for() loop after the lookup? > I met the same issue on ia64/NUMA box. seems cisc[]->key is NULL and index for radix_tree_gang_lookup() was always '1'. Attached patch works well for me, but I don't know much about cfq. please confirm. Regards, -Kame == cics[]->key can be NULL. In that case, cics[]->dead_key has key value. Signed-off-by: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Index: linux-2.6.25-rc2/block/cfq-iosched.c =========== --- linux-2.6.25-rc2.orig/block/cfq-iosched.c +++ linux-2.6.25-rc2/block/cfq-iosched.c @@ -1171,7 +1171,11 @@ call_for_each_cic(struct io_context *ioc break; called += nr; - index = 1 + (unsigned long) cics[nr - 1]->key; + + if (!cics[nr - 1]->key) + index = 1 + (unsigned long) cics[nr - 1]->dead_key; + else + index = 1 + (unsigned long) cics[nr - 1]->key; for (i = 0; i < nr; i++) func(ioc, cics[i]); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Sun, 17 Feb 2008 20:29:13 +0100 Jens Axboe [EMAIL PROTECTED] wrote: It's odd stuff. Could you perhaps try and add some printks to block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return from radix_tree_gang_lookup() and the pointer value of cics[i] in the for() loop after the lookup? I met the same issue on ia64/NUMA box. seems cisc[]-key is NULL and index for radix_tree_gang_lookup() was always '1'. Attached patch works well for me, but I don't know much about cfq. please confirm. Regards, -Kame == cics[]-key can be NULL. In that case, cics[]-dead_key has key value. Signed-off-by: KAMEZAWA Hiroyuki [EMAIL PROTECTED] Index: linux-2.6.25-rc2/block/cfq-iosched.c === --- linux-2.6.25-rc2.orig/block/cfq-iosched.c +++ linux-2.6.25-rc2/block/cfq-iosched.c @@ -1171,7 +1171,11 @@ call_for_each_cic(struct io_context *ioc break; called += nr; - index = 1 + (unsigned long) cics[nr - 1]-key; + + if (!cics[nr - 1]-key) + index = 1 + (unsigned long) cics[nr - 1]-dead_key; + else + index = 1 + (unsigned long) cics[nr - 1]-key; for (i = 0; i nr; i++) func(ioc, cics[i]); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: On Sun, 17 Feb 2008 20:29:13 +0100 Jens Axboe [EMAIL PROTECTED] wrote: It's odd stuff. Could you perhaps try and add some printks to block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return from radix_tree_gang_lookup() and the pointer value of cics[i] in the for() loop after the lookup? I met the same issue on ia64/NUMA box. seems cisc[]-key is NULL and index for radix_tree_gang_lookup() was always '1'. Why does it keep repeating then? If -key is NULL, the next lookup index should be 1UL. But I think the radix 'scan over entire tree' is a bit fragile. This patch adds a parallel hlist for ease of properly browsing the members, does that work for you? It compiles, but I haven't booted it here yet... Attached patch works well for me, but I don't know much about cfq. please confirm. It doesn't make a lot of sense, I'm afraid. block/blk-ioc.c | 35 +++ block/cfq-iosched.c | 37 +++-- include/linux/iocontext.h |2 ++ 3 files changed, 28 insertions(+), 46 deletions(-) diff --git a/block/blk-ioc.c b/block/blk-ioc.c index 80245dc..73c7002 100644 --- a/block/blk-ioc.c +++ b/block/blk-ioc.c @@ -17,17 +17,13 @@ static struct kmem_cache *iocontext_cachep; static void cfq_dtor(struct io_context *ioc) { - struct cfq_io_context *cic[1]; - int r; + if (!hlist_empty(ioc-cic_list)) { + struct cfq_io_context *cic; - /* -* We don't have a specific key to lookup with, so use the gang -* lookup to just retrieve the first item stored. The cfq exit -* function will iterate the full tree, so any member will do. -*/ - r = radix_tree_gang_lookup(ioc-radix_root, (void **) cic, 0, 1); - if (r 0) - cic[0]-dtor(ioc); + cic = list_entry(ioc-cic_list.first, struct cfq_io_context, + cic_list); + cic-dtor(ioc); + } } /* @@ -57,18 +53,16 @@ EXPORT_SYMBOL(put_io_context); static void cfq_exit(struct io_context *ioc) { - struct cfq_io_context *cic[1]; - int r; - rcu_read_lock(); - /* -* See comment for cfq_dtor() -*/ - r = radix_tree_gang_lookup(ioc-radix_root, (void **) cic, 0, 1); - rcu_read_unlock(); - if (r 0) - cic[0]-exit(ioc); + if (!hlist_empty(ioc-cic_list)) { + struct cfq_io_context *cic; + + cic = list_entry(ioc-cic_list.first, struct cfq_io_context, + cic_list); + cic-exit(ioc); + } + rcu_read_unlock(); } /* Called by the exitting task */ @@ -105,6 +99,7 @@ struct io_context *alloc_io_context(gfp_t gfp_flags, int node) ret-nr_batch_requests = 0; /* because this is 0 */ ret-aic = NULL; INIT_RADIX_TREE(ret-radix_root, GFP_ATOMIC | __GFP_HIGH); + INIT_HLIST_HEAD(ret-cic_list); ret-ioc_data = NULL; } diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index ca198e6..62eda3f 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -1145,38 +1145,19 @@ static void cfq_put_queue(struct cfq_queue *cfqq) /* * Call func for each cic attached to this ioc. Returns number of cic's seen. */ -#define CIC_GANG_NR16 static unsigned int call_for_each_cic(struct io_context *ioc, void (*func)(struct io_context *, struct cfq_io_context *)) { - struct cfq_io_context *cics[CIC_GANG_NR]; - unsigned long index = 0; - unsigned int called = 0; - int nr; + struct cfq_io_context *cic; + struct hlist_node *n; + int called = 0; rcu_read_lock(); - - do { - int i; - - /* -* Perhaps there's a better way - this just gang lookups from -* 0 to the end, restarting after each CIC_GANG_NR from the -* last key + 1. -*/ - nr = radix_tree_gang_lookup(ioc-radix_root, (void **) cics, - index, CIC_GANG_NR); - if (!nr) - break; - - called += nr; - index = 1 + (unsigned long) cics[nr - 1]-key; - - for (i = 0; i nr; i++) - func(ioc, cics[i]); - } while (nr == CIC_GANG_NR); - + hlist_for_each_entry_rcu(cic, n, ioc-cic_list, cic_list) { + func(ioc, cic); + called++; + } rcu_read_unlock(); return called; @@ -1190,6 +1171,7 @@ static void cic_free_func(struct io_context *ioc, struct cfq_io_context *cic) spin_lock_irqsave(ioc-lock, flags); radix_tree_delete(ioc-radix_root, cic-dead_key); +
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe [EMAIL PROTECTED] wrote: On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: On Sun, 17 Feb 2008 20:29:13 +0100 Jens Axboe [EMAIL PROTECTED] wrote: It's odd stuff. Could you perhaps try and add some printks to block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return from radix_tree_gang_lookup() and the pointer value of cics[i] in the for() loop after the lookup? I met the same issue on ia64/NUMA box. seems cisc[]-key is NULL and index for radix_tree_gang_lookup() was always '1'. Why does it keep repeating then? If -key is NULL, the next lookup index should be 1UL. when I inserted printk here == for (i = 0; i nr; i++) func(ioc, cics[i]); printk(%d %lx\n, nr, index); == index was always 1 and nr was always 32. So, cics[31]-key was always NULL when index=1 is passed to radix_tree_gang_lookup(). But I think the radix 'scan over entire tree' is a bit fragile. This patch adds a parallel hlist for ease of properly browsing the members, does that work for you? It compiles, but I haven't booted it here yet... will try. please wait a bit. Thanks, -Kame -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group
[added CCs from the other thread on this topic] Alasdair G Kergon schrieb: On Sat, Feb 16, 2008 at 11:37:37PM +0100, Jiri Slaby wrote: # CONFIG_SYSFS_DEPRECATED is not set IMHO That should be *set* by default until everyone has had time to update their userspace software to cope with the changed sysfs layout. It *is* set by default. The root cause of the trouble is that its semantics are changing. At one point in time (sorry, don't remember which kernel release exactly) I tested whether the openSUSE 10.3 userspace supported a CONFIG_SYSFS_DEPRECATED=n kernel and found that it did. From then on, make oldconfig would carry that setting over to every new kernel I built, which was fine while the meaning of this setting - ie. the difference in sysfs layout it controlled - stayed the same. With commit edfaa7c36574f1bf09c65ad602412db9da5f96bf however, the sysfs layout changed again, so the same CONFIG_SYSFS_DEPRECATED setting now controls a different difference (argh) in sysfs layout. That kind of situation is not handled very well by make oldconfig, which basically starts from the assumption that a setting that was ok for the previous kernel version is still ok for the new one. I see two ways of avoiding that problem: either create a new backward compatibility config setting for that new sysfs change, or create a way of telling make oldconfig that the semantics of CONFIG_SYSFS_DEPRECATED have changed and it should ask the user for that again even if there is a previous setting. HTH T. -- Tilman SchmidtE-Mail: [EMAIL PROTECTED] Bonn, Germany Diese Nachricht besteht zu 100% aus wiederverwerteten Bits. Ungeöffnet mindestens haltbar bis: (siehe Rückseite) signature.asc Description: OpenPGP digital signature
Re: Linux 2.6.25-rc2
* Pekka Enberg [EMAIL PROTECTED] wrote: Mathieu, Christoph is on vacation and I'm not at all that familiar with this cmpxchg_local() optimization, so if you could take a peek at this bug report to see if you can spot something obviously wrong with it, I would much appreciate that. hm, it's bad for at least one other reason as well (which is probably unrelated to this crash): /* * Currently fastpath is not supported if preemption is enabled. */ #if defined(CONFIG_FAST_CMPXCHG_LOCAL) !defined(CONFIG_PREEMPT) #define SLUB_FASTPATH #endif such !PREEMPT exceptions tend to show i didnt want to think too hard about the preemptible case so just turn it off thinking. Also, why isnt this SLUB_FASTPATH flag done in the Kconfig space? Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
Hi, Pekka Enberg [EMAIL PROTECTED] wrote: Mathieu, Christoph is on vacation and I'm not at all that familiar with this cmpxchg_local() optimization, so if you could take a peek at this bug report to see if you can spot something obviously wrong with it, I would much appreciate that. On Feb 19, 2008 12:27 PM, Ingo Molnar [EMAIL PROTECTED] wrote: hm, it's bad for at least one other reason as well (which is probably unrelated to this crash): /* * Currently fastpath is not supported if preemption is enabled. */ #if defined(CONFIG_FAST_CMPXCHG_LOCAL) !defined(CONFIG_PREEMPT) #define SLUB_FASTPATH #endif such !PREEMPT exceptions tend to show i didnt want to think too hard about the preemptible case so just turn it off thinking. Also, why isnt this SLUB_FASTPATH flag done in the Kconfig space? Hmm, no idea. I think might have been some mix-up with merging the patch. The one I saw was: http://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24/2.6.24-mm1/broken-out/slub-optional-fast-path-using-cmpxchg_local.patch But I don't remember giving out a Reviewed-by for it (and my mailbox confirms that). Furthermore, somehow it turned into this when merged: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c In any case, if Torsten/someone can verify that reverting 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c (SLUB: Alternate fast paths using cmpxchg_local) fixes these problems, I think we should just do it and let Christoph sort it out when he gets back. Pekka -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: On Tue, 19 Feb 2008 09:58:38 +0100 Jens Axboe [EMAIL PROTECTED] wrote: when I inserted printk here == for (i = 0; i nr; i++) func(ioc, cics[i]); printk(%d %lx\n, nr, index); == index was always 1 and nr was always 32. So, cics[31]-key was always NULL when index=1 is passed to radix_tree_gang_lookup(). Hang on, it returned 32? It should not return more than 16, since that is what we have room for and asked for. sorry. Of course, it was 16 ;( I expected so, otherwise we would have had far more serious problems :-) your patch works well. thank you. It's committed now and posted in the relevant bugzilla as well (#9948). -- Jens Axboe -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe [EMAIL PROTECTED] wrote: On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: On Sun, 17 Feb 2008 20:29:13 +0100 Jens Axboe [EMAIL PROTECTED] wrote: It's odd stuff. Could you perhaps try and add some printks to block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return from radix_tree_gang_lookup() and the pointer value of cics[i] in the for() loop after the lookup? I met the same issue on ia64/NUMA box. seems cisc[]-key is NULL and index for radix_tree_gang_lookup() was always '1'. Why does it keep repeating then? If -key is NULL, the next lookup index should be 1UL. when I inserted printk here == for (i = 0; i nr; i++) func(ioc, cics[i]); printk(%d %lx\n, nr, index); == index was always 1 and nr was always 32. So, cics[31]-key was always NULL when index=1 is passed to radix_tree_gang_lookup(). Hang on, it returned 32? It should not return more than 16, since that is what we have room for and asked for. Using -dead_key when -key is NULL is correct btw, since that is the correct location in the tree once the process has exited. But that should not happen until AFTER the func() call, so I still think the list patch is safer. But I think the radix 'scan over entire tree' is a bit fragile. This patch adds a parallel hlist for ease of properly browsing the members, does that work for you? It compiles, but I haven't booted it here yet... will try. please wait a bit. It boots here, so at least it passes normal sanity tests. It should solve your problem as well, hopefully. -- Jens Axboe -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe [EMAIL PROTECTED] wrote: On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: On Sun, 17 Feb 2008 20:29:13 +0100 Jens Axboe [EMAIL PROTECTED] wrote: It's odd stuff. Could you perhaps try and add some printks to block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return from radix_tree_gang_lookup() and the pointer value of cics[i] in the for() loop after the lookup? I met the same issue on ia64/NUMA box. seems cisc[]-key is NULL and index for radix_tree_gang_lookup() was always '1'. Why does it keep repeating then? If -key is NULL, the next lookup index should be 1UL. But I think the radix 'scan over entire tree' is a bit fragile. This patch adds a parallel hlist for ease of properly browsing the members, does that work for you? It compiles, but I haven't booted it here yet... Works well for me and my box booted ! Super, I'll get it upstream. Thanks for testing and debugging! -- Jens Axboe -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe [EMAIL PROTECTED] wrote: On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: On Sun, 17 Feb 2008 20:29:13 +0100 Jens Axboe [EMAIL PROTECTED] wrote: It's odd stuff. Could you perhaps try and add some printks to block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return from radix_tree_gang_lookup() and the pointer value of cics[i] in the for() loop after the lookup? I met the same issue on ia64/NUMA box. seems cisc[]-key is NULL and index for radix_tree_gang_lookup() was always '1'. Why does it keep repeating then? If -key is NULL, the next lookup index should be 1UL. But I think the radix 'scan over entire tree' is a bit fragile. This patch adds a parallel hlist for ease of properly browsing the members, does that work for you? It compiles, but I haven't booted it here yet... Works well for me and my box booted ! Thanks, -Kame -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Tue, 19 Feb 2008 09:58:38 +0100 Jens Axboe [EMAIL PROTECTED] wrote: when I inserted printk here == for (i = 0; i nr; i++) func(ioc, cics[i]); printk(%d %lx\n, nr, index); == index was always 1 and nr was always 32. So, cics[31]-key was always NULL when index=1 is passed to radix_tree_gang_lookup(). Hang on, it returned 32? It should not return more than 16, since that is what we have room for and asked for. sorry. Of course, it was 16 ;( your patch works well. thank you. -Kame -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
* Ingo Molnar ([EMAIL PROTECTED]) wrote: * Pekka Enberg [EMAIL PROTECTED] wrote: Mathieu, Christoph is on vacation and I'm not at all that familiar with this cmpxchg_local() optimization, so if you could take a peek at this bug report to see if you can spot something obviously wrong with it, I would much appreciate that. hm, it's bad for at least one other reason as well (which is probably unrelated to this crash): /* * Currently fastpath is not supported if preemption is enabled. */ #if defined(CONFIG_FAST_CMPXCHG_LOCAL) !defined(CONFIG_PREEMPT) #define SLUB_FASTPATH #endif such !PREEMPT exceptions tend to show i didnt want to think too hard about the preemptible case so just turn it off thinking. Ingo, a comment in slub.c explains it : /* * The SLUB_FASTPATH path is provisional and is currently disabled if the * kernel is compiled with preemption or if the arch does not support * fast cmpxchg operations. There are a couple of coming changes that will * simplify matters and allow preemption. Ultimately we may end up making * SLUB_FASTPATH the default. * * 1. The introduction of the per cpu allocator will avoid array lookups *through get_cpu_slab(). A special register can be used instead. * * 2. The introduction of per cpu atomic operations (cpu_ops) means that *we can realize the logic here entirely with per cpu atomics. The *per cpu atomic ops will take care of the preemption issues. */ So there is more coming in the preemption area. Also, why isnt this SLUB_FASTPATH flag done in the Kconfig space? Eventually, I think only CONFIG_FAST_CMPXCHG_LOCAL will be needed (when the code will support preemption). Therefore, this SLUB_FASTPATH define seems to be only here temporarily. I'm looking at the code right now.. more to come. Mathieu Ingo -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
Jens Axboe wrote: On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: On Sun, 17 Feb 2008 20:29:13 +0100 Jens Axboe [EMAIL PROTECTED] wrote: It's odd stuff. Could you perhaps try and add some printks to block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return from radix_tree_gang_lookup() and the pointer value of cics[i] in the for() loop after the lookup? I met the same issue on ia64/NUMA box. seems cisc[]-key is NULL and index for radix_tree_gang_lookup() was always '1'. Why does it keep repeating then? If -key is NULL, the next lookup index should be 1UL. But I think the radix 'scan over entire tree' is a bit fragile. This patch adds a parallel hlist for ease of properly browsing the members, does that work for you? It compiles, but I haven't booted it here yet... Attached patch works well for me, but I don't know much about cfq. please confirm. It doesn't make a lot of sense, I'm afraid. block/blk-ioc.c | 35 +++ block/cfq-iosched.c | 37 +++-- include/linux/iocontext.h |2 ++ 3 files changed, 28 insertions(+), 46 deletions(-) diff --git a/block/blk-ioc.c b/block/blk-ioc.c index 80245dc..73c7002 100644 --- a/block/blk-ioc.c snip Hi Jens, Thanks for the patch. The patch works fine, machine boots up without the kernel panic. -- Thanks Regards, Kamalesh Babulal, Linux Technology Center, IBM, ISTL. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
* Mathieu Desnoyers [EMAIL PROTECTED] wrote: Ingo, a comment in slub.c explains it : /* * The SLUB_FASTPATH path is provisional and is currently disabled if the * kernel is compiled with preemption or if the arch does not support * fast cmpxchg operations. There are a couple of coming changes that will * simplify matters and allow preemption. Ultimately we may end up making * SLUB_FASTPATH the default. well the feature is not complete and there are no reasons given _why_ it's not complete ... and even if there's a reason it should have been deferred to the next merge window. We still have 10 year old this is a temporary hack comments in the kernel ;-) hardware does not support it is a valid argument, kernel developer had no time to implement it properly is not ;-) Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
Hi Mathieu, On Feb 19, 2008 4:02 PM, Mathieu Desnoyers [EMAIL PROTECTED] wrote: Since this shows mostly with network card drivers, I think the most plausible cause would be an IRQ nesting over kmem_cache_alloc_node and calling it. On Feb 19, 2008 4:21 PM, Pekka Enberg [EMAIL PROTECTED] wrote: Yes, this can happen. Are you saying it is not safe to be in the lockless path when an IRQ triggers? Hmm. The barrier() in slab_free() looks fishy. The comment says it's there to make sure we've retrieved c-freelist before c-page but then it uses a _compiler barrier_ which doesn't affect the CPU and the reads may still be re-ordered... Not sure if that matters here though. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
* Pekka Enberg ([EMAIL PROTECTED]) wrote: On Feb 19, 2008 8:54 AM, Torsten Kaiser [EMAIL PROTECTED] wrote: [ 5282.056415] [ cut here ] [ 5282.059757] kernel BUG at lib/list_debug.c:33! [ 5282.062055] invalid opcode: [1] SMP [ 5282.062055] CPU 3 hm. Your crashes do seem to span multiple subsystems, but it always seems to be around the SLUB code. Could you try the patch below? The SLUB code has a new optimization and i'm not 100% sure about it. [the hack below switches the SLUB optimization off by disabling the CPU feature it relies on.] Ingo - arch/x86/Kconfig |4 1 file changed, 4 deletions(-) Index: linux/arch/x86/Kconfig === --- linux.orig/arch/x86/Kconfig +++ linux/arch/x86/Kconfig @@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT config SEMAPHORE_SLEEPERS def_bool y -config FAST_CMPXCHG_LOCAL - bool - default y - config MMU def_bool y $ grep FAST_CMPXCHG_LOCAL */.config linux-2.6.24-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y -rc2-mm1 still worked for me. Did you mean the new SLUB_FASTPATH? $ grep define SLUB_FASTPATH */mm/slub.c linux-2.6.25-rc1/mm/slub.c:#define SLUB_FASTPATH linux-2.6.25-rc2-mm1/mm/slub.c:#define SLUB_FASTPATH linux-2.6.25-rc2/mm/slub.c:#define SLUB_FASTPATH The 2.6.24-rc3+ mm-kernels did crash for me, but don't seem to contain this... On the other hand: From the crash in 2.6.25-rc2-mm1: [59987.116182] RIP [8029f83d] kmem_cache_alloc_node+0x6d/0xa0 (gdb) list *0x8029f83d 0x8029f83d is in kmem_cache_alloc_node (mm/slub.c:1646). 1641if (unlikely(is_end(object) || !node_match(c, node))) { 1642object = __slab_alloc(s, gfpflags, node, addr, c); 1643break; 1644} 1645stat(c, ALLOC_FASTPATH); 1646} while (cmpxchg_local(c-freelist, object, object[c-offset]) 1647 != object); 1648#else 1649unsigned long flags; 1650 That code is part for SLUB_FASTPATH. I'm willing to test the patch, but don't know how fast I can find the time to do it, so my answer if your patch helps might be delayed until the weekend. Mathieu, Christoph is on vacation and I'm not at all that familiar with this cmpxchg_local() optimization, so if you could take a peek at this bug report to see if you can spot something obviously wrong with it, I would much appreciate that. Sure, Initial thoughts : I'd like to get the complete config causing this bug. I suspect either : - A race between the lockless algo and an IRQ in a driver allocating memory. - stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore indicating it is not reentrant if IRQs are disabled. Since those are only stats, I guess it's ok, but still weird. - CPU hotplug problem. http://bugzilla.kernel.org/attachment.cgi?id=14877action=view shows last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map -- is this linked to a cpu up/down event ? Since this shows mostly with network card drivers, I think the most plausible cause would be an IRQ nesting over kmem_cache_alloc_node and calling it. Will dig further... Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
* Pekka Enberg [EMAIL PROTECTED] wrote: Yes, this can happen. Are you saying it is not safe to be in the lockless path when an IRQ triggers? Hmm. The barrier() in slab_free() looks fishy. The comment says it's there to make sure we've retrieved c-freelist before c-page but then it uses a _compiler barrier_ which doesn't affect the CPU and the reads may still be re-ordered... Not sure if that matters here though. find a fix patch for that below - most systems affected seem to be SMP ones. If this (or my other patch) indeed solves the problem i'd still favor a full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks quite un-cooked and quite un-tested for multiple independent reasons. Sigh, why do i again have to be the messenger who brings the bad news to SLUB land, and again when poor Christoph went on vacation? :-/ Ingo -- Subject: SLUB: barrier fix From: Ingo Molnar [EMAIL PROTECTED] --- mm/slub.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux/mm/slub.c === --- linux.orig/mm/slub.c +++ linux/mm/slub.c @@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st debug_check_no_locks_freed(object, s-objsize); do { freelist = c-freelist; - barrier(); + smp_mb(); /* * If the compiler would reorder the retrieval of c-page to * come before c-freelist then an interrupt could - Subject: slub: fastpath optimization revert From: Ingo Molnar [EMAIL PROTECTED] Date: Tue Feb 19 15:46:37 CET 2008 revert: commit 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c Author: Christoph Lameter [EMAIL PROTECTED] Date: Mon Jan 7 23:20:30 2008 -0800 SLUB: Alternate fast paths using cmpxchg_local it was causing problems (crashes) and was incomplete. Signed-off-by: Ingo Molnar [EMAIL PROTECTED] --- mm/slub.c | 87 -- 1 file changed, 87 deletions(-) Index: linux-x86.q/mm/slub.c === --- linux-x86.q.orig/mm/slub.c +++ linux-x86.q/mm/slub.c @@ -149,13 +149,6 @@ static inline void ClearSlabDebug(struct /* Enable to test recovery from slab corruption on boot */ #undef SLUB_RESILIENCY_TEST -/* - * Currently fastpath is not supported if preemption is enabled. - */ -#if defined(CONFIG_FAST_CMPXCHG_LOCAL) !defined(CONFIG_PREEMPT) -#define SLUB_FASTPATH -#endif - #if PAGE_SHIFT = 12 /* @@ -1514,11 +1507,6 @@ static void *__slab_alloc(struct kmem_ca { void **object; struct page *new; -#ifdef SLUB_FASTPATH - unsigned long flags; - - local_irq_save(flags); -#endif if (!c-page) goto new_slab; @@ -1541,9 +1529,6 @@ load_freelist: unlock_out: slab_unlock(c-page); stat(c, ALLOC_SLOWPATH); -#ifdef SLUB_FASTPATH - local_irq_restore(flags); -#endif return object; another_slab: @@ -1575,9 +1560,6 @@ new_slab: c-page = new; goto load_freelist; } -#ifdef SLUB_FASTPATH - local_irq_restore(flags); -#endif /* * No memory available. * @@ -1619,34 +1601,6 @@ static __always_inline void *slab_alloc( { void **object; struct kmem_cache_cpu *c; - -/* - * The SLUB_FASTPATH path is provisional and is currently disabled if the - * kernel is compiled with preemption or if the arch does not support - * fast cmpxchg operations. There are a couple of coming changes that will - * simplify matters and allow preemption. Ultimately we may end up making - * SLUB_FASTPATH the default. - * - * 1. The introduction of the per cpu allocator will avoid array lookups - *through get_cpu_slab(). A special register can be used instead. - * - * 2. The introduction of per cpu atomic operations (cpu_ops) means that - *we can realize the logic here entirely with per cpu atomics. The - *per cpu atomic ops will take care of the preemption issues. - */ - -#ifdef SLUB_FASTPATH - c = get_cpu_slab(s, raw_smp_processor_id()); - do { - object = c-freelist; - if (unlikely(is_end(object) || !node_match(c, node))) { - object = __slab_alloc(s, gfpflags, node, addr, c); - break; - } - stat(c, ALLOC_FASTPATH); - } while (cmpxchg_local(c-freelist, object, object[c-offset]) - != object); -#else unsigned long flags; local_irq_save(flags); @@ -1661,7 +1615,6 @@ static __always_inline void *slab_alloc( stat(c, ALLOC_FASTPATH); } local_irq_restore(flags); -#endif if (unlikely((gfpflags __GFP_ZERO) object)) memset(object, 0, c-objsize);
Re: Linux 2.6.25-rc2
* Ingo Molnar [EMAIL PROTECTED] wrote: If this (or my other patch) indeed solves the problem i'd still favor a full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks quite un-cooked and quite un-tested for multiple independent reasons. Sigh, why do i again have to be the messenger who brings the bad news to SLUB land, and again when poor Christoph went on vacation? :-/ the revert patch is below. (manually done due to other changes since 1f84260c8ce3b1ce26d4 was commited, but trivial) Ingo - Subject: slub: fastpath optimization revert From: Ingo Molnar [EMAIL PROTECTED] Date: Tue Feb 19 15:46:37 CET 2008 revert: commit 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c Author: Christoph Lameter [EMAIL PROTECTED] Date: Mon Jan 7 23:20:30 2008 -0800 SLUB: Alternate fast paths using cmpxchg_local it was causing problems (crashes) and was incomplete. Signed-off-by: Ingo Molnar [EMAIL PROTECTED] --- mm/slub.c | 87 -- 1 file changed, 87 deletions(-) Index: linux-x86.q/mm/slub.c === --- linux-x86.q.orig/mm/slub.c +++ linux-x86.q/mm/slub.c @@ -149,13 +149,6 @@ static inline void ClearSlabDebug(struct /* Enable to test recovery from slab corruption on boot */ #undef SLUB_RESILIENCY_TEST -/* - * Currently fastpath is not supported if preemption is enabled. - */ -#if defined(CONFIG_FAST_CMPXCHG_LOCAL) !defined(CONFIG_PREEMPT) -#define SLUB_FASTPATH -#endif - #if PAGE_SHIFT = 12 /* @@ -1514,11 +1507,6 @@ static void *__slab_alloc(struct kmem_ca { void **object; struct page *new; -#ifdef SLUB_FASTPATH - unsigned long flags; - - local_irq_save(flags); -#endif if (!c-page) goto new_slab; @@ -1541,9 +1529,6 @@ load_freelist: unlock_out: slab_unlock(c-page); stat(c, ALLOC_SLOWPATH); -#ifdef SLUB_FASTPATH - local_irq_restore(flags); -#endif return object; another_slab: @@ -1575,9 +1560,6 @@ new_slab: c-page = new; goto load_freelist; } -#ifdef SLUB_FASTPATH - local_irq_restore(flags); -#endif /* * No memory available. * @@ -1619,34 +1601,6 @@ static __always_inline void *slab_alloc( { void **object; struct kmem_cache_cpu *c; - -/* - * The SLUB_FASTPATH path is provisional and is currently disabled if the - * kernel is compiled with preemption or if the arch does not support - * fast cmpxchg operations. There are a couple of coming changes that will - * simplify matters and allow preemption. Ultimately we may end up making - * SLUB_FASTPATH the default. - * - * 1. The introduction of the per cpu allocator will avoid array lookups - *through get_cpu_slab(). A special register can be used instead. - * - * 2. The introduction of per cpu atomic operations (cpu_ops) means that - *we can realize the logic here entirely with per cpu atomics. The - *per cpu atomic ops will take care of the preemption issues. - */ - -#ifdef SLUB_FASTPATH - c = get_cpu_slab(s, raw_smp_processor_id()); - do { - object = c-freelist; - if (unlikely(is_end(object) || !node_match(c, node))) { - object = __slab_alloc(s, gfpflags, node, addr, c); - break; - } - stat(c, ALLOC_FASTPATH); - } while (cmpxchg_local(c-freelist, object, object[c-offset]) - != object); -#else unsigned long flags; local_irq_save(flags); @@ -1661,7 +1615,6 @@ static __always_inline void *slab_alloc( stat(c, ALLOC_FASTPATH); } local_irq_restore(flags); -#endif if (unlikely((gfpflags __GFP_ZERO) object)) memset(object, 0, c-objsize); @@ -1698,11 +1651,6 @@ static void __slab_free(struct kmem_cach void **object = (void *)x; struct kmem_cache_cpu *c; -#ifdef SLUB_FASTPATH - unsigned long flags; - - local_irq_save(flags); -#endif c = get_cpu_slab(s, raw_smp_processor_id()); stat(c, FREE_SLOWPATH); slab_lock(page); @@ -1734,9 +1682,6 @@ checks_ok: out_unlock: slab_unlock(page); -#ifdef SLUB_FASTPATH - local_irq_restore(flags); -#endif return; slab_empty: @@ -1749,9 +1694,6 @@ slab_empty: } slab_unlock(page); stat(c, FREE_SLAB); -#ifdef SLUB_FASTPATH - local_irq_restore(flags); -#endif discard_slab(s, page); return; @@ -1777,34 +1719,6 @@ static __always_inline void slab_free(st { void **object = (void *)x; struct kmem_cache_cpu *c; - -#ifdef SLUB_FASTPATH - void **freelist; - - c = get_cpu_slab(s, raw_smp_processor_id()); - debug_check_no_locks_freed(object, s-objsize); - do { -
Re: Linux 2.6.25-rc2
Hi Mathieu, On Feb 19, 2008 4:02 PM, Mathieu Desnoyers [EMAIL PROTECTED] wrote: - stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore indicating it is not reentrant if IRQs are disabled. Since those are only stats, I guess it's ok, but still weird. What is not re-entrant? On Feb 19, 2008 4:02 PM, Mathieu Desnoyers [EMAIL PROTECTED] wrote: Since this shows mostly with network card drivers, I think the most plausible cause would be an IRQ nesting over kmem_cache_alloc_node and calling it. Yes, this can happen. Are you saying it is not safe to be in the lockless path when an IRQ triggers? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On Tue, 19 Feb 2008, Eric Dumazet wrote: cmpxchg_local(c-freelist, object, object[c-offset]) can succeed, while an interrupt came (on this cpu), and several allocations were done, and one free was performed at the end of this interruption, so 'object' was recycled. I think you may well be right. This looks like a good clue. I'll do the revert. I wanted either a confirmation that reveting it actually fixes something, _or_ an actual bug description, and this seems to be a quite possible case of the latter. Linus -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
Ingo Molnar wrote: * Ingo Molnar [EMAIL PROTECTED] wrote: If this (or my other patch) indeed solves the problem i'd still favor a full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks quite un-cooked and quite un-tested for multiple independent reasons. Sigh, why do i again have to be the messenger who brings the bad news to SLUB land, and again when poor Christoph went on vacation? :-/ the revert patch is below. (manually done due to other changes since 1f84260c8ce3b1ce26d4 was commited, but trivial) I am ok with this if someone can actually confirm it fixes things. Pekka -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
Ingo Molnar wrote: * Pekka Enberg [EMAIL PROTECTED] wrote: Yes, this can happen. Are you saying it is not safe to be in the lockless path when an IRQ triggers? Hmm. The barrier() in slab_free() looks fishy. The comment says it's there to make sure we've retrieved c-freelist before c-page but then it uses a _compiler barrier_ which doesn't affect the CPU and the reads may still be re-ordered... Not sure if that matters here though. find a fix patch for that below - most systems affected seem to be SMP ones. If this (or my other patch) indeed solves the problem i'd still favor a full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks quite un-cooked and quite un-tested for multiple independent reasons. Sigh, why do i again have to be the messenger who brings the bad news to SLUB land, and again when poor Christoph went on vacation? :-/ Ingo -- Subject: SLUB: barrier fix From: Ingo Molnar [EMAIL PROTECTED] --- mm/slub.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux/mm/slub.c === --- linux.orig/mm/slub.c +++ linux/mm/slub.c @@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st debug_check_no_locks_freed(object, s-objsize); do { freelist = c-freelist; - barrier(); + smp_mb(); /* * If the compiler would reorder the retrieval of c-page to * come before c-freelist then an interrupt could Torsten/Yamin, does this fix things for you? What about reverting commit 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c (SLUB: Alternate fast paths using cmpxchg_local)? Pekka -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
* Ingo Molnar [EMAIL PROTECTED] wrote: Earlier today i turned off local-cmpxchg and havent had a crash or hang since then - but at 200 bootups and 4-5 crashes in a week that's not conclusive yet. I think others might have workloads that trigger this bug more often. i mean, today i've only done 200 randconfig bootups since i did the cmpxchg SLUB revert, and given the statistics of the bug (thousands of bootups and just 3 provable crashes) i cannot yet conclude that the bug is truly gone. Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On Tue, 19 Feb 2008, Pekka Enberg wrote: Hmm. The barrier() in slab_free() looks fishy. The comment says it's there to make sure we've retrieved c-freelist before c-page but then it uses a _compiler barrier_ which doesn't affect the CPU and the reads may still be re-ordered... Not sure if that matters here though. No, no. The comment says that it's purely there to serialize an *interrupt*, and as such, a compiler-only barrier is sufficient (or the comment is wrong). Interrupts are totally ordered within a cpu (of course, in theory a CPU might have speculative work etc reordering, but the CPU also guarantees that interrupt acts _as_if_ it was exact), so a compiler barrier is sufficient. Of course, if we're talking about interrupts on another CPU, that's a different issue, but the fact is, in that case it's not about interrupts any more (might as well be other code just running normally on another CPU), and a barrier doesn't help, it needs real locking. So that barrier is fine per se. Of course, the whole code (and/or just the comment!) may be buggered, but any CPU SMP-aware barriers shouldn't be relevant. What's much more likely to be an issue is simply the fact that since the fastpath now accesses the per-cpu freelist without any locking, if there is *any* sequence what-so-ever that does it from another CPU and assumes the old locking behaviour, the list will be corrupted. And from a quick look-through, I certainly cannot guarantee that isn't the case. There's still a lot of cases that do direct assignments to c-freelist without using a guaranteed atomic sequence. They *should* be safe if it's guaranteed that (a) they always run with interrupts disabled AND (b) 'c' is _always_ the current CPU list but I can't quickly see that guarantee for either. I'd happily just revert this thing, but it would be really good to have confirmation that it seems to matter. But Torsten's partial bisection seems to say that the quicklist thing went into -mm before the crash even started. So: - it might be something else entirely - it might still be the local cmpxchg, just Torsten didn't happen to notice it until later. - it might still be the local cmpxchg, but something else changed its patterns to actually make it start triggering. and in general I don't think we should revert it unless we have stronger indications that it really is the problem (eg somebody finds the actual bug, or a reporter can confirm that it goes away when the local cmpxchg optimization is disabled). Linus -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On Tue, 19 Feb 2008 09:02:30 -0500 Mathieu Desnoyers [EMAIL PROTECTED] wrote: * Pekka Enberg ([EMAIL PROTECTED]) wrote: On Feb 19, 2008 8:54 AM, Torsten Kaiser [EMAIL PROTECTED] wrote: [ 5282.056415] [ cut here ] [ 5282.059757] kernel BUG at lib/list_debug.c:33! [ 5282.062055] invalid opcode: [1] SMP [ 5282.062055] CPU 3 hm. Your crashes do seem to span multiple subsystems, but it always seems to be around the SLUB code. Could you try the patch below? The SLUB code has a new optimization and i'm not 100% sure about it. [the hack below switches the SLUB optimization off by disabling the CPU feature it relies on.] Ingo - arch/x86/Kconfig |4 1 file changed, 4 deletions(-) Index: linux/arch/x86/Kconfig === --- linux.orig/arch/x86/Kconfig +++ linux/arch/x86/Kconfig @@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT config SEMAPHORE_SLEEPERS def_bool y -config FAST_CMPXCHG_LOCAL - bool - default y - config MMU def_bool y $ grep FAST_CMPXCHG_LOCAL */.config linux-2.6.24-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y -rc2-mm1 still worked for me. Did you mean the new SLUB_FASTPATH? $ grep define SLUB_FASTPATH */mm/slub.c linux-2.6.25-rc1/mm/slub.c:#define SLUB_FASTPATH linux-2.6.25-rc2-mm1/mm/slub.c:#define SLUB_FASTPATH linux-2.6.25-rc2/mm/slub.c:#define SLUB_FASTPATH The 2.6.24-rc3+ mm-kernels did crash for me, but don't seem to contain this... On the other hand: From the crash in 2.6.25-rc2-mm1: [59987.116182] RIP [8029f83d] kmem_cache_alloc_node+0x6d/0xa0 (gdb) list *0x8029f83d 0x8029f83d is in kmem_cache_alloc_node (mm/slub.c:1646). 1641if (unlikely(is_end(object) || !node_match(c, node))) { 1642object = __slab_alloc(s, gfpflags, node, addr, c); 1643break; 1644} 1645stat(c, ALLOC_FASTPATH); 1646} while (cmpxchg_local(c-freelist, object, object[c-offset]) 1647 != object); 1648#else 1649unsigned long flags; 1650 That code is part for SLUB_FASTPATH. I'm willing to test the patch, but don't know how fast I can find the time to do it, so my answer if your patch helps might be delayed until the weekend. Mathieu, Christoph is on vacation and I'm not at all that familiar with this cmpxchg_local() optimization, so if you could take a peek at this bug report to see if you can spot something obviously wrong with it, I would much appreciate that. Sure, Initial thoughts : I'd like to get the complete config causing this bug. I suspect either : - A race between the lockless algo and an IRQ in a driver allocating memory. - stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore indicating it is not reentrant if IRQs are disabled. Since those are only stats, I guess it's ok, but still weird. - CPU hotplug problem. http://bugzilla.kernel.org/attachment.cgi?id=14877action=view shows last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map -- is this linked to a cpu up/down event ? Since this shows mostly with network card drivers, I think the most plausible cause would be an IRQ nesting over kmem_cache_alloc_node and calling it. Will dig further... I wonder how SLUB_FASTPATH is supposed to work, since it is affected by a classical ABA problem of lockless algo. cmpxchg_local(c-freelist, object, object[c-offset]) can succeed, while an interrupt came (on this cpu), and several allocations were done, and one free was performed at the end of this interruption, so 'object' was recycled. c-freelist can then contain the previous value (object), but object[c-offset] was changed by IRQ. We then put back in freelist an already allocated object. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
* Linus Torvalds [EMAIL PROTECTED] wrote: So: - it might be something else entirely - it might still be the local cmpxchg, just Torsten didn't happen to notice it until later. - it might still be the local cmpxchg, but something else changed its patterns to actually make it start triggering. and in general I don't think we should revert it unless we have stronger indications that it really is the problem (eg somebody finds the actual bug, or a reporter can confirm that it goes away when the local cmpxchg optimization is disabled). yeah - my revert suggestions were all completely conditional on such type of test feedback. Btw., i did trigger occasional SLUB crashes myself starting at around -rc1, on the order of one per 200-300 straight random bootups, and yesterday i did a 50-bootups series of a specific .config that crashed, to try to reproduce one of them but failed - so bisection was not an option and i had nothing concrete and repeatable to report either. I had a few complete lockups and only 3 usable backtraces - find them below. Networking features in all of the backtraces - and so does the VFS. All of the crashes are on SMP - and given that 50% of the bootups are UP this gives us a 1:8 chance hint that this bug is SMP specific. (All the crashes are in distccd - that is what this build cluster does mainly so it's the main activity of the box - so they dont necessarily indicate anything workload specific.) Earlier today i turned off local-cmpxchg and havent had a crash or hang since then - but at 200 bootups and 4-5 crashes in a week that's not conclusive yet. I think others might have workloads that trigger this bug more often. Ingo mercury login: [ 582.671916] Oops: [#1] SMP DEBUG_PAGEALLOC [ 582.672334] [ 582.672334] Pid: 3776, comm: distccd Not tainted (2.6.25-rc2 #5) [ 582.672334] EIP: 0060:[c0174fda] EFLAGS: 00010246 CPU: 0 [ 582.672334] EIP is at kmem_cache_alloc+0x2a/0x90 [ 582.672334] EAX: EBX: 861c ECX: c069ed1c EDX: 01060002 [ 582.672334] ESI: c0aeffc8 EDI: c1d11714 EBP: f6eddcdc ESP: f6eddcc4 [ 582.672334] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 582.672334] Process distccd (pid: 3776, ti=f6edc000 task=f508c000 task.ti=f6edc000) [ 582.672334] Stack: c06a3d48 f6eddce4 0020 861c 066c c0aeffc8 f6eddcf8 c069ed1c [ 582.672334] 0020 861c f7ce6580 f7ce6580 f6eddd18 c045e7bb [ 582.672334] f7f683e0 861c f52136c0 f7ce6580 f6eddd58 c0461de5 f508c000 [ 582.672334] Call Trace: [ 582.672334] [c06a3d48] ? netif_receive_skb+0x2a8/0x320 [ 582.672334] [c069ed1c] ? __alloc_skb+0x2c/0x110 [ 582.672334] [c045e7bb] ? nv_alloc_rx_optimized+0x10b/0x1a0 [ 582.672334] [c0461de5] ? nv_napi_poll+0x1b5/0x730 [ 582.672334] [c06a62cb] ? net_rx_action+0x16b/0x200 [ 582.672334] [c06a61e8] ? net_rx_action+0x88/0x200 [ 582.672334] [c012d713] ? __do_softirq+0x93/0x120 [ 582.672334] [c012d7f7] ? do_softirq+0x57/0x60 [ 582.672334] [c012dcc9] ? irq_exit+0x69/0x80 [ 582.672334] [c0106325] ? do_IRQ+0x45/0x80 [ 582.672334] [c018a2a2] ? d_instantiate+0x42/0x60 [ 582.672334] [c0103fd8] ? common_interrupt+0x28/0x30 [ 582.672334] [c018a2a2] ? d_instantiate+0x42/0x60 [ 582.672334] [c0149e50] ? lock_release+0xc0/0x1b0 [ 582.672334] [c07d0816] ? _spin_unlock+0x16/0x20 [ 582.672334] [c018a2a2] ? d_instantiate+0x42/0x60 [ 582.672334] [c0202a84] ? ext3_add_nondir+0x34/0x50 [ 582.672334] [c0202fde] ? ext3_create+0x9e/0xe0 [ 582.672334] [c0181498] ? vfs_create+0xb8/0x100 [ 582.672334] [c01838c0] ? open_namei+0x4d0/0x5a0 [ 582.672334] [c0136346] ? in_group_p+0x26/0x30 [ 582.672334] [c020cd40] ? ext3_permission+0x0/0x10 [ 582.672334] [c01770c1] ? do_filp_open+0x31/0x50 [ 582.672334] [c07d081d] ? _spin_unlock+0x1d/0x20 [ 582.672334] [c0176e1b] ? get_unused_fd_flags+0xbb/0xe0 [ 582.672334] [c017712d] ? do_sys_open+0x4d/0xf0 [ 582.672334] [c0327894] ? trace_hardirqs_on_thunk+0xc/0x10 [ 582.672334] [c014869d] ? trace_hardirqs_on_caller+0xbd/0x140 [ 582.672334] [c017720c] ? sys_open+0x1c/0x20 [ 582.672334] [c0102fc6] ? sysenter_past_esp+0x5f/0x99 [ 582.672334] === [ 582.672334] Code: c3 55 89 e5 57 56 89 c6 53 83 ec 0c 8b 4d 04 89 55 f0 64 a1 04 40 b7 c0 8b 7c 86 64 90 8d 74 26 00 8b 17 f6 c2 01 75 41 8b 47 0c 8b 1c 82 89 d0 0f b1 1f 39 d0 89 c3 75 e8 66 83 7d f0 00 79 1f [ 582.672334] EIP: [c0174fda] kmem_cache_alloc+0x2a/0x90 SS:ESP 0068:f6eddcc4 [ 582.672343] Kernel panic - not syncing: Fatal exception in interrupt [ 582.673337] Pid: 3776, comm: distccd Tainted: G D 2.6.25-rc2 #5 [ 582.674342] [c0128516] panic+0x46/0x120 [ 582.676335] [c0104be4] die+0x134/0x150 [ 582.678335] [c01182a8] do_page_fault+0x188/0x610 [ 582.680335] [c06c6016] ? ip_local_deliver+0xf6/0x1c0 [ 582.682335] [c0118120] ? do_page_fault+0x0/0x610 [ 582.685334] [c07d0f82] error_code+0x72/0x80
Re: Linux 2.6.25-rc2
On Feb 19, 2008 5:20 PM, Linus Torvalds [EMAIL PROTECTED] wrote: So: - it might be something else entirely - it might still be the local cmpxchg, just Torsten didn't happen to notice it until later. My new hackbench-testcase also killed 2.6.24-rc2-mm1, so I really noticed to late. - it might still be the local cmpxchg, but something else changed its patterns to actually make it start triggering. and in general I don't think we should revert it unless we have stronger indications that it really is the problem (eg somebody finds the actual bug, or a reporter can confirm that it goes away when the local cmpxchg optimization is disabled). I tried the following three patches: switching the barrier() for a smp_mb() in 2.6.25-rc2-mm1: - crashed reverting the FASTPATH-patch in 2.6.25-rc2: - worked only removed FAST_CMPXCHG_LOCAL from arch/x86/Kconfig - worked So all of these tests seem to confirm, that the bug is in the new SLUB fastpath. Torsten -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
* Eric Dumazet ([EMAIL PROTECTED]) wrote: On Tue, 19 Feb 2008 09:02:30 -0500 Mathieu Desnoyers [EMAIL PROTECTED] wrote: * Pekka Enberg ([EMAIL PROTECTED]) wrote: On Feb 19, 2008 8:54 AM, Torsten Kaiser [EMAIL PROTECTED] wrote: [ 5282.056415] [ cut here ] [ 5282.059757] kernel BUG at lib/list_debug.c:33! [ 5282.062055] invalid opcode: [1] SMP [ 5282.062055] CPU 3 hm. Your crashes do seem to span multiple subsystems, but it always seems to be around the SLUB code. Could you try the patch below? The SLUB code has a new optimization and i'm not 100% sure about it. [the hack below switches the SLUB optimization off by disabling the CPU feature it relies on.] Ingo - arch/x86/Kconfig |4 1 file changed, 4 deletions(-) Index: linux/arch/x86/Kconfig === --- linux.orig/arch/x86/Kconfig +++ linux/arch/x86/Kconfig @@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT config SEMAPHORE_SLEEPERS def_bool y -config FAST_CMPXCHG_LOCAL - bool - default y - config MMU def_bool y $ grep FAST_CMPXCHG_LOCAL */.config linux-2.6.24-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y -rc2-mm1 still worked for me. Did you mean the new SLUB_FASTPATH? $ grep define SLUB_FASTPATH */mm/slub.c linux-2.6.25-rc1/mm/slub.c:#define SLUB_FASTPATH linux-2.6.25-rc2-mm1/mm/slub.c:#define SLUB_FASTPATH linux-2.6.25-rc2/mm/slub.c:#define SLUB_FASTPATH The 2.6.24-rc3+ mm-kernels did crash for me, but don't seem to contain this... On the other hand: From the crash in 2.6.25-rc2-mm1: [59987.116182] RIP [8029f83d] kmem_cache_alloc_node+0x6d/0xa0 (gdb) list *0x8029f83d 0x8029f83d is in kmem_cache_alloc_node (mm/slub.c:1646). 1641if (unlikely(is_end(object) || !node_match(c, node))) { 1642object = __slab_alloc(s, gfpflags, node, addr, c); 1643break; 1644} 1645stat(c, ALLOC_FASTPATH); 1646} while (cmpxchg_local(c-freelist, object, object[c-offset]) 1647 != object); 1648#else 1649unsigned long flags; 1650 That code is part for SLUB_FASTPATH. I'm willing to test the patch, but don't know how fast I can find the time to do it, so my answer if your patch helps might be delayed until the weekend. Mathieu, Christoph is on vacation and I'm not at all that familiar with this cmpxchg_local() optimization, so if you could take a peek at this bug report to see if you can spot something obviously wrong with it, I would much appreciate that. Sure, Initial thoughts : I'd like to get the complete config causing this bug. I suspect either : - A race between the lockless algo and an IRQ in a driver allocating memory. - stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore indicating it is not reentrant if IRQs are disabled. Since those are only stats, I guess it's ok, but still weird. - CPU hotplug problem. http://bugzilla.kernel.org/attachment.cgi?id=14877action=view shows last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map -- is this linked to a cpu up/down event ? Since this shows mostly with network card drivers, I think the most plausible cause would be an IRQ nesting over kmem_cache_alloc_node and calling it. Will dig further... I wonder how SLUB_FASTPATH is supposed to work, since it is affected by a classical ABA problem of lockless algo. cmpxchg_local(c-freelist, object, object[c-offset]) can succeed, while an interrupt came (on this cpu), and several allocations were done, and one free was performed at the end of this interruption, so 'object' was recycled. c-freelist can then contain the previous value (object), but object[c-offset] was changed by IRQ. We then put back in freelist an already allocated object. I think you are right. A way to fix this would use the fact that the freelist is only useful to point to the first free object in a page. We could change it to an offset rather than an address. The freelist would become a counter of type long which increments until
Re: Linux 2.6.25-rc2
* Pekka Enberg ([EMAIL PROTECTED]) wrote: Hi Mathieu, On Feb 19, 2008 4:02 PM, Mathieu Desnoyers [EMAIL PROTECTED] wrote: - stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore indicating it is not reentrant if IRQs are disabled. Since those are only stats, I guess it's ok, but still weird. What is not re-entrant? incrementing the variable with a ++ when interrupts are not disabled. It's not an atomic add and it's racy. The code within stat() does exactly this. On Feb 19, 2008 4:02 PM, Mathieu Desnoyers [EMAIL PROTECTED] wrote: Since this shows mostly with network card drivers, I think the most plausible cause would be an IRQ nesting over kmem_cache_alloc_node and calling it. Yes, this can happen. Are you saying it is not safe to be in the lockless path when an IRQ triggers? It should be safe, but I think Eric pointed the correct problem in his reply. Thanks, Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On Tue, 2008-02-19 at 17:52 +0200, Pekka Enberg wrote: Ingo Molnar wrote: * Pekka Enberg [EMAIL PROTECTED] wrote: Yes, this can happen. Are you saying it is not safe to be in the lockless path when an IRQ triggers? Hmm. The barrier() in slab_free() looks fishy. The comment says it's there to make sure we've retrieved c-freelist before c-page but then it uses a _compiler barrier_ which doesn't affect the CPU and the reads may still be re-ordered... Not sure if that matters here though. find a fix patch for that below - most systems affected seem to be SMP ones. If this (or my other patch) indeed solves the problem i'd still favor a full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks quite un-cooked and quite un-tested for multiple independent reasons. Sigh, why do i again have to be the messenger who brings the bad news to SLUB land, and again when poor Christoph went on vacation? :-/ Ingo -- Subject: SLUB: barrier fix From: Ingo Molnar [EMAIL PROTECTED] --- mm/slub.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux/mm/slub.c === --- linux.orig/mm/slub.c +++ linux/mm/slub.c @@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st debug_check_no_locks_freed(object, s-objsize); do { freelist = c-freelist; - barrier(); + smp_mb(); /* * If the compiler would reorder the retrieval of c-page to * come before c-freelist then an interrupt could Torsten/Yamin, does this fix things for you? What about reverting commit 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c (SLUB: Alternate fast paths using cmpxchg_local)? I'm busy in another issue and will test it ASAP. Sorry. -yanmin -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On Wed, 2008-02-20 at 08:36 +0800, Zhang, Yanmin wrote: On Tue, 2008-02-19 at 17:52 +0200, Pekka Enberg wrote: Ingo Molnar wrote: * Pekka Enberg [EMAIL PROTECTED] wrote: Yes, this can happen. Are you saying it is not safe to be in the lockless path when an IRQ triggers? Hmm. The barrier() in slab_free() looks fishy. The comment says it's there to make sure we've retrieved c-freelist before c-page but then it uses a _compiler barrier_ which doesn't affect the CPU and the reads may still be re-ordered... Not sure if that matters here though. find a fix patch for that below - most systems affected seem to be SMP ones. If this (or my other patch) indeed solves the problem i'd still favor a full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks quite un-cooked and quite un-tested for multiple independent reasons. Sigh, why do i again have to be the messenger who brings the bad news to SLUB land, and again when poor Christoph went on vacation? :-/ Ingo -- Subject: SLUB: barrier fix From: Ingo Molnar [EMAIL PROTECTED] --- mm/slub.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux/mm/slub.c === --- linux.orig/mm/slub.c +++ linux/mm/slub.c @@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st debug_check_no_locks_freed(object, s-objsize); do { freelist = c-freelist; - barrier(); + smp_mb(); /* * If the compiler would reorder the retrieval of c-page to * come before c-freelist then an interrupt could Torsten/Yamin, does this fix things for you? What about reverting commit 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c (SLUB: Alternate fast paths using cmpxchg_local)? I'm busy in another issue and will test it ASAP. Sorry. I tested it on my 3 x86-64 machines. The small fix to use smp_mb to replace barrier in slab_free doesn't work. Kernel still crashed at the same place. I will test the reverting patch. -yanmin -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On Wed, 2008-02-20 at 10:08 +0800, Zhang, Yanmin wrote: On Wed, 2008-02-20 at 08:36 +0800, Zhang, Yanmin wrote: On Tue, 2008-02-19 at 17:52 +0200, Pekka Enberg wrote: Ingo Molnar wrote: * Pekka Enberg [EMAIL PROTECTED] wrote: Yes, this can happen. Are you saying it is not safe to be in the lockless path when an IRQ triggers? Hmm. The barrier() in slab_free() looks fishy. The comment says it's there to make sure we've retrieved c-freelist before c-page but then it uses a _compiler barrier_ which doesn't affect the CPU and the reads may still be re-ordered... Not sure if that matters here though. find a fix patch for that below - most systems affected seem to be SMP ones. If this (or my other patch) indeed solves the problem i'd still favor a full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks quite un-cooked and quite un-tested for multiple independent reasons. Sigh, why do i again have to be the messenger who brings the bad news to SLUB land, and again when poor Christoph went on vacation? :-/ Ingo -- Subject: SLUB: barrier fix From: Ingo Molnar [EMAIL PROTECTED] --- mm/slub.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux/mm/slub.c === --- linux.orig/mm/slub.c +++ linux/mm/slub.c @@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st debug_check_no_locks_freed(object, s-objsize); do { freelist = c-freelist; - barrier(); + smp_mb(); /* * If the compiler would reorder the retrieval of c-page to * come before c-freelist then an interrupt could Torsten/Yamin, does this fix things for you? What about reverting commit 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c (SLUB: Alternate fast paths using cmpxchg_local)? I'm busy in another issue and will test it ASAP. Sorry. I tested it on my 3 x86-64 machines. The small fix to use smp_mb to replace barrier in slab_free doesn't work. Kernel still crashed at the same place. I will test the reverting patch. Kernel with the reverting patch is ok. I ran reboot/hackbench for more than 10 times on every one of my 3 x86-64 machines, and kernel didn't crash. -yanmin -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On 2/20/2008, Zhang, Yanmin [EMAIL PROTECTED] wrote: Kernel with the reverting patch is ok. I ran reboot/hackbench for more than 10 times on every one of my 3 x86-64 machines, and kernel didn't crash. Great, Linus reverted the patch yesterday. Thanks for testing! -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On Feb 19, 2008 8:54 AM, Torsten Kaiser <[EMAIL PROTECTED]> wrote: > > > [ 5282.056415] [ cut here ] > > > [ 5282.059757] kernel BUG at lib/list_debug.c:33! > > > [ 5282.062055] invalid opcode: [1] SMP > > > [ 5282.062055] CPU 3 > > > > hm. Your crashes do seem to span multiple subsystems, but it always > > seems to be around the SLUB code. Could you try the patch below? The > > SLUB code has a new optimization and i'm not 100% sure about it. [the > > hack below switches the SLUB optimization off by disabling the CPU > > feature it relies on.] > > > > Ingo > > > > -> > > arch/x86/Kconfig |4 > > 1 file changed, 4 deletions(-) > > > > Index: linux/arch/x86/Kconfig > > === > > --- linux.orig/arch/x86/Kconfig > > +++ linux/arch/x86/Kconfig > > @@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT > > config SEMAPHORE_SLEEPERS > > def_bool y > > > > -config FAST_CMPXCHG_LOCAL > > - bool > > - default y > > - > > config MMU > > def_bool y > > > > $ grep FAST_CMPXCHG_LOCAL */.config > linux-2.6.24-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > -rc2-mm1 still worked for me. > > Did you mean the new SLUB_FASTPATH? > $ grep "define SLUB_FASTPATH" */mm/slub.c > linux-2.6.25-rc1/mm/slub.c:#define SLUB_FASTPATH > linux-2.6.25-rc2-mm1/mm/slub.c:#define SLUB_FASTPATH > linux-2.6.25-rc2/mm/slub.c:#define SLUB_FASTPATH > > The 2.6.24-rc3+ mm-kernels did crash for me, but don't seem to contain this... > > On the other hand: > From the crash in 2.6.25-rc2-mm1: > [59987.116182] RIP [] kmem_cache_alloc_node+0x6d/0xa0 > > (gdb) list *0x8029f83d > 0x8029f83d is in kmem_cache_alloc_node (mm/slub.c:1646). > 1641if (unlikely(is_end(object) || !node_match(c, node))) > { > 1642object = __slab_alloc(s, gfpflags, > node, addr, c); > 1643break; > 1644} > 1645stat(c, ALLOC_FASTPATH); > 1646} while (cmpxchg_local(>freelist, object, > object[c->offset]) > 1647 > != object); > 1648#else > 1649unsigned long flags; > 1650 > > That code is part for SLUB_FASTPATH. > > I'm willing to test the patch, but don't know how fast I can find the > time to do it, so my answer if your patch helps might be delayed until > the weekend. Mathieu, Christoph is on vacation and I'm not at all that familiar with this cmpxchg_local() optimization, so if you could take a peek at this bug report to see if you can spot something obviously wrong with it, I would much appreciate that. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On Feb 19, 2008 7:11 AM, Ingo Molnar <[EMAIL PROTECTED]> wrote: > * Torsten Kaiser <[EMAIL PROTECTED]> wrote: > > On Feb 15, 2008 10:23 PM, Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > > > > Ok, > > > this kernel is a winner. > > > > Sadly not for me: > > [ 5282.056415] [ cut here ] > > [ 5282.059757] kernel BUG at lib/list_debug.c:33! > > [ 5282.062055] invalid opcode: [1] SMP > > [ 5282.062055] CPU 3 > > hm. Your crashes do seem to span multiple subsystems, but it always > seems to be around the SLUB code. Could you try the patch below? The > SLUB code has a new optimization and i'm not 100% sure about it. [the > hack below switches the SLUB optimization off by disabling the CPU > feature it relies on.] > > Ingo > > -> > arch/x86/Kconfig |4 > 1 file changed, 4 deletions(-) > > Index: linux/arch/x86/Kconfig > === > --- linux.orig/arch/x86/Kconfig > +++ linux/arch/x86/Kconfig > @@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT > config SEMAPHORE_SLEEPERS > def_bool y > > -config FAST_CMPXCHG_LOCAL > - bool > - default y > - > config MMU > def_bool y > $ grep FAST_CMPXCHG_LOCAL */.config linux-2.6.24-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y -rc2-mm1 still worked for me. Did you mean the new SLUB_FASTPATH? $ grep "define SLUB_FASTPATH" */mm/slub.c linux-2.6.25-rc1/mm/slub.c:#define SLUB_FASTPATH linux-2.6.25-rc2-mm1/mm/slub.c:#define SLUB_FASTPATH linux-2.6.25-rc2/mm/slub.c:#define SLUB_FASTPATH The 2.6.24-rc3+ mm-kernels did crash for me, but don't seem to contain this... On the other hand: >From the crash in 2.6.25-rc2-mm1: [59987.116182] RIP [] kmem_cache_alloc_node+0x6d/0xa0 (gdb) list *0x8029f83d 0x8029f83d is in kmem_cache_alloc_node (mm/slub.c:1646). 1641if (unlikely(is_end(object) || !node_match(c, node))) { 1642object = __slab_alloc(s, gfpflags, node, addr, c); 1643break; 1644} 1645stat(c, ALLOC_FASTPATH); 1646} while (cmpxchg_local(>freelist, object, object[c->offset]) 1647 != object); 1648#else 1649unsigned long flags; 1650 That code is part for SLUB_FASTPATH. I'm willing to test the patch, but don't know how fast I can find the time to do it, so my answer if your patch helps might be delayed until the weekend. Torsten -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On Feb 19, 2008 12:54 AM, Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > On Sat, 16 Feb 2008, Torsten Kaiser wrote: > > > > [ 5282.056415] [ cut here ] > > [ 5282.059757] kernel BUG at lib/list_debug.c:33! > > Is there any chance that you could try to bisect this, if it's repeatable > enough for you? Even if you can't bisect it *all* the way, it would be > really good to do a handful of bisection runs which should already > hopefully narrow it down a bit more. > > Linus > It's repeatable, but not in a really reliable way. So to mark a kernel good I need to compile around 100 KDE packages, and even then I'm not 100% sure, if it's good or if I was just lucky. But I did a partly bisect against 2.6.24-rc6-mm1: 2.6.24-rc6 + mm-patches up to (including) git.nfsd -> worked 2.6.24-rc6 + mm-patches up to (including) git.xfs -> crashed I think the only added patch between rc2-mm1 and rc3-mm2 in that range where the iommu changes that I later ruled out. That leaves some git trees as suspects: git-ocfs2.patch git-selinux.patch git-s390.patch git-sched.patch git-sh.patch git-scsi-misc.patch git-unionfs.patch git-v9fs.patch git-watchdog.patch git-wireless.patch git-ipwireless_cs.patch git-x86.patch git-xfs.patch (see http://marc.info/?l=linux-kernel=120276641105256 ) Torsten -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
* Torsten Kaiser <[EMAIL PROTECTED]> wrote: > On Feb 15, 2008 10:23 PM, Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > > Ok, > > this kernel is a winner. > > Sadly not for me: > [ 5282.056415] [ cut here ] > [ 5282.059757] kernel BUG at lib/list_debug.c:33! > [ 5282.062055] invalid opcode: [1] SMP > [ 5282.062055] CPU 3 hm. Your crashes do seem to span multiple subsystems, but it always seems to be around the SLUB code. Could you try the patch below? The SLUB code has a new optimization and i'm not 100% sure about it. [the hack below switches the SLUB optimization off by disabling the CPU feature it relies on.] Ingo -> arch/x86/Kconfig |4 1 file changed, 4 deletions(-) Index: linux/arch/x86/Kconfig === --- linux.orig/arch/x86/Kconfig +++ linux/arch/x86/Kconfig @@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT config SEMAPHORE_SLEEPERS def_bool y -config FAST_CMPXCHG_LOCAL - bool - default y - config MMU def_bool y -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group
On Sat, Feb 16, 2008 at 11:37:37PM +0100, Jiri Slaby wrote: > # CONFIG_SYSFS_DEPRECATED is not set IMHO That should be *set* by default until everyone has had time to update their userspace software to cope with the changed sysfs layout. Alasdair -- [EMAIL PROTECTED] -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On Sat, 16 Feb 2008, Torsten Kaiser wrote: > > [ 5282.056415] [ cut here ] > [ 5282.059757] kernel BUG at lib/list_debug.c:33! Is there any chance that you could try to bisect this, if it's repeatable enough for you? Even if you can't bisect it *all* the way, it would be really good to do a handful of bisection runs which should already hopefully narrow it down a bit more. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Kernel Ooops while running dbench
Jeff Garzik wrote: > Two x86-64 boxes here lock up here on 2.6.25-rc2, shortly after boot. > One running Fedora 8 + X (GNOME) and one a headless file server. > configs and lspci attached. Unable to capture any splatter so far. Sounds like it may be http://lkml.org/lkml/2008/2/17/78. Suggest you try reverting that before doing the bisect. Cheers, FJP -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Kernel Ooops while running dbench
Two x86-64 boxes here lock up here on 2.6.25-rc2, shortly after boot. One running Fedora 8 + X (GNOME) and one a headless file server. configs and lspci attached. Unable to capture any splatter so far. Bisecting... 00:00.0 Host bridge: Intel Corporation 82955X Memory Controller Hub 00:01.0 PCI bridge: Intel Corporation 82955X PCI Express Root Port 00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 (rev 01) 00:1c.4 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express Port 5 (rev 01) 00:1c.5 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express Port 6 (rev 01) 00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 (rev 01) 00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 (rev 01) 00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 (rev 01) 00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #4 (rev 01) 00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller (rev 01) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1) 00:1f.0 ISA bridge: Intel Corporation 82801GB/GR (ICH7 Family) LPC Interface Bridge (rev 01) 00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 01) 00:1f.2 SATA controller: Intel Corporation 82801GR/GH (ICH7 Family) SATA AHCI Controller (rev 01) 00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 01) 01:00.0 VGA compatible controller: nVidia Corporation NV44 [Quadro NVS 285] (rev a1) 04:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5751 Gigabit Ethernet PCI Express (rev 01) 05:02.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5701 Gigabit Ethernet (rev 15) 00:00.0 Host bridge: Intel Corporation 82975X Memory Controller Hub 00:01.0 PCI bridge: Intel Corporation 82975X PCI Express Root Port 00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller (rev 01) 00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 (rev 01) 00:1c.4 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express Port 5 (rev 01) 00:1c.5 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express Port 6 (rev 01) 00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 (rev 01) 00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 (rev 01) 00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 (rev 01) 00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #4 (rev 01) 00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller (rev 01) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1) 00:1f.0 ISA bridge: Intel Corporation 82801GH (ICH7DH) LPC Interface Bridge (rev 01) 00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 01) 00:1f.2 SATA controller: Intel Corporation 82801GR/GH (ICH7 Family) SATA AHCI Controller (rev 01) 00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 01) 01:00.0 VGA compatible controller: ATI Technologies Inc R580 [Radeon X1900 XT] (Primary) 01:00.1 Display controller: ATI Technologies Inc R580 [Radeon X1900 XT] (Secondary) 02:00.0 Multimedia controller: Philips Semiconductors Unknown device 7162 04:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller 05:02.0 Network controller: RaLink RT2561/RT61 802.11g PCI 05:04.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23 IEEE-1394a-2000 Controller (PHY/Link) 05:05.0 RAID bus controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02) pretzel.bz2 Description: application/bzip core.bz2 Description: application/bzip
Re: [BUG] Linux 2.6.25-rc2 - Kernel Ooops while running dbench
On Sat, 16 Feb 2008 11:14:46 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote: > The 2.6.25-rc2 kernel oopses while running dbench on ext3 filesystem > mounted with mount -o data=writeback,nobh option on the x86_64 box > > BUG: unable to handle kernel NULL pointer dereference at > IP: [] kmem_cache_alloc+0x3a/0x6c > PGD 1f6860067 PUD 1f5d64067 PMD 0 > Oops: [1] SMP > CPU 3 > Modules linked in: > Pid: 4271, comm: dbench Not tainted 2.6.25-rc2-autotest #1 > RIP: 0010:[] [] > kmem_cache_alloc+0x3a/0x6c > RSP: :8101fb041dc8 EFLAGS: 00010246 > RAX: RBX: 810180033c00 RCX: 8027b269 > RDX: RSI: 80d0 RDI: 80632d70 > RBP: 80d0 R08: 0001 R09: > R10: 8101feb36e50 R11: 0190 R12: 0001 > R13: R14: 8101f8f38000 R15: ff9c > FS: () GS:8101fff0f000(0063) knlGS:f7e41460 > CS: 0010 DS: 002b ES: 002b CR0: 80050033 > CR2: CR3: 0001f562 CR4: 06e0 > DR0: DR1: DR2: > DR3: DR6: 0ff0 DR7: 0400 > Process dbench (pid: 4271, threadinfo 8101fb04, task 8101fb18) > Stack: 0001 8101fb041ea8 0001 8027b269 > 8101fb041ea8 80281fe8 0001 > 8101fb041ea8 ff9c 000b 0001 > Call Trace: > [] get_empty_filp+0x55/0xf9 > [] __path_lookup_intent_open+0x22/0x8f > [] open_namei+0x86/0x5a7 > [] vfs_stat_fd+0x3c/0x4a > [] do_filp_open+0x1c/0x3d > [] get_unused_fd_flags+0x79/0x111 > [] do_sys_open+0x46/0xca > [] ia32_sysret+0x0/0xa > Looks to me like we broke slab. Christoph is offline until the 27th.. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group
Am 17.02.2008 schrieb Jeff Chua: I faced the same problem, but resolved with ... vgscan vgchange -a y Sorry, I'm not sure what to do with those two commands. Running them once manually doesn't seem to change anything, and my initrd already contains them AFAICS. Also, ensure you set "write_cache_state = 1" in /etc/lvm.conf before running the above. That was already set by default. Thanks, Tilman -- Tilman SchmidtE-Mail: [EMAIL PROTECTED] Bonn, Germany Diese Nachricht besteht zu 100% aus wiederverwerteten Bits. Ungeöffnet mindestens haltbar bis: (siehe Rückseite) signature.asc Description: OpenPGP digital signature
Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group
Am 17.02.2008 schrieb Jeff Chua: I faced the same problem, but resolved with ... vgscan vgchange -a y Sorry, I'm not sure what to do with those two commands. Running them once manually doesn't seem to change anything, and my initrd already contains them AFAICS. Also, ensure you set write_cache_state = 1 in /etc/lvm.conf before running the above. That was already set by default. Thanks, Tilman -- Tilman SchmidtE-Mail: [EMAIL PROTECTED] Bonn, Germany Diese Nachricht besteht zu 100% aus wiederverwerteten Bits. Ungeöffnet mindestens haltbar bis: (siehe Rückseite) signature.asc Description: OpenPGP digital signature
Re: [BUG] Linux 2.6.25-rc2 - Kernel Ooops while running dbench
On Sat, 16 Feb 2008 11:14:46 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote: The 2.6.25-rc2 kernel oopses while running dbench on ext3 filesystem mounted with mount -o data=writeback,nobh option on the x86_64 box BUG: unable to handle kernel NULL pointer dereference at IP: [80274972] kmem_cache_alloc+0x3a/0x6c PGD 1f6860067 PUD 1f5d64067 PMD 0 Oops: [1] SMP CPU 3 Modules linked in: Pid: 4271, comm: dbench Not tainted 2.6.25-rc2-autotest #1 RIP: 0010:[80274972] [80274972] kmem_cache_alloc+0x3a/0x6c RSP: :8101fb041dc8 EFLAGS: 00010246 RAX: RBX: 810180033c00 RCX: 8027b269 RDX: RSI: 80d0 RDI: 80632d70 RBP: 80d0 R08: 0001 R09: R10: 8101feb36e50 R11: 0190 R12: 0001 R13: R14: 8101f8f38000 R15: ff9c FS: () GS:8101fff0f000(0063) knlGS:f7e41460 CS: 0010 DS: 002b ES: 002b CR0: 80050033 CR2: CR3: 0001f562 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process dbench (pid: 4271, threadinfo 8101fb04, task 8101fb18) Stack: 0001 8101fb041ea8 0001 8027b269 8101fb041ea8 80281fe8 0001 8101fb041ea8 ff9c 000b 0001 Call Trace: [8027b269] get_empty_filp+0x55/0xf9 [80281fe8] __path_lookup_intent_open+0x22/0x8f [80282853] open_namei+0x86/0x5a7 [8027d019] vfs_stat_fd+0x3c/0x4a [80279ab1] do_filp_open+0x1c/0x3d [80279c2c] get_unused_fd_flags+0x79/0x111 [80279dce] do_sys_open+0x46/0xca [80221c82] ia32_sysret+0x0/0xa Looks to me like we broke slab. Christoph is offline until the 27th.. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Kernel Ooops while running dbench
Two x86-64 boxes here lock up here on 2.6.25-rc2, shortly after boot. One running Fedora 8 + X (GNOME) and one a headless file server. configs and lspci attached. Unable to capture any splatter so far. Bisecting... 00:00.0 Host bridge: Intel Corporation 82955X Memory Controller Hub 00:01.0 PCI bridge: Intel Corporation 82955X PCI Express Root Port 00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 (rev 01) 00:1c.4 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express Port 5 (rev 01) 00:1c.5 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express Port 6 (rev 01) 00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 (rev 01) 00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 (rev 01) 00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 (rev 01) 00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #4 (rev 01) 00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller (rev 01) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1) 00:1f.0 ISA bridge: Intel Corporation 82801GB/GR (ICH7 Family) LPC Interface Bridge (rev 01) 00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 01) 00:1f.2 SATA controller: Intel Corporation 82801GR/GH (ICH7 Family) SATA AHCI Controller (rev 01) 00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 01) 01:00.0 VGA compatible controller: nVidia Corporation NV44 [Quadro NVS 285] (rev a1) 04:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5751 Gigabit Ethernet PCI Express (rev 01) 05:02.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5701 Gigabit Ethernet (rev 15) 00:00.0 Host bridge: Intel Corporation 82975X Memory Controller Hub 00:01.0 PCI bridge: Intel Corporation 82975X PCI Express Root Port 00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller (rev 01) 00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 (rev 01) 00:1c.4 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express Port 5 (rev 01) 00:1c.5 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express Port 6 (rev 01) 00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 (rev 01) 00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 (rev 01) 00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 (rev 01) 00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #4 (rev 01) 00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller (rev 01) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1) 00:1f.0 ISA bridge: Intel Corporation 82801GH (ICH7DH) LPC Interface Bridge (rev 01) 00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 01) 00:1f.2 SATA controller: Intel Corporation 82801GR/GH (ICH7 Family) SATA AHCI Controller (rev 01) 00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 01) 01:00.0 VGA compatible controller: ATI Technologies Inc R580 [Radeon X1900 XT] (Primary) 01:00.1 Display controller: ATI Technologies Inc R580 [Radeon X1900 XT] (Secondary) 02:00.0 Multimedia controller: Philips Semiconductors Unknown device 7162 04:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller 05:02.0 Network controller: RaLink RT2561/RT61 802.11g PCI 05:04.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23 IEEE-1394a-2000 Controller (PHY/Link) 05:05.0 RAID bus controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02) pretzel.bz2 Description: application/bzip core.bz2 Description: application/bzip
Re: [BUG] Linux 2.6.25-rc2 - Kernel Ooops while running dbench
Jeff Garzik wrote: Two x86-64 boxes here lock up here on 2.6.25-rc2, shortly after boot. One running Fedora 8 + X (GNOME) and one a headless file server. configs and lspci attached. Unable to capture any splatter so far. Sounds like it may be http://lkml.org/lkml/2008/2/17/78. Suggest you try reverting that before doing the bisect. Cheers, FJP -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On Sat, 16 Feb 2008, Torsten Kaiser wrote: [ 5282.056415] [ cut here ] [ 5282.059757] kernel BUG at lib/list_debug.c:33! Is there any chance that you could try to bisect this, if it's repeatable enough for you? Even if you can't bisect it *all* the way, it would be really good to do a handful of bisection runs which should already hopefully narrow it down a bit more. Linus -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group
On Sat, Feb 16, 2008 at 11:37:37PM +0100, Jiri Slaby wrote: # CONFIG_SYSFS_DEPRECATED is not set IMHO That should be *set* by default until everyone has had time to update their userspace software to cope with the changed sysfs layout. Alasdair -- [EMAIL PROTECTED] -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
* Torsten Kaiser [EMAIL PROTECTED] wrote: On Feb 15, 2008 10:23 PM, Linus Torvalds [EMAIL PROTECTED] wrote: Ok, this kernel is a winner. Sadly not for me: [ 5282.056415] [ cut here ] [ 5282.059757] kernel BUG at lib/list_debug.c:33! [ 5282.062055] invalid opcode: [1] SMP [ 5282.062055] CPU 3 hm. Your crashes do seem to span multiple subsystems, but it always seems to be around the SLUB code. Could you try the patch below? The SLUB code has a new optimization and i'm not 100% sure about it. [the hack below switches the SLUB optimization off by disabling the CPU feature it relies on.] Ingo - arch/x86/Kconfig |4 1 file changed, 4 deletions(-) Index: linux/arch/x86/Kconfig === --- linux.orig/arch/x86/Kconfig +++ linux/arch/x86/Kconfig @@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT config SEMAPHORE_SLEEPERS def_bool y -config FAST_CMPXCHG_LOCAL - bool - default y - config MMU def_bool y -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On Feb 19, 2008 12:54 AM, Linus Torvalds [EMAIL PROTECTED] wrote: On Sat, 16 Feb 2008, Torsten Kaiser wrote: [ 5282.056415] [ cut here ] [ 5282.059757] kernel BUG at lib/list_debug.c:33! Is there any chance that you could try to bisect this, if it's repeatable enough for you? Even if you can't bisect it *all* the way, it would be really good to do a handful of bisection runs which should already hopefully narrow it down a bit more. Linus It's repeatable, but not in a really reliable way. So to mark a kernel good I need to compile around 100 KDE packages, and even then I'm not 100% sure, if it's good or if I was just lucky. But I did a partly bisect against 2.6.24-rc6-mm1: 2.6.24-rc6 + mm-patches up to (including) git.nfsd - worked 2.6.24-rc6 + mm-patches up to (including) git.xfs - crashed I think the only added patch between rc2-mm1 and rc3-mm2 in that range where the iommu changes that I later ruled out. That leaves some git trees as suspects: git-ocfs2.patch git-selinux.patch git-s390.patch git-sched.patch git-sh.patch git-scsi-misc.patch git-unionfs.patch git-v9fs.patch git-watchdog.patch git-wireless.patch git-ipwireless_cs.patch git-x86.patch git-xfs.patch (see http://marc.info/?l=linux-kernelm=120276641105256 ) Torsten -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On Feb 19, 2008 7:11 AM, Ingo Molnar [EMAIL PROTECTED] wrote: * Torsten Kaiser [EMAIL PROTECTED] wrote: On Feb 15, 2008 10:23 PM, Linus Torvalds [EMAIL PROTECTED] wrote: Ok, this kernel is a winner. Sadly not for me: [ 5282.056415] [ cut here ] [ 5282.059757] kernel BUG at lib/list_debug.c:33! [ 5282.062055] invalid opcode: [1] SMP [ 5282.062055] CPU 3 hm. Your crashes do seem to span multiple subsystems, but it always seems to be around the SLUB code. Could you try the patch below? The SLUB code has a new optimization and i'm not 100% sure about it. [the hack below switches the SLUB optimization off by disabling the CPU feature it relies on.] Ingo - arch/x86/Kconfig |4 1 file changed, 4 deletions(-) Index: linux/arch/x86/Kconfig === --- linux.orig/arch/x86/Kconfig +++ linux/arch/x86/Kconfig @@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT config SEMAPHORE_SLEEPERS def_bool y -config FAST_CMPXCHG_LOCAL - bool - default y - config MMU def_bool y $ grep FAST_CMPXCHG_LOCAL */.config linux-2.6.24-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y -rc2-mm1 still worked for me. Did you mean the new SLUB_FASTPATH? $ grep define SLUB_FASTPATH */mm/slub.c linux-2.6.25-rc1/mm/slub.c:#define SLUB_FASTPATH linux-2.6.25-rc2-mm1/mm/slub.c:#define SLUB_FASTPATH linux-2.6.25-rc2/mm/slub.c:#define SLUB_FASTPATH The 2.6.24-rc3+ mm-kernels did crash for me, but don't seem to contain this... On the other hand: From the crash in 2.6.25-rc2-mm1: [59987.116182] RIP [8029f83d] kmem_cache_alloc_node+0x6d/0xa0 (gdb) list *0x8029f83d 0x8029f83d is in kmem_cache_alloc_node (mm/slub.c:1646). 1641if (unlikely(is_end(object) || !node_match(c, node))) { 1642object = __slab_alloc(s, gfpflags, node, addr, c); 1643break; 1644} 1645stat(c, ALLOC_FASTPATH); 1646} while (cmpxchg_local(c-freelist, object, object[c-offset]) 1647 != object); 1648#else 1649unsigned long flags; 1650 That code is part for SLUB_FASTPATH. I'm willing to test the patch, but don't know how fast I can find the time to do it, so my answer if your patch helps might be delayed until the weekend. Torsten -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group
On Feb 18, 2008 8:57 AM, Tilman Schmidt <[EMAIL PROTECTED]> wrote: > Am 16.02.2008 23:37 schrieb Jiri Slaby: > > On 02/16/2008 09:12 PM, Alan Cox wrote: > > Try to upgrade to at least lvm 2.02.29 (I guess this is the first version > > which > > understands the new sysfs layout). > I'll have to investigate how to do that without breaking anything. I faced the same problem, but resolved with ... vgscan vgchange -a y Also, ensure you set "write_cache_state = 1" in /etc/lvm.conf before running the above. Let me know if this helps. Thanks, Jeff. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group
Am 16.02.2008 23:37 schrieb Jiri Slaby: On 02/16/2008 09:12 PM, Alan Cox wrote: On Sat, 16 Feb 2008 20:14:30 +0100 Tilman Schmidt <[EMAIL PROTECTED]> wrote: 2.6.25-rc2 fails to bring up my openSUSE 10.3 PC because LVM cannot find the volume group containing the root file system. 2.6.25-rc1 has the same problem, 2.6.24 works fine. Bisection says: edfaa7c36574f1bf09c65ad602412db9da5f96bf is first bad commit commit edfaa7c36574f1bf09c65ad602412db9da5f96bf Author: Kay Sievers <[EMAIL PROTECTED]> Date: Mon May 21 22:08:01 2007 +0200 Driver core: convert block from raw kobjects to core devices This moves the block devices to /sys/class/block. It will create a flat list of all block devices, with the disks and partitions in one directory. For compatibility /sys/block is created and contains symlinks to the disks. Apparently, compatibility is in the eye of the beholder - in this case, LVM. Compile in SCSI disk support. Modular even if loaded in initrd it seems to have broken somewhere. Setting CONFIG_SCSI=y CONFIG_BLK_DEV_SD=y does not help. The problem persists. # CONFIG_SYSFS_DEPRECATED is not set I would suspect this. Setting CONFIG_SYSFS_DEPRECATED=y does indeed fix the problem and allows me to boot successfully. Pity, I was so happy getting rid of that a couple of releases ago. Try to upgrade to at least lvm 2.02.29 (I guess this is the first version which understands the new sysfs layout). I'll have to investigate how to do that without breaking anything. HTH T. -- Tilman Schmidt E-Mail: [EMAIL PROTECTED] Bonn, Germany Diese Nachricht besteht zu 100% aus wiederverwerteten Bits. Ungeöffnet mindestens haltbar bis: (siehe Rückseite) signature.asc Description: OpenPGP digital signature
Re: Linux 2.6.25-rc2
On Feb 17, 2008 9:25 PM, Rafael J. Wysocki <[EMAIL PROTECTED]> wrote: > There's the Bugzilla entry for it at > http://bugzilla.kernel.org/show_bug.cgi?id=9973 Thank you. > Please update it with the current information. Crash for 2.6.25-rc2-mm1 added. That one had a complete stacktrace, but the trace looks like others I already reported, so no real new information... :-( Torsten -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On Saturday, 16 of February 2008, Torsten Kaiser wrote: > On Feb 15, 2008 10:23 PM, Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > > Ok, > > this kernel is a winner. > > Sadly not for me: > [ 5282.056415] [ cut here ] > [ 5282.059757] kernel BUG at lib/list_debug.c:33! > [ 5282.062055] invalid opcode: [1] SMP > [ 5282.062055] CPU 3 > [ 5282.062055] Modules linked in: radeon drm w83792d ipv6 tuner > tea5767 tda8290 tuner_xc2028 tda9887 tuner_simple mt20xx tea5761 > tvaudio msp3400 bttv videodev v4l1_compat ir_common compat_ioctl32 > v4l2_common videobuf_dma_sg videobuf_core btcx_risc tveeprom usbhid > pata_amd i2c_nforce2 hid sg > [ 5282.062055] Pid: 12937, comm: sed Not tainted 2.6.25-rc2 #1 > [ 5282.062055] RIP: 0010:[] > -> then the output from the serial console stopped. I was in X, so I > could not see, if there was anything more on the real console. > > (gdb) list *0x803bffe4 > 0x803bffe4 is in __list_add (lib/list_debug.c:33). > 28 } > 29 if (unlikely(prev->next != next)) { > 30 printk(KERN_ERR "list_add corruption. > prev->next should be " > 31 "next (%p), but was %p. (prev=%p).\n", > 32 next, prev->next, prev); > 33 BUG(); > 34 } > 35 next->prev = new; > 36 new->next = next; > 37 new->prev = prev; > > For more on this problem see > http://marc.info/?l=linux-kernel=120293042005445 There's the Bugzilla entry for it at http://bugzilla.kernel.org/show_bug.cgi?id=9973 Please update it with the current information. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Saturday, 16 of February 2008, Kamalesh Babulal wrote: > Hi, Hi, > The softlockup is seen from 2.6.25-rc1-git{1,3} and is visible in the > 2.6.24-rc2 kernel, > While booting up with the 2.6.25-rc1-git{1,3} and 2.6.25-rc2 kernel(s) on the > powerbox Can you update the Bugzilla entry at: http://bugzilla.kernel.org/show_bug.cgi?id=9948 with the above information, please? Rafael > Loading st.ko module > BUG: soft lockup - CPU#1 stuck for 61s! [insmod:379] > NIP: c01b0620 LR: c01a5dcc CTR: 0040 > REGS: c0077caab8a0 TRAP: 0901 Not tainted (2.6.25-rc2-autotest) > MSR: 80009032 CR: 84004088 XER: 2000 > TASK = c0077cb450a0[379] 'insmod' THREAD: c0077caa8000 CPU: 1 > GPR00: c0077c9d4000 c0077caabb20 c0538a40 000b > GPR04: ffc0 c0077e0c 0036 000a > GPR08: 0040 c0077c9d4250 c000 > GPR12: c0077c9d4230 c0481d00 > NIP [c01b0620] .radix_tree_gang_lookup+0x100/0x1e4 > LR [c01a5dcc] .call_for_each_cic+0x50/0x10c > Call Trace: > [c0077caabb20] [c01a5e2c] .call_for_each_cic+0xb0/0x10c > (unreliable) > [c0077caabc60] [c019dba4] .exit_io_context+0xf0/0x110 > [c0077caabcf0] [c0061e38] .do_exit+0x820/0x850 > [c0077caabda0] [c0061f34] .do_group_exit+0xcc/0xe8 > [c0077caabe30] [c000872c] syscall_exit+0x0/0x40 > Instruction dump: > 7d296214 39290018 e809 7caa2038 39290008 2fa0 409e0018 7caa4215 > 396b0001 418200cc 424000b8 4bdc <79691f24> 7d296214 e9690018 2fab > INFO: task insmod:387 blocked for more than 120 seconds. > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > insmodD 1000e144 12144 387 1 > Call Trace: > [c0077cb97600] [c08fae80] 0xc08fae80 (unreliable) > [c0077cb977d0] [c0010c7c] .__switch_to+0x11c/0x154 > [c0077cb97860] [c0344498] .schedule+0x5d0/0x6b0 > [c0077cb97950] [c03447d8] .schedule_timeout+0x3c/0xe8 > [c0077cb97a20] [c0343d34] .wait_for_common+0x150/0x22c > [c0077cb97ae0] [c008ef00] .__stop_machine_run+0xbc/0xf0 > [c0077cb97bb0] [c008ef70] .stop_machine_run+0x3c/0x80 > [c0077cb97c50] [c00891f0] .sys_init_module+0x14e4/0x1af4 > [c0077cb97e30] [c000872c] syscall_exit+0x0/0x40 > -- 0:conmux-control -- time-stamp -- Feb/15/08 16:04:12 -- > INFO: task insmod:387 blocked for more than 120 seconds. > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > insmodD 1000e144 12144 387 1 > Call Trace: > [c0077cb97600] [c08fae80] 0xc08fae80 (unreliable) > [c0077cb977d0] [c0010c7c] .__switch_to+0x11c/0x154 > [c0077cb97860] [c0344498] .schedule+0x5d0/0x6b0 > [c0077cb97950] [c03447d8] .schedule_timeout+0x3c/0xe8 > [c0077cb97a20] [c0343d34] .wait_for_common+0x150/0x22c > [c0077cb97ae0] [c008ef00] .__stop_machine_run+0xbc/0xf0 > [c0077cb97bb0] [c008ef70] .stop_machine_run+0x3c/0x80 > [c0077cb97c50] [c00891f0] .sys_init_module+0x14e4/0x1af4 > [c0077cb97e30] [c000872c] syscall_exit+0x0/0x40 > -- 0:conmux-control -- time-stamp -- Feb/15/08 16:06:21 -- -- "Premature optimization is the root of all evil." - Donald Knuth -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Sat, Feb 16 2008, Kamalesh Babulal wrote: > Hi, > > The softlockup is seen from 2.6.25-rc1-git{1,3} and is visible in the > 2.6.24-rc2 kernel, > While booting up with the 2.6.25-rc1-git{1,3} and 2.6.25-rc2 kernel(s) on the > powerbox > > Loading st.ko module > BUG: soft lockup - CPU#1 stuck for 61s! [insmod:379] > NIP: c01b0620 LR: c01a5dcc CTR: 0040 > REGS: c0077caab8a0 TRAP: 0901 Not tainted (2.6.25-rc2-autotest) > MSR: 80009032 CR: 84004088 XER: 2000 > TASK = c0077cb450a0[379] 'insmod' THREAD: c0077caa8000 CPU: 1 > GPR00: c0077c9d4000 c0077caabb20 c0538a40 000b > GPR04: ffc0 c0077e0c 0036 000a > GPR08: 0040 c0077c9d4250 c000 > GPR12: c0077c9d4230 c0481d00 > NIP [c01b0620] .radix_tree_gang_lookup+0x100/0x1e4 > LR [c01a5dcc] .call_for_each_cic+0x50/0x10c > Call Trace: > [c0077caabb20] [c01a5e2c] .call_for_each_cic+0xb0/0x10c > (unreliable) > [c0077caabc60] [c019dba4] .exit_io_context+0xf0/0x110 > [c0077caabcf0] [c0061e38] .do_exit+0x820/0x850 > [c0077caabda0] [c0061f34] .do_group_exit+0xcc/0xe8 > [c0077caabe30] [c000872c] syscall_exit+0x0/0x40 > Instruction dump: > 7d296214 39290018 e809 7caa2038 39290008 2fa0 409e0018 7caa4215 > 396b0001 418200cc 424000b8 4bdc <79691f24> 7d296214 e9690018 2fab It's odd stuff. Could you perhaps try and add some printks to block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return from radix_tree_gang_lookup() and the pointer value of cics[i] in the for() loop after the lookup? How many SCSI devices are online? -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Sat, Feb 16 2008, Kamalesh Babulal wrote: Hi, The softlockup is seen from 2.6.25-rc1-git{1,3} and is visible in the 2.6.24-rc2 kernel, While booting up with the 2.6.25-rc1-git{1,3} and 2.6.25-rc2 kernel(s) on the powerbox Loading st.ko module BUG: soft lockup - CPU#1 stuck for 61s! [insmod:379] NIP: c01b0620 LR: c01a5dcc CTR: 0040 REGS: c0077caab8a0 TRAP: 0901 Not tainted (2.6.25-rc2-autotest) MSR: 80009032 EE,ME,IR,DR CR: 84004088 XER: 2000 TASK = c0077cb450a0[379] 'insmod' THREAD: c0077caa8000 CPU: 1 GPR00: c0077c9d4000 c0077caabb20 c0538a40 000b GPR04: ffc0 c0077e0c 0036 000a GPR08: 0040 c0077c9d4250 c000 GPR12: c0077c9d4230 c0481d00 NIP [c01b0620] .radix_tree_gang_lookup+0x100/0x1e4 LR [c01a5dcc] .call_for_each_cic+0x50/0x10c Call Trace: [c0077caabb20] [c01a5e2c] .call_for_each_cic+0xb0/0x10c (unreliable) [c0077caabc60] [c019dba4] .exit_io_context+0xf0/0x110 [c0077caabcf0] [c0061e38] .do_exit+0x820/0x850 [c0077caabda0] [c0061f34] .do_group_exit+0xcc/0xe8 [c0077caabe30] [c000872c] syscall_exit+0x0/0x40 Instruction dump: 7d296214 39290018 e809 7caa2038 39290008 2fa0 409e0018 7caa4215 396b0001 418200cc 424000b8 4bdc 79691f24 7d296214 e9690018 2fab It's odd stuff. Could you perhaps try and add some printks to block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return from radix_tree_gang_lookup() and the pointer value of cics[i] in the for() loop after the lookup? How many SCSI devices are online? -- Jens Axboe -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc
On Saturday, 16 of February 2008, Kamalesh Babulal wrote: Hi, Hi, The softlockup is seen from 2.6.25-rc1-git{1,3} and is visible in the 2.6.24-rc2 kernel, While booting up with the 2.6.25-rc1-git{1,3} and 2.6.25-rc2 kernel(s) on the powerbox Can you update the Bugzilla entry at: http://bugzilla.kernel.org/show_bug.cgi?id=9948 with the above information, please? Rafael Loading st.ko module BUG: soft lockup - CPU#1 stuck for 61s! [insmod:379] NIP: c01b0620 LR: c01a5dcc CTR: 0040 REGS: c0077caab8a0 TRAP: 0901 Not tainted (2.6.25-rc2-autotest) MSR: 80009032 EE,ME,IR,DR CR: 84004088 XER: 2000 TASK = c0077cb450a0[379] 'insmod' THREAD: c0077caa8000 CPU: 1 GPR00: c0077c9d4000 c0077caabb20 c0538a40 000b GPR04: ffc0 c0077e0c 0036 000a GPR08: 0040 c0077c9d4250 c000 GPR12: c0077c9d4230 c0481d00 NIP [c01b0620] .radix_tree_gang_lookup+0x100/0x1e4 LR [c01a5dcc] .call_for_each_cic+0x50/0x10c Call Trace: [c0077caabb20] [c01a5e2c] .call_for_each_cic+0xb0/0x10c (unreliable) [c0077caabc60] [c019dba4] .exit_io_context+0xf0/0x110 [c0077caabcf0] [c0061e38] .do_exit+0x820/0x850 [c0077caabda0] [c0061f34] .do_group_exit+0xcc/0xe8 [c0077caabe30] [c000872c] syscall_exit+0x0/0x40 Instruction dump: 7d296214 39290018 e809 7caa2038 39290008 2fa0 409e0018 7caa4215 396b0001 418200cc 424000b8 4bdc 79691f24 7d296214 e9690018 2fab INFO: task insmod:387 blocked for more than 120 seconds. echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. insmodD 1000e144 12144 387 1 Call Trace: [c0077cb97600] [c08fae80] 0xc08fae80 (unreliable) [c0077cb977d0] [c0010c7c] .__switch_to+0x11c/0x154 [c0077cb97860] [c0344498] .schedule+0x5d0/0x6b0 [c0077cb97950] [c03447d8] .schedule_timeout+0x3c/0xe8 [c0077cb97a20] [c0343d34] .wait_for_common+0x150/0x22c [c0077cb97ae0] [c008ef00] .__stop_machine_run+0xbc/0xf0 [c0077cb97bb0] [c008ef70] .stop_machine_run+0x3c/0x80 [c0077cb97c50] [c00891f0] .sys_init_module+0x14e4/0x1af4 [c0077cb97e30] [c000872c] syscall_exit+0x0/0x40 -- 0:conmux-control -- time-stamp -- Feb/15/08 16:04:12 -- INFO: task insmod:387 blocked for more than 120 seconds. echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. insmodD 1000e144 12144 387 1 Call Trace: [c0077cb97600] [c08fae80] 0xc08fae80 (unreliable) [c0077cb977d0] [c0010c7c] .__switch_to+0x11c/0x154 [c0077cb97860] [c0344498] .schedule+0x5d0/0x6b0 [c0077cb97950] [c03447d8] .schedule_timeout+0x3c/0xe8 [c0077cb97a20] [c0343d34] .wait_for_common+0x150/0x22c [c0077cb97ae0] [c008ef00] .__stop_machine_run+0xbc/0xf0 [c0077cb97bb0] [c008ef70] .stop_machine_run+0x3c/0x80 [c0077cb97c50] [c00891f0] .sys_init_module+0x14e4/0x1af4 [c0077cb97e30] [c000872c] syscall_exit+0x0/0x40 -- 0:conmux-control -- time-stamp -- Feb/15/08 16:06:21 -- -- Premature optimization is the root of all evil. - Donald Knuth -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On Saturday, 16 of February 2008, Torsten Kaiser wrote: On Feb 15, 2008 10:23 PM, Linus Torvalds [EMAIL PROTECTED] wrote: Ok, this kernel is a winner. Sadly not for me: [ 5282.056415] [ cut here ] [ 5282.059757] kernel BUG at lib/list_debug.c:33! [ 5282.062055] invalid opcode: [1] SMP [ 5282.062055] CPU 3 [ 5282.062055] Modules linked in: radeon drm w83792d ipv6 tuner tea5767 tda8290 tuner_xc2028 tda9887 tuner_simple mt20xx tea5761 tvaudio msp3400 bttv videodev v4l1_compat ir_common compat_ioctl32 v4l2_common videobuf_dma_sg videobuf_core btcx_risc tveeprom usbhid pata_amd i2c_nforce2 hid sg [ 5282.062055] Pid: 12937, comm: sed Not tainted 2.6.25-rc2 #1 [ 5282.062055] RIP: 0010:[803bffe4] - then the output from the serial console stopped. I was in X, so I could not see, if there was anything more on the real console. (gdb) list *0x803bffe4 0x803bffe4 is in __list_add (lib/list_debug.c:33). 28 } 29 if (unlikely(prev-next != next)) { 30 printk(KERN_ERR list_add corruption. prev-next should be 31 next (%p), but was %p. (prev=%p).\n, 32 next, prev-next, prev); 33 BUG(); 34 } 35 next-prev = new; 36 new-next = next; 37 new-prev = prev; For more on this problem see http://marc.info/?l=linux-kernelm=120293042005445 There's the Bugzilla entry for it at http://bugzilla.kernel.org/show_bug.cgi?id=9973 Please update it with the current information. Thanks, Rafael -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2
On Feb 17, 2008 9:25 PM, Rafael J. Wysocki [EMAIL PROTECTED] wrote: There's the Bugzilla entry for it at http://bugzilla.kernel.org/show_bug.cgi?id=9973 Thank you. Please update it with the current information. Crash for 2.6.25-rc2-mm1 added. That one had a complete stacktrace, but the trace looks like others I already reported, so no real new information... :-( Torsten -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group
Am 16.02.2008 23:37 schrieb Jiri Slaby: On 02/16/2008 09:12 PM, Alan Cox wrote: On Sat, 16 Feb 2008 20:14:30 +0100 Tilman Schmidt [EMAIL PROTECTED] wrote: 2.6.25-rc2 fails to bring up my openSUSE 10.3 PC because LVM cannot find the volume group containing the root file system. 2.6.25-rc1 has the same problem, 2.6.24 works fine. Bisection says: edfaa7c36574f1bf09c65ad602412db9da5f96bf is first bad commit commit edfaa7c36574f1bf09c65ad602412db9da5f96bf Author: Kay Sievers [EMAIL PROTECTED] Date: Mon May 21 22:08:01 2007 +0200 Driver core: convert block from raw kobjects to core devices This moves the block devices to /sys/class/block. It will create a flat list of all block devices, with the disks and partitions in one directory. For compatibility /sys/block is created and contains symlinks to the disks. Apparently, compatibility is in the eye of the beholder - in this case, LVM. Compile in SCSI disk support. Modular even if loaded in initrd it seems to have broken somewhere. Setting CONFIG_SCSI=y CONFIG_BLK_DEV_SD=y does not help. The problem persists. # CONFIG_SYSFS_DEPRECATED is not set I would suspect this. Setting CONFIG_SYSFS_DEPRECATED=y does indeed fix the problem and allows me to boot successfully. Pity, I was so happy getting rid of that a couple of releases ago. Try to upgrade to at least lvm 2.02.29 (I guess this is the first version which understands the new sysfs layout). I'll have to investigate how to do that without breaking anything. HTH T. -- Tilman Schmidt E-Mail: [EMAIL PROTECTED] Bonn, Germany Diese Nachricht besteht zu 100% aus wiederverwerteten Bits. Ungeöffnet mindestens haltbar bis: (siehe Rückseite) signature.asc Description: OpenPGP digital signature