subject:"Linux 2.6.25\-rc2"

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-21 Thread Jens Axboe

On Thu, Feb 21 2008, Andrew Morton wrote:
> On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe <[EMAIL PROTECTED]> wrote:
> 
> > But I think the radix 'scan over entire tree' is a bit fragile.
> 
> eek, it had better not be.  Was this an error in the caller?  Hope so.

The cfq use of it, not the radix tree code! It juggled the keys and
wants to make sure that we see all users, modulo raced added ones (ok if
we see them, doesn't matter if we don't).

> > This
> > patch adds a parallel hlist for ease of properly browsing the members,
> 
> Even though io_contexts are fairly uncommon, adding more stuff to a data
> structure was a pretty sad alternative to fixing a bug in
> radix_tree_gang_lookup(), or to fixing a bug in a caller of it.
> 
> IOW: what exactly went wrong here??

I could not convince myself that the current code would always do the
right thing. We should not have been seeing ->key == NULL entries in
there, it implied a double exit of that process. So I decided to fix it
by making the code a lot more readable (the patch in question deleted a
lot more than it added), at the cost of that hlist head + node.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-21 Thread Andrew Morton

On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe <[EMAIL PROTECTED]> wrote:

> But I think the radix 'scan over entire tree' is a bit fragile.

eek, it had better not be.  Was this an error in the caller?  Hope so.

> This
> patch adds a parallel hlist for ease of properly browsing the members,

Even though io_contexts are fairly uncommon, adding more stuff to a data
structure was a pretty sad alternative to fixing a bug in
radix_tree_gang_lookup(), or to fixing a bug in a caller of it.

IOW: what exactly went wrong here??
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-21 Thread Andrew Morton

On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe [EMAIL PROTECTED] wrote:

 But I think the radix 'scan over entire tree' is a bit fragile.

eek, it had better not be.  Was this an error in the caller?  Hope so.

 This
 patch adds a parallel hlist for ease of properly browsing the members,

Even though io_contexts are fairly uncommon, adding more stuff to a data
structure was a pretty sad alternative to fixing a bug in
radix_tree_gang_lookup(), or to fixing a bug in a caller of it.

IOW: what exactly went wrong here??
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-21 Thread Jens Axboe

On Thu, Feb 21 2008, Andrew Morton wrote:
 On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe [EMAIL PROTECTED] wrote:
 
  But I think the radix 'scan over entire tree' is a bit fragile.
 
 eek, it had better not be.  Was this an error in the caller?  Hope so.

The cfq use of it, not the radix tree code! It juggled the keys and
wants to make sure that we see all users, modulo raced added ones (ok if
we see them, doesn't matter if we don't).

  This
  patch adds a parallel hlist for ease of properly browsing the members,
 
 Even though io_contexts are fairly uncommon, adding more stuff to a data
 structure was a pretty sad alternative to fixing a bug in
 radix_tree_gang_lookup(), or to fixing a bug in a caller of it.
 
 IOW: what exactly went wrong here??

I could not convince myself that the current code would always do the
right thing. We should not have been seeing -key == NULL entries in
there, it implied a double exit of that process. So I decided to fix it
by making the code a lot more readable (the patch in question deleted a
lot more than it added), at the cost of that hlist head + node.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg


On 2/20/2008, "Zhang, Yanmin" <[EMAIL PROTECTED]> wrote:
> Kernel with the reverting patch is ok.
> I ran reboot/hackbench for more than 10 times on every one of my 3 x86-64 
> machines, and kernel didn't crash.

Great, Linus reverted the patch yesterday. Thanks for testing!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Zhang, Yanmin

On Wed, 2008-02-20 at 10:08 +0800, Zhang, Yanmin wrote:
> On Wed, 2008-02-20 at 08:36 +0800, Zhang, Yanmin wrote:
> > On Tue, 2008-02-19 at 17:52 +0200, Pekka Enberg wrote:
> > > Ingo Molnar wrote:
> > > > * Pekka Enberg <[EMAIL PROTECTED]> wrote:
> > > > 
> > > >>> Yes, this can happen. Are you saying it is not safe to be in the 
> > > >>> lockless path when an IRQ triggers?
> > > >> Hmm. The barrier() in slab_free() looks fishy. The comment says it's 
> > > >> there to make sure we've retrieved c->freelist before c->page but then 
> > > >> it uses a _compiler barrier_ which doesn't affect the CPU and the 
> > > >> reads may still be re-ordered... Not sure if that matters here though.
> > > > 
> > > > find a fix patch for that below - most systems affected seem to be SMP 
> > > > ones.
> > > > 
> > > > If this (or my other patch) indeed solves the problem i'd still favor a 
> > > > full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it 
> > > > looks 
> > > > quite un-cooked and quite un-tested for multiple independent reasons.
> > > > 
> > > > Sigh, why do i again have to be the messenger who brings the bad news 
> > > > to 
> > > > SLUB land, and again when poor Christoph went on vacation? :-/
> > > > 
> > > > Ingo
> > > > 
> > > > -->
> > > > Subject: SLUB: barrier fix
> > > > From: Ingo Molnar <[EMAIL PROTECTED]>
> > > > 
> > > > ---
> > > >  mm/slub.c |2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > Index: linux/mm/slub.c
> > > > ===
> > > > --- linux.orig/mm/slub.c
> > > > +++ linux/mm/slub.c
> > > > @@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st
> > > > debug_check_no_locks_freed(object, s->objsize);
> > > > do {
> > > > freelist = c->freelist;
> > > > -   barrier();
> > > > +   smp_mb();
> > > > /*
> > > >  * If the compiler would reorder the retrieval of 
> > > > c->page to
> > > >  * come before c->freelist then an interrupt could
> > > 
> > > Torsten/Yamin, does this fix things for you? What about reverting commit 
> > > 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c ("SLUB: Alternate fast paths 
> > > using cmpxchg_local")?
> > I'm busy in another issue and will test it ASAP. Sorry.
> I tested it on my 3 x86-64 machines. The small fix to use smp_mb to replace
> barrier in slab_free doesn't work. Kernel still crashed at the same place.
> 
> I will test the reverting patch.
Kernel with the reverting patch is ok.
I ran reboot/hackbench for more than 10 times on every one of my 3 x86-64 
machines,
and kernel didn't crash.

-yanmin


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Zhang, Yanmin

On Wed, 2008-02-20 at 08:36 +0800, Zhang, Yanmin wrote:
> On Tue, 2008-02-19 at 17:52 +0200, Pekka Enberg wrote:
> > Ingo Molnar wrote:
> > > * Pekka Enberg <[EMAIL PROTECTED]> wrote:
> > > 
> > >>> Yes, this can happen. Are you saying it is not safe to be in the 
> > >>> lockless path when an IRQ triggers?
> > >> Hmm. The barrier() in slab_free() looks fishy. The comment says it's 
> > >> there to make sure we've retrieved c->freelist before c->page but then 
> > >> it uses a _compiler barrier_ which doesn't affect the CPU and the 
> > >> reads may still be re-ordered... Not sure if that matters here though.
> > > 
> > > find a fix patch for that below - most systems affected seem to be SMP 
> > > ones.
> > > 
> > > If this (or my other patch) indeed solves the problem i'd still favor a 
> > > full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks 
> > > quite un-cooked and quite un-tested for multiple independent reasons.
> > > 
> > > Sigh, why do i again have to be the messenger who brings the bad news to 
> > > SLUB land, and again when poor Christoph went on vacation? :-/
> > > 
> > >   Ingo
> > > 
> > > -->
> > > Subject: SLUB: barrier fix
> > > From: Ingo Molnar <[EMAIL PROTECTED]>
> > > 
> > > ---
> > >  mm/slub.c |2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > Index: linux/mm/slub.c
> > > ===
> > > --- linux.orig/mm/slub.c
> > > +++ linux/mm/slub.c
> > > @@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st
> > >   debug_check_no_locks_freed(object, s->objsize);
> > >   do {
> > >   freelist = c->freelist;
> > > - barrier();
> > > + smp_mb();
> > >   /*
> > >* If the compiler would reorder the retrieval of c->page to
> > >* come before c->freelist then an interrupt could
> > 
> > Torsten/Yamin, does this fix things for you? What about reverting commit 
> > 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c ("SLUB: Alternate fast paths 
> > using cmpxchg_local")?
> I'm busy in another issue and will test it ASAP. Sorry.
I tested it on my 3 x86-64 machines. The small fix to use smp_mb to replace
barrier in slab_free doesn't work. Kernel still crashed at the same place.

I will test the reverting patch.

-yanmin


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Zhang, Yanmin

On Tue, 2008-02-19 at 17:52 +0200, Pekka Enberg wrote:
> Ingo Molnar wrote:
> > * Pekka Enberg <[EMAIL PROTECTED]> wrote:
> > 
> >>> Yes, this can happen. Are you saying it is not safe to be in the 
> >>> lockless path when an IRQ triggers?
> >> Hmm. The barrier() in slab_free() looks fishy. The comment says it's 
> >> there to make sure we've retrieved c->freelist before c->page but then 
> >> it uses a _compiler barrier_ which doesn't affect the CPU and the 
> >> reads may still be re-ordered... Not sure if that matters here though.
> > 
> > find a fix patch for that below - most systems affected seem to be SMP 
> > ones.
> > 
> > If this (or my other patch) indeed solves the problem i'd still favor a 
> > full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks 
> > quite un-cooked and quite un-tested for multiple independent reasons.
> > 
> > Sigh, why do i again have to be the messenger who brings the bad news to 
> > SLUB land, and again when poor Christoph went on vacation? :-/
> > 
> > Ingo
> > 
> > -->
> > Subject: SLUB: barrier fix
> > From: Ingo Molnar <[EMAIL PROTECTED]>
> > 
> > ---
> >  mm/slub.c |2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > Index: linux/mm/slub.c
> > ===
> > --- linux.orig/mm/slub.c
> > +++ linux/mm/slub.c
> > @@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st
> > debug_check_no_locks_freed(object, s->objsize);
> > do {
> > freelist = c->freelist;
> > -   barrier();
> > +   smp_mb();
> > /*
> >  * If the compiler would reorder the retrieval of c->page to
> >  * come before c->freelist then an interrupt could
> 
> Torsten/Yamin, does this fix things for you? What about reverting commit 
> 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c ("SLUB: Alternate fast paths 
> using cmpxchg_local")?
I'm busy in another issue and will test it ASAP. Sorry.

-yanmin


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Mathieu Desnoyers

* Pekka Enberg ([EMAIL PROTECTED]) wrote:
> Hi Mathieu,
> 
> On Feb 19, 2008 4:02 PM, Mathieu Desnoyers <[EMAIL PROTECTED]> wrote:
> > - stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore
> >   indicating it is not reentrant if IRQs are disabled. Since those are
> >   only stats, I guess it's ok, but still weird.
> 
> What is not re-entrant?
> 

incrementing the variable with a "++" when interrupts are not disabled.
It's not an atomic add and it's racy. The code within stat() does
exactly this.

> On Feb 19, 2008 4:02 PM, Mathieu Desnoyers <[EMAIL PROTECTED]> wrote:
> > Since this shows mostly with network card drivers, I think the most
> > plausible cause would be an IRQ nesting over kmem_cache_alloc_node and
> > calling it.
> 
> Yes, this can happen. Are you saying it is not safe to be in the
> lockless path when an IRQ triggers?

It should be safe, but I think Eric pointed the correct problem in his
reply.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Mathieu Desnoyers

* Eric Dumazet ([EMAIL PROTECTED]) wrote:
> On Tue, 19 Feb 2008 09:02:30 -0500
> Mathieu Desnoyers <[EMAIL PROTECTED]> wrote:
> 
> > * Pekka Enberg ([EMAIL PROTECTED]) wrote:
> > > On Feb 19, 2008 8:54 AM, Torsten Kaiser <[EMAIL PROTECTED]> wrote:
> > > > > > [ 5282.056415] [ cut here ]
> > > > > > [ 5282.059757] kernel BUG at lib/list_debug.c:33!
> > > > > > [ 5282.062055] invalid opcode:  [1] SMP
> > > > > > [ 5282.062055] CPU 3
> > > > >
> > > > > hm. Your crashes do seem to span multiple subsystems, but it always
> > > > > seems to be around the SLUB code. Could you try the patch below? The
> > > > > SLUB code has a new optimization and i'm not 100% sure about it. [the
> > > > > hack below switches the SLUB optimization off by disabling the CPU
> > > > > feature it relies on.]
> > > > >
> > > > > Ingo
> > > > >
> > > > > ->
> > > > >  arch/x86/Kconfig |4 
> > > > >  1 file changed, 4 deletions(-)
> > > > >
> > > > > Index: linux/arch/x86/Kconfig
> > > > > ===
> > > > > --- linux.orig/arch/x86/Kconfig
> > > > > +++ linux/arch/x86/Kconfig
> > > > > @@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT
> > > > >  config SEMAPHORE_SLEEPERS
> > > > > def_bool y
> > > > >
> > > > > -config FAST_CMPXCHG_LOCAL
> > > > > -   bool
> > > > > -   default y
> > > > > -
> > > > >  config MMU
> > > > > def_bool y
> > > > >
> > > >
> > > > $ grep FAST_CMPXCHG_LOCAL */.config
> > > > linux-2.6.24-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > > > linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > > > linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > > > linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > > > linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > > > linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > > > linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > > > linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > > >
> > > > -rc2-mm1 still worked for me.
> > > >
> > > > Did you mean the new SLUB_FASTPATH?
> > > > $ grep "define SLUB_FASTPATH" */mm/slub.c
> > > > linux-2.6.25-rc1/mm/slub.c:#define SLUB_FASTPATH
> > > > linux-2.6.25-rc2-mm1/mm/slub.c:#define SLUB_FASTPATH
> > > > linux-2.6.25-rc2/mm/slub.c:#define SLUB_FASTPATH
> > > >
> > > > The 2.6.24-rc3+ mm-kernels did crash for me, but don't seem to contain 
> > > > this...
> > > >
> > > > On the other hand:
> > > > From the crash in 2.6.25-rc2-mm1:
> > > > [59987.116182] RIP  [] kmem_cache_alloc_node+0x6d/0xa0
> > > >
> > > > (gdb) list *0x8029f83d
> > > > 0x8029f83d is in kmem_cache_alloc_node (mm/slub.c:1646).
> > > > 1641if (unlikely(is_end(object) || !node_match(c, 
> > > > node))) {
> > > > 1642object = __slab_alloc(s, gfpflags,
> > > > node, addr, c);
> > > > 1643break;
> > > > 1644}
> > > > 1645stat(c, ALLOC_FASTPATH);
> > > > 1646} while (cmpxchg_local(>freelist, object, 
> > > > object[c->offset])
> > > > 1647
> > > >  != object);
> > > > 1648#else
> > > > 1649unsigned long flags;
> > > > 1650
> > > >
> > > > That code is part for SLUB_FASTPATH.
> > > >
> > > > I'm willing to test the patch, but don't know how fast I can find the
> > > > time to do it, so my answer if your patch helps might be delayed until
> > > > the weekend.
> > > 
> > > Mathieu, Christoph is on vacation and I'm not at all that familiar
> > > with this cmpxchg_local() optimization, so if you could take a peek at
> > > this bug report to see if you can spot something obviously wrong with
> > > it, I would much appreciate that.
> > 
> > Sure,
> > 
> > I

Re: Linux 2.6.25-rc2

2008-02-19 Thread Torsten Kaiser

On Feb 19, 2008 5:20 PM, Linus Torvalds <[EMAIL PROTECTED]> wrote:
> So:
>  - it might be something else entirely
>  - it might still be the local cmpxchg, just Torsten didn't happen to
>notice it until later.

My new hackbench-testcase also killed 2.6.24-rc2-mm1, so I really
noticed to late.

>  - it might still be the local cmpxchg, but something else changed its
>patterns to actually make it start triggering.
>
> and in general I don't think we should revert it unless we have stronger
> indications that it really is the problem (eg somebody finds the actual
> bug, or a reporter can confirm that it goes away when the local cmpxchg
> optimization is disabled).

I tried the following three patches:

switching the barrier() for a smp_mb() in 2.6.25-rc2-mm1:
-> crashed

reverting the FASTPATH-patch in 2.6.25-rc2:
-> worked

only removed FAST_CMPXCHG_LOCAL from arch/x86/Kconfig
-> worked

So all of these tests seem to confirm, that the bug is in the new SLUB fastpath.

Torsten
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> Earlier today i turned off local-cmpxchg and havent had a crash or 
> hang since then - but at 200 bootups and 4-5 crashes in a week that's 
> not conclusive yet. I think others might have workloads that trigger 
> this bug more often.

i mean, today i've only done 200 randconfig bootups since i did the 
cmpxchg SLUB revert, and given the statistics of the bug (thousands of 
bootups and just 3 provable crashes) i cannot yet conclude that the bug 
is truly gone.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar


* Linus Torvalds <[EMAIL PROTECTED]> wrote:

> So:
>  - it might be something else entirely
>  - it might still be the local cmpxchg, just Torsten didn't happen to 
>notice it until later.
>  - it might still be the local cmpxchg, but something else changed its 
>patterns to actually make it start triggering.
> 
> and in general I don't think we should revert it unless we have 
> stronger indications that it really is the problem (eg somebody finds 
> the actual bug, or a reporter can confirm that it goes away when the 
> local cmpxchg optimization is disabled).

yeah - my revert suggestions were all completely conditional on such 
type of test feedback.

Btw., i did trigger occasional SLUB crashes myself starting at around 
-rc1, on the order of one per 200-300 straight random bootups, and 
yesterday i did a 50-bootups series of a specific .config that crashed, 
to try to reproduce one of them but failed - so bisection was not an 
option and i had nothing concrete and repeatable to report either. I had 
a few complete lockups and only 3 usable backtraces - find them below.

Networking features in all of the backtraces - and so does the VFS. All 
of the crashes are on SMP - and given that 50% of the bootups are UP 
this gives us a 1:8 chance hint that this bug is SMP specific. (All the 
crashes are in distccd - that is what this build cluster does mainly so 
it's the main activity of the box - so they dont necessarily indicate 
anything workload specific.)

Earlier today i turned off local-cmpxchg and havent had a crash or hang 
since then - but at 200 bootups and 4-5 crashes in a week that's not 
conclusive yet. I think others might have workloads that trigger this 
bug more often.

Ingo

>
mercury login: [  582.671916] Oops:  [#1] SMP DEBUG_PAGEALLOC
[  582.672334] 
[  582.672334] Pid: 3776, comm: distccd Not tainted (2.6.25-rc2 #5)
[  582.672334] EIP: 0060:[] EFLAGS: 00010246 CPU: 0
[  582.672334] EIP is at kmem_cache_alloc+0x2a/0x90
[  582.672334] EAX:  EBX: 861c ECX: c069ed1c EDX: 01060002
[  582.672334] ESI: c0aeffc8 EDI: c1d11714 EBP: f6eddcdc ESP: f6eddcc4
[  582.672334]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  582.672334] Process distccd (pid: 3776, ti=f6edc000 task=f508c000 
task.ti=f6edc000)
[  582.672334] Stack: c06a3d48 f6eddce4 0020 861c 066c c0aeffc8 
f6eddcf8 c069ed1c 
[  582.672334] 0020 861c f7ce6580 f7ce6580 f6eddd18 
c045e7bb  
[  582.672334] f7f683e0 861c f52136c0 f7ce6580 f6eddd58 
c0461de5 f508c000 
[  582.672334] Call Trace:
[  582.672334]  [] ? netif_receive_skb+0x2a8/0x320
[  582.672334]  [] ? __alloc_skb+0x2c/0x110
[  582.672334]  [] ? nv_alloc_rx_optimized+0x10b/0x1a0
[  582.672334]  [] ? nv_napi_poll+0x1b5/0x730
[  582.672334]  [] ? net_rx_action+0x16b/0x200
[  582.672334]  [] ? net_rx_action+0x88/0x200
[  582.672334]  [] ? __do_softirq+0x93/0x120
[  582.672334]  [] ? do_softirq+0x57/0x60
[  582.672334]  [] ? irq_exit+0x69/0x80
[  582.672334]  [] ? do_IRQ+0x45/0x80
[  582.672334]  [] ? d_instantiate+0x42/0x60
[  582.672334]  [] ? common_interrupt+0x28/0x30
[  582.672334]  [] ? d_instantiate+0x42/0x60
[  582.672334]  [] ? lock_release+0xc0/0x1b0
[  582.672334]  [] ? _spin_unlock+0x16/0x20
[  582.672334]  [] ? d_instantiate+0x42/0x60
[  582.672334]  [] ? ext3_add_nondir+0x34/0x50
[  582.672334]  [] ? ext3_create+0x9e/0xe0
[  582.672334]  [] ? vfs_create+0xb8/0x100
[  582.672334]  [] ? open_namei+0x4d0/0x5a0
[  582.672334]  [] ? in_group_p+0x26/0x30
[  582.672334]  [] ? ext3_permission+0x0/0x10
[  582.672334]  [] ? do_filp_open+0x31/0x50
[  582.672334]  [] ? _spin_unlock+0x1d/0x20
[  582.672334]  [] ? get_unused_fd_flags+0xbb/0xe0
[  582.672334]  [] ? do_sys_open+0x4d/0xf0
[  582.672334]  [] ? trace_hardirqs_on_thunk+0xc/0x10
[  582.672334]  [] ? trace_hardirqs_on_caller+0xbd/0x140
[  582.672334]  [] ? sys_open+0x1c/0x20
[  582.672334]  [] ? sysenter_past_esp+0x5f/0x99
[  582.672334]  ===
[  582.672334] Code: c3 55 89 e5 57 56 89 c6 53 83 ec 0c 8b 4d 04 89 55 f0 64 
a1 04 40 b7 c0 8b 7c 86 64 90 8d 74 26 00 8b 17 f6 c2 01 75 41 8b 47 0c <8b> 1c 
82 89 d0 0f b1 1f 39 d0 89 c3 75 e8 66 83 7d f0 00 79 1f 
[  582.672334] EIP: [] kmem_cache_alloc+0x2a/0x90 SS:ESP 0068:f6eddcc4
[  582.672343] Kernel panic - not syncing: Fatal exception in interrupt
[  582.673337] Pid: 3776, comm: distccd Tainted: G  D  2.6.25-rc2 #5
[  582.674342]  [] panic+0x46/0x120
[  582.676335]  [] die+0x134/0x150
[  582.678335]  [] do_page_fault+0x188/0x610
[  582.680335]  [] ? ip_local_deliver+0xf6/0x1c0
[  582.682335]  [] ? do_page_fault+0x0/0x610
[  582.685334]  [] error_code+0x72/0x80
[  582.687334]  [] ? __alloc_skb+0x2c/0x110
[  582.689334]  [] ? kmem_cache_alloc+0x2a/0x90
[  582.691333]  [] ? netif_receive_skb+0x2a8/0x320
[  582.69]  [] __alloc_skb+0x2c/0x110
[  582.695333]  [] nv_alloc_rx_optimized+0x10b/0x1a0
[  582.697332]  [] nv_napi_poll+0x1b5/0x730
[

Re: Linux 2.6.25-rc2

2008-02-19 Thread Linus Torvalds

On Tue, 19 Feb 2008, Eric Dumazet wrote:
> 
> cmpxchg_local(>freelist, object, object[c->offset]) can succeed,
> while an interrupt came (on this cpu), and several allocations were done,
> and one free was performed at the end of this interruption, so 'object'
> was recycled.

I think you may well be right. This looks like a good clue.

I'll do the revert. I wanted either a confirmation that reveting it 
actually fixes something, _or_ an actual bug description, and this seems 
to be a quite possible case of the latter.

Linus

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Eric Dumazet

On Tue, 19 Feb 2008 09:02:30 -0500
Mathieu Desnoyers <[EMAIL PROTECTED]> wrote:

> * Pekka Enberg ([EMAIL PROTECTED]) wrote:
> > On Feb 19, 2008 8:54 AM, Torsten Kaiser <[EMAIL PROTECTED]> wrote:
> > > > > [ 5282.056415] [ cut here ]
> > > > > [ 5282.059757] kernel BUG at lib/list_debug.c:33!
> > > > > [ 5282.062055] invalid opcode:  [1] SMP
> > > > > [ 5282.062055] CPU 3
> > > >
> > > > hm. Your crashes do seem to span multiple subsystems, but it always
> > > > seems to be around the SLUB code. Could you try the patch below? The
> > > > SLUB code has a new optimization and i'm not 100% sure about it. [the
> > > > hack below switches the SLUB optimization off by disabling the CPU
> > > > feature it relies on.]
> > > >
> > > > Ingo
> > > >
> > > > ->
> > > >  arch/x86/Kconfig |4 
> > > >  1 file changed, 4 deletions(-)
> > > >
> > > > Index: linux/arch/x86/Kconfig
> > > > ===
> > > > --- linux.orig/arch/x86/Kconfig
> > > > +++ linux/arch/x86/Kconfig
> > > > @@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT
> > > >  config SEMAPHORE_SLEEPERS
> > > > def_bool y
> > > >
> > > > -config FAST_CMPXCHG_LOCAL
> > > > -   bool
> > > > -   default y
> > > > -
> > > >  config MMU
> > > > def_bool y
> > > >
> > >
> > > $ grep FAST_CMPXCHG_LOCAL */.config
> > > linux-2.6.24-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > > linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > > linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > > linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > > linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > > linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > > linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > > linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > >
> > > -rc2-mm1 still worked for me.
> > >
> > > Did you mean the new SLUB_FASTPATH?
> > > $ grep "define SLUB_FASTPATH" */mm/slub.c
> > > linux-2.6.25-rc1/mm/slub.c:#define SLUB_FASTPATH
> > > linux-2.6.25-rc2-mm1/mm/slub.c:#define SLUB_FASTPATH
> > > linux-2.6.25-rc2/mm/slub.c:#define SLUB_FASTPATH
> > >
> > > The 2.6.24-rc3+ mm-kernels did crash for me, but don't seem to contain 
> > > this...
> > >
> > > On the other hand:
> > > From the crash in 2.6.25-rc2-mm1:
> > > [59987.116182] RIP  [] kmem_cache_alloc_node+0x6d/0xa0
> > >
> > > (gdb) list *0x8029f83d
> > > 0x8029f83d is in kmem_cache_alloc_node (mm/slub.c:1646).
> > > 1641if (unlikely(is_end(object) || !node_match(c, 
> > > node))) {
> > > 1642object = __slab_alloc(s, gfpflags,
> > > node, addr, c);
> > > 1643break;
> > > 1644}
> > > 1645stat(c, ALLOC_FASTPATH);
> > > 1646} while (cmpxchg_local(>freelist, object, 
> > > object[c->offset])
> > > 1647
> > >  != object);
> > > 1648#else
> > > 1649unsigned long flags;
> > > 1650
> > >
> > > That code is part for SLUB_FASTPATH.
> > >
> > > I'm willing to test the patch, but don't know how fast I can find the
> > > time to do it, so my answer if your patch helps might be delayed until
> > > the weekend.
> > 
> > Mathieu, Christoph is on vacation and I'm not at all that familiar
> > with this cmpxchg_local() optimization, so if you could take a peek at
> > this bug report to see if you can spot something obviously wrong with
> > it, I would much appreciate that.
> 
> Sure,
> 
> Initial thoughts :
> 
> I'd like to get the complete config causing this bug. I suspect either :
> 
> - A race between the lockless algo and an IRQ in a driver allocating
>   memory.
> - stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore
>   indicating it is not reentrant if IRQs are disabled. Since those are
>   only stats, I guess it's ok, but still weird.
> - CPU hotplug problem. 
>   http://bugzilla.kernel.org/attachment.cgi?id=14877=view shows
>   last sysf

Re: Linux 2.6.25-rc2

2008-02-19 Thread Linus Torvalds

On Tue, 19 Feb 2008, Pekka Enberg wrote:
> 
> Hmm. The barrier() in slab_free() looks fishy. The comment says it's
> there to make sure we've retrieved c->freelist before c->page but then
> it uses a _compiler barrier_ which doesn't affect the CPU and the
> reads may still be re-ordered... Not sure if that matters here though.

No, no. The comment says that it's purely there to serialize an 
*interrupt*, and as such, a compiler-only barrier is sufficient (or the 
comment is wrong).

Interrupts are "totally ordered" within a cpu (of course, in theory a CPU 
might have speculative work etc reordering, but the CPU also guarantees 
that interrupt acts _as_if_ it was exact), so a compiler barrier is 
sufficient.

Of course, if we're talking about interrupts on another CPU, that's a 
different issue, but the fact is, in that case it's not about interrupts 
any more (might as well be other code just running normally on another 
CPU), and a barrier doesn't help, it needs real locking.

So that barrier is fine per se. Of course, the whole code (and/or just the 
comment!) may be buggered, but any CPU SMP-aware barriers shouldn't be 
relevant.

What's much more likely to be an issue is simply the fact that since the 
fastpath now accesses the per-cpu freelist without any locking, if there 
is *any* sequence what-so-ever that does it from another CPU and assumes 
the old locking behaviour, the list will be corrupted. And from a quick 
look-through, I certainly cannot guarantee that isn't the case.

There's still a lot of cases that do direct assignments to "c->freelist" 
without using a guaranteed atomic sequence. They *should* be safe if it's 
guaranteed that 
 (a) they always run with interrupts disabled
AND
 (b) 'c' is _always_ the "current CPU" list

but I can't quickly see that guarantee for either.

I'd happily just revert this thing, but it would be really good to have 
confirmation that it seems to matter. But Torsten's partial bisection 
seems to say that the quicklist thing went into -mm before the crash even 
started.

So:
 - it might be something else entirely
 - it might still be the local cmpxchg, just Torsten didn't happen to 
   notice it until later.
 - it might still be the local cmpxchg, but something else changed its 
   patterns to actually make it start triggering.

and in general I don't think we should revert it unless we have stronger 
indications that it really is the problem (eg somebody finds the actual 
bug, or a reporter can confirm that it goes away when the local cmpxchg 
optimization is disabled).

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg


Ingo Molnar wrote:

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

If this (or my other patch) indeed solves the problem i'd still favor 
a full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it 
looks quite un-cooked and quite un-tested for multiple independent 
reasons.


Sigh, why do i again have to be the messenger who brings the bad news 
to SLUB land, and again when poor Christoph went on vacation? :-/


the revert patch is below. (manually done due to other changes since 
1f84260c8ce3b1ce26d4 was commited, but trivial)


I am ok with this if someone can actually confirm it fixes things.

Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg


Ingo Molnar wrote:

* Pekka Enberg <[EMAIL PROTECTED]> wrote:

Yes, this can happen. Are you saying it is not safe to be in the 
lockless path when an IRQ triggers?
Hmm. The barrier() in slab_free() looks fishy. The comment says it's 
there to make sure we've retrieved c->freelist before c->page but then 
it uses a _compiler barrier_ which doesn't affect the CPU and the 
reads may still be re-ordered... Not sure if that matters here though.


find a fix patch for that below - most systems affected seem to be SMP 
ones.


If this (or my other patch) indeed solves the problem i'd still favor a 
full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks 
quite un-cooked and quite un-tested for multiple independent reasons.


Sigh, why do i again have to be the messenger who brings the bad news to 
SLUB land, and again when poor Christoph went on vacation? :-/


Ingo

-->
Subject: SLUB: barrier fix
From: Ingo Molnar <[EMAIL PROTECTED]>

---
 mm/slub.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/mm/slub.c
===
--- linux.orig/mm/slub.c
+++ linux/mm/slub.c
@@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st
debug_check_no_locks_freed(object, s->objsize);
do {
freelist = c->freelist;
-   barrier();
+   smp_mb();
/*
 * If the compiler would reorder the retrieval of c->page to
 * come before c->freelist then an interrupt could


Torsten/Yamin, does this fix things for you? What about reverting commit 
1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c ("SLUB: Alternate fast paths 
using cmpxchg_local")?


Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar


* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> If this (or my other patch) indeed solves the problem i'd still favor 
> a full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it 
> looks quite un-cooked and quite un-tested for multiple independent 
> reasons.
> 
> Sigh, why do i again have to be the messenger who brings the bad news 
> to SLUB land, and again when poor Christoph went on vacation? :-/

the revert patch is below. (manually done due to other changes since 
1f84260c8ce3b1ce26d4 was commited, but trivial)

Ingo

->
Subject: slub: fastpath optimization revert
From: Ingo Molnar <[EMAIL PROTECTED]>
Date: Tue Feb 19 15:46:37 CET 2008

revert:

  commit 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c
  Author: Christoph Lameter <[EMAIL PROTECTED]>
  Date:   Mon Jan 7 23:20:30 2008 -0800

  SLUB: Alternate fast paths using cmpxchg_local

it was causing problems (crashes) and was incomplete.

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
 mm/slub.c |   87 --
 1 file changed, 87 deletions(-)

Index: linux-x86.q/mm/slub.c
===
--- linux-x86.q.orig/mm/slub.c
+++ linux-x86.q/mm/slub.c
@@ -149,13 +149,6 @@ static inline void ClearSlabDebug(struct
 /* Enable to test recovery from slab corruption on boot */
 #undef SLUB_RESILIENCY_TEST
 
-/*
- * Currently fastpath is not supported if preemption is enabled.
- */
-#if defined(CONFIG_FAST_CMPXCHG_LOCAL) && !defined(CONFIG_PREEMPT)
-#define SLUB_FASTPATH
-#endif
-
 #if PAGE_SHIFT <= 12
 
 /*
@@ -1514,11 +1507,6 @@ static void *__slab_alloc(struct kmem_ca
 {
void **object;
struct page *new;
-#ifdef SLUB_FASTPATH
-   unsigned long flags;
-
-   local_irq_save(flags);
-#endif
if (!c->page)
goto new_slab;
 
@@ -1541,9 +1529,6 @@ load_freelist:
 unlock_out:
slab_unlock(c->page);
stat(c, ALLOC_SLOWPATH);
-#ifdef SLUB_FASTPATH
-   local_irq_restore(flags);
-#endif
return object;
 
 another_slab:
@@ -1575,9 +1560,6 @@ new_slab:
c->page = new;
goto load_freelist;
}
-#ifdef SLUB_FASTPATH
-   local_irq_restore(flags);
-#endif
/*
 * No memory available.
 *
@@ -1619,34 +1601,6 @@ static __always_inline void *slab_alloc(
 {
void **object;
struct kmem_cache_cpu *c;
-
-/*
- * The SLUB_FASTPATH path is provisional and is currently disabled if the
- * kernel is compiled with preemption or if the arch does not support
- * fast cmpxchg operations. There are a couple of coming changes that will
- * simplify matters and allow preemption. Ultimately we may end up making
- * SLUB_FASTPATH the default.
- *
- * 1. The introduction of the per cpu allocator will avoid array lookups
- *through get_cpu_slab(). A special register can be used instead.
- *
- * 2. The introduction of per cpu atomic operations (cpu_ops) means that
- *we can realize the logic here entirely with per cpu atomics. The
- *per cpu atomic ops will take care of the preemption issues.
- */
-
-#ifdef SLUB_FASTPATH
-   c = get_cpu_slab(s, raw_smp_processor_id());
-   do {
-   object = c->freelist;
-   if (unlikely(is_end(object) || !node_match(c, node))) {
-   object = __slab_alloc(s, gfpflags, node, addr, c);
-   break;
-   }
-   stat(c, ALLOC_FASTPATH);
-   } while (cmpxchg_local(>freelist, object, object[c->offset])
-   != object);
-#else
unsigned long flags;
 
local_irq_save(flags);
@@ -1661,7 +1615,6 @@ static __always_inline void *slab_alloc(
stat(c, ALLOC_FASTPATH);
}
local_irq_restore(flags);
-#endif
 
if (unlikely((gfpflags & __GFP_ZERO) && object))
memset(object, 0, c->objsize);
@@ -1698,11 +1651,6 @@ static void __slab_free(struct kmem_cach
void **object = (void *)x;
struct kmem_cache_cpu *c;
 
-#ifdef SLUB_FASTPATH
-   unsigned long flags;
-
-   local_irq_save(flags);
-#endif
c = get_cpu_slab(s, raw_smp_processor_id());
stat(c, FREE_SLOWPATH);
slab_lock(page);
@@ -1734,9 +1682,6 @@ checks_ok:
 
 out_unlock:
slab_unlock(page);
-#ifdef SLUB_FASTPATH
-   local_irq_restore(flags);
-#endif
return;
 
 slab_empty:
@@ -1749,9 +1694,6 @@ slab_empty:
}
slab_unlock(page);
stat(c, FREE_SLAB);
-#ifdef SLUB_FASTPATH
-   local_irq_restore(flags);
-#endif
discard_slab(s, page);
return;
 
@@ -1777,34 +1719,6 @@ static __always_inline void slab_free(st
 {
void **object = (void *)x;
struct kmem_cache_cpu *c;
-
-#ifdef SLUB_FASTPATH
-   void **freelist;
-
-   c = get_cpu_slab(s, raw_smp_processor_id());
-   debug_check_no_locks_freed(object,

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar


* Pekka Enberg <[EMAIL PROTECTED]> wrote:

> > Yes, this can happen. Are you saying it is not safe to be in the 
> > lockless path when an IRQ triggers?
> 
> Hmm. The barrier() in slab_free() looks fishy. The comment says it's 
> there to make sure we've retrieved c->freelist before c->page but then 
> it uses a _compiler barrier_ which doesn't affect the CPU and the 
> reads may still be re-ordered... Not sure if that matters here though.

find a fix patch for that below - most systems affected seem to be SMP 
ones.

If this (or my other patch) indeed solves the problem i'd still favor a 
full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks 
quite un-cooked and quite un-tested for multiple independent reasons.

Sigh, why do i again have to be the messenger who brings the bad news to 
SLUB land, and again when poor Christoph went on vacation? :-/

Ingo

-->
Subject: SLUB: barrier fix
From: Ingo Molnar <[EMAIL PROTECTED]>

---
 mm/slub.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/mm/slub.c
===
--- linux.orig/mm/slub.c
+++ linux/mm/slub.c
@@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st
debug_check_no_locks_freed(object, s->objsize);
do {
freelist = c->freelist;
-   barrier();
+   smp_mb();
/*
 * If the compiler would reorder the retrieval of c->page to
 * come before c->freelist then an interrupt could

->
Subject: slub: fastpath optimization revert
From: Ingo Molnar <[EMAIL PROTECTED]>
Date: Tue Feb 19 15:46:37 CET 2008

revert:

  commit 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c
  Author: Christoph Lameter <[EMAIL PROTECTED]>
  Date:   Mon Jan 7 23:20:30 2008 -0800

  SLUB: Alternate fast paths using cmpxchg_local

it was causing problems (crashes) and was incomplete.

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
 mm/slub.c |   87 --
 1 file changed, 87 deletions(-)

Index: linux-x86.q/mm/slub.c
===
--- linux-x86.q.orig/mm/slub.c
+++ linux-x86.q/mm/slub.c
@@ -149,13 +149,6 @@ static inline void ClearSlabDebug(struct
 /* Enable to test recovery from slab corruption on boot */
 #undef SLUB_RESILIENCY_TEST
 
-/*
- * Currently fastpath is not supported if preemption is enabled.
- */
-#if defined(CONFIG_FAST_CMPXCHG_LOCAL) && !defined(CONFIG_PREEMPT)
-#define SLUB_FASTPATH
-#endif
-
 #if PAGE_SHIFT <= 12
 
 /*
@@ -1514,11 +1507,6 @@ static void *__slab_alloc(struct kmem_ca
 {
void **object;
struct page *new;
-#ifdef SLUB_FASTPATH
-   unsigned long flags;
-
-   local_irq_save(flags);
-#endif
if (!c->page)
goto new_slab;
 
@@ -1541,9 +1529,6 @@ load_freelist:
 unlock_out:
slab_unlock(c->page);
stat(c, ALLOC_SLOWPATH);
-#ifdef SLUB_FASTPATH
-   local_irq_restore(flags);
-#endif
return object;
 
 another_slab:
@@ -1575,9 +1560,6 @@ new_slab:
c->page = new;
goto load_freelist;
}
-#ifdef SLUB_FASTPATH
-   local_irq_restore(flags);
-#endif
/*
 * No memory available.
 *
@@ -1619,34 +1601,6 @@ static __always_inline void *slab_alloc(
 {
void **object;
struct kmem_cache_cpu *c;
-
-/*
- * The SLUB_FASTPATH path is provisional and is currently disabled if the
- * kernel is compiled with preemption or if the arch does not support
- * fast cmpxchg operations. There are a couple of coming changes that will
- * simplify matters and allow preemption. Ultimately we may end up making
- * SLUB_FASTPATH the default.
- *
- * 1. The introduction of the per cpu allocator will avoid array lookups
- *through get_cpu_slab(). A special register can be used instead.
- *
- * 2. The introduction of per cpu atomic operations (cpu_ops) means that
- *we can realize the logic here entirely with per cpu atomics. The
- *per cpu atomic ops will take care of the preemption issues.
- */
-
-#ifdef SLUB_FASTPATH
-   c = get_cpu_slab(s, raw_smp_processor_id());
-   do {
-   object = c->freelist;
-   if (unlikely(is_end(object) || !node_match(c, node))) {
-   object = __slab_alloc(s, gfpflags, node, addr, c);
-   break;
-   }
-   stat(c, ALLOC_FASTPATH);
-   } while (cmpxchg_local(>freelist, object, object[c->offset])
-   != object);
-#else
unsigned long flags;
 
local_irq_save(flags);
@@ -1661,7 +1615,6 @@ static __always_inline void *slab_alloc(
stat(c, ALLOC_FASTPATH);
}
local_irq_restore(flags);
-#endif
 
if (unlikely((gfpflags & __GFP_ZERO) && object))

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg

Hi Mathieu,

On Feb 19, 2008 4:02 PM, Mathieu Desnoyers <[EMAIL PROTECTED]> wrote:
> > Since this shows mostly with network card drivers, I think the most
> > plausible cause would be an IRQ nesting over kmem_cache_alloc_node and
> > calling it.

On Feb 19, 2008 4:21 PM, Pekka Enberg <[EMAIL PROTECTED]> wrote:
> Yes, this can happen. Are you saying it is not safe to be in the
> lockless path when an IRQ triggers?

Hmm. The barrier() in slab_free() looks fishy. The comment says it's
there to make sure we've retrieved c->freelist before c->page but then
it uses a _compiler barrier_ which doesn't affect the CPU and the
reads may still be re-ordered... Not sure if that matters here though.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg

Hi Mathieu,

On Feb 19, 2008 4:02 PM, Mathieu Desnoyers <[EMAIL PROTECTED]> wrote:
> - stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore
>   indicating it is not reentrant if IRQs are disabled. Since those are
>   only stats, I guess it's ok, but still weird.

What is not re-entrant?

On Feb 19, 2008 4:02 PM, Mathieu Desnoyers <[EMAIL PROTECTED]> wrote:
> Since this shows mostly with network card drivers, I think the most
> plausible cause would be an IRQ nesting over kmem_cache_alloc_node and
> calling it.

Yes, this can happen. Are you saying it is not safe to be in the
lockless path when an IRQ triggers?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Mathieu Desnoyers

* Pekka Enberg ([EMAIL PROTECTED]) wrote:
> On Feb 19, 2008 8:54 AM, Torsten Kaiser <[EMAIL PROTECTED]> wrote:
> > > > [ 5282.056415] [ cut here ]
> > > > [ 5282.059757] kernel BUG at lib/list_debug.c:33!
> > > > [ 5282.062055] invalid opcode:  [1] SMP
> > > > [ 5282.062055] CPU 3
> > >
> > > hm. Your crashes do seem to span multiple subsystems, but it always
> > > seems to be around the SLUB code. Could you try the patch below? The
> > > SLUB code has a new optimization and i'm not 100% sure about it. [the
> > > hack below switches the SLUB optimization off by disabling the CPU
> > > feature it relies on.]
> > >
> > > Ingo
> > >
> > > ->
> > >  arch/x86/Kconfig |4 
> > >  1 file changed, 4 deletions(-)
> > >
> > > Index: linux/arch/x86/Kconfig
> > > ===
> > > --- linux.orig/arch/x86/Kconfig
> > > +++ linux/arch/x86/Kconfig
> > > @@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT
> > >  config SEMAPHORE_SLEEPERS
> > > def_bool y
> > >
> > > -config FAST_CMPXCHG_LOCAL
> > > -   bool
> > > -   default y
> > > -
> > >  config MMU
> > > def_bool y
> > >
> >
> > $ grep FAST_CMPXCHG_LOCAL */.config
> > linux-2.6.24-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> > linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> >
> > -rc2-mm1 still worked for me.
> >
> > Did you mean the new SLUB_FASTPATH?
> > $ grep "define SLUB_FASTPATH" */mm/slub.c
> > linux-2.6.25-rc1/mm/slub.c:#define SLUB_FASTPATH
> > linux-2.6.25-rc2-mm1/mm/slub.c:#define SLUB_FASTPATH
> > linux-2.6.25-rc2/mm/slub.c:#define SLUB_FASTPATH
> >
> > The 2.6.24-rc3+ mm-kernels did crash for me, but don't seem to contain 
> > this...
> >
> > On the other hand:
> > From the crash in 2.6.25-rc2-mm1:
> > [59987.116182] RIP  [] kmem_cache_alloc_node+0x6d/0xa0
> >
> > (gdb) list *0x8029f83d
> > 0x8029f83d is in kmem_cache_alloc_node (mm/slub.c:1646).
> > 1641if (unlikely(is_end(object) || !node_match(c, 
> > node))) {
> > 1642object = __slab_alloc(s, gfpflags,
> > node, addr, c);
> > 1643break;
> > 1644}
> > 1645stat(c, ALLOC_FASTPATH);
> > 1646} while (cmpxchg_local(>freelist, object, 
> > object[c->offset])
> > 1647
> >  != object);
> > 1648#else
> > 1649unsigned long flags;
> > 1650
> >
> > That code is part for SLUB_FASTPATH.
> >
> > I'm willing to test the patch, but don't know how fast I can find the
> > time to do it, so my answer if your patch helps might be delayed until
> > the weekend.
> 
> Mathieu, Christoph is on vacation and I'm not at all that familiar
> with this cmpxchg_local() optimization, so if you could take a peek at
> this bug report to see if you can spot something obviously wrong with
> it, I would much appreciate that.

Sure,

Initial thoughts :

I'd like to get the complete config causing this bug. I suspect either :

- A race between the lockless algo and an IRQ in a driver allocating
  memory.
- stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore
  indicating it is not reentrant if IRQs are disabled. Since those are
  only stats, I guess it's ok, but still weird.
- CPU hotplug problem. 
  http://bugzilla.kernel.org/attachment.cgi?id=14877=view shows
  last sysfs file:
  /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
  -- is this linked to a cpu up/down event ?

Since this shows mostly with network card drivers, I think the most
plausible cause would be an IRQ nesting over kmem_cache_alloc_node and
calling it.

Will dig further...

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar


* Mathieu Desnoyers <[EMAIL PROTECTED]> wrote:

> Ingo, a comment in slub.c explains it :
> 
> /*
>  * The SLUB_FASTPATH path is provisional and is currently disabled if the
>  * kernel is compiled with preemption or if the arch does not support
>  * fast cmpxchg operations. There are a couple of coming changes that will
>  * simplify matters and allow preemption. Ultimately we may end up making
>  * SLUB_FASTPATH the default.

well the feature is not complete and there are no reasons given _why_ 
it's not complete ... and even if there's a reason it should have been 
deferred to the next merge window. We still have 10 year old "this is a 
temporary hack" comments in the kernel ;-)

"hardware does not support it" is a valid argument, "kernel developer 
had no time to implement it properly" is not ;-)

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread Kamalesh Babulal

Jens Axboe wrote:
> On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote:
>> On Sun, 17 Feb 2008 20:29:13 +0100
>> Jens Axboe <[EMAIL PROTECTED]> wrote:
>>
>>> It's odd stuff. Could you perhaps try and add some printks to
>>> block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return
>>> from radix_tree_gang_lookup() and the pointer value of cics[i] in the
>>> for() loop after the lookup?
>>>
>> I met the same issue on ia64/NUMA box.
>> seems cisc[]->key is NULL and index for radix_tree_gang_lookup() was
>> always '1'.
> 
> Why does it keep repeating then? If ->key is NULL, the next lookup index
> should be 1UL.
> 
> But I think the radix 'scan over entire tree' is a bit fragile. This
> patch adds a parallel hlist for ease of properly browsing the members,
> does that work for you? It compiles, but I haven't booted it here yet...
> 
>> Attached patch works well for me, but I don't know much about cfq.
>> please confirm. 
> 
> It doesn't make a lot of sense, I'm afraid.
> 
>  block/blk-ioc.c   |   35 +++
>  block/cfq-iosched.c   |   37 +++--
>  include/linux/iocontext.h |2 ++
>  3 files changed, 28 insertions(+), 46 deletions(-)
> 
> diff --git a/block/blk-ioc.c b/block/blk-ioc.c
> index 80245dc..73c7002 100644
> --- a/block/blk-ioc.c



Hi Jens,

Thanks for the patch. The patch works fine, machine boots up without the kernel 
panic.


-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Mathieu Desnoyers

* Ingo Molnar ([EMAIL PROTECTED]) wrote:
> 
> * Pekka Enberg <[EMAIL PROTECTED]> wrote:
> 
> > Mathieu, Christoph is on vacation and I'm not at all that familiar 
> > with this cmpxchg_local() optimization, so if you could take a peek at 
> > this bug report to see if you can spot something obviously wrong with 
> > it, I would much appreciate that.
> 
> hm, it's bad for at least one other reason as well (which is probably 
> unrelated to this crash):
> 
>  /*
>   * Currently fastpath is not supported if preemption is enabled.
>   */
>  #if defined(CONFIG_FAST_CMPXCHG_LOCAL) && !defined(CONFIG_PREEMPT)
>  #define SLUB_FASTPATH
>  #endif
> 
> such !PREEMPT exceptions tend to show "i didnt want to think too hard 
> about the preemptible case so just turn it off" thinking.
> 

Ingo, a comment in slub.c explains it :

/*
 * The SLUB_FASTPATH path is provisional and is currently disabled if the
 * kernel is compiled with preemption or if the arch does not support
 * fast cmpxchg operations. There are a couple of coming changes that will
 * simplify matters and allow preemption. Ultimately we may end up making
 * SLUB_FASTPATH the default.
 *
 * 1. The introduction of the per cpu allocator will avoid array lookups
 *through get_cpu_slab(). A special register can be used instead.
 *
 * 2. The introduction of per cpu atomic operations (cpu_ops) means that
 *we can realize the logic here entirely with per cpu atomics. The
 *per cpu atomic ops will take care of the preemption issues.
 */

So there is more coming in the preemption area.

> Also, why isnt this "SLUB_FASTPATH" flag done in the Kconfig space?
> 

Eventually, I think only CONFIG_FAST_CMPXCHG_LOCAL will be needed (when
the code will support preemption). Therefore, this SLUB_FASTPATH define
seems to be only here temporarily.

I'm looking at the code right now.. more to come.

Mathieu

>   Ingo

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg

Hi,

Pekka Enberg <[EMAIL PROTECTED]> wrote:
> > Mathieu, Christoph is on vacation and I'm not at all that familiar
> > with this cmpxchg_local() optimization, so if you could take a peek at
> > this bug report to see if you can spot something obviously wrong with
> > it, I would much appreciate that.

On Feb 19, 2008 12:27 PM, Ingo Molnar <[EMAIL PROTECTED]> wrote:
> hm, it's bad for at least one other reason as well (which is probably
> unrelated to this crash):
>
>  /*
>   * Currently fastpath is not supported if preemption is enabled.
>   */
>  #if defined(CONFIG_FAST_CMPXCHG_LOCAL) && !defined(CONFIG_PREEMPT)
>  #define SLUB_FASTPATH
>  #endif
>
> such !PREEMPT exceptions tend to show "i didnt want to think too hard
> about the preemptible case so just turn it off" thinking.
>
> Also, why isnt this "SLUB_FASTPATH" flag done in the Kconfig space?

Hmm, no idea. I think might have been some mix-up with merging the
patch. The one I saw was:

http://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24/2.6.24-mm1/broken-out/slub-optional-fast-path-using-cmpxchg_local.patch

But I don't remember giving out a Reviewed-by for it (and my mailbox
confirms that). Furthermore, somehow it turned into this when merged:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c

In any case, if Torsten/someone can verify that reverting
1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c ("SLUB: Alternate fast paths
using cmpxchg_local") fixes these problems, I think we should just do
it and let Christoph sort it out when he gets back.

   Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar

* Pekka Enberg <[EMAIL PROTECTED]> wrote:

> Mathieu, Christoph is on vacation and I'm not at all that familiar 
> with this cmpxchg_local() optimization, so if you could take a peek at 
> this bug report to see if you can spot something obviously wrong with 
> it, I would much appreciate that.

hm, it's bad for at least one other reason as well (which is probably 
unrelated to this crash):

 /*
  * Currently fastpath is not supported if preemption is enabled.
  */
 #if defined(CONFIG_FAST_CMPXCHG_LOCAL) && !defined(CONFIG_PREEMPT)
 #define SLUB_FASTPATH
 #endif

such !PREEMPT exceptions tend to show "i didnt want to think too hard 
about the preemptible case so just turn it off" thinking.

Also, why isnt this "SLUB_FASTPATH" flag done in the Kconfig space?

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread Jens Axboe

On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote:
> On Tue, 19 Feb 2008 09:58:38 +0100
> Jens Axboe <[EMAIL PROTECTED]> wrote:
> > > when I inserted printk here
> > > ==
> > >   for (i = 0; i < nr; i++)
> > >   func(ioc, cics[i]);
> > >   printk("%d %lx\n", nr, index);
> > > ==
> > > index was always "1" and  nr was always 32.
> > > 
> > > So, cics[31]->key was always NULL when index=1 is passed to
> > > radix_tree_gang_lookup().
> > 
> > Hang on, it returned 32? It should not return more than 16, since that
> > is what we have room for and asked for. 
> sorry. Of course, it was 16 ;(

I expected so, otherwise we would have had far more serious problems :-)

> your patch works well. thank you.

It's committed now and posted in the relevant bugzilla as well (#9948).

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread KAMEZAWA Hiroyuki

On Tue, 19 Feb 2008 09:58:38 +0100
Jens Axboe <[EMAIL PROTECTED]> wrote:
> > when I inserted printk here
> > ==
> > for (i = 0; i < nr; i++)
> > func(ioc, cics[i]);
> > printk("%d %lx\n", nr, index);
> > ==
> > index was always "1" and  nr was always 32.
> > 
> > So, cics[31]->key was always NULL when index=1 is passed to
> > radix_tree_gang_lookup().
> 
> Hang on, it returned 32? It should not return more than 16, since that
> is what we have room for and asked for. 
sorry. Of course, it was 16 ;(

your patch works well. thank you.

-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread Jens Axboe

On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote:
> On Tue, 19 Feb 2008 09:36:34 +0100
> Jens Axboe <[EMAIL PROTECTED]> wrote:
> 
> > On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote:
> > > On Sun, 17 Feb 2008 20:29:13 +0100
> > > Jens Axboe <[EMAIL PROTECTED]> wrote:
> > > 
> > > > It's odd stuff. Could you perhaps try and add some printks to
> > > > block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return
> > > > from radix_tree_gang_lookup() and the pointer value of cics[i] in the
> > > > for() loop after the lookup?
> > > > 
> > > I met the same issue on ia64/NUMA box.
> > > seems cisc[]->key is NULL and index for radix_tree_gang_lookup() was
> > > always '1'.
> > 
> > Why does it keep repeating then? If ->key is NULL, the next lookup index
> > should be 1UL.
> > 
> > But I think the radix 'scan over entire tree' is a bit fragile. This
> > patch adds a parallel hlist for ease of properly browsing the members,
> > does that work for you? It compiles, but I haven't booted it here yet...
> > 
> Works well for me and my box booted !

Super, I'll get it upstream. Thanks for testing and debugging!

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread Jens Axboe

On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote:
> On Tue, 19 Feb 2008 09:36:34 +0100
> Jens Axboe <[EMAIL PROTECTED]> wrote:
> 
> > On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote:
> > > On Sun, 17 Feb 2008 20:29:13 +0100
> > > Jens Axboe <[EMAIL PROTECTED]> wrote:
> > > 
> > > > It's odd stuff. Could you perhaps try and add some printks to
> > > > block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return
> > > > from radix_tree_gang_lookup() and the pointer value of cics[i] in the
> > > > for() loop after the lookup?
> > > > 
> > > I met the same issue on ia64/NUMA box.
> > > seems cisc[]->key is NULL and index for radix_tree_gang_lookup() was
> > > always '1'.
> > 
> > Why does it keep repeating then? If ->key is NULL, the next lookup index
> > should be 1UL.
> > 
> when I inserted printk here
> ==
>   for (i = 0; i < nr; i++)
>   func(ioc, cics[i]);
>   printk("%d %lx\n", nr, index);
> ==
> index was always "1" and  nr was always 32.
> 
> So, cics[31]->key was always NULL when index=1 is passed to
> radix_tree_gang_lookup().

Hang on, it returned 32? It should not return more than 16, since that
is what we have room for and asked for. Using ->dead_key when ->key is
NULL is correct btw, since that is the correct location in the tree once
the process has exited. But that should not happen until AFTER the
func() call, so I still think the list patch is safer.

> > But I think the radix 'scan over entire tree' is a bit fragile. This
> > patch adds a parallel hlist for ease of properly browsing the
> > members, does that work for you? It compiles, but I haven't booted
> > it here yet...
> > 
> will try. please wait a bit.

It boots here, so at least it passes normal sanity tests. It should
solve your problem as well, hopefully.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread KAMEZAWA Hiroyuki

On Tue, 19 Feb 2008 09:36:34 +0100
Jens Axboe <[EMAIL PROTECTED]> wrote:

> On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote:
> > On Sun, 17 Feb 2008 20:29:13 +0100
> > Jens Axboe <[EMAIL PROTECTED]> wrote:
> > 
> > > It's odd stuff. Could you perhaps try and add some printks to
> > > block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return
> > > from radix_tree_gang_lookup() and the pointer value of cics[i] in the
> > > for() loop after the lookup?
> > > 
> > I met the same issue on ia64/NUMA box.
> > seems cisc[]->key is NULL and index for radix_tree_gang_lookup() was
> > always '1'.
> 
> Why does it keep repeating then? If ->key is NULL, the next lookup index
> should be 1UL.
> 
> But I think the radix 'scan over entire tree' is a bit fragile. This
> patch adds a parallel hlist for ease of properly browsing the members,
> does that work for you? It compiles, but I haven't booted it here yet...
> 
Works well for me and my box booted !

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group

2008-02-19 Thread Tilman Schmidt


[added CCs from the other thread on this topic]

Alasdair G Kergon schrieb:

On Sat, Feb 16, 2008 at 11:37:37PM +0100, Jiri Slaby wrote:

# CONFIG_SYSFS_DEPRECATED is not set


IMHO That should be *set* by default until everyone has had time to
update their userspace software to cope with the changed sysfs layout.


It *is* set by default.

The root cause of the trouble is that its semantics are changing.
At one point in time (sorry, don't remember which kernel release
exactly) I tested whether the openSUSE 10.3 userspace supported
a CONFIG_SYSFS_DEPRECATED=n kernel and found that it did. From
then on, "make oldconfig" would carry that setting over to every
new kernel I built, which was fine while the meaning of this
setting - ie. the difference in sysfs layout it controlled -
stayed the same.

With commit edfaa7c36574f1bf09c65ad602412db9da5f96bf however, the
sysfs layout changed again, so the same CONFIG_SYSFS_DEPRECATED
setting now controls a different difference (argh) in sysfs
layout. That kind of situation is not handled very well by
"make oldconfig", which basically starts from the assumption
that a setting that was ok for the previous kernel version is
still ok for the new one.

I see two ways of avoiding that problem: either create a new
backward compatibility config setting for that new sysfs change,
or create a way of telling "make oldconfig" that the semantics of
CONFIG_SYSFS_DEPRECATED have changed and it should ask the user
for that again even if there is a previous setting.

HTH
T.

--
Tilman SchmidtE-Mail: [EMAIL PROTECTED]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)



signature.asc
Description: OpenPGP digital signature

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread KAMEZAWA Hiroyuki

On Tue, 19 Feb 2008 09:36:34 +0100
Jens Axboe <[EMAIL PROTECTED]> wrote:

> On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote:
> > On Sun, 17 Feb 2008 20:29:13 +0100
> > Jens Axboe <[EMAIL PROTECTED]> wrote:
> > 
> > > It's odd stuff. Could you perhaps try and add some printks to
> > > block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return
> > > from radix_tree_gang_lookup() and the pointer value of cics[i] in the
> > > for() loop after the lookup?
> > > 
> > I met the same issue on ia64/NUMA box.
> > seems cisc[]->key is NULL and index for radix_tree_gang_lookup() was
> > always '1'.
> 
> Why does it keep repeating then? If ->key is NULL, the next lookup index
> should be 1UL.
> 
when I inserted printk here
==
for (i = 0; i < nr; i++)
func(ioc, cics[i]);
printk("%d %lx\n", nr, index);
==
index was always "1" and  nr was always 32.

So, cics[31]->key was always NULL when index=1 is passed to 
radix_tree_gang_lookup().


> But I think the radix 'scan over entire tree' is a bit fragile. This
> patch adds a parallel hlist for ease of properly browsing the members,
> does that work for you? It compiles, but I haven't booted it here yet...
> 
will try. please wait a bit.

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread Jens Axboe

On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote:
> On Sun, 17 Feb 2008 20:29:13 +0100
> Jens Axboe <[EMAIL PROTECTED]> wrote:
> 
> > It's odd stuff. Could you perhaps try and add some printks to
> > block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return
> > from radix_tree_gang_lookup() and the pointer value of cics[i] in the
> > for() loop after the lookup?
> > 
> I met the same issue on ia64/NUMA box.
> seems cisc[]->key is NULL and index for radix_tree_gang_lookup() was
> always '1'.

Why does it keep repeating then? If ->key is NULL, the next lookup index
should be 1UL.

But I think the radix 'scan over entire tree' is a bit fragile. This
patch adds a parallel hlist for ease of properly browsing the members,
does that work for you? It compiles, but I haven't booted it here yet...

> Attached patch works well for me, but I don't know much about cfq.
> please confirm. 

It doesn't make a lot of sense, I'm afraid.

 block/blk-ioc.c   |   35 +++
 block/cfq-iosched.c   |   37 +++--
 include/linux/iocontext.h |2 ++
 3 files changed, 28 insertions(+), 46 deletions(-)

diff --git a/block/blk-ioc.c b/block/blk-ioc.c
index 80245dc..73c7002 100644
--- a/block/blk-ioc.c
+++ b/block/blk-ioc.c
@@ -17,17 +17,13 @@ static struct kmem_cache *iocontext_cachep;
 
 static void cfq_dtor(struct io_context *ioc)
 {
-   struct cfq_io_context *cic[1];
-   int r;
+   if (!hlist_empty(>cic_list)) {
+   struct cfq_io_context *cic;
 
-   /*
-* We don't have a specific key to lookup with, so use the gang
-* lookup to just retrieve the first item stored. The cfq exit
-* function will iterate the full tree, so any member will do.
-*/
-   r = radix_tree_gang_lookup(>radix_root, (void **) cic, 0, 1);
-   if (r > 0)
-   cic[0]->dtor(ioc);
+   cic = list_entry(ioc->cic_list.first, struct cfq_io_context,
+   cic_list);
+   cic->dtor(ioc);
+   }
 }
 
 /*
@@ -57,18 +53,16 @@ EXPORT_SYMBOL(put_io_context);
 
 static void cfq_exit(struct io_context *ioc)
 {
-   struct cfq_io_context *cic[1];
-   int r;
-
rcu_read_lock();
-   /*
-* See comment for cfq_dtor()
-*/
-   r = radix_tree_gang_lookup(>radix_root, (void **) cic, 0, 1);
-   rcu_read_unlock();
 
-   if (r > 0)
-   cic[0]->exit(ioc);
+   if (!hlist_empty(>cic_list)) {
+   struct cfq_io_context *cic;
+
+   cic = list_entry(ioc->cic_list.first, struct cfq_io_context,
+   cic_list);
+   cic->exit(ioc);
+   }
+   rcu_read_unlock();
 }
 
 /* Called by the exitting task */
@@ -105,6 +99,7 @@ struct io_context *alloc_io_context(gfp_t gfp_flags, int 
node)
ret->nr_batch_requests = 0; /* because this is 0 */
ret->aic = NULL;
INIT_RADIX_TREE(>radix_root, GFP_ATOMIC | __GFP_HIGH);
+   INIT_HLIST_HEAD(>cic_list);
ret->ioc_data = NULL;
}
 
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index ca198e6..62eda3f 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1145,38 +1145,19 @@ static void cfq_put_queue(struct cfq_queue *cfqq)
 /*
  * Call func for each cic attached to this ioc. Returns number of cic's seen.
  */
-#define CIC_GANG_NR16
 static unsigned int
 call_for_each_cic(struct io_context *ioc,
  void (*func)(struct io_context *, struct cfq_io_context *))
 {
-   struct cfq_io_context *cics[CIC_GANG_NR];
-   unsigned long index = 0;
-   unsigned int called = 0;
-   int nr;
+   struct cfq_io_context *cic;
+   struct hlist_node *n;
+   int called = 0;
 
rcu_read_lock();
-
-   do {
-   int i;
-
-   /*
-* Perhaps there's a better way - this just gang lookups from
-* 0 to the end, restarting after each CIC_GANG_NR from the
-* last key + 1.
-*/
-   nr = radix_tree_gang_lookup(>radix_root, (void **) cics,
-   index, CIC_GANG_NR);
-   if (!nr)
-   break;
-
-   called += nr;
-   index = 1 + (unsigned long) cics[nr - 1]->key;
-
-   for (i = 0; i < nr; i++)
-   func(ioc, cics[i]);
-   } while (nr == CIC_GANG_NR);
-
+   hlist_for_each_entry_rcu(cic, n, >cic_list, cic_list) {
+   func(ioc, cic);
+   called++;
+   }
rcu_read_unlock();
 
return called;
@@ -1190,6 +1171,7 @@ static void cic_free_func(struct io_context *ioc, struct 
cfq_io_context *cic)
 
spin_lock_irqsave(>lock, flags);
radix_tree_delete(>radix_root, cic->dead_key);
+

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread KAMEZAWA Hiroyuki

On Sun, 17 Feb 2008 20:29:13 +0100
Jens Axboe <[EMAIL PROTECTED]> wrote:

> It's odd stuff. Could you perhaps try and add some printks to
> block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return
> from radix_tree_gang_lookup() and the pointer value of cics[i] in the
> for() loop after the lookup?
> 
I met the same issue on ia64/NUMA box.
seems cisc[]->key is NULL and index for radix_tree_gang_lookup() was always '1'.

Attached patch works well for me, 
but I don't know much about cfq. please confirm. 

Regards,
-Kame

==
cics[]->key can be NULL.
In that case, cics[]->dead_key has key value.

Signed-off-by: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]>

Index: linux-2.6.25-rc2/block/cfq-iosched.c
===========
--- linux-2.6.25-rc2.orig/block/cfq-iosched.c
+++ linux-2.6.25-rc2/block/cfq-iosched.c
@@ -1171,7 +1171,11 @@ call_for_each_cic(struct io_context *ioc
break;
 
called += nr;
-   index = 1 + (unsigned long) cics[nr - 1]->key;
+
+   if (!cics[nr - 1]->key)
+   index = 1 + (unsigned long) cics[nr - 1]->dead_key;
+   else
+   index = 1 + (unsigned long) cics[nr - 1]->key;
 
for (i = 0; i < nr; i++)
func(ioc, cics[i]);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread KAMEZAWA Hiroyuki

On Sun, 17 Feb 2008 20:29:13 +0100
Jens Axboe [EMAIL PROTECTED] wrote:

 It's odd stuff. Could you perhaps try and add some printks to
 block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return
 from radix_tree_gang_lookup() and the pointer value of cics[i] in the
 for() loop after the lookup?
 
I met the same issue on ia64/NUMA box.
seems cisc[]-key is NULL and index for radix_tree_gang_lookup() was always '1'.

Attached patch works well for me, 
but I don't know much about cfq. please confirm. 

Regards,
-Kame

==
cics[]-key can be NULL.
In that case, cics[]-dead_key has key value.

Signed-off-by: KAMEZAWA Hiroyuki [EMAIL PROTECTED]

Index: linux-2.6.25-rc2/block/cfq-iosched.c
===
--- linux-2.6.25-rc2.orig/block/cfq-iosched.c
+++ linux-2.6.25-rc2/block/cfq-iosched.c
@@ -1171,7 +1171,11 @@ call_for_each_cic(struct io_context *ioc
break;
 
called += nr;
-   index = 1 + (unsigned long) cics[nr - 1]-key;
+
+   if (!cics[nr - 1]-key)
+   index = 1 + (unsigned long) cics[nr - 1]-dead_key;
+   else
+   index = 1 + (unsigned long) cics[nr - 1]-key;
 
for (i = 0; i  nr; i++)
func(ioc, cics[i]);

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread Jens Axboe

On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote:
 On Sun, 17 Feb 2008 20:29:13 +0100
 Jens Axboe [EMAIL PROTECTED] wrote:
 
  It's odd stuff. Could you perhaps try and add some printks to
  block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return
  from radix_tree_gang_lookup() and the pointer value of cics[i] in the
  for() loop after the lookup?
  
 I met the same issue on ia64/NUMA box.
 seems cisc[]-key is NULL and index for radix_tree_gang_lookup() was
 always '1'.

Why does it keep repeating then? If -key is NULL, the next lookup index
should be 1UL.

But I think the radix 'scan over entire tree' is a bit fragile. This
patch adds a parallel hlist for ease of properly browsing the members,
does that work for you? It compiles, but I haven't booted it here yet...

 Attached patch works well for me, but I don't know much about cfq.
 please confirm. 

It doesn't make a lot of sense, I'm afraid.

 block/blk-ioc.c   |   35 +++
 block/cfq-iosched.c   |   37 +++--
 include/linux/iocontext.h |2 ++
 3 files changed, 28 insertions(+), 46 deletions(-)

diff --git a/block/blk-ioc.c b/block/blk-ioc.c
index 80245dc..73c7002 100644
--- a/block/blk-ioc.c
+++ b/block/blk-ioc.c
@@ -17,17 +17,13 @@ static struct kmem_cache *iocontext_cachep;
 
 static void cfq_dtor(struct io_context *ioc)
 {
-   struct cfq_io_context *cic[1];
-   int r;
+   if (!hlist_empty(ioc-cic_list)) {
+   struct cfq_io_context *cic;
 
-   /*
-* We don't have a specific key to lookup with, so use the gang
-* lookup to just retrieve the first item stored. The cfq exit
-* function will iterate the full tree, so any member will do.
-*/
-   r = radix_tree_gang_lookup(ioc-radix_root, (void **) cic, 0, 1);
-   if (r  0)
-   cic[0]-dtor(ioc);
+   cic = list_entry(ioc-cic_list.first, struct cfq_io_context,
+   cic_list);
+   cic-dtor(ioc);
+   }
 }
 
 /*
@@ -57,18 +53,16 @@ EXPORT_SYMBOL(put_io_context);
 
 static void cfq_exit(struct io_context *ioc)
 {
-   struct cfq_io_context *cic[1];
-   int r;
-
rcu_read_lock();
-   /*
-* See comment for cfq_dtor()
-*/
-   r = radix_tree_gang_lookup(ioc-radix_root, (void **) cic, 0, 1);
-   rcu_read_unlock();
 
-   if (r  0)
-   cic[0]-exit(ioc);
+   if (!hlist_empty(ioc-cic_list)) {
+   struct cfq_io_context *cic;
+
+   cic = list_entry(ioc-cic_list.first, struct cfq_io_context,
+   cic_list);
+   cic-exit(ioc);
+   }
+   rcu_read_unlock();
 }
 
 /* Called by the exitting task */
@@ -105,6 +99,7 @@ struct io_context *alloc_io_context(gfp_t gfp_flags, int 
node)
ret-nr_batch_requests = 0; /* because this is 0 */
ret-aic = NULL;
INIT_RADIX_TREE(ret-radix_root, GFP_ATOMIC | __GFP_HIGH);
+   INIT_HLIST_HEAD(ret-cic_list);
ret-ioc_data = NULL;
}
 
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index ca198e6..62eda3f 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1145,38 +1145,19 @@ static void cfq_put_queue(struct cfq_queue *cfqq)
 /*
  * Call func for each cic attached to this ioc. Returns number of cic's seen.
  */
-#define CIC_GANG_NR16
 static unsigned int
 call_for_each_cic(struct io_context *ioc,
  void (*func)(struct io_context *, struct cfq_io_context *))
 {
-   struct cfq_io_context *cics[CIC_GANG_NR];
-   unsigned long index = 0;
-   unsigned int called = 0;
-   int nr;
+   struct cfq_io_context *cic;
+   struct hlist_node *n;
+   int called = 0;
 
rcu_read_lock();
-
-   do {
-   int i;
-
-   /*
-* Perhaps there's a better way - this just gang lookups from
-* 0 to the end, restarting after each CIC_GANG_NR from the
-* last key + 1.
-*/
-   nr = radix_tree_gang_lookup(ioc-radix_root, (void **) cics,
-   index, CIC_GANG_NR);
-   if (!nr)
-   break;
-
-   called += nr;
-   index = 1 + (unsigned long) cics[nr - 1]-key;
-
-   for (i = 0; i  nr; i++)
-   func(ioc, cics[i]);
-   } while (nr == CIC_GANG_NR);
-
+   hlist_for_each_entry_rcu(cic, n, ioc-cic_list, cic_list) {
+   func(ioc, cic);
+   called++;
+   }
rcu_read_unlock();
 
return called;
@@ -1190,6 +1171,7 @@ static void cic_free_func(struct io_context *ioc, struct 
cfq_io_context *cic)
 
spin_lock_irqsave(ioc-lock, flags);
radix_tree_delete(ioc-radix_root, cic-dead_key);
+

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread KAMEZAWA Hiroyuki

On Tue, 19 Feb 2008 09:36:34 +0100
Jens Axboe [EMAIL PROTECTED] wrote:

 On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote:
  On Sun, 17 Feb 2008 20:29:13 +0100
  Jens Axboe [EMAIL PROTECTED] wrote:
  
   It's odd stuff. Could you perhaps try and add some printks to
   block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return
   from radix_tree_gang_lookup() and the pointer value of cics[i] in the
   for() loop after the lookup?
   
  I met the same issue on ia64/NUMA box.
  seems cisc[]-key is NULL and index for radix_tree_gang_lookup() was
  always '1'.
 
 Why does it keep repeating then? If -key is NULL, the next lookup index
 should be 1UL.
 
when I inserted printk here
==
for (i = 0; i  nr; i++)
func(ioc, cics[i]);
printk(%d %lx\n, nr, index);
==
index was always 1 and  nr was always 32.

So, cics[31]-key was always NULL when index=1 is passed to 
radix_tree_gang_lookup().


 But I think the radix 'scan over entire tree' is a bit fragile. This
 patch adds a parallel hlist for ease of properly browsing the members,
 does that work for you? It compiles, but I haven't booted it here yet...
 
will try. please wait a bit.

Thanks,
-Kame

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group

2008-02-19 Thread Tilman Schmidt


[added CCs from the other thread on this topic]

Alasdair G Kergon schrieb:

On Sat, Feb 16, 2008 at 11:37:37PM +0100, Jiri Slaby wrote:

# CONFIG_SYSFS_DEPRECATED is not set


IMHO That should be *set* by default until everyone has had time to
update their userspace software to cope with the changed sysfs layout.


It *is* set by default.

The root cause of the trouble is that its semantics are changing.
At one point in time (sorry, don't remember which kernel release
exactly) I tested whether the openSUSE 10.3 userspace supported
a CONFIG_SYSFS_DEPRECATED=n kernel and found that it did. From
then on, make oldconfig would carry that setting over to every
new kernel I built, which was fine while the meaning of this
setting - ie. the difference in sysfs layout it controlled -
stayed the same.

With commit edfaa7c36574f1bf09c65ad602412db9da5f96bf however, the
sysfs layout changed again, so the same CONFIG_SYSFS_DEPRECATED
setting now controls a different difference (argh) in sysfs
layout. That kind of situation is not handled very well by
make oldconfig, which basically starts from the assumption
that a setting that was ok for the previous kernel version is
still ok for the new one.

I see two ways of avoiding that problem: either create a new
backward compatibility config setting for that new sysfs change,
or create a way of telling make oldconfig that the semantics of
CONFIG_SYSFS_DEPRECATED have changed and it should ask the user
for that again even if there is a previous setting.

HTH
T.

--
Tilman SchmidtE-Mail: [EMAIL PROTECTED]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)



signature.asc
Description: OpenPGP digital signature

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar


* Pekka Enberg [EMAIL PROTECTED] wrote:

 Mathieu, Christoph is on vacation and I'm not at all that familiar 
 with this cmpxchg_local() optimization, so if you could take a peek at 
 this bug report to see if you can spot something obviously wrong with 
 it, I would much appreciate that.

hm, it's bad for at least one other reason as well (which is probably 
unrelated to this crash):

 /*
  * Currently fastpath is not supported if preemption is enabled.
  */
 #if defined(CONFIG_FAST_CMPXCHG_LOCAL)  !defined(CONFIG_PREEMPT)
 #define SLUB_FASTPATH
 #endif

such !PREEMPT exceptions tend to show i didnt want to think too hard 
about the preemptible case so just turn it off thinking.

Also, why isnt this SLUB_FASTPATH flag done in the Kconfig space?

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg

Hi,

Pekka Enberg [EMAIL PROTECTED] wrote:
  Mathieu, Christoph is on vacation and I'm not at all that familiar
  with this cmpxchg_local() optimization, so if you could take a peek at
  this bug report to see if you can spot something obviously wrong with
  it, I would much appreciate that.

On Feb 19, 2008 12:27 PM, Ingo Molnar [EMAIL PROTECTED] wrote:
 hm, it's bad for at least one other reason as well (which is probably
 unrelated to this crash):

  /*
   * Currently fastpath is not supported if preemption is enabled.
   */
  #if defined(CONFIG_FAST_CMPXCHG_LOCAL)  !defined(CONFIG_PREEMPT)
  #define SLUB_FASTPATH
  #endif

 such !PREEMPT exceptions tend to show i didnt want to think too hard
 about the preemptible case so just turn it off thinking.

 Also, why isnt this SLUB_FASTPATH flag done in the Kconfig space?

Hmm, no idea. I think might have been some mix-up with merging the
patch. The one I saw was:

http://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24/2.6.24-mm1/broken-out/slub-optional-fast-path-using-cmpxchg_local.patch

But I don't remember giving out a Reviewed-by for it (and my mailbox
confirms that). Furthermore, somehow it turned into this when merged:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c

In any case, if Torsten/someone can verify that reverting
1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c (SLUB: Alternate fast paths
using cmpxchg_local) fixes these problems, I think we should just do
it and let Christoph sort it out when he gets back.

   Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread Jens Axboe

On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote:
 On Tue, 19 Feb 2008 09:58:38 +0100
 Jens Axboe [EMAIL PROTECTED] wrote:
   when I inserted printk here
   ==
 for (i = 0; i  nr; i++)
 func(ioc, cics[i]);
 printk(%d %lx\n, nr, index);
   ==
   index was always 1 and  nr was always 32.
   
   So, cics[31]-key was always NULL when index=1 is passed to
   radix_tree_gang_lookup().
  
  Hang on, it returned 32? It should not return more than 16, since that
  is what we have room for and asked for. 
 sorry. Of course, it was 16 ;(

I expected so, otherwise we would have had far more serious problems :-)

 your patch works well. thank you.

It's committed now and posted in the relevant bugzilla as well (#9948).

-- 
Jens Axboe

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread Jens Axboe

On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote:
 On Tue, 19 Feb 2008 09:36:34 +0100
 Jens Axboe [EMAIL PROTECTED] wrote:
 
  On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote:
   On Sun, 17 Feb 2008 20:29:13 +0100
   Jens Axboe [EMAIL PROTECTED] wrote:
   
It's odd stuff. Could you perhaps try and add some printks to
block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return
from radix_tree_gang_lookup() and the pointer value of cics[i] in the
for() loop after the lookup?

   I met the same issue on ia64/NUMA box.
   seems cisc[]-key is NULL and index for radix_tree_gang_lookup() was
   always '1'.
  
  Why does it keep repeating then? If -key is NULL, the next lookup index
  should be 1UL.
  
 when I inserted printk here
 ==
   for (i = 0; i  nr; i++)
   func(ioc, cics[i]);
   printk(%d %lx\n, nr, index);
 ==
 index was always 1 and  nr was always 32.
 
 So, cics[31]-key was always NULL when index=1 is passed to
 radix_tree_gang_lookup().

Hang on, it returned 32? It should not return more than 16, since that
is what we have room for and asked for. Using -dead_key when -key is
NULL is correct btw, since that is the correct location in the tree once
the process has exited. But that should not happen until AFTER the
func() call, so I still think the list patch is safer.

  But I think the radix 'scan over entire tree' is a bit fragile. This
  patch adds a parallel hlist for ease of properly browsing the
  members, does that work for you? It compiles, but I haven't booted
  it here yet...
  
 will try. please wait a bit.

It boots here, so at least it passes normal sanity tests. It should
solve your problem as well, hopefully.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread Jens Axboe

On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote:
 On Tue, 19 Feb 2008 09:36:34 +0100
 Jens Axboe [EMAIL PROTECTED] wrote:
 
  On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote:
   On Sun, 17 Feb 2008 20:29:13 +0100
   Jens Axboe [EMAIL PROTECTED] wrote:
   
It's odd stuff. Could you perhaps try and add some printks to
block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return
from radix_tree_gang_lookup() and the pointer value of cics[i] in the
for() loop after the lookup?

   I met the same issue on ia64/NUMA box.
   seems cisc[]-key is NULL and index for radix_tree_gang_lookup() was
   always '1'.
  
  Why does it keep repeating then? If -key is NULL, the next lookup index
  should be 1UL.
  
  But I think the radix 'scan over entire tree' is a bit fragile. This
  patch adds a parallel hlist for ease of properly browsing the members,
  does that work for you? It compiles, but I haven't booted it here yet...
  
 Works well for me and my box booted !

Super, I'll get it upstream. Thanks for testing and debugging!

-- 
Jens Axboe

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread KAMEZAWA Hiroyuki

On Tue, 19 Feb 2008 09:36:34 +0100
Jens Axboe [EMAIL PROTECTED] wrote:

 On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote:
  On Sun, 17 Feb 2008 20:29:13 +0100
  Jens Axboe [EMAIL PROTECTED] wrote:
  
   It's odd stuff. Could you perhaps try and add some printks to
   block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return
   from radix_tree_gang_lookup() and the pointer value of cics[i] in the
   for() loop after the lookup?
   
  I met the same issue on ia64/NUMA box.
  seems cisc[]-key is NULL and index for radix_tree_gang_lookup() was
  always '1'.
 
 Why does it keep repeating then? If -key is NULL, the next lookup index
 should be 1UL.
 
 But I think the radix 'scan over entire tree' is a bit fragile. This
 patch adds a parallel hlist for ease of properly browsing the members,
 does that work for you? It compiles, but I haven't booted it here yet...
 
Works well for me and my box booted !

Thanks,
-Kame

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread KAMEZAWA Hiroyuki

On Tue, 19 Feb 2008 09:58:38 +0100
Jens Axboe [EMAIL PROTECTED] wrote:
  when I inserted printk here
  ==
  for (i = 0; i  nr; i++)
  func(ioc, cics[i]);
  printk(%d %lx\n, nr, index);
  ==
  index was always 1 and  nr was always 32.
  
  So, cics[31]-key was always NULL when index=1 is passed to
  radix_tree_gang_lookup().
 
 Hang on, it returned 32? It should not return more than 16, since that
 is what we have room for and asked for. 
sorry. Of course, it was 16 ;(

your patch works well. thank you.

-Kame

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Mathieu Desnoyers

* Ingo Molnar ([EMAIL PROTECTED]) wrote:
 
 * Pekka Enberg [EMAIL PROTECTED] wrote:
 
  Mathieu, Christoph is on vacation and I'm not at all that familiar 
  with this cmpxchg_local() optimization, so if you could take a peek at 
  this bug report to see if you can spot something obviously wrong with 
  it, I would much appreciate that.
 
 hm, it's bad for at least one other reason as well (which is probably 
 unrelated to this crash):
 
  /*
   * Currently fastpath is not supported if preemption is enabled.
   */
  #if defined(CONFIG_FAST_CMPXCHG_LOCAL)  !defined(CONFIG_PREEMPT)
  #define SLUB_FASTPATH
  #endif
 
 such !PREEMPT exceptions tend to show i didnt want to think too hard 
 about the preemptible case so just turn it off thinking.
 

Ingo, a comment in slub.c explains it :

/*
 * The SLUB_FASTPATH path is provisional and is currently disabled if the
 * kernel is compiled with preemption or if the arch does not support
 * fast cmpxchg operations. There are a couple of coming changes that will
 * simplify matters and allow preemption. Ultimately we may end up making
 * SLUB_FASTPATH the default.
 *
 * 1. The introduction of the per cpu allocator will avoid array lookups
 *through get_cpu_slab(). A special register can be used instead.
 *
 * 2. The introduction of per cpu atomic operations (cpu_ops) means that
 *we can realize the logic here entirely with per cpu atomics. The
 *per cpu atomic ops will take care of the preemption issues.
 */

So there is more coming in the preemption area.


 Also, why isnt this SLUB_FASTPATH flag done in the Kconfig space?
 

Eventually, I think only CONFIG_FAST_CMPXCHG_LOCAL will be needed (when
the code will support preemption). Therefore, this SLUB_FASTPATH define
seems to be only here temporarily.

I'm looking at the code right now.. more to come.

Mathieu

   Ingo

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread Kamalesh Babulal

Jens Axboe wrote:
 On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote:
 On Sun, 17 Feb 2008 20:29:13 +0100
 Jens Axboe [EMAIL PROTECTED] wrote:

 It's odd stuff. Could you perhaps try and add some printks to
 block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return
 from radix_tree_gang_lookup() and the pointer value of cics[i] in the
 for() loop after the lookup?

 I met the same issue on ia64/NUMA box.
 seems cisc[]-key is NULL and index for radix_tree_gang_lookup() was
 always '1'.
 
 Why does it keep repeating then? If -key is NULL, the next lookup index
 should be 1UL.
 
 But I think the radix 'scan over entire tree' is a bit fragile. This
 patch adds a parallel hlist for ease of properly browsing the members,
 does that work for you? It compiles, but I haven't booted it here yet...
 
 Attached patch works well for me, but I don't know much about cfq.
 please confirm. 
 
 It doesn't make a lot of sense, I'm afraid.
 
  block/blk-ioc.c   |   35 +++
  block/cfq-iosched.c   |   37 +++--
  include/linux/iocontext.h |2 ++
  3 files changed, 28 insertions(+), 46 deletions(-)
 
 diff --git a/block/blk-ioc.c b/block/blk-ioc.c
 index 80245dc..73c7002 100644
 --- a/block/blk-ioc.c

snip

Hi Jens,

Thanks for the patch. The patch works fine, machine boots up without the kernel 
panic.


-- 
Thanks  Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar


* Mathieu Desnoyers [EMAIL PROTECTED] wrote:

 Ingo, a comment in slub.c explains it :
 
 /*
  * The SLUB_FASTPATH path is provisional and is currently disabled if the
  * kernel is compiled with preemption or if the arch does not support
  * fast cmpxchg operations. There are a couple of coming changes that will
  * simplify matters and allow preemption. Ultimately we may end up making
  * SLUB_FASTPATH the default.

well the feature is not complete and there are no reasons given _why_ 
it's not complete ... and even if there's a reason it should have been 
deferred to the next merge window. We still have 10 year old this is a 
temporary hack comments in the kernel ;-)

hardware does not support it is a valid argument, kernel developer 
had no time to implement it properly is not ;-)

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg

Hi Mathieu,

On Feb 19, 2008 4:02 PM, Mathieu Desnoyers [EMAIL PROTECTED] wrote:
  Since this shows mostly with network card drivers, I think the most
  plausible cause would be an IRQ nesting over kmem_cache_alloc_node and
  calling it.

On Feb 19, 2008 4:21 PM, Pekka Enberg [EMAIL PROTECTED] wrote:
 Yes, this can happen. Are you saying it is not safe to be in the
 lockless path when an IRQ triggers?

Hmm. The barrier() in slab_free() looks fishy. The comment says it's
there to make sure we've retrieved c-freelist before c-page but then
it uses a _compiler barrier_ which doesn't affect the CPU and the
reads may still be re-ordered... Not sure if that matters here though.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Mathieu Desnoyers

* Pekka Enberg ([EMAIL PROTECTED]) wrote:
 On Feb 19, 2008 8:54 AM, Torsten Kaiser [EMAIL PROTECTED] wrote:
[ 5282.056415] [ cut here ]
[ 5282.059757] kernel BUG at lib/list_debug.c:33!
[ 5282.062055] invalid opcode:  [1] SMP
[ 5282.062055] CPU 3
  
   hm. Your crashes do seem to span multiple subsystems, but it always
   seems to be around the SLUB code. Could you try the patch below? The
   SLUB code has a new optimization and i'm not 100% sure about it. [the
   hack below switches the SLUB optimization off by disabling the CPU
   feature it relies on.]
  
   Ingo
  
   -
arch/x86/Kconfig |4 
1 file changed, 4 deletions(-)
  
   Index: linux/arch/x86/Kconfig
   ===
   --- linux.orig/arch/x86/Kconfig
   +++ linux/arch/x86/Kconfig
   @@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT
config SEMAPHORE_SLEEPERS
   def_bool y
  
   -config FAST_CMPXCHG_LOCAL
   -   bool
   -   default y
   -
config MMU
   def_bool y
  
 
  $ grep FAST_CMPXCHG_LOCAL */.config
  linux-2.6.24-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
  linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
  linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
  linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
  linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
  linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
  linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
  linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
 
  -rc2-mm1 still worked for me.
 
  Did you mean the new SLUB_FASTPATH?
  $ grep define SLUB_FASTPATH */mm/slub.c
  linux-2.6.25-rc1/mm/slub.c:#define SLUB_FASTPATH
  linux-2.6.25-rc2-mm1/mm/slub.c:#define SLUB_FASTPATH
  linux-2.6.25-rc2/mm/slub.c:#define SLUB_FASTPATH
 
  The 2.6.24-rc3+ mm-kernels did crash for me, but don't seem to contain 
  this...
 
  On the other hand:
  From the crash in 2.6.25-rc2-mm1:
  [59987.116182] RIP  [8029f83d] kmem_cache_alloc_node+0x6d/0xa0
 
  (gdb) list *0x8029f83d
  0x8029f83d is in kmem_cache_alloc_node (mm/slub.c:1646).
  1641if (unlikely(is_end(object) || !node_match(c, 
  node))) {
  1642object = __slab_alloc(s, gfpflags,
  node, addr, c);
  1643break;
  1644}
  1645stat(c, ALLOC_FASTPATH);
  1646} while (cmpxchg_local(c-freelist, object, 
  object[c-offset])
  1647
   != object);
  1648#else
  1649unsigned long flags;
  1650
 
  That code is part for SLUB_FASTPATH.
 
  I'm willing to test the patch, but don't know how fast I can find the
  time to do it, so my answer if your patch helps might be delayed until
  the weekend.
 
 Mathieu, Christoph is on vacation and I'm not at all that familiar
 with this cmpxchg_local() optimization, so if you could take a peek at
 this bug report to see if you can spot something obviously wrong with
 it, I would much appreciate that.

Sure,

Initial thoughts :

I'd like to get the complete config causing this bug. I suspect either :

- A race between the lockless algo and an IRQ in a driver allocating
  memory.
- stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore
  indicating it is not reentrant if IRQs are disabled. Since those are
  only stats, I guess it's ok, but still weird.
- CPU hotplug problem. 
  http://bugzilla.kernel.org/attachment.cgi?id=14877action=view shows
  last sysfs file:
  /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
  -- is this linked to a cpu up/down event ?

Since this shows mostly with network card drivers, I think the most
plausible cause would be an IRQ nesting over kmem_cache_alloc_node and
calling it.

Will dig further...

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar


* Pekka Enberg [EMAIL PROTECTED] wrote:

  Yes, this can happen. Are you saying it is not safe to be in the 
  lockless path when an IRQ triggers?
 
 Hmm. The barrier() in slab_free() looks fishy. The comment says it's 
 there to make sure we've retrieved c-freelist before c-page but then 
 it uses a _compiler barrier_ which doesn't affect the CPU and the 
 reads may still be re-ordered... Not sure if that matters here though.

find a fix patch for that below - most systems affected seem to be SMP 
ones.

If this (or my other patch) indeed solves the problem i'd still favor a 
full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks 
quite un-cooked and quite un-tested for multiple independent reasons.

Sigh, why do i again have to be the messenger who brings the bad news to 
SLUB land, and again when poor Christoph went on vacation? :-/

Ingo

--
Subject: SLUB: barrier fix
From: Ingo Molnar [EMAIL PROTECTED]

---
 mm/slub.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/mm/slub.c
===
--- linux.orig/mm/slub.c
+++ linux/mm/slub.c
@@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st
debug_check_no_locks_freed(object, s-objsize);
do {
freelist = c-freelist;
-   barrier();
+   smp_mb();
/*
 * If the compiler would reorder the retrieval of c-page to
 * come before c-freelist then an interrupt could

-
Subject: slub: fastpath optimization revert
From: Ingo Molnar [EMAIL PROTECTED]
Date: Tue Feb 19 15:46:37 CET 2008

revert:

  commit 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c
  Author: Christoph Lameter [EMAIL PROTECTED]
  Date:   Mon Jan 7 23:20:30 2008 -0800

  SLUB: Alternate fast paths using cmpxchg_local

it was causing problems (crashes) and was incomplete.

Signed-off-by: Ingo Molnar [EMAIL PROTECTED]
---
 mm/slub.c |   87 --
 1 file changed, 87 deletions(-)

Index: linux-x86.q/mm/slub.c
===
--- linux-x86.q.orig/mm/slub.c
+++ linux-x86.q/mm/slub.c
@@ -149,13 +149,6 @@ static inline void ClearSlabDebug(struct
 /* Enable to test recovery from slab corruption on boot */
 #undef SLUB_RESILIENCY_TEST
 
-/*
- * Currently fastpath is not supported if preemption is enabled.
- */
-#if defined(CONFIG_FAST_CMPXCHG_LOCAL)  !defined(CONFIG_PREEMPT)
-#define SLUB_FASTPATH
-#endif
-
 #if PAGE_SHIFT = 12
 
 /*
@@ -1514,11 +1507,6 @@ static void *__slab_alloc(struct kmem_ca
 {
void **object;
struct page *new;
-#ifdef SLUB_FASTPATH
-   unsigned long flags;
-
-   local_irq_save(flags);
-#endif
if (!c-page)
goto new_slab;
 
@@ -1541,9 +1529,6 @@ load_freelist:
 unlock_out:
slab_unlock(c-page);
stat(c, ALLOC_SLOWPATH);
-#ifdef SLUB_FASTPATH
-   local_irq_restore(flags);
-#endif
return object;
 
 another_slab:
@@ -1575,9 +1560,6 @@ new_slab:
c-page = new;
goto load_freelist;
}
-#ifdef SLUB_FASTPATH
-   local_irq_restore(flags);
-#endif
/*
 * No memory available.
 *
@@ -1619,34 +1601,6 @@ static __always_inline void *slab_alloc(
 {
void **object;
struct kmem_cache_cpu *c;
-
-/*
- * The SLUB_FASTPATH path is provisional and is currently disabled if the
- * kernel is compiled with preemption or if the arch does not support
- * fast cmpxchg operations. There are a couple of coming changes that will
- * simplify matters and allow preemption. Ultimately we may end up making
- * SLUB_FASTPATH the default.
- *
- * 1. The introduction of the per cpu allocator will avoid array lookups
- *through get_cpu_slab(). A special register can be used instead.
- *
- * 2. The introduction of per cpu atomic operations (cpu_ops) means that
- *we can realize the logic here entirely with per cpu atomics. The
- *per cpu atomic ops will take care of the preemption issues.
- */
-
-#ifdef SLUB_FASTPATH
-   c = get_cpu_slab(s, raw_smp_processor_id());
-   do {
-   object = c-freelist;
-   if (unlikely(is_end(object) || !node_match(c, node))) {
-   object = __slab_alloc(s, gfpflags, node, addr, c);
-   break;
-   }
-   stat(c, ALLOC_FASTPATH);
-   } while (cmpxchg_local(c-freelist, object, object[c-offset])
-   != object);
-#else
unsigned long flags;
 
local_irq_save(flags);
@@ -1661,7 +1615,6 @@ static __always_inline void *slab_alloc(
stat(c, ALLOC_FASTPATH);
}
local_irq_restore(flags);
-#endif
 
if (unlikely((gfpflags  __GFP_ZERO)  object))
memset(object, 0, c-objsize);

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar


* Ingo Molnar [EMAIL PROTECTED] wrote:

 If this (or my other patch) indeed solves the problem i'd still favor 
 a full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it 
 looks quite un-cooked and quite un-tested for multiple independent 
 reasons.
 
 Sigh, why do i again have to be the messenger who brings the bad news 
 to SLUB land, and again when poor Christoph went on vacation? :-/

the revert patch is below. (manually done due to other changes since 
1f84260c8ce3b1ce26d4 was commited, but trivial)

Ingo

-
Subject: slub: fastpath optimization revert
From: Ingo Molnar [EMAIL PROTECTED]
Date: Tue Feb 19 15:46:37 CET 2008

revert:

  commit 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c
  Author: Christoph Lameter [EMAIL PROTECTED]
  Date:   Mon Jan 7 23:20:30 2008 -0800

  SLUB: Alternate fast paths using cmpxchg_local

it was causing problems (crashes) and was incomplete.

Signed-off-by: Ingo Molnar [EMAIL PROTECTED]
---
 mm/slub.c |   87 --
 1 file changed, 87 deletions(-)

Index: linux-x86.q/mm/slub.c
===
--- linux-x86.q.orig/mm/slub.c
+++ linux-x86.q/mm/slub.c
@@ -149,13 +149,6 @@ static inline void ClearSlabDebug(struct
 /* Enable to test recovery from slab corruption on boot */
 #undef SLUB_RESILIENCY_TEST
 
-/*
- * Currently fastpath is not supported if preemption is enabled.
- */
-#if defined(CONFIG_FAST_CMPXCHG_LOCAL)  !defined(CONFIG_PREEMPT)
-#define SLUB_FASTPATH
-#endif
-
 #if PAGE_SHIFT = 12
 
 /*
@@ -1514,11 +1507,6 @@ static void *__slab_alloc(struct kmem_ca
 {
void **object;
struct page *new;
-#ifdef SLUB_FASTPATH
-   unsigned long flags;
-
-   local_irq_save(flags);
-#endif
if (!c-page)
goto new_slab;
 
@@ -1541,9 +1529,6 @@ load_freelist:
 unlock_out:
slab_unlock(c-page);
stat(c, ALLOC_SLOWPATH);
-#ifdef SLUB_FASTPATH
-   local_irq_restore(flags);
-#endif
return object;
 
 another_slab:
@@ -1575,9 +1560,6 @@ new_slab:
c-page = new;
goto load_freelist;
}
-#ifdef SLUB_FASTPATH
-   local_irq_restore(flags);
-#endif
/*
 * No memory available.
 *
@@ -1619,34 +1601,6 @@ static __always_inline void *slab_alloc(
 {
void **object;
struct kmem_cache_cpu *c;
-
-/*
- * The SLUB_FASTPATH path is provisional and is currently disabled if the
- * kernel is compiled with preemption or if the arch does not support
- * fast cmpxchg operations. There are a couple of coming changes that will
- * simplify matters and allow preemption. Ultimately we may end up making
- * SLUB_FASTPATH the default.
- *
- * 1. The introduction of the per cpu allocator will avoid array lookups
- *through get_cpu_slab(). A special register can be used instead.
- *
- * 2. The introduction of per cpu atomic operations (cpu_ops) means that
- *we can realize the logic here entirely with per cpu atomics. The
- *per cpu atomic ops will take care of the preemption issues.
- */
-
-#ifdef SLUB_FASTPATH
-   c = get_cpu_slab(s, raw_smp_processor_id());
-   do {
-   object = c-freelist;
-   if (unlikely(is_end(object) || !node_match(c, node))) {
-   object = __slab_alloc(s, gfpflags, node, addr, c);
-   break;
-   }
-   stat(c, ALLOC_FASTPATH);
-   } while (cmpxchg_local(c-freelist, object, object[c-offset])
-   != object);
-#else
unsigned long flags;
 
local_irq_save(flags);
@@ -1661,7 +1615,6 @@ static __always_inline void *slab_alloc(
stat(c, ALLOC_FASTPATH);
}
local_irq_restore(flags);
-#endif
 
if (unlikely((gfpflags  __GFP_ZERO)  object))
memset(object, 0, c-objsize);
@@ -1698,11 +1651,6 @@ static void __slab_free(struct kmem_cach
void **object = (void *)x;
struct kmem_cache_cpu *c;
 
-#ifdef SLUB_FASTPATH
-   unsigned long flags;
-
-   local_irq_save(flags);
-#endif
c = get_cpu_slab(s, raw_smp_processor_id());
stat(c, FREE_SLOWPATH);
slab_lock(page);
@@ -1734,9 +1682,6 @@ checks_ok:
 
 out_unlock:
slab_unlock(page);
-#ifdef SLUB_FASTPATH
-   local_irq_restore(flags);
-#endif
return;
 
 slab_empty:
@@ -1749,9 +1694,6 @@ slab_empty:
}
slab_unlock(page);
stat(c, FREE_SLAB);
-#ifdef SLUB_FASTPATH
-   local_irq_restore(flags);
-#endif
discard_slab(s, page);
return;
 
@@ -1777,34 +1719,6 @@ static __always_inline void slab_free(st
 {
void **object = (void *)x;
struct kmem_cache_cpu *c;
-
-#ifdef SLUB_FASTPATH
-   void **freelist;
-
-   c = get_cpu_slab(s, raw_smp_processor_id());
-   debug_check_no_locks_freed(object, s-objsize);
-   do {
-

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg

Hi Mathieu,

On Feb 19, 2008 4:02 PM, Mathieu Desnoyers [EMAIL PROTECTED] wrote:
 - stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore
   indicating it is not reentrant if IRQs are disabled. Since those are
   only stats, I guess it's ok, but still weird.

What is not re-entrant?

On Feb 19, 2008 4:02 PM, Mathieu Desnoyers [EMAIL PROTECTED] wrote:
 Since this shows mostly with network card drivers, I think the most
 plausible cause would be an IRQ nesting over kmem_cache_alloc_node and
 calling it.

Yes, this can happen. Are you saying it is not safe to be in the
lockless path when an IRQ triggers?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Linus Torvalds



On Tue, 19 Feb 2008, Eric Dumazet wrote:
 
 cmpxchg_local(c-freelist, object, object[c-offset]) can succeed,
 while an interrupt came (on this cpu), and several allocations were done,
 and one free was performed at the end of this interruption, so 'object'
 was recycled.

I think you may well be right. This looks like a good clue.

I'll do the revert. I wanted either a confirmation that reveting it 
actually fixes something, _or_ an actual bug description, and this seems 
to be a quite possible case of the latter.

Linus

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg


Ingo Molnar wrote:

* Ingo Molnar [EMAIL PROTECTED] wrote:

If this (or my other patch) indeed solves the problem i'd still favor 
a full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it 
looks quite un-cooked and quite un-tested for multiple independent 
reasons.


Sigh, why do i again have to be the messenger who brings the bad news 
to SLUB land, and again when poor Christoph went on vacation? :-/


the revert patch is below. (manually done due to other changes since 
1f84260c8ce3b1ce26d4 was commited, but trivial)


I am ok with this if someone can actually confirm it fixes things.

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg


Ingo Molnar wrote:

* Pekka Enberg [EMAIL PROTECTED] wrote:

Yes, this can happen. Are you saying it is not safe to be in the 
lockless path when an IRQ triggers?
Hmm. The barrier() in slab_free() looks fishy. The comment says it's 
there to make sure we've retrieved c-freelist before c-page but then 
it uses a _compiler barrier_ which doesn't affect the CPU and the 
reads may still be re-ordered... Not sure if that matters here though.


find a fix patch for that below - most systems affected seem to be SMP 
ones.


If this (or my other patch) indeed solves the problem i'd still favor a 
full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks 
quite un-cooked and quite un-tested for multiple independent reasons.


Sigh, why do i again have to be the messenger who brings the bad news to 
SLUB land, and again when poor Christoph went on vacation? :-/


Ingo

--
Subject: SLUB: barrier fix
From: Ingo Molnar [EMAIL PROTECTED]

---
 mm/slub.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/mm/slub.c
===
--- linux.orig/mm/slub.c
+++ linux/mm/slub.c
@@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st
debug_check_no_locks_freed(object, s-objsize);
do {
freelist = c-freelist;
-   barrier();
+   smp_mb();
/*
 * If the compiler would reorder the retrieval of c-page to
 * come before c-freelist then an interrupt could


Torsten/Yamin, does this fix things for you? What about reverting commit 
1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c (SLUB: Alternate fast paths 
using cmpxchg_local)?


Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar


* Ingo Molnar [EMAIL PROTECTED] wrote:

 Earlier today i turned off local-cmpxchg and havent had a crash or 
 hang since then - but at 200 bootups and 4-5 crashes in a week that's 
 not conclusive yet. I think others might have workloads that trigger 
 this bug more often.

i mean, today i've only done 200 randconfig bootups since i did the 
cmpxchg SLUB revert, and given the statistics of the bug (thousands of 
bootups and just 3 provable crashes) i cannot yet conclude that the bug 
is truly gone.

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Linus Torvalds



On Tue, 19 Feb 2008, Pekka Enberg wrote:
 
 Hmm. The barrier() in slab_free() looks fishy. The comment says it's
 there to make sure we've retrieved c-freelist before c-page but then
 it uses a _compiler barrier_ which doesn't affect the CPU and the
 reads may still be re-ordered... Not sure if that matters here though.

No, no. The comment says that it's purely there to serialize an 
*interrupt*, and as such, a compiler-only barrier is sufficient (or the 
comment is wrong).

Interrupts are totally ordered within a cpu (of course, in theory a CPU 
might have speculative work etc reordering, but the CPU also guarantees 
that interrupt acts _as_if_ it was exact), so a compiler barrier is 
sufficient.

Of course, if we're talking about interrupts on another CPU, that's a 
different issue, but the fact is, in that case it's not about interrupts 
any more (might as well be other code just running normally on another 
CPU), and a barrier doesn't help, it needs real locking.

So that barrier is fine per se. Of course, the whole code (and/or just the 
comment!) may be buggered, but any CPU SMP-aware barriers shouldn't be 
relevant.

What's much more likely to be an issue is simply the fact that since the 
fastpath now accesses the per-cpu freelist without any locking, if there 
is *any* sequence what-so-ever that does it from another CPU and assumes 
the old locking behaviour, the list will be corrupted. And from a quick 
look-through, I certainly cannot guarantee that isn't the case.

There's still a lot of cases that do direct assignments to c-freelist 
without using a guaranteed atomic sequence. They *should* be safe if it's 
guaranteed that 
 (a) they always run with interrupts disabled
AND
 (b) 'c' is _always_ the current CPU list

but I can't quickly see that guarantee for either.

I'd happily just revert this thing, but it would be really good to have 
confirmation that it seems to matter. But Torsten's partial bisection 
seems to say that the quicklist thing went into -mm before the crash even 
started.

So:
 - it might be something else entirely
 - it might still be the local cmpxchg, just Torsten didn't happen to 
   notice it until later.
 - it might still be the local cmpxchg, but something else changed its 
   patterns to actually make it start triggering.

and in general I don't think we should revert it unless we have stronger 
indications that it really is the problem (eg somebody finds the actual 
bug, or a reporter can confirm that it goes away when the local cmpxchg 
optimization is disabled).

Linus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Eric Dumazet

On Tue, 19 Feb 2008 09:02:30 -0500
Mathieu Desnoyers [EMAIL PROTECTED] wrote:

 * Pekka Enberg ([EMAIL PROTECTED]) wrote:
  On Feb 19, 2008 8:54 AM, Torsten Kaiser [EMAIL PROTECTED] wrote:
 [ 5282.056415] [ cut here ]
 [ 5282.059757] kernel BUG at lib/list_debug.c:33!
 [ 5282.062055] invalid opcode:  [1] SMP
 [ 5282.062055] CPU 3
   
hm. Your crashes do seem to span multiple subsystems, but it always
seems to be around the SLUB code. Could you try the patch below? The
SLUB code has a new optimization and i'm not 100% sure about it. [the
hack below switches the SLUB optimization off by disabling the CPU
feature it relies on.]
   
Ingo
   
-
 arch/x86/Kconfig |4 
 1 file changed, 4 deletions(-)
   
Index: linux/arch/x86/Kconfig
===
--- linux.orig/arch/x86/Kconfig
+++ linux/arch/x86/Kconfig
@@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT
 config SEMAPHORE_SLEEPERS
def_bool y
   
-config FAST_CMPXCHG_LOCAL
-   bool
-   default y
-
 config MMU
def_bool y
   
  
   $ grep FAST_CMPXCHG_LOCAL */.config
   linux-2.6.24-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
   linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
   linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
   linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
   linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
   linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
   linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
   linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
  
   -rc2-mm1 still worked for me.
  
   Did you mean the new SLUB_FASTPATH?
   $ grep define SLUB_FASTPATH */mm/slub.c
   linux-2.6.25-rc1/mm/slub.c:#define SLUB_FASTPATH
   linux-2.6.25-rc2-mm1/mm/slub.c:#define SLUB_FASTPATH
   linux-2.6.25-rc2/mm/slub.c:#define SLUB_FASTPATH
  
   The 2.6.24-rc3+ mm-kernels did crash for me, but don't seem to contain 
   this...
  
   On the other hand:
   From the crash in 2.6.25-rc2-mm1:
   [59987.116182] RIP  [8029f83d] kmem_cache_alloc_node+0x6d/0xa0
  
   (gdb) list *0x8029f83d
   0x8029f83d is in kmem_cache_alloc_node (mm/slub.c:1646).
   1641if (unlikely(is_end(object) || !node_match(c, 
   node))) {
   1642object = __slab_alloc(s, gfpflags,
   node, addr, c);
   1643break;
   1644}
   1645stat(c, ALLOC_FASTPATH);
   1646} while (cmpxchg_local(c-freelist, object, 
   object[c-offset])
   1647
!= object);
   1648#else
   1649unsigned long flags;
   1650
  
   That code is part for SLUB_FASTPATH.
  
   I'm willing to test the patch, but don't know how fast I can find the
   time to do it, so my answer if your patch helps might be delayed until
   the weekend.
  
  Mathieu, Christoph is on vacation and I'm not at all that familiar
  with this cmpxchg_local() optimization, so if you could take a peek at
  this bug report to see if you can spot something obviously wrong with
  it, I would much appreciate that.
 
 Sure,
 
 Initial thoughts :
 
 I'd like to get the complete config causing this bug. I suspect either :
 
 - A race between the lockless algo and an IRQ in a driver allocating
   memory.
 - stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore
   indicating it is not reentrant if IRQs are disabled. Since those are
   only stats, I guess it's ok, but still weird.
 - CPU hotplug problem. 
   http://bugzilla.kernel.org/attachment.cgi?id=14877action=view shows
   last sysfs file:
   /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
   -- is this linked to a cpu up/down event ?
 
 Since this shows mostly with network card drivers, I think the most
 plausible cause would be an IRQ nesting over kmem_cache_alloc_node and
 calling it.
 
 Will dig further...

I wonder how SLUB_FASTPATH is supposed to work, since it is affected by
a classical ABA problem of lockless algo.

cmpxchg_local(c-freelist, object, object[c-offset]) can succeed,
while an interrupt came (on this cpu), and several allocations were done,
and one free was performed at the end of this interruption, so 'object'
was recycled.

c-freelist can then contain the previous value (object), but
object[c-offset] was changed by IRQ.

We then put back in freelist an already allocated object.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar


* Linus Torvalds [EMAIL PROTECTED] wrote:

 So:
  - it might be something else entirely
  - it might still be the local cmpxchg, just Torsten didn't happen to 
notice it until later.
  - it might still be the local cmpxchg, but something else changed its 
patterns to actually make it start triggering.
 
 and in general I don't think we should revert it unless we have 
 stronger indications that it really is the problem (eg somebody finds 
 the actual bug, or a reporter can confirm that it goes away when the 
 local cmpxchg optimization is disabled).

yeah - my revert suggestions were all completely conditional on such 
type of test feedback.

Btw., i did trigger occasional SLUB crashes myself starting at around 
-rc1, on the order of one per 200-300 straight random bootups, and 
yesterday i did a 50-bootups series of a specific .config that crashed, 
to try to reproduce one of them but failed - so bisection was not an 
option and i had nothing concrete and repeatable to report either. I had 
a few complete lockups and only 3 usable backtraces - find them below.

Networking features in all of the backtraces - and so does the VFS. All 
of the crashes are on SMP - and given that 50% of the bootups are UP 
this gives us a 1:8 chance hint that this bug is SMP specific. (All the 
crashes are in distccd - that is what this build cluster does mainly so 
it's the main activity of the box - so they dont necessarily indicate 
anything workload specific.)

Earlier today i turned off local-cmpxchg and havent had a crash or hang 
since then - but at 200 bootups and 4-5 crashes in a week that's not 
conclusive yet. I think others might have workloads that trigger this 
bug more often.

Ingo


mercury login: [  582.671916] Oops:  [#1] SMP DEBUG_PAGEALLOC
[  582.672334] 
[  582.672334] Pid: 3776, comm: distccd Not tainted (2.6.25-rc2 #5)
[  582.672334] EIP: 0060:[c0174fda] EFLAGS: 00010246 CPU: 0
[  582.672334] EIP is at kmem_cache_alloc+0x2a/0x90
[  582.672334] EAX:  EBX: 861c ECX: c069ed1c EDX: 01060002
[  582.672334] ESI: c0aeffc8 EDI: c1d11714 EBP: f6eddcdc ESP: f6eddcc4
[  582.672334]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  582.672334] Process distccd (pid: 3776, ti=f6edc000 task=f508c000 
task.ti=f6edc000)
[  582.672334] Stack: c06a3d48 f6eddce4 0020 861c 066c c0aeffc8 
f6eddcf8 c069ed1c 
[  582.672334] 0020 861c f7ce6580 f7ce6580 f6eddd18 
c045e7bb  
[  582.672334] f7f683e0 861c f52136c0 f7ce6580 f6eddd58 
c0461de5 f508c000 
[  582.672334] Call Trace:
[  582.672334]  [c06a3d48] ? netif_receive_skb+0x2a8/0x320
[  582.672334]  [c069ed1c] ? __alloc_skb+0x2c/0x110
[  582.672334]  [c045e7bb] ? nv_alloc_rx_optimized+0x10b/0x1a0
[  582.672334]  [c0461de5] ? nv_napi_poll+0x1b5/0x730
[  582.672334]  [c06a62cb] ? net_rx_action+0x16b/0x200
[  582.672334]  [c06a61e8] ? net_rx_action+0x88/0x200
[  582.672334]  [c012d713] ? __do_softirq+0x93/0x120
[  582.672334]  [c012d7f7] ? do_softirq+0x57/0x60
[  582.672334]  [c012dcc9] ? irq_exit+0x69/0x80
[  582.672334]  [c0106325] ? do_IRQ+0x45/0x80
[  582.672334]  [c018a2a2] ? d_instantiate+0x42/0x60
[  582.672334]  [c0103fd8] ? common_interrupt+0x28/0x30
[  582.672334]  [c018a2a2] ? d_instantiate+0x42/0x60
[  582.672334]  [c0149e50] ? lock_release+0xc0/0x1b0
[  582.672334]  [c07d0816] ? _spin_unlock+0x16/0x20
[  582.672334]  [c018a2a2] ? d_instantiate+0x42/0x60
[  582.672334]  [c0202a84] ? ext3_add_nondir+0x34/0x50
[  582.672334]  [c0202fde] ? ext3_create+0x9e/0xe0
[  582.672334]  [c0181498] ? vfs_create+0xb8/0x100
[  582.672334]  [c01838c0] ? open_namei+0x4d0/0x5a0
[  582.672334]  [c0136346] ? in_group_p+0x26/0x30
[  582.672334]  [c020cd40] ? ext3_permission+0x0/0x10
[  582.672334]  [c01770c1] ? do_filp_open+0x31/0x50
[  582.672334]  [c07d081d] ? _spin_unlock+0x1d/0x20
[  582.672334]  [c0176e1b] ? get_unused_fd_flags+0xbb/0xe0
[  582.672334]  [c017712d] ? do_sys_open+0x4d/0xf0
[  582.672334]  [c0327894] ? trace_hardirqs_on_thunk+0xc/0x10
[  582.672334]  [c014869d] ? trace_hardirqs_on_caller+0xbd/0x140
[  582.672334]  [c017720c] ? sys_open+0x1c/0x20
[  582.672334]  [c0102fc6] ? sysenter_past_esp+0x5f/0x99
[  582.672334]  ===
[  582.672334] Code: c3 55 89 e5 57 56 89 c6 53 83 ec 0c 8b 4d 04 89 55 f0 64 
a1 04 40 b7 c0 8b 7c 86 64 90 8d 74 26 00 8b 17 f6 c2 01 75 41 8b 47 0c 8b 1c 
82 89 d0 0f b1 1f 39 d0 89 c3 75 e8 66 83 7d f0 00 79 1f 
[  582.672334] EIP: [c0174fda] kmem_cache_alloc+0x2a/0x90 SS:ESP 0068:f6eddcc4
[  582.672343] Kernel panic - not syncing: Fatal exception in interrupt
[  582.673337] Pid: 3776, comm: distccd Tainted: G  D  2.6.25-rc2 #5
[  582.674342]  [c0128516] panic+0x46/0x120
[  582.676335]  [c0104be4] die+0x134/0x150
[  582.678335]  [c01182a8] do_page_fault+0x188/0x610
[  582.680335]  [c06c6016] ? ip_local_deliver+0xf6/0x1c0
[  582.682335]  [c0118120] ? do_page_fault+0x0/0x610
[  582.685334]  [c07d0f82] error_code+0x72/0x80

Re: Linux 2.6.25-rc2

2008-02-19 Thread Torsten Kaiser

On Feb 19, 2008 5:20 PM, Linus Torvalds [EMAIL PROTECTED] wrote:
 So:
  - it might be something else entirely
  - it might still be the local cmpxchg, just Torsten didn't happen to
notice it until later.

My new hackbench-testcase also killed 2.6.24-rc2-mm1, so I really
noticed to late.

  - it might still be the local cmpxchg, but something else changed its
patterns to actually make it start triggering.

 and in general I don't think we should revert it unless we have stronger
 indications that it really is the problem (eg somebody finds the actual
 bug, or a reporter can confirm that it goes away when the local cmpxchg
 optimization is disabled).

I tried the following three patches:

switching the barrier() for a smp_mb() in 2.6.25-rc2-mm1:
- crashed

reverting the FASTPATH-patch in 2.6.25-rc2:
- worked

only removed FAST_CMPXCHG_LOCAL from arch/x86/Kconfig
- worked

So all of these tests seem to confirm, that the bug is in the new SLUB fastpath.

Torsten
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Mathieu Desnoyers

* Eric Dumazet ([EMAIL PROTECTED]) wrote:
 On Tue, 19 Feb 2008 09:02:30 -0500
 Mathieu Desnoyers [EMAIL PROTECTED] wrote:
 
  * Pekka Enberg ([EMAIL PROTECTED]) wrote:
   On Feb 19, 2008 8:54 AM, Torsten Kaiser [EMAIL PROTECTED] wrote:
  [ 5282.056415] [ cut here ]
  [ 5282.059757] kernel BUG at lib/list_debug.c:33!
  [ 5282.062055] invalid opcode:  [1] SMP
  [ 5282.062055] CPU 3

 hm. Your crashes do seem to span multiple subsystems, but it always
 seems to be around the SLUB code. Could you try the patch below? The
 SLUB code has a new optimization and i'm not 100% sure about it. [the
 hack below switches the SLUB optimization off by disabling the CPU
 feature it relies on.]

 Ingo

 -
  arch/x86/Kconfig |4 
  1 file changed, 4 deletions(-)

 Index: linux/arch/x86/Kconfig
 ===
 --- linux.orig/arch/x86/Kconfig
 +++ linux/arch/x86/Kconfig
 @@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT
  config SEMAPHORE_SLEEPERS
 def_bool y

 -config FAST_CMPXCHG_LOCAL
 -   bool
 -   default y
 -
  config MMU
 def_bool y

   
$ grep FAST_CMPXCHG_LOCAL */.config
linux-2.6.24-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
   
-rc2-mm1 still worked for me.
   
Did you mean the new SLUB_FASTPATH?
$ grep define SLUB_FASTPATH */mm/slub.c
linux-2.6.25-rc1/mm/slub.c:#define SLUB_FASTPATH
linux-2.6.25-rc2-mm1/mm/slub.c:#define SLUB_FASTPATH
linux-2.6.25-rc2/mm/slub.c:#define SLUB_FASTPATH
   
The 2.6.24-rc3+ mm-kernels did crash for me, but don't seem to contain 
this...
   
On the other hand:
From the crash in 2.6.25-rc2-mm1:
[59987.116182] RIP  [8029f83d] kmem_cache_alloc_node+0x6d/0xa0
   
(gdb) list *0x8029f83d
0x8029f83d is in kmem_cache_alloc_node (mm/slub.c:1646).
1641if (unlikely(is_end(object) || !node_match(c, 
node))) {
1642object = __slab_alloc(s, gfpflags,
node, addr, c);
1643break;
1644}
1645stat(c, ALLOC_FASTPATH);
1646} while (cmpxchg_local(c-freelist, object, 
object[c-offset])
1647
 != object);
1648#else
1649unsigned long flags;
1650
   
That code is part for SLUB_FASTPATH.
   
I'm willing to test the patch, but don't know how fast I can find the
time to do it, so my answer if your patch helps might be delayed until
the weekend.
   
   Mathieu, Christoph is on vacation and I'm not at all that familiar
   with this cmpxchg_local() optimization, so if you could take a peek at
   this bug report to see if you can spot something obviously wrong with
   it, I would much appreciate that.
  
  Sure,
  
  Initial thoughts :
  
  I'd like to get the complete config causing this bug. I suspect either :
  
  - A race between the lockless algo and an IRQ in a driver allocating
memory.
  - stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore
indicating it is not reentrant if IRQs are disabled. Since those are
only stats, I guess it's ok, but still weird.
  - CPU hotplug problem. 
http://bugzilla.kernel.org/attachment.cgi?id=14877action=view shows
last sysfs file:
/sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
-- is this linked to a cpu up/down event ?
  
  Since this shows mostly with network card drivers, I think the most
  plausible cause would be an IRQ nesting over kmem_cache_alloc_node and
  calling it.
  
  Will dig further...
 
 I wonder how SLUB_FASTPATH is supposed to work, since it is affected by
 a classical ABA problem of lockless algo.
 
 cmpxchg_local(c-freelist, object, object[c-offset]) can succeed,
 while an interrupt came (on this cpu), and several allocations were done,
 and one free was performed at the end of this interruption, so 'object'
 was recycled.
 
 c-freelist can then contain the previous value (object), but
 object[c-offset] was changed by IRQ.
 
 We then put back in freelist an already allocated object.
 

I think you are right. A way to fix this would use the fact that the
freelist is only useful to point to the first free object in a page. We
could change it to an offset rather than an address.

The freelist would become a counter of type long which increments
until

Re: Linux 2.6.25-rc2

2008-02-19 Thread Mathieu Desnoyers

* Pekka Enberg ([EMAIL PROTECTED]) wrote:
 Hi Mathieu,
 
 On Feb 19, 2008 4:02 PM, Mathieu Desnoyers [EMAIL PROTECTED] wrote:
  - stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore
indicating it is not reentrant if IRQs are disabled. Since those are
only stats, I guess it's ok, but still weird.
 
 What is not re-entrant?
 

incrementing the variable with a ++ when interrupts are not disabled.
It's not an atomic add and it's racy. The code within stat() does
exactly this.

 On Feb 19, 2008 4:02 PM, Mathieu Desnoyers [EMAIL PROTECTED] wrote:
  Since this shows mostly with network card drivers, I think the most
  plausible cause would be an IRQ nesting over kmem_cache_alloc_node and
  calling it.
 
 Yes, this can happen. Are you saying it is not safe to be in the
 lockless path when an IRQ triggers?

It should be safe, but I think Eric pointed the correct problem in his
reply.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Zhang, Yanmin

On Tue, 2008-02-19 at 17:52 +0200, Pekka Enberg wrote:
 Ingo Molnar wrote:
  * Pekka Enberg [EMAIL PROTECTED] wrote:
  
  Yes, this can happen. Are you saying it is not safe to be in the 
  lockless path when an IRQ triggers?
  Hmm. The barrier() in slab_free() looks fishy. The comment says it's 
  there to make sure we've retrieved c-freelist before c-page but then 
  it uses a _compiler barrier_ which doesn't affect the CPU and the 
  reads may still be re-ordered... Not sure if that matters here though.
  
  find a fix patch for that below - most systems affected seem to be SMP 
  ones.
  
  If this (or my other patch) indeed solves the problem i'd still favor a 
  full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks 
  quite un-cooked and quite un-tested for multiple independent reasons.
  
  Sigh, why do i again have to be the messenger who brings the bad news to 
  SLUB land, and again when poor Christoph went on vacation? :-/
  
  Ingo
  
  --
  Subject: SLUB: barrier fix
  From: Ingo Molnar [EMAIL PROTECTED]
  
  ---
   mm/slub.c |2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
  
  Index: linux/mm/slub.c
  ===
  --- linux.orig/mm/slub.c
  +++ linux/mm/slub.c
  @@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st
  debug_check_no_locks_freed(object, s-objsize);
  do {
  freelist = c-freelist;
  -   barrier();
  +   smp_mb();
  /*
   * If the compiler would reorder the retrieval of c-page to
   * come before c-freelist then an interrupt could
 
 Torsten/Yamin, does this fix things for you? What about reverting commit 
 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c (SLUB: Alternate fast paths 
 using cmpxchg_local)?
I'm busy in another issue and will test it ASAP. Sorry.

-yanmin


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Zhang, Yanmin

On Wed, 2008-02-20 at 08:36 +0800, Zhang, Yanmin wrote:
 On Tue, 2008-02-19 at 17:52 +0200, Pekka Enberg wrote:
  Ingo Molnar wrote:
   * Pekka Enberg [EMAIL PROTECTED] wrote:
   
   Yes, this can happen. Are you saying it is not safe to be in the 
   lockless path when an IRQ triggers?
   Hmm. The barrier() in slab_free() looks fishy. The comment says it's 
   there to make sure we've retrieved c-freelist before c-page but then 
   it uses a _compiler barrier_ which doesn't affect the CPU and the 
   reads may still be re-ordered... Not sure if that matters here though.
   
   find a fix patch for that below - most systems affected seem to be SMP 
   ones.
   
   If this (or my other patch) indeed solves the problem i'd still favor a 
   full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks 
   quite un-cooked and quite un-tested for multiple independent reasons.
   
   Sigh, why do i again have to be the messenger who brings the bad news to 
   SLUB land, and again when poor Christoph went on vacation? :-/
   
 Ingo
   
   --
   Subject: SLUB: barrier fix
   From: Ingo Molnar [EMAIL PROTECTED]
   
   ---
mm/slub.c |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
   
   Index: linux/mm/slub.c
   ===
   --- linux.orig/mm/slub.c
   +++ linux/mm/slub.c
   @@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st
 debug_check_no_locks_freed(object, s-objsize);
 do {
 freelist = c-freelist;
   - barrier();
   + smp_mb();
 /*
  * If the compiler would reorder the retrieval of c-page to
  * come before c-freelist then an interrupt could
  
  Torsten/Yamin, does this fix things for you? What about reverting commit 
  1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c (SLUB: Alternate fast paths 
  using cmpxchg_local)?
 I'm busy in another issue and will test it ASAP. Sorry.
I tested it on my 3 x86-64 machines. The small fix to use smp_mb to replace
barrier in slab_free doesn't work. Kernel still crashed at the same place.

I will test the reverting patch.

-yanmin


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Zhang, Yanmin

On Wed, 2008-02-20 at 10:08 +0800, Zhang, Yanmin wrote:
 On Wed, 2008-02-20 at 08:36 +0800, Zhang, Yanmin wrote:
  On Tue, 2008-02-19 at 17:52 +0200, Pekka Enberg wrote:
   Ingo Molnar wrote:
* Pekka Enberg [EMAIL PROTECTED] wrote:

Yes, this can happen. Are you saying it is not safe to be in the 
lockless path when an IRQ triggers?
Hmm. The barrier() in slab_free() looks fishy. The comment says it's 
there to make sure we've retrieved c-freelist before c-page but then 
it uses a _compiler barrier_ which doesn't affect the CPU and the 
reads may still be re-ordered... Not sure if that matters here though.

find a fix patch for that below - most systems affected seem to be SMP 
ones.

If this (or my other patch) indeed solves the problem i'd still favor a 
full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it 
looks 
quite un-cooked and quite un-tested for multiple independent reasons.

Sigh, why do i again have to be the messenger who brings the bad news 
to 
SLUB land, and again when poor Christoph went on vacation? :-/

Ingo

--
Subject: SLUB: barrier fix
From: Ingo Molnar [EMAIL PROTECTED]

---
 mm/slub.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/mm/slub.c
===
--- linux.orig/mm/slub.c
+++ linux/mm/slub.c
@@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st
debug_check_no_locks_freed(object, s-objsize);
do {
freelist = c-freelist;
-   barrier();
+   smp_mb();
/*
 * If the compiler would reorder the retrieval of 
c-page to
 * come before c-freelist then an interrupt could
   
   Torsten/Yamin, does this fix things for you? What about reverting commit 
   1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c (SLUB: Alternate fast paths 
   using cmpxchg_local)?
  I'm busy in another issue and will test it ASAP. Sorry.
 I tested it on my 3 x86-64 machines. The small fix to use smp_mb to replace
 barrier in slab_free doesn't work. Kernel still crashed at the same place.
 
 I will test the reverting patch.
Kernel with the reverting patch is ok.
I ran reboot/hackbench for more than 10 times on every one of my 3 x86-64 
machines,
and kernel didn't crash.

-yanmin


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg


On 2/20/2008, Zhang, Yanmin [EMAIL PROTECTED] wrote:
 Kernel with the reverting patch is ok.
 I ran reboot/hackbench for more than 10 times on every one of my 3 x86-64 
 machines, and kernel didn't crash.

Great, Linus reverted the patch yesterday. Thanks for testing!
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-18 Thread Pekka Enberg

On Feb 19, 2008 8:54 AM, Torsten Kaiser <[EMAIL PROTECTED]> wrote:
> > > [ 5282.056415] [ cut here ]
> > > [ 5282.059757] kernel BUG at lib/list_debug.c:33!
> > > [ 5282.062055] invalid opcode:  [1] SMP
> > > [ 5282.062055] CPU 3
> >
> > hm. Your crashes do seem to span multiple subsystems, but it always
> > seems to be around the SLUB code. Could you try the patch below? The
> > SLUB code has a new optimization and i'm not 100% sure about it. [the
> > hack below switches the SLUB optimization off by disabling the CPU
> > feature it relies on.]
> >
> > Ingo
> >
> > ->
> >  arch/x86/Kconfig |4 
> >  1 file changed, 4 deletions(-)
> >
> > Index: linux/arch/x86/Kconfig
> > ===
> > --- linux.orig/arch/x86/Kconfig
> > +++ linux/arch/x86/Kconfig
> > @@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT
> >  config SEMAPHORE_SLEEPERS
> > def_bool y
> >
> > -config FAST_CMPXCHG_LOCAL
> > -   bool
> > -   default y
> > -
> >  config MMU
> > def_bool y
> >
>
> $ grep FAST_CMPXCHG_LOCAL */.config
> linux-2.6.24-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
> linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
>
> -rc2-mm1 still worked for me.
>
> Did you mean the new SLUB_FASTPATH?
> $ grep "define SLUB_FASTPATH" */mm/slub.c
> linux-2.6.25-rc1/mm/slub.c:#define SLUB_FASTPATH
> linux-2.6.25-rc2-mm1/mm/slub.c:#define SLUB_FASTPATH
> linux-2.6.25-rc2/mm/slub.c:#define SLUB_FASTPATH
>
> The 2.6.24-rc3+ mm-kernels did crash for me, but don't seem to contain this...
>
> On the other hand:
> From the crash in 2.6.25-rc2-mm1:
> [59987.116182] RIP  [] kmem_cache_alloc_node+0x6d/0xa0
>
> (gdb) list *0x8029f83d
> 0x8029f83d is in kmem_cache_alloc_node (mm/slub.c:1646).
> 1641if (unlikely(is_end(object) || !node_match(c, node))) 
> {
> 1642object = __slab_alloc(s, gfpflags,
> node, addr, c);
> 1643break;
> 1644}
> 1645stat(c, ALLOC_FASTPATH);
> 1646} while (cmpxchg_local(>freelist, object, 
> object[c->offset])
> 1647
>  != object);
> 1648#else
> 1649unsigned long flags;
> 1650
>
> That code is part for SLUB_FASTPATH.
>
> I'm willing to test the patch, but don't know how fast I can find the
> time to do it, so my answer if your patch helps might be delayed until
> the weekend.

Mathieu, Christoph is on vacation and I'm not at all that familiar
with this cmpxchg_local() optimization, so if you could take a peek at
this bug report to see if you can spot something obviously wrong with
it, I would much appreciate that.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-18 Thread Torsten Kaiser

On Feb 19, 2008 7:11 AM, Ingo Molnar <[EMAIL PROTECTED]> wrote:
> * Torsten Kaiser <[EMAIL PROTECTED]> wrote:
> > On Feb 15, 2008 10:23 PM, Linus Torvalds <[EMAIL PROTECTED]> wrote:
> > >
> > > Ok,
> > >  this kernel is a winner.
> >
> > Sadly not for me:
> > [ 5282.056415] [ cut here ]
> > [ 5282.059757] kernel BUG at lib/list_debug.c:33!
> > [ 5282.062055] invalid opcode:  [1] SMP
> > [ 5282.062055] CPU 3
>
> hm. Your crashes do seem to span multiple subsystems, but it always
> seems to be around the SLUB code. Could you try the patch below? The
> SLUB code has a new optimization and i'm not 100% sure about it. [the
> hack below switches the SLUB optimization off by disabling the CPU
> feature it relies on.]
>
> Ingo
>
> ->
>  arch/x86/Kconfig |4 
>  1 file changed, 4 deletions(-)
>
> Index: linux/arch/x86/Kconfig
> ===
> --- linux.orig/arch/x86/Kconfig
> +++ linux/arch/x86/Kconfig
> @@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT
>  config SEMAPHORE_SLEEPERS
> def_bool y
>
> -config FAST_CMPXCHG_LOCAL
> -   bool
> -   default y
> -
>  config MMU
> def_bool y
>

$ grep FAST_CMPXCHG_LOCAL */.config
linux-2.6.24-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y

-rc2-mm1 still worked for me.

Did you mean the new SLUB_FASTPATH?
$ grep "define SLUB_FASTPATH" */mm/slub.c
linux-2.6.25-rc1/mm/slub.c:#define SLUB_FASTPATH
linux-2.6.25-rc2-mm1/mm/slub.c:#define SLUB_FASTPATH
linux-2.6.25-rc2/mm/slub.c:#define SLUB_FASTPATH

The 2.6.24-rc3+ mm-kernels did crash for me, but don't seem to contain this...

On the other hand:
>From the crash in 2.6.25-rc2-mm1:
[59987.116182] RIP  [] kmem_cache_alloc_node+0x6d/0xa0

(gdb) list *0x8029f83d
0x8029f83d is in kmem_cache_alloc_node (mm/slub.c:1646).
1641if (unlikely(is_end(object) || !node_match(c, node))) {
1642object = __slab_alloc(s, gfpflags,
node, addr, c);
1643break;
1644}
1645stat(c, ALLOC_FASTPATH);
1646} while (cmpxchg_local(>freelist, object, object[c->offset])
1647
 != object);
1648#else
1649unsigned long flags;
1650

That code is part for SLUB_FASTPATH.

I'm willing to test the patch, but don't know how fast I can find the
time to do it, so my answer if your patch helps might be delayed until
the weekend.

Torsten
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-18 Thread Torsten Kaiser

On Feb 19, 2008 12:54 AM, Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
>
> On Sat, 16 Feb 2008, Torsten Kaiser wrote:
> >
> > [ 5282.056415] [ cut here ]
> > [ 5282.059757] kernel BUG at lib/list_debug.c:33!
>
> Is there any chance that you could try to bisect this, if it's repeatable
> enough for you? Even if you can't bisect it *all* the way, it would be
> really good to do a handful of bisection runs which should already
> hopefully narrow it down a bit more.
>
> Linus
>

It's repeatable, but not in a really reliable way.
So to mark a kernel good I need to compile around 100 KDE packages,
and even then I'm not 100% sure, if it's good or if I was just lucky.

But I did a partly bisect against 2.6.24-rc6-mm1:
2.6.24-rc6 + mm-patches up to (including) git.nfsd -> worked
2.6.24-rc6 + mm-patches up to (including) git.xfs -> crashed

I think the only added patch between rc2-mm1 and rc3-mm2 in that range
where the iommu changes that I later ruled out.
That leaves some git trees as suspects:
git-ocfs2.patch
git-selinux.patch
git-s390.patch
git-sched.patch
git-sh.patch
git-scsi-misc.patch
git-unionfs.patch
git-v9fs.patch
git-watchdog.patch
git-wireless.patch
git-ipwireless_cs.patch
git-x86.patch
git-xfs.patch

(see http://marc.info/?l=linux-kernel=120276641105256 )

Torsten
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-18 Thread Ingo Molnar


* Torsten Kaiser <[EMAIL PROTECTED]> wrote:

> On Feb 15, 2008 10:23 PM, Linus Torvalds <[EMAIL PROTECTED]> wrote:
> >
> > Ok,
> >  this kernel is a winner.
> 
> Sadly not for me:
> [ 5282.056415] [ cut here ]
> [ 5282.059757] kernel BUG at lib/list_debug.c:33!
> [ 5282.062055] invalid opcode:  [1] SMP
> [ 5282.062055] CPU 3

hm. Your crashes do seem to span multiple subsystems, but it always 
seems to be around the SLUB code. Could you try the patch below? The 
SLUB code has a new optimization and i'm not 100% sure about it. [the 
hack below switches the SLUB optimization off by disabling the CPU 
feature it relies on.]

Ingo

->
 arch/x86/Kconfig |4 
 1 file changed, 4 deletions(-)

Index: linux/arch/x86/Kconfig
===
--- linux.orig/arch/x86/Kconfig
+++ linux/arch/x86/Kconfig
@@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT
 config SEMAPHORE_SLEEPERS
def_bool y
 
-config FAST_CMPXCHG_LOCAL
-   bool
-   default y
-
 config MMU
def_bool y
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group

2008-02-18 Thread Alasdair G Kergon

On Sat, Feb 16, 2008 at 11:37:37PM +0100, Jiri Slaby wrote:
> # CONFIG_SYSFS_DEPRECATED is not set

IMHO That should be *set* by default until everyone has had time to
update their userspace software to cope with the changed sysfs layout.

Alasdair
-- 
[EMAIL PROTECTED]
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-18 Thread Linus Torvalds

On Sat, 16 Feb 2008, Torsten Kaiser wrote:
>
> [ 5282.056415] [ cut here ]
> [ 5282.059757] kernel BUG at lib/list_debug.c:33!

Is there any chance that you could try to bisect this, if it's repeatable 
enough for you? Even if you can't bisect it *all* the way, it would be 
really good to do a handful of bisection runs which should already 
hopefully narrow it down a bit more.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Kernel Ooops while running dbench

2008-02-18 Thread Frans Pop

Jeff Garzik wrote:
> Two x86-64 boxes here lock up here on 2.6.25-rc2, shortly after boot.
> One running Fedora 8 + X (GNOME) and one a headless file server.
> configs and lspci attached.  Unable to capture any splatter so far.

Sounds like it may be http://lkml.org/lkml/2008/2/17/78.

Suggest you try reverting that before doing the bisect.

Cheers,
FJP
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Kernel Ooops while running dbench

2008-02-18 Thread Jeff Garzik

Two x86-64 boxes here lock up here on 2.6.25-rc2, shortly after boot. 
One running Fedora 8 + X (GNOME) and one a headless file server. 
configs and lspci attached.  Unable to capture any splatter so far.


Bisecting...


00:00.0 Host bridge: Intel Corporation 82955X Memory Controller Hub
00:01.0 PCI bridge: Intel Corporation 82955X PCI Express Root Port
00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 
(rev 01)
00:1c.4 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express 
Port 5 (rev 01)
00:1c.5 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express 
Port 6 (rev 01)
00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
Controller #1 (rev 01)
00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
Controller #2 (rev 01)
00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
Controller #3 (rev 01)
00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
Controller #4 (rev 01)
00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI 
Controller (rev 01)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1)
00:1f.0 ISA bridge: Intel Corporation 82801GB/GR (ICH7 Family) LPC Interface 
Bridge (rev 01)
00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller 
(rev 01)
00:1f.2 SATA controller: Intel Corporation 82801GR/GH (ICH7 Family) SATA AHCI 
Controller (rev 01)
00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 01)
01:00.0 VGA compatible controller: nVidia Corporation NV44 [Quadro NVS 285] 
(rev a1)
04:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5751 Gigabit 
Ethernet PCI Express (rev 01)
05:02.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5701 Gigabit 
Ethernet (rev 15)
00:00.0 Host bridge: Intel Corporation 82975X Memory Controller Hub
00:01.0 PCI bridge: Intel Corporation 82975X PCI Express Root Port
00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition 
Audio Controller (rev 01)
00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 
(rev 01)
00:1c.4 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express 
Port 5 (rev 01)
00:1c.5 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express 
Port 6 (rev 01)
00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
Controller #1 (rev 01)
00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
Controller #2 (rev 01)
00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
Controller #3 (rev 01)
00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
Controller #4 (rev 01)
00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI 
Controller (rev 01)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1)
00:1f.0 ISA bridge: Intel Corporation 82801GH (ICH7DH) LPC Interface Bridge 
(rev 01)
00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller 
(rev 01)
00:1f.2 SATA controller: Intel Corporation 82801GR/GH (ICH7 Family) SATA AHCI 
Controller (rev 01)
00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 01)
01:00.0 VGA compatible controller: ATI Technologies Inc R580 [Radeon X1900 XT] 
(Primary)
01:00.1 Display controller: ATI Technologies Inc R580 [Radeon X1900 XT] 
(Secondary)
02:00.0 Multimedia controller: Philips Semiconductors Unknown device 7162
04:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet 
Controller
05:02.0 Network controller: RaLink RT2561/RT61 802.11g PCI
05:04.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23 IEEE-1394a-2000 
Controller (PHY/Link)
05:05.0 RAID bus controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] 
Serial ATA Controller (rev 02)


pretzel.bz2
Description: application/bzip


core.bz2
Description: application/bzip

Re: [BUG] Linux 2.6.25-rc2 - Kernel Ooops while running dbench

2008-02-18 Thread Andrew Morton

On Sat, 16 Feb 2008 11:14:46 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote:

> The 2.6.25-rc2 kernel oopses while running dbench on ext3 filesystem
> mounted with mount -o data=writeback,nobh option on the x86_64 box
> 
> BUG: unable to handle kernel NULL pointer dereference at 
> IP: [] kmem_cache_alloc+0x3a/0x6c
> PGD 1f6860067 PUD 1f5d64067 PMD 0 
> Oops:  [1] SMP 
> CPU 3 
> Modules linked in:
> Pid: 4271, comm: dbench Not tainted 2.6.25-rc2-autotest #1
> RIP: 0010:[]  [] 
> kmem_cache_alloc+0x3a/0x6c
> RSP: :8101fb041dc8  EFLAGS: 00010246
> RAX:  RBX: 810180033c00 RCX: 8027b269
> RDX:  RSI: 80d0 RDI: 80632d70
> RBP: 80d0 R08: 0001 R09: 
> R10: 8101feb36e50 R11: 0190 R12: 0001
> R13:  R14: 8101f8f38000 R15: ff9c
> FS:  () GS:8101fff0f000(0063) knlGS:f7e41460
> CS:  0010 DS: 002b ES: 002b CR0: 80050033
> CR2:  CR3: 0001f562 CR4: 06e0
> DR0:  DR1:  DR2: 
> DR3:  DR6: 0ff0 DR7: 0400
> Process dbench (pid: 4271, threadinfo 8101fb04, task 8101fb18)
> Stack:  0001 8101fb041ea8 0001 8027b269
>  8101fb041ea8 80281fe8 0001 
>  8101fb041ea8 ff9c 000b 0001
> Call Trace:
>  [] get_empty_filp+0x55/0xf9
>  [] __path_lookup_intent_open+0x22/0x8f
>  [] open_namei+0x86/0x5a7
>  [] vfs_stat_fd+0x3c/0x4a
>  [] do_filp_open+0x1c/0x3d
>  [] get_unused_fd_flags+0x79/0x111
>  [] do_sys_open+0x46/0xca
>  [] ia32_sysret+0x0/0xa
> 

Looks to me like we broke slab.  Christoph is offline until the 27th..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group

2008-02-18 Thread Tilman Schmidt


Am 17.02.2008 schrieb Jeff Chua:

I faced the same problem, but resolved with ...

vgscan
vgchange -a y


Sorry, I'm not sure what to do with those two commands.
Running them once manually doesn't seem to change anything,
and my initrd already contains them AFAICS.


Also, ensure you set "write_cache_state = 1" in /etc/lvm.conf before
running the above.


That was already set by default.

Thanks,
Tilman

--
Tilman SchmidtE-Mail: [EMAIL PROTECTED]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)



signature.asc
Description: OpenPGP digital signature

Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group

2008-02-18 Thread Tilman Schmidt


Am 17.02.2008 schrieb Jeff Chua:

I faced the same problem, but resolved with ...

vgscan
vgchange -a y


Sorry, I'm not sure what to do with those two commands.
Running them once manually doesn't seem to change anything,
and my initrd already contains them AFAICS.


Also, ensure you set write_cache_state = 1 in /etc/lvm.conf before
running the above.


That was already set by default.

Thanks,
Tilman

--
Tilman SchmidtE-Mail: [EMAIL PROTECTED]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)



signature.asc
Description: OpenPGP digital signature

Re: [BUG] Linux 2.6.25-rc2 - Kernel Ooops while running dbench

2008-02-18 Thread Andrew Morton

On Sat, 16 Feb 2008 11:14:46 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote:

 The 2.6.25-rc2 kernel oopses while running dbench on ext3 filesystem
 mounted with mount -o data=writeback,nobh option on the x86_64 box
 
 BUG: unable to handle kernel NULL pointer dereference at 
 IP: [80274972] kmem_cache_alloc+0x3a/0x6c
 PGD 1f6860067 PUD 1f5d64067 PMD 0 
 Oops:  [1] SMP 
 CPU 3 
 Modules linked in:
 Pid: 4271, comm: dbench Not tainted 2.6.25-rc2-autotest #1
 RIP: 0010:[80274972]  [80274972] 
 kmem_cache_alloc+0x3a/0x6c
 RSP: :8101fb041dc8  EFLAGS: 00010246
 RAX:  RBX: 810180033c00 RCX: 8027b269
 RDX:  RSI: 80d0 RDI: 80632d70
 RBP: 80d0 R08: 0001 R09: 
 R10: 8101feb36e50 R11: 0190 R12: 0001
 R13:  R14: 8101f8f38000 R15: ff9c
 FS:  () GS:8101fff0f000(0063) knlGS:f7e41460
 CS:  0010 DS: 002b ES: 002b CR0: 80050033
 CR2:  CR3: 0001f562 CR4: 06e0
 DR0:  DR1:  DR2: 
 DR3:  DR6: 0ff0 DR7: 0400
 Process dbench (pid: 4271, threadinfo 8101fb04, task 8101fb18)
 Stack:  0001 8101fb041ea8 0001 8027b269
  8101fb041ea8 80281fe8 0001 
  8101fb041ea8 ff9c 000b 0001
 Call Trace:
  [8027b269] get_empty_filp+0x55/0xf9
  [80281fe8] __path_lookup_intent_open+0x22/0x8f
  [80282853] open_namei+0x86/0x5a7
  [8027d019] vfs_stat_fd+0x3c/0x4a
  [80279ab1] do_filp_open+0x1c/0x3d
  [80279c2c] get_unused_fd_flags+0x79/0x111
  [80279dce] do_sys_open+0x46/0xca
  [80221c82] ia32_sysret+0x0/0xa
 

Looks to me like we broke slab.  Christoph is offline until the 27th..
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Kernel Ooops while running dbench

2008-02-18 Thread Jeff Garzik

Two x86-64 boxes here lock up here on 2.6.25-rc2, shortly after boot. 
One running Fedora 8 + X (GNOME) and one a headless file server. 
configs and lspci attached.  Unable to capture any splatter so far.


Bisecting...


00:00.0 Host bridge: Intel Corporation 82955X Memory Controller Hub
00:01.0 PCI bridge: Intel Corporation 82955X PCI Express Root Port
00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 
(rev 01)
00:1c.4 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express 
Port 5 (rev 01)
00:1c.5 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express 
Port 6 (rev 01)
00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
Controller #1 (rev 01)
00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
Controller #2 (rev 01)
00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
Controller #3 (rev 01)
00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
Controller #4 (rev 01)
00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI 
Controller (rev 01)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1)
00:1f.0 ISA bridge: Intel Corporation 82801GB/GR (ICH7 Family) LPC Interface 
Bridge (rev 01)
00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller 
(rev 01)
00:1f.2 SATA controller: Intel Corporation 82801GR/GH (ICH7 Family) SATA AHCI 
Controller (rev 01)
00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 01)
01:00.0 VGA compatible controller: nVidia Corporation NV44 [Quadro NVS 285] 
(rev a1)
04:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5751 Gigabit 
Ethernet PCI Express (rev 01)
05:02.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5701 Gigabit 
Ethernet (rev 15)
00:00.0 Host bridge: Intel Corporation 82975X Memory Controller Hub
00:01.0 PCI bridge: Intel Corporation 82975X PCI Express Root Port
00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition 
Audio Controller (rev 01)
00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 
(rev 01)
00:1c.4 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express 
Port 5 (rev 01)
00:1c.5 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express 
Port 6 (rev 01)
00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
Controller #1 (rev 01)
00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
Controller #2 (rev 01)
00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
Controller #3 (rev 01)
00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
Controller #4 (rev 01)
00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI 
Controller (rev 01)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1)
00:1f.0 ISA bridge: Intel Corporation 82801GH (ICH7DH) LPC Interface Bridge 
(rev 01)
00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller 
(rev 01)
00:1f.2 SATA controller: Intel Corporation 82801GR/GH (ICH7 Family) SATA AHCI 
Controller (rev 01)
00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 01)
01:00.0 VGA compatible controller: ATI Technologies Inc R580 [Radeon X1900 XT] 
(Primary)
01:00.1 Display controller: ATI Technologies Inc R580 [Radeon X1900 XT] 
(Secondary)
02:00.0 Multimedia controller: Philips Semiconductors Unknown device 7162
04:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet 
Controller
05:02.0 Network controller: RaLink RT2561/RT61 802.11g PCI
05:04.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23 IEEE-1394a-2000 
Controller (PHY/Link)
05:05.0 RAID bus controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] 
Serial ATA Controller (rev 02)


pretzel.bz2
Description: application/bzip


core.bz2
Description: application/bzip

Re: [BUG] Linux 2.6.25-rc2 - Kernel Ooops while running dbench

2008-02-18 Thread Frans Pop

Jeff Garzik wrote:
 Two x86-64 boxes here lock up here on 2.6.25-rc2, shortly after boot.
 One running Fedora 8 + X (GNOME) and one a headless file server.
 configs and lspci attached.  Unable to capture any splatter so far.

Sounds like it may be http://lkml.org/lkml/2008/2/17/78.

Suggest you try reverting that before doing the bisect.

Cheers,
FJP
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-18 Thread Linus Torvalds



On Sat, 16 Feb 2008, Torsten Kaiser wrote:

 [ 5282.056415] [ cut here ]
 [ 5282.059757] kernel BUG at lib/list_debug.c:33!

Is there any chance that you could try to bisect this, if it's repeatable 
enough for you? Even if you can't bisect it *all* the way, it would be 
really good to do a handful of bisection runs which should already 
hopefully narrow it down a bit more.

Linus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group

2008-02-18 Thread Alasdair G Kergon

On Sat, Feb 16, 2008 at 11:37:37PM +0100, Jiri Slaby wrote:
 # CONFIG_SYSFS_DEPRECATED is not set

IMHO That should be *set* by default until everyone has had time to
update their userspace software to cope with the changed sysfs layout.

Alasdair
-- 
[EMAIL PROTECTED]
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-18 Thread Ingo Molnar


* Torsten Kaiser [EMAIL PROTECTED] wrote:

 On Feb 15, 2008 10:23 PM, Linus Torvalds [EMAIL PROTECTED] wrote:
 
  Ok,
   this kernel is a winner.
 
 Sadly not for me:
 [ 5282.056415] [ cut here ]
 [ 5282.059757] kernel BUG at lib/list_debug.c:33!
 [ 5282.062055] invalid opcode:  [1] SMP
 [ 5282.062055] CPU 3

hm. Your crashes do seem to span multiple subsystems, but it always 
seems to be around the SLUB code. Could you try the patch below? The 
SLUB code has a new optimization and i'm not 100% sure about it. [the 
hack below switches the SLUB optimization off by disabling the CPU 
feature it relies on.]

Ingo

-
 arch/x86/Kconfig |4 
 1 file changed, 4 deletions(-)

Index: linux/arch/x86/Kconfig
===
--- linux.orig/arch/x86/Kconfig
+++ linux/arch/x86/Kconfig
@@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT
 config SEMAPHORE_SLEEPERS
def_bool y
 
-config FAST_CMPXCHG_LOCAL
-   bool
-   default y
-
 config MMU
def_bool y
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-18 Thread Torsten Kaiser

On Feb 19, 2008 12:54 AM, Linus Torvalds [EMAIL PROTECTED] wrote:


 On Sat, 16 Feb 2008, Torsten Kaiser wrote:
 
  [ 5282.056415] [ cut here ]
  [ 5282.059757] kernel BUG at lib/list_debug.c:33!

 Is there any chance that you could try to bisect this, if it's repeatable
 enough for you? Even if you can't bisect it *all* the way, it would be
 really good to do a handful of bisection runs which should already
 hopefully narrow it down a bit more.

 Linus


It's repeatable, but not in a really reliable way.
So to mark a kernel good I need to compile around 100 KDE packages,
and even then I'm not 100% sure, if it's good or if I was just lucky.

But I did a partly bisect against 2.6.24-rc6-mm1:
2.6.24-rc6 + mm-patches up to (including) git.nfsd - worked
2.6.24-rc6 + mm-patches up to (including) git.xfs - crashed

I think the only added patch between rc2-mm1 and rc3-mm2 in that range
where the iommu changes that I later ruled out.
That leaves some git trees as suspects:
git-ocfs2.patch
git-selinux.patch
git-s390.patch
git-sched.patch
git-sh.patch
git-scsi-misc.patch
git-unionfs.patch
git-v9fs.patch
git-watchdog.patch
git-wireless.patch
git-ipwireless_cs.patch
git-x86.patch
git-xfs.patch

(see http://marc.info/?l=linux-kernelm=120276641105256 )

Torsten
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-18 Thread Torsten Kaiser

On Feb 19, 2008 7:11 AM, Ingo Molnar [EMAIL PROTECTED] wrote:
 * Torsten Kaiser [EMAIL PROTECTED] wrote:
  On Feb 15, 2008 10:23 PM, Linus Torvalds [EMAIL PROTECTED] wrote:
  
   Ok,
this kernel is a winner.
 
  Sadly not for me:
  [ 5282.056415] [ cut here ]
  [ 5282.059757] kernel BUG at lib/list_debug.c:33!
  [ 5282.062055] invalid opcode:  [1] SMP
  [ 5282.062055] CPU 3

 hm. Your crashes do seem to span multiple subsystems, but it always
 seems to be around the SLUB code. Could you try the patch below? The
 SLUB code has a new optimization and i'm not 100% sure about it. [the
 hack below switches the SLUB optimization off by disabling the CPU
 feature it relies on.]

 Ingo

 -
  arch/x86/Kconfig |4 
  1 file changed, 4 deletions(-)

 Index: linux/arch/x86/Kconfig
 ===
 --- linux.orig/arch/x86/Kconfig
 +++ linux/arch/x86/Kconfig
 @@ -59,10 +59,6 @@ config HAVE_LATENCYTOP_SUPPORT
  config SEMAPHORE_SLEEPERS
 def_bool y

 -config FAST_CMPXCHG_LOCAL
 -   bool
 -   default y
 -
  config MMU
 def_bool y


$ grep FAST_CMPXCHG_LOCAL */.config
linux-2.6.24-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y
linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y

-rc2-mm1 still worked for me.

Did you mean the new SLUB_FASTPATH?
$ grep define SLUB_FASTPATH */mm/slub.c
linux-2.6.25-rc1/mm/slub.c:#define SLUB_FASTPATH
linux-2.6.25-rc2-mm1/mm/slub.c:#define SLUB_FASTPATH
linux-2.6.25-rc2/mm/slub.c:#define SLUB_FASTPATH

The 2.6.24-rc3+ mm-kernels did crash for me, but don't seem to contain this...

On the other hand:
From the crash in 2.6.25-rc2-mm1:
[59987.116182] RIP  [8029f83d] kmem_cache_alloc_node+0x6d/0xa0

(gdb) list *0x8029f83d
0x8029f83d is in kmem_cache_alloc_node (mm/slub.c:1646).
1641if (unlikely(is_end(object) || !node_match(c, node))) {
1642object = __slab_alloc(s, gfpflags,
node, addr, c);
1643break;
1644}
1645stat(c, ALLOC_FASTPATH);
1646} while (cmpxchg_local(c-freelist, object, object[c-offset])
1647
 != object);
1648#else
1649unsigned long flags;
1650

That code is part for SLUB_FASTPATH.

I'm willing to test the patch, but don't know how fast I can find the
time to do it, so my answer if your patch helps might be delayed until
the weekend.

Torsten
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group

2008-02-17 Thread Jeff Chua

On Feb 18, 2008 8:57 AM, Tilman Schmidt <[EMAIL PROTECTED]> wrote:
> Am 16.02.2008 23:37 schrieb Jiri Slaby:
> > On 02/16/2008 09:12 PM, Alan Cox wrote:
> > Try to upgrade to at least lvm 2.02.29 (I guess this is the first version 
> > which
> > understands the new sysfs layout).
> I'll have to investigate how to do that without breaking anything.

I faced the same problem, but resolved with ...

vgscan
vgchange -a y

Also, ensure you set "write_cache_state = 1" in /etc/lvm.conf before
running the above.

Let me know if this helps.


Thanks,
Jeff.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group

2008-02-17 Thread Tilman Schmidt


Am 16.02.2008 23:37 schrieb Jiri Slaby:

On 02/16/2008 09:12 PM, Alan Cox wrote:

On Sat, 16 Feb 2008 20:14:30 +0100
Tilman Schmidt <[EMAIL PROTECTED]> wrote:


2.6.25-rc2 fails to bring up my openSUSE 10.3 PC because LVM
cannot find the volume group containing the root file system.
2.6.25-rc1 has the same problem, 2.6.24 works fine.


Bisection says:

edfaa7c36574f1bf09c65ad602412db9da5f96bf is first bad commit
commit edfaa7c36574f1bf09c65ad602412db9da5f96bf
Author: Kay Sievers <[EMAIL PROTECTED]>
Date:   Mon May 21 22:08:01 2007 +0200

Driver core: convert block from raw kobjects to core devices

This moves the block devices to /sys/class/block. It will create a
flat list of all block devices, with the disks and partitions in one
directory. For compatibility /sys/block is created and contains symlinks
to the disks.

Apparently, compatibility is in the eye of the beholder - in this
case, LVM.


Compile in SCSI disk support. Modular even if loaded in initrd it seems
to have broken somewhere.


Setting

CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y

does not help. The problem persists.


# CONFIG_SYSFS_DEPRECATED is not set

I would suspect this.


Setting

CONFIG_SYSFS_DEPRECATED=y

does indeed fix the problem and allows me to boot successfully.
Pity, I was so happy getting rid of that a couple of releases ago.

Try to upgrade to at least lvm 2.02.29 (I guess this is the first version which 
understands the new sysfs layout).


I'll have to investigate how to do that without breaking anything.

HTH
T.

--
Tilman Schmidt  E-Mail: [EMAIL PROTECTED]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)



signature.asc
Description: OpenPGP digital signature

Re: Linux 2.6.25-rc2

2008-02-17 Thread Torsten Kaiser

On Feb 17, 2008 9:25 PM, Rafael J. Wysocki <[EMAIL PROTECTED]> wrote:
> There's the Bugzilla entry for it at
> http://bugzilla.kernel.org/show_bug.cgi?id=9973

Thank you.

> Please update it with the current information.

Crash for 2.6.25-rc2-mm1 added. That one had a complete stacktrace,
but the trace looks like others I already reported, so no real new
information... :-(

Torsten
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-17 Thread Rafael J. Wysocki

On Saturday, 16 of February 2008, Torsten Kaiser wrote:
> On Feb 15, 2008 10:23 PM, Linus Torvalds <[EMAIL PROTECTED]> wrote:
> >
> > Ok,
> >  this kernel is a winner.
> 
> Sadly not for me:
> [ 5282.056415] [ cut here ]
> [ 5282.059757] kernel BUG at lib/list_debug.c:33!
> [ 5282.062055] invalid opcode:  [1] SMP
> [ 5282.062055] CPU 3
> [ 5282.062055] Modules linked in: radeon drm w83792d ipv6 tuner
> tea5767 tda8290 tuner_xc2028 tda9887 tuner_simple mt20xx tea5761
> tvaudio msp3400 bttv videodev v4l1_compat ir_common compat_ioctl32
> v4l2_common videobuf_dma_sg videobuf_core btcx_risc tveeprom usbhid
> pata_amd i2c_nforce2 hid sg
> [ 5282.062055] Pid: 12937, comm: sed Not tainted 2.6.25-rc2 #1
> [ 5282.062055] RIP: 0010:[]
> -> then the output from the serial console stopped. I was in X, so I
> could not see, if there was anything more on the real console.
> 
> (gdb) list *0x803bffe4
> 0x803bffe4 is in __list_add (lib/list_debug.c:33).
> 28  }
> 29  if (unlikely(prev->next != next)) {
> 30  printk(KERN_ERR "list_add corruption.
> prev->next should be "
> 31  "next (%p), but was %p. (prev=%p).\n",
> 32  next, prev->next, prev);
> 33  BUG();
> 34  }
> 35  next->prev = new;
> 36  new->next = next;
> 37  new->prev = prev;
> 
> For more on this problem see 
> http://marc.info/?l=linux-kernel=120293042005445

There's the Bugzilla entry for it at
http://bugzilla.kernel.org/show_bug.cgi?id=9973

Please update it with the current information.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-17 Thread Rafael J. Wysocki

On Saturday, 16 of February 2008, Kamalesh Babulal wrote:
> Hi,

Hi,
 
> The softlockup is seen from 2.6.25-rc1-git{1,3} and is visible in the 
> 2.6.24-rc2 kernel,
> While booting up with the 2.6.25-rc1-git{1,3} and 2.6.25-rc2 kernel(s) on the 
> powerbox

Can you update the Bugzilla entry at:
http://bugzilla.kernel.org/show_bug.cgi?id=9948
with the above information, please?

Rafael


> Loading st.ko module
> BUG: soft lockup - CPU#1 stuck for 61s! [insmod:379]
> NIP: c01b0620 LR: c01a5dcc CTR: 0040
> REGS: c0077caab8a0 TRAP: 0901   Not tainted  (2.6.25-rc2-autotest)
> MSR: 80009032   CR: 84004088  XER: 2000
> TASK = c0077cb450a0[379] 'insmod' THREAD: c0077caa8000 CPU: 1
> GPR00: c0077c9d4000 c0077caabb20 c0538a40 000b 
> GPR04: ffc0 c0077e0c 0036 000a 
> GPR08: 0040 c0077c9d4250 c000  
> GPR12: c0077c9d4230 c0481d00 
> NIP [c01b0620] .radix_tree_gang_lookup+0x100/0x1e4
> LR [c01a5dcc] .call_for_each_cic+0x50/0x10c
> Call Trace:
> [c0077caabb20] [c01a5e2c] .call_for_each_cic+0xb0/0x10c 
> (unreliable)
> [c0077caabc60] [c019dba4] .exit_io_context+0xf0/0x110
> [c0077caabcf0] [c0061e38] .do_exit+0x820/0x850
> [c0077caabda0] [c0061f34] .do_group_exit+0xcc/0xe8
> [c0077caabe30] [c000872c] syscall_exit+0x0/0x40
> Instruction dump:
> 7d296214 39290018 e809 7caa2038 39290008 2fa0 409e0018 7caa4215 
> 396b0001 418200cc 424000b8 4bdc <79691f24> 7d296214 e9690018 2fab 
> INFO: task insmod:387 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> insmodD 1000e144 12144   387  1
> Call Trace:
> [c0077cb97600] [c08fae80] 0xc08fae80 (unreliable)
> [c0077cb977d0] [c0010c7c] .__switch_to+0x11c/0x154
> [c0077cb97860] [c0344498] .schedule+0x5d0/0x6b0
> [c0077cb97950] [c03447d8] .schedule_timeout+0x3c/0xe8
> [c0077cb97a20] [c0343d34] .wait_for_common+0x150/0x22c
> [c0077cb97ae0] [c008ef00] .__stop_machine_run+0xbc/0xf0
> [c0077cb97bb0] [c008ef70] .stop_machine_run+0x3c/0x80
> [c0077cb97c50] [c00891f0] .sys_init_module+0x14e4/0x1af4
> [c0077cb97e30] [c000872c] syscall_exit+0x0/0x40
> -- 0:conmux-control -- time-stamp -- Feb/15/08 16:04:12 --
> INFO: task insmod:387 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> insmodD 1000e144 12144   387  1
> Call Trace:
> [c0077cb97600] [c08fae80] 0xc08fae80 (unreliable)
> [c0077cb977d0] [c0010c7c] .__switch_to+0x11c/0x154
> [c0077cb97860] [c0344498] .schedule+0x5d0/0x6b0
> [c0077cb97950] [c03447d8] .schedule_timeout+0x3c/0xe8
> [c0077cb97a20] [c0343d34] .wait_for_common+0x150/0x22c
> [c0077cb97ae0] [c008ef00] .__stop_machine_run+0xbc/0xf0
> [c0077cb97bb0] [c008ef70] .stop_machine_run+0x3c/0x80
> [c0077cb97c50] [c00891f0] .sys_init_module+0x14e4/0x1af4
> [c0077cb97e30] [c000872c] syscall_exit+0x0/0x40
> -- 0:conmux-control -- time-stamp -- Feb/15/08 16:06:21 --



-- 
"Premature optimization is the root of all evil." - Donald Knuth
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-17 Thread Jens Axboe

On Sat, Feb 16 2008, Kamalesh Babulal wrote:
> Hi,
> 
> The softlockup is seen from 2.6.25-rc1-git{1,3} and is visible in the 
> 2.6.24-rc2 kernel,
> While booting up with the 2.6.25-rc1-git{1,3} and 2.6.25-rc2 kernel(s) on the 
> powerbox
> 
> Loading st.ko module
> BUG: soft lockup - CPU#1 stuck for 61s! [insmod:379]
> NIP: c01b0620 LR: c01a5dcc CTR: 0040
> REGS: c0077caab8a0 TRAP: 0901   Not tainted  (2.6.25-rc2-autotest)
> MSR: 80009032   CR: 84004088  XER: 2000
> TASK = c0077cb450a0[379] 'insmod' THREAD: c0077caa8000 CPU: 1
> GPR00: c0077c9d4000 c0077caabb20 c0538a40 000b 
> GPR04: ffc0 c0077e0c 0036 000a 
> GPR08: 0040 c0077c9d4250 c000  
> GPR12: c0077c9d4230 c0481d00 
> NIP [c01b0620] .radix_tree_gang_lookup+0x100/0x1e4
> LR [c01a5dcc] .call_for_each_cic+0x50/0x10c
> Call Trace:
> [c0077caabb20] [c01a5e2c] .call_for_each_cic+0xb0/0x10c 
> (unreliable)
> [c0077caabc60] [c019dba4] .exit_io_context+0xf0/0x110
> [c0077caabcf0] [c0061e38] .do_exit+0x820/0x850
> [c0077caabda0] [c0061f34] .do_group_exit+0xcc/0xe8
> [c0077caabe30] [c000872c] syscall_exit+0x0/0x40
> Instruction dump:
> 7d296214 39290018 e809 7caa2038 39290008 2fa0 409e0018 7caa4215 
> 396b0001 418200cc 424000b8 4bdc <79691f24> 7d296214 e9690018 2fab 

It's odd stuff. Could you perhaps try and add some printks to
block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return
from radix_tree_gang_lookup() and the pointer value of cics[i] in the
for() loop after the lookup?

How many SCSI devices are online?

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-17 Thread Jens Axboe

On Sat, Feb 16 2008, Kamalesh Babulal wrote:
 Hi,
 
 The softlockup is seen from 2.6.25-rc1-git{1,3} and is visible in the 
 2.6.24-rc2 kernel,
 While booting up with the 2.6.25-rc1-git{1,3} and 2.6.25-rc2 kernel(s) on the 
 powerbox
 
 Loading st.ko module
 BUG: soft lockup - CPU#1 stuck for 61s! [insmod:379]
 NIP: c01b0620 LR: c01a5dcc CTR: 0040
 REGS: c0077caab8a0 TRAP: 0901   Not tainted  (2.6.25-rc2-autotest)
 MSR: 80009032 EE,ME,IR,DR  CR: 84004088  XER: 2000
 TASK = c0077cb450a0[379] 'insmod' THREAD: c0077caa8000 CPU: 1
 GPR00: c0077c9d4000 c0077caabb20 c0538a40 000b 
 GPR04: ffc0 c0077e0c 0036 000a 
 GPR08: 0040 c0077c9d4250 c000  
 GPR12: c0077c9d4230 c0481d00 
 NIP [c01b0620] .radix_tree_gang_lookup+0x100/0x1e4
 LR [c01a5dcc] .call_for_each_cic+0x50/0x10c
 Call Trace:
 [c0077caabb20] [c01a5e2c] .call_for_each_cic+0xb0/0x10c 
 (unreliable)
 [c0077caabc60] [c019dba4] .exit_io_context+0xf0/0x110
 [c0077caabcf0] [c0061e38] .do_exit+0x820/0x850
 [c0077caabda0] [c0061f34] .do_group_exit+0xcc/0xe8
 [c0077caabe30] [c000872c] syscall_exit+0x0/0x40
 Instruction dump:
 7d296214 39290018 e809 7caa2038 39290008 2fa0 409e0018 7caa4215 
 396b0001 418200cc 424000b8 4bdc 79691f24 7d296214 e9690018 2fab 

It's odd stuff. Could you perhaps try and add some printks to
block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return
from radix_tree_gang_lookup() and the pointer value of cics[i] in the
for() loop after the lookup?

How many SCSI devices are online?

-- 
Jens Axboe

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-17 Thread Rafael J. Wysocki

On Saturday, 16 of February 2008, Kamalesh Babulal wrote:
 Hi,

Hi,
 
 The softlockup is seen from 2.6.25-rc1-git{1,3} and is visible in the 
 2.6.24-rc2 kernel,
 While booting up with the 2.6.25-rc1-git{1,3} and 2.6.25-rc2 kernel(s) on the 
 powerbox

Can you update the Bugzilla entry at:
http://bugzilla.kernel.org/show_bug.cgi?id=9948
with the above information, please?

Rafael


 Loading st.ko module
 BUG: soft lockup - CPU#1 stuck for 61s! [insmod:379]
 NIP: c01b0620 LR: c01a5dcc CTR: 0040
 REGS: c0077caab8a0 TRAP: 0901   Not tainted  (2.6.25-rc2-autotest)
 MSR: 80009032 EE,ME,IR,DR  CR: 84004088  XER: 2000
 TASK = c0077cb450a0[379] 'insmod' THREAD: c0077caa8000 CPU: 1
 GPR00: c0077c9d4000 c0077caabb20 c0538a40 000b 
 GPR04: ffc0 c0077e0c 0036 000a 
 GPR08: 0040 c0077c9d4250 c000  
 GPR12: c0077c9d4230 c0481d00 
 NIP [c01b0620] .radix_tree_gang_lookup+0x100/0x1e4
 LR [c01a5dcc] .call_for_each_cic+0x50/0x10c
 Call Trace:
 [c0077caabb20] [c01a5e2c] .call_for_each_cic+0xb0/0x10c 
 (unreliable)
 [c0077caabc60] [c019dba4] .exit_io_context+0xf0/0x110
 [c0077caabcf0] [c0061e38] .do_exit+0x820/0x850
 [c0077caabda0] [c0061f34] .do_group_exit+0xcc/0xe8
 [c0077caabe30] [c000872c] syscall_exit+0x0/0x40
 Instruction dump:
 7d296214 39290018 e809 7caa2038 39290008 2fa0 409e0018 7caa4215 
 396b0001 418200cc 424000b8 4bdc 79691f24 7d296214 e9690018 2fab 
 INFO: task insmod:387 blocked for more than 120 seconds.
 echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this message.
 insmodD 1000e144 12144   387  1
 Call Trace:
 [c0077cb97600] [c08fae80] 0xc08fae80 (unreliable)
 [c0077cb977d0] [c0010c7c] .__switch_to+0x11c/0x154
 [c0077cb97860] [c0344498] .schedule+0x5d0/0x6b0
 [c0077cb97950] [c03447d8] .schedule_timeout+0x3c/0xe8
 [c0077cb97a20] [c0343d34] .wait_for_common+0x150/0x22c
 [c0077cb97ae0] [c008ef00] .__stop_machine_run+0xbc/0xf0
 [c0077cb97bb0] [c008ef70] .stop_machine_run+0x3c/0x80
 [c0077cb97c50] [c00891f0] .sys_init_module+0x14e4/0x1af4
 [c0077cb97e30] [c000872c] syscall_exit+0x0/0x40
 -- 0:conmux-control -- time-stamp -- Feb/15/08 16:04:12 --
 INFO: task insmod:387 blocked for more than 120 seconds.
 echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this message.
 insmodD 1000e144 12144   387  1
 Call Trace:
 [c0077cb97600] [c08fae80] 0xc08fae80 (unreliable)
 [c0077cb977d0] [c0010c7c] .__switch_to+0x11c/0x154
 [c0077cb97860] [c0344498] .schedule+0x5d0/0x6b0
 [c0077cb97950] [c03447d8] .schedule_timeout+0x3c/0xe8
 [c0077cb97a20] [c0343d34] .wait_for_common+0x150/0x22c
 [c0077cb97ae0] [c008ef00] .__stop_machine_run+0xbc/0xf0
 [c0077cb97bb0] [c008ef70] .stop_machine_run+0x3c/0x80
 [c0077cb97c50] [c00891f0] .sys_init_module+0x14e4/0x1af4
 [c0077cb97e30] [c000872c] syscall_exit+0x0/0x40
 -- 0:conmux-control -- time-stamp -- Feb/15/08 16:06:21 --



-- 
Premature optimization is the root of all evil. - Donald Knuth
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-17 Thread Rafael J. Wysocki

On Saturday, 16 of February 2008, Torsten Kaiser wrote:
 On Feb 15, 2008 10:23 PM, Linus Torvalds [EMAIL PROTECTED] wrote:
 
  Ok,
   this kernel is a winner.
 
 Sadly not for me:
 [ 5282.056415] [ cut here ]
 [ 5282.059757] kernel BUG at lib/list_debug.c:33!
 [ 5282.062055] invalid opcode:  [1] SMP
 [ 5282.062055] CPU 3
 [ 5282.062055] Modules linked in: radeon drm w83792d ipv6 tuner
 tea5767 tda8290 tuner_xc2028 tda9887 tuner_simple mt20xx tea5761
 tvaudio msp3400 bttv videodev v4l1_compat ir_common compat_ioctl32
 v4l2_common videobuf_dma_sg videobuf_core btcx_risc tveeprom usbhid
 pata_amd i2c_nforce2 hid sg
 [ 5282.062055] Pid: 12937, comm: sed Not tainted 2.6.25-rc2 #1
 [ 5282.062055] RIP: 0010:[803bffe4]
 - then the output from the serial console stopped. I was in X, so I
 could not see, if there was anything more on the real console.
 
 (gdb) list *0x803bffe4
 0x803bffe4 is in __list_add (lib/list_debug.c:33).
 28  }
 29  if (unlikely(prev-next != next)) {
 30  printk(KERN_ERR list_add corruption.
 prev-next should be 
 31  next (%p), but was %p. (prev=%p).\n,
 32  next, prev-next, prev);
 33  BUG();
 34  }
 35  next-prev = new;
 36  new-next = next;
 37  new-prev = prev;
 
 For more on this problem see 
 http://marc.info/?l=linux-kernelm=120293042005445

There's the Bugzilla entry for it at
http://bugzilla.kernel.org/show_bug.cgi?id=9973

Please update it with the current information.

Thanks,
Rafael
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2

2008-02-17 Thread Torsten Kaiser

On Feb 17, 2008 9:25 PM, Rafael J. Wysocki [EMAIL PROTECTED] wrote:
 There's the Bugzilla entry for it at
 http://bugzilla.kernel.org/show_bug.cgi?id=9973

Thank you.

 Please update it with the current information.

Crash for 2.6.25-rc2-mm1 added. That one had a complete stacktrace,
but the trace looks like others I already reported, so no real new
information... :-(

Torsten
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group

2008-02-17 Thread Tilman Schmidt


Am 16.02.2008 23:37 schrieb Jiri Slaby:

On 02/16/2008 09:12 PM, Alan Cox wrote:

On Sat, 16 Feb 2008 20:14:30 +0100
Tilman Schmidt [EMAIL PROTECTED] wrote:


2.6.25-rc2 fails to bring up my openSUSE 10.3 PC because LVM
cannot find the volume group containing the root file system.
2.6.25-rc1 has the same problem, 2.6.24 works fine.


Bisection says:

edfaa7c36574f1bf09c65ad602412db9da5f96bf is first bad commit
commit edfaa7c36574f1bf09c65ad602412db9da5f96bf
Author: Kay Sievers [EMAIL PROTECTED]
Date:   Mon May 21 22:08:01 2007 +0200

Driver core: convert block from raw kobjects to core devices

This moves the block devices to /sys/class/block. It will create a
flat list of all block devices, with the disks and partitions in one
directory. For compatibility /sys/block is created and contains symlinks
to the disks.

Apparently, compatibility is in the eye of the beholder - in this
case, LVM.


Compile in SCSI disk support. Modular even if loaded in initrd it seems
to have broken somewhere.


Setting

CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y

does not help. The problem persists.


# CONFIG_SYSFS_DEPRECATED is not set

I would suspect this.


Setting

CONFIG_SYSFS_DEPRECATED=y

does indeed fix the problem and allows me to boot successfully.
Pity, I was so happy getting rid of that a couple of releases ago.

Try to upgrade to at least lvm 2.02.29 (I guess this is the first version which 
understands the new sysfs layout).


I'll have to investigate how to do that without breaking anything.

HTH
T.

--
Tilman Schmidt  E-Mail: [EMAIL PROTECTED]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)



signature.asc
Description: OpenPGP digital signature

1 2 >

1 - 100 of 117 matches

Mail list logo