> Quoting Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: IPoIB caused a kernel: BUG: soft lockup detected on CPU#0!
> 
> >Feb 27 17:47:52 sw169 kernel:  [<ffffffff8053aaf1>] 
> >_spin_lock_irqsave+0x15/0x24
> >Feb 27 17:47:52 sw169 kernel:  [<ffffffff88067a23>] 
> >:ib_ipoib:ipoib_neigh_destructor+0xc2/0x139
> 
> It looks like this is deadlocking trying to take priv->lock in 
> ipoib_neigh_destructor().
> One idea I just had would be to build a kernel with CONFIG_PROVE_LOCKING 
> turned on, and then rerun this test.  There's a good chance that this would
> diagnose the deadlock.  (I don't have good access to my test machines right 
> now, or
> else I would do it myself)

OK, I did that. But I get
        [13440.761857] INFO: trying to register non-static key.
        [13440.766903] the code is fine but needs lockdep annotation.
        [13440.772455] turning off the locking correctness validator.
and I am not sure what triggers this, or how to fix it to have the
validator actually do its job.

Ingo, what key does the message refer to?

The stack dump seems to point to drivers/infiniband/ulp/ipoib/ipoib_main.c line
829.

Full message below:
        
[13440.761857] INFO: trying to register non-static key.
[13440.766903] the code is fine but needs lockdep annotation.
[13440.772455] turning off the locking correctness validator.
[13440.778008]  [<c023c082>] __lock_acquire+0xae4/0xbb9
[13440.783078]  [<c023c43d>] lock_acquire+0x56/0x71
[13440.787784]  [<f899bff2>] ipoib_neigh_destructor+0xd0/0x132 [ib_ipoib]
[13440.794412]  [<c051ad41>] _spin_lock_irqsave+0x32/0x41
[13440.799649]  [<f899bff2>] ipoib_neigh_destructor+0xd0/0x132 [ib_ipoib]
[13440.806275]  [<f899bff2>] ipoib_neigh_destructor+0xd0/0x132 [ib_ipoib]
[13440.812897]  [<c04a1c1b>] dst_run_gc+0xc/0x118
[13440.817439]  [<c022af6e>] run_timer_softirq+0x37/0x16b
[13440.822673]  [<c04a1c0f>] dst_run_gc+0x0/0x118
[13440.827221]  [<c04a3eab>] neigh_destroy+0xbe/0x104
[13440.832114]  [<c04a1bb1>] dst_destroy+0x4d/0xab
[13440.836751]  [<c04a1c64>] dst_run_gc+0x55/0x118
[13440.841384]  [<c022b03f>] run_timer_softirq+0x108/0x16b
[13440.846711]  [<c0227634>] __do_softirq+0x5a/0xd5
[13440.851427]  [<c023b435>] trace_hardirqs_on+0x106/0x141
[13440.856754]  [<c0227643>] __do_softirq+0x69/0xd5
[13440.861470]  [<c02276e6>] do_softirq+0x37/0x4d
[13440.866016]  [<c02167b0>] smp_apic_timer_interrupt+0x6b/0x77
[13440.871774]  [<c02029ef>] default_idle+0x3b/0x54
[13440.876491]  [<c02029ef>] default_idle+0x3b/0x54
[13440.881211]  [<c0204c33>] apic_timer_interrupt+0x33/0x38
[13440.886624]  [<c02029ef>] default_idle+0x3b/0x54
[13440.891342]  [<c02029f1>] default_idle+0x3d/0x54
[13440.896061]  [<c0202aaa>] cpu_idle+0xa2/0xbb
[13440.900436]  =======================
[13768.711447] BUG: spinlock lockup on CPU#1, swapper/0, c0687880
[13768.717353]  [<c031f919>] _raw_spin_lock+0xda/0xfd
[13768.722247]  [<c051ad48>] _spin_lock_irqsave+0x39/0x41
[13768.727486]  [<f899bff2>] ipoib_neigh_destructor+0xd0/0x132 [ib_ipoib]
[13768.734110]  [<f899bff2>] ipoib_neigh_destructor+0xd0/0x132 [ib_ipoib]
[13768.740735]  [<c04a1c1b>] dst_run_gc+0xc/0x118
[13768.745276]  [<c022af6e>] run_timer_softirq+0x37/0x16b
[13768.750517]  [<c04a1c0f>] dst_run_gc+0x0/0x118
[13768.755061]  [<c04a3eab>] neigh_destroy+0xbe/0x104
[13768.759955]  [<c04a1bb1>] dst_destroy+0x4d/0xab
[13768.764586]  [<c04a1c64>] dst_run_gc+0x55/0x118
[13768.769218]  [<c022b03f>] run_timer_softirq+0x108/0x16b
[13768.774542]  [<c0227634>] __do_softirq+0x5a/0xd5
[13768.779261]  [<c023b435>] trace_hardirqs_on+0x106/0x141
[13768.784588]  [<c0227643>] __do_softirq+0x69/0xd5
[13768.789308]  [<c02276e6>] do_softirq+0x37/0x4d
[13768.793851]  [<c02167b0>] smp_apic_timer_interrupt+0x6b/0x77
[13768.799609]  [<c02029ef>] default_idle+0x3b/0x54
[13768.804326]  [<c02029ef>] default_idle+0x3b/0x54
[13768.809054]  [<c0204c33>] apic_timer_interrupt+0x33/0x38
[13768.814471]  [<c02029ef>] default_idle+0x3b/0x54
[13768.819187]  [<c02029f1>] default_idle+0x3d/0x54
[13768.823903]  [<c0202aaa>] cpu_idle+0xa2/0xbb
[13768.828279]  =======================


-- 
MST
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to