On 07/19/2012 04:22 PM, Darren Hart wrote: > > > On 07/13/2012 11:54 AM, Dave Jones wrote: >> On Fri, Jul 13, 2012 at 08:47:38PM +0200, Thomas Gleixner wrote: >> > On Fri, 13 Jul 2012, Dave Jones wrote: >> > >> > > Looks like calling futex() with garbage makes things unhappy. >> > >> > WARN_ON(!&q.pi_state); >> > pi_mutex = &q.pi_state->pi_mutex; >> > ret = rt_mutex_finish_proxy_lock(pi_mutex, to, >> &rt_waiter, 1); >> > debug_rt_mutex_free_waiter(&rt_waiter); >> > >> > So there is some weird way which causes q.pi_state = NULL. Dave, did >> > you see the warning before the oops happened ? >> >> No, that didn't seem to trigger. > > Well I don't have a fix yet, but I can explain this not triggering. > > q is on the stack, so the ADDRESS for q.pi_state is never going to be > NULL. However, properly instrumented, we do see this: > > [ 23.621501] ---[ end trace 20bdfb44db182a17 ]--- > [ 23.622425] q.pi_state @ (null) > [ 23.623272] &q.pi_state @ ffff880185e2dca8 > [ 23.624119] ------------[ cut here ]------------ > > Duh. > > I'll add a fix to that WARN_ON in my futex-fixes branch along with the > fix for the bug Dan found. >
I think I have root cause. futex_wait_requeue_pi() doesn't like having uaddr == uaddr2. The handle_early_wakeup() doesn't detect a problem because key2 IS the same as key1, I think. I've just discovered this and quickly hacked in a "if (uaddr==uaddr2) return -EINVAL" fix and the test continues to run (with just ops 0, 11, 12) for several minutes now (typically fails in a few seconds). I'll let it run for a few hours and contemplate the proper fix. -- Darren Hart Intel Open Source Technology Center Yocto Project - Linux Kernel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/