On Mon, Apr 08, 2013 at 07:38:39AM -0700, Linus Torvalds wrote: > On Mon, Apr 8, 2013 at 5:42 AM, Ingo Molnar <mi...@kernel.org> wrote: > > > > AFAICS the main performance trade-off is the following: when the owner CPU > > unlocks > > the mutex, we'll poll it via a read first, which turns the cacheline into > > shared-read MESI state. Then we notice that its content signals 'lock is > > available', and we attempt the trylock again. > > > > This increases lock latency in the few-contended-tasks case slightly - and > > we'd > > like to know by precisely how much, not just for a generic '10-100 users' > > case > > which does not tell much about the contention level. > > We had this problem for *some* lock where we used a "read + cmpxchg" > in the hotpath and it caused us problems due to two cacheline state > transitions (first to shared, then to exclusive). It was faster to > just assume it was unlocked and try to do an immediate cmpxchg. > > But iirc it is a non-issue for this case, because this is only about > the contended slow path. > > I forget where we saw the case where we should *not* read the initial > value, though. Anybody remember?
I think you might be remembering ia64. Fairly early on, I recall there being a change in the spinlocks where we did not check them before just trying to acquire. Thanks, Robin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/