> > A test that acquired and released a lock 2 billion times reported that > > the custom lock was roughly 20% faster than using the mutex. > > 26.6 seconds versus 33.0 seconds. > > I think you are measuring the fact your call is inlined and pthreads > has an indirect jump - because internally pthreads implements the same > thing using a futex instead of a sem_t.
This is what I suspect as well. > > in releasing a lock. However, we keep the custom lock based on > > the results of the direct lock tests that were done. > > This does hurt portability though, the GCC extension > __sync_fetch_and_add is not supported on all targets.. I'll fixup that by falling back to a mutex when __sync_fetch_and_add is not available. > > As to the hotspot, the unlock in question occurs during rsend(). The > > hotspot may simply be the result of processing the send completion. > > Are you using a stochastic profiler? It may show as a hot spot simply > because the unlock is a context switch point. I'm using Intel's VTune Amplifier XE 2011, using the hotspot analysis settings. I had already gone through the trouble of creating the custom lock before realizing that the hotspot was likely the result of some other interaction. - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
