Re: [PATCH] librdmacm/rsockets: Optimize synchronization to improve performance

Jason Gunthorpe Thu, 10 May 2012 09:45:28 -0700

On Thu, May 10, 2012 at 12:01:03AM +0000, Hefty, Sean wrote:

> A test that acquired and released a lock 2 billion times reported that
> the custom lock was roughly 20% faster than using the mutex.
> 26.6 seconds versus 33.0 seconds.


I think you are measuring the fact your call is inlined and pthreads
has an indirect jump - because internally pthreads implements the same
thing using a futex instead of a sem_t.

> in releasing a lock.  However, we keep the custom lock based on
> the results of the direct lock tests that were done.

This does hurt portability though, the GCC extension
__sync_fetch_and_add is not supported on all targets..

> As to the hotspot, the unlock in question occurs during rsend().  The
> hotspot may simply be the result of processing the send completion.

Are you using a stochastic profiler? It may show as a hot spot simply
because the unlock is a context switch point.
 
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] librdmacm/rsockets: Optimize synchronization to improve performance

Reply via email to