On Tue, 14 Jan 2014, Jason Evans wrote:

On Dec 30, 2013, at 5:23 AM, [email protected] wrote:
I noticed that the OSX version of the jemalloc uses spin locks and decided to 
implement support for the pthread spin locks that can be used in Linux. At 
least in my case there is huge benefit because I run a single thread in a 
specific core that has not that much other activity and pthread mutex lock 
contention seems to always schedule the thread away from execution, so spin 
locking seems to give more stable result while adding bit more CPU load. Most 
likely in more general case this would not be wanted, because there would be 
more threads/processes running on same core and thus the scheduling might give 
execution time for some other important threads like the one having the lock 
taken.

What do you think, is this something that could be merged to the upstream? My 
patch implements new configure script option --enable-pthread-spinlock that 
turns on the pthread spin lock i.e. the spin locks are not used by default 
because of the scheduling benefit with normal use.

The only reason jemalloc uses spinlocks on OS X is that normal pthread mutex 
creation uses malloc() internally, and that causes bootstrapping issues.  Under 
all the real-world scenarios I've examined, mutex contention is negligible.  
Furthermore, on Linux jemalloc uses adaptive mutexes, which IIRC spin before 
blocking.  Do you have an application for which your patch makes a significant 
difference?

Thanks,
Jason

Hi Jason,

I have a case where I have a specialized thread running in it's own CPU and there is not be that much activity in same CPU core. I have internal memory allocator that allows the process to increase up to certain limit and after the predefined limit the application starts to do it's onw garbage collecting i.e. the thread cache can be mostly empty when the process size is allowed to grow. This means that the global arena is accessed a lot and when there is lot of CPU's the locks start to matter. When running my application I don't typically see full CPU load when there is lot of CPU's because eventually the mutex will put the process away from execution and with spinlock's that obviously doens't happen. There is definetely a measurable difference with different locking methods.

In the more general setup using spinlocks is probably not that good because then the other execution in the same CPU is prevented but with my specialized case the spinlock do matter.

--
Valtteri Rahkonen
[email protected]
http://www.rahkonen.fi
+358 40 5077041
_______________________________________________
jemalloc-discuss mailing list
[email protected]
http://www.canonware.com/mailman/listinfo/jemalloc-discuss

Reply via email to