I just want to reiterate what Jason has said below.  I recently spent several 
months trying to reduce the amount of memory used by one of our applications.  
We were seeing efficiency ratings for the heap in the 50-60% range (in terms of 
VM use vs outstanding buffers used by the app).

In our case it was relatively easy to segregate one of the largest offenders (a 
periodic thread that consumes large amounts of heap and then frees it when 
finished).  This resulted in a very large efficiency gain (now closer to 90%).  
If you are able to segregate long lived allocations I don't think it matters 
how many transient arenas you have configured because over time they'll empty 
themselves.

Also, another use for arenas we are interested in trying but haven't explored 
is fault isolation.  Again this will depend a bit upon your application, but 
one idea is to assign a problem thread or module its own arena in order to 
pinpoint the source of memory corruption issues.  In reduced memory 
environments tools like valgrind aren't always an option so something much 
lighter weight like thread specific arenas seem likely to be more viable.

We are using a fairly old version of jemalloc.  I'm happy to see that the newer 
version has official support for this type of segregation.  In the version we 
are using we also had to modify the code that detects when there's contention 
for a specific arena and allows threads to use alternate arenas.  We needed 
complete isolation of the one arena to see the efficiency gains noted above.

I also want to apologize to Jason.  He's clearly spent a great deal of time 
optimizing the performance of jemalloc.  Those of us operating in limited 
memory environments start off by disabling much of his hard work :)

From: Jason Evans <[email protected]<mailto:[email protected]>>
Date: Thursday, November 14, 2013 8:20 PM
To: Nikhil Bhatia <[email protected]<mailto:[email protected]>>
Cc: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: jemalloc tuning help

You can potentially mitigate the problem by reducing the number of arenas (only 
helps if per thread memory usage spikes are uncorrelated).  Another possibility 
is to segregate short- and long-lived objects into different arenas, but this 
requires that you have reliable (and ideally stable) knowledge of object 
lifetimes.  In practice, segregation is usually very difficult to maintain.  If 
you choose to go this direction, take a look at the "arenas.extend" mallctl 
(for creating an arena that contains long-lived objects), and the 
ALLOCM_ARENA(a) macro argument to the [r]allocm() functions.
_______________________________________________
jemalloc-discuss mailing list
[email protected]
http://www.canonware.com/mailman/listinfo/jemalloc-discuss

Reply via email to