On May 1, 2014, at 1:36 AM, Antony Dovgal <antony.dov...@gmail.com> wrote:
> On 05/01/2014 06:43 AM, Jason Evans wrote:
>> Use "thread.tcache.flush" to flush thread caches; "arena.<i>.purge" merely 
>> uses madvise(2)
>> to inform the kernel that it can recycle dirty pages that contain unused 
>> data.
> 
> According to the docs "thread.tcache.flush" only flushes the cache of the 
> calling thread and
> I have a lot of threads running in thread pools, which are created at the 
> start and never destroyed.
> Or did you mean to call it periodically from every thread?

Your application can benefit from calling the “thread.tcache.flush” mallctl 
from a thread that is about to go “idle” (i.e. stops using the allocator for a 
while), but there’s little benefit otherwise, because there’s an incremental 
flushing mechanism built in that is driven by continued allocation activity.  
One straightforward way to implement flushing for idle threads in thread pools 
is to have idle threads wake up after a few seconds of inactivity and flush 
before going back to sleep.

>> There are two statistics jemalloc tracks that directly allow you to measure 
>> external fragmentation: "stats.allocated" [1] and "stats.active" [2].
> 
> Right, I've tried using both of them.
> Do I understand it correctly that stats.active decreases only when an entire 
> page is freed?

“stats.active” decreases when an entire page run is freed.  It precisely tracks 
what actually matters in terms of physical memory exhaustion.

> So far, using Salvatore's method and code I can see about 3% difference 
> between RSS and allocated memory
> when using jemalloc and ~9% difference when using Hoard.
> But I expect these values to change since the processes haven't started 
> removing outdated records yet.
> 
> I also have a control process without jemalloc (i.e. using plain libc 
> malloc()) using the same code to compute fragmentation
> and it shows about 20% difference (and it's growing).
> 
> What buffles me most is that stats.allocated keeps returning the same value, 
> but RSS constantly grows.

This is probably because you aren’t calling the “epoch” mallctl to refresh 
mallctl’s cached statistics.

> Could it be because of the amount of threads I use?

If your application occasionally recurses deeply, you may be incrementally 
increasing the total amount of memory dedicated to thread execution stacks.  
That could account for several gigabytes of memory usage, but probably isn’t 
the only issue.

> Say, I free memory in one thread and try to allocate in another one, but the 
> second thread
> doesn't have it cached and has to do the actual allocation?

Within limits, this can bloat memory usage.  However, IIRC thread caches 
average ~2.5 MiB per thread under the worst conditions (all threads are purely 
deallocating a broad mix of allocation sizes), so the thread caches probably 
account for less than 1 GiB in your application.

Jason
_______________________________________________
jemalloc-discuss mailing list
jemalloc-discuss@canonware.com
http://www.canonware.com/mailman/listinfo/jemalloc-discuss

Reply via email to