Hello Jason,

On 05/01/2014 06:43 AM, Jason Evans wrote:
Use "thread.tcache.flush" to flush thread caches; "arena.<i>.purge" merely uses 
madvise(2)
to inform the kernel that it can recycle dirty pages that contain unused data.

According to the docs "thread.tcache.flush" only flushes the cache of the 
calling thread and
I have a lot of threads running in thread pools, which are created at the start 
and never destroyed.
Or did you mean to call it periodically from every thread?

There are two statistics jemalloc tracks that directly allow you to measure external fragmentation: 
"stats.allocated" [1] and "stats.active" [2].

Right, I've tried using both of them.
Do I understand it correctly that stats.active decreases only when an entire 
page is freed?

jemalloc's worst case fragmentation behavior is pretty straightforward to 
reason about for small objects.  Each size class [3] can be considered 
independently.  The worst thing that can possibly happen is that after the 
application reaches its maximum usage, it then frees all but one allocated 
region in each page run.  However, your application is presumably reaching a 
stable number of allocations, then replacing old data with new.  If the total 
number of allocated regions for each size class remains stable in the steady 
state, then your application should suffer very little fragmentation.  However, 
if your application maintains the same total memory usage, but shifts from, 
say, mostly 48-byte regions to mostly 64-byte regions, it can end up with 
highly fragmented runs that contain the few remaining 48-byte allocations.
Given 28 small size classes, it's possible for this to be a terrible 
fragmentation situation, but I have yet to see this happen in a real 
application.

So far, using Salvatore's method and code I can see about 3% difference between 
RSS and allocated memory
when using jemalloc and ~9% difference when using Hoard.
But I expect these values to change since the processes haven't started 
removing outdated records yet.

I also have a control process without jemalloc (i.e. using plain libc malloc()) 
using the same code to compute fragmentation
and it shows about 20% difference (and it's growing).


What buffles me most is that stats.allocated keeps returning the same value, 
but RSS constantly grows.
Could it be because of the amount of threads I use?
Say, I free memory in one thread and try to allocate in another one, but the 
second thread
doesn't have it cached and has to do the actual allocation?

--
Wbr,
Antony Dovgal
---
http://pinba.org - realtime profiling for PHP
_______________________________________________
jemalloc-discuss mailing list
[email protected]
http://www.canonware.com/mailman/listinfo/jemalloc-discuss

Reply via email to