Le 06/08/2014 18:12, Jason Evans a écrit :
On Aug 6, 2014, at 8:55 AM, Guillaume Holley <ghol...@cebitec.uni-bielefeld.de> 
wrote:
Le 06/08/2014 01:10, Jason Evans a écrit :
On Aug 5, 2014, at 10:35 AM, ghol...@cebitec.uni-bielefeld.de wrote:
I’m currently working on a data structure allowing the storage of a
dynamic set of short DNA sequences plus annotations.
Here are few details : the data structure is written in C, tests are
currently run on Ubuntu 14.04 64 bits, everything is single threaded and
Valgrind indicates that the program which manipulates the data structure
has no memory leaks.

I’ve started to use Jemalloc in an attempt to reduce the fragmentation of
the memory (by using one arena, disabling the thread caching system and
using a high ratio of dirty pages). On small data sets (30 millions
insertions), results are very good in comparison of Glibc: about 150MB
less by using tuned Jemalloc.

Now, I’ve started tests with much bigger data sets (3 to 10 billions
insertions) and I realized that Jemalloc is using more memory than Glibc.
I have generated a data set of 200 millions entries which I tried to
insert in the data structure and when the memory used reached 1GB, I
stopped the program and reported the number of entries inserted.
When using Jemalloc, doesn’t matter the tuning parameters (1 or 4 arenas,
tcache activated or not, lg_dirty = 3 or 8 or 16, lg_chunk = 14 or 22 or
30), the number of entries inserted varies between 120 millions to 172
millions. Or by using the standard Glibc, I’m able to insert 187 millions
of entries.
And on billions of entries, Glibc (I don’t have precise numbers
unfortunately) uses few Gigabytes less than Jemalloc.

So I would like to know if there is an explanation for this and if I can
do something to make Jemalloc at least as efficient as Glibc is on my
tests ? Maybe I’m not using Jemalloc correctly ?
There are a few possible issues, mainly related to fragmentation, but I can't 
make many specific guesses because I don't know what the 
allocation/deallocation patterns are in your application.  It sounds like your 
application just does a bunch of allocation, with very little interspersed 
deallocation, in which case I'm surprised by your results unless you happen to 
be allocating lots of objects that are barely larger than the nearest size 
class boundaries (e.g. 17 bytes).  Have you taken a close look at the output of 
malloc_stats_print()?

Jason
Hi and thank you for the help.

Well my application is doing like you said a lot of allocations with very 
little deallocations, but the memory allocated is very often reallocated. 
However, my application cannot allocate memory for more than 600KB in one 
allocation, so no allocation of huge objects.
I tried to have a look at malloc_stats_print() (which is enclosed at the end of 
my answer) and I see that for bins of size 64, it seems I make a huge amount of 
allocations/reallocations for a small amount of memory allocated, and maybe 
this generates a lot of fragmentation but I don't know in which proportion. Do 
you think my problem could be linked to this, and if yes, can I do something on 
Jemalloc to solve it ?

Here are the Jemalloc statistics for 200 millions insertions, Jemalloc tuned with the 
following parameters : "narenas:1,tcache:false,lg_dirty_mult:8"

___ Begin jemalloc statistics ___
[...]
Allocated: 1196728936, active: 1212567552, mapped: 1287651328
The overall external fragmentation is 1-(allocated/active), 1.3%, which is very 
low.

Current active ceiling: 16416505856
chunks: nchunks   highchunks    curchunks
            307          307          307
huge: nmalloc      ndalloc    allocated
            0            0            0

arenas[0]:
assigned threads: 1
dss allocation precedence: disabled
dirty pages: 296037:66 active:dirty, 25210 sweeps, 90046 madvises, 627169 purged
            allocated      nmalloc      ndalloc    nrequests
small:     1116037736    372536523    370598419    372536523
large:       80691200       223900       220617       223900
total:     1196728936    372760423    370819036    372760423
active:    1212567552
mapped:    1283457024
bins:     bin  size regs pgs    allocated      nmalloc ndalloc    nrequests     
  nfills     nflushes      newruns reruns      curruns
[...]
           21  1280   51  16    600357120       641800 172771       641800      
      0            0         9221 184785         9197
The only aspect of jemalloc that might be causing you problems is size class 
rounding.  IIRC glibc's malloc spaces its small size classes closer together.  
If your application allocates lots of 1025-byte objects, that could cost you 
nearly 20% in terms of memory usage for those allocations (~120 MB in this 
case).

Jason
I will try to see if I can solve that, thank you for the help !

Guillaume
_______________________________________________
jemalloc-discuss mailing list
jemalloc-discuss@canonware.com
http://www.canonware.com/mailman/listinfo/jemalloc-discuss

Reply via email to