I didn't know about Jemalloc but just downloaded it and ran with my sim code. This is just a single data point but Jemalloc seems slow and uses too much memory.
The parameters of the run was: -a10000000: 10,000,000 allocations. -t18: eighteen threads. -z10 and -Z1000: block sizes are random in the range [10,1000] -l100: each allocated block lives for a number of steps randomly picked from [1,100], then maybe realloced or freed. -e0.01: at each allocation step, there is a 1% chance that an emphemeral thread is created and run to allocate/free a small number of objects. The numbers to look at below are "elapse time" and "efficiency". Efficiency is defined as the ratio between the total memory gotten from the OS (ie, "process") for allocation divided by the maximum amount of busy memory (ie, "usage") at any given time. The "self" line reports user+sys times and numbers of context switches of the main process. Vmalloc allows shared memory regions concurrently allocated from different processes. If we had done that, there would be more reporting data for subprocesses. But here we only test single process and multiple threads. --------- te:g:.../Vmalloc_t$ runsafe -a10000000 -t18 -z10 -Z1000 -l100 -e0.01 t.malloc-vm elapse time=6.0339s memory[process=390082560, usage=320892840, efficiency=0.82] self: time=20.5539s[user=10.3794s, sys=10.1745s], csw=499145[voluntary=494822, forced=4323] t.malloc-je elapse time=9.0062s memory[process=562839552, usage=320892840, efficiency=0.57] self: time=33.5089s[user=16.7435s, sys=16.7655s], csw=789582[voluntary=783360, forced=6222] -------- Vmalloc was faster than Jemalloc but in a real application that difference might be minimal. I guess the alarming part in this test run was that the efficiency of jemalloc was just 57%, meaning that it was taking roughly twice the amount of system memory to manage the maximum busy memory that the process needed. Vmalloc's efficiency was 82%. The more threads, the more fragmentation will occur but a good malloc needs to manage this, else for large and long-running applications, things can turn out badly. Jemalloc may need more work there. Do you know if the test suite for Jemalloc is available publicly? What was the meaning of the table of numbers in your email? Phong > From olga.kryzhanov...@gmail.com Wed Aug 15 16:39:25 2012 > Subject: Fwd: jemalloc > To: ast-developers@research.att.com, Phong Vo <k...@research.att.com> > Phong, how does the new AST vmalloc compare to jemalloc? > Olga > ---------- Forwarded message ---------- > From: Nicholas Clark <n...@ccl4.org> > Date: Wed, Aug 15, 2012 at 9:27 PM > Subject: jemalloc > To: perl5-port...@perl.org > Artur and Tim Bunce suggested investigating jemalloc, which is a high > performance malloc implementation now used by (among others) FreeBSD and > Facebook. Artur also suggest that our use of arenas of memory (for SV bodies) > is no longer the best idea, give that malloc() implementations have got > better. Fortunately arenas are easy to disable, by compiling with -DPURIFY. > So here is a comparison of blead (on dromedary, -Os, no threads), default, > compiled with -DPURIFY, default using an LD_PRELOAD to force the use of > jemalloc 3.0.0, and finally compiled with -DPURIFY and using jemalloc. > Not having anything fantastically better to hand, this is perlbench, with > each of the 4 run twice. > IIRC smaller numbers are better, and anything less than 5% is probably noise: > A B C D E F G H > --- --- --- --- --- --- --- --- > arith/mixed 100 101 101 98 102 98 101 101 > arith/trig 100 101 101 99 100 98 99 100 > array/copy 100 101 95 101 101 100 102 100 > array/foreach 100 79 102 76 101 76 101 79 > array/index 100 112 101 105 100 112 101 110 > array/pop 100 103 100 100 102 102 102 102 > array/shift 100 101 97 98 100 100 101 100 > array/sort-num 100 103 100 103 100 103 100 102 > array/sort 100 87 98 84 100 84 97 87 > call/0arg 100 111 100 104 107 104 102 108 > call/1arg 100 99 103 96 106 96 103 97 > call/2arg 100 105 97 99 96 100 95 103 > call/9arg 100 103 98 102 101 94 99 103 > call/empty 100 102 99 102 99 97 96 103 > call/fib 100 100 100 97 97 100 101 101 > call/method 100 106 101 102 97 104 100 105 > call/wantarray 100 109 98 101 100 102 98 110 > hash/copy 100 85 102 81 101 78 104 88 > hash/each 100 94 102 88 85 88 102 93 > hash/foreach-sort 100 97 99 97 100 94 101 96 > hash/foreach 100 96 98 95 103 93 102 94 > hash/get 100 101 98 101 100 102 101 102 > hash/set 100 96 102 102 100 101 102 91 > loop/for-c 100 106 111 105 101 106 109 106 > loop/for-range-const 100 99 99 97 96 97 94 98 > loop/for-range 100 100 101 92 99 98 99 99 > loop/getline 100 104 98 104 100 103 100 104 > loop/while-my 100 103 101 99 100 101 99 99 > loop/while 100 71 100 96 96 98 101 99 > re/const 100 99 99 99 100 97 99 99 > re/w 100 99 100 101 98 100 101 97 > startup/fewmod 100 98 99 97 100 96 98 98 > startup/lotsofsub 100 98 100 98 100 98 100 98 > startup/noprog 100 101 79 79 100 79 79 100 > string/base64 100 100 99 99 100 100 98 99 > string/htmlparser 100 98 108 105 100 105 107 98 > string/index-const 100 100 98 100 100 101 99 101 > string/index-var 100 100 98 99 100 100 100 99 > string/ipol 100 108 107 107 108 106 108 106 > string/tr 100 101 100 101 99 101 101 102 > AVERAGE 100 99 100 98 100 98 100 99 > ed2b02642a84b031 A A > +PURIFY B B > +jemalloc C C > +PURIFY +jemalloc D D > It's not much, so I'm not sure if it's noise or "signal". If it's signal, > it's suggesting that glibc malloc is fractionally better than using arenas, > and jemalloc fractionally better still. But not much. (And that with arenas, > malloc doesn't seem to matter) > Would anyone like to pursue this further? > jemalloc is BSD licensed, actively maintained and likely to improve, so > potentially we could ship it as a replacement for the current malloc.c > However, I'm not sure how easy it would be to integrate. We're not in a > position to enforce the use of LD_PRELOAD to swap out the libc malloc, so > just like the current malloc.c we'd have to do a bit more to rename the > symbols, and to place nicely with the system malloc, particularly if both > use sbrk(). > Nicholas Clark > -- > , _ _ , > { \/`o;====- Olga Kryzhanovska -====;o`\/ } > .----'-/`-/ olga.kryzhanov...@gmail.com \-`\-'----. > `'-..-| / http://twitter.com/fleyta \ |-..-'` > /\/\ Solaris/BSD//C/C++ programmer /\/\ > `--` `--` _______________________________________________ ast-developers mailing list ast-developers@research.att.com https://mailman.research.att.com/mailman/listinfo/ast-developers