On 01/08/2012 09:00 PM, Richard W.M. Jones wrote:
On Sun, Jan 08, 2012 at 06:45:05PM +0000, Richard W.M. Jones wrote:
And that brings us to (c): does it even make sense to give back memory
to the OS?
BTW you can try calling malloc_stats(), it should print statistics
on how much total memory malloc() is using, and how much of that is
reclaimable/unreclaimable free memory.
Sometimes you may have this situation (fragmented memory):
| malloced bytes | .... large range of free bytes ... | malloced bytes |
AFAIK glibc is not able to give back the middle portion to the OS, you'll
have to use your own memory pool allocator that can munmap() the middle bit
when no longer needed.
A quick way to see if the increased mem usage you see in top is due to malloc()
is
to switch temporarely to a different malloc impl. You can try linking with
jemalloc, or tcmalloc.
I forgot to mention one way in which this is more efficient: If you
munmap a piece of memory and later decide you need more memory so you
call mmap, then the kernel has to give you zeroed memory. You
probably didn't want zeroed memory, but you pay the penalty anyway.
Also mmap() and munmap() are quite expensive in threaded apps because they
have to take a process-wide lock in the kernel, and that lock also used
to be held during page-fault I/O. I think thats why glibc "caches"
the mmap arenas. This doesn't really matter for OCaml though, as it already has
a process-wide lock for OCaml threads.
(The converse of this is that if your unused memory is swapped out,
then it has to be written to disk and read back, which is even less
efficient.)
There is an madvise flag "MADV_DONTNEED" which is better than munmap +
mmap, although not as optimal as it could be. See links below.
You can also try to map fresh anonymous memory over the already mapped
area, saves an munmap call.
http://gcc.gnu.org/ml/gcc-patches/2011-10/msg00733.html
http://www.reddit.com/r/programming/comments/dp5up/implementations_for_many_highlevel_programming/c120n77
Probably the OCaml GC should be setting madvise hints anyway.
It should mmap()/munmap() instead of malloc/realloc/free in that case, right?
Which probably wouldn't be a bad idea, as you don't get the fragmentation issues
as much as you do with malloc.
While we're at it, the GC may be able to cooperate better with the
new(-ish) Transparent Hugepages feature of Linux.
Does it suffice to allocate the major heap in 2MB increments to take advantage
of that?
Best regards,
--Edwin
--
Caml-list mailing list. Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs