On 17 May 2014, at 00:33, Roland Haas <[email protected]> wrote:

> Signed PGP part
> Hello Ian,
> 
> > Does anyone else see problems like this?
> I think Jim Healy saw growth in memory quite a while back.
> 
> If you are suspecting that the memory consumption is due to C++ code
> (ie Carpet other then you can try the attached memory tracer [there is
> a main() function at the end to show how to use it] to tag all calls
> to new/delete to the code). For Carpet I suspect that you may also
> want to change the malloc() call in mem<T>'s constructor (in
> CarpetLib/src/mem.cc) and mempool (same file) to eg new char[blah] or
> something similar so that they are also tracked.
> 
> The tracking is not thread safe at this point, you could likely add a
> pthread mutex though if you need that (I'd just add them around the
> new/delete implementations if I can get away with it).
> 
> Tracking is done by calling MemTagger::TagStack::PushTag("foo") which
> will tag all memory allocation until the next PushTag as coming from
> "foo". You can use PopTag("foo") to remove the current one and use the
> previous one (same as the hierarchical timers). I way to tag a
> fraction of the Cactus code would be to have the timers push and pop
> the tags.


Thanks for the pointer.  I have some new information.  I had been setting the 
parameter  CarpetLib::max_memory_size_MB to the memory available to the 
process.  This sets the processes maximum address space limit, which now that I 
think about it is probably the wrong thing to do.  The error I was getting was 
caused by this limit being exceeded.  When I removed this limit, the process 
maxrss grew to a maximum of 12 GB and then stayed there.  The simulation 
continued to run at the same speed, and the swap usage did not increase.  

So for the moment, it looks like maxrss is not a reliable indicator of when the 
process is going to "run out of memory".  As I understand it, "maxrss" is the 
amount of memory the kernel has allocated to the process and which is currently 
held in RAM, as opposed to having been swapped to disk.  This would include the 
process heap as well as any mmapped data.  It's possible that malloc is 
constantly growing the heap even though blocks have been freed, and only when 
it actually can't grow the heap any more does it try to look for and coalesce 
free blocks (a possibly-slow operation).  With a lot of regridding, Carpet 
might be allocating and deallocating quite a lot, so there might be a large 
amount of freed memory in the malloc heap.

SystemStatistics outputs the results of the mallinfo call, but I have learned 
recently that this is not properly supported on 64 bit systems (the numbers of 
bytes are stored in 32 bit integers, which is not enough), and the glibc 
developers have no interest in changing this, as they say it doesn't reflect 
the internal workings of a modern malloc anyway.  There is a call to print some 
malloc information to stderr, so I might try that.  Nonetheless, accounting for 
truncation to 32 bits, the amount of "freed" memory reported by mallinfo is 
fluctuating quite a lot during the simulation, lending support to the 
hypothesis that malloc is extending the heap rather than reusing freed memory.  
For large enough blocks, I would expect it to use mmap, which should avoid this 
fragmentation issue, but I don't know if that is happening.

-- 
Ian Hinder
http://numrel.aei.mpg.de/people/hinder

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Users mailing list
[email protected]
http://lists.einsteintoolkit.org/mailman/listinfo/users

Reply via email to