On Aug 24, 2007, at 11:05 PM, Josh Aune wrote:

Hmm.  If you compile Open MPI with no memory manager, then it
*shouldn't* be Open MPI's fault (unless there's a leak in the mvapi
BTL...?).  Verify that you did not actually compile Open MPI with a
memory manager by running "ompi_info| grep ptmalloc2" -- it should
come up empty.

I am sure.  I have multiple builds that I switch between.  One of the
apps doesn't work unless I --without-memory-manager (see post to
-users about realloc(), with sample code).

Ok.

I noticed that there are a few ./configure --debug type switches, even
some dealing with memory.  Could those be useful for gathering further
data?  What features do those provide and how do I use them?

If you use --enable-mem-debug, they force all internal calls to malloc (), free(), and calloc() to go through our own internal functions, but those mainly just check that we don't pass bad parameters such as NULL, etc. I suppose you could put in some memory profiling or something, but that would probably get pretty sticky. :-(

The fact that you can run this under TCP without memory leaking would
seem to indicate that it's not the app that's leaking memory, but
rather either the MPI or the network stack.

I should clarify here, this is effectively true.  The app crashes from
a segfault after running over tcp for several hours, but it gets much
farther into the run than the vapi btl does.

Yuck. :-( I assume there's no easy way to track this down -- do you get a corefile? Can you see where the app died -- are there any obvious indexes going out of range of array bounds, etc.? Is it in MPI or in the application?

--
Jeff Squyres
Cisco Systems

Reply via email to