Hi Ian --

Following up on this after a bit of a delay ...

I don't think you're actually leaking that memory in the traditional
sense, though it does sound like you're not able to (re)use it.

The "top" output you report and the Chapel memory leakage information
Brad reports are both right.  They disagree because they're talking
about different things.  "top" shows memory consumption from the OS's
point of view: what memory the OS has handed out to user-level code,
which the latter has not yet returned.  But user-level memory allocators
typically do not ever give memory back to the OS once they've acquired
it, even when their user-program clients free it.  Instead they hang on
to it with the expectation that they will be able to re-use it later,
for other allocations.  All the user level allocators I'm familiar with
behave this way: glibc, dlmalloc, tcmalloc.  Basically this is the same
model that Chapel's fifo tasking layer implementation uses for pthreads:
it's costly enough to get the pthreads from the kernel that once we've
got them, when we don't need them any more we'd rather cache them for
future use (spinning them or hanging them on a cond var) rather than
give them back to the OS.

The glibc allocator does support a mode in which you can encourage it
(maybe even explicitly tell it?) to return memory to the OS, but even in
that case it reserves the right to refuse to do so.

However, the Chapel leakage report is nevertheless correct; the vast
majority of what Chapel has allocated it has also freed.  But that
memory is being held by the underlying user-level allocator, instead of
being given back to the OS.  It should be available to satisfy future
allocation requests, but based on how you say the program is behaving
apparently it is not.

what could be happening is that the memory is ending up fragmented in
the allocator(s), so it's hard or impossible to reuse it.  This could
explain the steady increase in the amount of memory in the "top" output.
Diagnosing this could be tricky.  But one symptom would be profile
information that showed a lot of time being spent in the allocator code,
since allocators suffering from arena fragmentation often end up
spending a lot of time, for each allocation, trying (and failing) to
find memory to reuse, and then acquiring more from the OS.

You don't mention actual out-of-memory failures.  Are the programs just
getting steadily larger and slower, or do you eventually see them fail
with out-of-memory errors?

thanks,
greg


On Tue, 20 Oct 2015, Brad Chamberlain wrote:

>
> Hi Ian --
>
> First some general background statements:
>
> Yes, any leaks due to arrays and domains (or really, anything other than
> class objects which a user has allocated with 'new') suggest problems on
> the Chapel implementation's end.  Some of the most egregious cases that
> results in leaks today are (a) strings, (b) distributed arrays and domains
> -- these cases (and others) are listed in the $CHPL_HOME/STATUS file.
>
> We've currently got roughly 1/4 - 1/3 of our team working on a concerted
> effort to correct the memory semantics for automatically-managed value
> types (particularly records, strings, domains, arrays) and to plug the
> memory leaks stemming from these types; so I would expect this behavior to
> get better for version 1.13 of the compiler.  The sad fact of the matter
> is that most of the benchmarks we've studied the most create a small
> number of arrays at the beginning of time (rather than creating and
> destroying domains and arrays), and that this has allowed us to ignore
> cases like "distributed arrays are leaked" for far longer than we'd like.
> But as I say, it's a very active area of development now.
>
> Turning to your specific test cases:
>
> I'm curious what mechanism you're using to measure memory usage/leaks. For
> example, when I use Chapel's built-in mechanisms for measuring memory
> usage and leakage
> (http://chapel.cray.com/docs/1.12/modules/standard/Memory.html), I'm
> getting a very modest number of leaks reported at program exit for the two
> tests you sent (this is using the master branch, though I'd expect the
> result to be similar for 1.12.0):
>
> Here's the output for classes.chpl:
>
> Compiler Command : chpl /users/bradc/tmp/classes.chpl
> Execution Command: ./a.out --memTrack --memLeaksLog=leaks.out
>
> =================
> Memory Statistics
> ==============================================================
> Current Allocated Memory               280
> Maximum Simultaneous Allocated Memory  24001779
> Total Allocated Memory                 24002488
> Total Freed Memory                     24002208
> ==============================================================
>
> ====================
> Leaked Memory Report
> ==============================================================
> Number of leaked allocations
>            Total leaked memory (bytes)
>                       Description of allocation
> ==============================================================
> 7          232        string copy data
> 3          48         io buffer or bytes
> ==============================================================
>
>
>
> And for domains.chpl:
>
> =================
> Memory Statistics
> ==============================================================
> Current Allocated Memory               180
> Maximum Simultaneous Allocated Memory  104002012
> Total Allocated Memory                 200002412
> Total Freed Memory                     200002232
> ==============================================================
>
> ====================
> Leaked Memory Report
> ==============================================================
> Number of leaked allocations
>            Total leaked memory (bytes)
>                       Description of allocation
> ==============================================================
> 3          132        string copy data
> 3          48         io buffer or bytes
> ==============================================================
>
>
>
> There could obviously be a bug or hole in our memory tracking system that
> we're not aware of; or we may be measuring slightly different things -- so
> this is why I'm curious what you're doing to detect leaks.
>
> Thanks,
> -Brad
>
>
> On Mon, 19 Oct 2015, Ian Bertolacci wrote:
>
>> All,
>> I seem to have hit a barrier when it comes to using chapel (at least
>> 1.11.0, which we have installed at CSU).
>> The app I'm currently working on is very matrix heavy, which is backed by
>> arrays and regular domains.
>> There are many millions (maybe even billions, it is a ray tracer after all)
>> of matrix and matrix like object being created, and destroyed.
>>
>> However, I seem to be encountering a massive memory leak.
>>
>> The problem seems to be two fold:
>>
>>   1. When deleted, only an incredibly small portion of memory is actually
>>   free'd.
>>   Im seeing about 15% of the used memory being recovered in a small
>>   benchmark where the classes only contain a single int field.
>>   2. The domains never get freed.
>>   When classes are deleted, I set my domains to an empty domain (eg {1..0}
>>   ) which clears out the array, however the domains for those arrays still
>>   remain, since they cannot be deleted ("delete not allowed on records").
>>   Here I see about 5% of the used memory being recovered.
>>
>> (I observed this both with and without --fast)
>>
>> This leads to instances where the program can leak memory on the order of
>> 95% of machines with 30+GB of RAM, where its performance is then
>> obliterated by having to swap memory.
>>
>> Im not really sure how to remedy this.
>> Is it something that is on the compilers end?
>>
>> Attached are two benchmarks that I used to see this.
>> To observe memory usage I used top.
>>
>> -Ian J. Bertolacci
>>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Chapel-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/chapel-users
>

------------------------------------------------------------------------------
_______________________________________________
Chapel-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-users

Reply via email to