On Tue, Mar 15, 2016 at 1:45 PM, Yichao Yu <[email protected]> wrote:
>
> On Mar 15, 2016 11:56 AM, "'Bill Hart' via julia-users"
> <[email protected]> wrote:
>>
>> We have been trying to understand the garbage collector behaviour, since
>> we had some code for which our machine is running out of memory in a matter
>> of an hour.
>>
>> We already realised that Julia isn't responsible for memory we allocate on
>> the C side unless we use jl_gc_counted_malloc, which we now do everywhere.
>> But it still uses masses of memory where we were roughly expecting no growth
>> in memory usage (lots of short-lived objects and nothing much else).
>>
>> The behaviour of the gc on my machine seems to be to allocate objects
>> until 23mb of memory is allocated, then do a jl_gc_collect. However, even
>> after reading as much of the GC code in C as I can, I still can't determine
>> why we are observing the behaviour we are seeing.
>>
>> Here is a concrete example whose behaviour I don't understand:
>>
>>      function doit2(n::Int)
>>          s =
>> BigInt(2234567876543456789876545678987654567898765456789876545678)
>>          for i = 1:n
>>             s += i
>>          end
>>          return s
>>       end
>>
>>      doit(10000000000)

Another note is that adding finalizers will (currently) extend the
lifetime of an object. https://github.com/JuliaLang/julia/pull/13995
should solve this problem but I'm holding on to it before we finish
some other GC rework.

>>
>>
>> This is using Julia's BigInt type which is using a GMP bignum. Julia
>> replaces the GMP memory manager functions with jl_gc_counted_malloc, so
>> indeed Julia knows about all the allocations made here.
>>
>> But what I don't understand is that the memory usage of Julia starts at
>> about 124mb and rises up to around 1.5gb. The growth is initially fast and
>> it gets slower and slower.
>
> I can't really reproduce this behavior.
>
> I assume doit is doit2 and not another function you defined somewhere else.
>
>
>>
>> Can someone explain why there is this behaviour? Shouldn't jl_gc_collect
>> be able to collect every one of those allocations every time it reaches the
>> collect_interval of 23mb (which remains constant on my machine with this
>> example)?
>>
>> As an alternative experiment, I implemented a kind of bignum type using
>> Julia arrays of UInts which I then pass to GMP low level mpn functions
>> (which don't do any allocations on the C side). I only implemented the +
>> operator, just enough to make this example work.
>>
>> The behaviour in this case is that memory usage is constant at around
>> 124mb. There is no growth in memory usage over time.
>>
>> Why is the one example using so much memory and the other is not?
>>
>> Note that the bignums do not grow here. They are always essentially 3 or 4
>> limbs or something like that, in both examples.
>>
>> Some other quick questions someone might be able to answer:
>>
>> * Is there any difference in GC behaviour between using & vs Ref in ccall?
>
> Ref is heap allocated. & is not.
>
>>
>> * Does the Julia GC have a copying allocator for the short lived
>> generation(s)?
>
> It has two generation and not copy
>
>>
>> * Does the Julia GC do a full mark and sweep every collection?
>
> No
>
>> Even of the long lived generation(s)? If not, which part of the GC code is
>> responsible for deciding when to do a more involved sweep vs a faster sweep.
>> I am having some trouble orienting myself with the code, and I'd really like
>> to understand it a bit better.
>
> The counters in _jl_gc_collect
>
>
>>
>> * Can someone confirm whether the "pools" mentioned in the GC code refer
>> to pools for different sized allocations. Are there multiple pools for the
>> same sized allocation, or did I misunderstand that?
>
> Pools are for small objects and they are segregated by size.
>
>>
>> Thanks in advance.
>>
>> Bill.
>>
>>
>>

Reply via email to