On Mar 15, 2016 11:56 AM, "'Bill Hart' via julia-users" <
[email protected]> wrote:
>
> We have been trying to understand the garbage collector behaviour, since
we had some code for which our machine is running out of memory in a matter
of an hour.
>
> We already realised that Julia isn't responsible for memory we allocate
on the C side unless we use jl_gc_counted_malloc, which we now do
everywhere. But it still uses masses of memory where we were roughly
expecting no growth in memory usage (lots of short-lived objects and
nothing much else).
>
> The behaviour of the gc on my machine seems to be to allocate objects
until 23mb of memory is allocated, then do a jl_gc_collect. However, even
after reading as much of the GC code in C as I can, I still can't determine
why we are observing the behaviour we are seeing.
>
> Here is a concrete example whose behaviour I don't understand:
>
>      function doit2(n::Int)
>          s =
BigInt(2234567876543456789876545678987654567898765456789876545678)
>          for i = 1:n
>             s += i
>          end
>          return s
>       end
>
>      doit(10000000000)
>
>
> This is using Julia's BigInt type which is using a GMP bignum. Julia
replaces the GMP memory manager functions with jl_gc_counted_malloc, so
indeed Julia knows about all the allocations made here.
>
> But what I don't understand is that the memory usage of Julia starts at
about 124mb and rises up to around 1.5gb. The growth is initially fast and
it gets slower and slower.

I can't really reproduce this behavior.

I assume doit is doit2 and not another function you defined somewhere else.


>
> Can someone explain why there is this behaviour? Shouldn't jl_gc_collect
be able to collect every one of those allocations every time it reaches the
collect_interval of 23mb (which remains constant on my machine with this
example)?
>
> As an alternative experiment, I implemented a kind of bignum type using
Julia arrays of UInts which I then pass to GMP low level mpn functions
(which don't do any allocations on the C side). I only implemented the +
operator, just enough to make this example work.
>
> The behaviour in this case is that memory usage is constant at around
124mb. There is no growth in memory usage over time.
>
> Why is the one example using so much memory and the other is not?
>
> Note that the bignums do not grow here. They are always essentially 3 or
4 limbs or something like that, in both examples.
>
> Some other quick questions someone might be able to answer:
>
> * Is there any difference in GC behaviour between using & vs Ref in ccall?

Ref is heap allocated. & is not.

>
> * Does the Julia GC have a copying allocator for the short lived
generation(s)?

It has two generation and not copy

>
> * Does the Julia GC do a full mark and sweep every collection?

No

> Even of the long lived generation(s)? If not, which part of the GC code
is responsible for deciding when to do a more involved sweep vs a faster
sweep. I am having some trouble orienting myself with the code, and I'd
really like to understand it a bit better.

The counters in _jl_gc_collect


>
> * Can someone confirm whether the "pools" mentioned in the GC code refer
to pools for different sized allocations. Are there multiple pools for the
same sized allocation, or did I misunderstand that?

Pools are for small objects and they are segregated by size.

>
> Thanks in advance.
>
> Bill.
>
>
>

Reply via email to