Thanks for your answers. Please see my followup questions below.

On 15 March 2016 at 18:45, Yichao Yu <[email protected]> wrote:

>
> On Mar 15, 2016 11:56 AM, "'Bill Hart' via julia-users" <
> [email protected]> wrote:
> >
> > We have been trying to understand the garbage collector behaviour, since
> we had some code for which our machine is running out of memory in a matter
> of an hour.
> >
> > We already realised that Julia isn't responsible for memory we allocate
> on the C side unless we use jl_gc_counted_malloc, which we now do
> everywhere. But it still uses masses of memory where we were roughly
> expecting no growth in memory usage (lots of short-lived objects and
> nothing much else).
> >
> > The behaviour of the gc on my machine seems to be to allocate objects
> until 23mb of memory is allocated, then do a jl_gc_collect. However, even
> after reading as much of the GC code in C as I can, I still can't determine
> why we are observing the behaviour we are seeing.
> >
> > Here is a concrete example whose behaviour I don't understand:
> >
> >      function doit2(n::Int)
> >          s =
> BigInt(2234567876543456789876545678987654567898765456789876545678)
> >          for i = 1:n
> >             s += i
> >          end
> >          return s
> >       end
> >
> >      doit(10000000000)
> >
> >
> > This is using Julia's BigInt type which is using a GMP bignum. Julia
> replaces the GMP memory manager functions with jl_gc_counted_malloc, so
> indeed Julia knows about all the allocations made here.
> >
> > But what I don't understand is that the memory usage of Julia starts at
> about 124mb and rises up to around 1.5gb. The growth is initially fast and
> it gets slower and slower.
>
> I can't really reproduce this behavior.
>
> I assume doit is doit2 and not another function you defined somewhere else.
>

Yes, I meant doit2 of course.

I'm really amazed you can't reproduce this. I can reproduce it over a wide
variety of Julia versions on a variety of machines!

I'm monitoring memory usage with top, not with Julia itself.

>
> >
> > Can someone explain why there is this behaviour? Shouldn't jl_gc_collect
> be able to collect every one of those allocations every time it reaches the
> collect_interval of 23mb (which remains constant on my machine with this
> example)?
> >
> > As an alternative experiment, I implemented a kind of bignum type using
> Julia arrays of UInts which I then pass to GMP low level mpn functions
> (which don't do any allocations on the C side). I only implemented the +
> operator, just enough to make this example work.
> >
> > The behaviour in this case is that memory usage is constant at around
> 124mb. There is no growth in memory usage over time.
> >
> > Why is the one example using so much memory and the other is not?
> >
> > Note that the bignums do not grow here. They are always essentially 3 or
> 4 limbs or something like that, in both examples.
> >
> > Some other quick questions someone might be able to answer:
> >
> > * Is there any difference in GC behaviour between using & vs Ref in
> ccall?
>
> Ref is heap allocated. & is not.
>
Do you mean the Ref object itself is heap allocated but the & object is
stack allocated? Or are you referring to a GC "heap" vs a pool or something?

Why was & deprecated? It sounds like & should be faster, no?

> >
> > * Does the Julia GC have a copying allocator for the short lived
> generation(s)?
>
> It has two generation and not copy
>
So do the pools constitute the new generation? Or is the new generation
separate from the pools?

I noticed in the code that it looks like things get promoted if they
survive more than 2 generations (currently). But what happens to the
objects between generations 1 and 2 if they are not copied?

Or do you mean it has 2 generations plus a long term generation?

> >
> > * Does the Julia GC do a full mark and sweep every collection?
>
> No
>
> > Even of the long lived generation(s)? If not, which part of the GC code
> is responsible for deciding when to do a more involved sweep vs a faster
> sweep. I am having some trouble orienting myself with the code, and I'd
> really like to understand it a bit better.
>
The counters in _jl_gc_collect
>
You mean quick_count?

In my version of Julia in gc.c, quick_count is set to 0 and incremented,
but never used anywhere. How does this affect the behaviour? Is it used
somewhere else in Julia?

Or do you mean inc_count? I see this determines scanned_bytes_goal, which
again is not used anywhere that I can see.

I can't see anywhere else that inc_count is used either.

>
> >
> > * Can someone confirm whether the "pools" mentioned in the GC code refer
> to pools for different sized allocations. Are there multiple pools for the
> same sized allocation, or did I misunderstand that?
>
> Pools are for small objects and they are segregated by size.
>
Thanks.


> >
> > Thanks in advance.
> >
> > Bill.
> >
> >
> >
>

Reply via email to