On Mar 15, 2016 11:56 AM, "'Bill Hart' via julia-users" < [email protected]> wrote: > > We have been trying to understand the garbage collector behaviour, since we had some code for which our machine is running out of memory in a matter of an hour. > > We already realised that Julia isn't responsible for memory we allocate on the C side unless we use jl_gc_counted_malloc, which we now do everywhere. But it still uses masses of memory where we were roughly expecting no growth in memory usage (lots of short-lived objects and nothing much else). > > The behaviour of the gc on my machine seems to be to allocate objects until 23mb of memory is allocated, then do a jl_gc_collect. However, even after reading as much of the GC code in C as I can, I still can't determine why we are observing the behaviour we are seeing. > > Here is a concrete example whose behaviour I don't understand: > > function doit2(n::Int) > s = BigInt(2234567876543456789876545678987654567898765456789876545678) > for i = 1:n > s += i > end > return s > end > > doit(10000000000) > > > This is using Julia's BigInt type which is using a GMP bignum. Julia replaces the GMP memory manager functions with jl_gc_counted_malloc, so indeed Julia knows about all the allocations made here. > > But what I don't understand is that the memory usage of Julia starts at about 124mb and rises up to around 1.5gb. The growth is initially fast and it gets slower and slower.
I can't really reproduce this behavior. I assume doit is doit2 and not another function you defined somewhere else. > > Can someone explain why there is this behaviour? Shouldn't jl_gc_collect be able to collect every one of those allocations every time it reaches the collect_interval of 23mb (which remains constant on my machine with this example)? > > As an alternative experiment, I implemented a kind of bignum type using Julia arrays of UInts which I then pass to GMP low level mpn functions (which don't do any allocations on the C side). I only implemented the + operator, just enough to make this example work. > > The behaviour in this case is that memory usage is constant at around 124mb. There is no growth in memory usage over time. > > Why is the one example using so much memory and the other is not? > > Note that the bignums do not grow here. They are always essentially 3 or 4 limbs or something like that, in both examples. > > Some other quick questions someone might be able to answer: > > * Is there any difference in GC behaviour between using & vs Ref in ccall? Ref is heap allocated. & is not. > > * Does the Julia GC have a copying allocator for the short lived generation(s)? It has two generation and not copy > > * Does the Julia GC do a full mark and sweep every collection? No > Even of the long lived generation(s)? If not, which part of the GC code is responsible for deciding when to do a more involved sweep vs a faster sweep. I am having some trouble orienting myself with the code, and I'd really like to understand it a bit better. The counters in _jl_gc_collect > > * Can someone confirm whether the "pools" mentioned in the GC code refer to pools for different sized allocations. Are there multiple pools for the same sized allocation, or did I misunderstand that? Pools are for small objects and they are segregated by size. > > Thanks in advance. > > Bill. > > >
