Steven Schveighoffer wrote:
> On Fri, 02 Jul 2010 01:34:26 -0400, Ali Çehreli <[email protected]> wrote:
>
>> BCS wrote:
>>
>>  >> There is something strange in the array capacity algorithm.
>>  >> [...]
>>  >> Is that intended?
>>
>> > I'm going to take a guess that the gc is (after some point) allocating
>>  > into the smallest hole with "enough" capacity.
>>
>> You may be right. :)
>>
>>  > If you re run the test
>>  > but with something else going on to add some noise, do the spikes
>> move?
>>
>> Yes, the spikes move.
>>
>>  > void makeNoise()
>>  > {
>>  >      static byte[][256] data;
>>  >      data[rand() % $] = new byte[rand() % 512];
>>  > }
>>
>> I am not familiar with the dmd internals; but following the code took
>> me to the function newCapacity() in src/druntime/src/rt/lifetime.d.
>> The capacity growth is more complicated than what I've been assuming
>> so far.
>>
>> And yes, GC is in the picture. Quoting the comments from that file:
>>
>> /*
>>   * Better version by Dave Fladebo:
>> * This uses an inverse logorithmic algorithm to pre-allocate a bit more
>>   * space for larger arrays.
>>   * - Arrays smaller than PAGESIZE bytes are left as-is, so for the most
>>   * common cases, memory allocation is 1 to 1. The small overhead added
>>   * doesn't affect small array perf. (it's virtually the same as
>>   * current).
>>   * - Larger arrays have some space pre-allocated.
>>   * - As the arrays grow, the relative pre-allocated space shrinks.
>>   * - The logorithmic algorithm allocates relatively more space for
>>   * mid-size arrays, making it very fast for medium arrays (for
>> * mid-to-large arrays, this turns out to be quite a bit faster than the
>>   * equivalent realloc() code in C, on Linux at least. Small arrays are
>>   * just as fast as GCC).
>> * - Perhaps most importantly, overall memory usage and stress on the GC
>>   * is decreased significantly for demanding environments.
>>   */
>>
>> So there may not be a bug after all...
>>
>> Ali
>
> If your original code is all that was being run, I think there is a bug.

The original code was it.

> I'll tell you why -- the GC provides a way to append pages to an
> allocated block.  Essentially, you can glue consecutive unused pages
> together to make a larger block, thereby growing your block.
>
> But anything under a page is simply reallocated because anything less
> than a page is inside a pool of blocks of the same size, and those
> cannot be glued together.
>
> So when your array is growing, and it gets larger than a page, it should
> grow a single page at a time.  But I think (and I stress think, I'm not
> sure) that when a block of pages is deallocated, each page becomes
> independently free.  They do not stay glued together.  So for an
> allocating requesting 1 or 2 more pages to grow by 2000 pages is
> suspect. I still have to examine the code, but I think there is a problem.
>
> Back to my original assertion -- to disprove the theory that some other
> "larger hole" is why it gets so big, in one instance your number of
> elements grows from 1,155,070 (4MB) to 3,153,917 (12MB), that's almost a
> 200% increase.  In order for that 12MB hole to exist, something must
> have allocated it before, and then deallocated it.  It can't be your
> loop because your loop has not allocated that much space yet.  I would
> find that very strange to happen somewhere in the runtime without you
> specifically allocating it.  If however, your original post is not the
> entire code, then there may be some possibility to that.
>
> BTW, you can take the newCapacity function out of the loop by simply
> setting the length and then writing the last element, i.e.:
>
> a.length += 1;
> a[$-1] = i;
>
> newCapacity isn't called when setting the length explicitly, because
> it's not considered to be an append.

You are right.

Looks like I was successful in locating the culprit. :) I took liberty to write ++a.length and also outputted the number of elements that are in reserve at each allocation:

import std.stdio;
import std.array;

void main()
{
    int[] a;
    size_t oldCapacity;

    foreach (i; 0 .. 100_000_000) {
        ++a.length;
        a[$-1] = i;

        if (capacity(a) != oldCapacity) {
            writeln("length: ", a.length,
                    " capacity: ", capacity(a),
                    " ratio: ", cast(double)capacity(a) / oldCapacity,
                    " reserve: ", capacity(a) - a.length);
            oldCapacity = capacity(a);
        }
    }
}

Now the spikes are gone, and the allocations asymptote at once-per-1024 elements for an int array:

length: 1 capacity: 3 ratio: inf reserve: 2
length: 4 capacity: 7 ratio: 2.33333 reserve: 3
length: 8 capacity: 15 ratio: 2.14286 reserve: 7
length: 16 capacity: 31 ratio: 2.06667 reserve: 15
length: 32 capacity: 63 ratio: 2.03226 reserve: 31
length: 64 capacity: 127 ratio: 2.01587 reserve: 63
length: 128 capacity: 255 ratio: 2.00787 reserve: 127
length: 256 capacity: 509 ratio: 1.99608 reserve: 253
length: 512 capacity: 1021 ratio: 2.00589 reserve: 509
length: 1022 capacity: 2045 ratio: 2.00294 reserve: 1023
length: 2046 capacity: 3069 ratio: 1.50073 reserve: 1023
length: 3070 capacity: 4093 ratio: 1.33366 reserve: 1023
...

In the long run, there is one allocation per 1024 elements. That still feels like amortized constant, but considering that relocation times would be O(N), I think reserve should still grow with N, but maybe not too much. (I am not sure about this last point at all.)

> -Steve

Thank you very much,
Ali

Reply via email to