On 2009-06-06 17:12:58 +0200, Jarrett Billingsley
<[email protected]> said:
On Sat, Jun 6, 2009 at 8:03 AM, Vladimir
Panteleev<[email protected]> wrote:
// Works for DMD1/Phobos, DMD1/Tango and DMD2/Phobos
version(Tango) import tango.io.Console;
else import std.stdio;
struct S
{
ubyte[40_000] data;
}
void main()
{
S[] a;
a ~= S();
// QUESTION: How much memory will this program consume upo
n reaching
this point?
version(Tango) Cin.copyln();
else readln();
}
There seems to be something wrong with the newCapacity function that
_d_arrayappendcT calls. From an element size of 20000 (I halved it
just to make the allocation faster) and an array length of 1, it
somehow calculates the new size to be 266686600. Hm. That seems a
bit off.
It seems this line:
long mult = 100 + (1000L * size) / log2plus1(newcap);
is to blame. I don't think large value types were taken into account
here. The resulting multiplier is 1,333,433, which is hilariously
large.
Indeed we were discussing this in the IRC,
Actually it is interesting to note that the continuos function written
as comment in newCapacity
double mult2 = 1.0 + (size / log10(pow(newcap * 2.0,2.0)));
does *not* have that behaviour.
It seems to me that it is generally much better to work on the total
memory rather than on the number of elements.
I would use something like
long mult = 100 + 200L / log2plus2(newcap)
and round up
newext = cast(size_t)((newcap * mult) / 100);
newext += size-(newext % size);
This is what I am adding in tango.
One could add something that further favors large sizes, but I miss the
rationale behind that, I would rather expect that one typically
concatenates strings (size=1..4) and so there is more to gain by making
that faster.
I can also understand if someone wants to use only the number of
elements (rather than the total size), but what was implemented wasn't
that either.
If someone has some insight, or good benchmarks to choose a better
function it would be welcome.
Fawzi