Re: D memory consumption/runtime speed problem

2010-01-14 Thread sybrandy

snip

Using a small buffer as suggested by Daniel Keep and Steven 
Schveighoffer significantly improved performance.  Now down to about 5 
seconds.  I ended up using the static array buffer since the 
encodeNumber function will be in its own file in my resulting program, 
so I can keep it private.  Doing something similar to the output buffer 
had a similar effect and it's now processing everything in less than 2 
seconds!


I didn't realize that all those little arrays were created.  Perhaps 
this is something that should be detailed in the arrays documentation 
or, perhaps even better, an optimization guide?  I honestly thought the 
GC would help keep memory in check as I didn't want to assume a 
worst-case scenario, which with my RLE implementation is 2 * input 
buffer size, but I guess I have to.


Well, perhaps my original code will be of use if Walter and the gang 
decide to try to revamp the GC and want some bad code to test it with.


Thanks all!

Btw: below is the updated code

import std.conv;
import std.stdio;
import std.stream;
import std.date;
import std.mmfile;
import std.array;

string filename = enwik8_small;

private immutable uint ONE_BYTE_VAL = (1  6) - 1;
private immutable uint TWO_BYTE_VAL = (1  14) - 1;
private immutable uint THREE_BYTE_VAL = (1  22) - 1;
private immutable uint FOUR_BYTE_VAL = (1  30) - 1;
private immutable uint ONE_BYTE_MASK = (0  6);
private immutable uint TWO_BYTE_MASK = (1  6);
private immutable uint THREE_BYTE_MASK = (2  6);
private immutable uint FOUR_BYTE_MASK = (3  6);
private static ubyte[4] encodeBuff;

ubyte[] encodeNumber(in uint count)
{
if (count = ONE_BYTE_VAL)
{
encodeBuff[0] = cast(ubyte)(ONE_BYTE_MASK | count);
return encodeBuff[0..1];
}
else if (count = TWO_BYTE_VAL)
{
encodeBuff[0] = cast(ubyte)(TWO_BYTE_MASK | (count  8));
encodeBuff[1] = cast(ubyte)(count  0x00ff);
return encodeBuff[0..2];
}
else if (count = THREE_BYTE_VAL)
{
encodeBuff[0] = cast(ubyte)(THREE_BYTE_MASK | (count  16));
encodeBuff[1] = cast(ubyte)((count  8)  0x00ff);
encodeBuff[2] = cast(ubyte)(count  0x00ff);
return encodeBuff[0..3];
}
else if (count = FOUR_BYTE_VAL)
{
encodeBuff[0] = cast(ubyte)(FOUR_BYTE_MASK | (count  24));
encodeBuff[1] = cast(ubyte)((count  16)  0x00ff);
encodeBuff[2] = cast(ubyte)((count  8)  0x00ff);
encodeBuff[3] = cast(ubyte)(count  0x00ff);
return encodeBuff[0..4];
}
else
{
throw new Exception(Invalid count provided!);
}
}

void encode(in ubyte[] buff, ref ubyte[] output)
{
ubyte currByte = buff[0];
uint count = 0;
uint outIdx = 0;
ubyte[] temp;
foreach (byteVal; buff)
{
if (byteVal != currByte  count  0)
{
temp = encodeNumber(count);
foreach (t; temp)
{
output[outIdx++] = t;
}
output[outIdx++] = currByte;
currByte = byteVal;
count = 0;
}
count++;
}
temp = encodeNumber(count);
foreach (t; temp)
{
output[outIdx++] = t;
}
output[outIdx++] = currByte;
}

void benchCode()
{
MmFile buff = new MmFile(filename);
ubyte[] encodedBytes;
encodedBytes.length = cast(size_t)buff.length * 2;
encode(cast(ubyte[])buff[], encodedBytes);
}

void main(string[] args)
{
filename = args[1];
writeln(Benchmark time: , benchmark!(benchCode)(to!(uint)(args[2])));
}



Re: D memory consumption/runtime speed problem

2010-01-14 Thread bearophile
sybrandy:
 private immutable uint ONE_BYTE_VAL = (1  6) - 1;
 private immutable uint TWO_BYTE_VAL = (1  14) - 1;

Use private const in D1 and private enum in D2, there's no need for an 
immutable here.
In your code there are now no useless memory allocations, so the exit(0) trick 
is not needed.

Bye,
bearophile


D memory consumption/runtime speed problem

2010-01-13 Thread sybrandy

Hello,

I've been writing a bit of compression code and I noticed some strange 
behavior that's driving me a bit batty.  I don't know if it's a bug with 
D or something I did.  All I know is I can't figure it out.


Below is the simplified version of the code as a single file.  It takes 
two parameters.  The first is a file to compress and the second is the 
number of times to run the benchmark.  E.g. bugtest foo.txt 2


Now, if I set the second parameter to 1, it runs decently fast.  26 
seconds on my work laptop for a half-sized enwiki8 from the Hutter 
challenge.  If I set it to 2, then it takes about 142 seconds.  In both 
cases a lot of memory is used and I'm not really sure why.  Also, after 
it prints out the results, it takes several seconds for the program to exit.


Am I doing something wrong?  I've tried every trick that I could find by 
reading the documentation.  Btw: The last time I tried this was with the 
latest version of D released at the beginning of the month.


__CODE__

import std.conv;
import std.stdio;
import std.stream;
import std.date;
import std.mmfile;
import std.array;

string filename = enwik8_small;

private immutable uint ONE_BYTE_VAL = (1  6) - 1;
private immutable uint TWO_BYTE_VAL = (1  14) - 1;
private immutable uint THREE_BYTE_VAL = (1  22) - 1;
private immutable uint FOUR_BYTE_VAL = (1  30) - 1;
private immutable uint ONE_BYTE_MASK = (0  6);
private immutable uint TWO_BYTE_MASK = (1  6);
private immutable uint THREE_BYTE_MASK = (2  6);
private immutable uint FOUR_BYTE_MASK = (3  6);

ubyte[] encodeNumber(in uint count)
{
if (count = ONE_BYTE_VAL)
{
return [cast(ubyte)(ONE_BYTE_MASK | count)];
}
else if (count = TWO_BYTE_VAL)
{
return [cast(ubyte)(TWO_BYTE_MASK | (count  8))]
   ~ [cast(ubyte)(count  0x00ff)];
}
else if (count = THREE_BYTE_VAL)
{
return [cast(ubyte)(THREE_BYTE_MASK | (count  16))]
   ~ [cast(ubyte)((count  8)  0x00ff)]
   ~ [cast(ubyte)(count  0x00ff)];
}
else if (count = FOUR_BYTE_VAL)
{
return [cast(ubyte)(FOUR_BYTE_MASK | (count  24))]
   ~ [cast(ubyte)((count  16)  0x00ff)]
   ~ [cast(ubyte)((count  8)  0x00ff)]
   ~ [cast(ubyte)(count  0x00ff)];
}
else
{
throw new Exception(Invalid count provided!);
}
}

void encode(in ubyte[] buff, out ubyte[] output)
{
ubyte currByte = buff[0];
uint count = 0;
auto appOutput = appender(output);
foreach (byteVal; buff)
{
if (byteVal != currByte  count  0)
{
appOutput.put(encodeNumber(count));
appOutput.put(currByte);
currByte = byteVal;
count = 0;
}
count++;
}
appOutput.put(encodeNumber(count));
appOutput.put(currByte);
}

void benchCode()
{
MmFile buff = new MmFile(filename);
ubyte[] encodedBytes;
encode(cast(ubyte[])buff[], encodedBytes);
}

void main(string[] args)
{
filename = args[1];
writeln(Benchmark time: , benchmark!(benchCode)(to!(uint)(args[2])));
}


Re: D memory consumption/runtime speed problem

2010-01-13 Thread BCS

Hello sybrandy,


Hello,

I've been writing a bit of compression code and I noticed some strange
behavior that's driving me a bit batty.  I don't know if it's a bug
with D or something I did.  All I know is I can't figure it out.

Below is the simplified version of the code as a single file.  It
takes two parameters.  The first is a file to compress and the
second is the number of times to run the benchmark.  E.g. bugtest
foo.txt 2

Now, if I set the second parameter to 1, it runs decently fast.  26
seconds on my work laptop for a half-sized enwiki8 from the Hutter
challenge.  If I set it to 2, then it takes about 142 seconds.  In
both cases a lot of memory is used and I'm not really sure why.  Also,
after it prints out the results, it takes several seconds for the
program to exit.

Am I doing something wrong?  I've tried every trick that I could find
by reading the documentation.  Btw: The last time I tried this was
with the latest version of D released at the beginning of the month.



My guess is that you are getting long GC pauses. D's GC is rather unadvanced. 
IIRC there are ways to can make it do better but I don't know what they are.





Re: D memory consumption/runtime speed problem

2010-01-13 Thread bearophile
sybrandy:
 Now, if I set the second parameter to 1, it runs decently fast.  26 
 seconds on my work laptop for a half-sized enwiki8 from the Hutter 
 challenge.  If I set it to 2, then it takes about 142 seconds.  In both 
 cases a lot of memory is used and I'm not really sure why.  Also, after 
 it prints out the results, it takes several seconds for the program to exit.

My suggestions:
- Stop using array joining, those create many small arrays that the GC has to 
manage, and the current D GC is not efficient at all. There are many ways to 
avoid array joinings and avoid most runtime allocations.
- If you can, at the end of the program you can add std.c.stdlib.exit(0);, this 
kills the GC mid-way in a faster way (but probably destructors are not called, 
so be careful).
- Use the latest daily build of the LDC compiler with very good command line 
arguments, and if needed link-time optimizaiton activated too :-)

And then please tell us how much time it takes to run :-)

Bye,
bearophile


Re: D memory consumption/runtime speed problem

2010-01-13 Thread Daniel Keep


sybrandy wrote:
 Hello,
 
 I've been writing a bit of compression code and I noticed some strange
 behavior that's driving me a bit batty.  I don't know if it's a bug with
 D or something I did.  All I know is I can't figure it out.
 
 ...
 
 Am I doing something wrong?  I've tried every trick that I could find by
 reading the documentation.  Btw: The last time I tried this was with the
 latest version of D released at the beginning of the month.

I haven't verified this, but I'd be *deeply* suspicious of encodeNumber.
 I don't usually use array literals but, if I remember correctly, every
time it is called you're performing a heap allocation.  Even worse,
those concatentations might be performing separate allocations, too.

You could eliminate the overhead by using a passed-in buffer design like so:

ubyte[] encodeNumber(in uint count, ref ubyte[4] buffer)
{
if (count = ONE_BYTE_VAL)
{
buffer[0] = cast(ubyte)(ONE_BYTE_MASK | count);
return buffer[0..1];
}
// ...
}

Then, in the calling function:

{
ubyte[4] temp;
foreach( ... )
{
appOutput.put(encodeNumber(count, temp));
}
}

See if that helps.