Re: D memory consumption/runtime speed problem
snip Using a small buffer as suggested by Daniel Keep and Steven Schveighoffer significantly improved performance. Now down to about 5 seconds. I ended up using the static array buffer since the encodeNumber function will be in its own file in my resulting program, so I can keep it private. Doing something similar to the output buffer had a similar effect and it's now processing everything in less than 2 seconds! I didn't realize that all those little arrays were created. Perhaps this is something that should be detailed in the arrays documentation or, perhaps even better, an optimization guide? I honestly thought the GC would help keep memory in check as I didn't want to assume a worst-case scenario, which with my RLE implementation is 2 * input buffer size, but I guess I have to. Well, perhaps my original code will be of use if Walter and the gang decide to try to revamp the GC and want some bad code to test it with. Thanks all! Btw: below is the updated code import std.conv; import std.stdio; import std.stream; import std.date; import std.mmfile; import std.array; string filename = enwik8_small; private immutable uint ONE_BYTE_VAL = (1 6) - 1; private immutable uint TWO_BYTE_VAL = (1 14) - 1; private immutable uint THREE_BYTE_VAL = (1 22) - 1; private immutable uint FOUR_BYTE_VAL = (1 30) - 1; private immutable uint ONE_BYTE_MASK = (0 6); private immutable uint TWO_BYTE_MASK = (1 6); private immutable uint THREE_BYTE_MASK = (2 6); private immutable uint FOUR_BYTE_MASK = (3 6); private static ubyte[4] encodeBuff; ubyte[] encodeNumber(in uint count) { if (count = ONE_BYTE_VAL) { encodeBuff[0] = cast(ubyte)(ONE_BYTE_MASK | count); return encodeBuff[0..1]; } else if (count = TWO_BYTE_VAL) { encodeBuff[0] = cast(ubyte)(TWO_BYTE_MASK | (count 8)); encodeBuff[1] = cast(ubyte)(count 0x00ff); return encodeBuff[0..2]; } else if (count = THREE_BYTE_VAL) { encodeBuff[0] = cast(ubyte)(THREE_BYTE_MASK | (count 16)); encodeBuff[1] = cast(ubyte)((count 8) 0x00ff); encodeBuff[2] = cast(ubyte)(count 0x00ff); return encodeBuff[0..3]; } else if (count = FOUR_BYTE_VAL) { encodeBuff[0] = cast(ubyte)(FOUR_BYTE_MASK | (count 24)); encodeBuff[1] = cast(ubyte)((count 16) 0x00ff); encodeBuff[2] = cast(ubyte)((count 8) 0x00ff); encodeBuff[3] = cast(ubyte)(count 0x00ff); return encodeBuff[0..4]; } else { throw new Exception(Invalid count provided!); } } void encode(in ubyte[] buff, ref ubyte[] output) { ubyte currByte = buff[0]; uint count = 0; uint outIdx = 0; ubyte[] temp; foreach (byteVal; buff) { if (byteVal != currByte count 0) { temp = encodeNumber(count); foreach (t; temp) { output[outIdx++] = t; } output[outIdx++] = currByte; currByte = byteVal; count = 0; } count++; } temp = encodeNumber(count); foreach (t; temp) { output[outIdx++] = t; } output[outIdx++] = currByte; } void benchCode() { MmFile buff = new MmFile(filename); ubyte[] encodedBytes; encodedBytes.length = cast(size_t)buff.length * 2; encode(cast(ubyte[])buff[], encodedBytes); } void main(string[] args) { filename = args[1]; writeln(Benchmark time: , benchmark!(benchCode)(to!(uint)(args[2]))); }
Re: D memory consumption/runtime speed problem
sybrandy: private immutable uint ONE_BYTE_VAL = (1 6) - 1; private immutable uint TWO_BYTE_VAL = (1 14) - 1; Use private const in D1 and private enum in D2, there's no need for an immutable here. In your code there are now no useless memory allocations, so the exit(0) trick is not needed. Bye, bearophile
D memory consumption/runtime speed problem
Hello, I've been writing a bit of compression code and I noticed some strange behavior that's driving me a bit batty. I don't know if it's a bug with D or something I did. All I know is I can't figure it out. Below is the simplified version of the code as a single file. It takes two parameters. The first is a file to compress and the second is the number of times to run the benchmark. E.g. bugtest foo.txt 2 Now, if I set the second parameter to 1, it runs decently fast. 26 seconds on my work laptop for a half-sized enwiki8 from the Hutter challenge. If I set it to 2, then it takes about 142 seconds. In both cases a lot of memory is used and I'm not really sure why. Also, after it prints out the results, it takes several seconds for the program to exit. Am I doing something wrong? I've tried every trick that I could find by reading the documentation. Btw: The last time I tried this was with the latest version of D released at the beginning of the month. __CODE__ import std.conv; import std.stdio; import std.stream; import std.date; import std.mmfile; import std.array; string filename = enwik8_small; private immutable uint ONE_BYTE_VAL = (1 6) - 1; private immutable uint TWO_BYTE_VAL = (1 14) - 1; private immutable uint THREE_BYTE_VAL = (1 22) - 1; private immutable uint FOUR_BYTE_VAL = (1 30) - 1; private immutable uint ONE_BYTE_MASK = (0 6); private immutable uint TWO_BYTE_MASK = (1 6); private immutable uint THREE_BYTE_MASK = (2 6); private immutable uint FOUR_BYTE_MASK = (3 6); ubyte[] encodeNumber(in uint count) { if (count = ONE_BYTE_VAL) { return [cast(ubyte)(ONE_BYTE_MASK | count)]; } else if (count = TWO_BYTE_VAL) { return [cast(ubyte)(TWO_BYTE_MASK | (count 8))] ~ [cast(ubyte)(count 0x00ff)]; } else if (count = THREE_BYTE_VAL) { return [cast(ubyte)(THREE_BYTE_MASK | (count 16))] ~ [cast(ubyte)((count 8) 0x00ff)] ~ [cast(ubyte)(count 0x00ff)]; } else if (count = FOUR_BYTE_VAL) { return [cast(ubyte)(FOUR_BYTE_MASK | (count 24))] ~ [cast(ubyte)((count 16) 0x00ff)] ~ [cast(ubyte)((count 8) 0x00ff)] ~ [cast(ubyte)(count 0x00ff)]; } else { throw new Exception(Invalid count provided!); } } void encode(in ubyte[] buff, out ubyte[] output) { ubyte currByte = buff[0]; uint count = 0; auto appOutput = appender(output); foreach (byteVal; buff) { if (byteVal != currByte count 0) { appOutput.put(encodeNumber(count)); appOutput.put(currByte); currByte = byteVal; count = 0; } count++; } appOutput.put(encodeNumber(count)); appOutput.put(currByte); } void benchCode() { MmFile buff = new MmFile(filename); ubyte[] encodedBytes; encode(cast(ubyte[])buff[], encodedBytes); } void main(string[] args) { filename = args[1]; writeln(Benchmark time: , benchmark!(benchCode)(to!(uint)(args[2]))); }
Re: D memory consumption/runtime speed problem
Hello sybrandy, Hello, I've been writing a bit of compression code and I noticed some strange behavior that's driving me a bit batty. I don't know if it's a bug with D or something I did. All I know is I can't figure it out. Below is the simplified version of the code as a single file. It takes two parameters. The first is a file to compress and the second is the number of times to run the benchmark. E.g. bugtest foo.txt 2 Now, if I set the second parameter to 1, it runs decently fast. 26 seconds on my work laptop for a half-sized enwiki8 from the Hutter challenge. If I set it to 2, then it takes about 142 seconds. In both cases a lot of memory is used and I'm not really sure why. Also, after it prints out the results, it takes several seconds for the program to exit. Am I doing something wrong? I've tried every trick that I could find by reading the documentation. Btw: The last time I tried this was with the latest version of D released at the beginning of the month. My guess is that you are getting long GC pauses. D's GC is rather unadvanced. IIRC there are ways to can make it do better but I don't know what they are.
Re: D memory consumption/runtime speed problem
sybrandy: Now, if I set the second parameter to 1, it runs decently fast. 26 seconds on my work laptop for a half-sized enwiki8 from the Hutter challenge. If I set it to 2, then it takes about 142 seconds. In both cases a lot of memory is used and I'm not really sure why. Also, after it prints out the results, it takes several seconds for the program to exit. My suggestions: - Stop using array joining, those create many small arrays that the GC has to manage, and the current D GC is not efficient at all. There are many ways to avoid array joinings and avoid most runtime allocations. - If you can, at the end of the program you can add std.c.stdlib.exit(0);, this kills the GC mid-way in a faster way (but probably destructors are not called, so be careful). - Use the latest daily build of the LDC compiler with very good command line arguments, and if needed link-time optimizaiton activated too :-) And then please tell us how much time it takes to run :-) Bye, bearophile
Re: D memory consumption/runtime speed problem
sybrandy wrote: Hello, I've been writing a bit of compression code and I noticed some strange behavior that's driving me a bit batty. I don't know if it's a bug with D or something I did. All I know is I can't figure it out. ... Am I doing something wrong? I've tried every trick that I could find by reading the documentation. Btw: The last time I tried this was with the latest version of D released at the beginning of the month. I haven't verified this, but I'd be *deeply* suspicious of encodeNumber. I don't usually use array literals but, if I remember correctly, every time it is called you're performing a heap allocation. Even worse, those concatentations might be performing separate allocations, too. You could eliminate the overhead by using a passed-in buffer design like so: ubyte[] encodeNumber(in uint count, ref ubyte[4] buffer) { if (count = ONE_BYTE_VAL) { buffer[0] = cast(ubyte)(ONE_BYTE_MASK | count); return buffer[0..1]; } // ... } Then, in the calling function: { ubyte[4] temp; foreach( ... ) { appOutput.put(encodeNumber(count, temp)); } } See if that helps.