On Tuesday, 26 January 2016 at 22:36:31 UTC, H. S. Teoh wrote:
So the moral of the story is: avoid large numbers of small allocations. If you have to do it, consider consolidating your allocations into a series of allocations of large(ish) buffers instead, and taking slices of the buffers.

Thanks for sharing this, HS Teoh. I tried replacing allocations with using a Region from std.experimental.allocator (with FreeList and Quantizer on top), and then just deallocating everything in one go once I am done with the data. Seems to be a little faster, but I haven't had time to measure it.

Just came across this C++ project, which seems to have astonishing performance. 7 minutes for reading a terabyte, and 2.5 to 4.5 GB/sec for reading file cold. That's pretty impressive. (Obviously they read in parallel, but I haven't yet read source to see what the other tricks might be).

It would be nice to be able match that in D, though practically speaking it's probably easiest just to wrap it:

http://www.wise.io/tech/paratext

https://github.com/wiseio/paratext

Reply via email to