On Friday, 11 August 2017 at 21:33:51 UTC, Arun Chandrasekaran
I've modified the sample from tour.dlang.org to calculate the
md5 digest of the files in a directory using std.parallelism.
When I run this on a dir with huge number of files, I get:
Memory allocation failed
Since dirEntries returns a range, I thought
std.parallelism.parallel can make use of that without loading
the entire file list into the memory.
What am I doing wrong here? Is there a way to achieve what I'm
import std.stdio: writeln;
writeln("Loops through a given directory and calculates the
md5 digest of each file encountered.");
writeln("Usage: md <dirname>");
void safePrint(T...)(T args)
import std.stdio : writeln;
void main(string args)
if (args.length != 2)
foreach (d; parallel(dirEntries(args,
SpanMode.depth).filter!(f => f.isFile), 1))
auto md5 = new MD5Digest();
auto data = cast(const(ubyte)) read(d.name);
auto hash = md5.finish();
string t = split(d.name, '/');
safePrint(toHexString!(LetterCase.lower)(hash), " ",
Just a thought, maybe the GC isn't cleaning up quick enough? You
are allocating and md5 digest each iteration.
Possibly, an opitimization is use use a collection of md5 hashes
and reuse them. e.g., pre-allocate 100(you probably only need as
many as the number of parallel loops going) and then attempt to
resuse them. If all are in use, wait for a free one. Might
require some synchronization.