On Tuesday, 12 May 2015 at 17:45:54 UTC, thedeemon wrote:
On Tuesday, 12 May 2015 at 17:02:19 UTC, Gerald Jansen wrote:

About 3.5 million lines read by main(), 0.5 to 2 million lines read and 3.5 million lines written by runTraits (aka runJob).

Each GC allocation in D is a locking operation (and disabling GC doesn't help here at all), probably each writeln too, so when multiple threads try to write millions of lines such issue is easy to meet. I would look for a way to write those lines without allocations and locking, and also reduce total number of system calls by buffering data, doing less f.writef's.

Your advice is appreciated but quite disheartening. I was hoping for something (nearly) as easy to use as Python's parallel.Pool() map(), given that this is essentially an "embarassingly parallel" problem. Avoidance of GC allocation and self-written buffered IO functions seems a bit much to ask of a newcomer to a language.

Reply via email to