On Monday, 14 September 2015 at 12:50:03 UTC, Fredrik Boulund
wrote:
On Monday, 14 September 2015 at 12:44:22 UTC, Edwin van Leeuwen
wrote:
Sounds like this program is actually IO bound. In that case I
would not expect a really expect an improvement by using D.
What is the CPU usage like when you run this program?
Also which dmd version are you using. I think there were some
performance improvements for file reading in the latest
version (2.068)
Hi Edwin, thanks for your quick reply!
I'm using v2.068.1; I actually got inspired to try this out
after skimming the changelog :).
Regarding if it is IO-bound. I actually expected it would be,
but both the Python and the D-version consume 100% CPU while
running, and just copying the file around only takes a few
seconds (cf 15-20 sec in runtime for the two programs). There's
bound to be some aggressive file caching going on, but I figure
that would rather normalize program runtimes at lower times
after running them a few times, but I see nothing indicating
that.
Two things that you could try:
First hitlists.byKey can be expensive (especially if hitlists is
big). Instead use:
foreach( key, value ; hitlists )
Also the filter.array.length is quite expensive. You could use
count instead.
import std.algorithm : count;
value.count!(h => h.pid >= (max_pid - max_pid_diff));