On 2013-02-01 16:42, Sparsh Mittal wrote:
When I run it, and compare this parallel version with its serial version, I only
get speedup of nearly <1.3 for 2 threads. When I write same program in Go,
scaling is nearly 2.
Also, in D, on doing "top", I see the usage as only 130% CPU and not nearly 200%
or 180%. So I was wondering, if I am doing it properly. Please help me.
Probably because the SolverSlave doesn't have enough work to do to call for
dividing it into threads and barriers with their overhead. Like Dmitry wrote,
std.parallelism may be a better tool for the job.
I've tested your code on 2 cores (but have put whole main() in a loop).
It's taking about 82% of both cores.
After increasing gridSize to 1024 it was using 88% of the CPU.