Re: std.parallelism: Request for Review

dsimcha Sun, 27 Feb 2011 06:50:45 -0800

On 2/27/2011 8:03 AM, Russel Winder wrote:

32-bit mode on a 8-core (twin Xeon) Linux box.  That core.cpuid bug
really, really sucks.


I see matrix inversion takes longer with 4 cores than with 1!

Can you please re-run the benchmark to make sure that this isn't just aone-time anomaly? I can't seem to make the parallel matrix inversionrun slower than serial on my hardware, even with ridiculous tuningparameters that I was almost sure would bottleneck the thing on the taskqueue. Also, all the other benchmarks actually look pretty good.

It's possible that machines with multiple physical CPUs are much morelikely to bottleneck on the task queue because synchronized blocks costa few more clock cycles. It's also possible that stack alignment issuesare creeping in somewhere I hadn't anticipated, or that using 4 coresinstead of two on a fairly fine-grained benchmark is enough tobottleneck on the queue (though I doubt this because this benchmarkworked well for others with quad cores).

Re: std.parallelism: Request for Review

Reply via email to