parallel methods and performance

SrMordred via Digitalmars-d-learn Mon, 19 Feb 2018 06:56:10 -0800

On Monday, 19 February 2018 at 05:49:54 UTC, Nicholas Wilsonwrote:

As SIZE=1024*1024 (i.e. not much, possibly well within L2 cachefor 32bit) it may be that dealing with the concurrency overheadadds a significant amount of overhead.


That 'concurrency overhead' is what i´m not getting.

Since the array is big, dividing it into 6, 7 cores will nottrash L1 since they are very far from each other, right? Or L2cache trashing is also a problem in this case?

_base : 150 ms, 728 μs, and 5 hnsecs
_parallel : 120 ms, 78 μs, and 5 hnsecs
_concurrency : 134 ms, 787 μs, and 4 hnsecs
_thread : 129 ms, 476 μs, and 2 hnsecs


Yes, on my PC I was using -release.

Yet, 150ms for 1 core. 120-134ms of X cores.

Shouldn´t be way faster? I´m trying to understand where theoverhead is, and if is possible to get rid of it (perfect threadscaling).

Re: multithread/concurrency/parallel methods and performance

Reply via email to