On Sunday, 10 January 2016 at 03:23:14 UTC, Ilya wrote:
I will add significantly faster pairwise summation based on SIMD instructions into the future std.las. --Ilya
Wow! A lot of overhead in the debug build. I checked the computed values are the same. This is on my laptop corei5.
dub -b release-nobounds --force parallel time msec:448 non_parallel msec:767 dub -b debug --force parallel time msec:2465 non_parallel msec:4962 on my corei7 desktop, the release-no bounds parallel time msec:161 non_parallel msec:571
