I've done a very quick & dirty rewrite with OpenMP instead of Nim threads, and 
can get 0.5 seconds instead of 8 seconds.

I think there is too much lock contention when lots of cores are present on the 
machine.

Unfortunately it doesn't seem like OpenMP works with Channels so for 
progressbar display it only shows core 0 at the moment. Also the result I get 
is: 3.141592649989267 instead of 3.141592653489267 so it still needs some 
debugging work.

In any case here is the openmp branch: 
[https://github.com/mratsim/pibench2/tree/openmp-version](https://github.com/mratsim/pibench2/tree/openmp-version).
 It needs to be compiled with `-d:openmp`. Note that stacktraces should be off 
to avoid segfaulting (which is true in release mode), otherwise Nim will 
allocate strings for stacktraces on threads with no GC initialised (unless you 
use `setupForeignThreadGc()`)

Reply via email to