Thank you, folks, for your hints and suggestions!
Indeed, I re-wrote the code and got it substantially faster and
well paralleled.
Insted of making inner loop parallel, I made parallel both of
them. For that I had to convert 2d index into 1d, and then back
to 2d. Essentially I had to
On 10/18/22 06:24, Guillaume Piolat wrote:
> To win something with OS threads, you must think of tasks that takes on
> the order of milliseconds rather than less than 0.1ms.
> Else you will just pay extra in synchronization costs.
In other words, the OP can adjust work unit size. It is on the
On Tuesday, 18 October 2022 at 11:56:30 UTC, Yura wrote:
```D
// Then for each Sphere, i.e. dot[i]
// I need to do some arithmetics with itself and other dots
// I have only parallelized the inner loop, i is fixed.
It's usually a much better idea to parallelize the outer loop.
Even OpenMP
On Tuesday, 18 October 2022 at 11:56:30 UTC, Yura wrote:
What I am doing wrong?
The size of your task are way too small.
To win something with OS threads, you must think of tasks that
takes on the order of milliseconds rather than less than 0.1ms.
Else you will just pay extra in
Dear All,
I am trying to make a simple code run in parallel. The parallel
version works, and gives the same number as serial albeit slower.
First, the parallel features I am using:
import core.thread: Thread;
import std.range;
import std.parallelism:parallel;
import std.parallelism:taskPool;