On Wednesday, 6 May 2020 at 04:04:14 UTC, Mathias LANG wrote:
On Wednesday, 6 May 2020 at 03:41:11 UTC, data pulverizer wrote:
Yes, that's exactly what I want the actual computation I'm running is much more expensive and much larger. It shouldn't matter if I have like 100_000_000 threads should it? The threads should just be queued until the cpu works on it?

It does matter quite a bit. Each thread has its own resources allocated to it, and some part of the language will need to interact with *all* threads, e.g. the GC. In general, if you want to parallelize something, you should aim to have as many threads as you have cores. Having 100M threads will mean you have to do a lot of context switches. You might want to look up the difference between tasks and threads.

Sorry, I meant 10_000 not 100_000_000 I square the number by mistake because I'm calculating a 10_000 x 10_000 matrix it's only 10_000 tasks, so 1 task does 10_000 calculations. The actual bit of code I'm parallelising is here:

```
auto calculateKernelMatrix(T)(AbstractKernel!(T) K, Matrix!(T) data)
{
  long n = data.ncol;
  auto mat = new Matrix!(T)(n, n);

  foreach(j; taskPool.parallel(iota(n)))
  {
    auto arrj = data.refColumnSelect(j).array;
    for(long i = j; i < n; ++i)
    {
      mat[i, j] = K.kernel(data.refColumnSelect(i).array, arrj);
      mat[j, i] = mat[i, j];
    }
  }
  return mat;
}
```

At the moment this code is running a little bit faster than threaded simd optimised Julia code, but as I said in an earlier reply to Ali when I look at my system monitor, I can see that all the D threads are active and running at ~ 40% usage, meaning that they are mostly doing nothing. The Julia code runs all threads at 100% and is still a tiny bit slower so my (maybe incorrect?) assumption is that I could get more performance from D. The method `refColumnSelect(j).array` is (trying to) reference a column from a matrix (1D array with computed index referencing) which I select from the matrix using:

```
return new Matrix!(T)(data[startIndex..(startIndex + nrow)], [nrow, 1]);
```

If I use the above code, I am I wrong in assuming that the sliced data (T[]) is referenced rather than copied? That so if I do:

```
auto myData = data[5...10];
```

myData is referencing elements [5..10] of data and not creating a new array with elements data[5..10] copied?

Reply via email to