I am trying to parallelize kernel matrix calculations using threadpool:
proc calculateKernelMatrix*(K: AbstractKernel, data: Matrix[F]): Matrix[F] =
let n = int64(ncol(data));
var mat = Matrix[F](data: newSeq[F](n*n), dim: @[n, n]);
for j in 0..<n:
for i in j..<n:
var tmp: F;
spawn kernel(K, data.col(i), data.col(j), tmp.addr);
mat[i, j] = tmp;
mat[j, i] = tmp;
sync();
return mat;
RunI'd like each j to run in a new thread rather than each (i, j).
