Dear all, This topic follows that one: https://groups.google.com/forum/#!topic/julia-users/S86qxRkJ0ao which was not extremely easy to read so I take the liberty to reformulate my problem in simpler terms. I want to compute a matrix row by row, and since these computations can be done independently, on the advice of Tim Holy, I use SharedArrays, where each worker is given a range of rows on which to perform a given operation. The code is:
@everywhere begin m = Int(1e3) n = Int(5e3) mat_b = rand(m, n) function compute_row!(smat_a, mat_b, irange, n) for i in irange smat_a[i,i:n] = mean(mat_b[:,i] .* mat_b[:,i:n], 1) end end end smat_a = SharedArray(Float64, (n,n)) tic() @sync begin for p in procs(smat_a) @async begin irange = p-1:length(procs(smat_a)):n remotecall_wait(p, compute_row!, smat_a, mat_b, irange, n) end end end toc() And the problem is that it is always slower than the sequential version... I have no idea why this does not work. Any suggestion appreciated, thanks a lot!