I'm still having problems understanding the basic concepts of 
parallelization in Julia. It seems to me that the examples in the 
documentation and those that I found elsewhere on the web don't really 
reflect my usage case, so I'm wondering whether I'm approaching the problem 
from the right angle. I've written a short piece of code that illustrates 
what I'm trying to do; basically it's a large number of small calculations, 
the results of which have to be stored in one large matrix.
Here's the example:

addprocs(3)

agrid = linspace(1,4,4)
bgrid = linspace(-1.05, 1.05, 30)
cgrid = linspace(-0.1, 0.1, 40)
dgrid = linspace(0.5, 1000, 40)

result = SharedArray(Float64, (size(agrid,1), size(bgrid,1), size(cgrid,1), 
size(dgrid,1)), pids=procs())

@everywhere function calculate(a,b,c,d)
  quadgk(cos, -b*10π, c*10π)[1] + quadgk(sin, -b*10π, c*10π)[1]*d
end

function solveall()
  for a = 1:length(agrid)
    for b = 1:length(bgrid)
      for c = 1:length(cgrid)
        @parallel for d = 1:length(dgrid)
          result[a,b,c,d] = calculate(agrid[a], bgrid[b], cgrid[c], 
dgrid[d])
        end
      end
    end
  end
  return result
end

@time solveall()

Unfortunately, the speedup from parallelizing the inner loop isn't great 
(going from ~9s to ~7.5s on my machine), so I'm wondering whether this is 
actually the best way of implementing the parallelization. My originaly 
idea was to somehow parallelize the outer loop, so that each processor 
returns a 30x40x40 array, but I don't see how I can get the worker 
processors to run the inner loops correctly.

Any input would be greatly appreciated, as I've been tyring to parallelize 
this for a while and seem to be at a point where I'm just getting more 
confused now the harder I try.

Reply via email to