Have you read this part of the manual http://julia.readthedocs.org/en/latest/manual/parallel-computing/
There is a paragraph saying: "For example, the following code will not work as intended", which I think is pretty close to your code. Medvenlig hilsen Andreas Noack 2014-10-21 13:01 GMT-04:00 Nils Gudat <[email protected]>: > I'm trying to understand parallelization in Julia, but as a former MATLAB > user who is used to just sprinkling some random "parfor"'s around my code > I'm not having too much success. > Here's my test program, that runs in a similar way to the actual program I > want to parallelize. The basic feature is that I have a number of variables > stored in arrays and I want to call a function using each possible > combination of these variables. The function itself consists of an > integration and a minimization, and takes around 0.005 seconds to compute. > > When running the program below, two things stand out: > (i) The results matrices of all parallelized calculations are empty. I > seem do be doing something fundamentally wrong in the way I'm structuring > this. > (ii) The further I move up along the chain of for loops, the shorter my > calculation time; when I parallelize the outer-most loop, the calculation > time drops to 0.02 seconds, which seems to indicate that most computations > are simply not performed. > > As you can tell, I don't really know what I'm doing here so any help would > be greatly appreciated! > > Test program: > > addprocs(3) > > @everywhere using Distributions > @everywhere using QuantEcon > > @everywhere function f{T<:Float64}(x1::T, x2::T, x3::T) > > distr = LogNormal(x1+2, x2+2) > > function f2(x, x1=x1, x2=x2, x3=x3) > (x*x1 + x*x2 - x*x3).*pdf(distr, x) > end > > quadrect(f2, 500, x1, x2+2) > end > > X1 = rand(10, 1) > X2 = rand(10, 1) > X3 = rand(10, 1) > Results = zeros(10, 10, 10) > ResultsInner = zeros(10, 10, 10) > ResultsMiddle = zeros(10, 10, 10) > ResultsOuter = zeros(10, 10, 10) > > tic() > for i = 1:10 > x1 = X1[i] > for j = 1:10 > x2 = X2[j] > for k = 1:10 > x3 = X3[k] > Results[i, j, k] = f(x1, x2, x3) > end > end > end > @printf "The one-core loop takes %.2f seconds\n" toq() > > tic() > for i = 1:10 > x1 = X1[i] > for j = 1:10 > x2 = X2[j] > @parallel for k = 1:10 > x3 = X3[k] > ResultsInner[i, j, k] = f(x1, x2, x3) > end > end > end > @printf "The multi-core loop (inner) takes %.2f seconds\n" toq() > > tic() > for i = 1:10 > x1 = X1[i] > @parallel for j = 1:10 > x2 = X2[j] > for k = 1:10 > x3 = X3[k] > ResultsMiddle[i, j, k] = f(x1, x2, x3) > end > end > end > @printf "The multi-core loop (middle) takes %.2f seconds\n" toq() > > tic() > @parallel for i = 1:10 > x1 = X1[i] > for j = 1:10 > x2 = X2[j] > for k = 1:10 > x3 = X3[k] > ResultsOuter[i, j, k] = f(x1, x2, x3) > end > end > end > @printf "The multi-core loop (outer) takes %.2f seconds\n" toq() >
