Have you read this part of the manual

http://julia.readthedocs.org/en/latest/manual/parallel-computing/

There is a paragraph saying: "For example, the following code will not work
as intended", which I think is pretty close to your code.

Medvenlig hilsen

Andreas Noack

2014-10-21 13:01 GMT-04:00 Nils Gudat <[email protected]>:

> I'm trying to understand parallelization in Julia, but as a former MATLAB
> user who is used to just sprinkling some random "parfor"'s around my code
> I'm not having too much success.
> Here's my test program, that runs in a similar way to the actual program I
> want to parallelize. The basic feature is that I have a number of variables
> stored in arrays and I want to call a function using each possible
> combination of these variables. The function itself consists of an
> integration and a minimization, and takes around 0.005 seconds to compute.
>
> When running the program below, two things stand out:
> (i) The results matrices of all parallelized calculations are empty. I
> seem do be doing something fundamentally wrong in the way I'm structuring
> this.
> (ii) The further I move up along the chain of for loops, the shorter my
> calculation time; when I parallelize the outer-most loop, the calculation
> time drops to 0.02 seconds, which seems to indicate that most computations
> are simply not performed.
>
> As you can tell, I don't really know what I'm doing here so any help would
> be greatly appreciated!
>
> Test program:
>
> addprocs(3)
>
> @everywhere using Distributions
> @everywhere using QuantEcon
>
> @everywhere function f{T<:Float64}(x1::T, x2::T, x3::T)
>
>     distr = LogNormal(x1+2, x2+2)
>
>     function f2(x, x1=x1, x2=x2, x3=x3)
>         (x*x1 + x*x2 - x*x3).*pdf(distr, x)
>     end
>
>     quadrect(f2, 500, x1, x2+2)
> end
>
> X1 = rand(10, 1)
> X2 = rand(10, 1)
> X3 = rand(10, 1)
> Results = zeros(10, 10, 10)
> ResultsInner = zeros(10, 10, 10)
> ResultsMiddle = zeros(10, 10, 10)
> ResultsOuter = zeros(10, 10, 10)
>
> tic()
> for i = 1:10
>     x1 = X1[i]
>     for j = 1:10
>         x2 = X2[j]
>         for k = 1:10
>             x3 = X3[k]
>             Results[i, j, k] = f(x1, x2, x3)
>         end
>     end
> end
> @printf "The one-core loop takes %.2f seconds\n" toq()
>
> tic()
> for i = 1:10
>     x1 = X1[i]
>     for j = 1:10
>         x2 = X2[j]
>         @parallel for k = 1:10
>             x3 = X3[k]
>             ResultsInner[i, j, k] = f(x1, x2, x3)
>         end
>     end
> end
> @printf "The multi-core loop (inner) takes %.2f seconds\n" toq()
>
> tic()
> for i = 1:10
>     x1 = X1[i]
>     @parallel for j = 1:10
>         x2 = X2[j]
>         for k = 1:10
>             x3 = X3[k]
>             ResultsMiddle[i, j, k] = f(x1, x2, x3)
>         end
>     end
> end
> @printf "The multi-core loop (middle) takes %.2f seconds\n" toq()
>
> tic()
> @parallel for i = 1:10
>     x1 = X1[i]
>     for j = 1:10
>         x2 = X2[j]
>         for k = 1:10
>             x3 = X3[k]
>             ResultsOuter[i, j, k] = f(x1, x2, x3)
>         end
>     end
> end
> @printf "The multi-core loop (outer) takes %.2f seconds\n" toq()
>

Reply via email to