Also, you want to map(fetch, refs) not pmap.
With that i get better speedup (still not great, but at least > 2x with 8
processors)
julia> N=1000000;T=1000;A=rand(3,N);@time SimulationSerial(A,N,T)
elapsed time: 1.822478028 seconds (233 kB allocated)
julia> N=1000000;T=1000;dA=drand(3,N);@time SimulationParallel(dA,N,T)
elapsed time: 0.617520182 seconds (573 kB allocated)
8-element Array{Any,1}:
nothing
nothing
nothing
nothing
nothing
nothing
nothing
nothing
On Thursday, April 30, 2015 at 10:50:23 AM UTC-4, Jake Bolewski wrote:
>
> DistributedArray performance is pretty bad. The reason for removing them
> from base was to spur their development. All I can say at this time is
> that we are actively working on making their performance better.
>
> For every parallel program you have implicit serial overhead (this is
> especially true with multiprocessing). The fraction of serial work to
> parallel work determines your potential parallel speedup. The parallel
> work / serial overhead in this case is really bad, so I don't think your
> observation is really surprising. If this is on a shared memory machine I
> would try using SharedArray's as the serial communication overhead will be
> lower, and the potential parallel speedup much higher. DistributedArrays
> only really make sense if they are in fact distributed over multiple
> machines.
>
> On Thursday, April 30, 2015 at 9:18:32 AM UTC-4, Alex wrote:
>>
>> Hi,
>>
>> I can't say anything regarding the performance of Distributed Arrays.
>> However, note that they have been relocated from Base to a separate
>> package: https://github.com/JuliaParallel/DistributedArrays.jl which
>> should work with 0.4-dev.
>>
>> Best,
>>
>> Alex.
>>
>>
>> On Thursday, 30 April 2015 12:10:47 UTC+2, Ángel de Vicente wrote:
>>>
>>> Hello all,
>>>
>>> I'm trying to understand the sort of performance that we can get in
>>> Parallel with Julia. DistributedArrays look very tempting, but my first try
>>> gives me a hopeless performance. As a test code, I got it from the slides
>>> (pages 75-80) at
>>>
>>> http://www.csd.uwo.ca/~moreno/cs2101a_moreno/Parallel_computing_with_Julia.pdf
>>>
>>> The code, which just defines two functions is available at:
>>> https://bitbucket.org/snippets/angelv/5kb4
>>> and also attached to this message for convenience
>>>
>>> When I run it in my 8-core laptop in serial or in parallel (see below
>>> for output of the different runs), I see no better performance in parallel
>>> (though I see a huge increase in the allocated memory in parallel). With
>>> version 0.4-dev drand is not defined (well, actually the whole Distributed
>>> Arrays section is gone from the documentation for 0.4-dev).
>>>
>>> Any pointers on what can be done to improve this appreciated (this has
>>> to be the simplest possible parallel program, with no communication at all,
>>> so we should be able to get near perfect scalability here).
>>>
>>> Thanks a lot,
>>> Ángel de Vicente
>>>
>>> ==========
>>>
>>> angelv@pilas:~/mhdsolver-julia/Misc/Julia_Parallel$ julia -q
>>> julia>
>>> println(VERSION);require("simulation.jl");N=1000000;T=1000;A=rand(3,N);@time
>>>
>>> SimulationSerial(A,N,T)
>>> 0.3.7
>>> elapsed time: 2.376680715 seconds (80 bytes allocated)
>>>
>>> ==========
>>>
>>> angelv@pilas:~/mhdsolver-julia/Misc/Julia_Parallel$ julia -p 4 -q
>>> julia>
>>> println(VERSION);require("simulation.jl");N=1000000;T=1000;dA=drand(3,N);@time
>>>
>>> SimulationParallel(dA,N,T)
>>> 0.3.7
>>> elapsed time: 2.510426469 seconds (20011756 bytes allocated)
>>> 4-element Array{Any,1}:
>>> nothing
>>> nothing
>>> nothing
>>> nothing
>>>
>>> ============
>>>
>>> angelv@pilas:~/mhdsolver-julia/Misc/Julia_Parallel$
>>> /home/angelv/JULIA-DEV/julia/julia -q
>>> julia>
>>> println(VERSION);require("simulation.jl");N=1000000;T=1000;A=rand(3,N);@time
>>>
>>> SimulationSerial(A,N,T)
>>> 0.4.0-dev+4572
>>> elapsed time: 2.33095253 seconds (283 kB allocated)
>>>
>>> =============
>>>
>>> angelv@pilas:~/mhdsolver-julia/Misc/Julia_Parallel$
>>> /home/angelv/JULIA-DEV/julia/julia -q -p 4
>>> julia>
>>> println(VERSION);require("simulation.jl");N=1000000;T=1000;dA=drand(3,N);@time
>>>
>>> SimulationParallel(dA,N,T)
>>> 0.4.0-dev+4572
>>> ERROR: UndefVarError: drand not defined
>>>
>>>
>>>
>>>