As you could see, the reduction operator can be omitted if it is not needed. In that case, the loop executes asynchronously, i.e. it spawns independent tasks on all available workers and returns an array of RemoteRef (page 399) immediately without waiting for completion. The caller can wait for the RemoteRef (page 399) completions at a later point by calling fetch() (page 398) on them, or wait for completion at the end of the loop by prefixing it with @sync (page 399), like @sync @parallel for.
(From the manual) When I try: a=randn(1000) r=@parallel for i=1:100000 a[rand(1:end)] end I get an array of remoteref() When calling fetch(r[1]) for example I get an empty line? What does completion mean in this sense? Each worker runs its own own loop and completion at the end or after each iteration? e.g. 2 workers run from i= 1 to 100. Now completion means that after each i the first finished worker waits for the other so they can proceed to i+1 ?? For example the following test is much faster when I include an @async infront of the execution part and leave the @sync out Does this parallel program even run parallel if I leave both @sync and @sync out, since its seems to be slower than the serial alternative? @everywhere function myrange(q::SharedArray) idx = indexpids(q) if idx == 0 return 1:0 else nchunks = length(procs(q)) splits = [round(Int, s) for s in linspace(0,size(q,1),nchunks+1)] splits[idx]+1:splits[idx+1] end end @everywhere function g(q::SharedArray) q[myrange(q)].^2 end function h(q::SharedArray) #@sync begin res=Array(Float64,size(q,1)); for p in procs(q) @async res[remotecall_fetch(p,myrange,q)]=remotecall_fetch(p,g,q) end #end res end