I compared the speed of the parallel loop : it is 10 times slower (with 4 cpus) than the simple loop :(
@time @sync @parallel for i=1:800000 c[i] = prod(a[i], b[i]) end #println(c) @time for i=1:800000 c[i] = prod(a[i], b[i]) end 0.728079 seconds (391.50 k allocations: 16.659 MB) 0.091947 seconds (4.80 M allocations: 73.240 MB, 5.67% gc time)
