There's something weird happening in my recently parallelized code. When 
running it without adding some worker processes first, the results are 
completely off and after some investigation I found that this was due to 
assignment operations going wrong - results of computations were assigned 
to different Arrays than the intended ones. A small working example 
illustrating the point:

x1 = linspace(1, 3, 3)
x2 = linspace(1, 3, 3)
x3 = linspace(1, 3, 3)

function getresults(x1::Array, x2::Array, x3::Array)
  result1 = SharedArray(Float64, (3,3,3))
  result2 = similar(result1)
  result3 = similar(result1)
  
  @sync @parallel for a=1:3
    for b=1:3
      for c=1:3
        result1[a,b,c] = x1[a]*x2[b]*x3[c]
        result2[a,b,c] = sqrt(x1[a]*x2[b]*x3[c])
        result3[a,b,c] = (x1[a]*x2[b]*x3[c])^2
      end
    end
  end
  return sdata(result1), sdata(result2), sdata(result3)
end

(r1,r2,r3) = getresults(x1, x2, x3)

nprocs()==CPU_CORES || addprocs(CPU_CORES-1)
(r1_par,r2_par,r3_par) = getresults(x1, x2, x3)


When I run this on my system  (v0.3.6), the parallelized version works as 
intended, while running the code without adding workers first gives the 
expected results for r1 and r3, but r2 holds the same results as r3. The 
behaviour in my original problem was similar, the code returns three 
Arrays, but running it without additional workers those Arrays all return 
the same contents.

Is there something in the @sync or @parallel macros that causes this? How 
should a code be written to ensure that it works both with one and multiple 
cores?

Reply via email to