I have a parallel application where I want to modify some objects from
worker processes, and get the modified values back into the main process.
Is there a clean way to do this in Julia?
Here is a contrived example illustrating the problem (run with "julia -p 2
myfile.jl"):
@everywhere function mutate_arr!(arr, x)
> println(arr)
> push!(arr, x)
> println(arr)
> nothing
> end
> arr = [1, 2, 3]
> println(arr)
> remotecall_fetch(2, mutate_arr!, arr, 4)
> println(arr)
This fails with the following error:
exception on 2: ERROR: cannot resize array with shared data
> in push! at ./array.jl:459
> in mutate_arr! at .../myfile.jl:3
> in anonymous at multi.jl:855
> in run_work_thunk at multi.jl:621
> in anonymous at task.jl:855
Is it the case that remotecall and @spawn do not allow modifying the
function arguments? If so, this should be documented.
My second attempt uses a RemoteRef to pass the array explicitly:
@everywhere function mutate_arr!(arr, x)
> println(arr)
> push!(arr, x)
> println(arr)
> nothing
> end
@everywhere function wrapper_mutate_arr!(ref, x)
> local_arr = take!(ref)
> mutate_arr!(local_arr, x)
> put!(ref, local_arr)
> nothing
> end
arr = [1, 2, 3]
> println(arr)
arr_ref = RemoteRef(2)
> put!(arr_ref, arr)
> remotecall_fetch(2, wrapper_mutate_arr!, arr_ref, 4)
> arr = take!(arr_ref)
> println(arr)
This fails with the same error ("cannot resize array with shared data").
What is shared here, and why?
If I use replace local_arr = take!(ref) with local_arr = copy(take!(ref)),
it works. But I think this creates an additional copy (one for the
serialization, and one for the copy() call), which I would like to avoid.
In summary: If I have a function like mutate_arr! that I want to offload to
a worker process, what is the right way to do it? Obviously the functional
way of thinking is to avoid mutation entirely, but if I already have a
function like that, what should I do? Just use an additional copy() as
above?
(In case it matters, I'm using Julia 0.3.4.)
Thanks,
Constantin