Hello everyone, If I have a very large array in the main process and I use remotecall() or pmap() to copy the array to worker processes and modify the array in parallel (all modifications are wrapped in a function). After returning from the worker process, will the copied array be released?
See the following REPL session run on my laptop (OS X with 8GB memory) $ ~/julia/julia -p 1 julia> A = randn(10000*10000); # According to the system monitor, the main process of julia used about 800MB of memory, and the worker process used about 80MB julia> A[1] = remotecall_fetch(2, x->(x[1] = 1.0), A); # Now the main process used about 1.6GB of memory, the worker process used about 800MB julia> @everywhere gc() # Now Both the main proess and the worker process used about 800MB of memory, the copied array in the worker process wasn't released julia> A[1] = remotecall_fetch(2, x->(x[1] = 2.0), A); # If I want to iterate the computing, the situation gets worse. Now the worker process used about 1.6GB julia> A[1] = remotecall_fetch(2, x->(x[1] = 3.0), A); # worker process used about 2.4GB now julia> A[1] = remotecall_fetch(2, x->(x[1] = 4.0), A); # worker process used about 3GB julia> A[1] = remotecall_fetch(2, x->(x[1] = 5.0), A); # worker process used about 3.8GB In my real code, the array is even larger and there is more processes. After one or two iterations of pmap(), the computation becomes much slower than the first iteration. I think it's because the huge memory consumption triggers page swapping constantly. PS. In fact I prefer using shared memory or multithreading in my project, but I don't know how to share a object with a user defined type besides shared array. Regards, Yang Zhixuan
