I have noticed that these remote references can't be fetched: fetch(zeroMatrix.refs[1])
the driver process just waits until infinity, so I'm thinking that the remotecall_wait() in https://github.com/JuliaLang/julia/blob/f3c355115ab02868ac644a5561b788fc16738443/base/sharedarray.jl#L96 exit before it should. Any ideas? On Wednesday, 10 December 2014 13:47:19 UTC-5, benFranklin wrote: > > I think you are right about some references not being released yet: > > If I change the while loop to include you way of replacing every > reference, the put! actually never gets executed, it just waits: > > while true > zeroMatrix = > SharedArray(Float64,(nQ,nQ,3,nQ,nQ,nQ),pids=workers(), init = x->inF(x,nQ)) > println("ran!") > > for i = 1:length(zeroMatrix.refs) > put!(zeroMatrix.refs[i], 1) > end > @everywhere gc() > end > ran! > ________ > > Runs once and stalls, after C-c: > > > ^CERROR: interrupt > in process_events at /usr/bin/../lib64/julia/sys.so > in wait at /usr/bin/../lib64/julia/sys.so (repeats 2 times) > in wait_full at /usr/bin/../lib64/julia/sys.so > ____________ > After C-d > > julia> > > WARNING: Forcibly interrupting busy workers > error in running finalizer: InterruptException() > error in running finalizer: InterruptException() > WARNING: Unable to terminate all workers > [...] > > > It seems after the init function not all workers are "done". I'll see if > there's something weird with that part, but if the SharedArray is being > returned, I don't see any reason for this to be so. > > > > On Wednesday, 10 December 2014 05:19:55 UTC-5, Tim Holy wrote: >> >> After your gc() it should be able to be unmapped, see >> >> https://github.com/JuliaLang/julia/blob/f3c355115ab02868ac644a5561b788fc16738443/base/mmap.jl#L113 >> >> >> My guess is something in the parallel architecture is holding a >> reference. >> Have you tried going at this systematically from the internal >> representation >> of the SharedArray? For example, I might consider trying to put! new >> stuff in >> zeroMatrix.refs: >> >> for i = 1:length(zeroMatrix.refs) >> put!(zeroMatrix.refs[i], 1) >> end >> >> before calling gc(). I don't know if this will work, but it's where I'd >> start >> experimenting. >> >> If you can fix this, please do submit a pull request. >> >> Best, >> --Tim >> >> On Tuesday, December 09, 2014 08:06:10 PM [email protected] wrote: >> > On Wednesday, December 10, 2014 12:28:29 PM UTC+10, benFranklin wrote: >> > > I've made a small example of the memory problems I've been running >> into. I >> > > can't find a way to deallocate a SharedArray, >> > >> > Someone more expert might find it, but I can't see anywhere that the >> > mmapped memory is unmapped. >> > >> > > if the code below runs once, it means the computer has enough memory >> to >> > > run this. If I can properly deallocate the memory I should be able to >> do >> > > it >> > > again, however, I run out of memory. Am I misunderstanding something >> about >> > > garbage collection in Julia? >> > > >> > > Thanks for your attention >> > > >> > > Code: >> > > >> > > @everywhere nQ = 60 >> > > >> > > @everywhere function inF(x::SharedArray,nQ::Int64) >> > > >> > > number = myid()-1; >> > > targetLength = nQ*nQ*3 >> > > >> > > startN = floor((number-1)*targetLength/nworkers()) + 1 >> > > endN = floor(number*targetLength/nworkers()) >> > > >> > > myIndexes = int64(startN:endN) >> > > for j in myIndexes >> > > inds = ind2sub((nQ,nQ,nQ),j) >> > > x[inds[1],inds[2],inds[3],:,:,:] = rand(nQ,nQ,nQ) >> > > end >> > > >> > > >> > > end >> > > >> > > while true >> > > zeroMatrix = SharedArray(Float64,(nQ,nQ,3,nQ,nQ,nQ),pids=workers(), >> init = >> > > x->inF(x,nQ)) >> > > println("ran!") >> > > @everywhere zeroMatrix = 1 >> > > @everywhere gc() >> > > end >> > > >> > > On Monday, 8 December 2014 23:43:03 UTC-5, Isaiah wrote: >> > >> Hopefully you will get an answer on pmap from someone more familiar >> with >> > >> the parallel stuff, but: have you tried splitting the init step? >> (see the >> > >> example in the manual for how to init an array in chunks done by >> > >> different >> > >> workers). Just guessing though: I'm not sure if/how those will be >> > >> serialized if each worker is contending for the whole array. >> > >> >> > >> On Fri, Dec 5, 2014 at 4:23 PM, benFranklin <[email protected]> >> wrote: >> > >>> Hi all, I'm trying to figure out how to best initialize a >> SharedArray, >> > >>> using a C function to fill it up that computes a huge matrix in >> parts, >> > >>> and >> > >>> all comments are appreciated. To summarise: Is A, making an empty >> shared >> > >>> array, computing the matrix in parallel using pmap and then filling >> it >> > >>> up >> > >>> serially, better than using B, computing in parallel and storing in >> one >> > >>> step by using an init function in the SharedArray declaration? >> > >>> >> > >>> >> > >>> The difference tends to be that B uses a lot more memory, each >> process >> > >>> using the exact same amount of memory. However it is much faster >> than A, >> > >>> as >> > >>> the copy step takes longer than the computation, but in A most of >> the >> > >>> memory usage is in one process, using less memory overall. >> > >>> >> > >>> Any tips on how to do this better? Also, this pmap is how I'm >> handling >> > >>> more complex paralellizations in Julia. Any comments on that >> approach? >> > >>> >> > >>> Thanks a lot! >> > >>> >> > >>> Best, >> > >>> Ben >> > >>> >> > >>> >> > >>> CODE A: >> > >>> >> > >>> Is this, making an empty shared array, computing the matrix in >> parallel >> > >>> and then filling it up serially: >> > >>> >> > >>> function findZeroDividends(model::ModelPrivate) >> > >>> >> > >>> nW = length(model.vW) >> > >>> nZ = length(model.vZ) >> > >>> nK = length(model.vK) >> > >>> nQ = length(model.vQ) >> > >>> >> > >>> zeroMatrix = >> SharedArray(Float64,(nW,nZ,nK,nQ,nQ,nQ),pids=workers()) >> > >>> >> > >>> input = [stateFindZeroK(w,z,k,model) for w in 1:nW, z in 1:nZ, k >> in >> > >>> 1:nK]; >> > >>> results = pmap(findZeroInC,input); >> > >>> >> > >>> for w in 1:nW >> > >>> for z in 1:nZ >> > >>> for k in 1:nK >> > >>> >> > >>> zeroMatrix[w,z,k,:,:,:] = results[w + nW*((z-1) + nZ*(k-1))] >> > >>> >> > >>> end >> > >>> >> > >>> end >> > >>> end >> > >>> >> > >>> return zeroMatrix >> > >>> end >> > >>> >> > >>> _______________________ >> > >>> >> > >>> CODE B: >> > >>> >> > >>> Better than these two: >> > >>> >> > >>> function >> > >>> >> start(x::SharedArray,nW::Int64,nZ::Int64,nK::Int64,model::ModelPrivate) >> > >>> >> > >>> for j in myid()-1:nworkers():(nW*nZ*nK) >> > >>> inds = ind2sub((nW,nZ,nK),j) >> > >>> x[inds[1],inds[2],inds[3],:,:,:] >> > >>> =findZeroInC(stateFindZeroK(inds[1],inds[2],inds[3],model)) >> > >>> end >> > >>> >> > >>> x >> > >>> >> > >>> end >> > >>> >> > >>> function findZeroDividendsSmart(model::ModelPrivate) >> > >>> >> > >>> nW = length(model.vW) >> > >>> nZ = length(model.vZ) >> > >>> nK = length(model.vK) >> > >>> nQ = length(model.vQ) >> > >>> >> > >>> #input = [stateFindZeroK(w,z,k,model) for w in 1:nW, z in 1:nZ, k >> in >> > >>> 1:nK]; >> > >>> #results = pmap(findZeroInC,input); >> > >>> >> > >>> zeroMatrix = >> SharedArray(Float64,(nW,nZ,nK,nQ,nQ,nQ),pids=workers(), >> > >>> init = x->start(x,nW,nZ,nK,model) ) >> > >>> >> > >>> return zeroMatrix >> > >>> end >> > >>> >> > >>> ________________________ >> > >>> >> > >>> The C function being called is inside this wrapper and returns the >> > >>> pointer to double *capitalChoices = (double >> > >>> *)malloc(sizeof(double)*nQ*nQ*nQ); >> > >>> >> > >>> function findZeroInC(state::stateFindZeroK) >> > >>> >> > >>> w = state.wealth >> > >>> z = state.z >> > >>> k = state.k >> > >>> model = state.model >> > >>> >> > >>> #findZeroInC(double wealth, int z,int k, double theta, double >> delta, >> > >>> >> > >>> double* vK, >> > >>> >> > >>> # int nK, double* vQ, int nQ, double* transition, double betaGov) >> > >>> >> > >>> nQ = length(model.vQ) >> > >>> >> > >>> t = ccall((:findZeroInC,"findP.so"), >> > >>> >> Ptr{Float64},(Float64,Int64,Int64,Float64,Float64,Ptr{Float64},Int64,Ptr >> > >>> {Float64},Int64,Ptr{Float64},Float64), >> > >>> >> > >>> >> model.vW[w],z-1,k-1,model.theta,model.delta,model.vK,length(model.vK),mo >> > >>> del.vQ,nQ,model.transition,model.betaGov) if t == C_NULL >> > >>> error("NULL") >> > >>> end >> > >>> >> > >>> return pointer_to_array(t,(nQ,nQ,nQ),true) >> > >>> >> > >>> end >> > >>> >> > >>> >> > >>> < >> https://lh5.googleusercontent.com/-5rJqYh2oUqQ/VIIiFQUl2rI/AAAAAAAAAvM/ >> > >>> gwAXG7N0Gxc/s1600/mem.png> >> >>
