I've made a small example of the memory problems I've been running into. I 
can't find a way to deallocate a SharedArray, if the code below runs once, 
it means the computer has enough memory to run this. If I can properly 
deallocate the memory I should be able to do it again, however, I run out 
of memory. Am I misunderstanding something about garbage collection in 
Julia?

Thanks for your attention

Code: 

@everywhere nQ = 60

@everywhere function inF(x::SharedArray,nQ::Int64)

number = myid()-1;
targetLength = nQ*nQ*3

startN = floor((number-1)*targetLength/nworkers()) + 1
endN = floor(number*targetLength/nworkers())

myIndexes = int64(startN:endN)
for j in myIndexes
inds = ind2sub((nQ,nQ,nQ),j)
x[inds[1],inds[2],inds[3],:,:,:] = rand(nQ,nQ,nQ)
end


end

while true
zeroMatrix = SharedArray(Float64,(nQ,nQ,3,nQ,nQ,nQ),pids=workers(), init = 
x->inF(x,nQ))
println("ran!")
@everywhere zeroMatrix = 1
@everywhere gc()
end

On Monday, 8 December 2014 23:43:03 UTC-5, Isaiah wrote:
>
> Hopefully you will get an answer on pmap from someone more familiar with 
> the parallel stuff, but: have you tried splitting the init step? (see the 
> example in the manual for how to init an array in chunks done by different 
> workers). Just guessing though: I'm not sure if/how those will be 
> serialized if each worker is contending for the whole array.
>
> On Fri, Dec 5, 2014 at 4:23 PM, benFranklin <[email protected] 
> <javascript:>> wrote:
>
>> Hi all, I'm trying to figure out how to best initialize a SharedArray, 
>> using a C function to fill it up that computes a huge matrix in parts, and 
>> all comments are appreciated. To summarise: Is A, making an empty shared 
>> array, computing the matrix in parallel using pmap and then filling it up 
>> serially, better than using B, computing in parallel and storing in one 
>> step by using an init function in the SharedArray declaration?
>>
>>
>> The difference tends to be that B uses a lot more memory, each process 
>> using the exact same amount of memory. However it is much faster than A, as 
>> the copy step takes longer than the computation, but in A most of the 
>> memory usage is in one process, using less memory overall.
>>
>> Any tips on how to do this better? Also, this pmap is how I'm handling 
>> more complex paralellizations in Julia. Any comments on that approach?
>>
>> Thanks a lot!
>>
>> Best,
>> Ben
>>
>>
>> CODE A:
>>
>> Is this, making an empty shared array, computing the matrix in parallel 
>> and then filling it up serially:
>>
>> function findZeroDividends(model::ModelPrivate)
>>
>> nW = length(model.vW)
>> nZ = length(model.vZ)
>> nK = length(model.vK)
>> nQ = length(model.vQ)
>>  zeroMatrix = SharedArray(Float64,(nW,nZ,nK,nQ,nQ,nQ),pids=workers())
>>
>> input = [stateFindZeroK(w,z,k,model) for w in 1:nW, z in 1:nZ,  k in 
>> 1:nK];
>> results = pmap(findZeroInC,input);
>>
>> for w in 1:nW
>> for z in 1:nZ
>> for k in 1:nK
>>
>> zeroMatrix[w,z,k,:,:,:] = results[w + nW*((z-1) + nZ*(k-1))]
>>  end
>> end
>> end
>>
>> return zeroMatrix
>> end
>>
>> _______________________
>>
>> CODE B:
>>
>> Better than these two:
>>
>> function 
>> start(x::SharedArray,nW::Int64,nZ::Int64,nK::Int64,model::ModelPrivate)
>>
>> for j in myid()-1:nworkers():(nW*nZ*nK)
>> inds = ind2sub((nW,nZ,nK),j)
>> x[inds[1],inds[2],inds[3],:,:,:] 
>> =findZeroInC(stateFindZeroK(inds[1],inds[2],inds[3],model))
>> end
>>
>> x
>>
>> end
>>
>> function findZeroDividendsSmart(model::ModelPrivate)
>>
>> nW = length(model.vW)
>> nZ = length(model.vZ)
>> nK = length(model.vK)
>> nQ = length(model.vQ)
>>
>> #input = [stateFindZeroK(w,z,k,model) for w in 1:nW, z in 1:nZ,  k in 
>> 1:nK];
>> #results = pmap(findZeroInC,input);
>>
>> zeroMatrix = SharedArray(Float64,(nW,nZ,nK,nQ,nQ,nQ),pids=workers(), init 
>> = x->start(x,nW,nZ,nK,model) )
>>
>> return zeroMatrix
>> end
>>
>> ________________________
>>
>> The C function being called is inside this wrapper and returns the 
>> pointer to  double *capitalChoices = (double 
>> *)malloc(sizeof(double)*nQ*nQ*nQ);
>>
>> function findZeroInC(state::stateFindZeroK)
>>
>> w = state.wealth
>> z = state.z
>> k = state.k
>> model = state.model
>>
>> #findZeroInC(double wealth, int z,int k,  double theta, double delta, 
>>  double* vK,
>> # int nK, double* vQ, int nQ, double* transition, double betaGov)
>>
>> nQ = length(model.vQ)
>>
>> t = ccall((:findZeroInC,"findP.so"), 
>> Ptr{Float64},(Float64,Int64,Int64,Float64,Float64,Ptr{Float64},Int64,Ptr{Float64},Int64,Ptr{Float64},Float64),
>>
>> model.vW[w],z-1,k-1,model.theta,model.delta,model.vK,length(model.vK),model.vQ,nQ,model.transition,model.betaGov)
>> if t == C_NULL
>> error("NULL")
>> end
>>
>> return pointer_to_array(t,(nQ,nQ,nQ),true)
>>
>> end
>>
>>
>> <https://lh5.googleusercontent.com/-5rJqYh2oUqQ/VIIiFQUl2rI/AAAAAAAAAvM/gwAXG7N0Gxc/s1600/mem.png>
>>
>>
>>
>

Reply via email to