On Wednesday, December 10, 2014 12:28:29 PM UTC+10, benFranklin wrote:
>
> I've made a small example of the memory problems I've been running into. I
> can't find a way to deallocate a SharedArray,
>
Someone more expert might find it, but I can't see anywhere that the
mmapped memory is unmapped.
> if the code below runs once, it means the computer has enough memory to
> run this. If I can properly deallocate the memory I should be able to do it
> again, however, I run out of memory. Am I misunderstanding something about
> garbage collection in Julia?
>
> Thanks for your attention
>
> Code:
>
> @everywhere nQ = 60
>
> @everywhere function inF(x::SharedArray,nQ::Int64)
>
> number = myid()-1;
> targetLength = nQ*nQ*3
>
> startN = floor((number-1)*targetLength/nworkers()) + 1
> endN = floor(number*targetLength/nworkers())
>
> myIndexes = int64(startN:endN)
> for j in myIndexes
> inds = ind2sub((nQ,nQ,nQ),j)
> x[inds[1],inds[2],inds[3],:,:,:] = rand(nQ,nQ,nQ)
> end
>
>
> end
>
> while true
> zeroMatrix = SharedArray(Float64,(nQ,nQ,3,nQ,nQ,nQ),pids=workers(), init =
> x->inF(x,nQ))
> println("ran!")
> @everywhere zeroMatrix = 1
> @everywhere gc()
> end
>
> On Monday, 8 December 2014 23:43:03 UTC-5, Isaiah wrote:
>>
>> Hopefully you will get an answer on pmap from someone more familiar with
>> the parallel stuff, but: have you tried splitting the init step? (see the
>> example in the manual for how to init an array in chunks done by different
>> workers). Just guessing though: I'm not sure if/how those will be
>> serialized if each worker is contending for the whole array.
>>
>> On Fri, Dec 5, 2014 at 4:23 PM, benFranklin <[email protected]> wrote:
>>
>>> Hi all, I'm trying to figure out how to best initialize a SharedArray,
>>> using a C function to fill it up that computes a huge matrix in parts, and
>>> all comments are appreciated. To summarise: Is A, making an empty shared
>>> array, computing the matrix in parallel using pmap and then filling it up
>>> serially, better than using B, computing in parallel and storing in one
>>> step by using an init function in the SharedArray declaration?
>>>
>>>
>>> The difference tends to be that B uses a lot more memory, each process
>>> using the exact same amount of memory. However it is much faster than A, as
>>> the copy step takes longer than the computation, but in A most of the
>>> memory usage is in one process, using less memory overall.
>>>
>>> Any tips on how to do this better? Also, this pmap is how I'm handling
>>> more complex paralellizations in Julia. Any comments on that approach?
>>>
>>> Thanks a lot!
>>>
>>> Best,
>>> Ben
>>>
>>>
>>> CODE A:
>>>
>>> Is this, making an empty shared array, computing the matrix in parallel
>>> and then filling it up serially:
>>>
>>> function findZeroDividends(model::ModelPrivate)
>>>
>>> nW = length(model.vW)
>>> nZ = length(model.vZ)
>>> nK = length(model.vK)
>>> nQ = length(model.vQ)
>>> zeroMatrix = SharedArray(Float64,(nW,nZ,nK,nQ,nQ,nQ),pids=workers())
>>>
>>> input = [stateFindZeroK(w,z,k,model) for w in 1:nW, z in 1:nZ, k in
>>> 1:nK];
>>> results = pmap(findZeroInC,input);
>>>
>>> for w in 1:nW
>>> for z in 1:nZ
>>> for k in 1:nK
>>>
>>> zeroMatrix[w,z,k,:,:,:] = results[w + nW*((z-1) + nZ*(k-1))]
>>> end
>>> end
>>> end
>>>
>>> return zeroMatrix
>>> end
>>>
>>> _______________________
>>>
>>> CODE B:
>>>
>>> Better than these two:
>>>
>>> function
>>> start(x::SharedArray,nW::Int64,nZ::Int64,nK::Int64,model::ModelPrivate)
>>>
>>> for j in myid()-1:nworkers():(nW*nZ*nK)
>>> inds = ind2sub((nW,nZ,nK),j)
>>> x[inds[1],inds[2],inds[3],:,:,:]
>>> =findZeroInC(stateFindZeroK(inds[1],inds[2],inds[3],model))
>>> end
>>>
>>> x
>>>
>>> end
>>>
>>> function findZeroDividendsSmart(model::ModelPrivate)
>>>
>>> nW = length(model.vW)
>>> nZ = length(model.vZ)
>>> nK = length(model.vK)
>>> nQ = length(model.vQ)
>>>
>>> #input = [stateFindZeroK(w,z,k,model) for w in 1:nW, z in 1:nZ, k in
>>> 1:nK];
>>> #results = pmap(findZeroInC,input);
>>>
>>> zeroMatrix = SharedArray(Float64,(nW,nZ,nK,nQ,nQ,nQ),pids=workers(),
>>> init = x->start(x,nW,nZ,nK,model) )
>>>
>>> return zeroMatrix
>>> end
>>>
>>> ________________________
>>>
>>> The C function being called is inside this wrapper and returns the
>>> pointer to double *capitalChoices = (double
>>> *)malloc(sizeof(double)*nQ*nQ*nQ);
>>>
>>> function findZeroInC(state::stateFindZeroK)
>>>
>>> w = state.wealth
>>> z = state.z
>>> k = state.k
>>> model = state.model
>>>
>>> #findZeroInC(double wealth, int z,int k, double theta, double delta,
>>> double* vK,
>>> # int nK, double* vQ, int nQ, double* transition, double betaGov)
>>>
>>> nQ = length(model.vQ)
>>>
>>> t = ccall((:findZeroInC,"findP.so"),
>>> Ptr{Float64},(Float64,Int64,Int64,Float64,Float64,Ptr{Float64},Int64,Ptr{Float64},Int64,Ptr{Float64},Float64),
>>>
>>> model.vW[w],z-1,k-1,model.theta,model.delta,model.vK,length(model.vK),model.vQ,nQ,model.transition,model.betaGov)
>>> if t == C_NULL
>>> error("NULL")
>>> end
>>>
>>> return pointer_to_array(t,(nQ,nQ,nQ),true)
>>>
>>> end
>>>
>>>
>>> <https://lh5.googleusercontent.com/-5rJqYh2oUqQ/VIIiFQUl2rI/AAAAAAAAAvM/gwAXG7N0Gxc/s1600/mem.png>
>>>
>>>
>>>
>>