Thank you for the prompt reply! No, I am not using either open or mmap.
On Wednesday, June 1, 2016 at 12:20:19 PM UTC-4, Stefan Karpinski wrote: > > Are you opening files via open or mmap in any of the functions > that learningExperimentRun calls? > > On Wed, Jun 1, 2016 at 11:42 AM, Martha White <[email protected] > <javascript:>> wrote: > >> I am having difficulty understanding how to use pmap in Julia. I am a >> reasonably experienced matlab and c programmer. However, I am new to Julia >> and to using parallel functions. I am running an experiment with nested for >> loops, benchmarking different algorithms. In the inner loop, I am running >> the algorithms across multiple trials. I would like to parallelize this >> inner loop (as the outer iteration I can easily run as multiple jobs on a >> cluster). The code looks like: >> >> effNumCores = 3 >> procids = addprocs(effNumCores) >> >> # This has to be added so that each run has access to these function >> definitions >> @everywhere include("experimentUtils.jl") >> >> # Initialize array of RMSE >> fill!(runErrors, 0.0); >> >> # Split up runs across number of cores >> outerloop = floor(Int, numRuns / effNumCores)+1 >> r = 1 >> rend = effNumCores >> for i = 1:outerloop >> rend = min(r+effNumCores-1, numRuns) >> >> # Empty RMSE passed, since it is create and returned in pmap_errors >> Array{Float64}(0,0) >> pmap_errors = pmap(r -> learningExperimentRun(mdp,hordeOfD, stepData, >> alpha,lambda,beta, numAgents, numSteps, Array{Float64}(0,0), r), r:rend) >> for j=1:(rend-r+1) >> runErrors[:,:,MEAN_IND] += pmap_errors[j] >> runErrors[:,:,VAR_IND] += pmap_errors[j].^2 >> end >> r += effNumCores >> end >> rmprocs(procids) >> >> The function called above is defined in separate file called >> experimentUtils.jl, as >> >> function learningExperimentRun(mdp::MDP, hordeOfD::horde, >> stepData::transData, alpha::Float64,lambda::Float64, beta::Float64, >> numAgents::Int64, numSteps::Int64, RMSE::Array{Float64, 2}, runNum::Int64) >> # if runErrors is empty, then initialize; this is empty for parallel >> version >> if (isempty(RMSE)) >> RMSE = zeros(Float64,numAgents, numSteps) >> else >> fill!(RMSE, 0.0) >> end >> >> srand(runNum) >> >> agentInit(hordeOfD, mdp, alpha, beta,lambda,BETA_ETD) >> getLearnerErrors(hordeOfD,mdp, RMSE,1) >> mdpStart(mdp,stepData) >> for i=2:numSteps >> mdpStep(mdp,stepData) >> updateLearners(stepData, mdp, hordeOfD) >> getLearnerErrors(hordeOfD,mdp, RMSE,i) >> end >> >> return RMSE >> end >> >> When I try to run this, I get a large number of workers and get errors >> that state that I have too many files open. I believe I must be doing >> something seriously wrong. If anyone could help to parallelize this code in >> julia, that would be fantastic. I am not tied to pmap, but after reading a >> bit, it seemed to be the right function to use. >> >> >> I should further add that I have an additional loop splitting runs over >> cores, even though pmap could do that for me. I did this because pmap_errors >> then becomes an array of numRuns (which could be 100s). By splitting it up >> into loops, the returned pmap_errors has size that is at most the number of >> cores. I am hoping that this memory then gets re-used when starting the >> next loop over cores. >> >> I tried at first avoiding this by using a distributed array for >> runErrors. But, this was not clearly documented and so I abandoned that >> approach. >> > >
