I have been using pmap for simulations and find it very useful and convenient. However, sometimes I want to run a large number of trials where the results are also large. This requires a lot of memory to hold the returned results. If I'm only interested the final, reduced result, and not concerned with the raw individual trial results, then returning entire array seems unnecessary. I want to reduce on the fly, avoiding the need to keep all trial results. I want to run more trials than workers for load balancing. (And possibly because I'm interested in summary results of individual trials, not the entire raw results).
With the help of the simplified version of pmap presented in the docs ( http://julia.readthedocs.org/en/latest/manual/parallel-computing/), I have a tenuous understanding of how pmap works. Although the actual implementation scares me. In any case, I was wondering before I progress further, whether a modified version of pmap could be designed to reduce on-the-fly. Here are some modifications to the simplified, documentation version. Would something like this work? I'm worried about the shared updates to final_result. Will these happen orderly? What else should I consider? * function pmap(f, lst) * function pmap_reduce(f, lst, reduce_fn) # extra argument is reduce function np = nprocs() # determine the number of processes available n = length(lst) * results = cell(n) * results = cell(np) # hold results for currently executing procs only * final_result = cell(1) # holds the final, reduced result i = 1 # function to produce the next work item from the queue. # in this case it's just an index. nextidx() = (idx=i; i+=1; idx) @sync begin for p=1:np if p != myid() || np == 1 @async begin while true idx = nextidx() if idx > n break end * results[idx] = remotecall_fetch(p, f, lst[idx]) * results[p] = remotecall_fetch(p, f, lst[idx]) # return results into array indexed by proc * reduce_fn(final_result, results[p]) # combine results[p] into final_result using reduction function end end end end end * results * final_result # return reduced result end
