this sounds like you may be looking for the `@parallel reduce_fn for itm = lst; f(itm); end` map-reducer construct (described on the same page)?
On Thu, Jul 9, 2015 at 9:23 PM Greg Plowman <[email protected]> wrote: > I have been using pmap for simulations and find it very useful > and convenient. > However, sometimes I want to run a large number of trials where the > results are also large. This requires a lot of memory to hold the returned > results. > If I'm only interested the final, reduced result, and not concerned with > the raw individual trial results, then returning entire array seems > unnecessary. > I want to reduce on the fly, avoiding the need to keep all trial results. > I want to run more trials than workers for load balancing. (And possibly > because I'm interested in summary results of individual trials, not the > entire raw results). > > With the help of the simplified version of pmap presented in the docs ( > http://julia.readthedocs.org/en/latest/manual/parallel-computing/), I > have a tenuous understanding of how pmap works. Although the actual > implementation scares me. > In any case, I was wondering before I progress further, whether a modified > version of pmap could be designed to reduce on-the-fly. > Here are some modifications to the simplified, documentation version. > Would something like this work? I'm worried about the shared updates to > final_result. Will these happen orderly? What else should I consider? > > > * function pmap(f, lst) > > * function pmap_reduce(f, lst, reduce_fn) # extra argument is reduce > function > np = nprocs() # determine the number of processes available > n = length(lst) > > > * results = cell(n) > * results = cell(np) # hold results for currently executing procs only > * final_result = cell(1) # holds the final, reduced result > > i = 1 > # function to produce the next work item from the queue. > # in this case it's just an index. > nextidx() = (idx=i; i+=1; idx) > > @sync begin > for p=1:np > if p != myid() || np == 1 > @async begin > while true > idx = nextidx() > if idx > n > break > end > > * results[idx] = remotecall_fetch(p, f, lst[idx]) > * results[p] = remotecall_fetch(p, f, lst[idx]) # > return results into array indexed by proc > * reduce_fn(final_result, results[p]) # combine > results[p] into final_result using reduction function > end > end > end > end > end > > * results > * final_result # return reduced result > end > > >
