I have been using pmap for simulations and find it very useful 
and convenient.
However, sometimes I want to run a large number of trials where the results 
are also large. This requires a lot of memory to hold the returned results.
If I'm only interested the final, reduced result, and not concerned with 
the raw individual trial results, then returning entire array seems 
unnecessary.
I want to reduce on the fly, avoiding the need to keep all trial results.
I want to run more trials than workers for load balancing. (And possibly 
because I'm interested in summary results of individual trials, not the 
entire raw results).

With the help of the simplified version of pmap presented in the docs (
http://julia.readthedocs.org/en/latest/manual/parallel-computing/), I have 
a tenuous understanding of how pmap works. Although the actual 
implementation scares me.
In any case, I was wondering before I progress further, whether a modified 
version of pmap could be designed to reduce on-the-fly.
Here are some modifications to the simplified, documentation version.
Would something like this work? I'm worried about the shared updates to 
final_result. Will these happen orderly? What else should I consider?


* function pmap(f, lst)

* function pmap_reduce(f, lst, reduce_fn)  # extra argument is reduce 
function 
    np = nprocs()  # determine the number of processes available
    n = length(lst)


*   results = cell(n)
*   results = cell(np)  # hold results for currently executing procs only
*   final_result = cell(1)  # holds the final, reduced result

    i = 1
    # function to produce the next work item from the queue.
    # in this case it's just an index.
    nextidx() = (idx=i; i+=1; idx)

    @sync begin
        for p=1:np
            if p != myid() || np == 1
                @async begin
                    while true
                        idx = nextidx()
                        if idx > n
                            break
                        end

*                       results[idx] = remotecall_fetch(p, f, lst[idx])
*                       results[p] = remotecall_fetch(p, f, lst[idx])  # 
return results into array indexed by proc
*                       reduce_fn(final_result, results[p])  # combine 
results[p] into final_result using reduction function
                    end
                end
            end
        end
    end

*   results
*   final_result  # return reduced result
end


Reply via email to