Re: [julia-users] pmap - version to return reduced result only

Greg Plowman Thu, 16 Jul 2015 22:09:44 -0700

Thanks Jameson.

My error was from the line: trialCounts = MySimulation(trial, numIter)
Error message: trialCounts not defined


This had something to do with the variable name (used later, perhaps if 
statements guarding scope and now local)
In any case, if I change variable name, it works: counts = MySimulation(
trial, numIter)


Thanks again for your help.

Greg



On Friday, July 17, 2015 at 2:48:44 PM UTC+10, Jameson wrote:

> i believe that length(chunks) will be <= nworkers()
>
> the last statement of the for loop should be the "return" value from that 
> iteration. (for example: the variable name `trialCount`).
>
> On Fri, Jul 17, 2015 at 12:12 AM Greg Plowman <[email protected] 
> <javascript:>> wrote:
>
> OK thanks.
> I didn't consider @parallel (probably because I considered it for only 
> large trials of small work units, whereas I considered pmap more suited to 
> relatively small trials of longer running work units)
> In any case, @parallel works fine.
>
> Old pmap code skeleton:
> trialCounts = pmap(MySimulation, [1:numTrials], fill(numIter, numTrials))
> totalCounts = sum(trialCounts)
>
> New @parallel code
> totalCounts = @parallel (+) for trial = 1:numTrials
>     MySimulation(trial, numIter)
> end
>
>
>
> However, I have 2 questions:
>
> 1. When I try to modify @parallel code to assign the result to a variable 
> inside the loop, I get an error.
> I don't understand the @parallel macro, but I'm guessing I can't assign to 
> variable inside loop?
>  
> totalCounts = @parallel (+) for trial = 1:numTrials
>     trialCount = MySimulation(trial, numPlays)
>     print(trialCount) # or some other processing with trialCount
> end
>
>
>
> 2. Again I don't @parallel macro but it seems to call preduce (see below), 
> which seems to collect results in an array of size numTrials / nworkers().
> If this is so, then memory requirement still has a dependency on the 
> number of trials.
> I was trying to limit the results array to the number of workers, 
> independent of number of trials.
> Is my understanding here correct?
>
> function preduce(reducer, f, N::Int)
>     chunks = splitrange(N, nworkers())
>     results = cell(length(chunks))
>     for i in 1:length(chunks)
>         results[i] = @spawn f(first(chunks[i]), last(chunks[i]))
>     end
>     mapreduce(fetch, reducer, results)
> end
>
>
>
> Greg
>
>
>
>
>
>
>
>
>  
>
> On Friday, July 10, 2015 at 12:24:16 PM UTC+10, Jameson wrote:
>
> this sounds like you may be looking for the `@parallel reduce_fn for itm = 
> lst; f(itm); end` map-reducer construct (described on the same page)?
>
> On Thu, Jul 9, 2015 at 9:23 PM Greg Plowman <[email protected]> wrote:
>
> I have been using pmap for simulations and find it very useful 
> and convenient.
> However, sometimes I want to run a large number of trials where the 
> results are also large. This requires a lot of memory to hold the returned 
> results.
> If I'm only interested the final, reduced result, and not concerned with 
> the raw individual trial results, then returning entire array seems 
> unnecessary.
> I want to reduce on the fly, avoiding the need to keep all trial results.
> I want to run more trials than workers for load balancing. (And possibly 
> because I'm interested in summary results of individual trials, not the 
> entire raw results).
>
> With the help of the simplified version of pmap presented in the docs (
> http://julia.readthedocs.org/en/latest/manual/parallel-computing/), I 
> have a tenuous understanding of how pmap works. Although the actual 
> implementation scares me.
> In any case, I was wondering before I progress further, whether a modified 
> version of pmap could be designed to reduce on-the-fly.
> Here are some modifications to the simplified, documentation version.
> Would something like this work? I'm worried about the shared updates to 
> final_result. Will these happen orderly? What else should I consider?
>
>
> * function pmap(f, lst)
>
> * function pmap_reduce(f, lst, reduce_fn)  # extra argument is reduce 
> function 
>     np = nprocs()  # determine the number of processes available
>     n = length(lst)
>
>
> *   results = cell(n)
> *   results = cell(np)  # hold results for currently executing procs only
> *   final_result = cell(1)  # holds the final, reduced result
>
>     i = 1
>     # function to produce the next work item from the queue.
>     # in this case it's just an index.
>     nextidx() = (idx=i; i+=1; idx)
>
>     @sync begin
>         for p=1:np
>             if p != myid() || np == 1
>                 @async begin
>                     while true
>                         idx = nextidx()
>                         if idx > n
>                             break
>                         end
>
> *                       results[idx] = remotecall_fetch(p, f, lst[idx])
> *                       results[p] = remotecall_fetch(
>
> ...

Re: [julia-users] pmap - version to return reduced result only

Reply via email to