OK thanks.
I didn't consider @parallel (probably because I considered it for only 
large trials of small work units, whereas I considered pmap more suited to 
relatively small trials of longer running work units)
In any case, @parallel works fine.

Old pmap code skeleton:
trialCounts = pmap(MySimulation, [1:numTrials], fill(numIter, numTrials))
totalCounts = sum(trialCounts)

New @parallel code
totalCounts = @parallel (+) for trial = 1:numTrials
    MySimulation(trial, numIter)
end



However, I have 2 questions:

1. When I try to modify @parallel code to assign the result to a variable 
inside the loop, I get an error.
I don't understand the @parallel macro, but I'm guessing I can't assign to 
variable inside loop?
 
totalCounts = @parallel (+) for trial = 1:numTrials
    trialCount = MySimulation(trial, numPlays)
    print(trialCount) # or some other processing with trialCount
end



2. Again I don't @parallel macro but it seems to call preduce (see below), 
which seems to collect results in an array of size numTrials / nworkers().
If this is so, then memory requirement still has a dependency on the number 
of trials.
I was trying to limit the results array to the number of workers, 
independent of number of trials.
Is my understanding here correct?

function preduce(reducer, f, N::Int)
    chunks = splitrange(N, nworkers())
    results = cell(length(chunks))
    for i in 1:length(chunks)
        results[i] = @spawn f(first(chunks[i]), last(chunks[i]))
    end
    mapreduce(fetch, reducer, results)
end



Greg








 

On Friday, July 10, 2015 at 12:24:16 PM UTC+10, Jameson wrote:

> this sounds like you may be looking for the `@parallel reduce_fn for itm = 
> lst; f(itm); end` map-reducer construct (described on the same page)?
>
> On Thu, Jul 9, 2015 at 9:23 PM Greg Plowman <[email protected] 
> <javascript:>> wrote:
>
>> I have been using pmap for simulations and find it very useful 
>> and convenient.
>> However, sometimes I want to run a large number of trials where the 
>> results are also large. This requires a lot of memory to hold the returned 
>> results.
>> If I'm only interested the final, reduced result, and not concerned with 
>> the raw individual trial results, then returning entire array seems 
>> unnecessary.
>> I want to reduce on the fly, avoiding the need to keep all trial results.
>> I want to run more trials than workers for load balancing. (And possibly 
>> because I'm interested in summary results of individual trials, not the 
>> entire raw results).
>>
>> With the help of the simplified version of pmap presented in the docs (
>> http://julia.readthedocs.org/en/latest/manual/parallel-computing/), I 
>> have a tenuous understanding of how pmap works. Although the actual 
>> implementation scares me.
>> In any case, I was wondering before I progress further, whether a 
>> modified version of pmap could be designed to reduce on-the-fly.
>> Here are some modifications to the simplified, documentation version.
>> Would something like this work? I'm worried about the shared updates to 
>> final_result. Will these happen orderly? What else should I consider?
>>
>>
>> * function pmap(f, lst)
>>
>> * function pmap_reduce(f, lst, reduce_fn)  # extra argument is reduce 
>> function 
>>     np = nprocs()  # determine the number of processes available
>>     n = length(lst)
>>
>>
>> *   results = cell(n)
>> *   results = cell(np)  # hold results for currently executing procs only
>> *   final_result = cell(1)  # holds the final, reduced result
>>
>>     i = 1
>>     # function to produce the next work item from the queue.
>>     # in this case it's just an index.
>>     nextidx() = (idx=i; i+=1; idx)
>>
>>     @sync begin
>>         for p=1:np
>>             if p != myid() || np == 1
>>                 @async begin
>>                     while true
>>                         idx = nextidx()
>>                         if idx > n
>>                             break
>>                         end
>>
>> *                       results[idx] = remotecall_fetch(p, f, lst[idx])
>> *                       results[p] = remotecall_fetch(p, f, lst[idx])  # 
>> return results into array indexed by proc
>> *                       reduce_fn(final_result, results[p])  # combine 
>> results[p] into final_result using reduction function
>>                     end
>>                 end
>>             end
>>         end
>>     end
>>
>> *   results
>> *   final_result  # return reduced result
>> end
>>
>>
>>

Reply via email to