Am 13.03.2015 um 16:20 schrieb Pieter Barendrecht <[email protected]>:

> Thanks! I tried both approaches you suggested. Some results using 
> SharedArrays (100,000 simulations)
> 
> #workers #time
> 1 ~120s
> 3 ~42s
> 6 ~40s
> 
> Short question. The first print statement after the for-loop is already 
> executed before the for-loop ends. How do I prevent this from happening?
> 
> Some results using the other approach (again 100,000 simulations)
> 
> #workers #time
> 1 ~118s
> 2 ~60s
> 3 ~42s
> 4 ~38s
> 6 ~40s
> 6 ~40s
> 

Could you post a simplified code snippet? Either here on in a gist. It is 
difficult to know what exactly you doing ;-)

> Couple of questions. My equivalent of "myfunc_pure()" also requires a second 
> argument.

Is that argument changing, or is this there to switch between different 
algorithms etc?

> In addition, I don't make use of the "startindex" argument in the function. 
> What's the common approach here? Next, there are actually multiple variables 
> that should be returned, not just "result".

You can always return (a,b,c) instead of a, i.e. a tuple. The function you 
provide to reduce then has the following signature: myreducer(a::Tuple, 
b::Tuple). Combine the tuples, and again return a tuple.

> 
> Overall, I'm a bit surprised that using more than 3 or 4 workers does not 
> decrease the running time. Any ideas? I'm using Julia 0.3.6 on a 64bit Arch 
> Linux system, Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz. 

Can be any number of things, could be the memory bandwidth being the limiting 
factor, or that the computation is actually nicely sped up and a lot of what 
you see is communication overhead. In that case, work on chunks of data / 
batches of itertations, i.e. dont pmap over millions of things but only a 
couple dozen. Looking at the code might shed some light.

> 
> On Friday, March 13, 2015 at 8:37:19 AM UTC, René Donner wrote:
> Perhaps SharedArrays are what you need here? 
> http://docs.julialang.org/en/release-0.3/stdlib/parallel/?highlight=sharedarray#Base.SharedArray
>  
> 
> Reading from a shared array in workers is fine, but when different workers 
> try to update the same part of that array you will get racy behaviour and 
> most likely not the correct result. 
> 
> Can you somehow re-formulate your problem along these lines, using a map and 
> reduce approach using a pure function? 
> 
>   @everywhere function myfunc_pure(startindex) 
>       result = zeros(Int,10) 
>       for i in startindex + (0:19)  # 20 iterations 
>           result[mod(i,length(result))+1] += 1 
>       end 
>       result 
>   end 
>   reduce(+,pmap(myfunc_pure, 1:5))  # 5 blocks of 20 iterations 
> 
> Like this you don't have a shared mutable state and thus no risk for 
> mess-ups. 
> 
> 
> 
> 
> Am 13.03.2015 um 00:56 schrieb Pieter Barendrecht <[email protected]>: 
> 
> > I'm wondering how to save data/results in a parallel for-loop. Let's assume 
> > there is a single Int64 array, initialised using zeros() before starting 
> > the for-loop. In the for-loop (typically ~100,000 iterations, that's the 
> > reason I'm interested in parallel processing) the entries of this Int64 
> > array should be increased (based on the results of an algorithm that's 
> > invoked in the for-loop).
> > 
> > Everything works fine when using just a single proc, but I'm not sure how 
> > to modify the code such that, when using e.g. addprocs(4), the data/results 
> > stored in the Int64 array can be processed once the for-loop ends. The 
> > algorithm (a separate function) is available to all procs (using the 
> > require() function). Just using the Int64 array in the for-loop (using 
> > @parallel for k=1:100000) does not work as each proc receives its own copy, 
> > so after the for-loop it contains just zeros (as illustrated in a set of 
> > slides on the Julia language). I guess it involves @spawn and fetch() 
> > and/or pmap(). Any suggestions or examples would be much appreciated :).

Reply via email to