Cheers. I uploaded the two scripts —

https://gist.github.com/pjbarendrecht/ee4eff971ec2073bfad6 (using 
SharedArrays)
https://gist.github.com/pjbarendrecht/617b73a36b4848634eae (using the 
pmap() function) → use ParSet(10) to run 10,000 simulations.

Pieter


On Friday, March 13, 2015 at 3:29:48 PM UTC, René Donner wrote:
>
>
> Am 13.03.2015 um 16:20 schrieb Pieter Barendrecht <[email protected] 
> <javascript:>>: 
>
> > Thanks! I tried both approaches you suggested. Some results using 
> SharedArrays (100,000 simulations) 
> > 
> > #workers #time 
> > 1 ~120s 
> > 3 ~42s 
> > 6 ~40s 
> > 
> > Short question. The first print statement after the for-loop is already 
> executed before the for-loop ends. How do I prevent this from happening? 
> > 
> > Some results using the other approach (again 100,000 simulations) 
> > 
> > #workers #time 
> > 1 ~118s 
> > 2 ~60s 
> > 3 ~42s 
> > 4 ~38s 
> > 6 ~40s 
> > 6 ~40s 
> > 
>
> Could you post a simplified code snippet? Either here on in a gist. It is 
> difficult to know what exactly you doing ;-) 
>
> > Couple of questions. My equivalent of "myfunc_pure()" also requires a 
> second argument. 
>
> Is that argument changing, or is this there to switch between different 
> algorithms etc? 
>
> > In addition, I don't make use of the "startindex" argument in the 
> function. What's the common approach here? Next, there are actually 
> multiple variables that should be returned, not just "result". 
>
> You can always return (a,b,c) instead of a, i.e. a tuple. The function you 
> provide to reduce then has the following signature: myreducer(a::Tuple, 
> b::Tuple). Combine the tuples, and again return a tuple. 
>
> > 
> > Overall, I'm a bit surprised that using more than 3 or 4 workers does 
> not decrease the running time. Any ideas? I'm using Julia 0.3.6 on a 64bit 
> Arch Linux system, Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz. 
>
> Can be any number of things, could be the memory bandwidth being the 
> limiting factor, or that the computation is actually nicely sped up and a 
> lot of what you see is communication overhead. In that case, work on chunks 
> of data / batches of itertations, i.e. dont pmap over millions of things 
> but only a couple dozen. Looking at the code might shed some light. 
>
> > 
> > On Friday, March 13, 2015 at 8:37:19 AM UTC, René Donner wrote: 
> > Perhaps SharedArrays are what you need here? 
> http://docs.julialang.org/en/release-0.3/stdlib/parallel/?highlight=sharedarray#Base.SharedArray
>  
> > 
> > Reading from a shared array in workers is fine, but when different 
> workers try to update the same part of that array you will get racy 
> behaviour and most likely not the correct result. 
> > 
> > Can you somehow re-formulate your problem along these lines, using a map 
> and reduce approach using a pure function? 
> > 
> >   @everywhere function myfunc_pure(startindex) 
> >       result = zeros(Int,10) 
> >       for i in startindex + (0:19)  # 20 iterations 
> >           result[mod(i,length(result))+1] += 1 
> >       end 
> >       result 
> >   end 
> >   reduce(+,pmap(myfunc_pure, 1:5))  # 5 blocks of 20 iterations 
> > 
> > Like this you don't have a shared mutable state and thus no risk for 
> mess-ups. 
> > 
> > 
> > 
> > 
> > Am 13.03.2015 um 00:56 schrieb Pieter Barendrecht <[email protected]>: 
>
> > 
> > > I'm wondering how to save data/results in a parallel for-loop. Let's 
> assume there is a single Int64 array, initialised using zeros() before 
> starting the for-loop. In the for-loop (typically ~100,000 iterations, 
> that's the reason I'm interested in parallel processing) the entries of 
> this Int64 array should be increased (based on the results of an algorithm 
> that's invoked in the for-loop). 
> > > 
> > > Everything works fine when using just a single proc, but I'm not sure 
> how to modify the code such that, when using e.g. addprocs(4), the 
> data/results stored in the Int64 array can be processed once the for-loop 
> ends. The algorithm (a separate function) is available to all procs (using 
> the require() function). Just using the Int64 array in the for-loop (using 
> @parallel for k=1:100000) does not work as each proc receives its own copy, 
> so after the for-loop it contains just zeros (as illustrated in a set of 
> slides on the Julia language). I guess it involves @spawn and fetch() 
> and/or pmap(). Any suggestions or examples would be much appreciated :). 
>
>

Reply via email to