Cheers. I uploaded the two scripts — https://gist.github.com/pjbarendrecht/ee4eff971ec2073bfad6 (using SharedArrays) https://gist.github.com/pjbarendrecht/617b73a36b4848634eae (using the pmap() function) → use ParSet(10) to run 10,000 simulations.
Pieter On Friday, March 13, 2015 at 3:29:48 PM UTC, René Donner wrote: > > > Am 13.03.2015 um 16:20 schrieb Pieter Barendrecht <[email protected] > <javascript:>>: > > > Thanks! I tried both approaches you suggested. Some results using > SharedArrays (100,000 simulations) > > > > #workers #time > > 1 ~120s > > 3 ~42s > > 6 ~40s > > > > Short question. The first print statement after the for-loop is already > executed before the for-loop ends. How do I prevent this from happening? > > > > Some results using the other approach (again 100,000 simulations) > > > > #workers #time > > 1 ~118s > > 2 ~60s > > 3 ~42s > > 4 ~38s > > 6 ~40s > > 6 ~40s > > > > Could you post a simplified code snippet? Either here on in a gist. It is > difficult to know what exactly you doing ;-) > > > Couple of questions. My equivalent of "myfunc_pure()" also requires a > second argument. > > Is that argument changing, or is this there to switch between different > algorithms etc? > > > In addition, I don't make use of the "startindex" argument in the > function. What's the common approach here? Next, there are actually > multiple variables that should be returned, not just "result". > > You can always return (a,b,c) instead of a, i.e. a tuple. The function you > provide to reduce then has the following signature: myreducer(a::Tuple, > b::Tuple). Combine the tuples, and again return a tuple. > > > > > Overall, I'm a bit surprised that using more than 3 or 4 workers does > not decrease the running time. Any ideas? I'm using Julia 0.3.6 on a 64bit > Arch Linux system, Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz. > > Can be any number of things, could be the memory bandwidth being the > limiting factor, or that the computation is actually nicely sped up and a > lot of what you see is communication overhead. In that case, work on chunks > of data / batches of itertations, i.e. dont pmap over millions of things > but only a couple dozen. Looking at the code might shed some light. > > > > > On Friday, March 13, 2015 at 8:37:19 AM UTC, René Donner wrote: > > Perhaps SharedArrays are what you need here? > http://docs.julialang.org/en/release-0.3/stdlib/parallel/?highlight=sharedarray#Base.SharedArray > > > > > Reading from a shared array in workers is fine, but when different > workers try to update the same part of that array you will get racy > behaviour and most likely not the correct result. > > > > Can you somehow re-formulate your problem along these lines, using a map > and reduce approach using a pure function? > > > > @everywhere function myfunc_pure(startindex) > > result = zeros(Int,10) > > for i in startindex + (0:19) # 20 iterations > > result[mod(i,length(result))+1] += 1 > > end > > result > > end > > reduce(+,pmap(myfunc_pure, 1:5)) # 5 blocks of 20 iterations > > > > Like this you don't have a shared mutable state and thus no risk for > mess-ups. > > > > > > > > > > Am 13.03.2015 um 00:56 schrieb Pieter Barendrecht <[email protected]>: > > > > > > I'm wondering how to save data/results in a parallel for-loop. Let's > assume there is a single Int64 array, initialised using zeros() before > starting the for-loop. In the for-loop (typically ~100,000 iterations, > that's the reason I'm interested in parallel processing) the entries of > this Int64 array should be increased (based on the results of an algorithm > that's invoked in the for-loop). > > > > > > Everything works fine when using just a single proc, but I'm not sure > how to modify the code such that, when using e.g. addprocs(4), the > data/results stored in the Int64 array can be processed once the for-loop > ends. The algorithm (a separate function) is available to all procs (using > the require() function). Just using the Int64 array in the for-loop (using > @parallel for k=1:100000) does not work as each proc receives its own copy, > so after the for-loop it contains just zeros (as illustrated in a set of > slides on the Julia language). I guess it involves @spawn and fetch() > and/or pmap(). Any suggestions or examples would be much appreciated :). > >
