Hi Nathan, I posted the codes, so you can check if they do the same thing or not. These went to separate cells in Jupyter, nothing more and nothing less. Not even a single line I didn't post. And yes I understand your line of reasoning, so that's why I got astonished also. But I can see what is making this huge difference, and I'd like to know :)
Best, Ferran. On Thursday, July 21, 2016 at 6:31:57 PM UTC+2, Nathan Smith wrote: > > Hey Ferran, > > You should be suspicious when your apparent speed up surpasses the level > of parallelism available on your CPU. I looks like your codes don't > actually compute the same thing. > > I'm assuming you're trying to compute the matrix exponential of A > (A^1000000000) by repeatedly multiplying A. In your parallel code, each > process gets a local copy of 'z' and > uses that. This means each process is computing something like > (A^(1000000000/# of procs)). Check out this > <http://docs.julialang.org/en/release-0.4/manual/parallel-computing/#parallel-map-and-loops> > section > of the documentation on parallel map and loops to see what I mean. > > That said, that doesn't explain your speed up completely, you should also > make sure that each part of your script is wrapped in a function and that > you 'warm-up' each function by running it once before comparing. > > Cheers, > Nathan > > On Thursday, 21 July 2016 12:00:47 UTC-4, Ferran Mazzanti wrote: >> >> Hi, >> >> mostly showing my astonishment, but I can even understand the figures in >> this stupid parallelization code >> >> A = [[1.0 1.0001];[1.0002 1.0003]] >> z = A >> tic() >> for i in 1:1000000000 >> z *= A >> end >> toc() >> A >> >> produces >> >> elapsed time: 105.458639263 seconds >> >> 2x2 Array{Float64,2}: >> 1.0 1.0001 >> 1.0002 1.0003 >> >> >> >> But then add @parallel in the for loop >> >> A = [[1.0 1.0001];[1.0002 1.0003]] >> z = A >> tic() >> @parallel for i in 1:1000000000 >> z *= A >> end >> toc() >> A >> >> and get >> >> elapsed time: 0.008912282 seconds >> >> 2x2 Array{Float64,2}: >> 1.0 1.0001 >> 1.0002 1.0003 >> >> >> look at the elapsed time differences! And I'm running this on my Xeon >> desktop, not even a cluster >> Of course A-B reports >> >> 2x2 Array{Float64,2}: >> 0.0 0.0 >> 0.0 0.0 >> >> >> So is this what one should expect from this kind of simple >> paralleizations? If so, I'm definitely *in love* with Julia :):):) >> >> Best, >> >> Ferran. >> >> >>
