Hi Nathan,

I posted the codes, so you can check if they do the same thing or not. 
These went to separate cells in Jupyter, nothing more and nothing less.
Not even a single line I didn't post. And yes I understand your line of 
reasoning, so that's why I got astonished also.
But I can see what is making this huge difference, and I'd like to know :)

Best,

Ferran.

On Thursday, July 21, 2016 at 6:31:57 PM UTC+2, Nathan Smith wrote:
>
> Hey Ferran, 
>
> You should be suspicious when your apparent speed up surpasses the level 
> of parallelism available on your CPU. I looks like your codes don't 
> actually compute the same thing.
>
> I'm assuming you're trying to compute the matrix exponential of A 
> (A^1000000000) by repeatedly multiplying A. In your parallel code, each 
> process gets a local copy of 'z' and
> uses that. This means each process is computing something like 
> (A^(1000000000/# of procs)). Check out this 
> <http://docs.julialang.org/en/release-0.4/manual/parallel-computing/#parallel-map-and-loops>
>  section 
> of the documentation on parallel map and loops to see what I mean.
>
> That said, that doesn't explain your speed up completely, you should also 
> make sure that each part of your script is wrapped in a function and that 
> you 'warm-up' each function by running it once before comparing.
>
> Cheers, 
> Nathan
>
> On Thursday, 21 July 2016 12:00:47 UTC-4, Ferran Mazzanti wrote:
>>
>> Hi,
>>
>> mostly showing my astonishment, but I can even understand the figures in 
>> this stupid parallelization code
>>
>> A = [[1.0 1.0001];[1.0002 1.0003]]
>> z = A
>> tic()
>> for i in 1:1000000000
>>     z *= A
>> end
>> toc()
>> A
>>
>> produces
>>
>> elapsed time: 105.458639263 seconds
>>
>> 2x2 Array{Float64,2}:
>>  1.0     1.0001
>>  1.0002  1.0003
>>
>>
>>
>> But then add @parallel in the for loop
>>
>> A = [[1.0 1.0001];[1.0002 1.0003]]
>> z = A
>> tic()
>> @parallel for i in 1:1000000000
>>     z *= A
>> end
>> toc()
>> A
>>
>> and get 
>>
>> elapsed time: 0.008912282 seconds
>>
>> 2x2 Array{Float64,2}:
>>  1.0     1.0001
>>  1.0002  1.0003
>>
>>
>> look at the elapsed time differences! And I'm running this on my Xeon 
>> desktop, not even a cluster
>> Of course A-B reports
>>
>> 2x2 Array{Float64,2}:
>>  0.0  0.0
>>  0.0  0.0
>>
>>
>> So is this what one should expect from this kind of simple 
>> paralleizations? If so, I'm definitely *in love* with Julia :):):)
>>
>> Best,
>>
>> Ferran.
>>
>>
>>

Reply via email to