in Jupyer notebook, add processors with addprocs(N) 

On Thursday, 21 July 2016 12:59:02 UTC-4, Nathan Smith wrote:
>
> To be clear, you need to compare the final 'z' not the final 'A' to check 
> if your calculations are consistent. The matrix A does not change through 
> out this calculation, but the matrix z does.
> Also, there is no parallelism with the @parallel loop unless your start 
> julia with 'julia -np N' where N is the number of processes you'd like to 
> use.
>
> On Thursday, 21 July 2016 12:45:17 UTC-4, Ferran Mazzanti wrote:
>>
>> Hi Nathan,
>>
>> I posted the codes, so you can check if they do the same thing or not. 
>> These went to separate cells in Jupyter, nothing more and nothing less.
>> Not even a single line I didn't post. And yes I understand your line of 
>> reasoning, so that's why I got astonished also.
>> But I can see what is making this huge difference, and I'd like to know :)
>>
>> Best,
>>
>> Ferran.
>>
>> On Thursday, July 21, 2016 at 6:31:57 PM UTC+2, Nathan Smith wrote:
>>>
>>> Hey Ferran, 
>>>
>>> You should be suspicious when your apparent speed up surpasses the level 
>>> of parallelism available on your CPU. I looks like your codes don't 
>>> actually compute the same thing.
>>>
>>> I'm assuming you're trying to compute the matrix exponential of A 
>>> (A^1000000000) by repeatedly multiplying A. In your parallel code, each 
>>> process gets a local copy of 'z' and
>>> uses that. This means each process is computing something like 
>>> (A^(1000000000/# of procs)). Check out this 
>>> <http://docs.julialang.org/en/release-0.4/manual/parallel-computing/#parallel-map-and-loops>
>>>  section 
>>> of the documentation on parallel map and loops to see what I mean.
>>>
>>> That said, that doesn't explain your speed up completely, you should 
>>> also make sure that each part of your script is wrapped in a function and 
>>> that you 'warm-up' each function by running it once before comparing.
>>>
>>> Cheers, 
>>> Nathan
>>>
>>> On Thursday, 21 July 2016 12:00:47 UTC-4, Ferran Mazzanti wrote:
>>>>
>>>> Hi,
>>>>
>>>> mostly showing my astonishment, but I can even understand the figures 
>>>> in this stupid parallelization code
>>>>
>>>> A = [[1.0 1.0001];[1.0002 1.0003]]
>>>> z = A
>>>> tic()
>>>> for i in 1:1000000000
>>>>     z *= A
>>>> end
>>>> toc()
>>>> A
>>>>
>>>> produces
>>>>
>>>> elapsed time: 105.458639263 seconds
>>>>
>>>> 2x2 Array{Float64,2}:
>>>>  1.0     1.0001
>>>>  1.0002  1.0003
>>>>
>>>>
>>>>
>>>> But then add @parallel in the for loop
>>>>
>>>> A = [[1.0 1.0001];[1.0002 1.0003]]
>>>> z = A
>>>> tic()
>>>> @parallel for i in 1:1000000000
>>>>     z *= A
>>>> end
>>>> toc()
>>>> A
>>>>
>>>> and get 
>>>>
>>>> elapsed time: 0.008912282 seconds
>>>>
>>>> 2x2 Array{Float64,2}:
>>>>  1.0     1.0001
>>>>  1.0002  1.0003
>>>>
>>>>
>>>> look at the elapsed time differences! And I'm running this on my Xeon 
>>>> desktop, not even a cluster
>>>> Of course A-B reports
>>>>
>>>> 2x2 Array{Float64,2}:
>>>>  0.0  0.0
>>>>  0.0  0.0
>>>>
>>>>
>>>> So is this what one should expect from this kind of simple 
>>>> paralleizations? If so, I'm definitely *in love* with Julia :):):)
>>>>
>>>> Best,
>>>>
>>>> Ferran.
>>>>
>>>>
>>>>

Reply via email to