On Sunday, March 15, 2015 at 4:31:56 PM UTC+10, Kyle Barbary wrote:
>
> Lex is right that the devectorized form [xi^2 for xi in x] suffers from 
> not being in a function. However, the x.^2 form is already simply a 
> function call, so it shouldn’t benefit much from being wrapped in a 
> function. 
>
Yes, the 40X was for the  `[xi^2 for xi in x]` but the `x.^2` still showed 
a 5X improvement in a function on 0.3.6.  The .^ function is written in 
Julia so it still has the issues accessing globals.

Perhaps I should have made it clear that `n`, `x` and `y` were also moved 
into the function so they are not global.

Cheers
Lex 


 

> Note that one of the other performance tips in Julia v0.3 is that you 
> should use x*x instead of x.^2 (I think this will not be the case in 
> v0.4). On Julia v0.3.6:
>
> julia> x = rand(10000);
>
> julia> @timeit x.^2
>
> 1000 loops, best of 3: 100.46 µs per loop
>
> julia> @timeit x.*x
> 10000 loops, best of 3: 30.15 µs per loop
>
> I get about the same timing as the second version when writing a custom 
> function with @simd and @inbounds.
>
> In Python:
>
> In [19]: from numpy.random import rand
>
> In [20]: x = rand(10000);
>
> In [21]: %timeit x**2100000 loops, best of 3: 4.47 µs per loop
>
> So, I’m still seeing a difference of about a factor of 6 between numpy and 
> Julia here (but found that the difference does depend on array size and is 
> generally less). I’m curious what the difference is caused by in this case. 
> Alex’s email shows that a lot of the time is due to allocating the result 
> array (his x2! and x2simd! functions don’t include the allocation), but I 
> think the above is a “fair” comparison, as Python allocates an array to 
> perform x**2 as well.
>
> Kyle
> ​
>
> On Sat, Mar 14, 2015 at 9:24 PM, <[email protected] <javascript:>> wrote:
>
>> Read 
>> http://docs.julialang.org/en/latest/manual/performance-tips/?highlight=performance#performance-tips
>>  
>> in particular the first one avoiding global variables.  I get up to 40 
>> times the performance by putting your code in a function.
>>
>> Cheers
>> Lex
>>
>>
>> On Sunday, March 15, 2015 at 2:10:29 PM UTC+10, Dallas Morisette wrote:
>>>
>>> I am very new to Julia. I'm working on adding some features to a fairly 
>>> simple Fortran simulation, and decided to try writing it in Python to make 
>>> it easier to explore variations. After a lot of optimization work I got it 
>>> within about 8x slower than the Fortran code. I had read about Julia and 
>>> had wanted a reason to try it, so I thought I'd see if I could get closer 
>>> to Fortran speeds in Julia. My initial results were depressingly slow 
>>> (140x slower than Fortran and 17x slower than Python) and before trying to 
>>> optimize it I tried some very simple benchmarks to try to understand how to 
>>> get good performance from Julia. One I tried was two different versions of 
>>> squaring each element of a 10,000 element array, one vectorized, and one 
>>> for-loop. I fully expected there to be a large difference in performance of 
>>> Python between the two, but I didn't expect Julia to be slower than Python 
>>> in BOTH cases. I also expected the for loop and vectorized versions to be 
>>> similar, if not the for loop faster given what I'd read 
>>> about devectjorizing Julia code. 
>>>
>>> I'm sure I'm doing something wrong, but can someone point out what?
>>>
>>> Here are the results
>>>
>>> # Python Version
>>> import numpy as np
>>> n = 10000
>>> x = np.linspace(0.0,1.0,n)
>>> y = np.zeros_like(x)
>>> %timeit y = x**2
>>> %timeit y = [xi**2 for xi in x]
>>> 100000 loops, best of 3: 5.42 µs per loop
>>> 100 loops, best of 3: 3.19 ms per loop
>>>
>>>
>>>
>>> # Julia Version
>>> using TimeIt
>>> n = 10000
>>> x = linspace(0.0,1.0,n)
>>> y = zeros(x)
>>> @timeit y = x.^2
>>> @timeit y = [xi^2 for xi in x]
>>> 1000 loops, best of 3: 433.29 µs per loop
>>> 100 loops, best of 3: 8.57 ms per loop
>>>
>>>
>>> Thanks!
>>>
>>> Dallas Morisette
>>>
>>>
>

Reply via email to