On Sunday, March 15, 2015 at 4:31:56 PM UTC+10, Kyle Barbary wrote: > > Lex is right that the devectorized form [xi^2 for xi in x] suffers from > not being in a function. However, the x.^2 form is already simply a > function call, so it shouldn’t benefit much from being wrapped in a > function. > Yes, the 40X was for the `[xi^2 for xi in x]` but the `x.^2` still showed a 5X improvement in a function on 0.3.6. The .^ function is written in Julia so it still has the issues accessing globals.
Perhaps I should have made it clear that `n`, `x` and `y` were also moved into the function so they are not global. Cheers Lex > Note that one of the other performance tips in Julia v0.3 is that you > should use x*x instead of x.^2 (I think this will not be the case in > v0.4). On Julia v0.3.6: > > julia> x = rand(10000); > > julia> @timeit x.^2 > > 1000 loops, best of 3: 100.46 µs per loop > > julia> @timeit x.*x > 10000 loops, best of 3: 30.15 µs per loop > > I get about the same timing as the second version when writing a custom > function with @simd and @inbounds. > > In Python: > > In [19]: from numpy.random import rand > > In [20]: x = rand(10000); > > In [21]: %timeit x**2100000 loops, best of 3: 4.47 µs per loop > > So, I’m still seeing a difference of about a factor of 6 between numpy and > Julia here (but found that the difference does depend on array size and is > generally less). I’m curious what the difference is caused by in this case. > Alex’s email shows that a lot of the time is due to allocating the result > array (his x2! and x2simd! functions don’t include the allocation), but I > think the above is a “fair” comparison, as Python allocates an array to > perform x**2 as well. > > Kyle > > > On Sat, Mar 14, 2015 at 9:24 PM, <[email protected] <javascript:>> wrote: > >> Read >> http://docs.julialang.org/en/latest/manual/performance-tips/?highlight=performance#performance-tips >> >> in particular the first one avoiding global variables. I get up to 40 >> times the performance by putting your code in a function. >> >> Cheers >> Lex >> >> >> On Sunday, March 15, 2015 at 2:10:29 PM UTC+10, Dallas Morisette wrote: >>> >>> I am very new to Julia. I'm working on adding some features to a fairly >>> simple Fortran simulation, and decided to try writing it in Python to make >>> it easier to explore variations. After a lot of optimization work I got it >>> within about 8x slower than the Fortran code. I had read about Julia and >>> had wanted a reason to try it, so I thought I'd see if I could get closer >>> to Fortran speeds in Julia. My initial results were depressingly slow >>> (140x slower than Fortran and 17x slower than Python) and before trying to >>> optimize it I tried some very simple benchmarks to try to understand how to >>> get good performance from Julia. One I tried was two different versions of >>> squaring each element of a 10,000 element array, one vectorized, and one >>> for-loop. I fully expected there to be a large difference in performance of >>> Python between the two, but I didn't expect Julia to be slower than Python >>> in BOTH cases. I also expected the for loop and vectorized versions to be >>> similar, if not the for loop faster given what I'd read >>> about devectjorizing Julia code. >>> >>> I'm sure I'm doing something wrong, but can someone point out what? >>> >>> Here are the results >>> >>> # Python Version >>> import numpy as np >>> n = 10000 >>> x = np.linspace(0.0,1.0,n) >>> y = np.zeros_like(x) >>> %timeit y = x**2 >>> %timeit y = [xi**2 for xi in x] >>> 100000 loops, best of 3: 5.42 µs per loop >>> 100 loops, best of 3: 3.19 ms per loop >>> >>> >>> >>> # Julia Version >>> using TimeIt >>> n = 10000 >>> x = linspace(0.0,1.0,n) >>> y = zeros(x) >>> @timeit y = x.^2 >>> @timeit y = [xi^2 for xi in x] >>> 1000 loops, best of 3: 433.29 µs per loop >>> 100 loops, best of 3: 8.57 ms per loop >>> >>> >>> Thanks! >>> >>> Dallas Morisette >>> >>> >
