Calm down. It's very easy to match Octave here, by writing the standard library in C. Julia chose not to go that route. Of course everyone would like the concise vectorized syntax to be as fast as the for-loop syntax. If it were easy to do that *without* writing the standard library in C, it would've been done already. In the meantime it isn't easy and hasn't been done yet, so when someone asks how to write code that's faster than Octave, this is the answer.
There have been various LLVM bugs regarding automatic optimizations of converting small integer powers to multiplications. "if y == 2" code really shouldn't be necessary here, LLVM is smarter than that but sometimes fickle and buggy under the demands Julia puts on it. And in this case the answer is clearly a reduction for which Julia has a very good set of tools to express efficiently. If your output is much less data than your input, who cares how fast you can calculate temporaries that you don't need to save? On Monday, September 8, 2014 9:18:02 AM UTC-7, Jeff Waller wrote: > > > > On Monday, September 8, 2014 7:37:52 AM UTC-4, Mohammed El-Beltagy wrote: >> >> Octave seems to be doing a pretty good job for this type of calculation. >> If you want to squeeze a bit more performance out of Julia, you can try to >> use explicit loops (devectorize as in >> http://julialang.org/blog/2013/09/fast-numeric/). Then you might also >> remove bounds checking in a loop for faster performance. >> >> Try this function: >> function doCalc(x::Array{Float64,2}) >> XX=Array(Float64, 7000,7000) >> for j=1:7000,i=1:7000 >> @inbounds XX[i,j]=x[i,j]*x[i,j] >> end >> XX >> end >> >> Followed by >> @time XX=doCalc(X); >> > > > I am totally not cool with this. Just simply making things fast at > whatever the cost is not good enough. You can't tell someone that "Oh sure, > Julia does support that extremely convenient syntax, but because it's 6 > times slower (and it is), you need to essentially write C". There's no > reason that Julia for this should be any less fast than Octave except for > bugs which will eventually be fixed. I think it's perfectly ok to say > devectorize if you want to do even better, but it must be at least equal. > > Here's the implementation of .^ > > .^(x::StridedArray, y::Number) = reshape([ x[i] ^ y for i=1:length(x) ], > size(x)) > how is that made less vectorized? > > Yes R, Matlab, Octave do conflate vectorization for speed and style and > that's a mistake. Julia tries not to do that that's better. But do you > expect people to throw away the Matlab syntax? No way. > > for me: > > *julia> **x = rand(7000,7000);* > > *julia> **@time x.^2;* > > elapsed time: 1.340131226 seconds (392000256 bytes allocated) > > And in Octave.... > > >> x=rand(7000); > > >> tic;y=x.^2;toc > > Elapsed time is 0.201613 seconds. > > Comprehensions might be the culprit, or simply use of the generic x^y > here's an alternate implementation I threw together > > *julia> **function .^{T}(x::Array{T},y::Number)* > > * z = *(size(x)...)* > > * new_x = Array(T,z);* > > * tmp_x = reshape(x,z);* > > * for i in 1:z* > > * @inbounds if y == 2* > > * new_x[i] = tmp_x[i]*tmp_x[i]* > > * else* > > * new_x[i] = tmp_x[i]^y* > > * end* > > * end* > > * reshape(new_x,size(x))* > > *end* > > *.^ (generic function with 24 methods)* > > *julia> **@time x.^2;* > > elapsed time: 0.248816016 seconds (392000360 bytes allocated, 4.73% gc > time) > > Close.... > > >
