Thanks Josh for your feedback. I tried to replicate your experiment, but surprisingly here is what I have got:
tf1 (vectorized): 0.00967 seconds tf2 (loop): 0.01178 seconds tf3 (comprehension): 0.00655 seconds tf4 (map): 0.07086 seconds I can understand the overhead introduced by the extra shaping calls in the vectorized operator's implementation, but the time for the plain loop version is just weird. Maybe it's a machine specific thing? Regardless comprehension is always faster than the other ones -- so I suppose this should warrant the use of comprehension in favour of other styles whenever possible? Though I personally think subtle performance "best practices" like this may be a bad thing for the language users, especially package authors who work in other domains because they will have to invest a lot more time to learn the language to write optimised Julia code. On Friday, 19 June 2015 18:24:18 UTC+1, Josh Langsfeld wrote: > > My results don't show a significant performance advantage of the > comprehension. Averaging over 1000 runs of a million-element array, I got: > > f1 (vectorized): 10.32 ms > f2 (loop): 2.07 ms > f3 (comprehension): 2.05 ms > f4 (map): 38.09 ms > > Also, as you can see here ( > https://github.com/JuliaLang/julia/blob/master/base/arraymath.jl#L57), > the .^ operator is implemented with a comprehension so I don't see why it > is measurably slower. It does include a call to reshape, but I believe that > it shares the data so that should be a negligible extra cost. > > On Friday, June 19, 2015 at 10:45:56 AM UTC-4, Xiubo Zhang wrote: >> >> Thanks for the reply. >> >> I am aware of the @time macro. Just that I thought tic() and toc() are >> adequate for this case as I am not concerned with the memory side of things >> at the moment. I have also read the performance section in the manual, >> which led me to doing the benchmarks with functions rather than writing >> expressions in the REPL. >> >> Loops are faster than vectorized code in Julia, and a comprehension is >>> essentially a loop. >>> >> >> This is exactly what I was thinking before asking this question. I learnt >> that de-vectorized loops should be faster than the vectorized version, but >> wouldn't the developers of the language simply implement the ".^" in the >> form of plain for loops to benefit from the better performance? Also it >> does not explain why the comprehension version is 3 to 4 times faster than >> the for loop version. >> >> What did I miss? >> >> On Friday, 19 June 2015 15:29:15 UTC+1, Mauro wrote: >>> >>> Loops are faster than vectorized code in Julia, and a comprehension is >>> essentially a loop. Also checkout the convenient @time macro, it also >>> reports memory allocation. Last, there is a performance section in the >>> manual where a lot of this is explained. But do report back if you got >>> more questions. Mauro >>> >>> >>> On Fri, 2015-06-19 at 15:41, Xiubo Zhang <[email protected]> wrote: >>> > I am rather new to Julia, so please do remind me if I missed anything >>> > important. >>> > >>> > I was trying to write a function which would operate on the elements >>> in an >>> > array, and return an array. For the sake of simplicity, let's say >>> > calculating the squares of an array of real numbers. I designed four >>> > functions, each implementing the task using a different style: >>> > >>> > function tf1{T<:Real}(x::AbstractArray{T}) return r = x .^ 2 end # >>> > vectorised power operator >>> > function tf2{T<:Real}(x::AbstractArray{T}) r = Array(T, length(x)); >>> for i >>> > in 1:length(x) r[i] = x[i] ^ 2 end; return r end # plain for loop >>> > function tf3{T<:Real}(x::AbstractArray{T}) return [i ^ 2 for i in x] >>> end # >>> > array comprehension >>> > function tf4{T<:Real}(x::AbstractArray{T}) return map(x -> x ^ 2, x) >>> end # >>> > using the "map" function >>> > >>> > And I timed the operations with tic() and toc(). The results varies >>> from >>> > each run, but the following is a typical set of results after warming >>> up: >>> > >>> > tic(); tf1( 1:1000000 ); toc() >>> > elapsed time: 0.011582169 seconds >>> > >>> > tic(); tf2( 1:1000000 ); toc() >>> > elapsed time: 0.016566094 seconds >>> > >>> > tic(); tf3( 1:1000000 ); toc() >>> > elapsed time: 0.004038817 seconds >>> > >>> > tic(); tf4( 1:1000000 ); toc() >>> > elapsed time: 0.065989988 seconds >>> > >>> > I understand that the map function should run slower than the rest, >>> but why >>> > is the comprehension version so much faster than the vectorised "^" >>> > operator? Does this mean array comprehensions should be used in favour >>> of >>> > all other styles whenever possible? >>> > >>> > P.S. version of Julia: >>> > _ >>> > _ _ _(_)_ | A fresh approach to technical computing >>> > (_) | (_) (_) | Documentation: http://docs.julialang.org >>> > _ _ _| |_ __ _ | Type "help()" for help. >>> > | | | | | | |/ _` | | >>> > | | |_| | | | (_| | | Version 0.3.9 (2015-05-30 11:24 UTC) >>> > _/ |\__'_|_|_|\__'_| | Official http://julialang.org/ release >>> > |__/ | x86_64-w64-mingw32 >>> > >>> > This is on a Windows 7 machine. >>> >>>
