That seems right: julia> f(a, p) 781.4987197415827
julia> f(reverse(a), reverse(p)) 781.4987197415213 But I'm pretty surprised the effect is that big. On Thursday, October 13, 2016 at 9:49:00 AM UTC+2, Kristoffer Carlsson wrote: > > I think you will not add the numbers in the same order when SIMD is used. > Floating point addition is not commutative so you get slightly different > answers. > > On Thursday, October 13, 2016 at 9:14:31 AM UTC+2, DNF wrote: >> >> This is about twice as fast with, with @simd: >> >> function f2(a, p) >> @assert length(a) == length(p) >> s = 0.0 >> @simd for i in eachindex(a) >> @inbounds s += abs((a[i] - p[i])/a[i]) >> end >> return 100s/length(a) >> end >> >> julia> @benchmark f(a, p) >> BenchmarkTools.Trial: >> samples: 115 >> evals/sample: 1 >> time tolerance: 5.00% >> memory tolerance: 1.00% >> memory estimate: 144.00 bytes >> allocs estimate: 7 >> minimum time: 41.96 ms (0.00% GC) >> median time: 42.53 ms (0.00% GC) >> mean time: 43.49 ms (0.00% GC) >> maximum time: 52.82 ms (0.00% GC) >> >> julia> @benchmark f2(a, p) >> BenchmarkTools.Trial: >> samples: 224 >> evals/sample: 1 >> time tolerance: 5.00% >> memory tolerance: 1.00% >> memory estimate: 0.00 bytes >> allocs estimate: 0 >> minimum time: 21.08 ms (0.00% GC) >> median time: 21.86 ms (0.00% GC) >> mean time: 22.38 ms (0.00% GC) >> maximum time: 27.30 ms (0.00% GC) >> >> >> Weirdly, they give slightly different answers: >> >> julia> f(a, p) >> 781.4987197415827 >> >> julia> f2(a, p) >> 781.498719741497 >> >> >> I would like to know why that happens. >> >> On Friday, October 7, 2016 at 10:29:20 AM UTC+2, Martin Florek wrote: >>> >>> Thanks Andrew for answer. >>> I also have experience that eachindex() is slightly faster. >>> In Performance tips I found macros e.g. @simd. Do you have any >>> experience with them? >>> >> >> >> >