I think you will not add the numbers in the same order when SIMD is used. 
Floating point addition is not commutative so you get slightly different 
answers.

On Thursday, October 13, 2016 at 9:14:31 AM UTC+2, DNF wrote:
>
> This is about twice as fast with, with @simd:
>
> function f2(a, p)
>     @assert length(a) == length(p)
>     s = 0.0
>     @simd for i in eachindex(a)
>         @inbounds s += abs((a[i] - p[i])/a[i])
>     end
>     return 100s/length(a)
> end
>
> julia> @benchmark f(a, p)
> BenchmarkTools.Trial:  
>   samples:          115 
>   evals/sample:     1 
>   time tolerance:   5.00% 
>   memory tolerance: 1.00% 
>   memory estimate:  144.00 bytes 
>   allocs estimate:  7 
>   minimum time:     41.96 ms (0.00% GC) 
>   median time:      42.53 ms (0.00% GC) 
>   mean time:        43.49 ms (0.00% GC) 
>   maximum time:     52.82 ms (0.00% GC) 
>
> julia> @benchmark f2(a, p) 
> BenchmarkTools.Trial:  
>   samples:          224 
>   evals/sample:     1 
>   time tolerance:   5.00% 
>   memory tolerance: 1.00% 
>   memory estimate:  0.00 bytes 
>   allocs estimate:  0 
>   minimum time:     21.08 ms (0.00% GC) 
>   median time:      21.86 ms (0.00% GC) 
>   mean time:        22.38 ms (0.00% GC) 
>   maximum time:     27.30 ms (0.00% GC)
>
>
> Weirdly, they give slightly different answers:
>
> julia> f(a, p)
> 781.4987197415827 
>
> julia> f2(a, p) 
> 781.498719741497
>
>
> I would like to know why that happens.
>
> On Friday, October 7, 2016 at 10:29:20 AM UTC+2, Martin Florek wrote:
>>
>> Thanks Andrew for answer. 
>> I also have experience that eachindex() is slightly faster. 
>> In Performance tips I found macros e.g. @simd. Do you have any 
>> experience with them?
>>
>
>  
>

Reply via email to