Try with:
x::Array{Float32,1} = rand(Float32,n)
y::Array{Float32,1} = rand(Float32,n)
s::Float64 = zero(Float64)
I believe this fixed a similar issue for me in Julia 0.4. The underlying
problem must have been fixed in 0.5-dev.
@code_typed is also very useful in diagnosing failure to vectorize. Check
for type instability, unexpected type promotion, overflow checks when
converting number types, and non-inlined calls.
I've been trying to make tests for this but they keep failing on the
continuous integration machines:
https://github.com/JuliaLang/julia/issues/13686
On Thursday, 5 November 2015 15:12:22 UTC+1, DNF wrote:
>
> I have been looking through the performance tips section of the manual.
> Specifically, I am curious about @simd (
> http://docs.julialang.org/en/release-0.4/manual/performance-tips/#performance-annotations
> ).
>
> When I cut and paste the code demonstrating the @simd macro, I don't get
> substantial speedups. Before updating from OSX Yosemite to El Capitan, I
> saw no speedup whatsoever. After the update, there is a small speedup (I
> ran the example repeatedly):
>
> julia> timeit(1000,1000)
> GFlop = 1.2292170133468385
> GFlop (SIMD) = 1.5351220575547964
>
>
> This contrasts sharply to the example in the documentation which shows a
> speedup from 1.95GFlop to 17.6GFlop.
>
> Does my computer not have simd? How can I tell?
>
> This is my versioninfo:
>
> Julia Version 0.4.0
> Commit 0ff703b* (2015-10-08 06:20 UTC)
> Platform Info:
> System: Darwin (x86_64-apple-darwin15.0.0)
> CPU: Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz
> WORD_SIZE: 64
> BLAS: libopenblas (DYNAMIC_ARCH NO_AFFINITY Haswell)
> LAPACK: libopenblas
> LIBM: libopenlibm
> LLVM: libLLVM-3.3
>
>