Hi Erik,

that's great thanks. I may have a hot inner loop where this could be very 
helpful. I'll have a closer look and come back with any questions later on 
if that's ok. 


On Thursday, 13 October 2016 16:24:03 UTC+2, Erik Schnetter wrote:
> If you want to use the SIMD package, then you need to manually vectorized 
> the code. That is, all (most of) the local variables you're using will have 
> a SIMD `Vec` type. For convenience, your input and output arrays will 
> likely still hold scalar values, and the `vload` and vstore` functions 
> access scalar arrays, reading/writing SIMD vectors. The function you quote 
> above (from the SIMD examples) does just this.
> What vector length `N` is best depends on the particular machine. Usually, 
> you would look at the CPU instruction set and choose the largest SIMD 
> vector size that the CPU supports, but sometimes twice that size or half 
> that size might also work well. Note that using a larger SIMD vector size 
> roughly corresponds to loop unrolling, which might be beneficial if the 
> compiler isn't clever enough to do this automatically.
> There's additional complication if the array size is not a multiple of the 
> vector size. In this case, extending the array via dummy elements if often 
> the easiest way to go.
> Note that SIMD vectorization is purely a performance improvement. It does 
> not make sense to make such changes without measuring performance before 
> and after. Given the low-level nature if the changes, looking at the 
> generated assembler code via `@code_native` is usually also insightful.
> I'll be happy to help if you have a specific problem on which you're 
> working.
> -erik
> On Thu, Oct 13, 2016 at 9:51 AM, Florian Oswald <florian...@gmail.com 
> <javascript:>> wrote:
>> ok thanks! and so I should define my SIMD-able function like
>> function vadd!{N,T}(xs::Vector{T}, ys::Vector{T}, ::Type{Vec{N,T}})
>>     @assert length(ys) == length(xs)
>>     @assert length(xs) % N == 0
>>     @inbounds for i in 1:N:length(xs)
>>         xv = vload(Vec{N,T}, xs, i)
>>         yv = vload(Vec{N,T}, ys, i)
>>         xv += yv
>>         vstore(xv, xs, i)
>>     endend
>> i.e. using vload() and vstore() methods?
>> On Thursday, 13 October 2016 15:29:50 UTC+2, Valentin Churavy wrote:
>>> If you want explicit simd the best way right now is the great SIMD.jl 
>>> package https://github.com/eschnett/SIMD.jl  it is builds on top of 
>>> VecElement.
>>> In many cases we can perform automatic vectorisation, but you have to 
>>> start Julia with -O3
>>> On Thursday, 13 October 2016 22:15:00 UTC+9, Florian Oswald wrote:
>>>> i see on the docs 
>>>> http://docs.julialang.org/en/release-0.5/stdlib/simd-types/?highlight=SIMD 
>>>> that there is a vecElement that is build for SIMD support. I don't 
>>>> understand if as a user I should construct vecElement arrays and hope for 
>>>> some SIMD optimization? thanks.
> -- 
> Erik Schnetter <schn...@gmail.com <javascript:>> 
> http://www.perimeterinstitute.ca/personal/eschnetter/

Reply via email to