Thanks Kristoffer !
I was playing with this code in order to understand some optimizations and
I was not expecting a major improvement. But in the end, we are always
learning something new :0)
Regarding #1: should it be discussed in the Performance Tips of the manual ?
Regarding #2: I will look at @code_native to better understand those simd
operations.
I also should have said that I am using version 0.4.3 with llvm 3-3.
Thanks for your assistance,
Eduardo.
On Monday, February 29, 2016 at 2:11:09 PM UTC-3, Kristoffer Carlsson wrote:
>
> Regarding #2.
>
> Looking at @code_llvm Test(A,B,C) we can see that a vector block is being
> generated and @code_native shows packed/vector instructions so apparently
> that the compiler somehow manages to get some vectorization going...
>
> On Monday, February 29, 2016 at 3:30:14 PM UTC+1, Eduardo Lenz wrote:
>>
>> I have two silly questions about the example code below.
>>
>> 1) There is a small, but noticeable speed up in this code if I define the
>> alias
>> colptr = A.colptr
>> rowval = A.rowval
>> nzval = A.nzval
>>
>> 2) Also, I would not expect a speed up from the @simd macro since the loop
>>
>> @simd for k=colptr[col]:(colptr[col+1]-1)
>> @inbounds soma += B[rowval[k]] * nzval[k]
>> end
>>
>> does not have a unitary stride, but the diference is also noticiable.
>>
>> As there are no mentions to alias in the Performance Tips I would like to
>> known if there
>> is a logical explanation for this.
>>
>>
>> # Test Code ( Y = A*B, A is sparse)
>> function Test( A::SparseMatrixCSC{Float64,Int64}, B::Array{Float64},
>> Y::Array{Float64})
>>
>> # Lets assume it is square
>> n = size(A,1)
>>
>> # Local alias
>> colptr = A.colptr
>> rowval = A.rowval
>> nzval = A.nzval
>>
>> #Loops
>> @inbounds for col = 1:n
>> const s = 0.0
>> @simd for k=colptr[col]:(colptr[col+1]-1)
>> @inbounds s += B[rowval[k]] * nzval[k]
>> end
>> Y[col] = s
>> end
>>
>> end
>>
>