Regarding #1. Unfortunately the compiler does not hoist field loads from 
non immutable types which means that for tight loops where you access the 
field, manually hoisting the load is significant. This is used a lot in the 
Base sparse code.

On Monday, February 29, 2016 at 3:30:14 PM UTC+1, Eduardo Lenz wrote:
>
> I have two silly questions about the example code below.
>
> 1) There is a small, but noticeable speed up in this code if I define the 
> alias
>     colptr = A.colptr
>     rowval = A.rowval
>     nzval  = A.nzval
>
> 2) Also, I would not expect a speed up from the @simd macro since the loop
>
>  @simd for k=colptr[col]:(colptr[col+1]-1)
>             @inbounds soma += B[rowval[k]] * nzval[k]
>   end
>
> does not have a unitary stride, but the diference is also noticiable.
>
> As there are no mentions to alias in the Performance Tips I would like to 
> known if there
> is a logical explanation for this. 
>
>
> # Test Code ( Y = A*B, A is sparse) 
> function Test( A::SparseMatrixCSC{Float64,Int64}, B::Array{Float64}, 
> Y::Array{Float64})
>
>     # Lets assume it is square
>     n = size(A,1)
>
>     # Local alias 
>     colptr = A.colptr
>     rowval = A.rowval
>     nzval  = A.nzval
>
>     #Loops
>     @inbounds for col = 1:n
>         const s = 0.0
>         @simd for k=colptr[col]:(colptr[col+1]-1)
>             @inbounds s += B[rowval[k]] * nzval[k]
>         end
>         Y[col] = s
>     end
>
> end
>

Reply via email to