Yes, there have been some hints of regressions in constructing SubArrays (see
https://github.com/JuliaLang/julia/issues/11787), but I haven't had the time
to do a bisect or anything.
--Tim
On Sunday, June 21, 2015 02:07:29 PM Douglas Bates wrote:
> After profiling some code that I will discuss in a lightning talk at
> JuliaCon I found a bottleneck in an unexpected place. (I guess I should
> know by now that profiling always shows that the bottleneck is in an
> unexpected place - that's why we do it.)
>
> I have a matrix of matrices, some of which are dense and some sparse, for
> which I wish to create the lower Cholesky factor. This operation is done
> once for each evaluation of the objective function in a nonlinear
> optimization and I want to be careful about creating copies. The chol!
> fallback methods (argument is AbstractMatrix) in base/linalg/cholesky.jl
> has operations like
>
> A[k,k] -= A[k,i]*A[k,i]'
>
> and
>
> A[i,k] -= A[i,j]*A[k,j]'*AkkInv'
>
> which, when the elements of A are matrices, I rewrote as
> downdate!(A[k,k],A[k,i]) etc.
>
> One of these methods is
>
> function
> downdate!{T<:FloatingPoint}(C::DenseMatrix{T},A::DenseMatrix{T},B::SparseMat
> rixCSC{T}) ma,na = size(A)
> mb,nb = size(B)
> ma == size(C,1) && mb == size(C,2) && na == nb ||
> throw(DimensionMismatch(""))
> rvb = rowvals(B); nzb = nonzeros(B)
> for j in 1:nb
> aj = sub(A,:,j)
> ib = nzrange(B,j)
> rvbj = sub(rvb,ib)
> nzbj = sub(nzb,ib)
> for k in eachindex(ib)
> BLAS.axpy!(-nzbj[k],aj,sub(C,:,Int(rvbj[k])))
> end
> end
> C
> end
>
> The surprising bottleneck is the call to sub(C,:,Int(rvbj[k])) within the
> call to BLAS.axpy!. Fortunately there are sufficiently flexible BLAS.axpy!
> method signatures that I think I can write around this using some pointer
> arithmetic and pointer_to_array. At present the sub operation seems to be
> taking about 2/3 of the time in the BLAS.axpy! call. That is, the linear
> algebra is much faster than trying to determine which piece to work on.