Yes, this is bad.  I think this stems from the bad habit to change a
range inside the getindex function into a vector.  Whilst this is ok-ish
for dense matrices its definitely no good for sparse matrices.  In this
case a [1:n] is made and on top of that compared to 1:m!

Redefining the called getindex function in base/sparse/sparsematrix.jl
to

function getindex{Tv,Ti}(A::SparseMatrixCSC{Tv,Ti}, I::AbstractVector, 
J::AbstractVector)
    m = size(A, 1)
    if I == 1:m
        return Base.SparseMatrix.getindex_cols(A, J)
    end
    
    if isa(I, Range) || isa(I, Range1); I = [I]; end

    if issorted(I)
        return getindex_I_sorted(A, I, J)
    else
        return getindex_general(A, I, J)
    end

end

improves speed for me by almost 100x from 2.5s to 0.029s.  I think for
getindex is now proportional to non-zeros.

Note above is just a quick hack.  For instance the issorted call is also
unnecessary if it is a range... probably should tidy this up.  Does
anyone know of a reference paper/implementation for this?

On Mon, 2014-03-17 at 23:57, [email protected] wrote:
> It seems to me that indexing sparse matrices, like
> m[i,:] is way too slow, even if one is retrieving columns.
> In fact, it seems to take time proportional to the dimension
> of the matrix, rather than to the number of non-zeros in the column.
>
> For example, consider the following timings:
>
> n = 10000;
> e = speye(n);
> x = 0;
> tic();
> for i = 1:n; x = x + sum(find(e[:,i])); end
> toc()
>
> elapsed time: 1.388211586 seconds
>
> n = 20000;
> e = speye(n);
> x = 0;
> tic();
> for i = 1:n; x = x + sum(find(e[:,i])); end
> toc()
>
> elapsed time: 6.076672088 seconds
>
> n = 40000;
> e = speye(n);
> x = 0;
> tic();
> for i = 1:n; x = x + sum(find(e[:,i])); end
> toc()
>
> elapsed time: 22.653318711 seconds
>
> Thanks,
>   Dan Spielman

-- 

Reply via email to