Re: [julia-users] Performance confusions on matrix extractions in loops, and memory allocations

Milan Bouchet-Valat Mon, 10 Nov 2014 03:24:07 -0800

Le dimanche 09 novembre 2014 à 21:17 -0800, Todd Leo a écrit :
> Hi fellows, 
> 
> 
> 
> I'm currently working on sparse matrix and cosine similarity
> computation, but my routines is running very slow, at least not reach
> my expectation. So I wrote some test functions, to dig out the reason
> of ineffectiveness. To my surprise, the execution time of passing two
> vectors to the test function and passing the whole sparse matrix
> differs greatly, the latter is 80x faster. I am wondering why
> extracting two vectors of the matrix in each loop is dramatically
> faster that much, and how to avoid the multi-GB memory allocate.
> Thanks guys.
> 
> 
> --
> BEST REGARDS,
> Todd Leo
> 
> 
> # The sparse matrix
> mat # 2000x15037 SparseMatrixCSC{Float64, Int64}
> 
> 
> # The two vectors, prepared in advance
> v = mat'[:,1]
> w = mat'[:,2]
> 
> 
> # Cosine similarity function
> function cosine_vectorized(i::SparseMatrixCSC{Float64, Int64},
> j::SparseMatrixCSC{Float64, Int64})
>     return sum(i .* j)/sqrt(sum(i.*i)*sum(j.*j))
> end
I think you'll experience a dramatic speed gain if you write the sums in
explicit loops, accessing elements one by one, taking their product and
adding it immediately to a counter. In your current version, the
element-wise products allocate new vectors before computing the sums,
which is very costly.


This will also get rid of the difference you report between passing
arrays and vectors.


Regards

> function test1(d)
>     res = 0.
>     for i in 1:10000
>         res = cosine_vectorized(d[:,1], d[:,2])
>     end
> end
> 
> 
> function test2(_v,_w)
>     res = 0.
>     for i in 1:10000
>         res = cosine_vectorized(_v, _w)
>     end
> end
> 
> 
> test1(dtm)
> test2(v,w)
> gc()
> @time test1(dtm)
> gc()
> @time test2(v,w)
> 
> 
> #elapsed time: 0.054925372 seconds (59360080 bytes allocated, 59.07%
> gc time)
> 
> #elapsed time: 4.204132608 seconds (3684160080 bytes allocated, 65.51%
> gc time)
>

Re: [julia-users] Performance confusions on matrix extractions in loops, and memory allocations

Reply via email to