If they're sparse along dimension 1, you can at least save time computing the dot product of the two sparse vectors. But yes, the correlation matrix itself will be dense.
--Tim On Monday, July 28, 2014 11:23:31 AM Jiahao Chen wrote: > > I don't think sparse cor() is implemented and is falling back to the dense > > implementation. > Computing the correlation matrix is much like computing the outer > product of two sparse vectors. There will be massive fill-in and I > don't see how you can preserve sparsity without special knowledge > about the sparsity pattern. > Thanks, > > Jiahao Chen > Staff Research Scientist > MIT Computer Science and Artificial Intelligence Laboratory > > On Mon, Jul 28, 2014 at 11:12 AM, Stefan Karpinski <[email protected]> wrote: > > https://github.com/JuliaLang/julia/issues/new > > > > > > On Mon, Jul 28, 2014 at 10:06 AM, paul analyst <[email protected]> > > > > wrote: > >> Issue on github or on julia-dev groups? > >> Paul > >> > >> W dniu poniedziałek, 28 lipca 2014 12:05:27 UTC+2 użytkownik Viral Shah > >> > >> napisał: > >>> Please file an issue. I don't think sparse cor() is implemented and is > >>> falling back to the dense implementation. > >>> > >>> -viral > >>> > >>> On Monday, July 28, 2014 1:41:55 PM UTC+5:30, paul analyst wrote: > >>>> Correlation sparse array is very slow. Out of memory on a dense array > >>>> when we have 30,000 columns. How quickly it calculated? > >>>> > >>>> julia> I=int32((rand(10^7)*9999999).+1); > >>>> > >>>> julia> J=int32((rand(10^7)*29999).+1); > >>>> > >>>> julia> V=int8((rand(10^7)*9).+1); > >>>> > >>>> julia> D=sparse(I,J,V); > >>>> > >>>> julia> @time cor(D[:,1:30]); > >>>> elapsed time: 23.806328476 seconds (2458875228 bytes allocated, 0.14% > >>>> gc > >>>> time) > >>>> > >>>> julia> @time cor(full(D[:,1:30])); > >>>> elapsed time: 4.494099126 seconds (2732042496 bytes allocated, 5.31% gc > >>>> time) > >>>> > >>>> julia> > >>>> > >>>> Paul
