If they're sparse along dimension 1, you can at least save time computing the 
dot product of the two sparse vectors. But yes, the correlation matrix itself 
will be dense.

--Tim

On Monday, July 28, 2014 11:23:31 AM Jiahao Chen wrote:
> > I don't think sparse cor() is implemented and is falling back to the dense
> > implementation.
> Computing the correlation matrix is much like computing the outer
> product of two sparse vectors. There will be massive fill-in and I
> don't see how you can preserve sparsity without special knowledge
> about the sparsity pattern.
> Thanks,
> 
> Jiahao Chen
> Staff Research Scientist
> MIT Computer Science and Artificial Intelligence Laboratory
> 
> On Mon, Jul 28, 2014 at 11:12 AM, Stefan Karpinski <[email protected]> 
wrote:
> > https://github.com/JuliaLang/julia/issues/new
> > 
> > 
> > On Mon, Jul 28, 2014 at 10:06 AM, paul analyst <[email protected]>
> > 
> > wrote:
> >> Issue on github or on julia-dev  groups?
> >> Paul
> >> 
> >> W dniu poniedziałek, 28 lipca 2014 12:05:27 UTC+2 użytkownik Viral Shah
> >> 
> >> napisał:
> >>> Please file an issue. I don't think sparse cor() is implemented and is
> >>> falling back to the dense implementation.
> >>> 
> >>> -viral
> >>> 
> >>> On Monday, July 28, 2014 1:41:55 PM UTC+5:30, paul analyst wrote:
> >>>> Correlation sparse array is very slow. Out of memory on a dense array
> >>>> when we have 30,000 columns. How quickly it calculated?
> >>>> 
> >>>> julia> I=int32((rand(10^7)*9999999).+1);
> >>>> 
> >>>> julia> J=int32((rand(10^7)*29999).+1);
> >>>> 
> >>>> julia> V=int8((rand(10^7)*9).+1);
> >>>> 
> >>>> julia> D=sparse(I,J,V);
> >>>> 
> >>>> julia> @time cor(D[:,1:30]);
> >>>> elapsed time: 23.806328476 seconds (2458875228 bytes allocated, 0.14%
> >>>> gc
> >>>> time)
> >>>> 
> >>>> julia> @time cor(full(D[:,1:30]));
> >>>> elapsed time: 4.494099126 seconds (2732042496 bytes allocated, 5.31% gc
> >>>> time)
> >>>> 
> >>>> julia>
> >>>> 
> >>>> Paul

Reply via email to