2013/8/29 Will Buckner <wbuck...@beatsmusic.com>:
>> the motivation for these lines is that even if X is sparse
>> safe_sparse_dot(W, H)
> will not be. So you will allocate a matrix of size X but dense which is
> unacceptable in many cases.
>
> Er, it looks like safe_sparse_dot() returns sparse unless dense_output=True.
> And, I'm confused as to how this would result in more memory. Aren't we
> allocating more in the lines above for the issparse(X) case? I'm stick right
> now because my 40k x 220k CSR matrix can't make it past computing the
> reconstruction_err without a MemoryError--with 200GB or RAM free. Any ideas
> of how to reduce memory constraints of that calculation?

You probably need an online (aka out-of-core, streaming, incremental)
algorithm instead (e.g. SGD on the least square reconstruction error
with positivity constraints possibly implemented as projections).
Mathieu Blondel knows even better algorithms but AFAIK his paper might
still be pending reviews so I am not sure he would like to speak about
it in more details.

Here are sample codes for partially observed (sparse) data that
implement matrix factorization without the positivity constraints:

https://github.com/scikit-learn/scikit-learn/pull/2387
http://code.google.com/p/pyrsvd/

If you have time it would be interesting to experiment with adding
positivity projections and report your results on this mailing list.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to