hi Will, > if not sp.issparse(X): > > self.reconstruction_err_ = norm(X - np.dot(W, H)) > > else: > > norm2X = np.sum(X.data ** 2) # Ok because X is CSR > > normWHT = np.trace(np.dot(np.dot(H.T, np.dot(W.T, W)), H)) > > cross_prod = np.trace(np.dot((X * H.T).T, W)) > > self.reconstruction_err_ = sqrt(norm2X + normWHT > > - 2. * cross_prod) > > > So, for a dense matrix X, this is relatively straight-forward. For a sparse > matrix, this is a massively expensive operation, at least from a memory > standpoint. Is there any reason we can't implement norm() for CSR, and just > self.reconstruction_err_ = safe_sparse_norm(X - safe_sparse_dot(W, H))?
the motivation for these lines is that even if X is sparse safe_sparse_dot(W, H) will not be. So you will allocate a matrix of size X but dense which is unacceptable in many cases. > Additionally, is np.sum(X.data ** 2) a typo? Should it be np.sum(X.data * > 2)? If not a typo, the variable seems misnamed and should be "normSquared", > or something, not norm2X, right? norm2 is a common name for L2 norm. Indeed I could have added squared. > Surely the current approach could be done > more memory-efficiently also, but a "sparse safe norm" sounds better... > > > Additionally, is there any reason to use math.sqrt() as above instead of > np.sqrt()? yes. math.sqrt is faster on floats than np.sqrt which is only required for arrays. let me know if you have any question Best, Alex > I'm more-than-glad to fix this, but I'm hoping someone more familiar could > give me a bit of direction :) > > > Thanks! > > -Will > > > ------------------------------------------------------------------------------ > Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! > Discover the easy way to master current and previous Microsoft technologies > and advance your career. Get an incredible 1,500+ hours of step-by-step > tutorial videos with LearnDevNow. Subscribe today and save! > http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > ------------------------------------------------------------------------------ Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general