Peter implemented "penalized SVD" with SGD for "Netflix challenge"-style matrix factorization problems:
http://code.google.com/p/pyrsvd/ It should be a pretty good baseline to compare performance against. As for missing data, I would just use scipy.sparse matrices and treat non-materialized zeros as missing data for the sake of memory efficiency and API simplicity. -- Olivier ------------------------------------------------------------------------------ October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register > http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
