On Wed, Mar 21, 2012 at 5:23 AM, Gael Varoquaux <[email protected]> wrote:
> My gut feeling about your project is that it is an interesting proposal, > but idealy a GSOC project should be more ambitious than a single > algorithm. You could consider a full application problem that the > algorithm is trying to solve and contribute a few different algorithms. > This is what Vlad did last year, with different matrix > factorization/dictionary learning algorithms, and it was very succesful. If the online NMF and SGD-based matrix factorization proposals are merged as I suggested before, I think it would make a decent GSOC project. Besides, if two different students were to work on the two proposals in parallel, I think there would be too much overlap. One thing I would like to see is an option to choose the loss function. In the general case, we can use the squared loss but if the values/ratings are binary, we can use the hinge loss and obtain maximum margin matrix factorization, and if the values/ratings are discrete, we can use ordinal regression losses. Jason Rennie, who is following this list, did work on both. [*] Also, I would like the current SGD module for classification/regression and the future SGD module for matrix factorization to share as much Cython code as possible. After all, multivariate regression and multiclass classification can be seen as matrix factorization problems (the same way you need to solve multiple Lasso problems to do dictionary learning). Mathieu [*] http://people.csail.mit.edu/jrennie/papers/ http://people.csail.mit.edu/jrennie/writing/ ------------------------------------------------------------------------------ This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
