Le 06/03/2012 20:23, Andreas a écrit : > On 03/06/2012 08:17 PM, Adrien wrote: >> Le 06/03/2012 19:19, Andreas Mueller a écrit : >> >>> Hi Adrien. >>> Thanks for the offer and thanks for converting people from the dark side ;) >>> >>> I'm not sure this is the way to go, though. >>> There is already quite efficient SGD code in sklearn and this should >>> probably >>> be extended to handle the multi-class case. >>> If you include a separate implementation, there will be a lot of code >>> duplication >>> and it will probably be non-trivial to get to the speed of the current >>> implementation. >> I agree with you. >> >> What I had in mind was just to, first, provide a simple, "stand-alone", >> batch implementation of MLR for reference. Note, that I didn't find any >> in python... Maybe someone else has? > Well, I have several ;) > Most of them are SGD, too. > > There is also Peter's bolt, which might be a good reference implementation. > http://pprett.github.com/bolt/ I didn't know about that one and Google didn't find it for me (even with maxent-related keywords). Thanks! > >> Like you mentioned, this batch version will not scale very well. One >> reason for this is the optimization algorithm used (scipy's BFGS in my >> case). >> >> From then on, however, it will be easy for the SGD masters to make the >> stochastic version: it will just require re-using the function to >> compute the negative-log-likelihood and its gradient and replace BFGS >> with SGD! > Making the SGD handle this case is more or less the only thing that > requires any real work in my opinion. Ah! I didn't think it was actually an engineering problem. My bad ;-P > Integrating the different loss functions with the current, two-class > loss functions and handling 2d weights is what having multinomial > logistic regression is about. > The rest I can write down in<10 minutes ;) Ok, got it; I didn't have the full picture. Thanks for the clarification.
Btw, what approach do you consider regarding the problem of the 2D label array? It seems tricky to integrate "cleanly" with previous methods taking 1D target values. This reminds me of the problems with the precomputed kernel/affinity matrices... except on the "y" side this time. Anyway, IMHO, I still think it's worth having a separate module for batch multinomial logistic regression. It's a popular method, it provides an inherently multi-class classifier, some users have asked for it, and, apparently, you already have an implementation so it should be straightforward (10 minutes... joking ;-)). Bonus: with kernels (I love kernels)! Furthermore, this could be a situation similar to SVMs: hinge loss can be used with SGD, but you can also use liblinear/libsvm which are in separate modules. Therefore, users can enjoy batch MLR until the stochastic version is available, at which point everyone will switch to SGD of course ;-). What do you think? Cheers, Adrien ------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
