Re: [Scikit-learn-general] Interpretation of LogisticRegression coefficients in multiclass case

Adrien Tue, 06 Mar 2012 12:32:08 -0800

Le 06/03/2012 20:23, Andreas a écrit :
> On 03/06/2012 08:17 PM, Adrien wrote:
>> Le 06/03/2012 19:19, Andreas Mueller a écrit :
>>
>>> Hi Adrien.
>>> Thanks for the offer and thanks for converting people from the dark side ;)
>>>
>>> I'm not sure this is the way to go, though.
>>> There is already quite efficient SGD code in sklearn and this should
>>> probably
>>> be extended to handle the multi-class case.
>>> If you include a separate implementation, there will be a lot of code
>>> duplication
>>> and it will probably be non-trivial to get to the speed of the current
>>> implementation.
>> I agree with you.
>>
>> What I had in mind was just to, first, provide a simple, "stand-alone",
>> batch implementation of MLR for reference.  Note, that I didn't find any
>> in python... Maybe someone else has?
> Well, I have several ;)
> Most of them are SGD, too.
>
> There is also Peter's bolt, which might be a good reference implementation.
> http://pprett.github.com/bolt/
I didn't know about that one and Google didn't find it for me (even with 
maxent-related keywords). Thanks!
>
>> Like you mentioned, this batch version will not scale very well. One
>> reason for this is the optimization algorithm used (scipy's BFGS in my
>> case).
>>
>>    From then on, however, it will be easy for the SGD masters to make the
>> stochastic version: it will just require re-using the function to
>> compute the negative-log-likelihood and its gradient and replace BFGS
>> with SGD!
> Making the SGD handle this case is more or less the only thing that
> requires any real work in my opinion.
Ah! I didn't think it was actually an engineering problem. My bad ;-P
> Integrating the different loss functions with the current, two-class
> loss functions and handling 2d weights is what having multinomial
> logistic regression is about.
> The rest I can write down in<10 minutes ;)
Ok, got it; I didn't have the full picture. Thanks for the clarification.


Btw, what approach do you consider regarding the problem of the 2D label 
array? It seems tricky to integrate "cleanly" with previous methods 
taking 1D target values. This reminds me of the problems with the 
precomputed kernel/affinity matrices... except on the "y" side this time.

Anyway, IMHO, I still think it's worth having a separate module for 
batch multinomial logistic regression. It's a popular method, it 
provides an inherently multi-class classifier, some users have asked for 
it, and, apparently, you already have an implementation so it should be 
straightforward (10 minutes... joking ;-)). Bonus: with kernels (I love 
kernels)!

Furthermore, this could be a situation similar to SVMs: hinge loss can 
be used with SGD, but you can also use liblinear/libsvm which are in 
separate modules. Therefore, users can enjoy batch MLR until the 
stochastic version is available, at which point everyone will switch to 
SGD of course ;-).

What do you think?

Cheers,

Adrien



------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Interpretation of LogisticRegression coefficients in multiclass case

Reply via email to