Re: [Scikit-learn-general] LogisticRegression versus SGDClassifier(loss="log")?

Peter Prettenhofer Sun, 17 Jun 2012 23:49:32 -0700

2012/6/15 Fred Mailhot <[email protected]>:
> Thanks for the prompt reply, Peter. I may be in a situation that will call
> for SGDClassifier, so I have two follow-up questions:
>
> 1) I'd like to compute the class probs; are the probs for the individual OvR
> classifiers (easily) accessible? My intuition is that I can compute these
> from the returned vals from decision_function(), then do the normalization
> afterward...


Correct, you can get the class probability for each OvR classifier via
decision_function::

    P = 1.0 / (1.0 + np.exp(-self.decision_function(X)))

>
> 2) How "online" is the SGD implementation? Specifically, would it be
> possible do to something like continuous training from a "neverending"
> stream of data (e.g. coming in over a network socket)?

You can do "online" learning via SGDClassifier.partial_fit. I'm not
really familiar with "practical" online learning; the partial_fit
method mainly targets at "sequential learning" which is useful when
your training data does not fit into main memory. The major issue here
is again the learning rate. Currently, partial_fit records the
learning rate/schedule from previous calls to partial_fit which means
that at some point in time you hardy update your model based on new
examples because the learning rate became too small. If you need to
"adapt" to new data it might be better to "reset" the learning rate
before calling partial_fit or to train a new classifier on the new
data and combine the old model (i.e. parameter vector) with the new
(e.g. an exponential average). Again, I've no practical experience
with online learning so please take this with a grain of salt.

A practical note: make sure you buffer the stream before you call
``partial_fit``; calling ``partial_fit`` with a single example at a
time will be rather inefficient (housekeeping and function call
overhead in python).

best,
 Peter

>
> Thanks again,
> Fred.
>
>
>
> On 15 June 2012 16:53, Peter Prettenhofer <[email protected]>
> wrote:
>>
>> Hi Fred,
>>
>> the major difference is the optimization algorithm:
>> Liblinear/Coordinate Descent vs. Stochastic Gradient Descent.
>>
>> If your problem is high dimensional (10K or more) and you have a large
>> number of examples (100K or more) you should choose the latter -
>> otherwise, LogisticRegression should be fine.
>>
>> Both are not proper multinomial logistic regression models;
>> LogisticRegression does not care and simply computes the probability
>> estimates of each OVR classifier and normalized to make sure they sum
>> to one. You could do the same for SGDClassifier(loss='log') but you
>> have to implement it on your own. You should be aware of the fact that
>> SGDClassifier(n_jobs > 1) uses multiple processes, thus, if your
>> dataset (``X``) is too large (more than 50% of your RAM) you'll run
>> into troubles.
>>
>> best,
>>  Peter
>>
>>
>> 2012/6/15 Fred Mailhot <[email protected]>:
>> > Dear all,
>> >
>> > What are the advantages of choosing one of the Subject line classifiers
>> > over
>> > the other? At a quick glance, I see the following:
>> >
>> > - LogisticRegression implements predict_proba for the multiclass case,
>> > while
>> > SGDClassifier doesn't
>> > - SGDClassifier(loss="log") lets you specify multiple CPUs for the OVA
>> > training, while LogisticRegression doesn't
>> >
>> > Are there other obvious differences that might influence this decision?
>> >
>> > Regards,
>> > Fred.
>> >
>> >
>> >
>> > ------------------------------------------------------------------------------
>> > Live Security Virtual Conference
>> > Exclusive live event will cover all the ways today's security and
>> > threat landscape has changed and how IT managers can respond.
>> > Discussions
>> > will include endpoint security, mobile security and the latest in
>> > malware
>> > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>> > _______________________________________________
>> > Scikit-learn-general mailing list
>> > [email protected]
>> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>> >
>>
>>
>>
>> --
>> Peter Prettenhofer
>>
>>
>> ------------------------------------------------------------------------------
>> Live Security Virtual Conference
>> Exclusive live event will cover all the ways today's security and
>> threat landscape has changed and how IT managers can respond. Discussions
>> will include endpoint security, mobile security and the latest in malware
>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>> _______________________________________________
>> Scikit-learn-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>



-- 
Peter Prettenhofer

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] LogisticRegression versus SGDClassifier(loss="log")?

Reply via email to