2014-09-21 10:46 GMT+02:00 Mathieu Blondel <math...@mblondel.org>:
>
>
> On Sun, Sep 21, 2014 at 1:55 AM, Olivier Grisel <olivier.gri...@ensta.org>
> wrote:
>>
>> On a related note, here is an implementeation of Logistic Regression
>> applied to one-hot features obtained from leaf membership info of a
>> GBRT model:
>>
>>
>> http://nbviewer.ipython.org/github/ogrisel/notebooks/blob/master/sklearn_demos/Income%20classification.ipynb#Using-the-boosted-trees-to-extract-features-for-a-Logistic-Regression-model
>>
>> This is inspired by this paper from Facebook:
>> https://www.facebook.com/publications/329190253909587/ .
>>
>> It's easy to implement and seems to work quite well.
>
>
> What is the advantage of this method over using GBRT directly?

A significant improvement in F1-score for the positive / minority
class and ROC AUC on this dataset (Adult Census binarized income
prediction with integer encoding of the categorical variables).

Apparently the facebook ad team reported the same kind of improvement
on their own data.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to