2014-09-21 10:46 GMT+02:00 Mathieu Blondel <math...@mblondel.org>: > > > On Sun, Sep 21, 2014 at 1:55 AM, Olivier Grisel <olivier.gri...@ensta.org> > wrote: >> >> On a related note, here is an implementeation of Logistic Regression >> applied to one-hot features obtained from leaf membership info of a >> GBRT model: >> >> >> http://nbviewer.ipython.org/github/ogrisel/notebooks/blob/master/sklearn_demos/Income%20classification.ipynb#Using-the-boosted-trees-to-extract-features-for-a-Logistic-Regression-model >> >> This is inspired by this paper from Facebook: >> https://www.facebook.com/publications/329190253909587/ . >> >> It's easy to implement and seems to work quite well. > > > What is the advantage of this method over using GBRT directly?
A significant improvement in F1-score for the positive / minority class and ROC AUC on this dataset (Adult Census binarized income prediction with integer encoding of the categorical variables). Apparently the facebook ad team reported the same kind of improvement on their own data. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general