2013/12/8 Gael Varoquaux <[email protected]>: > Hi Magellane, > >> I would like to provide an implementation for the Ensemble selection >> technique as described by the following paper : Ensemble selection from >> libraries of models by Rich Caruana ,Alexandru Niculescu-Mizil,Geoff >> Crew,Alex Ksikes ( >> www.cs.cornell.edu/~caruana/ctp/ct.papers/caruana.icml04.icdm06long.pdf) > > This paper has 200 citations on Google scholar, which is somewhat on the > low end of what we include in scikit-learn. > > Do you believe that it is a major tool that is very useful in general? > Have you had a lot of success using it?
There are at least 2 R packages used by kagglers that implement this ensemble method (and refinements): http://moderntoolmaking.blogspot.fr/2013/03/new-package-for-ensembling-r-models.html http://www.kaggle.com/forums/t/3661/medley-a-new-r-package-for-blending-regression-models There is also a Python project that works with scikit-learn: https://github.com/dclambert/pyensemble However in practice this method is likely to generate a large amount of models and predictions. Keeping it all in memory might not be efficient. On the other hand storing temporary datastructures (pickled scikit-learn models and prediction data) on the filesystem might lead to frameworkish code which we try to avoid in a library such as scikit-learn. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ Sponsored by Intel(R) XDK Develop, test and display web and hybrid apps with a single code base. Download it for free now! http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
