On Thu, Nov 12, 2015 at 3:41 PM, < scikit-learn-general-requ...@lists.sourceforge.net> wrote:
> https://github.com/scikit-learn/scikit-learn/pull/5805 I wish all my off-hand remarks got such speedy service :-). To return for a moment to Andreas Mueller's concern over whether an averaging ensemble of regressors is useful, the obvious example is the Netflix Prize. But let me moderate my suggestion to request that VotingClassifier be generalized into a StackingEnsemble. A StackingEnsemble would take a list of base estimators, a meta-estimator, a partitioning scheme, and a few flags. It would fit by using the partitioning scheme to split X and y, and then train the base estimators on the first split, and then use the base estimators to predict the second split and train the meta-estimator on [y2p] (plus X2 if a flag for including the base features is set). With classifiers as the base estimators, a Voting meta-estimator and a null partition, this is the VotingClassifier. With a holdout it becomes blending. With something more sophisticated as a meta-estimator it becomes Stacking. With regressors as the base estimators, a Mean meta-estimator and a null partition, this is the AveragingEnsemble. With a holdout it becomes blending. With something more sophisticated as a meta-estimator it becomes Stacking. If you use StackingEnsembles as base estimators in another StackingEnsemble, you get multi-level stacking. If you allow an additional optional input to StackingEnsemble.fit() to pass in meta-features that would be used only by the meta-estimator, you get the sort of ensemble that was effective in the Netflix Competition. There's probably more thought needed about the design and options, but an approach like this would seem to add a lot of capability without overly complicating sklearn with a lot of individual ensemble estimators. -- Scott
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general