On Thu, Nov 12, 2015 at 3:41 PM, <
scikit-learn-general-requ...@lists.sourceforge.net> wrote:

> https://github.com/scikit-learn/scikit-learn/pull/5805


I wish all my off-hand remarks got such speedy service :-).

To return for a moment to Andreas Mueller's concern over whether an
averaging ensemble of regressors is useful, the obvious example is the
Netflix Prize.  But  let me moderate my suggestion to request that
VotingClassifier be generalized into a StackingEnsemble.

A StackingEnsemble would take a list of base estimators, a meta-estimator,
a partitioning scheme, and a few flags.  It would fit by using the
partitioning scheme to split X and y, and then train the base estimators on
the first split, and then use the base estimators to predict the second
split and train the meta-estimator on [y2p] (plus X2 if a flag for
including the base features is set).

With classifiers as the base estimators, a Voting meta-estimator and a null
partition, this is the VotingClassifier.  With a holdout it becomes
blending.  With something more sophisticated as a meta-estimator it becomes
Stacking.

With regressors as the base estimators, a Mean meta-estimator and a null
partition, this is the AveragingEnsemble.  With a holdout it becomes
blending.  With something more sophisticated as a meta-estimator it becomes
Stacking.

If you use StackingEnsembles as base estimators in another
StackingEnsemble, you get multi-level stacking.

If you allow an additional optional input to StackingEnsemble.fit() to pass
in meta-features that would be used only by the meta-estimator, you get the
sort of ensemble that was effective in the Netflix Competition.

There's probably more thought needed about the design and options, but an
approach like this would seem to add a lot of capability without overly
complicating sklearn with a lot of individual ensemble estimators.

-- Scott
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to