Re: [Scikit-learn-general] Scikit-learn-general Digest, Vol 70, Issue 10

olologin Thu, 12 Nov 2015 14:54:03 -0800

On 11/13/2015 12:32 AM, Scott Turner wrote:

On Thu, Nov 12, 2015 at 3:41 PM,<scikit-learn-general-requ...@lists.sourceforge.net<mailto:scikit-learn-general-requ...@lists.sourceforge.net>> wrote:
    https://github.com/scikit-learn/scikit-learn/pull/5805


I wish all my off-hand remarks got such speedy service :-).
To return for a moment to Andreas Mueller's concern over whether anaveraging ensemble of regressors is useful, the obvious example is theNetflix Prize. But let me moderate my suggestion to request thatVotingClassifier be generalized into a StackingEnsemble.
A StackingEnsemble would take a list of base estimators, ameta-estimator, a partitioning scheme, and a few flags. It would fitby using the partitioning scheme to split X and y, and then train thebase estimators on the first split, and then use the base estimatorsto predict the second split and train the meta-estimator on [y2p](plus X2 if a flag for including the base features is set).
With classifiers as the base estimators, a Voting meta-estimator and anull partition, this is the VotingClassifier. With a holdout itbecomes blending. With something more sophisticated as ameta-estimator it becomes Stacking.
With regressors as the base estimators, a Mean meta-estimator and anull partition, this is the AveragingEnsemble. With a holdout itbecomes blending. With something more sophisticated as ameta-estimator it becomes Stacking.
If you use StackingEnsembles as base estimators in anotherStackingEnsemble, you get multi-level stacking.
If you allow an additional optional input to StackingEnsemble.fit() topass in meta-features that would be used only by the meta-estimator,you get the sort of ensemble that was effective in the NetflixCompetition.
There's probably more thought needed about the design and options, butan approach like this would seem to add a lot of capability withoutoverly complicating sklearn with a lot of individual ensemble estimators.
-- Scott


------------------------------------------------------------------------------


_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

It's not a good idea, but i'll post it:

Seems that it's possible to easily create StackingClassifier even incurrent master. You can compose VotingClassifier and some meta-estimatorinto Pipeline, because VotingClassifier has transform method, whichreturns separated predictions of each base estimator before voting. Soit could be used and fitted as transformer inside pipeline. WithFeatureUnion you can pass some features strictly to meta estimator.

Though custom partitioning behaviour requires small modifications incode, and you have to slice your dataset into features somehow if youwant to use FeatureUnion in Pipeline.

------------------------------------------------------------------------------

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Scikit-learn-general Digest, Vol 70, Issue 10

Reply via email to