On Wed, Nov 11, 2015 at 6:18 PM, < scikit-learn-general-requ...@lists.sourceforge.net> wrote:
> I am only a little concerned if averaging the results of different > regressors could potentially help with the predictive performance. I believe it has been proven that averaging uncorrelated regressors improves performance. (Whether it is easy to find uncorrelated regressors in practice is another question.) The Wikipedia article has some references, although they seem to focus primarily on ensemble averaging in neural networks: https://en.wikipedia.org/wiki/Ensemble_Averaging I think Wolpert 1992 is the seminal paper. But I think the implementation should be general-purpose stacking, permitting a user-specified meta regressor for combining the base regressors (with or without the base features). Averaging (or taking the median) is just a simplified special case. And I think stacking is well-established, e.g., http://link.springer.com/article/10.1007%2FBF00117832 although perhaps less used than boosting, etc. these days. If this does get implemented as a generalization of Voting, it would be good to back-fit stacking to the VotingClassifier as well. -- Scott
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general