On 8 January 2017 at 08:36, Thomas Evangelidis <teva...@gmail.com> wrote:
> > > On 7 January 2017 at 21:20, Sebastian Raschka <se.rasc...@gmail.com> > wrote: > >> Hi, Thomas, >> sorry, I overread the regression part … >> This would be a bit trickier, I am not sure what a good strategy for >> averaging regression outputs would be. However, if you just want to compute >> the average, you could do sth like >> np.mean(np.asarray([r.predict(X) for r in list_or_your_mlps])) >> >> However, it may be better to use stacking, and use the output of >> r.predict(X) as meta features to train a model based on these? >> > > Like to train an SVR to combine the predictions of the top 10% > MLPRegressors using the same data that were used for training of the > MLPRegressors? Wouldn't that lead to overfitting? > You could certainly hold out a different data sample and that might indeed be valuable regularisation, but it's not obvious to me that this is substantially more prone to overfitting than just training a handful of MLPRegressors on the same data and having them vote by other means. There is no problem, in general, with overfitting, as long as your evaluation of an estimator's performance isn't biased towards the training set. We've not talked about overfitting.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn