Hi Yanir.
I was not aware that GradientBoosting had oob scores.
Is that even possible / sensible? It definitely does not do what it
promises :-/
Peter, any thoughts?
Cheers,
Andy
On 03/22/2013 11:39 AM, Yanir Seroussi wrote:
Hi,
I'm new to the mailing list, so I apologise if this has been asked before.
I want to use the oob_score_ in GradientBoostingRegressor to determine
the optimal number of iterations without relying on an external
validation set, so I set the subsample parameter to 0.5 and trained
the model. However, I've noticed that oob_score_ improves in a similar
manner to the in-bag scores (train_score_). That is, it goes down very
fast, and keeps improving regardless of the number of iterations.
Digging through the code in ensemble/gradient_boosting.py, it seems
like the cause is that oob_score_[i] includes previous trees that were
trained on the OOB instances of the i-th sample. Isn't the OOB score
supposed to be calculated for each OOB instance using only trees that
where this instance wasn't used for training (as done for random forests)?
Cheers,
Yanir
------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_mar
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_mar
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general