[Scikit-learn-general] How to optimize a random forest for out of sample prediction

2015-10-07 Thread Raphael C
I have a training set, a validation set and a test set. I build a random forest using RandomForestClassifier on the training set. However, I would like to tune it by scoring on the validation set. I find that the cross-validation score on the training set is a lot better than the score on the

Re: [Scikit-learn-general] How to optimize a random forest for out of sample prediction

2015-10-07 Thread Joel Nothman
RFECV will select features based on scores on a number of validation sets, as selected by its cv parameter. As opposed to that StackOverflow query, RFECV should now support RandomForest and its feature_importances_ attribute. On 7 October 2015 at 18:16, Raphael C wrote: > I