RFECV will select features based on scores on a number of validation sets,
as selected by its cv parameter. As opposed to that StackOverflow query,
RFECV should now support RandomForest and its feature_importances_
attribute.
On 7 October 2015 at 18:16, Raphael C <drr...@gmail.com> wrote:
> I have a training set, a validation set and a test set. I build a
> random forest using RandomForestClassifier on the training set.
> However, I would like to tune it by scoring on the validation set.
> I find that the cross-validation score on the training set is a lot
> better than the score on the validation set.
>
> To improve this I would like to do [RFE][1] to do feature selection to
> deal with overfitting. I have tried removing features by hand and in
> some cases it does improve the score on the validation set. This
> [question and answer][2] show how to use RFE with
> RandomForestClassifier but I don't understand how to do this when you
> score on a separate validation set.
>
> Can this sort of feature selection be done using RFE or some other
> scikit learn method?
>
>
> [1]:
> http://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.RFE.html
> [2]:
> https://stackoverflow.com/questions/24123498/recursive-feature-elimination-on-random-forest-using-scikit-learn
>
> Raphael
>
>
> ------------------------------------------------------------------------------
> Full-scale, agent-less Infrastructure Monitoring from a single dashboard
> Integrate with 40+ ManageEngine ITSM Solutions for complete visibility
> Physical-Virtual-Cloud Infrastructure monitoring from one console
> Real user monitoring with APM Insights and performance trend reports
> Learn More
> http://pubads.g.doubleclick.net/gampad/clk?id=247754911&iu=/4140
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
Full-scale, agent-less Infrastructure Monitoring from a single dashboard
Integrate with 40+ ManageEngine ITSM Solutions for complete visibility
Physical-Virtual-Cloud Infrastructure monitoring from one console
Real user monitoring with APM Insights and performance trend reports
Learn More http://pubads.g.doubleclick.net/gampad/clk?id=247754911&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general