Hi all, I am currently working with the RandomForestClassifier performing a RandomizedSearchCV on the training data set. The data contains 106 features and approx. 10.000 observations.
The hyperparameter search returns the best parameters as: {'bootstrap': True, 'class_weight': 'balanced', 'criterion': 'entropy', 'max_depth': 10, 'max_features': 'log2', 'min_samples_leaf': 4, 'min_samples_split': 3, 'n_estimators': 33} My question is regarding feature_importances_. When calling this on my RandomForestClassifier (clf) it returns: clf.feature_importances_ Out[140]: array([ 0.51036391, 0.03331918, 0.02011316, 0.11259915, 0.17919327, 0.05119163, 0.01932924, 0.03351345, 0.01557083, 0.02480619]) Calling feature_importances_ on the different trees returns: clf.estimators_[1].feature_importances_ Out[137]: array([ 0.42919509, 0.0524983 , 0.01913177, 0.13067667, 0.20454586, 0.03236881, 0.06266216, 0.02380507, 0.01972648, 0.02538979]) clf.estimators_[0].feature_importances_ Out[138]: array([ 0.57415072, 0.02156333, 0.01333293, 0.08907816, 0.20841139, 0.02695001, 0.03061188, 0.02447627, 0.0064503 , 0.00497501]) Since every tree is using different features, the feature importances of each tree should represent the relative importance of the used features in the tree. Even though each tree seem to use 10 features, although max_features is set to log2, which should be log2(106) ~= 7. However, what does clf.feature_importances_ return? Is it a mean value of all feature importances? If so, does it makes sense, since every tree is using a different feature set? Please let me know, if you need more information. Kind regards Piotr Bialecki ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general