The variable importance in scikit-learn's implementation of random
forest is based on the proportion of samples that were classified by
the feature at some point in one of the decision trees evaluation.

http://scikit-learn.org/stable/modules/ensemble.html#feature-importance-evaluation

This method seems different from the OOB based method of Breiman 2001
(section 10):

http://www.stat.berkeley.edu/~breiman/randomforest2001.pdf

Is there any reference for the method implemented in the scikit?

Here is the original Stack Overflow question:

http://stackoverflow.com/questions/15810339/how-are-feature-importances-in-randomforestclassifier-determined/15811003?noredirect=1#comment22487062_15811003

--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
Minimize network downtime and maximize team effectiveness.
Reduce network management and security costs.Learn how to hire 
the most talented Cisco Certified professionals. Visit the 
Employer Resources Portal
http://www.cisco.com/web/learning/employer_resources/index.html
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to