Den 2007-08-24 21:13, Mathe, Ewy (NIH/NCI) [F] skrev: > Hello, > > > > I am trying to explore the use of random forests for classification and > am certain about the interpretation of the importance measurements.
In case you haven't already done so, you probably want to read @ARTICLE{Strobl+Boulesteix+Zeileis+Hothorn:2007, author = {Carolin Strobl and Anne-Laure Boulesteix and Achim Zeileis and Torsten Hothorn}, title = {Bias in Random Forest Variable Importance Measures: Illustrations, Sources and a Solution}, journal = {{BMC} Bioinformatics}, year = {2007}, volume = {8}, number = {25}, url = {http://www.biomedcentral.com/1471-2105/8/25/} } HTH, Henric > > > > When having the option "importance = T" in the randomForest call, the > resulting 'importance' element matrix has four columns with the > following headings: > > 0 - mean raw importance score of variable x for class 0 (where > importance is the difference between the permutated data error and the > original test set error) > > 1 - mean raw importance score of variable x for class 1 > > MeanDecreaseAccuracy : average lowering of the margin across all cases > (where margin is the proportion of votes for the true class - the > maximum proportion of votes for the other classes) > > MeanDecreaseGini : summation of the gini decreases over all trees in the > forest > > > > Are these definitions correct? Why is the raw importance score > calculated for each class? Could one just average the raw importance > scores for class 0 and 1 to get a composite importance score? > > > > Now, when having the option "importance = F" in the randomForest call, > the 'importance' element is now a vector. What values are those? > > > > Thank you in advance for any input you may have. > > > > Best, > > Ewy > > > > > > > > > > Ewy Mathe, Ph. D. > > Laboratory of Human Carcinogenesis > > National Cancer Institute, NIH > > 37 Convent Drive > > Building 37, Room 3068 > > Bethesda, MD 20892-4255 > > Tel: 301-496-5835 > > Fax: 301-496-0497 > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.