I read berkeley's doc[1] states that for each tree it randomly sample features from all input features. So I am curious if those randomly sampled features are preserved.
Thanks for the explain. [1]. http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm ----- Mail original ----- De : Olivier Grisel <[email protected]> À : Aaron Jacques <[email protected]>; scikit-learn-general <[email protected]> Cc : Envoyé le : Jeudi 29 août 2013 10h02 Objet : Re: [Scikit-learn-general] sample_weight and features in a single tree In general all the features are used by the DT algorithm. The max_features parameter is just a way to control the amount of randomization injected at each stage during the learning process of the trees used by the ExtraTrees* or RandomForest* classes. But on average all features end up selected at one point or another: hence there is no such think as "selected features". Ensembles of trees like ExtraTrees* or RandomForest* can however tell you what where the most important features. See the examples mentioned in this section of the documentation: http://scikit-learn.org/dev/modules/ensemble.html#feature-importance-evaluation Finally you can output a representation of individual tree as a graph, see: http://scikit-learn.org/dev/modules/tree.html#classification http://scikit-learn.org/stable/modules/generated/sklearn.tree.export_graphviz.html The source code of the function might be a good example to walk done the tree: for instance to mine the frequencies of consecutive decision rules in a forest: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/export.py ------------------------------------------------------------------------------ Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
