I read berkeley's doc[1] states that for each tree it randomly sample features 
from all input features. So I am curious if those randomly sampled features are 
preserved.    

Thanks for the explain.


[1]. http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm


----- Mail original -----
De : Olivier Grisel <[email protected]>
À : Aaron Jacques <[email protected]>; scikit-learn-general 
<[email protected]>
Cc : 
Envoyé le : Jeudi 29 août 2013 10h02
Objet : Re: [Scikit-learn-general] sample_weight and features in a single tree

In general all the features are used by the DT algorithm. The
max_features parameter is just a way to control the amount of
randomization injected at each stage during the learning process of
the trees used by the ExtraTrees* or RandomForest* classes. But on
average all features end up selected at one point or another: hence
there is no such think as "selected features".

Ensembles of trees like ExtraTrees* or RandomForest* can however tell
you what where the most important features. See the examples mentioned
in this section of the documentation:

http://scikit-learn.org/dev/modules/ensemble.html#feature-importance-evaluation

Finally you can output a representation of individual tree as a graph, see:

http://scikit-learn.org/dev/modules/tree.html#classification
http://scikit-learn.org/stable/modules/generated/sklearn.tree.export_graphviz.html

The source code of the function might be a good example to walk done
the tree: for instance to mine the frequencies of consecutive decision
rules in a forest:

https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/export.py


------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to