In general all the features are used by the DT algorithm. The
max_features parameter is just a way to control the amount of
randomization injected at each stage during the learning process of
the trees used by the ExtraTrees* or RandomForest* classes. But on
average all features end up selected at one point or another: hence
there is no such think as "selected features".

Ensembles of trees like ExtraTrees* or RandomForest* can however tell
you what where the most important features. See the examples mentioned
in this section of the documentation:

http://scikit-learn.org/dev/modules/ensemble.html#feature-importance-evaluation

Finally you can output a representation of individual tree as a graph, see:

http://scikit-learn.org/dev/modules/tree.html#classification
http://scikit-learn.org/stable/modules/generated/sklearn.tree.export_graphviz.html

The source code of the function might be a good example to walk done
the tree: for instance to mine the frequencies of consecutive decision
rules in a forest:

https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/export.py

------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to