Hi all. I'm following the RandomForest code (in dev from a 1 week old
checkout). As I understand it (and similar to the previous post - I
have some RF usage experience but nothing fundamental), RF uses a
weighted sample of examples to learn *and* a random subset of features
when building its decision trees.

Does the scikit-learn implementation use a random subset of features?
I've followed the code in forest.py and I can't find where the choice
might be made. I haven't looked at the C code for the DecisionTree.

I'm interested to learn the lower bound of the number of random
features that can be chosen.

I'm also curious to understand where we can restrict the depth of the
RandomForest classifier. All I can see is that in forest.py the
constructor takes but ignores the max_depth argument:
class RandomForestClassifier(ForestClassifier):
...
    def __init__(self,
                 n_estimators=10,
                 criterion="gini",
                 max_depth=None,
...
        super(RandomForestClassifier, self).__init__(
            base_estimator=DecisionTreeClassifier(),
...

base.py._make_estimator just clones the existing base_estimator. Am I
missing something?

Thanks for listening,
Ian.

--
Ian Ozsvald (A.I. researcher)
i...@ianozsvald.com

http://IanOzsvald.com
http://MorConsulting.com/
http://Annotate.IO
http://SocialTiesApp.com/
http://TheScreencastingHandbook.com
http://FivePoundApp.com/
http://twitter.com/IanOzsvald
http://ShowMeDo.com

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to