Your intuition is correct. For a decision tree with max_feature=None, the random_state is used to break ties randomly.
Cheers, Arnaud > On 14 Oct 2015, at 17:33, Kevin Markham <justmark...@gmail.com> wrote: > > Hello, > > I'm a data science instructor that uses scikit-learn extensively in the > classroom. Yesterday I was teaching decision trees, and I summarized the tree > building process (for regression trees) as follows: > > 1. Begin at the top of the tree. > 2. For every feature, examine every possible cutpoint, and choose the feature > and cutpoint such that the resulting tree has the lowest possible mean > squared error (MSE). Make that split. > 3. Examine the two resulting regions, and again make a single split (in one > of the regions) to minimize the MSE. > 4. Keep repeating step 3 until a stopping criterion is met. > > One question that came up is why there is a random_state parameter for a > DecisionTreeRegressor (or a DecisionTreeClassifier). Assuming that an > exhaustive search is performed before each split (meaning that every possible > cutpoint is checked for every feature), it is not obvious to me what > randomness is used during the tree building process, such that a random_state > is necessary. > > My best guesses were that the random_state is used for tiebreaking, or > perhaps that the search for the best split is not exhaustive and thus > random_state affects the way in which the search is performed. > > In summary, I am asking: Why is a random_state necessary for decision trees? > > As a corollary, I am asking: Am I correctly representing how a decision tree > is built? > > Thank you very much! > Kevin Markham > ------------------------------------------------------------------------------ > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general ------------------------------------------------------------------------------ _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general