Hi Yang, I think I understand it better now, as well. So this is what I think it does:
First of all, I think it only affects the categorical node splits. It will work as following in this scenario: Let us consider a dataset D we want to build a decision tree from. Let's say the tree has been partially built, and we've reached a categorical attribute C that we want to split on. As I understand it, when parametrized = false, on that node we might only branch on a subset of possible values of C. When parametrized = true, however, we will 'force' branching on all possible values of C from the entire dataset, and replace the missing data with leaves having a label computed from the parent data (line 307): if (data.getDataset <http://grepcode.com/file/repo1.maven.org/maven2/org.apache.mahout/mahout-core/0.7/org/apache/mahout/classifier/df/data/Data.java#Data.getDataset%28%29>().isNumerical <http://grepcode.com/file/repo1.maven.org/maven2/org.apache.mahout/mahout-core/0.7/org/apache/mahout/classifier/df/data/Dataset.java#Dataset.isNumerical%28int%29>(data.getDataset <http://grepcode.com/file/repo1.maven.org/maven2/org.apache.mahout/mahout-core/0.7/org/apache/mahout/classifier/df/data/Data.java#Data.getDataset%28%29>().getLabelId <http://grepcode.com/file/repo1.maven.org/maven2/org.apache.mahout/mahout-core/0.7/org/apache/mahout/classifier/df/data/Dataset.java#Dataset.getLabelId%28%29>())) { label = sum / data.size <http://grepcode.com/file/repo1.maven.org/maven2/org.apache.mahout/mahout-core/0.7/org/apache/mahout/classifier/df/data/Data.java#Data.size%28%29>(); } else { label = data.majorityLabel <http://grepcode.com/file/repo1.maven.org/maven2/org.apache.mahout/mahout-core/0.7/org/apache/mahout/classifier/df/data/Data.java#Data.majorityLabel%28java.util.Random%29>(rng); } I hope this is correct and helps with understanding it better. Also, I found this <https://issues.apache.org/jira/browse/MAHOUT-840>, it's the Jira task that introduced the DecisionTreeBuilder, take a look at the comments, maybe it'll help you as well. Anca
