Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/7294#discussion_r34624195
--- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala ---
@@ -209,3 +209,122 @@ private object InternalNode {
}
}
}
+
+/**
+ * Version of a node used in learning. This uses vars so that we can
modify nodes as we split the
+ * tree by adding children, etc.
+ *
+ * For now, we use node IDs. These will be kept internal since we hope to
remove node IDs
+ * in the future, or at least change the indexing (so that we can support
much deeper trees).
+ *
+ * This node can either be:
+ * - a leaf node, with leftChild, rightChild, split set to null, or
+ * - an internal node, with all values set
+ *
+ * @param id We currently use the same indexing as the spark.mllib
implementation,
--- End diff --
* Explain how `id` is set or provide a link to the old implementation. It
might become harder to trace back.
* minor: It might be useful to have doc for other parameters like
`predictionStats` and `stats`.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]