Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/20632#discussion_r170410834
--- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala ---
@@ -270,11 +269,21 @@ private[tree] class LearningNode(
* Convert this [[LearningNode]] to a regular [[Node]], and recurse on
any children.
*/
def toNode: Node = {
- if (leftChild.nonEmpty) {
- assert(rightChild.nonEmpty && split.nonEmpty && stats != null,
+
+ // convert to an inner node only when:
+ // -) the node is not a leaf, and
+ // -) the subtree rooted at this node cannot be replaced by a single
leaf
+ // (i.e., there at least two different leaf predictions appear in
the subtree)
--- End diff --
This comment seems out of place now. You might just say `// when both
children make the same prediction, collapse into single leaf` or something
similar below the first case statement.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]