GitHub user WeichenXu123 opened a pull request: https://github.com/apache/spark/pull/20758
[SPARK-14681][ML] Provide label/impurity stats for spark.ml decision tree nodes ## What changes were proposed in this pull request? Provide label/impurity stats for spark.ml decision tree nodes. API: ``` class TreeClassifierStatInfo def getLabelCount(label: Int): Double class TreeRegressorStatInfo def getCount(): Double def getSum(): Double def getSquareSum(): Double class Node .... +++ def statInfo: TreeStatInfo trait TreeStatInfo def asTreeClassifierStatInfo: TreeClassifierStatInfo def asTreeRegressorStatInfo: TreeRegressorStatInfo ``` ## How was this patch tested? UT added. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WeichenXu123/spark tree_stat_api Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20758.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20758 ---- commit e57ffaaad1666577d956c1f8f734f97569b93969 Author: WeichenXu <weichen.xu@...> Date: 2018-03-07T10:37:22Z init pr ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org