[GitHub] spark pull request #20758: [SPARK-14681][ML] Provide label/impurity stats fo...

2018-03-09 Thread WeichenXu123
Github user WeichenXu123 closed the pull request at:

https://github.com/apache/spark/pull/20758


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20758: [SPARK-14681][ML] Provide label/impurity stats fo...

2018-03-07 Thread WeichenXu123
GitHub user WeichenXu123 opened a pull request:

https://github.com/apache/spark/pull/20758

[SPARK-14681][ML] Provide label/impurity stats for spark.ml decision tree 
nodes

## What changes were proposed in this pull request?

Provide label/impurity stats for spark.ml decision tree nodes.

API:
```
class TreeClassifierStatInfo
   def getLabelCount(label: Int): Double

class TreeRegressorStatInfo
   def getCount(): Double
   def getSum(): Double
   def getSquareSum(): Double

class Node
   
   +++ def statInfo: TreeStatInfo

trait TreeStatInfo
   def asTreeClassifierStatInfo: TreeClassifierStatInfo
   def asTreeRegressorStatInfo: TreeRegressorStatInfo
```

## How was this patch tested?

UT added.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/WeichenXu123/spark tree_stat_api

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20758.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20758


commit e57ffaaad1666577d956c1f8f734f97569b93969
Author: WeichenXu 
Date:   2018-03-07T10:37:22Z

init pr




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org