Repository: spark
Updated Branches:
  refs/heads/master 663f30d14 -> 1426eea84


[SPARK-21623][ML] fix RF doc

## What changes were proposed in this pull request?

comments of parentStats in RF are wrong.
parentStats is not only used for the first iteration, it is used with all the 
iteration for unordered features.

## How was this patch tested?

Author: Peng Meng <peng.m...@intel.com>

Closes #18832 from mpjlu/fixRFDoc.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1426eea8
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1426eea8
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1426eea8

Branch: refs/heads/master
Commit: 1426eea84c544000273d176514532cb7f7015cea
Parents: 663f30d
Author: Peng Meng <peng.m...@intel.com>
Authored: Mon Aug 7 11:03:07 2017 +0100
Committer: Sean Owen <so...@cloudera.com>
Committed: Mon Aug 7 11:03:07 2017 +0100

----------------------------------------------------------------------
 .../org/apache/spark/ml/tree/impl/DTStatsAggregator.scala      | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/1426eea8/mllib/src/main/scala/org/apache/spark/ml/tree/impl/DTStatsAggregator.scala
----------------------------------------------------------------------
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/tree/impl/DTStatsAggregator.scala 
b/mllib/src/main/scala/org/apache/spark/ml/tree/impl/DTStatsAggregator.scala
index 61091bb..5aeea14 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/tree/impl/DTStatsAggregator.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/tree/impl/DTStatsAggregator.scala
@@ -78,9 +78,9 @@ private[spark] class DTStatsAggregator(
 
   /**
    * Array of parent node sufficient stats.
-   *
-   * Note: this is necessary because stats for the parent node are not 
available
-   *       on the first iteration of tree learning.
+   * Note: parent stats need to be explicitly tracked in the 
[[DTStatsAggregator]] for unordered
+   *       categorical features, because the parent [[Node]] object does not 
have [[ImpurityStats]]
+   *       on the first iteration.
    */
   private val parentStats: Array[Double] = new Array[Double](statsSize)
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to