[GitHub] spark pull request #20701: [SPARK-23528][ML] Add numIter to ClusteringSummar...

mgaido91 Sat, 10 Mar 2018 02:56:44 -0800

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20701#discussion_r173619104
  
    --- Diff: 
mllib/src/test/scala/org/apache/spark/ml/clustering/BisectingKMeansSuite.scala 
---
    @@ -127,6 +128,7 @@ class BisectingKMeansSuite
         assert(clusterSizes.length === k)
         assert(clusterSizes.sum === numRows)
         assert(clusterSizes.forall(_ >= 0))
    +    assert(summary.numIter == 2)
    --- End diff --
    
    In `KMeansSuite` the value is not `maxIter` (it performs only 1 iteration 
in that case). In `BisectingKMeans` `numIter` is always `maxIter` since we are 
always performing `maxIter` (see 
https://github.com/apache/spark/blob/b6f837c9d3cb0f76f0a52df37e34aea8944f6867/mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeans.scala#L192).
    
    Does it answer to your comment?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20701: [SPARK-23528][ML] Add numIter to ClusteringSummar...

Reply via email to