[GitHub] spark pull request #22756: [SPARK-25758][ML] Deprecate computeCost on Bisect...

dongjoon-hyun Wed, 17 Oct 2018 23:28:16 -0700

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22756#discussion_r226181011
  
    --- Diff: 
examples/src/main/java/org/apache/spark/examples/ml/JavaBisectingKMeansExample.java
 ---
    @@ -50,9 +51,14 @@ public static void main(String[] args) {
         BisectingKMeans bkm = new BisectingKMeans().setK(2).setSeed(1);
         BisectingKMeansModel model = bkm.fit(dataset);
     
    -    // Evaluate clustering.
    -    double cost = model.computeCost(dataset);
    -    System.out.println("Within Set Sum of Squared Errors = " + cost);
    +    // Make predictions
    +    Dataset<Row> predictions = model.transform(dataset);
    +
    +    // Evaluate clustering by computing Silhouette score
    +    ClusteringEvaluator evaluator = new ClusteringEvaluator();
    +
    +    double silhouette = evaluator.evaluate(predictions);
    +    System.out.println("Silhouette with squared euclidean distance = " + 
silhouette);
    --- End diff --
    
    @mgaido91 .
    If we are going to change all `ml` examples for deprecation, we had better 
change the following, too. 
    - 
https://github.com/apache/spark/blob/master/examples/src/main/python/ml/bisecting_k_means_example.py#L45
    ```scala
        # Evaluate clustering.
        cost = model.computeCost(dataset)
    ```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22756: [SPARK-25758][ML] Deprecate computeCost on Bisect...

Reply via email to