Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/22756#discussion_r226181011
--- Diff:
examples/src/main/java/org/apache/spark/examples/ml/JavaBisectingKMeansExample.java
---
@@ -50,9 +51,14 @@ public static void main(String[] args) {
BisectingKMeans bkm = new BisectingKMeans().setK(2).setSeed(1);
BisectingKMeansModel model = bkm.fit(dataset);
- // Evaluate clustering.
- double cost = model.computeCost(dataset);
- System.out.println("Within Set Sum of Squared Errors = " + cost);
+ // Make predictions
+ Dataset<Row> predictions = model.transform(dataset);
+
+ // Evaluate clustering by computing Silhouette score
+ ClusteringEvaluator evaluator = new ClusteringEvaluator();
+
+ double silhouette = evaluator.evaluate(predictions);
+ System.out.println("Silhouette with squared euclidean distance = " +
silhouette);
--- End diff --
@mgaido91 .
If we are going to change all `ml` examples for deprecation, we had better
change the following, too.
-
https://github.com/apache/spark/blob/master/examples/src/main/python/ml/bisecting_k_means_example.py#L45
```scala
# Evaluate clustering.
cost = model.computeCost(dataset)
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]