[GitHub] spark pull request #19676: [SPARK-14516][FOLLOWUP] Adding ClusteringEvaluato...

srowen Sat, 09 Dec 2017 10:27:19 -0800

Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19676#discussion_r155928871
  
    --- Diff: 
examples/src/main/java/org/apache/spark/examples/ml/JavaKMeansExample.java ---
    @@ -51,9 +52,14 @@ public static void main(String[] args) {
         KMeans kmeans = new KMeans().setK(2).setSeed(1L);
         KMeansModel model = kmeans.fit(dataset);
     
    -    // Evaluate clustering by computing Within Set Sum of Squared Errors.
    -    double WSSSE = model.computeCost(dataset);
    -    System.out.println("Within Set Sum of Squared Errors = " + WSSSE);
    +    // Make predictions
    +    Dataset<Row> predictions = model.transform(dataset);
    +
    +    // Evaluate clustering by computing Silhouette score
    +    ClusteringEvaluator evaluator = new ClusteringEvaluator();
    +
    +    double silhouette = evaluator.evaluate(predictions);
    +    System.out.println("Silhouette with squared euclidean distance = " + 
silhouette);
    --- End diff --
    
    euclidean -> Euclidean, but not important to change unless you're touching 
the code again anyway



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19676: [SPARK-14516][FOLLOWUP] Adding ClusteringEvaluato...

Reply via email to