Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20396#discussion_r164061850
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/evaluation/ClusteringEvaluatorSuite.scala
---
@@ -66,16 +66,38 @@ class ClusteringEvaluatorSuite
assert(evaluator.evaluate(irisDataset) ~== 0.6564679231 relTol 1e-5)
}
- test("number of clusters must be greater than one") {
- val singleClusterDataset = irisDataset.where($"label" === 0.0)
+ /*
+ Use the following python code to load the data and evaluate it using
scikit-learn package.
--- End diff --
this is the same as for `squaredEuclidean` where this format was used
according to https://github.com/apache/spark/pull/18538#discussion_r131100309
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]