Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/20396#discussion_r164112428
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/evaluation/ClusteringEvaluatorSuite.scala
---
@@ -66,16 +66,38 @@ class ClusteringEvaluatorSuite
assert(evaluator.evaluate(irisDataset) ~== 0.6564679231 relTol 1e-5)
}
- test("number of clusters must be greater than one") {
- val singleClusterDataset = irisDataset.where($"label" === 0.0)
+ /*
+ Use the following python code to load the data and evaluate it using
scikit-learn package.
--- End diff --
I see, the idea is to make it more copy-pasteable. That's fine.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]