Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/20396#discussion_r164111264
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala
---
@@ -84,18 +81,39 @@ class ClusteringEvaluator @Since("2.3.0")
(@Since("2.3.0") override val uid: Str
@Since("2.3.0")
def setMetricName(value: String): this.type = set(metricName, value)
- setDefault(metricName -> "silhouette")
+ /**
+ * param for distance measure to be used in evaluation
+ * (supports `"squaredEuclidean"` (default), `"cosine"`)
+ * @group param
+ */
+ @Since("2.4.0")
+ val distanceMeasure: Param[String] = {
+ val allowedParams = ParamValidators.inArray(Array("squaredEuclidean",
"cosine"))
--- End diff --
You don't need to change this, but it occurs to me that on lots of the
parameters that take discrete values, the error message could reference the
same array of values the validator uses, to make sure they're always consistent.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]