Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/17086#discussion_r231123729
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/evaluation/MulticlassMetrics.scala
---
@@ -27,10 +27,17 @@ import org.apache.spark.sql.DataFrame
/**
* Evaluator for multiclass classification.
*
- * @param predictionAndLabels an RDD of (prediction, label) pairs.
+ * @param predAndLabelsWithOptWeight an RDD of (prediction, label, weight)
or
+ * (prediction, label) pairs.
*/
@Since("1.1.0")
-class MulticlassMetrics @Since("1.1.0") (predictionAndLabels: RDD[(Double,
Double)]) {
+class MulticlassMetrics @Since("3.0.0") (predAndLabelsWithOptWeight:
RDD[_]) {
--- End diff --
Oh, wait a sec, this changed the signature. I think you have to retain
both. The `RDD[(Double, Double)]` constructor should stay, one way or the
other, and add a new `RDD[(Double, Double, Double)]` constructor, with
appropriate Since tags on each.
Below there's a `DataFrame` constructor and I'm not sure how to handle
that. It should also handle the case where there's a weight col, but I'm not
sure how to do that cleanly. There can be a second argument like `hasWeightCol`
but that's starting to feel hacky.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]