Github user imatiach-msft commented on a diff in the pull request:
https://github.com/apache/spark/pull/17086#discussion_r231227296
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/evaluation/MulticlassMetrics.scala
---
@@ -27,10 +27,17 @@ import org.apache.spark.sql.DataFrame
/**
* Evaluator for multiclass classification.
*
- * @param predictionAndLabels an RDD of (prediction, label) pairs.
+ * @param predAndLabelsWithOptWeight an RDD of (prediction, label, weight)
or
+ * (prediction, label) pairs.
*/
@Since("1.1.0")
-class MulticlassMetrics @Since("1.1.0") (predictionAndLabels: RDD[(Double,
Double)]) {
+class MulticlassMetrics @Since("3.0.0") (predAndLabelsWithOptWeight:
RDD[_]) {
--- End diff --
"I am not sure what to do about the DataFrame issue though", ah, I think I
see your concern.
But, isn't this dataframe constructor private anyway, so it can't be used
by anyone outside mllib:
private[mllib] def this(predictionAndLabels: DataFrame) =
this(predictionAndLabels.rdd.map(r => (r.getDouble(0), r.getDouble(1))))
I only modified the RDD part because that is what is used by the ML
evaluator and it is what users outside spark can access. This is to add weight
column for the evaluators.
However, even if we wanted to add weight column support for the private
API, I'm unsure about how to add this. Should I just check if there are 3
columns or two, and if there are 3 use the third one as the weight column? I
guess I am on the fence about this, I could change it but I don't think it is
absolutely necessary, since it's not used anywhere outside spark MLLIB anyway.
Actually, this constructor is a bit weird, it looks like it was added as
part of this PR:
https://github.com/apache/spark/pull/6011/files
It is only used here in the python API:
https://github.com/apache/spark/pull/6011/files#diff-443f766289f8090078531c3e1a1d6027R186
But I don't see why we couldn't just get the rdd there and remove the
private constructor altogether (?)
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]