Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/21044#discussion_r180920806
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -195,14 +205,18 @@ final class OneVsRestModel private[ml] (
newDataset.unpersist()
}
- // output the index of the classifier with highest confidence as
prediction
- val labelUDF = udf { (predictions: Map[Int, Double]) =>
- predictions.maxBy(_._2)._1.toDouble
+ // output the RawPrediction as vector
+ val rawPredictionUDF = udf { (predictions: Map[Int, Double]) =>
+ Vectors.sparse(numClasses, predictions.toList )
--- End diff --
Also, let's output a dense Vector since it will almost surely be dense.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]