Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/19106#discussion_r143445323
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/ProbabilisticClassifier.scala
---
@@ -230,21 +230,22 @@ private[ml] object ProbabilisticClassificationModel {
* Normalize a vector of raw predictions to be a multinomial probability
vector, in place.
*
* The input raw predictions should be nonnegative.
- * The output vector sums to 1, unless the input vector is all-0 (in
which case the output is
- * all-0 too).
+ * The output vector sums to 1.
*
* NOTE: This is NOT applicable to all models, only ones which
effectively use class
* instance counts for raw predictions.
*/
+ @throws[IllegalArgumentException]("If the input vector is all-0 or
including negative values")
def normalizeToProbabilitiesInPlace(v: DenseVector): Unit = {
+ v.values.foreach(value => require(value >= 0,
+ "The input raw predictions should be nonnegative."))
val sum = v.values.sum
- if (sum != 0) {
- var i = 0
- val size = v.size
- while (i < size) {
- v.values(i) /= sum
- i += 1
- }
+ require(sum > 0, "All-0 vector is not allowed normalizing.")
--- End diff --
Text may be better as "Can't normalize the 0-vector" or similar
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]