Github user thvasilo commented on a diff in the pull request:
https://github.com/apache/flink/pull/874#discussion_r33663520
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/classification/SVM.scala
---
@@ -242,8 +275,21 @@ object SVM{
}
}
- override def predict(value: T, model: DenseVector): Double = {
- value.asBreeze dot model.asBreeze
+ override def predict(value: T, model: DenseVector,
predictParameters: ParameterMap):
+ Double = {
+ val thresholdOption = predictParameters.get(Threshold)
+
+ val rawValue = value.asBreeze dot model.asBreeze
+ // If the Threshold option has been reset, we will get back a
Some(None) thresholdOption
+ // causing the exception when we try to get the value. In that
case we just return the
+ // raw value
+ try {
+ val thresOptionValue = thresholdOption.get
+ if (rawValue > thresOptionValue) 1.0 else -1.0
+ }
+ catch {
+ case e: java.lang.ClassCastException => rawValue
+ }
--- End diff --
This relates to the previous discussion:
I do believe we want this turned on by default, when you train a binary
classifier you expect that `predict` will return binary labels, not the
decision function values.
So if we have `None` as default, the user could write:
```scala
val svm = SVM().
setBlocks(env.getParallelism)
svm.fit(train)
val eval = svm.evaluate(test)
```
and the eval output would not make sense, but if he wrote
```scala
val svm = SVM().
setBlocks(env.getParallelism).
setThreshold(0.0)
svm.fit(train)
val eval = svm.evaluate(test)
```
it would.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---