[
https://issues.apache.org/jira/browse/FLINK-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14611764#comment-14611764
]
ASF GitHub Bot commented on FLINK-2297:
---------------------------------------
Github user tillrohrmann commented on a diff in the pull request:
https://github.com/apache/flink/pull/874#discussion_r33765000
--- Diff:
flink-staging/flink-ml/src/test/scala/org/apache/flink/ml/classification/SVMITSuite.scala
---
@@ -69,19 +70,38 @@ class SVMITSuite extends FlatSpec with Matchers with
FlinkTestBase {
svm.fit(trainingDS)
- val threshold = 0.0
-
- val predictionPairs = svm.evaluate(test).map {
- truthPrediction =>
- val truth = truthPrediction._1
- val prediction = truthPrediction._2
- val thresholdedPrediction = if (prediction > threshold) 1.0 else
-1.0
- (truth, thresholdedPrediction)
- }
+ val predictionPairs = svm.evaluate(test)
val absoluteErrorSum = predictionPairs.collect().map{
case (truth, prediction) => Math.abs(truth - prediction)}.sum
absoluteErrorSum should be < 15.0
}
+
+ it should "be possible to get the raw decision function values" in {
+ val env = ExecutionEnvironment.getExecutionEnvironment
+
+ val svm = SVM().
+ setBlocks(env.getParallelism).
+ setIterations(100).
+ setLocalIterations(100).
+ setRegularization(0.002).
+ setStepsize(0.1).
+ setSeed(0).
+ clearThreshold()
+
+ val trainingDS = env.fromCollection(Classification.trainingData)
+
+ val test = trainingDS.map(x => x.vector)
+
+ svm.fit(trainingDS)
+
+ val predictions: DataSet[(FlinkVector, Double)] = svm.predict(test)
+
+ val preds = predictions.map(vectorLabel => vectorLabel._2).collect()
+
+ preds.max should be > 1.0
--- End diff --
I think we should resolve this before merging.
> Add threshold setting for SVM binary predictions
> ------------------------------------------------
>
> Key: FLINK-2297
> URL: https://issues.apache.org/jira/browse/FLINK-2297
> Project: Flink
> Issue Type: Improvement
> Components: Machine Learning Library
> Reporter: Theodore Vasiloudis
> Assignee: Theodore Vasiloudis
> Priority: Minor
> Labels: ML
> Fix For: 0.10
>
>
> Currently SVM outputs the raw decision function values when using the predict
> function.
> We should have instead the ability to set a threshold above which examples
> are labeled as positive (1.0) and below negative (-1.0). Then the prediction
> function can be directly used for evaluation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)