flink git commit: [FLINK-4850] [ml] FlinkML - SVM predict Operation for Vector and not LaveledVector

trohrmann Wed, 02 Nov 2016 10:14:05 -0700

Repository: flink
Updated Branches:
  refs/heads/master 30a53ef69 -> da991aebb



[FLINK-4850] [ml] FlinkML - SVM predict Operation for Vector and not 
LaveledVector

This closes #2658.


Project: http://git-wip-us.apache.org/repos/asf/flink/repo
Commit: http://git-wip-us.apache.org/repos/asf/flink/commit/da991aeb
Tree: http://git-wip-us.apache.org/repos/asf/flink/tree/da991aeb
Diff: http://git-wip-us.apache.org/repos/asf/flink/diff/da991aeb

Branch: refs/heads/master
Commit: da991aebb038b13a2d34344cf456c32feb4222dd
Parents: 30a53ef
Author: Theodore Vasiloudis <[email protected]>
Authored: Wed Oct 19 14:09:09 2016 +0200
Committer: Till Rohrmann <[email protected]>
Committed: Wed Nov 2 18:12:53 2016 +0100

----------------------------------------------------------------------
 docs/dev/libs/ml/quickstart.md | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/flink/blob/da991aeb/docs/dev/libs/ml/quickstart.md
----------------------------------------------------------------------
diff --git a/docs/dev/libs/ml/quickstart.md b/docs/dev/libs/ml/quickstart.md
index 50ca08a..e4c6962 100644
--- a/docs/dev/libs/ml/quickstart.md
+++ b/docs/dev/libs/ml/quickstart.md
@@ -135,12 +135,13 @@ We can simply import the dataset then using:
 
 import org.apache.flink.ml.MLUtils
 
-val astroTrain: DataSet[LabeledVector] = 
MLUtils.readLibSVM("/path/to/svmguide1")
-val astroTest: DataSet[LabeledVector] = 
MLUtils.readLibSVM("/path/to/svmguide1.t")
+val astroTrain: DataSet[LabeledVector] = MLUtils.readLibSVM(env, 
"/path/to/svmguide1")
+val astroTest: DataSet[(Vector, Double)] = MLUtils.readLibSVM(env, 
"/path/to/svmguide1.t")
+      .map(x => (x.vector, x.label))
 
 {% endhighlight %}
 
-This gives us two `DataSet[LabeledVector]` objects that we will use in the 
following section to
+This gives us two `DataSet` objects that we will use in the following section 
to
 create a classifier.
 
 ## Classification
@@ -167,11 +168,11 @@ svm.fit(astroTrain)
 
 {% endhighlight %}
 
-We can now make predictions on the test set.
+We can now make predictions on the test set, and use the `evaluate` function 
to create (truth, prediction) pairs.
 
 {% highlight scala %}
 
-val predictionPairs = svm.predict(astroTest)
+val evaluationPairs: DataSet[(Double, Double)] = svm.evaluate(astroTest)
 
 {% endhighlight %}
 
@@ -210,12 +211,11 @@ make predictions.
 
 scaledSVM.fit(astroTrain)
 
-val predictionPairsScaled: DataSet[(Double, Double)] = 
scaledSVM.predict(astroTest)
+val evaluationPairsScaled: DataSet[(Double, Double)] = 
scaledSVM.evaluate(astroTest)
 
 {% endhighlight %}
 
 The scaled inputs should give us better prediction performance.
-The result of the prediction on `LabeledVector`s is a data set of tuples where 
the first entry denotes the true label value and the second entry is the 
predicted label value.
 
 ## Where to go from here

flink git commit: [FLINK-4850] [ml] FlinkML - SVM predict Operation for Vector and not LaveledVector

Reply via email to