Repository: flink Updated Branches: refs/heads/master 30a53ef69 -> da991aebb
[FLINK-4850] [ml] FlinkML - SVM predict Operation for Vector and not LaveledVector This closes #2658. Project: http://git-wip-us.apache.org/repos/asf/flink/repo Commit: http://git-wip-us.apache.org/repos/asf/flink/commit/da991aeb Tree: http://git-wip-us.apache.org/repos/asf/flink/tree/da991aeb Diff: http://git-wip-us.apache.org/repos/asf/flink/diff/da991aeb Branch: refs/heads/master Commit: da991aebb038b13a2d34344cf456c32feb4222dd Parents: 30a53ef Author: Theodore Vasiloudis <[email protected]> Authored: Wed Oct 19 14:09:09 2016 +0200 Committer: Till Rohrmann <[email protected]> Committed: Wed Nov 2 18:12:53 2016 +0100 ---------------------------------------------------------------------- docs/dev/libs/ml/quickstart.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/flink/blob/da991aeb/docs/dev/libs/ml/quickstart.md ---------------------------------------------------------------------- diff --git a/docs/dev/libs/ml/quickstart.md b/docs/dev/libs/ml/quickstart.md index 50ca08a..e4c6962 100644 --- a/docs/dev/libs/ml/quickstart.md +++ b/docs/dev/libs/ml/quickstart.md @@ -135,12 +135,13 @@ We can simply import the dataset then using: import org.apache.flink.ml.MLUtils -val astroTrain: DataSet[LabeledVector] = MLUtils.readLibSVM("/path/to/svmguide1") -val astroTest: DataSet[LabeledVector] = MLUtils.readLibSVM("/path/to/svmguide1.t") +val astroTrain: DataSet[LabeledVector] = MLUtils.readLibSVM(env, "/path/to/svmguide1") +val astroTest: DataSet[(Vector, Double)] = MLUtils.readLibSVM(env, "/path/to/svmguide1.t") + .map(x => (x.vector, x.label)) {% endhighlight %} -This gives us two `DataSet[LabeledVector]` objects that we will use in the following section to +This gives us two `DataSet` objects that we will use in the following section to create a classifier. ## Classification @@ -167,11 +168,11 @@ svm.fit(astroTrain) {% endhighlight %} -We can now make predictions on the test set. +We can now make predictions on the test set, and use the `evaluate` function to create (truth, prediction) pairs. {% highlight scala %} -val predictionPairs = svm.predict(astroTest) +val evaluationPairs: DataSet[(Double, Double)] = svm.evaluate(astroTest) {% endhighlight %} @@ -210,12 +211,11 @@ make predictions. scaledSVM.fit(astroTrain) -val predictionPairsScaled: DataSet[(Double, Double)] = scaledSVM.predict(astroTest) +val evaluationPairsScaled: DataSet[(Double, Double)] = scaledSVM.evaluate(astroTest) {% endhighlight %} The scaled inputs should give us better prediction performance. -The result of the prediction on `LabeledVector`s is a data set of tuples where the first entry denotes the true label value and the second entry is the predicted label value. ## Where to go from here
