[
https://issues.apache.org/jira/browse/FLINK-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563040#comment-14563040
]
Theodore Vasiloudis commented on FLINK-2102:
--------------------------------------------
My suggestion for an initial version of this is to add a PredictOperation that
looks like this to each implemented Predictor:
{code}
implicit def predictLabeledValues = {
new PredictOperation[PredictorType, LabeledVector, (Double, Double)] {
...
}
}
{code}
i.e. for each example in the dataset, we make a prediction, and then return a
tuple with the true value and the predicted value. That {{DataSet[(Double,
Double)]}} could then be passed on to a scoring function.
> Add predict operation for LabeledVector
> ---------------------------------------
>
> Key: FLINK-2102
> URL: https://issues.apache.org/jira/browse/FLINK-2102
> Project: Flink
> Issue Type: Improvement
> Components: Machine Learning Library
> Reporter: Theodore Vasiloudis
> Assignee: Theodore Vasiloudis
> Priority: Minor
> Labels: ML
> Fix For: 0.9
>
>
> Currently we can only call predict on DataSet[V <: Vector].
> A lot of times though we have a DataSet[LabeledVector] that we split into a
> train and test set.
> We should be able to make predictions on the test DataSet[LabeledVector]
> without having to transform it into a DataSet[Vector]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)