[ 
https://issues.apache.org/jira/browse/FLINK-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563040#comment-14563040
 ] 

Theodore Vasiloudis commented on FLINK-2102:
--------------------------------------------

My suggestion for an initial version of this is to add a PredictOperation that 
looks like this to each implemented Predictor:
{code}
implicit def predictLabeledValues = {
    new PredictOperation[PredictorType, LabeledVector, (Double, Double)] {
      ...
    }
}
{code}
i.e. for each example in the dataset, we make a prediction, and then return a 
tuple with the true value and the predicted value. That {{DataSet[(Double, 
Double)]}} could then be passed on to a scoring function.

> Add predict operation for LabeledVector
> ---------------------------------------
>
>                 Key: FLINK-2102
>                 URL: https://issues.apache.org/jira/browse/FLINK-2102
>             Project: Flink
>          Issue Type: Improvement
>          Components: Machine Learning Library
>            Reporter: Theodore Vasiloudis
>            Assignee: Theodore Vasiloudis
>            Priority: Minor
>              Labels: ML
>             Fix For: 0.9
>
>
> Currently we can only call predict on DataSet[V <: Vector].
> A lot of times though we have a DataSet[LabeledVector] that we split into a 
> train and test set.
> We should be able to make predictions on the test DataSet[LabeledVector] 
> without having to transform it into a DataSet[Vector]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to