[
https://issues.apache.org/jira/browse/FLINK-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14572386#comment-14572386
]
Till Rohrmann commented on FLINK-2116:
--------------------------------------
At the moment, the corresponding PR only contains the {{evaluate}} method which
gives you a {{DataSet}} of tuples {{(true label, predicted label)}}. This can
then be used to calculate some accuracy scores. But this has not been done yet.
With this PR I wanted to get some feedback on the general design of the
pipelines with the {{evaluate}} method and whether it makes sense to use
{{Tuples}} as input instead of {{LabeledVector}}. Maybe there is also some
other way to automatically extract a label value from some type which is
parameterized to make the default {{EvaluateDataSetOperation}} work on
{{LabeledVector}} if you only specify a {{PredictOperation}}.
My gut feeling is also that we should keep the calculation of the evaluation
score separate from the actual {{Predictor}}, because if you have a pipeline,
then it's no longer easy to access the members of the {{Predictor}} which are
only defined in the corresponding subclass. Moreover, maybe sometimes you want
to apply different scores to your method depending on the use case.
We should definitely open a new JIRA issue for the implementation of an
evaluation framework.
> Make pipeline extension require less coding
> -------------------------------------------
>
> Key: FLINK-2116
> URL: https://issues.apache.org/jira/browse/FLINK-2116
> Project: Flink
> Issue Type: Improvement
> Components: Machine Learning Library
> Reporter: Mikio Braun
> Assignee: Till Rohrmann
> Priority: Minor
>
> Right now, implementing methods from the pipelines for new types, or even
> adding new methods to pipelines requires many steps:
> 1) implementing methods for new types
> implement implicit of the corresponding class encapsulating the operation
> in the companion object
> 2) adding methods to the pipeline
> - adding a method
> - adding a trait for the operation
> - implement implicit in the companion object
> These are all objects which contain many generic parameters, so reducing the
> work would be great.
> The goal should be that you can really focus on the code to add, and have as
> little boilerplate code as possible.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)