Github user thvasilo commented on the issue:
https://github.com/apache/flink/pull/2838
> The problem is not with the evaluate(test: TestType): DataSet[Double] but
rather with evaluate(test: TestType): DataSet[(Prediction,Prediction)].
Completely agree there, I advocated for removing/renaming the evaluate
function, we considered using a `score` function for a more sklearn-like
approach before, see e.g. #902. Having _some_ function that returns a
`DataSet[(truth: Prediction,pred: Prediction)]` is useful and probably
necessary, but we should look at alternatives as the current state is confusing.
I think I like the approach you are suggesting, so feel free to come up
with an alternative in the WIP PRs.
Getting rid of the Pipeline requirements for recommendation algorithms
would simplify some things. In that case we'll have to re-evaluate if it makes
sense for them to implement the `Predictor` interface at all, or maybe we have
`ChainablePredictors` but I think our hierarchy is deep enough already.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---