Github user thvasilo commented on the issue:
https://github.com/apache/flink/pull/2838
Hello Gabor,
I like the idea of having a RankingScore, it seems like having that
hierarchy with Score, RankingScore and PairWiseScore gives us the flexibility
we need to include ranking and supervised learning evaluation under the same
umbrella.
I would also encourage sharing any other ideas you broached that might
break the API, this is still very much an evolving project and there is no need
to shoehorn everything into an `evaluate(test: TestType): DataSet[Double]`
function if there are better alternatives.
One think we need to consider is how this affects cross-validation and
model selection/hyper-parameter tuning. These two aspects of the library are
tightly linked and I think that we'll need to work on them in parallel to find
issues that affect both.
I recommend taking a look at the [cross-validation
PR](https://github.com/apache/flink/pull/891) I had opened way back when, and
make a new WIP PR that uses the current one (#2838) as a basis. Since the
`Score` interface still exists it shouldn't require many changes, and all
that's added is the CrossValidation class. There are other fundamental issues
with the sampling there we can discuss in due time.
Regarding the RankingPredictor we should consider the usecase of such an
interface. Is it only going to be used for recommendation? If yes, what are the
cases where we could build a Pipeline with current or future pre-processing
steps? Could you give some pipeline examples in a recommendation setting?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---