Github user gaborhermann commented on the issue:
https://github.com/apache/flink/pull/2838
Thanks again for taking a look at our PR!
I've just realized from a developer mailing list thread that the FlinkML
API is still not carved into stone even until 2.0, and it's nice to hear that :)
The problem is not with the `evaluate(test: TestType): DataSet[Double]` but
rather with `evaluate(test: TestType): DataSet[(Prediction,Prediction)]`. It's
at least confusing to have both, but it might not be worth to expose the one
giving `(Prediction,Prediction)` pairs to the user as it only *prepares*
evaluation. With introducing the evaluation framework, we could at least rename
it to something like `preparePairwiseEvaluation(test: TestType):
DataSet[(Prediction,Prediction)]`. In the ranking case we might generalize it
to `prepareEvaluation(test: TestType): PreparedTesting`. We basically did this
with the `PrepareDataSetOperation`, we've just left the old `evaluate` as it is
for now. I suggest to change this if we can break the API.
I'll do a rebase on the cross-validation PR. At first glance, it should not
really be a problem to do both cross-validation and hyper-parameter tuning, as
the user has to provide a `Scorer` anyway. A minor issue I see is the user
falling back to a default `score` (e.g. RMSE in case of ALS). This might not be
a problem for recommendation models that give rating predictions beside ranking
predictions, but it's a problem for models that *only* give ranking
predictions, because those do not extend the `Predictor` class. This is not an
issue for now, but might be a problem when adding more recommendation models.
Should we try and do this now or is it a bit "overengineering"? I'll see if any
other problem comes up with after rebasing.
The `RankingPredictor` interface is useful *internally* for the `Score`s.
It serves a contract between a `RankingScore` and the model. I'm sure it will
be used only for recommendations, but it's no effort exposing it, so the user
can write code using a general `RankingPredictor` (although I would not think
this is what users would like to do :) ). A better question is whether to use
it in a `Pipeline`. We discussed this with some people, and could not really
find a use-case where we need a `Transformer`-like preprocessing for
recommendations. Of course, there could be other preprocessing steps, such as
removing/aggregating duplicates, but those do not have to be `fit` to training
data. Based on this, it's not worth the effort to integrate `RankingPredictor`
with the `Pipeline`, at least for now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---