[
https://issues.apache.org/jira/browse/FLINK-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gábor Hermann reassigned FLINK-4713:
------------------------------------
Assignee: Gábor Hermann
> Implementing ranking evaluation scores for recommender systems
> --------------------------------------------------------------
>
> Key: FLINK-4713
> URL: https://issues.apache.org/jira/browse/FLINK-4713
> Project: Flink
> Issue Type: New Feature
> Components: Machine Learning Library
> Reporter: Domokos Miklós Kelen
> Assignee: Gábor Hermann
>
> Follow up work to [4712|https://issues.apache.org/jira/browse/FLINK-4712]
> includes implementing ranking recommendation evaluation metrics (such as
> precision@k, recall@k, ndcg@k), [similar to Spark's
> implementations|https://spark.apache.org/docs/1.5.0/mllib-evaluation-metrics.html#ranking-systems].
> It would be beneficial if we were able to design the API such that it could
> be included in the proposed evaluation framework (see
> [2157|https://issues.apache.org/jira/browse/FLINK-2157]).
> In it's current form, this would mean generalizing the PredictionType type
> parameter of the Score class to allow for {{Array[Int]}} or {{Array[(Int,
> Double)]}}, and outputting the recommendations in the form {{DataSet[(Int,
> Array[Int])]}} or {{DataSet[(Int, Array[(Int,Double)])]}} meaning (user,
> array of items), possibly including the predicted scores as well.
> However, calculating for example nDCG for a given user u requires us to be
> able to access all of the (u, item, relevance) records in the test dataset,
> which means we would need to put this information in the second element of
> the {{DataSet[(PredictionType, PredictionType)]}} input of the scorer
> function as PredictionType={{Array[(Int, Double)]}}. This is problematic, as
> this Array could be arbitrarily long.
> Another option is to further rework the proposed evaluation framework to
> allow us to implement this properly, with inputs in the form of
> {{recommendations : DataSet[(Int,Int,Int)]}} (user, item, rank) and {{test :
> DataSet[(Int,Int,Double)]}} (user, item relevance). This way, the scores
> could be implemented such that they can be calculated in a distributed way.
> The third option is to implement the scorer functions outside the evaluation
> framework.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)