[ 
https://issues.apache.org/jira/browse/FLINK-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gábor Hermann reassigned FLINK-4713:
------------------------------------

    Assignee: Gábor Hermann

> Implementing ranking evaluation scores for recommender systems
> --------------------------------------------------------------
>
>                 Key: FLINK-4713
>                 URL: https://issues.apache.org/jira/browse/FLINK-4713
>             Project: Flink
>          Issue Type: New Feature
>          Components: Machine Learning Library
>            Reporter: Domokos Miklós Kelen
>            Assignee: Gábor Hermann
>
> Follow up work to [4712|https://issues.apache.org/jira/browse/FLINK-4712] 
> includes implementing ranking recommendation evaluation metrics (such as 
> precision@k, recall@k, ndcg@k), [similar to Spark's 
> implementations|https://spark.apache.org/docs/1.5.0/mllib-evaluation-metrics.html#ranking-systems].
>  It would be beneficial if we were able to design the API such that it could 
> be included in the proposed evaluation framework (see 
> [2157|https://issues.apache.org/jira/browse/FLINK-2157]).
> In it's current form, this would mean generalizing the PredictionType type 
> parameter of the Score class to allow for {{Array[Int]}} or {{Array[(Int, 
> Double)]}}, and outputting the recommendations in the form {{DataSet[(Int, 
> Array[Int])]}} or {{DataSet[(Int, Array[(Int,Double)])]}} meaning (user, 
> array of items), possibly including the predicted scores as well. 
> However, calculating for example nDCG for a given user u requires us to be 
> able to access all of the (u, item, relevance) records in the test dataset, 
> which means we would need to put this information in the second element of 
> the {{DataSet[(PredictionType, PredictionType)]}} input of the scorer 
> function as PredictionType={{Array[(Int, Double)]}}. This is problematic, as 
> this Array could be arbitrarily long.
> Another option is to further rework the proposed evaluation framework to 
> allow us to implement this properly, with inputs in the form of 
> {{recommendations : DataSet[(Int,Int,Int)]}} (user, item, rank) and {{test : 
> DataSet[(Int,Int,Double)]}} (user, item relevance). This way, the scores 
> could be implemented such that they can be calculated in a distributed way.
> The third option is to implement the scorer functions outside the evaluation 
> framework.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to