Domokos Miklós Kelen created FLINK-4713:
-------------------------------------------

             Summary: Implementing ranking evaluation scores for recommender 
systems
                 Key: FLINK-4713
                 URL: https://issues.apache.org/jira/browse/FLINK-4713
             Project: Flink
          Issue Type: New Feature
          Components: Machine Learning Library
            Reporter: Domokos Miklós Kelen


Follow up work to [4712|https://issues.apache.org/jira/browse/FLINK-4712] 
includes implementing ranking recommendation evaluation metrics (such as 
precision@k, recall@k, ndcg@k), [similar to Spark's 
implementations|https://spark.apache.org/docs/1.5.0/mllib-evaluation-metrics.html#ranking-systems].
 It would be beneficial if we were able to design the API such that it could be 
included in the proposed evaluation framework (see 
[2157|https://issues.apache.org/jira/browse/FLINK-2157]).

In it's current form, this would mean generalizing the PredictionType type 
parameter of the Score class to allow for {{Array[Int]}} or {{Array[(Int, 
Double)]}}, and outputting the recommendations in the form {{DataSet[(Int, 
Array[Int])]}} or {{DataSet[(Int, Array[(Int,Double)])]}} meaning (user, array 
of items), possibly including the predicted scores as well. 

However, calculating for example nDCG for a given user u requires us to be able 
to access all of the (u, item, relevance) records in the test dataset, which 
means we would need to put this information in the second element of the 
{{DataSet[(PredictionType, PredictionType)]}} input of the scorer function as 
PredictionType={{Array[(Int, Double)]}}. This is problematic, as this Array 
could be arbitrarily long.

Another option is to further rework the proposed evaluation framework to allow 
us to implement this properly, with inputs in the form of {{recommendations : 
DataSet[(Int,Int,Int)]}} (user, item, rank) and {{test : 
DataSet[(Int,Int,Double)]}} (user, item relevance). This way, the scores could 
be implemented such that they can be calculated in a distributed way.

The third option is to implement the scorer functions outside the evaluation 
framework.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to