[GitHub] spark issue #16353: [SPARK-18948][MLlib] Add Mean Percentile Rank metric for...

daniloascione Wed, 21 Dec 2016 03:49:57 -0800

Github user daniloascione commented on the issue:

    https://github.com/apache/spark/pull/16353
  
    I agree that mean percentile rank (MPR) is related to mean reciprocal rank 
(MRR), and both are simple to calculate. MPR provides better interpretability 
than MRR because it is based on percentiles and also the calculated MPR value 
represents a percentile. 
    
    The r_ui value is a measure of the feedback from user _u_ about the item 
_i_. For a given user and item, it can be calculated as the number of feedback 
events from the user about the item (e.g. clicks, amount of time) in 
recommender systems or as the number of the times a user queried the item 
(document) in information retrieval systems. 
    Note that r_ui is equal to the number of label values from the input 
dataset associated to the user u, and sum_u(r_ui) is the size of the input 
dataset.  Thus, the sum of the percentile-ranks for each (label, predictions) 
pair in the input dataset (rank_ui in the paper) is equivalent to the sum of 
r_ui * rank_ui, which is the numerator of the MPR formula (8) in the paper.
    E.g. if a user u viewed a page i three times (or if she queried a document 
i three times), then r_ui = 3 and also there are three identical (label, 
predictions) pairs in the input dataset: it is equivalent to multiply three 
times the percentile-rank for the identical pair or summing it three times.
    That said, we can assume r_ui = 1 for each pair of the input dataset and 
then remove it from the equation. Also, there is no need to have information 
about the user in the input dataset.
    
    Since the MPR metric is based on percentile-rank and the r_ui value depends 
on the size of the input dataset, this metric is general enough to be used with 
any ranking algorithm.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #16353: [SPARK-18948][MLlib] Add Mean Percentile Rank metric for...

Reply via email to