[ 
https://issues.apache.org/jira/browse/SPARK-26351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pablo J. Villacorta updated SPARK-26351:
----------------------------------------
    Description: 
The formula of the *precision @ k* for measuring the quality of the 
recommendations:

[https://spark.apache.org/docs/latest/mllib-evaluation-metrics.html#ranking-systems]

says that j goes from 0 to *min(|D|, k)* , but according to the code, 

[https://github.com/apache/spark/blob/a63e7b2a212bab94d080b00cf1c5f397800a276a/mllib/src/main/scala/org/apache/spark/mllib/evaluation/RankingMetrics.scala#L65]

 
{code:java}
val n = math.min(pred.length, k){code}
 

The notation of Spark documentation defines

D_i as the set of ground truth relevant documents for user i

R_i as the set of recommended documents (i.e. predictions) given for user i .

According to the code, the documentation should say j goes from 0 to *min( 
|R~i~|, k )*

  was:
The formula of the *precision @ k* for measuring the quality of the 
recommendations:

[https://spark.apache.org/docs/latest/mllib-evaluation-metrics.html#ranking-systems]

says that j goes from 0 to *min(|D|, k)* , but according to the code, 

[https://github.com/apache/spark/blob/a63e7b2a212bab94d080b00cf1c5f397800a276a/mllib/src/main/scala/org/apache/spark/mllib/evaluation/RankingMetrics.scala#L65]

 
{code:java}
val n = math.min(pred.length, k){code}
 

The notation of Spark documentation defines

D~i~ as the set of ground truth relevant documents for user i

R~i~ as the set of recommended documents (i.e. predictions) given for user i .

According to the code, the documentation should say j goes from 0 to *min( 
|R~i~|, k )*


> Documented formula of precision at k does not match the actual code
> -------------------------------------------------------------------
>
>                 Key: SPARK-26351
>                 URL: https://issues.apache.org/jira/browse/SPARK-26351
>             Project: Spark
>          Issue Type: Bug
>          Components: Documentation
>    Affects Versions: 2.4.0
>            Reporter: Pablo J. Villacorta
>            Priority: Major
>
> The formula of the *precision @ k* for measuring the quality of the 
> recommendations:
> [https://spark.apache.org/docs/latest/mllib-evaluation-metrics.html#ranking-systems]
> says that j goes from 0 to *min(|D|, k)* , but according to the code, 
> [https://github.com/apache/spark/blob/a63e7b2a212bab94d080b00cf1c5f397800a276a/mllib/src/main/scala/org/apache/spark/mllib/evaluation/RankingMetrics.scala#L65]
>  
> {code:java}
> val n = math.min(pred.length, k){code}
>  
> The notation of Spark documentation defines
> D_i as the set of ground truth relevant documents for user i
> R_i as the set of recommended documents (i.e. predictions) given for user i .
> According to the code, the documentation should say j goes from 0 to *min( 
> |R~i~|, k )*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to