[ 
https://issues.apache.org/jira/browse/SPARK-32904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Brett Marcott reopened SPARK-32904:
--------------------------------------------

Opening to close again with resolution Invalid, as recommended in the [spark 
contributing guidelines|https://spark.apache.org/contributing.html]

> pyspark.mllib.evaluation.MulticlassMetrics needs to swap the results of 
> precision( ) and recall( )
> --------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-32904
>                 URL: https://issues.apache.org/jira/browse/SPARK-32904
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib
>    Affects Versions: 3.0.1
>            Reporter: TinaLi
>            Priority: Major
>
> [https://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/evaluation/MulticlassMetrics.html]
> *The values returned by the precision() and recall() methods of this API 
> should be swapped.*
> Following is the example results I got when I run this API. It prints out 
> precision  
> metrics = MulticlassMetrics(predictionAndLabels)
> print (metrics.confusionMatrix().toArray())
> print ("precision: ",metrics.precision(1))
> print ("recall: ",metrics.recall(1))
> [[36631. 2845.]
> [ 3839. 1610.]]
> precision: 0.3613916947250281
> recall: 0.2954670581758121
>  
> predictions.select('prediction').agg(\{'prediction':'sum'}).show()
> |sum(prediction)| 5449.0|
> As you can see, my model predicted 5449 cases with label=1, and 1610 out of 
> the 5449 cases are true positive, so precision should be  
> 1610/5449=0.2954670581758121, but this API assigned the precision value to 
> recall() method, which should be swapped. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to