[ 
https://issues.apache.org/jira/browse/SPARK-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Dodier reopened SPARK-6332:
----------------------------------

I'm reopening this issue, as I have made a new [PR 
#10666](https://github.com/apache/spark/pull/10666) to address the comments 
that were made on the previous [PR 
#5025](https://github.com/apache/spark/pull/5025). 

> compute calibration curve for binary classifiers
> ------------------------------------------------
>
>                 Key: SPARK-6332
>                 URL: https://issues.apache.org/jira/browse/SPARK-6332
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>            Reporter: Robert Dodier
>            Priority: Minor
>              Labels: classification
>
> For binary classifiers, calibration measures how classifier scores compare to 
> the proportion of positive examples. If the classifier is well-calibrated, 
> the classifier score is approximately equal to the proportion of positive 
> examples. This is important if the scores are used as probabilities for 
> making decisions via expected cost. Otherwise, the calibration curve may 
> still be interesting; the proportion of positive examples should at least be 
> a monotonic function of the score.
> I propose that a new method for calibration be added to the class 
> BinaryClassificationMetrics, since calibration seems to fit in with the ROC 
> curve and other classifier assessments. 
> For more about calibration, see: 
> http://en.wikipedia.org/wiki/Calibration_%28statistics%29#In_classification
> References:
> Mahdi Pakdaman Naeini, Gregory F. Cooper, Milos Hauskrecht. "Binary 
> Classifier Calibration: Non-parametric approach." 
> http://arxiv.org/abs/1401.3390
> Alexandru Niculescu-Mizil, Rich Caruana. "Predicting Good Probabilities With 
> Supervised Learning." Appearing in Proceedings of the 22nd International 
> Conference on Machine Learning, Bonn, Germany, 2005. 
> http://www.cs.cornell.edu/~alexn/papers/calibration.icml05.crc.rev3.pdf
> "Properties and benefits of calibrated classifiers." Ira Cohen, Moises 
> Goldszmidt. http://www.hpl.hp.com/techreports/2004/HPL-2004-22R1.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to