[ https://issues.apache.org/jira/browse/SPARK-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089932#comment-15089932 ]
Robert Dodier edited comment on SPARK-6332 at 1/13/16 10:46 PM: ---------------------------------------------------------------- I'm reopening this issue, as I have made a new [PR #10666|https://github.com/apache/spark/pull/10666] to address the comments that were made on the previous [PR #5025|https://github.com/apache/spark/pull/5025]. was (Author: robert_dodier): I'm reopening this issue, as I have made a new [PR #10666](https://github.com/apache/spark/pull/10666) to address the comments that were made on the previous [PR #5025](https://github.com/apache/spark/pull/5025). > compute calibration curve for binary classifiers > ------------------------------------------------ > > Key: SPARK-6332 > URL: https://issues.apache.org/jira/browse/SPARK-6332 > Project: Spark > Issue Type: New Feature > Components: MLlib > Reporter: Robert Dodier > Priority: Minor > Labels: classification > > For binary classifiers, calibration measures how classifier scores compare to > the proportion of positive examples. If the classifier is well-calibrated, > the classifier score is approximately equal to the proportion of positive > examples. This is important if the scores are used as probabilities for > making decisions via expected cost. Otherwise, the calibration curve may > still be interesting; the proportion of positive examples should at least be > a monotonic function of the score. > I propose that a new method for calibration be added to the class > BinaryClassificationMetrics, since calibration seems to fit in with the ROC > curve and other classifier assessments. > For more about calibration, see: > http://en.wikipedia.org/wiki/Calibration_%28statistics%29#In_classification > References: > Mahdi Pakdaman Naeini, Gregory F. Cooper, Milos Hauskrecht. "Binary > Classifier Calibration: Non-parametric approach." > http://arxiv.org/abs/1401.3390 > Alexandru Niculescu-Mizil, Rich Caruana. "Predicting Good Probabilities With > Supervised Learning." Appearing in Proceedings of the 22nd International > Conference on Machine Learning, Bonn, Germany, 2005. > http://www.cs.cornell.edu/~alexn/papers/calibration.icml05.crc.rev3.pdf > "Properties and benefits of calibrated classifiers." Ira Cohen, Moises > Goldszmidt. http://www.hpl.hp.com/techreports/2004/HPL-2004-22R1.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org