On 05/24/2010 02:14 AM, Claudia Beleites wrote:
Dear Changbin,
I want to know how to select the optimal decision threshold from the ROC
curve?
Depends on what optimal means. I think there are a bunch of different
criteria used:
- point closest to the ideal model
- point furthest from the "guessing" model
- these criteria may include costs, i.e. a FP/FN ratio != 1
- ...
More practical:
If you use ROCR: the help of the performance class explains the slots in
the object. You find there the data of the curve, incl. the thresholds.
At what threshold will give the highest accuracy?
to know that, optmize the accuracy as function of the threshold.
Remember: finding the optimal threshold from a ROC curve is a
data-driven optimization. You need to validate the resulting model with
independent test data afterwards.
That point is excellent. In addition, such decision analysis assumes
that (1) a forced yes/no decision is acceptable, i.e., a predicted
probability in the middle is forced to be categorized as "low" or "high"
as opposed to "no decision; get more data", and (2) the
utility/cost/loss function is identical across subjects (which it almost
never is).
Frank
--
Frank E Harrell Jr Professor and Chairman School of Medicine
Department of Biostatistics Vanderbilt University
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.