Re: [R] Calculate Specificity and Sensitivity for a given threshold value
Thanks to you both -- View this message in context: http://www.nabble.com/Calculate-Specificity-and-Sensitivity-for-a-given-threshold-value-tp20481633p20541110.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculate Specificity and Sensitivity for a given threshold value
Hi list, I'm new to R and I'm currently using ROCR package. Data in input look like this: DIAGNOSIS SCORE 1 0.387945 1 0.50405 1 0.435667 1 0.358057 1 0.583512 1 0.387945 1 0.531795 1 0.527148 0 0.526397 0 0.372935 1 0.861097 And I run the following simple code: d - read.table(inputFile, header=TRUE); pred - prediction(d$SCORE, d$DIAGNOSIS); perf - performance( pred, tpr, fpr); plot(perf) So building the curve works easily. My question is: can I have the specificity and the sensitivity for a score threshold = 0.5 (for example)? How do I compute this ? Thank you in advance -- View this message in context: http://www.nabble.com/Calculate-Specificity-and-Sensitivity-for-a-given-threshold-value-tp20481633p20481633.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculate Specificity and Sensitivity for a given threshold value
Kaliss wrote: Hi list, I'm new to R and I'm currently using ROCR package. Data in input look like this: DIAGNOSIS SCORE 1 0.387945 1 0.50405 1 0.435667 1 0.358057 1 0.583512 1 0.387945 1 0.531795 1 0.527148 0 0.526397 0 0.372935 1 0.861097 And I run the following simple code: d - read.table(inputFile, header=TRUE); pred - prediction(d$SCORE, d$DIAGNOSIS); perf - performance( pred, tpr, fpr); plot(perf) So building the curve works easily. My question is: can I have the specificity and the sensitivity for a score threshold = 0.5 (for example)? How do I compute this ? Thank you in advance Beware of the utility/loss function you are implicitly assuming with this approach. It is quite oversimplified. In clinical practice the cost of a false positive or false negative (which comes from a cost function and the simple forward probability of a positive diagnosis, e.g., from a basic logistic regression model if you start with a cohort study) vary with the type of patient being diagnosed. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculate Specificity and Sensitivity for a given threshold value
Hi Frank, Thank you for your answer. In fact, I don't use this for clinical research practice. I am currently testing several scoring methods and I'd like to know which one is the most effective and which threshold value I should apply to discriminate positives and negatives. So, any idea for my problem ? Pierre-Jean -Original Message- From: Frank E Harrell Jr [mailto:[EMAIL PROTECTED] Sent: Thursday, November 13, 2008 5:00 PM To: Breton, Pierre-Jean-EXT RD/FR Cc: r-help@r-project.org Subject: Re: [R] Calculate Specificity and Sensitivity for a given threshold value Kaliss wrote: Hi list, I'm new to R and I'm currently using ROCR package. Data in input look like this: DIAGNOSIS SCORE 1 0.387945 1 0.50405 1 0.435667 1 0.358057 1 0.583512 1 0.387945 1 0.531795 1 0.527148 0 0.526397 0 0.372935 1 0.861097 And I run the following simple code: d - read.table(inputFile, header=TRUE); pred - prediction(d$SCORE, d$DIAGNOSIS); perf - performance( pred, tpr, fpr); plot(perf) So building the curve works easily. My question is: can I have the specificity and the sensitivity for a score threshold = 0.5 (for example)? How do I compute this ? Thank you in advance Beware of the utility/loss function you are implicitly assuming with this approach. It is quite oversimplified. In clinical practice the cost of a false positive or false negative (which comes from a cost function and the simple forward probability of a positive diagnosis, e.g., from a basic logistic regression model if you start with a cohort study) vary with the type of patient being diagnosed. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculate Specificity and Sensitivity for a given threshold value
Hi Pierre-Jean, Sensitivity (Se) and specificity (Sp) are calculated for cutoffs stored in the performance x.values of your prediction for Se and Sp: For example, let's generate the performance for Se and Sp: sens - performance(pred,sens) spec - performance(pred,spec) Now, you can have acces to: [EMAIL PROTECTED] # (or [EMAIL PROTECTED]), which is the list of cutoffs [EMAIL PROTECTED] # for the corresponding Se [EMAIL PROTECTED] # for the corresponding Sp You can for example sum up this information in a table: (SeSp - cbind ([EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED])) You can also write a function to give Se and Sp for a specific cutoff, but you will have to define what to do for cutoffs not stored in the list. For example, the following function keeps the closest stored cutoff to give corresponding Se and Sp (but this is not always the best solution, you may want to define your own way to interpolate): se.sp - function (cutoff, performance){ sens - performance(pred,sens) spec - performance(pred,spec) num.cutoff - which.min(abs([EMAIL PROTECTED] - cutoff)) return(list([EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] [[1]][num.cutoff])) } se.sp(.5, pred) Hope this helps, Nael On Thu, Nov 13, 2008 at 5:59 PM, [EMAIL PROTECTED]wrote: Hi Frank, Thank you for your answer. In fact, I don't use this for clinical research practice. I am currently testing several scoring methods and I'd like to know which one is the most effective and which threshold value I should apply to discriminate positives and negatives. So, any idea for my problem ? Pierre-Jean -Original Message- From: Frank E Harrell Jr [mailto:[EMAIL PROTECTED] Sent: Thursday, November 13, 2008 5:00 PM To: Breton, Pierre-Jean-EXT RD/FR Cc: r-help@r-project.org Subject: Re: [R] Calculate Specificity and Sensitivity for a given threshold value Kaliss wrote: Hi list, I'm new to R and I'm currently using ROCR package. Data in input look like this: DIAGNOSIS SCORE 1 0.387945 1 0.50405 1 0.435667 1 0.358057 1 0.583512 1 0.387945 1 0.531795 1 0.527148 0 0.526397 0 0.372935 1 0.861097 And I run the following simple code: d - read.table(inputFile, header=TRUE); pred - prediction(d$SCORE, d$DIAGNOSIS); perf - performance( pred, tpr, fpr); plot(perf) So building the curve works easily. My question is: can I have the specificity and the sensitivity for a score threshold = 0.5 (for example)? How do I compute this ? Thank you in advance Beware of the utility/loss function you are implicitly assuming with this approach. It is quite oversimplified. In clinical practice the cost of a false positive or false negative (which comes from a cost function and the simple forward probability of a positive diagnosis, e.g., from a basic logistic regression model if you start with a cohort study) vary with the type of patient being diagnosed. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculate Specificity and Sensitivity for a given threshold value
[EMAIL PROTECTED] wrote: Hi Frank, Thank you for your answer. In fact, I don't use this for clinical research practice. I am currently testing several scoring methods and I'd like to know which one is the most effective and which threshold value I should apply to discriminate positives and negatives. So, any idea for my problem ? The use of thresholds gets in the way of finding a good solution because you will have predictor values in the gray zone. I tend to rank methods by the most sensitive index available such as the log likelihood in the binary logistic model. You can extend ordinary logistic models to allow for nonlinear effects on the log odds scale using regression splines. Frank Pierre-Jean -Original Message- From: Frank E Harrell Jr [mailto:[EMAIL PROTECTED] Sent: Thursday, November 13, 2008 5:00 PM To: Breton, Pierre-Jean-EXT RD/FR Cc: r-help@r-project.org Subject: Re: [R] Calculate Specificity and Sensitivity for a given threshold value Kaliss wrote: Hi list, I'm new to R and I'm currently using ROCR package. Data in input look like this: DIAGNOSIS SCORE 1 0.387945 1 0.50405 1 0.435667 1 0.358057 1 0.583512 1 0.387945 1 0.531795 1 0.527148 0 0.526397 0 0.372935 1 0.861097 And I run the following simple code: d - read.table(inputFile, header=TRUE); pred - prediction(d$SCORE, d$DIAGNOSIS); perf - performance( pred, tpr, fpr); plot(perf) So building the curve works easily. My question is: can I have the specificity and the sensitivity for a score threshold = 0.5 (for example)? How do I compute this ? Thank you in advance Beware of the utility/loss function you are implicitly assuming with this approach. It is quite oversimplified. In clinical practice the cost of a false positive or false negative (which comes from a cost function and the simple forward probability of a positive diagnosis, e.g., from a basic logistic regression model if you start with a cohort study) vary with the type of patient being diagnosed. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculate Specificity and Sensitivity for a given threshold value
N. Lapidus wrote: Hi Pierre-Jean, Sensitivity (Se) and specificity (Sp) are calculated for cutoffs stored in the performance x.values of your prediction for Se and Sp: For example, let's generate the performance for Se and Sp: sens - performance(pred,sens) spec - performance(pred,spec) Now, you can have acces to: [EMAIL PROTECTED] # (or [EMAIL PROTECTED]), which is the list of cutoffs [EMAIL PROTECTED] # for the corresponding Se [EMAIL PROTECTED] # for the corresponding Sp You can for example sum up this information in a table: (SeSp - cbind ([EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED])) You can also write a function to give Se and Sp for a specific cutoff, but you will have to define what to do for cutoffs not stored in the list. For example, the following function keeps the closest stored cutoff to give corresponding Se and Sp (but this is not always the best solution, you may want to define your own way to interpolate): se.sp - function (cutoff, performance){ sens - performance(pred,sens) spec - performance(pred,spec) num.cutoff - which.min(abs([EMAIL PROTECTED] - cutoff)) return(list([EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] [[1]][num.cutoff])) That is a biased procedure (like how stepwise regression results in overfitting). It also uses a strange loss function. The bootstrap would need to be used to penalize for the uncertainty in the cutoff. You are also assuming that a cutoff exists, which is a major assumption. Frank } se.sp(.5, pred) Hope this helps, Nael On Thu, Nov 13, 2008 at 5:59 PM, [EMAIL PROTECTED]wrote: Hi Frank, Thank you for your answer. In fact, I don't use this for clinical research practice. I am currently testing several scoring methods and I'd like to know which one is the most effective and which threshold value I should apply to discriminate positives and negatives. So, any idea for my problem ? Pierre-Jean -Original Message- From: Frank E Harrell Jr [mailto:[EMAIL PROTECTED] Sent: Thursday, November 13, 2008 5:00 PM To: Breton, Pierre-Jean-EXT RD/FR Cc: r-help@r-project.org Subject: Re: [R] Calculate Specificity and Sensitivity for a given threshold value Kaliss wrote: Hi list, I'm new to R and I'm currently using ROCR package. Data in input look like this: DIAGNOSIS SCORE 1 0.387945 1 0.50405 1 0.435667 1 0.358057 1 0.583512 1 0.387945 1 0.531795 1 0.527148 0 0.526397 0 0.372935 1 0.861097 And I run the following simple code: d - read.table(inputFile, header=TRUE); pred - prediction(d$SCORE, d$DIAGNOSIS); perf - performance( pred, tpr, fpr); plot(perf) So building the curve works easily. My question is: can I have the specificity and the sensitivity for a score threshold = 0.5 (for example)? How do I compute this ? Thank you in advance Beware of the utility/loss function you are implicitly assuming with this approach. It is quite oversimplified. In clinical practice the cost of a false positive or false negative (which comes from a cost function and the simple forward probability of a positive diagnosis, e.g., from a basic logistic regression model if you start with a cohort study) vary with the type of patient being diagnosed. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.