Hi all -
I am trying to tune an SVM model by optimizing the cross-validation accuracy. Maximizing this value doesn't necessarily seem to minimize the number of misclassifications. Can anyone tell me how the cross-validation accuracy is defined? In the output below, for example, cross-validation accuracy is 92.2%, while the number of correctly classified samples is (1476+170)/(1476+170+4) = 99.7% !?
Thanks for any help.
Regards - Ton
Percent correctly classified is an improper scoring rule. The percent is maximized when the predicted values are bogus. In addition, one can add a very important predictor and have the % actually decrease.
Frank Harrell
---
Parameters:
SVM-Type: C-classification SVM-Kernel: radial cost: 8 gamma: 0.007
Number of Support Vectors: 1015
( 148 867 )
Number of Classes: 2
Levels: false true
5-fold cross-validation on training data:
Total Accuracy: 92.24242 Single Accuracies:
90 93.33333 94.84848 92.72727 90.30303
Contingency Table predclasses origclasses false true false 1476 0 true 4 170
______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
-- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University
______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html