Ton van Daelen wrote:
Hi all -

I am trying to tune an SVM model by optimizing the cross-validation
accuracy. Maximizing this value doesn't necessarily seem to minimize the
number of misclassifications. Can anyone tell me how the
cross-validation accuracy is defined? In the output below, for example,
cross-validation accuracy is 92.2%, while the number of correctly
classified samples is (1476+170)/(1476+170+4) = 99.7% !?

Thanks for any help.

Regards - Ton

Percent correctly classified is an improper scoring rule. The percent is maximized when the predicted values are bogus. In addition, one can add a very important predictor and have the % actually decrease.


Frank Harrell


---
Parameters:
SVM-Type: C-classification SVM-Kernel: radial cost: 8 gamma: 0.007


Number of Support Vectors:  1015

 ( 148 867 )

Number of Classes: 2

Levels: false true

5-fold cross-validation on training data:

Total Accuracy: 92.24242 Single Accuracies:
90 93.33333 94.84848 92.72727 90.30303


Contingency Table
           predclasses
origclasses false true
      false 1476     0
      true     4   170

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                     Department of Biostatistics   Vanderbilt University

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to