Dear All,

I'm classifying some data with various methods (binary classification). I'm 
interpreting the results via a confusion matrix from which I calculate the 
sensitifity and the fdr. The classifiers are trained on 575 data points and my 
test set has 50 data points.

I'd like to calculate p-values for obtaining <=fdr and >=sensitifity for each 
classifier. I was thinking about shuffling/bootstrap the lables of the test 
set, classify them and calculating the p-value from the obtained normal 
distributed random fdr and sensitifity.

The problem is that it's rather slow when running many rounds of 
shuffling/classification (I'd like to do this for many classifiers and 
parameter combinations). In addition classification of the 50 test data points 
with shuffled lables realistically produces only a  very limited number of 
possible fdr's and sensitivities, and I'm wondering if I can realy believe 
these values to be normal.

Basically I'm looking for a way to calculate the p-values analytically. I'd be 
happy  for any suggestions, web-addresses or references.

        kind regads,

        Arne

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to