This is exactly the same as your March 19 post to which I replied on March 22. Did you see my reply? In it I tried to clarify the ROC concept in order to rectify what appeared to be your lack of understanding.
Please read my reply in Google Groups and respond if you disagree or don't understand. Meanwhile, without referring to my previous reply in order to give a less biased response, I'll add a few more comments below. "Koen Vermeer" <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>... > I am having conceptual problems with cross-validating an ROC. Me too. I have never heard of such a concept. > The thing is > that for me the only reason to draw an ROC is to show the individual > fpr/tpr pairs, so one can choose the optimal setting for a specific > application (depending on prevalence, cost of FP/FN, etc). So, in fact, > the ROC just shows various algorithms, and you choose one that suits you > best. No. ROC (Receiver Operating Curve): A *single* curve that shows, for *one* algorithm created with a design data set, the variation in performance on an independent validation data set with respect to the changes of a single parameter (typically a classification threshold). The result of changing a single parameter setting should *not* be referred to as "another algorithm". > The thing with validation is that it is supposed to be done on the > final algorithm, not on some intermediate result. According to commonly accepted neural network terminology (see the comp.ai.neural-net FAQ), validation is performed on multiple intermediate designs in order to pick the "best" one. Testing is then performed on the chosen algorithm to quantify its generalization performance on independent test data that was not used for design or validation. > More detailed: > Consider algorithm A. It tests a number of algorithms (1..N) and chooses > the best one (say number i). Even if algorithm A uses cross-validation to > train and test all N algorithms, we cannot say that the error rate of > algorithm A is the same as the estimated error rate of algorithm i. What you have done is to introduce a novel concept: Estimating the error of an algorithm (A) that designs and validates other algorithms (1..N). I don't know if you meant to do this or you have misinterpreted what is done conventionally. If you meant to do this then you are in uncharted land. If A chooses i based on the validation set performance (say 25%) and testing shows that i also has the best test set performance (say 30%), then the error rate of A is 0. This does not appear to be a very useful concept, and is probably not what you want. > So, we > cross-validate algorithm A: We use a data set to train it (and thus to > select i) This training data set consists of two subsets: a design data set and an independent validation data set. However, it is not clear how you are distinguishing algorithm i from algorithm j. > Now, if we compare this to the ROC, the ROC is like the outcome of all N > algorithms. No. A ROC summarizes the performance of *one* algorithm w.r.t. varying a single parameter. You are comparing algorithm i with a ROC that comes from where? > Based on the application, one would choose the best algorithm. Are you now referring to using "the" ROC to choose the best algorithm or are you referring to choosing the algorithm that yields the best ROC? > Cross-validation is therefore not possible before this selection has been > made. I'm lost. > On the other hand, one could of course 'cross-validate' the ROC. For > example, the ROCs of the several folds could be averaged in some way, or > the individual tpr/fpr pairs could be cross-validated. > > I would appreciate any comments on this! I will comment on the alternative you propose in the next to last paragraph in a separate post. Hope this helps. Greg . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
