Hi Jo, Thanks very much for your reply which is very helpful! I will look further to the link you sent and do leave-two/three-run-out cross-validation and see what I'll get. I'll keep you updated.
Best, Meng > Date: Mon, 26 Nov 2012 11:09:53 -0600 > From: jet...@artsci.wustl.edu > To: pkg-exppsy-pymvpa@lists.alioth.debian.org > Subject: Re: [pymvpa] FW: What does a classification accuracy that is > significantly lower than chancel level mean? > > Sorting out below-chance accuracy is really vexing. If you haven't seen > it before, this topic has been discussed on this (and other mailing > lists) before, see the thread at > http://comments.gmane.org/gmane.comp.ai.machine-learning.pymvpa/611 . > Googling "below-chance accuracy" also brings up some useful links. > > I have seen this phenomenon (permutation distribution looks reasonably > normal and centered near chance but true-labeled accuracy in the left > tail) occasionally in my own data. > > I don't have a good explanation for this, but tend to think it has to do > with data that doesn't make a linear-svm-friendly shape in hyperspace. > As typical in MVPA, you don't have a huge number of examples > (particularly if you have more than a hundred or so voxels in the ROI), > which also can make the classification results unstable. > > If you are reasonably sure that the dataset is good (the examples are > properly labeled, the ROI masks fit well, etc) then I would try altering > the cross-validation scheme to see if you can get the individual > accuracies at (or above!) chance. For example, I'd try leaving two or > three runs out instead of just one for the cross-validation. Having a > small testing set (like you do with leave-one-run-out) can make a lot of > variance in the cross-validation folds (i.e. the accuracy for each of > the 6 classifiers going into each person's accuracy). Things seem to > often go better when all the cross-validation folds have fairly similar > accuracies (0.55, 0.6, 0.59, ...) rather than widely variable ones (0.5, > 0.75, 0.6, ...). > > Good luck, and I'd love to hear if you find a solution. > Jo > > > On 11/26/2012 7:21 AM, Meng Liang wrote: > > Dear Yaroslav, > > > > I'm still puzzled by the results of classification accuracy lower than > > chance level. I've provided some details to your questions my previous > > email, and I hope you could help me understand this puzzle. Many thanks > > in advance! > > > > Best, > > Meng > > > > ------------------------------------------------------------------------ > > From: meng.li...@hotmail.co.uk > > To: pkg-exppsy-pymvpa@lists.alioth.debian.org > > Date: Sat, 10 Nov 2012 19:19:19 +0000 > > Subject: Re: [pymvpa] FW: What does a classification accuracy that is > > significantly lower than chancel level mean? > > > > Dear Yaroslav, > > > > Thanks very much for your reply! Please see below for details. > > > > > > I'm running MVPA on some fMRI data (four different stimuli, say A, B, C > > > > and D; six runs in each subject) to see whether the BOLD signals from a > > > > given ROI can successfully predict the type of the stimulus. The MVPA > > > > (leave-one-run-out cross-validation) was performed on each subject for > > > > each two-way classification task. In a particular classification > > task (say > > > > classification A vs. B), in some subjects, the classification > > accuracy was > > > > (almost) significantly LOWER than the chance level (somewhere > > between 0.2 > > > > and 0.4). > > > > > > > > > depending on number of trials/cross-validation scheme even values of 0 > > > could come up by chance ;-) but indeed should not be 'significant' > > > > > > > What could be the reason for a significantly-lower-than-chance-level > > > > accuracy? > > > > > > and how significant is this 'significantly LOWER'? > > > > The significant level was assessed by P value obtained from 10,000 > > permutations. Permutation was done within each subject, by randomly > > assigning stimulus labels to each trial (the number of trials under each > > label was still balanced; there were 8 trials per condition in each run, > > and there were six runs in total). The P value was calculated as the > > percentage of random permutations in which the resultant classification > > accuracy was higher than the actual classification accuracy obtained > > from the correct labels (for example, if none of 10,000 random > > permutations led to a classification accuracy that was higher than the > > actual classification accuracy, the P value would be 0). In this way, in > > 5 out of 14 subjects, the P values were greater than 0.95. In other > > words, the actual classification accuracy was located around the end of > > the left tail of the null distribution in these 5 subjects (the shape of > > the null distribution is like a bell, centered around 50%). In other 9 > > subjects, the actual classification accuracies were near or higher than > > chance level. > > > > > details of # trials/cross-validation? > > > > There were 8 trials per condition in each run, and there were six runs > > in total. Leave-one-run-out cross-validation was performed, that is, the > > classifier (linear SVM) was trained on the data obtained from five runs > > and tested on the remaining run (repeat the same procedure six times and > > each time using a different run as a testing dataset). > > > > > > The P value was obtained from 10,000 permutations. > > > > > > is that permutations within the subject which at the end showed > > > significant below 0? how permuations were done? > > > > I hope the reply above provide enough details of how the permutation was > > done. Please let me know if there is anything unclear. > > > > > > > > > But the > > > > accuracies of all other classifications look fine in all subjects. > > > > > > fine means all above chance or still distributed around chance? > > > > By 'fine' I mean the classification accuracy was around (i.e. not far > > from the chance level, can be lower or higher than chance level) or > > above chance level. To me, around or above chance level makes more sense > > than significantly lower than chance level. > > > > Thanks, > > Meng > > -- > Joset A. Etzel, Ph.D. > Research Analyst > Cognitive Control & Psychopathology Lab > Washington University in St. Louis > http://mvpa.blogspot.com/ > > _______________________________________________ > Pkg-ExpPsy-PyMVPA mailing list > Pkg-ExpPsy-PyMVPA@lists.alioth.debian.org > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
_______________________________________________ Pkg-ExpPsy-PyMVPA mailing list Pkg-ExpPsy-PyMVPA@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa