`Sorting out below-chance accuracy is really vexing. If you haven't seen`

`it before, this topic has been discussed on this (and other mailing`

`lists) before, see the thread at`

`http://comments.gmane.org/gmane.comp.ai.machine-learning.pymvpa/611 .`

`Googling "below-chance accuracy" also brings up some useful links.`

`I have seen this phenomenon (permutation distribution looks reasonably`

`normal and centered near chance but true-labeled accuracy in the left`

`tail) occasionally in my own data.`

## Advertising

`I don't have a good explanation for this, but tend to think it has to do`

`with data that doesn't make a linear-svm-friendly shape in hyperspace.`

`As typical in MVPA, you don't have a huge number of examples`

`(particularly if you have more than a hundred or so voxels in the ROI),`

`which also can make the classification results unstable.`

`If you are reasonably sure that the dataset is good (the examples are`

`properly labeled, the ROI masks fit well, etc) then I would try altering`

`the cross-validation scheme to see if you can get the individual`

`accuracies at (or above!) chance. For example, I'd try leaving two or`

`three runs out instead of just one for the cross-validation. Having a`

`small testing set (like you do with leave-one-run-out) can make a lot of`

`variance in the cross-validation folds (i.e. the accuracy for each of`

`the 6 classifiers going into each person's accuracy). Things seem to`

`often go better when all the cross-validation folds have fairly similar`

`accuracies (0.55, 0.6, 0.59, ...) rather than widely variable ones (0.5,`

`0.75, 0.6, ...).`

Good luck, and I'd love to hear if you find a solution. Jo On 11/26/2012 7:21 AM, Meng Liang wrote:

Dear Yaroslav, I'm still puzzled by the results of classification accuracy lower than chance level. I've provided some details to your questions my previous email, and I hope you could help me understand this puzzle. Many thanks in advance! Best, Meng ------------------------------------------------------------------------ From: meng.li...@hotmail.co.uk To: pkg-exppsy-pymvpa@lists.alioth.debian.org Date: Sat, 10 Nov 2012 19:19:19 +0000 Subject: Re: [pymvpa] FW: What does a classification accuracy that is significantly lower than chancel level mean? Dear Yaroslav, Thanks very much for your reply! Please see below for details. > > I'm running MVPA on some fMRI data (four different stimuli, say A, B, C > > and D; six runs in each subject) to see whether the BOLD signals from a > > given ROI can successfully predict the type of the stimulus. The MVPA > > (leave-one-run-out cross-validation) was performed on each subject for > > each two-way classification task. In a particular classification task (say > > classification A vs. B), in some subjects, the classification accuracy was > > (almost) significantly LOWER than the chance level (somewhere between 0.2 > > and 0.4). > > > depending on number of trials/cross-validation scheme even values of 0 > could come up by chance ;-) but indeed should not be 'significant' > > > What could be the reason for a significantly-lower-than-chance-level > > accuracy? > > and how significant is this 'significantly LOWER'? The significant level was assessed by P value obtained from 10,000 permutations. Permutation was done within each subject, by randomly assigning stimulus labels to each trial (the number of trials under each label was still balanced; there were 8 trials per condition in each run, and there were six runs in total). The P value was calculated as the percentage of random permutations in which the resultant classification accuracy was higher than the actual classification accuracy obtained from the correct labels (for example, if none of 10,000 random permutations led to a classification accuracy that was higher than the actual classification accuracy, the P value would be 0). In this way, in 5 out of 14 subjects, the P values were greater than 0.95. In other words, the actual classification accuracy was located around the end of the left tail of the null distribution in these 5 subjects (the shape of the null distribution is like a bell, centered around 50%). In other 9 subjects, the actual classification accuracies were near or higher than chance level. > details of # trials/cross-validation? There were 8 trials per condition in each run, and there were six runs in total. Leave-one-run-out cross-validation was performed, that is, the classifier (linear SVM) was trained on the data obtained from five runs and tested on the remaining run (repeat the same procedure six times and each time using a different run as a testing dataset). > > The P value was obtained from 10,000 permutations. > > is that permutations within the subject which at the end showed > significant below 0? how permuations were done? I hope the reply above provide enough details of how the permutation was done. Please let me know if there is anything unclear. > > > But the > > accuracies of all other classifications look fine in all subjects. > > fine means all above chance or still distributed around chance? By 'fine' I mean the classification accuracy was around (i.e. not far from the chance level, can be lower or higher than chance level) or above chance level. To me, around or above chance level makes more sense than significantly lower than chance level. Thanks, Meng

-- Joset A. Etzel, Ph.D. Research Analyst Cognitive Control & Psychopathology Lab Washington University in St. Louis http://mvpa.blogspot.com/ _______________________________________________ Pkg-ExpPsy-PyMVPA mailing list Pkg-ExpPsy-PyMVPA@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa