I believe below-chance accuracy is a natural phenomenon in classification 
theorem. This issue is obvious when one finds the permutation distribution of 
some problem, where in a two-category problem, the distribution has a peek 
around 50% accuracy, so there will always be some (or, a lot of) below-chance 
values. This case is more likely to happen when the dataset has few samples, 
and probably high dimensional data. I am not sure if any procedure to relabel 
the data, or any other fine-tuned algorithm would be considered as 'tweaking 
the results'. 

>My question is this: How much (statistical?) merit would it be to come with 
>some sort of index to show how much a given classification accuracy is off 
>from absolute chance for this classification?

A p-value via permutation testing is the better candidate to answer this 
question, eg, p<0.01.

Regards,
-Rawi


>________________________________
> From: Jacob Itzhacki <[email protected]>
>To: [email protected] 
>Sent: Thursday, January 31, 2013 9:20 AM
>Subject: [pymvpa] On below-chance classification (Anti-learning, encore)
> 
>
>Dear all,
>
>
>First off, pardon me if anything of what I say might already be described 
>somewhere else, I've done quite a bit of searching and reading on the subject 
>(eg. including Dr. Kowalczyks lecture) but it is always possible to have 
>bypassed something in this internet age. After reading as much as I could 
>about the problem I've noticed that the workarounds proposed don't really fix 
>the problem, which I am facing quite a bit, to the point that around 1/3 of 
>classifications are below classification accuracy (38-42% for 2way or 17%-19% 
>for 4-way). I would like to have some feedback on an idea I've had to try to 
>still have this data be useful.
>
>
>My question is this: How much (statistical?) merit would it be to come with 
>some sort of index to show how much a given classification accuracy is off 
>from absolute chance for this classification?
>
>
>Elaborating, it would be displaying the absolute value of the substraction of 
>the resulting accuracy from chance level. Say, for a 2-way classification 
>(with 50% chance level), in which you obtain accuracies of 38% and 62% in 2 
>different instances the difference from chance for both would be 12% which 
>would make them equivalent.
>
>
>Please offer as much criticism as you can to this approach.
>
>
>Thanks in advance,
>
>
>Jacob
>
>
>
>
>PS. For completions sake, I'll first list the things I've tried.
>
>
>I'm running the classification on fMRI data obtained from a paradigm that 
>gives the following classification opportunities:
>
>
>a. 4 categories, with 40 trials each at its fullest use (160 trials)
>b. 2 categories as one yielding a classification of 80 trials for each, by 
>including two categories as one.
>c. 2 categories, with 40 trials each, by disregarding 2 of the conditions.
>
>
>I am also using a total of 8 different ROI.
>
>
>I have tried reordering the trials on one of the subjects, however this 
>results in above chance accuracies in one analysis and below in the other for 
>the same ROI which gets rather frustrating if I wanted to do some sort of 
>averaging by the end. However, there seems to be some consistency into which 
>classification moves away from chance which leads me once again to believe 
>that there is in fact some learning even in the below-chance classifications 
>but the seeming anti-learning baffles me. What does it mean?! (And how is it 
>even possible? O.o)
>
>
>Thanks again.
>_______________________________________________
>Pkg-ExpPsy-PyMVPA mailing list
>[email protected]
>http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>
>

_______________________________________________
Pkg-ExpPsy-PyMVPA mailing list
[email protected]
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa

Reply via email to