I believe below-chance accuracy is a natural phenomenon in classification theorem. This issue is obvious when one finds the permutation distribution of some problem, where in a two-category problem, the distribution has a peek around 50% accuracy, so there will always be some (or, a lot of) below-chance values. This case is more likely to happen when the dataset has few samples, and probably high dimensional data. I am not sure if any procedure to relabel the data, or any other fine-tuned algorithm would be considered as 'tweaking the results'.
>My question is this: How much (statistical?) merit would it be to come with >some sort of index to show how much a given classification accuracy is off >from absolute chance for this classification? A p-value via permutation testing is the better candidate to answer this question, eg, p<0.01. Regards, -Rawi >________________________________ > From: Jacob Itzhacki <[email protected]> >To: [email protected] >Sent: Thursday, January 31, 2013 9:20 AM >Subject: [pymvpa] On below-chance classification (Anti-learning, encore) > > >Dear all, > > >First off, pardon me if anything of what I say might already be described >somewhere else, I've done quite a bit of searching and reading on the subject >(eg. including Dr. Kowalczyks lecture) but it is always possible to have >bypassed something in this internet age. After reading as much as I could >about the problem I've noticed that the workarounds proposed don't really fix >the problem, which I am facing quite a bit, to the point that around 1/3 of >classifications are below classification accuracy (38-42% for 2way or 17%-19% >for 4-way). I would like to have some feedback on an idea I've had to try to >still have this data be useful. > > >My question is this: How much (statistical?) merit would it be to come with >some sort of index to show how much a given classification accuracy is off >from absolute chance for this classification? > > >Elaborating, it would be displaying the absolute value of the substraction of >the resulting accuracy from chance level. Say, for a 2-way classification >(with 50% chance level), in which you obtain accuracies of 38% and 62% in 2 >different instances the difference from chance for both would be 12% which >would make them equivalent. > > >Please offer as much criticism as you can to this approach. > > >Thanks in advance, > > >Jacob > > > > >PS. For completions sake, I'll first list the things I've tried. > > >I'm running the classification on fMRI data obtained from a paradigm that >gives the following classification opportunities: > > >a. 4 categories, with 40 trials each at its fullest use (160 trials) >b. 2 categories as one yielding a classification of 80 trials for each, by >including two categories as one. >c. 2 categories, with 40 trials each, by disregarding 2 of the conditions. > > >I am also using a total of 8 different ROI. > > >I have tried reordering the trials on one of the subjects, however this >results in above chance accuracies in one analysis and below in the other for >the same ROI which gets rather frustrating if I wanted to do some sort of >averaging by the end. However, there seems to be some consistency into which >classification moves away from chance which leads me once again to believe >that there is in fact some learning even in the below-chance classifications >but the seeming anti-learning baffles me. What does it mean?! (And how is it >even possible? O.o) > > >Thanks again. >_______________________________________________ >Pkg-ExpPsy-PyMVPA mailing list >[email protected] >http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa > > _______________________________________________ Pkg-ExpPsy-PyMVPA mailing list [email protected] http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa

