testing if classifier accuracy differs significantly

Mark Everingham Sat, 19 Aug 2000 08:21:06 -0700
Dear all,

Help appreciated on this problem:

I have two classifier systems which take as input an image and produce
as output a label for each pixel in the image, for example the input
might be of an outdoor scene, and the labels sky/road/tree etc.

I have a set of images with the correct labels, so I can test how
accurately a classifier performs by calculating for example the mean
number of pixels correctly classified per image or the mean number of
sky pixels correctly classified etc.

The problem is this: Given *two* different classifiers, I want to test
if the accuracy achieved by each classifier differs *significantly*. One
way I can think of doing this is:

for classifier 1,2
        for each image
                get % pixels correct
        calculate mean and sd across images
apply t-test

Because the images used for each classifier are the same, I assume I can
use a paired t-test. Assuming the distribution of % correct across
images is approximately normal, this should work fine.

However, I have two nagging objections to this:

 i) the accumulation of statistics across *images* rather than any other
unit is
    fairly arbitrary

ii) because the *pixels* in each image are identical as well as the
images, it
    seems to me that there may be a stronger statistic I can use, rather
than
    just lumping all the pixels of an image together and taking the sum
of
    correct pixels. The analogy I am thinking of is comparing
performance on a pair
    of exams and looking at individual questions rather than just taking
the
    overall number of correct responses.

Anyone have any comments/ideas?

Thanks in advance
Mark

________________________________________________________________________

Mark Everingham               Phone: +44 117 9545249
Room 1.15                     Fax:   +44 117 9545208
Merchant Venturers Building   Email: [EMAIL PROTECTED]
University of Bristol         WWW:   http://www.cs.bris.ac.uk/~everingm/
Bristol BS8 1UB, UK


=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================
testing if classifier accuracy differs significantly

Reply via email to