�S�̽b�nFO ��\(�+�3Eb+��4tTwo different strategies occur to mind, both of
which � ���\��ght I suppose be implemented severally:
Garray of pixels, Ӽ�m�o�[��3s�so that each pixel may be thought of as at
the intersection of a row and }c���`���M�)�HHW�_�3(U
a column of pixels. There are then the individual pixels (all RxC of
them) and three different aggregations: by row, by column, and by the
whole image. Seems to me this would permit an ANOVA-like analysis, using
for dependent variable some suitable error function between the known
label for each function and the classifier's label, with sources of
variation representing rows and columns (in neither of which you would
have much interest, I imagine), classifiers (whose main effect is
equivalent to the t-test you mention below), and interactions between
(classifiers and rows) and (classifiers and columns) (these latter two
representing different levels of aggregation than the whole image.
2. Instead of the structural components (rows & 'n/i�&/j�
!�Ocolumns;}b) �J
I�k�7;����v�n�i����>=�jq
뺳���6Z%�g&���M���On Sat, 19 Aug 2000, Mark Everingham wrote:
> I have two classifier systems which take as input an image and produce
> as output a label for each pixel in the image, for example the input
> might be of an outdoor scene, and the labels sky/road/tree etc.
>
> I have a set of images with the correct labels, so I can test how
> accurately a classifier performs by calculating for example the mean
> number of pixels correctly classified per image or the mean number of
> sky pixels correctly classified etc.
>
> The problem is this: Given *two* different classifiers, I want to test
> if the accuracy achieved by each classifier differs *significantly*. One
> way I can think of doing this is:
>
> for classifier 1,2
> for each image
> get % pixels correct
> calculate mean and sd across images
> apply t-test
>
> Because the images used for each classifier are the same, I assume I can
> use a paired t-test. Assuming the distribution of % correct across
> images is approximately normal, this should work fine.
>
> However, I have two nagging objections to this:
>
> i) the accumulation of statistics across *images* rather than any other
> unit is
> fairly arbitrary
>
> ii) because the *pixels* in each image are identical as well as the
> images, it
> seems to me that there may be a stronger statistic I can use, rather
> than
> just lumping all the pixels of an image together and taking the sum
> of
> correct pixels. The analogy I am thinking of is comparing
> performance on a pair
> of exams and looking at individual questions rather than just taking
> the
> overall number of correct responses.
>
> Anyone have any comments/ideas?
>
> Thanks in advance
> Mark
>
> ________________________________________________________________________
>
> Mark Everingham Phone: +44 117 9545249
> Room 1.15 Fax: +44 117 9545208
> Merchant Venturers Building Email: [EMAIL PROTECTED]
> University of Bristol WWW: http://www.cs.bris.ac.uk/~everingm/
> Bristol BS8 1UB, UK
>
>
> =================================================================
> Instructions for joining and leaving this list and remarks about
> the problem of INAPPROPRIATE MESSAGES are available at
> http://jse.stat.ncsu.edu/
> =================================================================
>
------------------------------------------------------------------------
Donald F. Burrill [EMAIL PROTECTED]
348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED]
MSC #29, Plymouth, NH 03264 603-535-2597
184 Nashua Road, Bedford, NH 03110 603-471-7128
=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
http://jse.stat.ncsu.edu/
=================================================================