Paige Miller <[EMAIL PROTECTED]> wrote: > Ray Koopman wrote: >> [EMAIL PROTECTED] (Jan) wrote in message >> news:<[EMAIL PROTECTED]>... >> >>>A very basic statistical problem, i fear, but i can't get it solved. I >>>have collected data on the occurence of pathology at female ovaries, >>>and graded them according to the severity. Left: 1: 91, 2: 31, 3: 7, >>>4:3; Right: 1:66, 2:28, 3:6, 4:3; totals: left: 132, right: 103. All >>>data are collected on females who present with fertility problems >>>(which could case a certain bias). To the best of my knowledge, no-one >>>has documented a 50/50 spread between left and right (normally there >>>should be no pathology! though some asymptomatic women are probably >>>around). Q: 1/Which test should I use to compare left vs right againsi >>>grading (if i can due to the difference in spread among grades?!)? >>>2/Can I say that left is significantly more affected than right? >>>(which test, based on which presumptions)? Thanks a lot!! >> >> The data should be organized in a 5 by 5 contingency table F in which >> F_ij, i,j = 0...4, is the number of women whose left and right ovaries >> had severity scores i and j, respectively, where 0 indicates no >> pathology. It is not clear what the given values >> 1 2 3 4 >> L 91 31 7 3 >> R 66 28 6 3 >> are. Are they the leftmost column F_i0 and the topmost row F_0j, >> omitting F_00? Or perhaps the row and column sums, F_i+ and F_+j, >> omitting F_0+ and F_+0? >> >> In any case, the question is about the nature of any asymmetry in F. > > I cannot see how this question turns into a contingency table unless the > data given are count data. And yet the original question indicated that > the numbers represent severity, which to me indicates continuous or > ratio scaled data. Even if the data are indeed counts of some type of > pathology within an ovary, I can't see how this fits into a contingency > table.
I would guess the data appear collected like this: given a population of women (unknown size) each had each of her ovaries given a score from 0 to 4 by clinical results. Presumably there are some women examined who had no problem on one side so there are some missing zero-score data. You want to answer the question: suppose I chose, at random, some woman among those presenting to the clinic. Is it more likely she will have a higher pathology score on her left side than her right, or not? I'd guess that would be the most clinically relevant question. Being a random physicist who only knows about pseudorandom number generators :) , I would take the L-R difference in score for *each* *woman*. (I hope those data haven't been discarded!!!!!). That's an integer in [-4 -3 -2 -1 0 1 2 3 4 ] Those would be the raw data and the test statistic the mean of that set. I'd then simulate surrogate data sets by taking the original L-R data and with 50% probability flip the sign for each observation. Take the mean and that's one example from the sampling distribution under this null. Simulate many times. compare this distribution to the statistic on the "real" data. is this the same as the Wilcoxon matched-pairs test? . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
