Glenn Barnett wrote: > > But since WMW is completely insensitive to a change in spread without > a change in location, if either were possible, a rejection would > imply that there was indeed a location difference of some kind. This > objection strikes me as strange indeed. Does Johnson not understand > what WMW is doing? Why on earth does he think that a t-test suffers > any less from these problems than WMW? > > Similarly, a change in shape sufficient to get a rejection of a WMW > test would imply a change in location (in the sense that the "middle" > had moved, though the term 'location' becomes somewhat harder to pin > down precisely in this case). e.g. (use a monospaced font to see this): > > :. .: > ::. => .:: > ::::... ...:::: > a b a b > > would imply a different 'location' in some sense, which WMW will > pick up. I don't understand the problem - a t-test will also reject > in this case; it suffers from this drawback as well (i.e. they are > *both* tests that are sensitive to location differences, insensitive > to spread differences without a corresponding location change, and > both pick up a shape change that moves the "middle" of the data). In fact, it can be shown (I can send details - and a preprint- to anybody interested) that a weakness - at least in principle - of the WMW test is that it *fails* to be a test of location, in that it may exhibit cyclicity between three sets of data, or even consistently cyclic behaviour between three populations as sample size -> infinity. (A test is "cyclic" if it can imply A > B > C > A, rejecting the null hypothesis in each case. This is stronger than "intransitivity" in which the test implies A>B>C but fails to reject A=C. Student's t test (with pooled variance) can exhibit the latter behaviour (suppose n1 = n3 = 2, n2 = 100; xbar1 = -2, xbar2 = 0, xbar3 = 2; and s1 =s2 =s3 = 1). but not the former, as it can never imply mu1 > mu2 if xbar1 <= xbar2.) The simplest example of cyclic behaviour for the WMW test uses made-up (or large) data sets based on Efron's intransitive dice, labelled {1,1,5,5,5,5},{3,3,3,4,4,4} and {2,2,2,2,6,6}. Details are left to the reader. This is Barnett's "change of shape". Pothoff (1963) showed that WMW is a test for the median/mean between any two symmetric distributions; and it is clear that it is a test for the median/mean within any shifted family. However, (Dawson, 1997, unpublished) for a Behrens-Fisher family of asymmetric distributions, cyclic behaviour is typically exhibited; so that a change of shape is *not* necessary. In particular, if f_X(x) is analytic with all moments existing, the WMW test is a test of location for the Behrens-Fisher family generated by f_X(x) if and only if f_X(x) is symmetric. For more general distributions, a necessary and sufficient condition is that if we let f_X(x) = f1(x) + f2(x) where f1 is nonzero only below the median (WLOG 0) and f2 only above, and gi(x) = e^x fi(e^x), then g1 and g2 have the same autocorrelation. (Don't ask me why, I just did the calculus & that's what it said...) Notwithstanding all of the above, the cyclicity phenomenon is never very strong. Using a result of Steinhaus and Trybula (1959), we can show that even three made-up data sets cannot exhibit cyclicity for two-tailed WMW tests at the 5% significance level unless each sample size is at least 50. EG: Sample 1 2 3 X=1 19 0 0 2 0 0 31 3 0 50 0 4 31 0 0 5 0 0 19 but no smaller sample size will work. Using random samples from populations divided in these proportions we would of course need samle sizes larger than 50 to have this happen with any great frequency. As a final example, consider the shifted exponential distributions as a fairly realistic model of a Behrens-Fisher family. It can be shown that, for random samples from three member distributions f_a, f_b, f_c chosen so that the expected values of the pairwise WMW test statistics imply A>B>C>A for hypothetical "locations" A,B,C, at least one test will have a power of less than 50% (for two-sided 5% significance level tests) unless the sample sizes are greater than about 800. (As n -> infinity, the power of all three tests goes to 1, of course; but it takes its time doing so!) Thus, while the phenomenon is in one sense very widespread, it would seem that there are few naturally occurring triples of independent data sets for which the WMW is cyclic; and examples for which the Behrens-Fisher model is plausible may be very few and far between. -Robert Dawson

- Disadvantage of Non-parametric vs. Parametric Test boonlert
- Re: Disadvantage of Non-parametric vs. Parametric ... Jerry Dallal
- Re: Disadvantage of Non-parametric vs. Paramet... Robert Dawson
- Re: Disadvantage of Non-parametric vs. Par... Jerry Dallal

- Re: Disadvantage of Non-parametric vs. Parametric ... Alex Yu
- Re: Disadvantage of Non-parametric vs. Paramet... dennis roberts
- Re: Disadvantage of Non-parametric vs. Paramet... Glen Barnett
- Re: Disadvantage of Non-parametric vs. Par... Robert Dawson
- Re: Disadvantage of Non-parametric vs. Par... Rich Ulrich
- Re: Disadvantage of Non-parametric vs.... Frank E Harrell Jr
- Re: Disadvantage of Non-parametri... Glen Barnett

- Re: Disadvantage of Non-parametric vs.... Rich Strauss
- Re: Disadvantage of Non-parametri... Jan de Leeuw
- Re: Disadvantage of Non-parametri... Robert Dawson

- Re: Disadvantage of Non-parametric vs.... Glen Barnett
- Re: Disadvantage of Non-parametric vs.... Brian Cade

- Re: Disadvantage of Non-parametric vs. Parametric ... Glen Barnett