I believe that the key assumption is stochastic ordering of the two distributions for the two samples. Under the alternative hypothesis this means that the two distribution functions do not cross. One of them has to be always to the right of the other. Lots of asymmetric distributions will do this. Under the null hypothesis of no difference in location (eg, median) the distributions are identical if they do not cross each other.
If two distributions have the same location parameter (eg, mean) but different dispersion parameters (eg, variance), then their distributions would cross each other and the assumption of stochastic ordering would be violated. An example of this would be two normal distributions with equal means but different variances.
I think this is somewhere in Hayek's little book on nonparametrics.
Some of the fallout from this is that rank tests don't work when data is normally distributed but with different variances, since in this situation the distribution functions cross somewhere, even with different means. That is, rank tests don't, in general, solve the problem of different variances with normal distributions, or other symmetric distributions for that matter.
David Smith
David W. Smith, Ph.D., M.P.H.
Associate Professor of Biometry
School of Public Health - San Antonio Campus
University of Texas
7703 Floyd Curl Dr., Mail Code 7976
San Antonio, TX 78229-3900
voice: (210)567-3560
fax: (210) 567-5942
e-mail: [EMAIL PROTECTED]
-----Original Message-----
From: Donald Burrill [mailto:[EMAIL PROTECTED]]
Sent: Thursday, January 02, 2003 3:42 PM
To: Francois Bergeret; Rich Ulrich
Cc: [EMAIL PROTECTED]
Subject: Re: Question on Wilcoxon Test
I suspect that something like an assumption of symmetry may be needed,
on the following argument:
Assumptions generally deal with the situation if the null hypothesis be
true: in this case, if the two subgroup medians are equal. If the
subgroups were not symmetrical and had the same median, how badly could
the test go wrong?
The worst case I can conjure up on the spur of the moment is this:
suppose the two groups to be equal in size, and that we imagine the
total set of data to be divided into quarters at the overall quartiles;
and suppose Group A consists of the 1st and 3rd quarters while Group B
comprises the 2nd and 4th quarters. A and B will have the same median
(or if not, we can arrange that by swapping at most two observations
between the groups, in what I suppose to be an obvious way), so the null
hypothesis is true in this case.
Now for such a case I rather suspect that the sum of the ranks on which
the test is based might be interestingly far from the value expected
under an assumption of symmetry about the median for each group.
(Haven't tried to demonstrate that with fictitious data, but leave it as
an exercise for an interested reader.)
What I cannot tell (on the spur of the moment) is whether this situation
can reasonably be included in the "100*alpha %" of cases for which one
expects to reject the null hypothesis falsely; but surely if one had
reason to expect some such distribution on a systematic basis that
reason would be sufficient to invalidate one's calculation of any
p-value based on the standard assumption(s).
Comments, anyone? -- Don.
-----------------------------------------------------------------------
Donald F. Burrill [EMAIL PROTECTED]
56 Sebbins Pond Drive, Bedford, NH 03110 (603) 626-0816
[was: 184 Nashua Road, Bedford, NH 03110 (603) 471-7128]
On Thu, 2 Jan 2003, Rich Ulrich wrote:
> On Thu, 2 Jan 2003 "Francois Bergeret" <[EMAIL PROTECTED]>
> wrote:
>
> > hello and happy new year to the group members !
> >
> > I have read on some references that the Wilcoxon test (or Kruskal
> > Wallis for more than 2 subgroups) has the assumption that the
> > distribution of values is symmetric around the median. I'm not sure
> > this assumption is needed. In a book from Saporta there is a detail
> > of the test and I do not see this assumption.
> >
> > Is the assumption of symmetry needed for Wilcoxon or KW tests ?
>
> The rank-tests are unchanged, in fact, by any monotonic
> transformation (log, etc.), and that is basic about them.
> So, I hope that you are mis-remembering the references -
>
> The usual assumption, as I keep it in mind, is that
> the samples follow the *same* distribution.
> (Maybe someone will give the more technical statement?)
>
> As I think of them, the rank-tests save you the trouble
> of finding the proper transformation:
> a) However, the proper transformation would give the
> most powerful test and the most information; and
> b) When there is no 'proper' transformation available,
> the assumptions of the rank-test are probably violated,
> too (and, as a consequence of *this*, you probably are
> testing something odd, which you will regret).
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
. http://jse.stat.ncsu.edu/ .
=================================================================
