Re: [R] Bootstrap or Wilcoxons' test?

David Winsemius Fri, 13 Feb 2009 23:10:57 -0800

The Wilcoxon rank sum test is not "plain and simple a test equality ofdistributions". If it were such, it would be able to test fordifferences in variance when locations were similar. For that purposeit would, in point of fact, be useless. Compare these simplesituations w.r.t. the WRS:


> x <- rnorm(100)  # mean=0, sd=1
> y <- rnorm(100, mean=0, sd=4)
> wilcox.test(x,y)


        Wilcoxon rank sum test with continuity correction

data:  x and y
W = 4518, p-value = 0.2394
alternative hypothesis: true location shift is not equal to 0

> y <- rnorm(100, mean=.2, sd=0)
>
> wilcox.test(x,y)

        Wilcoxon rank sum test with continuity correction

data:  x and y
W = 3900, p-value = 0.004079
alternative hypothesis: true location shift is not equal to 0

It is a test of the equality of location (and the median is a readilyunderstood non-parametric measure of location). The test is derivedunder the *assumption* that the samples are drawn from the *same*distribution differing only by a shift. If the distributions were notof the same family, the test would be invalidated. The wilcox.testhelp page is informative, saying "the null hypothesis is that thedistributions of xand y differ by a location shift of mu". Thepseudomedian is optionally estimated when conf.int is set to TRUE. Ialso suggest looking at the formula for the statistic. It is availablewith getAnywhere(wilcox.test.default).

If one wants a test for "equality of distribution", one could turn toa more general test (with loss of power but with at least somepotential for detecting differences in dispersion) such as theKolmogorov-Smirnov or Kuiper tests. With x and y as above:


> ks.test(x,y)

        Two-sample Kolmogorov-Smirnov test

data:  x and y
D = 0.61, p-value < 2.2e-16
alternative hypothesis: two-sided

Warning message:
In ks.test(x, y) : cannot compute correct p-values with ties

Returning to the OP's question, rather than worrying about normalityin samples, the greater threat to validity in regression methods isunequal variances across groups or the range of continuous predictors.


--
David Winsemius

On Feb 13, 2009, at 11:12 PM, Murray Cooper wrote:

First of all, sorry for my typing mistakes.

Second, the WRS test is most certainly not a test for unequal medians.
Although under specified models it would be. Just as under specified
models it can be a test for other measures of location. Perhaps Idid notword my explanation correctly, but I did not mean to imply that itwouldbe a test of equality of variance. It is plain and simple a test forthe equalityof distributions. When the results of a properly applied parametrictest donot agree with the WRS, it is usually do to a difference in theempirical
density function of the two samples.

Murray M Cooper, Ph.D.
Richland Statistics
9800 N 24th St
Richland, MI, USA 49083
Mail: richs...@earthlink.net
----- Original Message ----- From: "David Winsemius" <dwinsem...@comcast.net>
To: "Murray Cooper" <myrm...@earthlink.net>
Cc: "Charlotta Rylander" <z...@nilu.no>; <r-help@r-project.org>
Sent: Friday, February 13, 2009 9:19 PM
Subject: Re: [R] Bootstrap or Wilcoxons' test?
I must disagree with both this general characterization of theWilcoxon test and with the specific example offered. First, weought to spell the author's correctly and then clarify that it isthe Wilcoxon rank-sum test that is being considered. Next, the WRStest is a test for differences in the location parameter ofindependent samples conditional on the samples having been drawnfrom the same distribution. The WRS test would have nodiscriminatory power for samples drawn from the same distributionhaving equal location parameters but only different with respectto unequal dispersion. Look at the formula, for Pete's sake. Itsummarizes differences in ranking, so it is in fact designed NOTto be sensitive to the spread of the values in the sample. Itwould have no power, for instance, to test the variances of twosamples, both with a mean of 0, and one having a variance of 1with the other having a variance of 3. One can think of the WRSas a test for unequal medians.
--
David Winsemius, MD. MPH
Heritage Laboratories


On Feb 13, 2009, at 7:48 PM, Murray Cooper wrote:
Charlotta,

I'm not sure what you mean when you say simple linear
regression. From your description you have two groups
of people, for which you recorded contaminant concentration.
Thus, I would think you would do something like a t-test to
compare the mean concentration level. Where does the
regression part come in? What are you regressing?

As for the Wilcoxnin test, it is often thought of as a
nonparametric t-test equivalent. This is only true if the
observations were drawn, from a population with the
same probability distribution. The null hypothesis of
the Wilcoxin test is actually "the observations were
drawn, from the same probability distribution".
Thus if your two samples had say different variances,
there means could be the same, but since the variances
are different, the Wilcoxin could give you a significant result.

Don't know if this all makes sense, but if you have more
questions, please e-mail your data and a more detailed
description of what analysis you used and I'd be happy
to try and help out.

Murray M Cooper, Ph.D.
Richland Statistics
9800 N 24th St
Richland, MI, USA 49083
Mail: richs...@earthlink.net
----- Original Message ----- From: "Charlotta Rylander"<z...@nilu.no>
To: <r-help@r-project.org>
Sent: Friday, February 13, 2009 3:24 AM
Subject: [R] Bootstrap or Wilcoxons' test?
Hi!
I'm comparing the differences in contaminant concentrationbetween 2different groups of people ( N=36, N=37). When using a simplelinearregression model I found no differences between groups, but whenevaluatingthe diagnostic plots of the residuals I found my independentvariable tohave deviations from normality (even after log transformation).Therefore Ihave used bootstrap on the regression parameters ( R= 1000 &R=10000) andthis confirms my results , i.e., no differences between groups( and thedistribution is log-normal). However, when using wilcoxons' ranksum test on
the same data set I find differences between groups.
Should I trust the results from bootstrapping or from wilcoxons'test?
Thanks!



Regards



Lotta Rylander


[[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bootstrap or Wilcoxons' test?

Reply via email to