On Wed, 18 Apr 2007, Greg Snow wrote: > For testing, the permutation test may be prefered to the bootstrap > (though the bootstrap could be used for a confidence interval). > > I remember in grad school doing a project on comparing the efficiency of > a permutation test on medians compared to the MannWhitney test, but I > don't remember the specifics of when each was better. Do any of the > other participants in this discussion have any ideas on how the > permutation tests compare to what else has been discussed? > > The MannWhitney test is actually a special case of the permutation test, > but using the median permutation test is more intuitive to my mind. The > permutation test is actually testing the null hypothesis that the 2 > distributions are identical, but no assumptions about normality, > skewness, shift hypotheses, etc.. Though the efficiency of the test > statistic used would depend somewhat on the nature of the alternatives > of interest (imagine 2 distributions with the same mean, but different > medians, or same median, but different mean; a permutation test > comparing means or medians would differ in the 2 cases).
I think the point is that one does not want to assume the two distributions are identical: the null hypothesis is that they have the same median but possibly different shapes (including spread). You can set up bootstrap tests (see Davison & Hinkley, for example). Cody Hamilton's CI is too crude: again see D&H or MASS for less crude alternatives. However, bootstrapping a median has its own peculiarities: see the example in MASS and references there, including to Sheather's book. > I'll have to try some simulations looking at a permutation test on > efron's dice. > > -- > Gregory (Greg) L. Snow Ph.D. > Statistical Data Center > Intermountain Healthcare > [EMAIL PROTECTED] > (801) 408-8111 > > > >> -----Original Message----- >> From: [EMAIL PROTECTED] >> [mailto:[EMAIL PROTECTED] On Behalf Of >> [EMAIL PROTECTED] >> Sent: Wednesday, April 18, 2007 10:06 AM >> To: [email protected] >> Subject: Re: [R] help comparing two median with R >> >> >> Has anyone proposed using a bootstrap for Pedro's problem? >> >> What about taking a boostrap sample from x, a boostrap sample >> from y, take the difference in the medians for these two >> bootstrap samples, repeat the process 1,000 times and >> calculate the 95th percentiles of the 1,000 computed >> differences? You would get a CI on the difference between >> the medians for these two groups, with which you could >> determine whether the difference was greater/less than zero. >> Too crude? >> >> Regards, >> -Cody >> >> >> >> >> >> >> Frank E Harrell >> >> Jr >> >> <[EMAIL PROTECTED] >> To >> bilt.edu> Thomas Lumley >> >> Sent by: >> <[EMAIL PROTECTED]> >> [EMAIL PROTECTED] >> cc >> at.math.ethz.ch >> [email protected] >> >> Subject >> Re: [R] help comparing >> two median >> 04/18/2007 05:02 with R >> >> AM >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Thomas Lumley wrote: >>> On Tue, 17 Apr 2007, Frank E Harrell Jr wrote: >>> >>>> The points that Thomas and Brian have made are certainly >> correct, if >>>> one is truly interested in testing for differences in medians or >>>> means. But the Wilcoxon test provides a valid test of x > y more >>>> generally. The test is consonant with the Hodges-Lehmann >> estimator: >>>> the median of all possible differences between an X and a Y. >>>> >>> >>> Yes, but there is no ordering of distributions (taken one >> at a time) >>> that agrees with the Wilcoxon two-sample test, only >> orderings of pairs >>> of distributions. >>> >>> The Wilcoxon test provides a test of x>y if it is known a >> priori that >>> the two distributions are stochastically ordered, but not >> under weaker >>> assumptions. Otherwise you can get x>y>z>x. This is in contrast to >>> the t-test, which orders distributions (by their mean) >> whether or not >>> they are stochastically ordered. >>> >>> Now, it is not unreasonable to say that the problems are >> unlikely to >>> occur very often and aren't worth worrying too much about. It does >>> imply that it cannot possibly be true that there is any >> summary of a >>> single distribution that the Wilcoxon test tests for (and >> the same is >>> true for other two-sample rank tests, eg the logrank test). >>> >>> I know Frank knows this, because I gave a talk on it at Vanderbilt, >>> but most people don't know it. (I thought for a long time that the >>> Wilcoxon rank-sum test was a test for the median pairwise >> mean, which >>> is actually the R-estimator corresponding to the >> *one*-sample Wilcoxon test). >>> >>> >>> -thomas >>> >> >> Thanks for your note Thomas. I do feel that the problems you >> have rightly listed occur infrequently and that often I only >> care about two groups. Rank tests generally are good at >> relatives, not absolutes. We have an efficient test >> (Wilcoxon) for relative shift but for estimating an absolute >> one-sample quantity (e.g., median) the nonparametric >> estimator is not very efficient. Ironically there is an >> exact nonparametric confidence interval for the median >> (unrelated to Wilcoxon) but none exists for the mean. >> >> Cheers, >> Frank >> -- >> Frank E Harrell Jr Professor and Chair School of Medicine >> Department of Biostatistics >> Vanderbilt University >> >> ______________________________________________ >> [email protected] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> [email protected] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > [email protected] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
