I tend to use t-tests after examining normal probability plots and, possibly, considering transformation. I believe they would be more powerful than permutation tests but that may be incorrect. Can you describe situations in which you would prefer permutation tests to t-tests?
Here is the reason I prefer permutation tests, besides the conceptual simplicity. T-tests are based on a normal sampling model, and my perception is that very few data sets to which people apply t-tests actually arise from random samples. Here I mean specifically that the person gathering data used a genuine random sampling method to select observations from a population. Survey samples would be an exception. I think it is far more frequently the case that a study takes whatever units/subjects are at hand and separates these into groups either through random assignment or based on some categorical variable. The permutation test directly answers the question about how a measured difference between group averages may have been different than if the groups had been formed in a different way. Then, instead of making the objectional argument that "we will treat the data as if it were a representative sample from the population of interest", and then using a t-test justified by a false random sampling argument and making inferences to some larger population, I find it much more justifiable to model the randomness that was truly part of the data gathering (random assignment). Any inference to other populations is then justified on the basis of background information (the groups I am interested in are similar to the groups in the study, so maybe the results there apply here too) and not by random sampling. It is important to describe how the units/subjects were selected and let the reader determine how applicable the results are to other populations.
When data is not collected by a random sample, the t-test can still be justified either as an approximation to the permutation test (but, as Doug would say, why approximate when you can use the computer to do the real thing) or if a normal model for the data is ASSUMED and not concluded in reference to the central limit theorem and random sampling that did not occur.
I would be very interested if readers of this message can send me specific reference to the use of a t-test with real data in an introductory text book for which the individual objects were genuinely sampled at random from populations.
-Bret _______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
