The big problem with the preliminary F test is shared by the preliminary
Levene's test, preliminary tests for normality, etc; and it is:
#####################################
# #
# They answer the wrong question. #
# #
#####################################
The question that you want the answer to is
"Is/Are the distribution[s] close enough to what we are assuming that
the test will be valid for this sample size?
In some cases, this becomes true for more distributions as n increases;
in others it is unchanged, or at least the incidence levels off.
I'm unaware of any sensible test that becomes more sensitive to
violations of assumption as n increases. [Tukey's "7-11" (which is based
only on the extreme tails of the samples) does do this. I don't consider
it to be a sensible test; it's a clever party trick.]
The question the preliminary test answers is
"Do the data provide evidence that the distributions fail at all to
satisfy the assumptions?"
And, as in the real world null hypotheses are never quite true, these
tests say (as Hamlet didn't quite put it:)
"Be thou as chaste as ice, as pure as snow, thou shalt not escape
asymptotic calumny: get thee to a nonparametric test, go!"
These tests are insensitive for small samples when the warning would
often be useful; they are too twitchy for large samples when the
violations of assumption often don't matter much. Somewhere, I suppose,
Goldilocks may find a middle-sized sample where it's just right...but
getting it right one time in three is pretty unambitious even for a
frequentist <grin, duck, & run>
-Robert Dawson
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
. http://jse.stat.ncsu.edu/ .
=================================================================