David Cross wrote:

> There is a standard test for comparing variances from two independent
> samples, and it is discussed in most intro stat texts.  The test statistic
> has an F-distribution, and degrees of freedom are what you would expect
> for the sample variances.


    No! No! Not the F test! _Anything_ but the F test! _Especially_ when -as
Yorgi says- the distributions are far from the same shape.  The F test is
infamous for its nonrobustness against deviations from normality.

    Levene's test, and some other variations on the same theme, give
somewhat more robust comparisons of spread. (Levene's test is a two sample t
test applied to absolute deviation from the group mean. Brown & Forsythe's
test is similar but done, perhaps more logically, on the absolute deviations
from the group median. [The mean is the point from which total squared
deviation is least, the median that from which total absolute deviation is
least.]

    These in turn involve some odd assumptions. In particular, deviations in
general are far from normally distributed. My experience is that, if the
distributions _are_ normal, a square root transformation symmetrizes the
absolute deviations rather nicely;  I've done some informal simulations
suggesting (to me, anyway) that a t test on the root-absolute-deviations
from either the mean or the median may have good properties. Another
interesting option is nonparametric tests; I think Lehmann mentions some
options in _Nonparametrics_ (1975).

    However, when all is said and done, I would suggest that if two
populations have very different kurtoses, there is no canonical measure of
spread, in much the same way that if two populations have very different
skewnesses there is no canonical measure of location.  In each case, a very
tail-sensitive measure [range, midrange] will do one thing, a robust measure
[interquartile range, median] another, and a "sum-of-squares"  measure
(standard deviation, mean] something in between.  Therefore, unless one has
a good reason to assign meaning to one measure of spread over another, it
may make sense to say simply "They are different shapes", provide graphics,
and leave it at that.

    Also: Yorgi wrote

>                           However, because the first
> has much larger magnitude, it has larger variance.

    If it really is "because", this suggests a model in which variation
increases with value. Such a model may sometimes be more naturally examined
after an appropriate transformation; and after the transformation there
might be no differemce in spread - or, possibly, even in shape. So it might
be worth examining the situation to see whether this is true.

    -Robert Dawson




===========================================================================
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================

Reply via email to