----- Original Message -----
From: Glen Barnett <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Wednesday, August 30, 2000 7:45 PM
Subject: Re: Skewness and Kurtosis Questions

In reply to Ronny Richardson's question.

> There's several problems.
> (i) mean-median is measured in the units of the original data.
>  A skewness measure based on standardised third central moment
> (as is commonly used) is unit-free. Double all your numbers in a
> data set and you double "mean-median", but skewness is unchanged.
> (ii) there is not necessarily any relationship between the standardised
> third central moment measure of skewness and a (standardised)
> mean-median measure of skewness (e.g [mean-median]/std.dev).
> It is easy to construct data sets where the third-moment skewness
> measure has one sign while the mean-median skewness measure has
> the opposite sign.
----------------------------------
This is fine as long as we stick to the concept or idea of the existance of
a "standardized third central moment". When you look hard at the concept
from a historical aspect, you really have

 1)   �b1b1 = m3/ m2(to the 3/2 power)          (Pearson)
 (Note that in this rich text version of WORD, �b1 corresponds to the square
root symbol before b1.)
 2)   b1 = m3(squared)/ m2(cubed)                   (Pearson)
 3)   g1 = k3 / k2(to the 3/2 power)                    (Fisher)

where m2 is the second moment, m3 is the third moment, k2 is the second
semi-invariant and k3 is the third semi-invariant. These are the three basic
accepted concepts of a standardized third central moment.

If you now consider implementation by inserting numerical values you now
have nine primary numerical values of skewness as follows:

A. �b1, biased/biased
B. �b1, biased/unbiased
C. �b1, unbiased/biased
D. �b1, unbiased/unbiased
E. g1, unbiased/unbiased
F. b1, biased/biased
G. b1, biased/unbiased
H. b1, unbiased/biased
I. b1, unbiased/unbiased

A is the correct �b1 value. E is the correct g1 value. F is the correct b1
value. B is a variation that arises when the software program uses the
unbiased standard deviation to calculate the denominator. These variations
come from use of either the biased or unbiased estimates of the moments.
Fisher was the only one who had his-head-on-right here.

Be aware that most of the canned statistical software programs never tell
you which one of the nine they use to give a skewness value.

Also be aware that most texts and articles in the literature ignore the
biased/unbiased aspects. This is a major stumbling block when someone tries
to compare a textbook example to the output of a canned stat software
program. Be also aware that it took Jorgenson 7 years to fix his LISREL
package to compute correct skewness values.

Barnett then goes on...

> > Now, if I delete the two 150's on the end of data set #1 and change the
> > ranges on the formulae, I get a mean of 7.28 and I still get a median of
0.
> > Again, the mean is larger than the median so this should be positively
> > skewed but Excel returns a value of -0.370.
>
> It looks like you've just constructed just such an example as I mentioned.
>
> > I have verified Excel's calculations manually and they appear to be
correct
> > so it would appear that the commonly used statement that:
> >
> > mean > median: positive, or right-skewness
> > mean = median: symmetry, or zero-skewness
> > mean < median: negative, or left-skewness
> >
> > is incorrect, or, am I overlooking something?
>
>It is correct if you measure skewness in terms of mean-median. If you
>measure it some other way, it is no longer true.  Note in particular
>that zero third central moment does not imply symmetry (contrary
>to what some books assert).

If you use form 1) or form 3) then a zero value represents complete
symmetry. If you use form 2), complete symmetry will be a positive value.

On page 370 of Pearson's 1895 article, he says "It seems to me better to
take as the measure of skewness the ratio of the distance between the
maximum ordinate and the centroid to the length of the swing radius of the
curve about the centroid vertical." The quanta 'd/(�m2)' represents his
statement in symbolic form. Given that we have all kinds of interpretations
of what Pearson meant by this, we have (mean-mode)/sigma or
(mean-median)/sigma in the literature referred to as Pearson skewness
coefficients.

 Skewness Coefficient  =  [3(mean) - mode] / Sample Standard Deviation
 Skewness Coefficient  =  [3(mean) - median] / Sample Standard Deviation
 Skewness Coefficient = 3[mean - mode] / Sample Standard Deviation
 Skewness Coefficient = 3[mean - median] / Sample Standard Deviation
 Skewness Coefficient = [mean - mode] / Sample Standard Deviation
 Skewness Coefficient = [mean - median] / Sample Standard Deviation

as rule-of-thumb measures of a standardized skewness value. Take your pick.
In any case, you would have to use Pearson's form of the standard deviation,
which is the biased one and is referred to as the "Sample Standard
 Deviation".

> > Excel, and another reference I looked at, state that "The peakeness of a
> > distribution is measured by its kurtosis. Positive kurtosis indicates a
> > relatively peaked distribution. Negative kurtosis indicates a relatively
> > flat distribution."
>
> These are relative to a normal distribution.
>
> This statement is also wrong (as pointed out in Kendall and Stuart).
Kurtosis
> (as measured by standardized fourth central moment, sometimes with 3
> subtracted,
> as would have been intended by the above reference) is a *combination* of
> peakedness
> and heavy-tailedness; more specifically it is a tendency to vary away from
the
> mean +/- 1
> std. deviation.
>
> >
> > If that is the case, what does it mean that data set #1 above has a
> > kurtosis value of zero?
>
> It is supposedly of similar peakedness and heavy-tailedness as a normal
> distribution.
>
> >
> > I appreciate any comments you can supply.
> >
>
> Beware those books! If they get that wrong, what else have the not
understood?
>
> Fortunately you have had the sense to verify these things for yourself
rather
> than
> just accept what some book tells you.
>
> Kendall and Stuart Vol I may help to clear up some of these issues for
you.
> (Advanced Theory of Statistics. Don't be put off by the title - it is
quite
> readable; moreso than many books with the word "Introduction" or
"Introductory"
> in the title!)
>
> Glen
>
>

I won't tackle kurtosis here. At some other time.

My paper "THE USE AND MISUSE OF SKEWNESS AND KURTOSIS AS MEASURES OF
NON-NORMALITY" discusses these issues and gives the references.

DAH







>
>
> =================================================================
> Instructions for joining and leaving this list and remarks about
> the problem of INAPPROPRIATE MESSAGES are available at
>                   http://jse.stat.ncsu.edu/
> =================================================================



=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to