I have enjoyed the comments I read on this.

I want to point to the a couple of addition conclusions that
are possible, concerning these summaries of raw data -

> On Sun, 13 Oct 2002, Maleck Kcelam wrote:
> 
> > Dear sir or madam, I was making an experiment and I have a small
> > problem to write my final results. I've counted grain sizes in a metal
> > sample using a specific software and I obtained the following data:

On 13 Oct 2002 14:43:16 -0700, [EMAIL PROTECTED] (Donald Burrill) wrote:
 - I am citing DB, for his improvement of the text -
>   [re-formatted for compactness -- DFB]
> 
>  measure    mean  variance  st.dev.   C.V.   min.   max.     N
> >  1         3.6    22.3     4.7     1.32    1.1    88.5   4376
> >  2         4.3    18.3     4.3     0.98    1.5    96.9   4151
> 

1)  mean=3.6,  max= 88.5, SD= 4.72.  
  The z-score of the max is 18 -- an extreme I've 
seldom seen.   Since the "total variance"  of 
z-scores, referring to the "total Sum of squares
around the mean",  is equal to the DF=4375,
the single Max-value accounts for 18^2 =324, 
or 324/4375 =>  7.4%   of the variance.

2)  mean= 4.3, max= 96.9, SD= 4.28
  The z-score of the max is 21.6 -- even more extreme.
Here, z-squared is 468;   and   486/4150 = 11.2%  of
the variance.

On the original scales, each sample has at least
*one*  huge outlier.  By the way, there can't be a
dozen scores that extreme, because the total SS
has to add up.  If the original scaling is interesting,
one question would be:  How small is the mean and
SD  of the rest? - once you decide to trim-and-describe
a handful (how many?)  of outliers.


Further.  The log-transform also will leave big outliers, since
the median is *not*  midway between the min and max,
after transformation.  That would take medians of about
9.0, in order to be the geometric mean of 1  and 81 (say).

Since the two means are about 4, and the skews are 
extreme, the medians must be even smaller than 4.
It might be that the medians are small enough that the
*reciprocal*  transformation  will yield symmetry.  

Does any transformation make sense?  What is
the purpose?  You can't use Least squares test-statistics
while you have huge outliers.  You can't regard
the raw mean as an indicator  of "central tendency", 
if that was your intention -- One useful comment on the
distribution might be that difference (whatever it is)
between the mean and the median.

But you can still look at the Mean as a "parameter"  if 
there is a distribution that it might usefully index.  


-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to