Re: help with statistic problem

Donald Burrill Sun, 13 Oct 2002 14:22:08 -0700

On Sun, 13 Oct 2002, Maleck Kcelam wrote:

> Dear sir or madam, I was making an experiment and I have a small
> problem to write my final results. I've counted grain sizes in a metal
> sample using a specific software and I obtained the following data:
  [re-formatted for compactness -- DFB]


 measure    mean  variance  st.dev.   C.V.   min.   max.     N
>  1         3.6    22.3     4.7     1.32    1.1    88.5   4376
>  2         4.3    18.3     4.3     0.98    1.5    96.9   4151

> So, basically, I have three questions:
>
> 1)  Usually I would have written (average +/- standard deviation) but
> the standard deviation is superior than my average ! How can I
> represent my final result or what should I say because I did several
> other measurements and they are quite the same?

Evidently the distribution of your measurements is highly skewed;  it
follows that the sample mean cannot be trusted as a measure of
expectation, and the sample variance is highly inflated by the few very
large values (maximum about 90, minimum about 1 to 1.5, mean about 4);
which is why st.dev. > average for both data sets.

Display your distributions, either as dotplots or frequency tables (or
possibly stem-&-leaf diagrams, but these are likely to be cumbersome with
such large sample sizes).  Investigate the very large values:  are they
real?  Do they belong with the bulk of your observations?  If "yes" to
both questions, consider
  (a) reporting order statistics instead of average and s.d. (median,
quartiles, possible 10th and 90th percentiles) so as to show the shape of
the distribution you're trying to summarize (and if you like to report
results in diagrams, a pair of box plots, one for each data set, would be
informative to your readers);
  (b) taking logarithms of your data values, displaying those
distributions, and using them if they're (nearly) symmetrical (in which
case mean & s.d. of the log values would be reasonable summary
statistics) (and in that case the antilog of the mean logarithm is the
geometric mean of the original data).

> 2) Can I express my result as (average +/- coefficient of variation)?

This does not appear to make any intuitive sense.  Why would one?

> 3) I need to represent this measures in one single number, so, how
> would I unite the two measurements?  Which formula should I use?

If you report order statistics, all you can do is combine the two data
sets into one (N = 8527) and find the median etc. of the combined data.

If logarithms give you reasonable distributions, the usual "single number"
would be the mean of the combined sample:
  (N1*(average1) + N2*(average2)) / (N1 + N2).

>       (mean average of  MEASURE 1+ MEASURE 2) +/- (????)

The "+/- (????)" does not look to me like a "single number";  looks more
like two numbers, an average of some kind +/- a measure of uncertainty.
 If you end up using means and s.d.s (either because you cast out the very
high values as not properly belonging to your data, or because you used a
transformation (e.g., log) that made the distribution nearly symmetric),
it might be appropriate to use a pooled standard deviation:

   pooled variance = (var1*(N1-1) + var2*(N2-1)) / (N1 + N2 - 2),
   pooled s.d. = square root of pooled variance.

Not everyone would agree that this is appropriate, however;  and knowing
nothing about the field in which you're operating, I cannot offer useful
opinion on this point.

> Could you possibly help me with this problem? Thank you in advance and
> I hope to hearing from you soon.

 -----------------------------------------------------------------------
 Donald F. Burrill                                            [EMAIL PROTECTED]
 56 Sebbins Pond Drive, Bedford, NH 03110                 (603) 626-0816
 [Old address:  184 Nashua Road, Bedford, NH 03110       (603) 471-7128]

.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Re: help with statistic problem

Reply via email to