On Thu, 6 Apr 2000 [EMAIL PROTECTED] wrote:

> In response to Donald Burrill I would like to clarify what I said in my
> original post.  I have a vector A with 102 elements.  The mean of these
> elements is 50.  These elements all are close to 50.  The variance is
> ~90 which is small given the magnitude of the mean.  The maximum and
> minimum values are also fairly close to 50. 

If A is approximately Gaussian, with N = 102, one would not be surprised 
if the maximum and minimum values were about 2 standard deviations from 
the mean -- that is, about 50 plus or minus about 20, or from 30 to 70. 
(Does this agree with what you mean by "fairly close to 50"?)

> There are not any clearly outlying observations.  The distribution is 
> symmetric about the mean and tight to the mean as well.  
> Clearly, it is in some sense near to the mean and not very variable. 

This may be clear to you;  it is not so clear to me.  Perhaps this is a 
function of the graphical display you mention below.

> The other vector B has same n=102.  It is from a different population 
> ...  Vector B has mean 0.3.  It has variance that is 9. 
> The distribution is symmetric (more or less) about the mean but the 
> values are very spread out and not tight around the mean. 
> Clearly the _variance_ of vector A is larger than the variance of B 
> BUT I think I am right to say that A is less variable than B.
>    My decision to say this is based on their graphical representations.

This would seem to imply that the two distributions were displayed in 
separate graphical representations, and to two quite different scales. 
If they had been displayed to the same scale (which might be unreasonable 
to attempt without a VERY large piece of paper, given the disparity in 
means and variances) vector B would have appeared much more densely 
packed (i.e., visually less variable) than vector A.  After all, B's 
standard deviation is 3, about one-third that of A. 

On the other hand, in some disciplines it is not uncommon to refer to the 
"coefficient of variation" (CV) as a measure of dispersion.  (I never 
used it much myself, because I could not justify it as intelligible, let 
alone interpretable, in social-science contexts.)  It is defined thus:

        CV = s/m

wehre s = standard deviation and m = mean.  For your data, 
 vector A has a CV of about 9.5/50, or about 0.2;
 vector B has a CV of 3/0.3, or about 10.

In this sense, it would be entirely appropriate to say that the A data 
are less variable than the B data, and by nearly two orders of magnitude. 
        What I don't know, of course, is whether the CV is an appropriate 
measure of dispersion for the kinds of data you have.  Notice that it is 
a very difficult thing to deal with if the mean of a distribution is in 
any danger of being very near to zero:  if the mean be zero, CV is 
infinite. 

> Maybe it is wrong to think there is a way to quantify this. 

No:  trust your intuition.  Most things people notice _can_ be 
quantified, but it isn't always clear how best to do so, and one may 
encounter disagreements on the propriety of various ways of quantifying 
something.  That doesn't mean that it shouldn't be done, nor that it 
isn't worth doing:  it may mean that whatever one is trying to quantify 
is something really interesting.
        But one thing you have to address is what it _means_ if the 
variability of data set A is less than that of data set B.  What 
consequences ensue from such a declaration?  And what consequences ensue 
if the declaration should turn out to be wrong?  (As a result of further 
research, perhaps.)  Presumably such a result has _some_ meaning for you, 
or you would not have raised the question in the first place.

> Maybe this isn't very much an issue of statistics.

It's more, I think, an issue in the interface (or intersection) between 
statistics and your substantive discipline.  Most interesting questions 
are like that, I think.
                                -- DFB.
 ------------------------------------------------------------------------
 Donald F. Burrill                                 [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,          [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264                                 603-535-2597
 184 Nashua Road, Bedford, NH 03110                          603-471-7128  



===========================================================================
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================

Reply via email to