- I am sorting and compressing the data-listing and question -
On 3 Nov 2000 12:23:07 -0800, [EMAIL PROTECTED] (Richard Lehman)
wrote:

> >A statistics question.
> >
> >Temperatures taken from different portions of a stream:
> >
> >Portion 1
 ( 15.8, 16.9, 17, 17.1, 18, 18.7 ) mean = 17.25, variance = 0.995
> >
> >Portion 2
 (18.3, 18.5 ) mean = 18.4,  variance = 0.02

 - That is not a very *precise*  estimate of the variance; it
is based on a difference of 2, in the smallest increment of
measurement.  More precision might have it as 50% smaller
or larger.

> >
> >Do these portions have different temperatures?
> >
> >Obviously the variances are unequal and a 2-sample [unequal variance]
> 
>    t = 2.74 w/ 5 df p = 0.037.
> 
> >No problem.
> 
> 
> >But (and this is where I am perplexed), a pooled [equal variance]
> 
>    t = 1.54 w/ 6 df p = 0.17.
> 
> >Why is the less conservative pooled t giving a lower t-value?  Are the
> >variances so uequal (and the one so close to zero) that the formula is
> >messed up?

Despite what you may have read -- There is hardly any difference in 
their conservatism, especially when compared to their main
distinction.  

Which is:  
If the small group has a bigger variance, then the chance (Q1, say)
of rejecting-the-null  lies with "pooling" the variances;  and 
if it has the small variance, then the chance (Q2) lies with
not-pooling.  In conditions that I once simulated, where it was
appropriate to have a power-transformation, I found Q1 to be 95%
larger than it should be for the one-tailed, 5% test, whereas Q2 was
only 90% too large.  So, both tests were rotten, but the
Satterthwaithe test was "more conservative" as others have said.
For various reasons, I stick with the pooled test, most of the time; 
I try to transform the data if a power transformation works, and 
the pooled test works right with dichotomous or scale data where 
the variances are artifacts of the means.

How to test here?  If there were only two measures taken  *because*
the tester knew that one stream would be much less variable, then the
non-pooled test is okay.  That is, you can use the test if you are
willing to concede, beforehand, that there is a difference in
variability, apart from differences in means.  But if you are going to
use it, that variance estimator should have been based on another
decimal of accuracy.

Another poster mentioned Randomization by Monte Carlo.  I don't 
know whether that should be favored over a complete randomization...
Fisher's randomization of 7 points into 5+2  means, results in only 21
possibilities of outcome.  So you would not see any result at under
5%, precisely.  Since the highest measure (18.7)  is in the wrong
group, there are two mean-contrasts larger than the one being tested,
so 3/21 implies only the 14% level, as a one-tailed test.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to