On Thu, 27 Apr 2000, GEORGE PERKINS wrote:
> I got a call the other day from a high school science teacher asking
> about the following:
> She is testing different brands of yogurt for acid neutralization by
> acidophilus bacteria.
O.K. To start with we have some unspecified number b of grands
of yogurt; it follows that either we want to average them all together,
so as to ignore any systematic differences that may exist between brands,
or we want to keep them explicitly separate, so that we can detect (or at
any rate attempt to detect) systematic differences between brands.
If b > 2, already t-tests are to be discarded in favor of analysis of
variance (ANOVA).
> Her students have measured the pH of yogurt then
> poured in a known amount of acid and began measuring pH in intervals of
> 1 minute for 5 minutes. She has six replicates for each of the types of
> yogurt for a total of 12 time series.
If there are only 12 time series, then it appears b = 2. Yes?
Now the manipulation and measuring seem to have been carried out by some
(also unspecified number of) students. Are the six replicates associated
with six students, each of whom carried out one replicate? Or is the
procedure followed rather messier than that? And if the several students
aren't equivalent to the replicates, in what precisely do the replicates
consist?
> She wants to test if the mean concentration of acid is different in
> the two groups by taking the initial pH value - final pH value for each
> replicate getting a total of six differences per group then finds a
> mean of differences for each set.
Only initial vs. final? What was the point of the 1-minute-apart
administration of acid and measurement of pH, if one is going to ignore
the time-series information altogether?
> Finally, she wants to take the means from each set of differences and
> do a hypothesis test mu1=mu2 using a t-test but can't figure out the
> degrees of freedom of the test and frankly I am not quite sure either.
Why? That is, why a t-test? Because that's the only form of analysis
she knows how to do? The situation clearly calls for a repeated-measures
ANOVA; and I'd bet that if she actually does treat it as a t-test
(comparing Brand B with Brand X, I'd guess?), which could be equivalent
to the formal test of one of the main effects in the proper ANOVA, she
won't correctly calculate the sampling variance of the two means. If it
be the case that she doesn't know how to do ANOVA, point her gently in
the direction of Bruning & Kintz, Computational Handbook of Statistics,
which must be in a 4th or 5th edition by now. Marvellous cookbook --
leads the naive (or for that matter not so naive) reader through the
necessary arithmetic step by step [rather as though one were writing a
computer program for the computer between one's ears] for a _wide_
variety of formal analyses, and supplies references for those who want to
pursue the matter further.
> Her idea is to take 12-2 degrees but others have said it should be 6-1
> degrees. I wonder if others out there can shed light on three issues:
Well, let's see. If I've sorted this out aright, she has six replicates
(r = 6) of time-series measurements (t = 6) on each of two brands of
yogurt (b = 2) Looks like 72 measurements all together. Presumably the
six time points are conceptually or logically equivalent for all 12 time
series, so replicates (R) are crossed with time (T), and they are
necessarily nested within brand (B); we have therefore a formal design
of the form R(B)xT -- a repeated measures design.
The formal ANOVA table will have the following lines:
Source df Error term
Brand 1 R(B)
Replicates(Brand) 10 ---
Time 5 TR(B)
Brand x Time 5 TR(B)
Time x Repl (Brand) 50 ---
TOTAL 71
(Another name for "Error term" is "denominator mean square".)
If she decides to discard all the data in the time series except for the
first and last measurements, then there's only 1 d.f. for Time, and only
1 d.f. for the Time-by-Brand interaction, and 10 d.f. for TR(B).
> 1) Is the t-test approach she is using on solid statistical footing,
> and if so how many degrees of freedom is to be used for the t-test?
Well, _I'd_ use ANOVA myself. Error d.f. for Brand are 10.
> 2) If the t-test approach is not legitimate what type of statistical
> test can be used to test the mu1=mu2 hypothesis? (keep in mind that
> these are high school students)
Discussed at length above. Get Bruning & Kintz.
> 3) Is there a 'better' way to proceed with the analysis in the future
> for these types of experiments?
Yes.
> If you want to answer could you please forward the response to my
> e-mail address and I can forward them to her.
Done.
> Thanks,
You're welcome.
------------------------------------------------------------------------
Donald F. Burrill [EMAIL PROTECTED]
348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED]
MSC #29, Plymouth, NH 03264 603-535-2597
184 Nashua Road, Bedford, NH 03110 603-471-7128
===========================================================================
This list is open to everyone. Occasionally, less thoughtful
people send inappropriate messages. Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.
For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================