My favorite book for elementary statistics is David S. Moore's The Basic Practice of Statistics, now in its fifth edition. For comparing two means he has good reasons for denigrating both a. and b. below. He recommends using either a confidence interval _ _ x1 - x2 +/- t* sqrt(s1^2/n1 + s2^2/n2)
or a P-value for the test statistic _ _ t = (x1 - x2)/sqrt(s1^2/n1 + s2^2/n2) in either case using a t distribution with degrees of freedom the smaller of n1-1 and n2-1 or, if you are fussy, degrees of freedom (s1^2/n1 + s2^2/n2)^2 / ( (s1^2/n1)^2/(n1-1) + (s2^2/n2)^2/(n2-1) ) . In the first case the appropriate t distribution has area C (confidence level) between -t* and t*. Writing in 2003 (third edition), he states the TI-83 calculator correctly uses the fussy degrees of freedom. In the third edition all this is discussed in Chapter 17 Two-Sample Problems. Of course all the above is nonsense if the sampling is incorrect, say one interpreter is sampled on a fast machine and the other on a slow machine; or data in one of the samples cannot be considered to have been randomly chosen. On 10/20/2011 3:04 PM, Roger Hui wrote: > This message is addressed to Forum members who are knowledgeable in > statistics. > > The objective is to test whether the same expression is faster, > slower, or takes the same amount of time, on the two different > versions of the interpreter. We know that due to vagaries of the > operating system, the way interpreters are built (in particular the > memory usage), the phase of the moon, ... the same expression will run > in different times. Are the times "the same"? > > From stat courses taken long ago and from consulting ancient stats > texts, I get the idea that the following may be applicable: > > a. "Large-Sample Test" on the mean running time, with Z=(theta - > theta0)%s_theta0 as the normally distributed statistic. > > b. "Small-Sample Test for Comparing Two Population Means", with T=(Y0 > - Y1) % S * %: (%n0)+(%n1) as the t-distributed statistic. > > I believe what I want is a "Large-Sample Test for Comparing Two > Population Means". (Large-Sample because I can run as many benchmarks > as I like.) > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm