"The Fall 1998 Issue of 'Research in the Schools,' is a special issue on
'Statistical Significance Testing': 6 papers and 3 follow-up comments. The issue is now online at <http://roberts.ed.psu.edu/users/droberts/sigtest.htm>. If you're interested NOW is the time to download and save since the server on which they reside will be taken OFF LINE within the next few days."
In a later post, Roberts (2003b) gave another link to the above special issue as <http://www.personal.psu.edu/users/d/m/dmr/sigtest/sigtest.htm>, but does not indicate whether or not that link is also soon to disappear.
I note that many education papers even those in physics-education research (PER)!] continue to employ null-hypothesis testing with its "p" values, while eschewing the more widely accepted "effect size" (d), and (would you believe?) even ignoring the half-century-old "average normalized gain" <g> [Hovland et al. (1949), Gery (1972), Hake (1998a,b; 2002a,b)].
For a discussion of "p vs d controversy" see Hake (2002a), as well as the papers referred to by Roberts (2003a,b).
In Hake (2002a), I wrote [see that article for the references, bracketed by lines "HHHHHHH. . . ."):
HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
The effect size is commonly used in meta-analyses (e.g., Light et al. 1990, Hunt 1997, Glass 2000), and strongly recommended by many psychologists (B. Thompson 1996, 1998, 2000), and biologists (Johnson 1999, Anderson et al. 2000, W.L. Thompson 2001) as a preferred alternative (or at least addition) to the usually inappropriate (Rozeboom 1960, Carver 1993, Cohen 1994, Kirk 1996) t-tests and p values associated with null-hypothesis testing.
Carver (1993) subjected the Michelson & Morley (1887) data to a simple analysis of variance (ANOVA) and FOUND *STATISTICAL* SIGNIFICANCE ASSOCIATED WITH THE DIRECTION THE LIGHT WAS TRAVELING (P < 0.001)! He writes: "It is interesting to speculate how the course of history might have changed if Michelson and Morley had been trained to use this CORRUPT FORM OF THE SCIENTIFIC METHOD, that is, testing the null hypothesis first. They might have concluded that there was evidence of SIGNIFICANT differences in the speed of light associated with its direction and that therefore there was evidence for the luminiferous ether . . . . Fortunately Michelson and Morley . . .(first). . . .interpreted their data with respect to their research hypothesis." (My CAPS.)
Consistent with the scientific methodology of physical scientists such as Michelson/ Morley (see Sec. II-G), Rozeboom (1960) wrote: ". . . THE PRIMARY AIM OF A SCIENTIFIC EXPERIMENT IS NOT TO PRECIPITATE DECISIONS, BUT TO MAKE AN APPROPRIATE ADJUSTMENT IN THE DEGREE TO WHICH ONE ACCEPTS, OR BELIEVES, THE HYPOTHESIS OR HYPOTHESES BEING TESTED." (See also Anderson 1998.)
HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Regarding the half-century-old average normalized gain <g>, in Hake (2003b) I wrote [see that article for the references, bracketed by lines "HHHHHHH. . . ."):
HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
The normalized gain "g" for a treatment is defined [Hovland et al. (1949), Gery (1972), Hake (1998a)] as g = Gain/[Gain (maximum possible). Thus, e.g., if a class averaged 40% on the pretest, and 60% on the posttest then the class-average normalized gain <g> = (60% - 40%)/(100% - 40%) = 20%/60% = 0.33. Ever since the work of Hovland et al. (1949) it's been know by pre/post cognoscente (up until about 1998 probably less than 100 people worldwide)
that <g> IS A MUCH BETTER INDICATOR OF THE EXTENT TO WHICH A TREATMENT IS EFFECTIVE THAN IS EITHER GAIN OR POSTTEST, for example, if the treatment yields <g> > 0.3 for a mechanics course, then the course could be considered as in the "interactive-engagement zone" (Hake 1998a, Meltzer 2002b).
Regrettably, the psychology/education/psychometric PEP community [see e.g., Pelligrino et al. (2001); Shavelson & Towne (2001); Fox & Hackermann (2002); Feuer et al. (2002)] remains largely oblivious of PER and the normalized gain. Paraphrasing Lee Schulman, as quoted by the late Arnold Arons (1986): "it seems that in education, the wheel (more usually the flat tire) must be reinvented every few decades." Unfortunately there seems to be little effort to build a "community map" [Redish (1999), Lagemann (2000), Ziman (2000), Shavelson & Towne (2001), Hake (2002a - "Can Educational Research be *Scientific* Research?")]. Extrapolating the historical record, around 2030 yet another investigator will come up with the idea of g, and fruitlessly attempt to interest the pre/post paranoiac (Hake 2001b) education community. Then around 2060 . . . . . . . . . . . . . . . . .
HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Richard Hake, Emeritus Professor of Physics, Indiana University 24245 Hatteras Street, Woodland Hills, CA 91367 <[EMAIL PROTECTED]> <http://www.physics.indiana.edu/~hake> <http://www.physics.indiana.edu/~sdi>
REFERENCES
Gery, F.W. 1972. "Does mathematics matter?" in A. Welch, ed., Research papers in economic education. Joint Council on Economic Education. pp. 142-157.
Hake, R.R. 1998a. "Interactive-engagement vs traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses," Am. J. Phys. 66: 64-74; online as ref. 24 at <http://www.physics.indiana.edu/~hake>.
Hake, R.R. 1998b. "Interactive-engagement methods in introductory mechanics courses," online as ref. 25 at <http://www.physics.indiana.edu/~hake>. SUBMITTED on 6/19/98 to the "Physics Education Research Supplement to AJP"(PERS). In this SADLY UNPUBLISHED (Physics Education Research has NO archival journal!) crucial companion paper to Hake (1998a): average pre/post test scores, standard deviations, instructional methods, materials used, institutions, and instructors for each of the survey courses of Hake (1998a) are tabulated and referenced. In addition the paper includes: (a) case histories for the seven IE courses of Hake (1998a) whose effectiveness as gauged by pre-to-post test gains was close to those of T courses, (b) advice for implementing IE methods, and (c) suggestions for further research.
Hake, R.R. 2002a. "Lessons from the physics education reform effort." Conservation Ecology 5(2): 28; online at <http://www.consecol.org/vol5/iss2/art28>. "Conservation Ecology," is a FREE "peer-reviewed journal of integrative science and fundamental policy research" with about 11,000 subscribers in about 108 countries.
Hake, R.R. 2002b. "Assessment of Physics Teaching Methods," Proceedings of the UNESCO-ASPEN Workshop on Active Learning in Physics, Univ. of Peradeniya, Sri Lanka, 2-4 Dec. 2002; also online as ref. 29 at <http://www.physics.indiana.edu/~hake/>.
Hovland, C. I., A. A. Lumsdaine, and F. D. Sheffield. 1949. "A baseline for measurement of percentage change." In C. I. Hovland, A. A. Lumsdaine, and F. D. Sheffield, eds. 1965. "Experiments on mass communication.: Wiley (first published in 1949).) Reprinted as pages 77-82 in P. F. Lazarsfeld and M. Rosenberg, eds. 1955. "The language of social research: a reader in the methodology of social Research." Free Press.
Roberts, D. 2003a. "Significance Testing," post of 14 Oct 2003 14:20:25-0400 to AERA-D, AERA-C, EdStat, EvalTalk, & L-MINITAB-BASICS; online at
<http://lists.asu.edu/cgi-bin/wa?A2=ind0310&L=aera-d&T=0&O=D&P=4067>.
Roberts, D. 2003b. "Sign. Testing," post of 15 Oct 2003 11:47:51-0400 to AERA-D, AERA-C, [EMAIL PROTECTED], EvalTalk, EdStat, & L-MINITAB-BASICS; online at <http://lists.asu.edu/cgi-bin/wa?A2=ind0310&L=aera-d&T=0&O=D&P=5032>.
. . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
