ABSTRACT: I discuss the 9/11 suicide attack on my post "Re: How do you gauge how you're doing?" by POD's Ed Nuhfer, pointing out that not all observers take Nuhfer's view that that the work of physics education researchers is essentially a duplication of the innovations of others and of little interest to those outside physics. I argue that Nuhfer is wrong in: (1) implying that my contrast of process and product measures is unreasonable, (2) suggesting that I think cognitive and affective factors are independent of one another, (3) presuming that I don't understand the importance of the affective domain and do not recognize that affective factors influence the cognitive impact of a course, (4) claiming that physicists have merely adopted the innovations of others, (5) stating that I am "savaging" education and psychology folk, (6) implying that I think SET's are a waste of time and that affective influences are a nuisance, (7) presuming that I place exclusive emphasis on pre/post testing.

Those who dislike very long posts (56 kB), references, cross-posting, or academic debate; or who have no interest in a rebuttal of Nuhfer's attack, are urged to hit the DELETE button. And if you reply PLEASE DON'T HIT THE REPLY BUTTON UNLESS YOU PRUNE THE COPY OF THIS POST THAT MAY APPEAR IN YOUR REPLY DOWN TO A FEW RELEVANT LINES, OTHERWISE THE ENTIRE POST MAY BE NEEDLESSLY RESENT TO SUBSCRIBERS.

In response to my post of 10 September 2005 "Re: How do you gauge how you're doing?" [Hake (2005)] Ed Nuhfer (2005) responded with a vigorous 9/11 attack on seven targets that I shall discuss below.

Nuhfer (2005) wrote:

11111111111111111111111111111111111111111111111
1. "I'm curious about the motivation for contrasting of the multiple measures of educational process . . . [(1) Reformed Teaching Observation Protocol (RTOP), (2) Student Evaluations Of Teaching (SET's), (3) Course Exams or Final Grades, (4) National Survey Of Student Engagement (NSSE), and (5) Student Assessment Of Learning Gains (SALG)] with product (test scores) as measured by a single tool . . . [(6) pre/post testing using (a) valid and consistently reliable tests devised by disciplinary experts, and (b) traditional courses as controls]. The tone of portraying an apparent competition between tests and other measures adds confusion when one reads. . . . . . . [Hake's opinion that Student Evaluations of Teaching (SET's) are NOT valid measures of the cognitive (as opposed to the affective) impact of courses.

In my opinion, Nuhfer's implication that my contrast of process and product measures is unreasonable is itself unreasonable. I clearly stated the motivation for the contrast: measures 1 - 5 of educational process (if one chooses to call "Course Exams or Final Grades" process measures) are INDIRECT (and therefore problematic) methods of measuring student learning. In sharp contrast measure "6" is a DIRECT (and therefore less problematic) method of measuring student learning. Such pre/post testing as currently undertaken in undergraduate astronomy, economics, biology, chemistry, computer science, and engineering courses, does not meet the U.S. Dept. of Education's (USDE's) pseudo "gold standard" of randomized control trials, but would nevertheless probably pass muster at the USDE's "What Works Clearing House" <http://www.w-w-c.org/> as "quasi-experimental studies [Shadish et al. (2002)] of especially strong design" [see <http://www.w-w-c.org/reviewprocess/standards.html>].

For introductory physics courses, pre/post testing has demonstrated that a a nearly two-standard deviation superiority in normalized learning gains for "interactive engagement" courses over traditional courses [Hake (2002a,b)] CAN be attained, and thus contributed to the solution of Bloom's (1984) "two-sigma" problem. Such testing has lead to marked improvement in many introductory physics courses throughout the nation [most notably at Harvard [Crouch & Mazur (2001)], North Carolina State University [Beichner & Saul (2004)], and MIT [Dori & Belcher (2004)]. I see no reason that similar results could not be eventually achieved in other disciplines IF their practitioners would undertake the lengthy qualitative and quantitative research [see e.g., Halloun & Hestenes (1985a,b)] required to develop multiple-choice (MC) tests of conceptual understanding that can be given to thousands of students in hundreds of courses under varying conditions.

How can MC tests gauge higher-order thinking skills such as conceptual understanding? Wilson & Bertenthal (2005) write:

"Performance assessment is an approach that offers great potential for assessing complex thinking and learning abilities, but multiple choice items also have their strengths. For example, although many people recognize that multiple-choice items are an efficient and effective way of determining how well students have acquired basic content knowledge, MANY DO NOT RECOGNIZE THAT THEY CAN ALSO BE USED TO MEASURE COMPLEX COGNITIVE PROCESSES. For example, THE FORCE CONCEPT INVENTORY (Hestenes, Wells, and Swackhamer, 1992) IS AN ASSESSMENT THAT USES MULTIPLE-CHOICE ITEMS TO TAP INTO HIGHER LEVEL COGNITIVE PROCESSES."



22222222222222222222222222222222222222222222222
2. "It seems unlikely to me that any cognitive learning can in practice be
isolated from affective influences."

Nuhfer seems to suggest that I think cognitive and affective factors are independent of one another. I think almost everyone would agree that student learning is influenced by affective factors. But so what? Does that mean that attempts to measure student learning directly are not worthwhile?


3333333333333333333333333333333333333333333333333
3. ". . . . The importance of both cognitive and affective domains. . . [Bloom et al. (1956), Krathwohl et al. (1964)]. . . has long been recognized although perhaps not understood by many practitioners. . . . Hake's opinion that student evaluations of teachers arise largely from the affective domain is supported by the thin slices research [Ambady & Rosenthal (1993)]. . . . Likewise, many resources show that the cognitive domain also contributes to evaluations in general but meaningful ways. . . "

Nuhfer is evidently presuming that I'm one of those insensitive technocratic oafs who doesn't understand the importance of the affective domain. If so, Nuhfer's presumption is wrong. In Hake (2002a) I wrote [see that article for the references other than Hake & Swihart (1979)]:

"I think SET's can be 'valid' in the sense that can be useful for gauging the *affective* impact of a course and for providing diagnostic feedback to *teachers* [see, e.g., Hake & Swihart (1979)] to assist them in making mid-course corrections. However IMHO, SET's are NOT valid in their widespread use by *administrators* to gauge the cognitive impact of courses [see, e.g., Williams & Ceci (1997); Hake (2000; 2002a,b); Johnson (2002)]. In fact the gross misuse of SET's as gauges of student learning is, in my view, one of the institutional factors that thwarts substantive educational reform (Hake 2002a, Lesson #12).

That SET ratings are NOT valid measures of the cognitive impact of a course (even though SET's may be *affected* by cognitive factors) is argued in "Re: Problems with Student Evaluations: Is Assessment the Remedy?" [Hake (2002c)]. Therein I wrote [see that article for references other than McKeachie (1987)]:

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
With regard to the problem of using course performance as a measure of student achievement or learning, Peter Cohen's (1981) oft-quoted meta-analysis of 41 studies on 68 separate multisection courses purportedly showing that:

"the average correlation between an overall instructor rating and student achievement was +0.43; the average correlation between an overall course rating and student achievement was +0.47 . . . the results . . . provide strong support for the validity of student ratings as measures of teaching effectiveness"

was reviewed and reanalyzed by Feldman (1989) who pointed out that McKeachie (1987) has recently reminded educational researchers and practitioners that the achievement tests assessing student learning in the sorts of studies reviewed here. . . (e.g., those by Cohen (1981, 1986, 1987). . . typically measure lower-level educational objectives such as memory of facts and definitions rather than higher-level outcomes such as critical thinking and problem solving . . .[he might have added conceptual understanding] . . . that are usually taken as important in higher education.

Striking back at SET skeptics, Peter Cohen (1990) opined:

"Negative attitudes toward student ratings are especially resistant to change, and it seems that faculty and administrators support their belief in student-rating myths with personal and anecdotal evidence, which (for them) outweighs empirically based research evidence."

However, as far as I know, NEITHER COHEN NOR ANY OTHER SET CHAMPION HAS COUNTERED THE FATAL OBJECTION OF MCKEACHIE (1987) THAT THE EVIDENCE FOR THE VALIDITY OF SET's AS GAUGES OF THE COGNITIVE IMPACT OF COURSES RESTS FOR THE MOST PART ON MEASURES OF STUDENTS' LOWER-LEVEL THINKING AS EXHIBITED IN COURSE GRADES OR EXAMS.

At least in physics it is well-known (see, e.g., Hake 2002a,b) that students in *traditional* mechanics courses can achieve A's through rote memorization and algorithmic problem solving, while achieving *normalized* gains in conceptual understanding of only about 0.2 (i.e., pre-to-post gains that are only about 0.2 of the maximum possible gain).
HHHHHHHHHHHHHHHHHHHHHHHHHHHHH

Regarding McKeachie's objection, Theall (2003) received an email from McKeachie (2003) who stated in response to the above quoted material: "It seems to me that all we can expect the mean overall rating of the course to do is to correlate with the teachers' assessment of the students achievement." But IF one can measure student learning *directly* then one can examine the correlation of *student learning* (as opposed to "achievement tests") vs SET ratings. As far as I know there has been no such systematic study, but anecdotal information from physics suggests that the correlation is more apt to be negative than positive; see, e.g., "Hostility to Interactive Engagement Methods" [Hake (2003a)].

Some may object that multiple-choice tests such as employed in physics diagnostic tests [FLAG (2005), NCSU (2005)] cannot possibly measure higher order thinking skills. But, as indicted in "1" above, see Wilson & Bertenthal (2005).


44444444444444444444444444444444444444444444444
4. ". . . the active learning and cooperative methods that Hake champions, largely developed and validated by educational researchers, such as the Johnson brothers at University of Minnesota, informed physics teachers before they began widely using them."

In my opinion, the above claim is incorrect. While it is true that the "interactive engagement" (IE) methods used in reform physics courses usually make some use "Collaborative Peer Instruction" (usually prejudgmentally called "Collaborative Learning"), largely developed and validated by educational researchers such as the Johnson brothers, the same cannot be said for most of the educational methods utilized by physicists.

For example, Mazur's (1997) "Concept Tests" and "Peer Instruction" were claimed by Nuhfer (2004) to be nothing more than a repeat of "Think-Pair-Share" [Lyman (1981)]. But as I point out in a post "Re: Think-Pair-Share citation?" [Hake (2002d)], Mazur's work differs from "Think-Pair-Share" in that:

a. student discussions are not limited to pairs but may include larger numbers (say 3 - 5) students who are seated in close proximity,

b. instructor assessment of student responses is facilitated by an electronic "Classroom Communication System" (CCS),

c. Mazur & Crouch (2001) now employ "Just In Time Teaching" [Novak et al. (1998, 1999)] strategies to encourage students to study reading assignments before coming to class,

d. definitive pre/post testing (Crouch & Mazur 2001) has indicated the relative effectiveness of "Peer Instruction" in promoting student learning. I am unaware of similar evidence for the effectiveness of the "Think-Pair-Share" activity.

More generally, in Hake (2002a) I wrote [see that article for the references]:

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
For the 48 interactive-engagement (IE) courses of Figs. 1 & 2, the ranking in terms of number of IE courses using each of the more popular methods is as follows:

(1) COLLABORATIVE PEER INSTRUCTION (Johnson et al. 1991; Heller et al. 1992a,b; Slavin 1995; Johnson et al. 2000): 48 (all courses) [CA] - for the meaning of "CA," and similar abbreviations below within the square brackets "[. . . .]", see the paragraph following this list.

(2) MICROCOMPUTER-BASED LABS (Thornton and Sokoloff 1990, 1998): 35 courses [DT].

(3) CONCEPT TESTS (Mazur 1997, Crouch & Mazur 2001): 20 courses [DT]; such tests for physics, biology, and chemistry are available on the web along with a description of the "Peer Instruction" method at the Galileo Project (2001).

(4) MODELING (Halloun & Hestenes 1987; Hestenes 1987, 1992; Wells et al. 1995): 19 courses [DT + CA]; a description is on the web at
<http://modeling.la.asu.edu/>.

(5) ACTIVE LEARNING PROBLEM SETS OR OVERVIEW CASE STUDIES (Van Heuvelen 1991a,b; 1995): 17 courses [CA]; information on these materials is online at
<http://www.physics.ohio-state.edu/~physedu/>.

(6) PHYSICS-EDUCATION-RESEARCH BASED TEXT (referenced in Hake 1998b, Table II) or no text: 13 courses.

(7) SOCRATIC DIALOGUE INDUCING LABS (Hake 1987, 1991, 1992, 2001a; Tobias & Hake 1988): 9 courses [DT + CA]; a description and lab manuals are on the web at the Galileo Project (2001) and <http://www.physics.indiana.edu/~sdi>.

The notations within the square brackets [. . .] follow Heller (1999) in loosely associating the methods with "learning theories" from cognitive science. Here "DT" stands for "Developmental Theory," originating with Piaget (Inhelder & Piaget 1958, Gardner 1985, Inhelder et al. 1987, Phillips & Soltis 1998); and "CA" stands for "Cognitive Apprenticeship" (Collins et al. 1989, Brown et al. 1989). All the methods (save #6) recognize the important role of social interactions in learning (Vygotsky 1978, Lave & Wenger 1991, Dewey 1997, Phillips & Soltis 1998). It should be emphasized that the above rankings are by popularity within the survey, and have no necessary connection with the effectiveness of the methods relative to one another. In fact, it is quite possible that some of the less popular methods used in some survey courses, as listed by Hake (1998b), could be more effective in terms of promoting student understanding than any of the above popular strategies.
HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH

As far as I know educational researchers such as the Johnson brothers had little if anything to do with Microcomputer-based Labs, Concept Tests, Modeling, Active Learning Problem Sets, Overview Case Studies, or Socratic Dialogue Inducing Labs.


55555555555555555555555555555555555555555555555
5. "My guess is that had the Johnsons excoriated physics departments the way Hake has been savaging the psychology and education folks, physicists would have resisted adaptation of others' innovations much longer. Excoriating a discipline is more likely to inspire defensiveness in its members rather than inspire them to adopt specific practices."

Nuhfer's uninformed implication that physicists merely *adapted* others' innovations is debunked in section "4" above. Regarding my "savaging the psychology and education folks" I have emphasized the importance of the contributions from psychologists and education specialists in Lesson #4 of the physics education reform effort [Hake (2002a):

"Education Research and Development (R&D) by disciplinary experts (DE's), and of the same quality and nature as traditional science/engineering R&D, is needed to develop potentially effective educational methods within each discipline. But the DE's should take advantage of the insights of (a) DE's doing education R&D in other disciplines, (b) COGNITIVE SCIENTISTS. . [COGNITIVE SCIENCE INCLUDES PSYCHOLOGY]. . ., (c) FACULTY AND GRADUATES OF EDUCATION SCHOOLS, and (d) classroom teachers.

Of course, it's true that I *have* constructively criticized psychologists for not researching the effectiveness of their own courses Hake (2005)]. But since when has constructive criticism been regarded as "savaging"? The reaction of subscribers to PsychTeacher, TIPS (Teaching In the Psychological Sciences), and TeachingEdPsych to this criticism has ranged from indifference to irritation. The PsychTeacher moderators even kicked me off their list. Moderator Rick Froman emailed me that ". . . we have decided to end this thread. . . due to the fact that the thread is not progressing and has gotten to point where there are few list members participating in it." But not *all* psychologists are indifferent to valid criticism. For example psychologist David Berliner (2005) wrote "Thanks for your provocative and educational emails."


6666666666666666666666666666666666666666666
6. "Just because evaluations may reflect affective feelings more than cognitive gains doesn't mean they are not valuable. If one wants to deal with facts and calculations without the 'nuisance' of affective influences, one will be happier programming computers than interacting with people."

Nuhfer is evidently implying that I think SET's are a waste of time and that affective influences are a nuisance. I countered that misconception in "3" above.


777777777777777777777777777777777777777777
7. "Pre-post testing provides information worth gathering. It is now one
generally accepted practice in assessment. . .[please tell that to assessment experts such a Linda Suskie (2004a,b) [Suskie's (2004a) canonical objections to pre/post testing are countered by Hake (2004a) and Scriven (2004)]] . . . However, tests can only measure some limited learning through specific sampling afforded by the tool. I have the same problems with attributing too much value to pre-post testing that I have with the testing mania of "no child left behind." A test is just one measuring tool. Because successful education is far more than courses, tests and grades, its assessment requires multiple tools and multiple measures."

The formative low-stakes formative pre/post testing that I advocate is the polar opposite of the high-stakes summative testing mandated by the No Child Left Behind act.

And Nuhfer's evident presumption that I place exclusive emphasis on pre/post testing is wrong, as shown by:

a. In Hake (1998b) I discuss the case studies and instructor surveys that I conducted to help validate my survey (Hake (1998a). b. The development of Socratic Dialogue Inducing (SDI) Labs [Hake (1992, 2002e)] required extensive *qualitative* research. That research involved the analysis of: (1) videotaped individual interviews probing both cognitive and affective states of introductory physics students, and (2) videotaped SDI lab sessions, including discussions both among students and between Socratic dialogists and students, (3) comments and performance of non-physical-science professors enrolled in the introductory physics course [Tobias & Hake (1999)].

c. In Hake (2002b) I wrote [bracketed by lines "HHHHHH. . . ."; see that article for the references]

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Does the class average normalized gain <g>. . .[see, e.g. Hake (1998a,b; 2002a,b)]. . . for the FCI (Force Concept Inventory), MD (Mechanics Diagnostic), or FMCE (Force Motion Concept Evaluation) provide a definitive assessment of the *overall* effectiveness of an introductory physics class? . . [For references to these tests see e.g., Hake (2002b)]. . . .

NO! It assesses "only the attainment of a minimal conceptual understanding of mechanics." . . . . Furthermore, as indicated in . . .[the unjustifiably suppressed]. . . Hake (1998b), among desirable outcomes of the introductory course that <g> does NOT measure directly are students':

(a) satisfaction with and interest in physics;

(b) understanding of the nature, methods, and limitations of science;

(c) understanding of the processes of scientific inquiry such as experimental design, control of variables dimensional analysis, order-of-magnitude estimation, thought experiments, hypothetical reasoning, graphing, and error analysis;

(d) ability to articulate their knowledge and learning processes;

(e) ability to collaborate and work in groups;

(f) communication skills;

(g) ability to solve real-world problems;

(h) understanding of the history of science and the relationship of science to society and other disciplines;

(i) understanding of, or at least appreciation for, "modern" physics;

(j) ability to participate in authentic research. HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH

Nuhfer's (2005) patronizing attack on me and on the value of physics education research is a continuation of his vitriolic blast "Re: Back to Basics vs. Hands-On Instruction" [Nuhfer (2004) - countered by Hake (2004b)]. It's fortunate that not all observers take the uninformed view of Nuhfer that the work of physics education researchers is essentially a duplication of the innovations of others and of little interest to those outside physics - see e.g., Stokstad (2001), Wood (2003), Wood & Gentile (2003), Powell (2003), Klymkowsky et al. (2003), Handlesman et al. (2004), Klymkowsky (2005).

Summarizing the above seven-part rebuttal, I think Nuhfer (2005) is WRONG in:

(1) implying that my contrast of process and product measures is unreasonable,

(2) suggesting that I think cognitive and affective factors are independent of one another,

(3) presuming that I don't understand the importance of the affective domain and do not recognize that affective factors influence cognitive factors,

(4) claiming that physicists have merely adopted the innovations of others,

(5) stating that I am "savaging" education and psychology folk,

(6) implying that I think SET's are a waste of time and that affective influences are a nuisance,

(7) presuming that I place exclusive emphasis on pre/post testing.


Richard Hake, Emeritus Professor of Physics, Indiana University
24245 Hatteras Street, Woodland Hills, CA 91367
<[EMAIL PROTECTED]>
<http://www.physics.indiana.edu/~hake>
<http://www.physics.indiana.edu/~sdi>

"Conflict is the gadfly of thought. It stirs us to observation and
memory. It instigates to invention. It shocks us out of sheep-like
passivity, and sets us at noting and contriving. Not that it always
effects this result; but that conflict is a sine qua non of
reflection and ingenuity."
    John Dewey "Morals Are Human," Dewey: Middle Works, Vol.14, p. 207.

REFERENCES
Ambady, N. & R. Rosenthal. 1992. "Thin Slices of Expressive Behavior as Predictors of Interpersonal Consequences: A Meta-analysis, "Psychological Bulletin 111: 256-274. For an introduction to the "thin-slice judgement" literature and debate thereon see Hake (2003b,c,d).

Anderson, L.W., & L.A. Sosniak (editor). 1994 "Bloom's Taxonomy: A Forty-Year Retrospective," Ninety-Third Yearbook of The National Society for the Study of Education," (Univ. of Chicago Press).

Anderson, L.W. & D. Krathwohl, eds. 2001. "A Taxonomy for Learning, Teaching and Assessing: A Revision of Bloom's Taxonomy of Educational Objectives." Addison Wesley Longman. See also Anderson & Sosniak (1994).

Beichner, R.J & J.M. Saul. 2003. "Introduction to the SCALE-UP (Student-Centered Activities for Large Enrollment Undergraduate Programs) Project," submitted to the Proceedings of the International School of Physics "Enrico Fermi", Varenna, Italy, (July 2003) online at
<http://www.ncsu.edu/per/Articles/Varenna_SCALEUP_Paper.pdf> (1MB).

Berliner. D. 2005. "Re: Teachers: the Archimedian Lever for Elevating Public-Schools," TeachingEdPsych post of 3 Jun 2005 22:24:59-0700; online at
<https://listserv.temple.edu/cgi-bin/wa?A2=ind0506&L=TEACHING_EDPSYCH&P=R102&I=-3&X=1942B662204F6E3D69&Y=rrhake%40earthlink.net>.

Bloom, B.S., ed., M.D. Englehart, E.J. Furst, W.H. Hill, & D.R. Krathwohl. 1956. "Taxonomy of educational objectives: Handbook I: Cognitive domain." David McKay. For an updated version that incorporates post-1956 advances in cognitive science see Anderson & Krathwohl (2001).

Bloom, B.S. 1984. "The 2 Sigma Problem: The Search for Methods of Group Instruction as Effective as One-to-One Tutoring," Educational Researcher 13(6), 4-16 (1984). Bloom wrote: "Using the standard deviation (sigma) of the control (conventional) class, it was typically found that the average student under tutoring was about two standard deviations above the average of the control class. . . The tutoring process demonstrates that MOST of the students do have the potential to reach this high level of learning. I believe an important task of research and instruction is to seek ways of accomplishing this under more practical and realistic conditions than the one-to-one tutoring, which is too costly for most societies to bear on a large scale. This is the '2 sigma' problem."

Crouch, C.H. & E. Mazur. 2001. "Peer Instruction: Ten years of experience and results," Am. J. Phys. 69: 970-977; online at <http://mazur-www.harvard.edu/library.php>, search "All Education Areas" for author "Crouch" (without the quotes).

Dori, Y.J. & J. Belcher, J. 2004. "How Does Technology-Enabled Active Learning Affect Undergraduate Students' Understanding of Electromagnetism Concepts?" The Journal of the Learning Sciences 14(2), abstract online at <http://www.cc.gatech.edu/lst/jls/vol14no2.html>. The complete article is online at
<http://web.mit.edu/jbelcher/www/TEALref/TEAL_Dori&Belcher_JLS_10_01_2
004.pdf> (1 MB).

FLAG. 2005. "Field-tested Learning Assessment Guide; online at <http://www.flaguide.org/>: ". . . offers broadly applicable, self-contained modular classroom assessment techniques (CAT's) and discipline-specific tools for STEM [Science, Technology, Engineering, and Mathematics] instructors interested in new approaches to evaluating student learning, attitudes and performance. Each has been developed, tested, and refined in real colleges and universities classrooms." Assessment tools for physics and astronomy (and other disciplines) are at <http://www.flaguide.org/tools/tools.php>.

Hake R.R. & J.C. Swihart. 1979. "Diagnostic Student Computerized Evaluation of Multicomponent Courses," Teaching and Learning, Vol. V, No. 3 (Indiana University, January 1979, updated 11/97; online as ref. #4 at <http://www.physics.indiana.edu/~hake/>, or download directly by clicking on <http://www.physics.indiana.edu/~sdi/DISCOE2.pdf> (20 kB).

Hake, R.R. 1992. "Socratic pedagogy in the introductory physics lab." Phys. Teach. 30: 546-552; updated version (4/27/98) online as ref. 23 at <http://www.physics.indiana.edu/~hake>, or simply click on
<http://www.physics.indiana.edu/~sdi/SocPed1.pdf> (88 kB).

Hake, R.R. 1998a. "Interactive-engagement vs traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses," Am. J. Phys. 66: 64-74; online as ref. 24 at <http://www.physics.indiana.edu/~hake>, or simply click on <http://www.physics.indiana.edu/~sdi/ajpv3i.pdf> (84 kB).

Hake, R.R. 1998b. "Interactive-engagement methods in introductory mechanics courses," online as ref. 25 at <http://www.physics.indiana.edu/~hake>, or simply click on <http://www.physics.indiana.edu/~sdi/IEM-2b.pdf> (108 kB) - a crucial companion paper to Hake (1998a).

Hake, R.R. 2002a. "Lessons from the physics education reform effort," Ecology and Society 5(2): 28; online at
<http://www.ecologyandsociety.org/vol5/iss2/art28/>. Ecology and Society
(formerly Conservation Ecology) is a free online "peer-reviewed journal of integrative science and fundamental policy research" with about 11,000 subscribers in about 108 countries.

Hake, R.R. 2002b. "Assessment of Physics Teaching Methods, Proceedings of the UNESCO-ASPEN Workshop on Active Learning in Physics, Univ. of Peradeniya, Sri Lanka, 2-4 Dec. 2002; also online as ref. 29 at
<http://www.physics.indiana.edu/~hake/>, or download directly by clicking on
<http://www.physics.indiana.edu/~hake/Hake-SriLanka-Assessb.pdf> (84 kB).

Hake, R.R. 2002c. "Re: Problems with Student Evaluations: Is Assessment the Remedy?" online at <http://listserv.nd.edu/cgi-bin/wa?A2=ind0204&L=pod&P=R14535>. Post of 25 Apr 2002 16:54:24-0700 to AERA-D, ASSESS, EvalTalk, Phys-L, PhysLrnR, POD, & STLHE-L. Slightly edited and improved on 16 November 2002 as ref. 18 at <http://www.physics.indiana.edu/~hake> or download directly at <http://www.physics.indiana.edu/~hake/AssessTheRem1.pdf> (72 kB)[This is the best version.]. Also online in HTML at <http://www.stu.ca/~hunt/hake.htm> as one of the many resources in Russ Hunt's annotated bibliography of articles and books on student evaluation of teaching <http://www.stu.ca/~hunt/evalbib.htm>.

Hake, R.R. 2002d. "Re: Think-Pair-Share citation?" misdated by my stupid computer as 1904; online at
<http://listserv.nd.edu/cgi-bin/wa?A2=ind0201&L=pod&P=R1686&I=-3>.

Hake,R.R. 2002e. "Socratic Dialogue Inducing Laboratory Workshop," Proceedings of the UNESCO-ASPEN Workshop on Active Learning in Physics, Univ. of Peradeniya, Sri Lanka, 2-4 Dec. 2002; also online as ref. 28 at
<http://www.physics.indiana.edu/~hake/> or download directly by clicking on
<http://www.physics.indiana.edu/~hake/Hake-SriLanka-SDIb.pdf> (44 KB).

Hake, R.R. 2003a. "Hostility to Interactive Engagement Methods," online at <http://listserv.nd.edu/cgi-bin/wa?A2=ind0310&L=pod&P=R5851&I=-3>. Post of 13 Oct 2003 15:08:37-0700 to Phys-L, PhysLnrR, and POD.

Hake, R.R. 2003b. "Thin-Slice Judgments, End-of-Course Evaluations, Grades, and Student Learning; online at <http://listserv.nd.edu/cgi-bin/wa?A2=ind0303&L=pod&P=R18434>. Post to ASSESS, EvalTalk, PhysLrnR, POD, & STLHE-L of 28 Mar 2003 16:23:25-0800.

Hake, R.R. 2003c. "Thin-Slice Judgments, End-of-Course Evaluations, Grades, and Student Learning - CORRECTIONS; online at <http://listserv.nd.edu/cgi-bin/wa?A2=ind0303&L=pod&P=R19378>. Post to ASSESS, EvalTalk, PhysLrnR, POD, & STLHE-L of 29 Mar 2003 11:45:27-0800.

Hake, R.R. 2003d. "Thin-Slice Judgments, End-of-Course Evaluations, Grades, and Student Learning; online at <http://listserv.nd.edu/cgi-bin/wa?A2=ind0303&L=pod&P=R21469>. Post to ASSESS, EvalTalk, PhysLrnR, POD, & STLHE-L of 31 Mar 2003 12:47:55-0800.

Hake, R.R. 2004a. "Re: pre-post testing in assessment," online at
<http://listserv.nd.edu/cgi-bin/wa?A2=ind0408&L=pod&P=R9135&I=-3>. Post of 19 Aug 2004 13:56:07-0700 to AERA-D, AERA-J, EDSTAT-L,
EVALTALK, PhysLrnR, and POD.

Hake, R.R. 2004b. "Re: Back to Basics vs. Hands-On Instruction," online at
<http://listserv.nd.edu/cgi-bin/wa?A2=ind0402&L=pod&P=R12714&I=-3>.
Post of 21 Feb 2004 21:45:48 -0800 to PhysLrnR and POD.

Hake, R.R. 2005.  "Re: How do you gauge how you're doing?" online at
<http://listserv.nd.edu/cgi-bin/wa?A2=ind0509&L=pod&O=D&P=7968>. Post of
10 Sep 2005 17:27:47-0700 to AERA-D, AERA-GSL, AERA-J, AERA-L, ASSESS, EvalTalk, PhysLrnR, PsychTeacher (rejected), TIPS, & TeachingEdPsych.

Halloun, I. & D. Hestenes. 1985a. "The initial knowledge state of college physics students." Am. J. Phys. 53:1043-1055; online at <http://modeling.asu.edu/R&E/Research.html>. Contains the "Mechanics Diagnostic" test (omitted from the online version), precursor to the "Force Concept Inventory."

Halloun, I. & D. Hestenes. 1985b. "Common sense concepts about motion." Am.
J. Phys. 53:1056-1065; online at <http://modeling.asu.edu/R&E/Research.html>.

Halloun, I., R.R. Hake, E.P Mosca, D. Hestenes. 1995. "Force Concept Inventory" (Revised, 1995); online (password protected) at <http://modeling.asu.edu/R&E/Research.html>. Available in English, Spanish, German, Malaysian, Chinese, Finnish, and Russian.

Handelsman, J., D. Ebert-May, R. Beichner, P. Bruns, A. Chang, R. DeHaan, J. Gentile, S. Lauffer, J. Stewart, S.M. Tilghman, W.B. Wood. 2004. "Scientific Teaching," Science 304 (23): 521-522, April; online for free (entire article to "Science" subscribers, abstract to guests) at <http://www.sciencemag.org/search.dtl>, search for Volume 304, First Page 521. Supporting Online Material (SOP) material - showing physics contributions - may be freely downloaded at <http://www.sciencemag.org/cgi/data/304/5670/521/DC1/1>. The complete article may be downloaded for free at Handelsman's homepage as a 100
kB pdf <http://www.plantpath.wisc.edu/fac/joh/scientificteaching.pdf>, or as
an 88kB pdf at John Belcher's site
<http://web.mit.edu/jbelcher/www/TEALref/scientificteaching.pdf>. See
also Wood &  Handelsman (2004).

Hestenes, D., M. Wells, & G. Swackhamer, 1992. "Force Concept Inventory." Phys. Teach. 30: 141-158; online (except for the test itself) at <http://modeling.asu.edu/R&E/Research.html>.

Klymkowsky, M.W., K. Garvin-Doxas, & M. Zeilik. 2003. "Bioliteracy and Teaching Efficiency: What Biologists Can Learn from Physicists," Cell Biology Education 2: 155-161; online at <http://www.cellbioed.org/article.cfm?ArticleID=67>. The abstract reads: "The introduction of the Force Concept Inventory (FCI) by Hestenes et al. (1992) produced a remarkable impact within the community of physics teachers. An instrument to measure student comprehension of the Newtonian concept of force, the FCI demonstrates that active learning leads to far superior student conceptual learning than didactic lectures. Compared to a working knowledge of physics, biological literacy and illiteracy have an even more direct, dramatic, and personal impact. They shape public research and reproductive health policies, the acceptance or rejection of technological advances, such as vaccinations, genetically modified foods and gene therapies, and, on the personal front, the reasoned evaluation of product claims and lifestyle choices. While many students take biology courses at both the secondary and the college levels, there is little in the way of reliable and valid assessment of the effectiveness of biological education. This lack has important consequences in terms of general bioliteracy and, in turn, for our society. Here we describe the beginning of a community effort to define what a bioliterate person needs to know and to develop, validate, and disseminate a tiered series of instruments collectively known as the Biology Concept Inventory (BCI), which accurately measures student comprehension of concepts in introductory, genetic, molecular, cell, and developmental biology. The BCI should serve as a lever for moving our current educational system in a direction that delivers a deeper conceptual understanding of the fundamental ideas upon which biology and biomedical sciences are based."

Klymkowsky, M.W. 2005. "Can Nonmajors Courses Lead to Biological Literacy? Do Majors Courses Do Any Better?" Cell Biology Education 4; online at <http://www.cellbioed.org/article.cfm?ArticleID=155>. Klymkowsky wrote "If biologists had assessment instruments analogous to the Force Concept Inventory for basic Newtonian mechanics (Hestenes et al., 1992; Klymkowsky et al., 2003), introductory majors and nonmajors courses would converge toward a common focus on fundamental concepts, critical to communicating in the language of biology. Introductory majors courses will spend more time ensuring that students actually understand the material presented (which is likely to drastically reduce the quantity of material "covered" per credit hour), while nonmajors courses will be forced to cover basic concepts needed to understand biological processes."

Hestenes, D., M. Wells, & G. Swackhamer, 1992. "Force Concept Inventory." Phys. Teach. 30: 141-158; online (except for the test itself) at <http://modeling.asu.edu/R&E/Research.html>. For the slightly revised 1995 version see Halloun et al. (1995).

Krathwohl, D.R., B.S. Bloom, B.B. Masia. 1964. "Taxonomy of educational objectives, the Classification of Educational Goals; Handbook II: The affective domain." David McKay. For an updates see Krathwohl et al. (1990) and Anderson & Sosniak (1994).

Krathwohl, D.R., B.B. Masia, with B.S. Bloom. 1990. "Taxonomy of Educational Objectives Book 2; Affective Domain." Longman.

Lyman, F. 1981. "The responsive classroom discussion." In A.S. Anderson, ed., "Mainstreaming Digest," the College Park, MD: University of Maryland College of Education. See also Millis & Cottell (1998).

Mazur, E. 1997. "Peer instruction: a user's manual." Prentice Hall; online at <http://galileo.harvard.edu/>, click on "Large Group" in the left column.

McKeachie, W.J. 1987. 'Instructional evaluation: Current issues and possible improvements." Journal of Higher Education 58(3): 344-350.

McKeachie, W.J. 2003. Email communication of 1 April to Mike Theall as indicated in Theall (2003). McKeachie wrote: "Although I'm not happy with the quality of most classroom examinations, they presumably assess the learning the teachers wanted to achieve. Thus it seems to me that correlations of student achievement with student ratings indicate that the students are able to make valid judgments of whether or not they have learned what they were supposed to. One wishes that more teachers were oriented toward teaching higher level thinking. I think we should ask students how much they gained in critical thinking, but if that's not a goal of the teacher, lack of correlation with achievement would not be evidence of invalidity of the student rating. It seems to me that all we can expect the mean overall rating of the course to do is to correlate with the teachers' assessment of the students achievement."

Millis, B. J., and Cottell, P. G., Jr. (1998). "Cooperative learning for higher education faculty," American Council on Education, Series on Higher Education. The Oryx Press, Phoenix, AZ.

NCSU. 2005. "Assessment Instrument Information Page," Physics Education R & D Group, North Carolina State University; online at
<http://www.ncsu.edu/per/TestInfo.html>.

Novak, G.M., E. Patterson, A. Gavrin, and R.C. Enger. 1998."Just-in-time teaching: active learner pedagogy with the WWW." IASTED International Conference on Computers and Advanced Technology in Education, May 27 -30, Cancun, Mexico; online at <http://webphysics.iupui.edu/JITT/ccjitt.html>.

Novak, G. M., E.T. Patterson, A.D. Gavrin, W. Christian. 1999. "Just in time teaching: Blending Active Learning with Web Technology." Prentice Hall; description online at <http://webphysics.iupui.edu/jitt/jitt.html>.

Nuhfer, E. 2004. "Re: Back to Basics vs. Hands-On Instruction," POD Post of 21 Feb 2004 10:39:52-0700; online at <http://listserv.nd.edu/cgi-bin/wa?A2=ind0402&L=pod&O=D&P=16847>. Countered by Hake (2004b).

Nuhfer, E. 2005. Re: How do you gauge how you're doing? POD post of 11 Sep 2005 00:38:17-0600; online at
<http://listserv.nd.edu/cgi-bin/wa?A2=ind0509&L=pod&O=D&P=8190>.

Powell, K. 2003. "Spare me the lecture," Nature 425, 18 September 2003, pp. 234-236; online as a 388K pdf at <http://www.nature.com./cgi-taf/DynaPage.taf?file=/nature/journal/v425/n6955/index.html>, scroll down about 1/3 of the page to "News Features": "US research universities, with their enormous classes, have a poor reputation for teaching science. Experts agree that a shake-up is needed, but which strategies work best? Kendall Powell goes back to school." Powell wrote: "Evidence of [the failure of the passive-student lecture] is provided by assessments such as the Force Concept Inventory (FCI), a multiple-choice test designed to examine students' understanding of Newton's laws of mechanics. Developed around a decade ago by David Hestenes, a physicist turned education researcher at Arizona State University in Tempe, the FCI has changed
some researchers' opinions of their teaching techniques."

Scriven, M. 2004. "Re: pre- post testing in assessment," AERA-D post of 15 Sept 2004 19:27:14-0400; online at
<http://lists.asu.edu/cgi-bin/wa?A2=ind0409&L=aera-d&T=0&F=&S=&P=1952>.

Shadish, W.R., T.D. Cook, & D.T. Campbell. 2002. "Experimental and Quasi-Experimental Designs for Generalized Causal Inference." Houghton Mifflin. A goldmine of references to the social-science literature of experimentation.

Stokstad, E. 2001. "Reintroducing the Intro Course." Science 293: 1608-1610, 31 August 2001. This special issue on "Trends in Undergraduate Education is online to subscribers at <http://www.sciencemag.org/content/vol293/issue5535/index.shtml>. Stokstad wrote: "Physicists are out in front in measuring how well students learn the basics, as science educators incorporate hands-on activities in hopes of making the introductory course a beginning rather than a finale."

Suskie, L. 2004a. "Re: pre- post testing in assessment," ASSESS post 19 Aug 2004 08:19:53-0400; online at <http://lsv.uky.edu/cgi-bin/wa.exe?A2=ind0408&L=assess&D=0&T=0&P=7492&F=P>. Suskie's canonical objections to pre/post testing were countered by Hake (2004a) and Scriven (2004).

Suskie, L. 2004b. "Assessing Student Learning," Anker Publishing.

Theall, M. 2003. "Re: Thin-Slice Judgments, End-of-Course Evaluations, Grades, and Student Learning," POD post of 2 Apr 2003 08:17:16-0500; online at <http://listserv.nd.edu/cgi-bin/wa?A2=ind0304&L=pod&P=R959&I=-3>.

Tobias S. & R.R. Hake. 1988. "Professors as physics students: What can they teach us?" Am. J. Phys. 56(9): 786-794.

Wilson, M.R. & M.W. Bertenthal, eds. 2005. "Systems for State Science Assessment," Nat. Acad. Press; online at <http://www.nap.edu/catalog.php?record_id=11312>.

Wood, W.B. 2003. "Inquiry-Based Undergraduate Teaching in the Life Sciences at Large Research Universities: A Perspective on the Boyer Commission Report," Cell Biology Education 2: 112-116; online at <http://www.cellbioed.org/article.cfm?ArticleID=57>. Wood wrote: "The ineffectiveness of standard lecture-based curricula has been particularly well documented in physics. In the early 1990s, physicists at Arizona State University developed a test called the Force Concept Inventory (FCI), designed to examine students' understanding of basic concepts in mechanics (Hestenes et al., 1992). This and similar tests have been used to compare the prevalence of common misconceptions before and after taking an introductory physics course or completing a physics major. . . . . Transforming Standard Courses - Again, the physicists took the lead in putting these ideas into practice. Eric Mazur at Harvard pioneered the use of "ConcepTests," posing questions during a lecture to assess student understanding, allowing contiguous groups of students to discuss the answer, and then displaying the distribution of group responses to the class by various means (at first colored index cards, more recently electronic devices). Differences in the responses lead to more discussion as students work toward consensus answers (Mazur, 1997; Crouch and Mazur, 2001). Robert Beichner, at North Carolina State University, has presented evidence on the effects of transforming his physics classes in this manner to an entirely inquiry-based format <http://www.ncsu.edu/per/scaleup/html>, using redesigned, electronically equipped classrooms that facilitate student interaction in small groups and allow them to access the internet during class for help in solving problems (see Figure 1). Using pre- and post-testing with quantitative assessments like the FCI (Hake, 1998), as well as interviews and other qualitative techniques, he can show clearly that the transformed classes are far superior to standard courses in promoting student understanding, as reviewed in a recent issue of CBE (Dancy and Beichner, 2002)."

Wood, W.B., and J.M. Gentile. 2003. "Teaching in a research context," Science 302, 1510; 28 November 2003; freely online only to subscribers only at
<http://www.sciencemag.org/content/vol302/issue5650/index.shtml#policyforum>.
They write [see the article for the references other than Hestenes et al. (1992) and Klymkowsky et al. (2003), My CAPS]: "Unknown to many university faculty in the natural sciences, particularly at large research institutions, is a large body of recent research from educators and cognitive scientists on how people learn [Bransford et al. (2000)]. The results show that MANY STANDARD INSTRUCTIONAL PRACTICES IN UNDERGRADUATE TEACHING, INCLUDING TRADITIONAL LECTURE, LABORATORY, AND RECITATION COURSES, ARE RELATIVELY INEFFECTIVE AT HELPING STUDENTS MASTER AND RETAIN THE IMPORTANT CONCEPTS OF THEIR DISCIPLINES OVER THE LONG TERM. Moreover, these practices do not adequately develop creative thinking, investigative, and collaborative problem-solving skills that employers often seek. PHYSICS EDUCATORS HAVE LED THE WAY in developing and using objective tests [Hestenes et al. (1992), Hake (1998a), NCSU (2005)] to compare student learning gains in different types of courses, and chemists, biologists, and others. . .[BUT EVIDENTLY NOT PSYCHOLOGISTS OR MATHEMATICIANS]. . . are now developing similar instruments [Mulford & Robinson (2002), Klymkowsky et al. (2003), Klymkowsky (2004)].











---
You are currently subscribed to tips as: [email protected]
To unsubscribe send a blank email to [EMAIL PROTECTED]

Reply via email to