Re: How do you gauge how you're doing?

Richard Hake Tue, 13 Sep 2005 17:27:41 -0700

ABSTRACT: I discuss the 9/11 suicide attack on my post "Re: How doyou gauge how you're doing?" by POD's Ed Nuhfer, pointing out thatnot all observers take Nuhfer's view that that the work of physicseducation researchers is essentially a duplication of the innovationsof others and of little interest to those outside physics. I arguethat Nuhfer is wrong in: (1) implying that my contrast of process andproduct measures is unreasonable, (2) suggesting that I thinkcognitive and affective factors are independent of one another, (3)presuming that I don't understand the importance of the affectivedomain and do not recognize that affective factors influence thecognitive impact of a course, (4) claiming that physicists havemerely adopted the innovations of others, (5) stating that I am"savaging" education and psychology folk, (6) implying that I thinkSET's are a waste of time and that affective influences are anuisance, (7) presuming that I place exclusive emphasis on pre/posttesting.

Those who dislike very long posts (56 kB), references, cross-posting,or academic debate; or who have no interest in a rebuttal of Nuhfer'sattack, are urged to hit the DELETE button. And if you reply PLEASEDON'T HIT THE REPLY BUTTON UNLESS YOU PRUNE THE COPY OF THIS POSTTHAT MAY APPEAR IN YOUR REPLY DOWN TO A FEW RELEVANT LINES, OTHERWISETHE ENTIRE POST MAY BE NEEDLESSLY RESENT TO SUBSCRIBERS.

In response to my post of 10 September 2005 "Re: How do you gauge howyou're doing?" [Hake (2005)] Ed Nuhfer (2005) responded with avigorous 9/11 attack on seven targets that I shall discuss below.


Nuhfer (2005) wrote:

11111111111111111111111111111111111111111111111

1. "I'm curious about the motivation for contrasting of the multiplemeasures of educational process . . . [(1) Reformed TeachingObservation Protocol (RTOP), (2) Student Evaluations Of Teaching(SET's), (3) Course Exams or Final Grades, (4) National Survey OfStudent Engagement (NSSE), and (5) Student Assessment Of LearningGains (SALG)] with product (test scores) as measured by a single tool. . . [(6) pre/post testing using (a) valid and consistently reliabletests devised by disciplinary experts, and (b) traditional courses ascontrols]. The tone of portraying an apparent competition betweentests and other measures adds confusion when one reads. . . . . . .[Hake's opinion that Student Evaluations of Teaching (SET's) are NOTvalid measures of the cognitive (as opposed to the affective) impactof courses.

In my opinion, Nuhfer's implication that my contrast of process andproduct measures is unreasonable is itself unreasonable. I clearlystated the motivation for the contrast: measures 1 - 5 of educationalprocess (if one chooses to call "Course Exams or Final Grades"process measures) are INDIRECT (and therefore problematic) methods ofmeasuring student learning. In sharp contrast measure "6" is a DIRECT(and therefore less problematic) method of measuring studentlearning. Such pre/post testing as currently undertaken inundergraduate astronomy, economics, biology, chemistry, computerscience, and engineering courses, does not meet the U.S. Dept. ofEducation's (USDE's) pseudo "gold standard" of randomized controltrials, but would nevertheless probably pass muster at the USDE's"What Works Clearing House" <http://www.w-w-c.org/> as"quasi-experimental studies [Shadish et al. (2002)] of especiallystrong design" [see<http://www.w-w-c.org/reviewprocess/standards.html>].

For introductory physics courses, pre/post testing has demonstratedthat a a nearly two-standard deviation superiority in normalizedlearning gains for "interactive engagement" courses over traditionalcourses [Hake (2002a,b)] CAN be attained, and thus contributed to thesolution of Bloom's (1984) "two-sigma" problem. Such testing has leadto marked improvement in many introductory physics courses throughoutthe nation [most notably at Harvard [Crouch & Mazur (2001)], NorthCarolina State University [Beichner & Saul (2004)], and MIT [Dori &Belcher (2004)]. I see no reason that similar results could not beeventually achieved in other disciplines IF their practitioners wouldundertake the lengthy qualitative and quantitative research [seee.g., Halloun & Hestenes (1985a,b)] required to developmultiple-choice (MC) tests of conceptual understanding that can begiven to thousands of students in hundreds of courses under varyingconditions.

How can MC tests gauge higher-order thinking skills such asconceptual understanding? Wilson & Bertenthal (2005) write:

"Performance assessment is an approach that offers great potentialfor assessing complex thinking and learning abilities, but multiplechoice items also have their strengths. For example, although manypeople recognize that multiple-choice items are an efficient andeffective way of determining how well students have acquired basiccontent knowledge, MANY DO NOT RECOGNIZE THAT THEY CAN ALSO BE USEDTO MEASURE COMPLEX COGNITIVE PROCESSES. For example, THE FORCECONCEPT INVENTORY (Hestenes, Wells, and Swackhamer, 1992) IS ANASSESSMENT THAT USES MULTIPLE-CHOICE ITEMS TO TAP INTO HIGHER LEVELCOGNITIVE PROCESSES."




22222222222222222222222222222222222222222222222
2. "It seems unlikely to me that any cognitive learning can in practice be
isolated from affective influences."

Nuhfer seems to suggest that I think cognitive and affective factorsare independent of one another. I think almost everyone would agreethat student learning is influenced by affective factors. But sowhat? Does that mean that attempts to measure student learningdirectly are not worthwhile?



3333333333333333333333333333333333333333333333333

3. ". . . . The importance of both cognitive and affective domains. .. [Bloom et al. (1956), Krathwohl et al. (1964)]. . . has long beenrecognized although perhaps not understood by many practitioners. . .. Hake's opinion that student evaluations of teachers arise largelyfrom the affective domain is supported by the thin slices research[Ambady & Rosenthal (1993)]. . . . Likewise, many resources show thatthe cognitive domain also contributes to evaluations in general butmeaningful ways. . . "

Nuhfer is evidently presuming that I'm one of those insensitivetechnocratic oafs who doesn't understand the importance of theaffective domain. If so, Nuhfer's presumption is wrong. In Hake(2002a) I wrote [see that article for the references other than Hake& Swihart (1979)]:

"I think SET's can be 'valid' in the sense that can be useful forgauging the *affective* impact of a course and for providingdiagnostic feedback to *teachers* [see, e.g., Hake & Swihart (1979)]to assist them in making mid-course corrections. However IMHO, SET'sare NOT valid in their widespread use by *administrators* to gaugethe cognitive impact of courses [see, e.g., Williams & Ceci (1997);Hake (2000; 2002a,b); Johnson (2002)]. In fact the gross misuse ofSET's as gauges of student learning is, in my view, one of theinstitutional factors that thwarts substantive educational reform(Hake 2002a, Lesson #12).

That SET ratings are NOT valid measures of the cognitive impact of acourse (even though SET's may be *affected* by cognitive factors) isargued in "Re: Problems with Student Evaluations: Is Assessment theRemedy?" [Hake (2002c)]. Therein I wrote [see that article forreferences other than McKeachie (1987)]:


HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH

With regard to the problem of using course performance as a measureof student achievement or learning, Peter Cohen's (1981) oft-quotedmeta-analysis of 41 studies on 68 separate multisection coursespurportedly showing that:

"the average correlation between an overall instructor rating andstudent achievement was +0.43; the average correlation between anoverall course rating and student achievement was +0.47 . . . theresults . . . provide strong support for the validity of studentratings as measures of teaching effectiveness"

was reviewed and reanalyzed by Feldman (1989) who pointed out thatMcKeachie (1987) has recently reminded educational researchers andpractitioners that the achievement tests assessing student learningin the sorts of studies reviewed here. . . (e.g., those by Cohen(1981, 1986, 1987). . . typically measure lower-level educationalobjectives such as memory of facts and definitions rather thanhigher-level outcomes such as critical thinking and problem solving .. .[he might have added conceptual understanding] . . . that areusually taken as important in higher education.


Striking back at SET skeptics, Peter Cohen (1990) opined:

"Negative attitudes toward student ratings are especially resistantto change, and it seems that faculty and administrators support theirbelief in student-rating myths with personal and anecdotal evidence,which (for them) outweighs empirically based research evidence."

However, as far as I know, NEITHER COHEN NOR ANY OTHER SET CHAMPIONHAS COUNTERED THE FATAL OBJECTION OF MCKEACHIE (1987) THAT THEEVIDENCE FOR THE VALIDITY OF SET's AS GAUGES OF THE COGNITIVE IMPACTOF COURSES RESTS FOR THE MOST PART ON MEASURES OF STUDENTS'LOWER-LEVEL THINKING AS EXHIBITED IN COURSE GRADES OR EXAMS.

At least in physics it is well-known (see, e.g., Hake 2002a,b) thatstudents in *traditional* mechanics courses can achieve A's throughrote memorization and algorithmic problem solving, while achieving*normalized* gains in conceptual understanding of only about 0.2(i.e., pre-to-post gains that are only about 0.2 of the maximumpossible gain).

HHHHHHHHHHHHHHHHHHHHHHHHHHHHH

Regarding McKeachie's objection, Theall (2003) received an email fromMcKeachie (2003) who stated in response to the above quoted material:"It seems to me that all we can expect the mean overall rating of thecourse to do is to correlate with the teachers' assessment of thestudents achievement." But IF one can measure student learning*directly* then one can examine the correlation of *student learning*(as opposed to "achievement tests") vs SET ratings. As far as I knowthere has been no such systematic study, but anecdotal informationfrom physics suggests that the correlation is more apt to be negativethan positive; see, e.g., "Hostility to Interactive EngagementMethods" [Hake (2003a)].

Some may object that multiple-choice tests such as employed inphysics diagnostic tests [FLAG (2005), NCSU (2005)] cannot possiblymeasure higher order thinking skills. But, as indicted in "1" above,see Wilson & Bertenthal (2005).



44444444444444444444444444444444444444444444444

4. ". . . the active learning and cooperative methods that Hakechampions, largely developed and validated by educationalresearchers, such as the Johnson brothers at University of Minnesota,informed physics teachers before they began widely using them."

In my opinion, the above claim is incorrect. While it is true thatthe "interactive engagement" (IE) methods used in reform physicscourses usually make some use "Collaborative Peer Instruction"(usually prejudgmentally called "Collaborative Learning"), largelydeveloped and validated by educational researchers such as theJohnson brothers, the same cannot be said for most of the educationalmethods utilized by physicists.

For example, Mazur's (1997) "Concept Tests" and "Peer Instruction"were claimed by Nuhfer (2004) to be nothing more than a repeat of"Think-Pair-Share" [Lyman (1981)]. But as I point out in a post "Re:Think-Pair-Share citation?" [Hake (2002d)], Mazur's work differs from"Think-Pair-Share" in that:

a. student discussions are not limited to pairs but may includelarger numbers (say 3 - 5) students who are seated in close proximity,

b. instructor assessment of student responses is facilitated by anelectronic "Classroom Communication System" (CCS),

c. Mazur & Crouch (2001) now employ "Just In Time Teaching" [Novak etal. (1998, 1999)] strategies to encourage students to study readingassignments before coming to class,

d. definitive pre/post testing (Crouch & Mazur 2001) has indicatedthe relative effectiveness of "Peer Instruction" in promoting studentlearning. I am unaware of similar evidence for the effectiveness ofthe "Think-Pair-Share" activity.


More generally, in Hake (2002a) I wrote [see that article for the references]:

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH

For the 48 interactive-engagement (IE) courses of Figs. 1 & 2, theranking in terms of number of IE courses using each of the morepopular methods is as follows:

(1) COLLABORATIVE PEER INSTRUCTION (Johnson et al. 1991; Heller etal. 1992a,b; Slavin 1995; Johnson et al. 2000): 48 (all courses) [CA]- for the meaning of "CA," and similar abbreviations below within thesquare brackets "[. . . .]", see the paragraph following this list.

(2) MICROCOMPUTER-BASED LABS (Thornton and Sokoloff 1990, 1998): 35courses [DT].

(3) CONCEPT TESTS (Mazur 1997, Crouch & Mazur 2001): 20 courses [DT];such tests for physics, biology, and chemistry are available on theweb along with a description of the "Peer Instruction" method at theGalileo Project (2001).

(4) MODELING (Halloun & Hestenes 1987; Hestenes 1987, 1992; Wells etal. 1995): 19 courses [DT + CA]; a description is on the web at

<http://modeling.la.asu.edu/>.

(5) ACTIVE LEARNING PROBLEM SETS OR OVERVIEW CASE STUDIES (VanHeuvelen 1991a,b; 1995): 17 courses [CA]; information on thesematerials is online at

<http://www.physics.ohio-state.edu/~physedu/>.

(6) PHYSICS-EDUCATION-RESEARCH BASED TEXT (referenced in Hake 1998b,Table II) or no text: 13 courses.

(7) SOCRATIC DIALOGUE INDUCING LABS (Hake 1987, 1991, 1992, 2001a;Tobias & Hake 1988): 9 courses [DT + CA]; a description and labmanuals are on the web at the Galileo Project (2001) and<http://www.physics.indiana.edu/~sdi>.

The notations within the square brackets [. . .] follow Heller (1999)in loosely associating the methods with "learning theories" fromcognitive science. Here "DT" stands for "Developmental Theory,"originating with Piaget (Inhelder & Piaget 1958, Gardner 1985,Inhelder et al. 1987, Phillips & Soltis 1998); and "CA" stands for"Cognitive Apprenticeship" (Collins et al. 1989, Brown et al. 1989).All the methods (save #6) recognize the important role of socialinteractions in learning (Vygotsky 1978, Lave & Wenger 1991, Dewey1997, Phillips & Soltis 1998). It should be emphasized that the aboverankings are by popularity within the survey, and have no necessaryconnection with the effectiveness of the methods relative to oneanother. In fact, it is quite possible that some of the less popularmethods used in some survey courses, as listed by Hake (1998b), couldbe more effective in terms of promoting student understanding thanany of the above popular strategies.

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH

As far as I know educational researchers such as the Johnson brothershad little if anything to do with Microcomputer-based Labs, ConceptTests, Modeling, Active Learning Problem Sets, Overview Case Studies,or Socratic Dialogue Inducing Labs.



55555555555555555555555555555555555555555555555

5. "My guess is that had the Johnsons excoriated physics departmentsthe way Hake has been savaging the psychology and education folks,physicists would have resisted adaptation of others' innovations muchlonger. Excoriating a discipline is more likely to inspiredefensiveness in its members rather than inspire them to adoptspecific practices."

Nuhfer's uninformed implication that physicists merely *adapted*others' innovations is debunked in section "4" above. Regarding my"savaging the psychology and education folks" I have emphasized theimportance of the contributions from psychologists and educationspecialists in Lesson #4 of the physics education reform effort [Hake(2002a):

"Education Research and Development (R&D) by disciplinary experts(DE's), and of the same quality and nature as traditionalscience/engineering R&D, is needed to develop potentially effectiveeducational methods within each discipline. But the DE's should takeadvantage of the insights of (a) DE's doing education R&D in otherdisciplines, (b) COGNITIVE SCIENTISTS. . [COGNITIVE SCIENCE INCLUDESPSYCHOLOGY]. . ., (c) FACULTY AND GRADUATES OF EDUCATION SCHOOLS, and(d) classroom teachers.

Of course, it's true that I *have* constructively criticizedpsychologists for not researching the effectiveness of their owncourses Hake (2005)]. But since when has constructive criticism beenregarded as "savaging"? The reaction of subscribers to PsychTeacher,TIPS (Teaching In the Psychological Sciences), and TeachingEdPsych tothis criticism has ranged from indifference to irritation. ThePsychTeacher moderators even kicked me off their list. Moderator RickFroman emailed me that ". . . we have decided to end this thread. .. due to the fact that the thread is not progressing and has gottento point where there are few list members participating in it." Butnot *all* psychologists are indifferent to valid criticism. Forexample psychologist David Berliner (2005) wrote "Thanks for yourprovocative and educational emails."



6666666666666666666666666666666666666666666

6. "Just because evaluations may reflect affective feelings more thancognitive gains doesn't mean they are not valuable. If one wants todeal with facts and calculations without the 'nuisance' of affectiveinfluences, one will be happier programming computers thaninteracting with people."

Nuhfer is evidently implying that I think SET's are a waste of timeand that affective influences are a nuisance. I countered thatmisconception in "3" above.



777777777777777777777777777777777777777777
7. "Pre-post testing provides information worth gathering. It is now one

generally accepted practice in assessment. . .[please tell that toassessment experts such a Linda Suskie (2004a,b) [Suskie's (2004a)canonical objections to pre/post testing are countered by Hake(2004a) and Scriven (2004)]] . . . However, tests can only measuresome limited learning through specific sampling afforded by the tool.I have the same problems with attributing too much value to pre-posttesting that I have with the testing mania of "no child left behind."A test is just one measuring tool. Because successful education isfar more than courses, tests and grades, its assessment requiresmultiple tools and multiple measures."

The formative low-stakes formative pre/post testing that I advocateis the polar opposite of the high-stakes summative testing mandatedby the No Child Left Behind act.

And Nuhfer's evident presumption that I place exclusive emphasis onpre/post testing is wrong, as shown by:

a. In Hake (1998b) I discuss the case studies and instructor surveysthat I conducted to help validate my survey (Hake (1998a).b. The development of Socratic Dialogue Inducing (SDI) Labs [Hake(1992, 2002e)] required extensive *qualitative* research. Thatresearch involved the analysis of: (1) videotaped individualinterviews probing both cognitive and affective states ofintroductory physics students, and (2) videotaped SDI lab sessions,including discussions both among students and between Socraticdialogists and students, (3) comments and performance ofnon-physical-science professors enrolled in the introductory physicscourse [Tobias & Hake (1999)].

c. In Hake (2002b) I wrote [bracketed by lines "HHHHHH. . . ."; seethat article for the references]


HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH

Does the class average normalized gain <g>. . .[see, e.g. Hake(1998a,b; 2002a,b)]. . . for the FCI (Force Concept Inventory), MD(Mechanics Diagnostic), or FMCE (Force Motion Concept Evaluation)provide a definitive assessment of the *overall* effectiveness of anintroductory physics class? . . [For references to these tests seee.g., Hake (2002b)]. . . .

NO! It assesses "only the attainment of a minimal conceptualunderstanding of mechanics." . . . . Furthermore, as indicated in . ..[the unjustifiably suppressed]. . . Hake (1998b), among desirableoutcomes of the introductory course that <g> does NOT measuredirectly are students':


(a) satisfaction with and interest in physics;

(b) understanding of the nature, methods, and limitations of science;

(c) understanding of the processes of scientific inquiry such asexperimental design, control of variables dimensional analysis,order-of-magnitude estimation, thought experiments, hypotheticalreasoning, graphing, and error analysis;


(d) ability to articulate their knowledge and learning processes;

(e) ability to collaborate and work in groups;

(f) communication skills;

(g) ability to solve real-world problems;

(h) understanding of the history of science and the relationship ofscience to society and other disciplines;


(i) understanding of, or at least appreciation for, "modern" physics;

(j) ability to participate in authentic research.HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH

Nuhfer's (2005) patronizing attack on me and on the value of physicseducation research is a continuation of his vitriolic blast "Re: Backto Basics vs. Hands-On Instruction" [Nuhfer (2004) - countered byHake (2004b)]. It's fortunate that not all observers take theuninformed view of Nuhfer that the work of physics educationresearchers is essentially a duplication of the innovations of othersand of little interest to those outside physics - see e.g., Stokstad(2001), Wood (2003), Wood & Gentile (2003), Powell (2003), Klymkowskyet al. (2003), Handlesman et al. (2004), Klymkowsky (2005).


Summarizing the above seven-part rebuttal, I think Nuhfer (2005) is WRONG in:

(1) implying that my contrast of process and product measures is unreasonable,

(2) suggesting that I think cognitive and affective factors areindependent of one another,

(3) presuming that I don't understand the importance of the affectivedomain and do not recognize that affective factors influencecognitive factors,


(4) claiming that physicists have merely adopted the innovations of others,

(5) stating that I am "savaging" education and psychology folk,

(6) implying that I think SET's are a waste of time and thataffective influences are a nuisance,


(7) presuming that I place exclusive emphasis on pre/post testing.


Richard Hake, Emeritus Professor of Physics, Indiana University
24245 Hatteras Street, Woodland Hills, CA 91367
<[EMAIL PROTECTED]>
<http://www.physics.indiana.edu/~hake>
<http://www.physics.indiana.edu/~sdi>

"Conflict is the gadfly of thought. It stirs us to observation and
memory. It instigates to invention. It shocks us out of sheep-like
passivity, and sets us at noting and contriving. Not that it always
effects this result; but that conflict is a sine qua non of
reflection and ingenuity."
    John Dewey "Morals Are Human," Dewey: Middle Works, Vol.14, p. 207.

REFERENCES

Ambady, N. & R. Rosenthal. 1992. "Thin Slices of Expressive Behavioras Predictors of Interpersonal Consequences: A Meta-analysis,"Psychological Bulletin 111: 256-274. For an introduction to the"thin-slice judgement" literature and debate thereon see Hake(2003b,c,d).

Anderson, L.W., & L.A. Sosniak (editor). 1994 "Bloom's Taxonomy: AForty-Year Retrospective," Ninety-Third Yearbook of The NationalSociety for the Study of Education," (Univ. of Chicago Press).

Anderson, L.W. & D. Krathwohl, eds. 2001. "A Taxonomy for Learning,Teaching and Assessing: A Revision of Bloom's Taxonomy of EducationalObjectives." Addison Wesley Longman. See also Anderson & Sosniak(1994).

Beichner, R.J & J.M. Saul. 2003. "Introduction to the SCALE-UP(Student-Centered Activities for Large Enrollment UndergraduatePrograms) Project," submitted to the Proceedings of the InternationalSchool of Physics "Enrico Fermi", Varenna, Italy, (July 2003) onlineat

<http://www.ncsu.edu/per/Articles/Varenna_SCALEUP_Paper.pdf> (1MB).

Berliner. D. 2005. "Re: Teachers: the Archimedian Lever for ElevatingPublic-Schools," TeachingEdPsych post of 3 Jun 2005 22:24:59-0700;online at

<https://listserv.temple.edu/cgi-bin/wa?A2=ind0506&L=TEACHING_EDPSYCH&P=R102&I=-3&X=1942B662204F6E3D69&Y=rrhake%40earthlink.net>.

Bloom, B.S., ed., M.D. Englehart, E.J. Furst, W.H. Hill, & D.R.Krathwohl. 1956. "Taxonomy of educational objectives: Handbook I:Cognitive domain." David McKay. For an updated version thatincorporates post-1956 advances in cognitive science see Anderson &Krathwohl (2001).

Bloom, B.S. 1984. "The 2 Sigma Problem: The Search for Methods ofGroup Instruction as Effective as One-to-One Tutoring," EducationalResearcher 13(6), 4-16 (1984). Bloom wrote: "Using the standarddeviation (sigma) of the control (conventional) class, it wastypically found that the average student under tutoring was about twostandard deviations above the average of the control class. . . Thetutoring process demonstrates that MOST of the students do have thepotential to reach this high level of learning. I believe animportant task of research and instruction is to seek ways ofaccomplishing this under more practical and realistic conditions thanthe one-to-one tutoring, which is too costly for most societies tobear on a large scale. This is the '2 sigma' problem."

Crouch, C.H. & E. Mazur. 2001. "Peer Instruction: Ten years ofexperience and results," Am. J. Phys. 69: 970-977; online at<http://mazur-www.harvard.edu/library.php>, search "All EducationAreas" for author "Crouch" (without the quotes).

Dori, Y.J. & J. Belcher, J. 2004. "How Does Technology-Enabled ActiveLearning Affect Undergraduate Students' Understanding ofElectromagnetism Concepts?" The Journal of the Learning Sciences14(2), abstract online at<http://www.cc.gatech.edu/lst/jls/vol14no2.html>. The completearticle is online at

<http://web.mit.edu/jbelcher/www/TEALref/TEAL_Dori&Belcher_JLS_10_01_2
004.pdf> (1 MB).

FLAG. 2005. "Field-tested Learning Assessment Guide; online at<http://www.flaguide.org/>: ". . . offers broadly applicable,self-contained modular classroom assessment techniques (CAT's) anddiscipline-specific tools for STEM [Science, Technology, Engineering,and Mathematics] instructors interested in new approaches toevaluating student learning, attitudes and performance. Each has beendeveloped, tested, and refined in real colleges and universitiesclassrooms." Assessment tools for physics and astronomy (and otherdisciplines) are at <http://www.flaguide.org/tools/tools.php>.

Hake R.R. & J.C. Swihart. 1979. "Diagnostic Student ComputerizedEvaluation of Multicomponent Courses," Teaching and Learning, Vol. V,No. 3 (Indiana University, January 1979, updated 11/97; online asref. #4 at<http://www.physics.indiana.edu/~hake/>, or download directly byclicking on <http://www.physics.indiana.edu/~sdi/DISCOE2.pdf> (20 kB).

Hake, R.R. 1992. "Socratic pedagogy in the introductory physics lab."Phys. Teach. 30: 546-552; updated version (4/27/98) online as ref. 23at <http://www.physics.indiana.edu/~hake>, or simply click on

<http://www.physics.indiana.edu/~sdi/SocPed1.pdf> (88 kB).

Hake, R.R. 1998a. "Interactive-engagement vs traditional methods: Asix-thousand-student survey of mechanics test data for introductoryphysics courses," Am. J. Phys. 66: 64-74; online as ref. 24 at<http://www.physics.indiana.edu/~hake>, or simply click on<http://www.physics.indiana.edu/~sdi/ajpv3i.pdf> (84 kB).

Hake, R.R. 1998b. "Interactive-engagement methods in introductorymechanics courses," online as ref. 25 at<http://www.physics.indiana.edu/~hake>, or simply click on<http://www.physics.indiana.edu/~sdi/IEM-2b.pdf> (108 kB) - acrucial companion paper to Hake (1998a).

Hake, R.R. 2002a. "Lessons from the physics education reform effort,"Ecology and Society 5(2): 28; online at

<http://www.ecologyandsociety.org/vol5/iss2/art28/>. Ecology and Society

(formerly Conservation Ecology) is a free online "peer-reviewedjournal of integrative science and fundamental policy research" withabout 11,000 subscribers in about 108 countries.

Hake, R.R. 2002b. "Assessment of Physics Teaching Methods,Proceedings of the UNESCO-ASPEN Workshop on Active Learning inPhysics, Univ. of Peradeniya, Sri Lanka, 2-4 Dec. 2002; also onlineas ref. 29 at

<http://www.physics.indiana.edu/~hake/>, or download directly by clicking on
<http://www.physics.indiana.edu/~hake/Hake-SriLanka-Assessb.pdf> (84 kB).

Hake, R.R. 2002c. "Re: Problems with Student Evaluations: IsAssessment the Remedy?" online at<http://listserv.nd.edu/cgi-bin/wa?A2=ind0204&L=pod&P=R14535>. Postof 25 Apr 2002 16:54:24-0700 to AERA-D, ASSESS, EvalTalk, Phys-L,PhysLrnR, POD, & STLHE-L. Slightly edited and improved on 16 November2002 as ref. 18 at <http://www.physics.indiana.edu/~hake> or downloaddirectly at <http://www.physics.indiana.edu/~hake/AssessTheRem1.pdf>(72 kB)[This is the best version.]. Also online in HTML at<http://www.stu.ca/~hunt/hake.htm> as one of the many resources inRuss Hunt's annotated bibliography of articles and books on studentevaluation of teaching <http://www.stu.ca/~hunt/evalbib.htm>.

Hake, R.R. 2002d. "Re: Think-Pair-Share citation?" misdated by mystupid computer as 1904; online at

<http://listserv.nd.edu/cgi-bin/wa?A2=ind0201&L=pod&P=R1686&I=-3>.

Hake,R.R. 2002e. "Socratic Dialogue Inducing Laboratory Workshop,"Proceedings of the UNESCO-ASPEN Workshop on Active Learning inPhysics, Univ. of Peradeniya, Sri Lanka, 2-4 Dec. 2002; also onlineas ref. 28 at

<http://www.physics.indiana.edu/~hake/> or download directly by clicking on
<http://www.physics.indiana.edu/~hake/Hake-SriLanka-SDIb.pdf> (44 KB).

Hake, R.R. 2003a. "Hostility to Interactive Engagement Methods,"online at<http://listserv.nd.edu/cgi-bin/wa?A2=ind0310&L=pod&P=R5851&I=-3>.Post of 13 Oct 2003 15:08:37-0700 to Phys-L, PhysLnrR, and POD.

Hake, R.R. 2003b. "Thin-Slice Judgments, End-of-Course Evaluations,Grades, and Student Learning; online at<http://listserv.nd.edu/cgi-bin/wa?A2=ind0303&L=pod&P=R18434>. Postto ASSESS, EvalTalk, PhysLrnR, POD, & STLHE-L of 28 Mar 200316:23:25-0800.

Hake, R.R. 2003c. "Thin-Slice Judgments, End-of-Course Evaluations,Grades, and Student Learning - CORRECTIONS; online at<http://listserv.nd.edu/cgi-bin/wa?A2=ind0303&L=pod&P=R19378>. Postto ASSESS, EvalTalk, PhysLrnR, POD, & STLHE-L of 29 Mar 200311:45:27-0800.

Hake, R.R. 2003d. "Thin-Slice Judgments, End-of-Course Evaluations,Grades, and Student Learning; online at<http://listserv.nd.edu/cgi-bin/wa?A2=ind0303&L=pod&P=R21469>. Postto ASSESS, EvalTalk, PhysLrnR, POD, & STLHE-L of 31 Mar 200312:47:55-0800.


Hake, R.R. 2004a. "Re: pre-post testing in assessment," online at

<http://listserv.nd.edu/cgi-bin/wa?A2=ind0408&L=pod&P=R9135&I=-3>.Post of 19 Aug 2004 13:56:07-0700 to AERA-D, AERA-J, EDSTAT-L,

EVALTALK, PhysLrnR, and POD.

Hake, R.R. 2004b. "Re: Back to Basics vs. Hands-On Instruction," online at
<http://listserv.nd.edu/cgi-bin/wa?A2=ind0402&L=pod&P=R12714&I=-3>.
Post of 21 Feb 2004 21:45:48 -0800 to PhysLrnR and POD.

Hake, R.R. 2005.  "Re: How do you gauge how you're doing?" online at
<http://listserv.nd.edu/cgi-bin/wa?A2=ind0509&L=pod&O=D&P=7968>. Post of

10 Sep 2005 17:27:47-0700 to AERA-D, AERA-GSL, AERA-J, AERA-L,ASSESS, EvalTalk, PhysLrnR, PsychTeacher (rejected), TIPS, &TeachingEdPsych.

Halloun, I. & D. Hestenes. 1985a. "The initial knowledge state ofcollege physics students." Am. J. Phys. 53:1043-1055; online at<http://modeling.asu.edu/R&E/Research.html>. Contains the "MechanicsDiagnostic" test (omitted from the online version), precursor to the"Force Concept Inventory."


Halloun, I. & D. Hestenes. 1985b. "Common sense concepts about motion." Am.
J. Phys. 53:1056-1065; online at <http://modeling.asu.edu/R&E/Research.html>.

Halloun, I., R.R. Hake, E.P Mosca, D. Hestenes. 1995. "Force ConceptInventory" (Revised, 1995); online (password protected) at<http://modeling.asu.edu/R&E/Research.html>. Available in English,Spanish, German, Malaysian, Chinese, Finnish, and Russian.

Handelsman, J., D. Ebert-May, R. Beichner, P. Bruns, A. Chang, R.DeHaan, J. Gentile, S. Lauffer, J. Stewart, S.M. Tilghman, W.B. Wood.2004. "Scientific Teaching," Science 304 (23): 521-522, April; onlinefor free (entire article to "Science" subscribers, abstract toguests) at <http://www.sciencemag.org/search.dtl>, search for Volume304, First Page 521. Supporting Online Material (SOP) material -showing physics contributions - may be freely downloaded at<http://www.sciencemag.org/cgi/data/304/5670/521/DC1/1>. The completearticle may be downloaded for free at Handelsman's homepage as a 100

kB pdf <http://www.plantpath.wisc.edu/fac/joh/scientificteaching.pdf>, or as
an 88kB pdf at John Belcher's site
<http://web.mit.edu/jbelcher/www/TEALref/scientificteaching.pdf>. See
also Wood &  Handelsman (2004).

Hestenes, D., M. Wells, & G. Swackhamer, 1992. "Force ConceptInventory." Phys. Teach. 30: 141-158; online (except for the testitself) at <http://modeling.asu.edu/R&E/Research.html>.

Klymkowsky, M.W., K. Garvin-Doxas, & M. Zeilik. 2003. "Bioliteracyand Teaching Efficiency: What Biologists Can Learn from Physicists,"Cell Biology Education 2: 155-161; online at<http://www.cellbioed.org/article.cfm?ArticleID=67>. The abstractreads: "The introduction of the Force Concept Inventory (FCI) byHestenes et al. (1992) produced a remarkable impact within thecommunity of physics teachers. An instrument to measure studentcomprehension of the Newtonian concept of force, the FCI demonstratesthat active learning leads to far superior student conceptuallearning than didactic lectures. Compared to a working knowledge ofphysics, biological literacy and illiteracy have an even more direct,dramatic, and personal impact. They shape public research andreproductive health policies, the acceptance or rejection oftechnological advances, such as vaccinations, genetically modifiedfoods and gene therapies, and, on the personal front, the reasonedevaluation of product claims and lifestyle choices. While manystudents take biology courses at both the secondary and the collegelevels, there is little in the way of reliable and valid assessmentof the effectiveness of biological education. This lack has importantconsequences in terms of general bioliteracy and, in turn, for oursociety. Here we describe the beginning of a community effort todefine what a bioliterate person needs to know and to develop,validate, and disseminate a tiered series of instruments collectivelyknown as the Biology Concept Inventory (BCI), which accuratelymeasures student comprehension of concepts in introductory, genetic,molecular, cell, and developmental biology. The BCI should serve as alever for moving our current educational system in a direction thatdelivers a deeper conceptual understanding of the fundamental ideasupon which biology and biomedical sciences are based."

Klymkowsky, M.W. 2005. "Can Nonmajors Courses Lead to BiologicalLiteracy? Do Majors Courses Do Any Better?" Cell Biology Education 4;online at <http://www.cellbioed.org/article.cfm?ArticleID=155>.Klymkowsky wrote "If biologists had assessment instruments analogousto the Force Concept Inventory for basic Newtonian mechanics(Hestenes et al., 1992; Klymkowsky et al., 2003), introductory majorsand nonmajors courses would converge toward a common focus onfundamental concepts, critical to communicating in the language ofbiology. Introductory majors courses will spend more time ensuringthat students actually understand the material presented (which islikely to drastically reduce the quantity of material "covered" percredit hour), while nonmajors courses will be forced to cover basicconcepts needed to understand biological processes."

Hestenes, D., M. Wells, & G. Swackhamer, 1992. "Force ConceptInventory." Phys. Teach. 30: 141-158; online (except for the testitself) at <http://modeling.asu.edu/R&E/Research.html>. For theslightly revised 1995 version see Halloun et al. (1995).

Krathwohl, D.R., B.S. Bloom, B.B. Masia. 1964. "Taxonomy ofeducational objectives, the Classification of Educational Goals;Handbook II: The affective domain." David McKay. For an updates seeKrathwohl et al. (1990) and Anderson & Sosniak (1994).

Krathwohl, D.R., B.B. Masia, with B.S. Bloom. 1990. "Taxonomy ofEducational Objectives Book 2; Affective Domain." Longman.

Lyman, F. 1981. "The responsive classroom discussion." In A.S.Anderson, ed., "Mainstreaming Digest," the College Park, MD:University of Maryland College of Education. See also Millis &Cottell (1998).

Mazur, E. 1997. "Peer instruction: a user's manual." Prentice Hall;online at <http://galileo.harvard.edu/>, click on "Large Group" inthe left column.

McKeachie, W.J. 1987. 'Instructional evaluation: Current issues andpossible improvements." Journal of Higher Education 58(3): 344-350.

McKeachie, W.J. 2003. Email communication of 1 April to Mike Theallas indicated in Theall (2003). McKeachie wrote: "Although I'm nothappy with the quality of most classroom examinations, theypresumably assess the learning the teachers wanted to achieve. Thusit seems to me that correlations of student achievement with studentratings indicate that the students are able to make valid judgmentsof whether or not they have learned what they were supposed to. Onewishes that more teachers were oriented toward teaching higher levelthinking. I think we should ask students how much they gained incritical thinking, but if that's not a goal of the teacher, lack ofcorrelation with achievement would not be evidence of invalidity ofthe student rating. It seems to me that all we can expect the meanoverall rating of the course to do is to correlate with the teachers'assessment of the students achievement."

Millis, B. J., and Cottell, P. G., Jr. (1998). "Cooperative learningfor higher education faculty," American Council on Education, Serieson Higher Education. The Oryx Press, Phoenix, AZ.

NCSU. 2005. "Assessment Instrument Information Page," PhysicsEducation R & D Group, North Carolina State University; online at

<http://www.ncsu.edu/per/TestInfo.html>.

Novak, G.M., E. Patterson, A. Gavrin, and R.C. Enger.1998."Just-in-time teaching: active learner pedagogy with the WWW."IASTED International Conference on Computers and Advanced Technologyin Education, May 27 -30, Cancun, Mexico; online at<http://webphysics.iupui.edu/JITT/ccjitt.html>.

Novak, G. M., E.T. Patterson, A.D. Gavrin, W. Christian. 1999. "Justin time teaching: Blending Active Learning with Web Technology."Prentice Hall; description online at<http://webphysics.iupui.edu/jitt/jitt.html>.

Nuhfer, E. 2004. "Re: Back to Basics vs. Hands-On Instruction," PODPost of 21 Feb 2004 10:39:52-0700; online at<http://listserv.nd.edu/cgi-bin/wa?A2=ind0402&L=pod&O=D&P=16847>.Countered by Hake (2004b).

Nuhfer, E. 2005. Re: How do you gauge how you're doing? POD post of11 Sep 2005 00:38:17-0600; online at

<http://listserv.nd.edu/cgi-bin/wa?A2=ind0509&L=pod&O=D&P=8190>.

Powell, K. 2003. "Spare me the lecture," Nature 425, 18 September2003, pp. 234-236; online as a 388K pdf at<http://www.nature.com./cgi-taf/DynaPage.taf?file=/nature/journal/v425/n6955/index.html>,scroll down about 1/3 of the page to "News Features": "US researchuniversities, with their enormous classes, have a poor reputation forteaching science. Experts agree that a shake-up is needed, but whichstrategies work best? Kendall Powell goes back to school." Powellwrote: "Evidence of [the failure of the passive-student lecture] isprovided by assessments such as the Force Concept Inventory (FCI), amultiple-choice test designed to examine students' understanding ofNewton's laws of mechanics. Developed around a decade ago by DavidHestenes, a physicist turned education researcher at Arizona StateUniversity in Tempe, the FCI has changed

some researchers' opinions of their teaching techniques."

Scriven, M. 2004. "Re: pre- post testing in assessment," AERA-D postof 15 Sept 2004 19:27:14-0400; online at

<http://lists.asu.edu/cgi-bin/wa?A2=ind0409&L=aera-d&T=0&F=&S=&P=1952>.

Shadish, W.R., T.D. Cook, & D.T. Campbell. 2002. "Experimental andQuasi-Experimental Designs for Generalized Causal Inference."Houghton Mifflin. A goldmine of references to the social-scienceliterature of experimentation.

Stokstad, E. 2001. "Reintroducing the Intro Course." Science 293:1608-1610, 31 August 2001. This special issue on "Trends inUndergraduate Education is online to subscribers at<http://www.sciencemag.org/content/vol293/issue5535/index.shtml>.Stokstad wrote: "Physicists are out in front in measuring how wellstudents learn the basics, as science educators incorporate hands-onactivities in hopes of making the introductory course a beginningrather than a finale."

Suskie, L. 2004a. "Re: pre- post testing in assessment," ASSESS post19 Aug 2004 08:19:53-0400; online at<http://lsv.uky.edu/cgi-bin/wa.exe?A2=ind0408&L=assess&D=0&T=0&P=7492&F=P>.Suskie's canonical objections to pre/post testing were countered byHake (2004a) and Scriven (2004).


Suskie, L. 2004b. "Assessing Student Learning," Anker Publishing.

Theall, M. 2003. "Re: Thin-Slice Judgments, End-of-CourseEvaluations, Grades, and Student Learning," POD post of 2 Apr 200308:17:16-0500; online at<http://listserv.nd.edu/cgi-bin/wa?A2=ind0304&L=pod&P=R959&I=-3>.

Tobias S. & R.R. Hake. 1988. "Professors as physics students: Whatcan they teach us?" Am. J. Phys. 56(9): 786-794.

Wilson, M.R. & M.W. Bertenthal, eds. 2005. "Systems for State ScienceAssessment," Nat. Acad. Press; online at<http://www.nap.edu/catalog.php?record_id=11312>.

Wood, W.B. 2003. "Inquiry-Based Undergraduate Teaching in the LifeSciences at Large Research Universities: A Perspective on the BoyerCommission Report," Cell Biology Education 2: 112-116; online at<http://www.cellbioed.org/article.cfm?ArticleID=57>. Wood wrote: "Theineffectiveness of standard lecture-based curricula has beenparticularly well documented in physics. In the early 1990s,physicists at Arizona State University developed a test called theForce Concept Inventory (FCI), designed to examine students'understanding of basic concepts in mechanics (Hestenes et al., 1992).This and similar tests have been used to compare the prevalence ofcommon misconceptions before and after taking an introductory physicscourse or completing a physics major. . . . . Transforming StandardCourses - Again, the physicists took the lead in putting these ideasinto practice. Eric Mazur at Harvard pioneered the use of"ConcepTests," posing questions during a lecture to assess studentunderstanding, allowing contiguous groups of students to discuss theanswer, and then displaying the distribution of group responses tothe class by various means (at first colored index cards, morerecently electronic devices). Differences in the responses lead tomore discussion as students work toward consensus answers (Mazur,1997; Crouch and Mazur, 2001). Robert Beichner, at North CarolinaState University, has presented evidence on the effects oftransforming his physics classes in this manner to an entirelyinquiry-based format <http://www.ncsu.edu/per/scaleup/html>, usingredesigned, electronically equipped classrooms that facilitatestudent interaction in small groups and allow them to access theinternet during class for help in solving problems (see Figure 1).Using pre- and post-testing with quantitative assessments like theFCI (Hake, 1998), as well as interviews and other qualitativetechniques, he can show clearly that the transformed classes are farsuperior to standard courses in promoting student understanding, asreviewed in a recent issue of CBE (Dancy and Beichner, 2002)."

Wood, W.B., and J.M. Gentile. 2003. "Teaching in a research context,"Science 302, 1510; 28 November 2003; freely online only tosubscribers only at

<http://www.sciencemag.org/content/vol302/issue5650/index.shtml#policyforum>.

They write [see the article for the references other than Hestenes etal. (1992) and Klymkowsky et al. (2003), My CAPS]: "Unknown to manyuniversity faculty in the natural sciences, particularly at largeresearch institutions, is a large body of recent research fromeducators and cognitive scientists on how people learn [Bransford etal. (2000)]. The results show that MANY STANDARD INSTRUCTIONALPRACTICES IN UNDERGRADUATE TEACHING, INCLUDING TRADITIONAL LECTURE,LABORATORY, AND RECITATION COURSES, ARE RELATIVELY INEFFECTIVE ATHELPING STUDENTS MASTER AND RETAIN THE IMPORTANT CONCEPTS OF THEIRDISCIPLINES OVER THE LONG TERM. Moreover, these practices do notadequately develop creative thinking, investigative, andcollaborative problem-solving skills that employers often seek.PHYSICS EDUCATORS HAVE LED THE WAY in developing and using objectivetests [Hestenes et al. (1992), Hake (1998a), NCSU (2005)] to comparestudent learning gains in different types of courses, and chemists,biologists, and others. . .[BUT EVIDENTLY NOT PSYCHOLOGISTS ORMATHEMATICIANS]. . . are now developing similar instruments [Mulford& Robinson (2002), Klymkowsky et al. (2003), Klymkowsky (2004)].












---
You are currently subscribed to tips as: [email protected]
To unsubscribe send a blank email to [EMAIL PROTECTED]

Re: How do you gauge how you're doing?

Reply via email to