Dear Mark,
The assessments used are performance-based type ones. In particular, They were
very similar to what you define as "charettes" in your paper (Michael
McCracken, Vicki Almstrum, Danny Diaz, Mark Guzdial, Dianne Hagan, Yifat
Ben-David Kolikant, Cary Laxer, Lynda Thomas, Ian Utting, and Tadeusz Wilusz.
2001. A multi-national, multi-institutional study of assessment of programming
skills of first-year CS students. SIGCSE Bull. 33, 4 (December 2001),
125-180.).
One difference is that the assessments in our paper have carried once (as
opposed to regular basis). Another difference was that students were given
different (small) tasks (less than 10) to implement. The difficulty of the
tasks was incremental (that is starting from a basic task such as print the
string "hello world" to more complex ones). Tasks were designed so that it has
been possible to test their familiarity with single topics (such as print 25
times your name (to test their understanding of loops)). But also to test their
ability to handle some of these concepts together (print a random number if it
smaller than 896). In this way, it has been possible to measure /grade their
proficiency. The pass condition is that the program must compile and run
correctly.
Only one person has marked all the assessments (the assumption here is that
there is the same consistency for one person even if each cohort was a large
one and each cohort from a different academic year).
Since the assessments counted for the student grades, the usual
standards/protocols of reviewing exams and marking have been followed.
stasha
From: Guzdial, Mark [guzd...@cc.gatech.edu]
Sent: 02 March 2011 15:08
To: Stasha Lauria
Cc: Stefano Federici; PPIG Listserve; Stefano Federici
Subject: Re: evalutation of new tools to teach computer programming
Stasha, your paper is unclear (at least to me) what assessment you were using.
You describe the rubric:
The use of loops, conditional, etc. to achieve the given task was part of the
marking criteria. For example, a grade A requires the correct use of
conditional and the correct use of loops and the correct use of libraries in
the program implemented.
What effort was made to make sure that the assessment was reliable and valid?
For example, did you have multiple raters? Do you have an indication of the
inter-rater reliability?
Thanks!
Mark
On Mar 2, 2011, at 9:39 AM, Stasha Lauria wrote:
We have evaluated the difference between using Python (this could be seen as
the tool) and java to teach programming to beginners. Such evaluation is based
on the analysis of student assessments. The aim of the comparison is to
quantitatively measure a student’s ability to master basic programming concepts
when Python is used instead of object-oriented Java. Both assessments consisted
of students having to implement a program. For further details, the paper can
be accessed below:
http://www.ics.heacademy.ac.uk/italics/vol10iss1.htm
or
http://www.ics.heacademy.ac.uk/italics/download.php?file=italics/vol10iss1/pdfs/paper10.pdf
I hope this helps.
Regards,
Stasha
From: Stefano Federici [sfeder...@unica.it]
Sent: 01 March 2011 14:29
To: PPIG Listserve
Cc: Stefano Federici
Subject: Re: evalutation of new tools to teach computer programming
Thanks a lot Thomas and John for your suggestions.
To better clarify my settings, I have two new tools (aiming at
teaching two different topics: general programming the first, sorting
algorithms the second) that I want to compare against NOT using the
tools.
Do you have any references to similar evalutations?
Stefano
Citando John Daughtry :
I would suggest taking a more holistic view of the design space. Rather than
asking which tool is best, you may be better served by seeking to
empirically describe and explain the underlying trade-offs. In what ways do
option1 help, hinder, and undermine learning? In what ways do option2 help,
hinder, and undermine learning? In all likelihood there are answers to all
six questions.
John
--
Associate Research Engineer
The Applied Research Laboratory
Penn State University
daugh...@psu.edu
On Tue, Mar 1, 2011 at 7:08 AM, Thomas Green wrote:
Depending on your aims, you might want to measure transfer to other
problems: that is, do participants who used tool A for the sorting task,
then do better when tackling a new problem, possibly with a different tool,
than participants who used tool B?
You might also want to look at memory and savings: how do the participants
manage two months later? Occasionally cognitive tasks like yours show no
effect at the time but produce measurable differences when the same people
do the same tasks later.
Pretty hard to create a truly fair test, but things to think about are
controlling for practice and order effects, which should be easy, and
controlling for experime