Dear Mark,

The assessments used are  performance-based type ones. In particular, They were 
very similar to what you define as "charettes" in your paper (Michael 
McCracken, Vicki Almstrum, Danny Diaz, Mark Guzdial, Dianne Hagan, Yifat 
Ben-David Kolikant, Cary Laxer, Lynda Thomas, Ian Utting, and Tadeusz Wilusz. 
2001. A multi-national, multi-institutional study of assessment of programming 
skills of first-year CS students. SIGCSE Bull. 33, 4 (December 2001), 
125-180.). 
One difference is that the assessments in our paper have carried once (as 
opposed to regular basis). Another difference was that students were given 
different  (small) tasks (less than 10) to implement. The difficulty of the 
tasks was incremental (that is starting  from a basic task such as print the 
string "hello world" to more complex ones). Tasks were designed so that it has 
been possible to test their familiarity with single topics (such as print 25 
times your name (to test their understanding of loops)). But also to test their 
ability to handle some of these concepts together (print a random number if it 
smaller than 896). In this way, it has been possible to measure /grade their 
proficiency. The pass condition is that the program must compile and run 
correctly. 
Only one person has marked all the assessments (the assumption here is that 
there is the same consistency for one person even if each cohort was a large 
one and each cohort from a different academic year). 
Since the assessments counted for the student grades, the usual 
standards/protocols of reviewing exams and marking have been followed.


stasha
________________________________________
From: Guzdial, Mark [guzd...@cc.gatech.edu]
Sent: 02 March 2011 15:08
To: Stasha Lauria
Cc: Stefano Federici; PPIG Listserve; Stefano Federici
Subject: Re: evalutation of new tools to teach computer programming

Stasha, your paper is unclear (at least to me) what assessment you were using.  
You describe the rubric:
The use of loops, conditional, etc. to achieve the given task was part of the 
marking criteria. For example, a grade A requires the correct use of 
conditional and the correct use of loops and the correct use of libraries in 
the program implemented.

What effort was made to make sure that the assessment was reliable and valid?  
For example, did you have multiple raters?  Do you have an indication of the 
inter-rater reliability?

Thanks!
 Mark

On Mar 2, 2011, at 9:39 AM, Stasha Lauria wrote:

We have evaluated the difference between using Python (this could be seen as 
the tool) and java to teach programming to beginners. Such evaluation is based 
on the analysis of student assessments. The aim of the comparison is to 
quantitatively measure a student’s ability to master basic programming concepts 
when Python is used instead of object-oriented Java. Both assessments consisted 
of students having to implement a program. For further details, the paper can 
be accessed below:

http://www.ics.heacademy.ac.uk/italics/vol10iss1.htm
or
http://www.ics.heacademy.ac.uk/italics/download.php?file=italics/vol10iss1/pdfs/paper10.pdf

I hope this helps.

Regards,

Stasha

________________________________________
From: Stefano Federici [sfeder...@unica.it]
Sent: 01 March 2011 14:29
To: PPIG Listserve
Cc: Stefano Federici
Subject: Re: evalutation of new tools to teach computer programming

Thanks a lot Thomas and John for your suggestions.

To better clarify my settings, I have two new tools (aiming at
teaching two different topics: general programming the first, sorting
algorithms the second) that I want to compare against NOT using the
tools.

Do you have any references to similar evalutations?

Stefano


Citando John Daughtry <j...@daughtryhome.com>:

I would suggest taking a more holistic view of the design space. Rather than
asking which tool is best, you may be better served by seeking to
empirically describe and explain the underlying trade-offs. In what ways do
option1 help, hinder, and undermine learning? In what ways do option2 help,
hinder, and undermine learning? In all likelihood there are answers to all
six questions.

John
--------------------------------------------------
Associate Research Engineer
The Applied Research Laboratory
Penn State University
daugh...@psu.edu



On Tue, Mar 1, 2011 at 7:08 AM, Thomas Green <green...@ntlworld.com> wrote:

Depending on your aims, you might want to measure transfer to other
problems: that is,  do participants who used tool A for the sorting task,
then do better when tackling  a new problem, possibly with a different tool,
than participants who used tool B?

You might also want to look at memory and savings: how do the participants
manage two months later? Occasionally cognitive tasks like yours show no
effect at the time but produce measurable differences when the same people
do the same tasks later.

Pretty hard to create a truly fair test, but things to think about are
controlling for practice and order effects, which should be easy, and
controlling for experimenter expectation effects. The hardest thing to
balance for is sometimes the training period: people using a new tool have
to learn about it, and that gives them practice effects that the controls
might not get. Sometimes people create a dummy task for the control
condition to avoid that problem; or you can compare different versions of
the tools, with differing features.

I suggest you try to avoid the simple A vs B design and instead look for a
design when you can predict a trend: find A, B, C such that your theory says
A > B > C. The statistical power is much better.

Don't forget to talk to the people afterwards and get their opinions.
Sometimes you can find they weren't playing the same game that you were.

Good luck

Thomas Green




On 1 Mar 2011, at 11:20, Stefano Federici wrote:

Dear Collegues,
I need to plan an evaluation of the improvements brought by the usage of
specific software tools when learning the basic concepts of computer
programming (sequence, loop, variables, arrays, etc) and the specific topic
of sorting algorithms.

Which are the best practises for the necessary steps? I guess the steps
should be: selection of test group, test of initial skills,
partition of the
test group in smaller homogenous groups, delivery of learning materials by
or by not making use of the tools, test of final skills, comparative
analysis.

What am I supposed to do to perform a fair test?

Any help or reference is welcome.

Best Regards



Stefano Federici
-------------------------------------------------
Università degli Studi di Cagliari
Facoltà di Scienze della Formazione
Dipartimento di Scienze Pedagogiche e Filosofiche
Via Is Mirrionis 1, 09123 Cagliari, Italia
-------------------------------------------------
Cell: +39 349 818 1955 Tel.: +39 070 675 7815
Fax: +39 070 675 7113



--
The Open University is incorporated by Royal Charter (RC 000391), an exempt 
charity in England & Wales and a charity registered in Scotland (SC 038302).




Reply via email to