> Anyone know of software comprehension experiments which used
> 'programmers' overview descriptions of the system' as a measure of
> comprehension.[..]
> Jim Buckley.
Hi Jim,
First an answer to your question, then followup discussion.
If I understand you correctly, you want to measure a
person's high-level or global understanding of a software
system by "reading" their minds for that knowledge.
The experimental instrument implied is a written statement of
the subjects' knowledge that was developed after experiencing
some experimental condition. Did I understand you OK?
I can't think of any specific papers using the above described
method, but Ric Holt has done work in visualizing high-level
system structure, and I believe he has gone around real work
sites asking system experts for drawings of their
understanding of system structures, and also has presented
experts with generated diagrams and asked them for their
opinions on how well it represents the system. Tim Letbridge
and Janice Singer also reported instances of asking programmers
to map out their understanding (WESS-98:Singer-Letbridge).
It seems to me that presenting subjects with pre-drawn system
diagrams as stimulus structures might be a better way of
approaching this type of question if you want to make comparisons
between subjects. While you're at it, you should also look
at Gail Murphy's et. al.'s work on the Reflexion Model tool.
She sees people writing out high-level program structure
models and then refining them as they gain a better
understanding of the system. That might give you some
ideas. Hope someone else can help out....
Now for the discussion.
Although that might be very interesting to see what subjects
output themselves, I can think of at least three important
questions that prospect raises:
1. Reliability and reconstruction. It is not possible for
anyone to "dump" their internal representations directly
onto paper or screen. The externalization process itself
frequently changes a person's understanding of the subject.
What you'd be measuring would more than likely be a
post-experiment reconstruction. That's why protocol analysis
uses !concurrent! verbalization (see Ericsson-Simon); its
also why accurate timings are taken to question answering
so that priming effects can be utilized in order to
reconstruct internal representations (see
CogPsy-19:Pennington, or INTERACT-95:Green-Navarro, for
example).
You might also try hunting in the knowledge engineering
literature (for the difficulty of knowledge elicitation
and how to compare expert knowledge bases), or possibly
in the constructivist education literature (for active
learning and how to compare things like concept maps
that express a subject's understanding of some material).
One danger with certain experimental methods is that
it may be difficult to separate the question
"what do they know?" from "and how is it represented?"
2. Open-endedness. The task you'd be asking the subjects to
perform (externalizing current understanding) is
more than likely very open ended for any but the most
trivial programs. It is quite different than the
verbatim reconstruction tasks asked before (e.g. see Boehm
Davis review in Handbook of HCI). The variability in the
output is likely to overwhelm anything but very informal
analysis.
3. What is a high-level understanding? Is it structural?
Functional? Domain? Design decisions? Some combination?
Is it expressed in patterns, architectures, or >gasp<
global goal-plan hierarchies (good luck finding subjects
that know how to write these out...)?
Also, are you interested in "meta-knowledge" such as survey
knowledge? Programmers may not be able to recall some
particular bit of information, but might be able to rapidly
find it by using survey knowledge, perhaps retrieving it
from episodic memory (CHI-95:Altmann-Larkin-John).
Recall-based experiments might miss this fact if not
designed to take it into account. Note that survey
knowledge is likely to be quite important if the subjects
take a "just in time comprehension" approach to maintaining
an understanding of their code (WESS-98:Singer-Lethbridge).
One other small note. It might help you that there are a
reasonable number of case studies and experiments in reverse
engineering. The goals of the subjects in such studies is
typically to reconstruct high-level documentation from existing
code. These might generate more naturalistic data sets wherein
the user-generated goals are to understand software at
a high-level and also produce representations of that
understanding. Of course, these studies make also it difficult
to answer questions, for example, about internal representations.
Andrew