Thanks everyone for the input! I didn't get this until after the talk
was set up, but there are definitely a number of useful
ideas/links/resources here for future reference.
This is what I ended up doing, in case anyone is looking to do something
similar --- I'd be happy to provide additional information and/or share
the materials I prepared. Basically, the talk included a first part with
some background/"motivational" material, and a second part with a demo
to illustrate how I would go about doing a simple task manually vs.
shell scripting. I emphasised that the second part aimed to provide a
mental model of how to go about the task, rather than to teach a
specific technique or tool.
The task was based on the "molecules" example/data in the Software
Carpentry shell-novice lesson. I introduced it as: "assume that your
supervisor gave you a set of text files (e.g. machine output) and asked
you to find the file with the smallest number of lines."
Then I demonstrated how I would do this by "pointing-and-clicking", i.e.
opening each file individually in a text editor, counting the number of
lines, making a note in a separate file, etc. At each step, I emphasised
where things could go wrong, e.g. mistakenly reading data for the same
file twice, errors in transcribing the line count, etc. I also pointed
out how this approach would not scale e.g. beyond the handful of files
in the example ("what if instead of 6 output files, I had 600?"), or if
new output files were added at a later stage, and so on.
Next, I completed the task in the shell. The presentation was projected
on two screens. On one screen I had slides with the commands (heavily
commented, so that people could follow along), and on the other I typed
the commands at the terminal. I covered basic operations (cd, ls, more,
head, wc, sort), redirection, and pipes --- all at the simplest level.
I concluded by executing the task with a simple script I had prepared
beforehand, which effectively "recapitulated" the commands I had typed
at the terminal. Also beforehand I had prepared a repository with the
data files and the script under version control. The idea here was
simply to show that you can keep track of who did what, when, etc, by
printing the log to screen.
The session was quite interactive, and from the feedback I got from the
students, it seemed that they did appreciate the pitfalls of the
GUI-based approach, as well as the potential benefits of the alternative
approach. The demo took ~15 mins. Overall, it seemed like a useful thing
to do. If I were to expand it, I would include a simple visualization,
as suggested by Tracy, Bianca, and others in the thread.
On 29/09/17 09:06, Bianca Peterson wrote:
Hi all,
I agree with Tracy about getting to the visualizatuons as quickly as
possible. Sorry for the delayed reply Laura, but the following might
be useful for future reference.
I've managed to convince a few people to consider using R (or rather
RStudio), simply by running 4 commands in RStudio (from the DC R
Ecology lesson):
1. download.file("https://ndownloader.figshare.com/files/2292169",
"data/portal_data_joined.csv")
2. surveys <-read.csv('data/portal_data_joined.csv')
3. summary(surveys)
4. plot(surveys$sex)
I usually emphasize the dimensions (or size) of this data and the
speed with which R executes the commands, and then ask "How many
clicks would it take to get these results in Excel?". They usually
then smile and ask when the next Carpentry workshop will be.
Thanks to everyone for sharing great resources and advice!
Bianca
On 29 Sep 2017 15:51, "Tracy Teal" <[email protected]
<mailto:[email protected]>> wrote:
Hi Laura,
This is a really neat idea, and I'm sorry, it sounds like it's too
late already for ideas for more ideas for your presentation. Let
us know how it went! This seems like a generally useful kind of
presentation to have available though, and these ideas have been
great.
A class at UC Davis does an exercise where they have people fill
out a survey on random things, like how many siblings do you have,
what is your favorite color, what kind of shoes are you wearing,
are you a cat person or a dog person? Create the survey so it
makes intentionally confusing data, for instance leaving number of
siblings as a fill in the blank rather than as a drop down
numerical response.
Then show the data, and show how messy the data is. Then demo how
to clean it up and do some visualizations. In a half hour (if you
knew generally what kind of data was going to be produced), you
could have people fill out the survey, show the data and do a
clean up and visualization with command line and Python or R. You
could maybe get version control in there too to show how you could
change the script. Maybe the messy data part is too much for a
half hour, but you could have a survey that creates cleaner data.
Getting to visualizations in a short amount of time seems to be
the thing that really is exciting to people. Especially when they
don't have a good idea of how they would have approached it in
something like Excel.
Best,
-Tracy
On Wed, Sep 27, 2017 at 3:06 PM, Moore, Nathan T
<[email protected] <mailto:[email protected]>> wrote:
I havn't tried what you're attempting, but here's a idea.
Describe the computer/lab notebook side of a data intensive
project, estimate the time associated with things like
clicking and dragging and computing by hand, and then show a
brief example in which that time is reduced (substantially).
Eg, tell the story from one of the learner profiles in more
detail, in a context that the MS students would be familiar with.
I assume you've seen learner profiles?
https://software-carpentry.org/audience/
<https://software-carpentry.org/audience/>
Nathan
------------------------------------------------------------------------
*From:* Discuss <[email protected]
<mailto:[email protected]>> on
behalf of Laura Fortunato <[email protected]
<mailto:[email protected]>>
*Sent:* Thursday, September 21, 2017 8:44:14 AM
*To:* [email protected]
<mailto:[email protected]>
*Subject:* [Discuss] core concepts for novices in 30 mins
Hello list,
I am looking for input on how to introduce core concepts about
reproducibility, effective research computing, etc to complete
novices in a 1/2-hour slot. Any ideas/suggestions/materials
welcome!
The background: I have been asked to give a talk on effective
computing for research reproducibility at the Oxford
Reproducibility School next week. The target audience is a
group of incoming masters-level students in psychology, most
of whom I assume will be complete novices.
Normally, given the format (30-min presentation + 10 mins for
questions) I would give a "motivational" talk, and then point
people to various resources (including Carpentry workshops,
lessons). However, this slot is part of a much longer event,
including "motivational" talks and talks on
discipline-specific tools (e.g. open, reproducible
neuroimaging) by several others.
Looking at the programme, it seems that what will not be
covered are the "basic" tools/skills taught in a standard
Software Carpentry workshop --- shell, version control,
programming.
So, one idea I have been toying with is to do a brief
demonstration of these tools to have the students see them "in
action". However, I am not sure this is possible in a 1/2-hour
slot.
Does anyone have experience doing something similar, or can
anyone point me to resources that do this? If anyone has tried
and failed, it would also be good to know, of course.
Thanks for any input!
Laura
--
*Laura Fortunato* || Associate Professor of Evolutionary
Anthropology | University of Oxford || External Professor |
Santa Fe Institute ||
_______________________________________________
Discuss mailing list
[email protected]
<mailto:[email protected]>
http://lists.software-carpentry.org/listinfo/discuss
<http://lists.software-carpentry.org/listinfo/discuss>
_______________________________________________
Discuss mailing list
[email protected]
<mailto:[email protected]>
http://lists.software-carpentry.org/listinfo/discuss
<http://lists.software-carpentry.org/listinfo/discuss>
_______________________________________________
Discuss mailing list
[email protected]
http://lists.software-carpentry.org/listinfo/discuss
--
*Laura Fortunato* || Associate Professor of Evolutionary Anthropology |
University of Oxford || External Professor | Santa Fe Institute ||
_______________________________________________
Discuss mailing list
[email protected]
http://lists.software-carpentry.org/listinfo/discuss