Re: [Discuss] core concepts for novices in 30 mins

Laura Fortunato Tue, 23 Jan 2018 13:49:15 -0800

Hello everyone,

For anyone who followed/contributed to this thread and/or may beinterested in doing something similar, a write-up of the session is nowavailable [1], including the materials I developed for this purpose [2].Comments, questions, and the like welcome, of course!


Laura

[1]http://neuroanatody.com/2017/12/oxford-reproducibility-lectures-laura-fortunato/[2]http://neuroanatody.com/wp-content/uploads/2017/12/Oxford-Reproducibility-School.zip


On 16/10/17 21:15, Laura Fortunato wrote:

Thanks everyone for the input! I didn't get this until after the talkwas set up, but there are definitely a number of usefulideas/links/resources here for future reference.
This is what I ended up doing, in case anyone is looking to dosomething similar --- I'd be happy to provide additional informationand/or share the materials I prepared. Basically, the talk included afirst part with some background/"motivational" material, and a secondpart with a demo to illustrate how I would go about doing a simpletask manually vs. shell scripting. I emphasised that the second partaimed to provide a mental model of how to go about the task, ratherthan to teach a specific technique or tool.
The task was based on the "molecules" example/data in the SoftwareCarpentry shell-novice lesson. I introduced it as: "assume that yoursupervisor gave you a set of text files (e.g. machine output) andasked you to find the file with the smallest number of lines."
Then I demonstrated how I would do this by "pointing-and-clicking",i.e. opening each file individually in a text editor, counting thenumber of lines, making a note in a separate file, etc. At each step,I emphasised where things could go wrong, e.g. mistakenly reading datafor the same file twice, errors in transcribing the line count, etc. Ialso pointed out how this approach would not scale e.g. beyond thehandful of files in the example ("what if instead of 6 output files, Ihad 600?"), or if new output files were added at a later stage, and so on.
Next, I completed the task in the shell. The presentation wasprojected on two screens. On one screen I had slides with the commands(heavily commented, so that people could follow along), and on theother I typed the commands at the terminal. I covered basic operations(cd, ls, more, head, wc, sort), redirection, and pipes --- all at thesimplest level.
I concluded by executing the task with a simple script I had preparedbeforehand, which effectively "recapitulated" the commands I had typedat the terminal. Also beforehand I had prepared a repository with thedata files and the script under version control. The idea here wassimply to show that you can keep track of who did what, when, etc, byprinting the log to screen.
The session was quite interactive, and from the feedback I got fromthe students, it seemed that they did appreciate the pitfalls of theGUI-based approach, as well as the potential benefits of thealternative approach. The demo took ~15 mins. Overall, it seemed likea useful thing to do. If I were to expand it, I would include a simplevisualization, as suggested by Tracy, Bianca, and others in the thread.
On 29/09/17 09:06, Bianca Peterson wrote:
Hi all,
I agree with Tracy about getting to the visualizatuons as quickly aspossible. Sorry for the delayed reply Laura, but the following mightbe useful for future reference.
I've managed to convince a few people to consider using R (or ratherRStudio), simply by running 4 commands in RStudio (from the DC REcology lesson):1. download.file("https://ndownloader.figshare.com/files/2292169";,"data/portal_data_joined.csv")
2. surveys <-read.csv('data/portal_data_joined.csv')
3. summary(surveys)
4. plot(surveys$sex)
I usually emphasize the dimensions (or size) of this data and thespeed with which R executes the commands, and then ask "How manyclicks would it take to get these results in Excel?". They usuallythen smile and ask when the next Carpentry workshop will be.
Thanks to everyone for sharing great resources and advice!

Bianca
On 29 Sep 2017 15:51, "Tracy Teal" <[email protected]<mailto:[email protected]>> wrote:
    Hi Laura,

    This is a really neat idea, and I'm sorry, it sounds like it's
    too late already for ideas for more ideas for your presentation.
    Let us know how it went! This seems like a generally useful kind
    of presentation to have available though, and these ideas have
    been great.

    A class at UC Davis does an exercise where they have people fill
    out a survey on random things, like how many siblings do you
    have, what is your favorite color, what kind of shoes are you
    wearing, are you a cat person or a dog person? Create the survey
    so it makes intentionally confusing data, for instance leaving
    number of siblings as a fill in the blank rather than as a drop
    down numerical response.

    Then show the data, and show how messy the data is. Then demo how
    to clean it up and do some visualizations. In a half hour (if you
    knew generally what kind of data was going to be produced), you
    could have people fill out the survey, show the data and do a
    clean up and visualization with command line and Python or R. You
    could maybe get version control in there too to show how you
    could change the script. Maybe the messy data part is too much
    for a half hour, but you could have a survey that creates cleaner
    data.

    Getting to visualizations in a short amount of time seems to be
    the thing that really is exciting to people. Especially when they
    don't have a good idea of how they would have approached it in
    something like Excel.

    Best,
    -Tracy

    On Wed, Sep 27, 2017 at 3:06 PM, Moore, Nathan T
    <[email protected] <mailto:[email protected]>> wrote:

        I havn't tried what you're attempting, but here's a idea.
        Describe the computer/lab notebook side of a data intensive
        project, estimate the time associated with things like
        clicking and dragging and computing by hand, and then show a
        brief example in which that time is reduced (substantially). 
        Eg, tell the story from one of the learner profiles in more
        detail, in a context that the MS students would be familiar
        with.


        I assume you've seen learner profiles?

        https://software-carpentry.org/audience/
        <https://software-carpentry.org/audience/>

        Nathan

        ------------------------------------------------------------------------
        *From:* Discuss <[email protected]
        <mailto:[email protected]>> on
        behalf of Laura Fortunato <[email protected]
        <mailto:[email protected]>>
        *Sent:* Thursday, September 21, 2017 8:44:14 AM
        *To:* [email protected]
        <mailto:[email protected]>
        *Subject:* [Discuss] core concepts for novices in 30 mins

        Hello list,

        I am looking for input on how to introduce core concepts
        about reproducibility, effective research computing, etc to
        complete novices in a 1/2-hour slot. Any
        ideas/suggestions/materials welcome!

        The background: I have been asked to give a talk on effective
        computing for research reproducibility at the Oxford
        Reproducibility School next week. The target audience is a
        group of incoming masters-level students in psychology, most
        of whom I assume will be complete novices.

        Normally, given the format (30-min presentation + 10 mins for
        questions) I would give a "motivational" talk, and then point
        people to various resources (including Carpentry workshops,
        lessons). However, this slot is part of a much longer event,
        including "motivational" talks and talks on
        discipline-specific tools (e.g. open, reproducible
        neuroimaging) by several others.

        Looking at the programme, it seems that what will not be
        covered are the "basic" tools/skills taught in a standard
        Software Carpentry workshop --- shell, version control,
        programming.

        So, one idea I have been toying with is to do a brief
        demonstration of these tools to have the students see them
        "in action". However, I am not sure this is possible in a
        1/2-hour slot.

        Does anyone have experience doing something similar, or can
        anyone point me to resources that do this? If anyone has
        tried and failed, it would also be good to know, of course.

        Thanks for any input!
        Laura
--*Laura Fortunato* || Associate Professor of Evolutionary
        Anthropology | University of Oxford || External Professor |
        Santa Fe Institute ||

_______________________________________________
Discuss mailing list
[email protected]
http://lists.software-carpentry.org/listinfo/discuss

Re: [Discuss] core concepts for novices in 30 mins

Reply via email to