Hi everyone,

This is a great discussion.  Feel free to continue it.

Would someone be interested in writing a blog post to summarize Joel Grus' 
opinions and the discussion we've had here?


Discuss List Moderator

From: Hao Ye <hao...@weecology.org>
Sent: Wednesday, August 29, 2018 7:36:58 AM
To: discuss
Subject: Re: [discuss] Slide of Joel Grus' JupyterCon Talk "I Don't Like 

I agree with what Simon wrote about hidden state, and I strongly feel that it 
is a lesson/concept that we should emphasize more for workshop attendees, 
especially those that don't have a substantial amount of experience with 

The notion that there even *is* such a thing as a hidden state will be new to a 
lot of folks - think how many come from experience working with Excel where you 
can see the data and the inputs and outputs of all calculations simultaneously.

In my experience, teaching REPL first conveys the notion of a hidden state 
slightly better, because you hit enter after every command, and typos often 
produce feedback immediately that something went wrong. That's a very different 
mode of operation than typing code into a Jupyter notebook cell, R markdown 
code chunk, or script file, where you can build up the code over time, 
including bugs and errors, and nothing happens until you try and execute.

I think there's good discussion to be had about workflows involving shell, 
IDEs, notebooks, etc., but not all our workshop attendees are at the stage of 
receiving that information in a useful context yet.

Hao Ye

On Wed, Aug 29, 2018 at 5:06 AM, Waldman, Simon 
<sm...@hw.ac.uk<mailto:sm...@hw.ac.uk>> wrote:
FWIW, when helping in SWC workshops, I’ve often found students getting confused 
in python notebooks due to hidden state.

The hidden state issues of notebooks are, however, no different to how many of 
us work in IDEs with interpreted languages (RStudio, MATLAB),

On Wed, Aug 29, 2018 at 9:25 AM, Bennet Fauber 
<ben...@umich.edu<mailto:ben...@umich.edu>> wrote:

I don't think anyone is saying, "Tell people not to use notebooks."
The questions are about whether they improve the learning experience
for beginners.  There is also the question of whether use of the GUI
somehow defeats the purpose of the shell lesson by contradicting what
is often said there; namely, the command line is a powerful tool, you
should use it.

One respondent said they review ways to run python -- python, ipython,
jupyter -- then go on to use whatever in their workshop.  That goes
some way toward giving the participants choices.  It may not
counteract the message that is still implicit or implicit in the shell

Perhaps the shell lesson should be modified so that the shell is
treated as a data management tool, and notebooks and Rstudio are
treated as development environments?  Then the dissonance between
advocating the shell in one lesson and abandoning it in another would
be lessened?

Perhaps all it would take would be a couple of examples of running a
notebook from the command line and telling it to start from scratch
and run all cells.  If a notebook can be run in the same way from a
prompt that a .py file can be, then maybe showing that capability
solves a whole bunch of problems.

The out-of-phase evaluation of things that is possible in notebooks
can also lead to irreproducible results, which is not, I think, in
keeping with the goals of the Carpentries.

Some people will want to keep notebooks, and others will want to
forego them; there should be a place for both approaches, and the one
that best fits the goal of the particular offering and the expected
audience should be chosen.  I think it would not be a service to
future learners if only one way were available for all circumstances.

Perhaps it would help to consider a full two-day workshop as a bundle,
and pick the lesson components that leads to the most coherent and
clear presentation of the most important points to the targeted
audience?  That would lesson the dissonance between command-line for
shell/git and GUI for R/Python, maybe?  Should there be an option to
do GUI-only workshops, no shell and a GUI for Git?  Similarly, a
command-line only option.  I think that might be worth considering.

On Tue, Aug 28, 2018 at 6:31 PM Carol Willing
<willi...@willingconsulting.com<mailto:willi...@willingconsulting.com>> wrote:
> Hi all,
> There's positive discussion that has been started by Joel's talk. While I 
> liked his talk and there are some good points re: improving support for 
> software engineering best practices in Jupyter and JupyterLab notebooks, I'm 
> a bit concerned about the direction that this conversation is going.
> While all are entitled to their personal opinions and the Carpentries will 
> use notebooks when and if needed, I believe that the Carpentries would be 
> doing its students a disservice by warning people not to use the notebooks or 
> conda.
> The notebooks are a popular and effective tool for scientists and data 
> scientists to have in their toolbox. Project Jupyter won the ACM Software 
> System Award recently, and the ACM stated "These tools, which include 
> IPython, the Jupyter Notebook and JupyterHub, have become a de facto standard 
> for data analysis in research, education, journalism and industry." 
> https://awards.acm.org/software-system
> While it's great for folks to have different personal perspectives, I want to 
> make sure that the Carpentries and its lessons do not recommend that the 
> Jupyter Notebooks, IPython, and JupyterHub should be avoided by scientists 
> and data scientists.
> Thanks,
> Carol Willing
> > On 28 Aug 2018, at 11:38, Maxime Boissonneault 
> > <maxime.boissonnea...@calculquebec.ca<mailto:maxime.boissonnea...@calculquebec.ca>>
> >  wrote:
> >
> > These kinds of things are rather hard to track in time, because everything 
> > is a moving target (conda and other package managers constantly get 
> > updated, but also version of packages changes), but here is a bit more 
> > details :
> >
> > - The 10x performance difference was with a user code, which I 
> > unfortunately can't share (nor do I still have a copy of it). It was about 
> > numpy, which may or may not have changed since MKL can now be shipped with 
> > Anaconda.
> >
> > - FFTW, 2x performance gain : These slides compare between Conda-provided 
> > (and those provided by other package managers) FFTW, and one which was 
> > built on an avx2 cluster, the performance gain is 2x (see slides 28 and 29 :
> > https://archive.fosdem.org/2018/schedule/event/installing_software_for_scientists/attachments/slides/2437/export/events/attachments/installing_software_for_scientists/slides/2437/20180204_installing_software_for_scientists.pdf
> >
> >
> > - Tensorflow, 7x gain for CPU version, slide 28 of this talk : 
> > https://archive.fosdem.org/2018/schedule/event/how_to_make_package_managers_cry/attachments/slides/2297/export/events/attachments/how_to_make_package_managers_cry/slides/2297/how_to_make_package_managers_cry.pdf
> >
> >   This one was not comparing Conda itself, but manylinux python wheels 
> > provided by the Tensorflow team, but no doubt Conda has the same issue if 
> > they build for generic architectures.
> >
> >
> >
> > Basically, any package that is compiled in a portable manner, such as what 
> > Conda and manylinux wheels do, will have some degree of speedup if compiled 
> > for the target architecture instead. This is typically achieved by the team 
> > of analysts who manage a cluster.
> >
> > Cheers,
> >
> > Maxime
> >
> >
> > On 2018-08-28 2:20 PM, Ashwin Srinath wrote:
> >> I'm very interested to see these examples? We use and advocate the use
> >> of conda environments and I'm happy to be convinced otherwise.
> >>
> >> Thanks,
> >> Ashwin
> >>
> >> On Tue, Aug 28, 2018 at 2:17 PM, Maxime Boissonneault
> >> <maxime.boissonnea...@calculquebec.ca<mailto:maxime.boissonnea...@calculquebec.ca>>
> >>  wrote:
> >>> Regarding performance, we have example of code using Anaconda-provided
> >>> packages that run 10 times slower than the same code using locally built
> >>> packages, optimized for the cluster architectures. That's not *a bit*
> >>> slower, that's a lot slower.
> >>>
> >>> Regarding "cheating on your partner", that analogy is not by me, but the
> >>> point he is trying to carry is that Anaconda basically replaces any 
> >>> cluster
> >>> provided versions, which HPC center people are working hard to optimize.
> >>> Recent versions of Anaconda are even worse, by packaging things like
> >>> compilers and linkers, creating conflicts with cluster-provided system
> >>> libraries and tools, and creating a lot of debugging problems for users 
> >>> and
> >>> support people alike.
> >>>
> >>> Regards,
> >>>
> >>> Maxime
> >>>
> >>>
> >>> On 2018-08-28 12:48 PM, Rémi Rampin wrote:
> >>>
> >>> 2018-08-28 12:27 EDT, Maxime Boissonneault
> >>> <maxime.boissonnea...@calculquebec.ca<mailto:maxime.boissonnea...@calculquebec.ca>>:
> >>>> As a side-discussion, I think we should also be wary of using Anaconda,
> >>>> and tell users not to use it in a cluster environment. For reasons, see
> >>>> here :
> >>>> https://twitter.com/mboisso/status/1034476890353020928
> >>>
> >>> Hi Maxime,
> >>>
> >>> All I see in this thread is that "it's like cheating on your partner" 
> >>> (!!!)
> >>> and it's "generically optimized software" that might be a bit slower than
> >>> locally-built libs (interesting concern when using Python, an interpreted
> >>> scripting language (and on the slow side too)).
> >>>
> >>> Could you elaborate on those reasons?
> >>>
> >>> Best
> >>> --
> >>> Rémi
> >>>
> >>>
> >>> The Carpentries / discuss / see discussions + participants + delivery
> >>> options Permalink
> >> ------------------------------------------
> >> The Carpentries: discuss
> >> Permalink: 
> >> https://carpentries.topicbox.com/groups/discuss/T1505f74d7f6e32f8-Mad4fadc6a6da6de2b5f2aeb9
> >> Delivery options: 
> >> https://carpentries.topicbox.com/groups/discuss/subscription
> >
> >
> > --
> > ---------------------------------
> > Maxime Boissonneault
> > Analyste de calcul - Calcul Québec, Université Laval
> > Président - Comité de coordination du soutien à la recherche de Calcul 
> > Québec
> > Team lead - Research Support National Team, Compute Canada
> > Instructeur Software Carpentry
> > Ph. D. en physique
> >

The Carpentries<https://carpentries.topicbox.com/latest> / discuss / see 
discussions<https://carpentries.topicbox.com/groups/discuss> + 
participants<https://carpentries.topicbox.com/groups/discuss/members> + 
delivery options<https://carpentries.topicbox.com/groups/discuss/subscription> 

The Carpentries: discuss
Delivery options: https://carpentries.topicbox.com/groups/discuss/subscription

Reply via email to