+1 for what Juan said. I think most of the cognitive load of notebooks can be addressed by giving people a crash course in Jupyter, and by narrating what you do, just like SWC suggests that instructors narrate what they do at the command line or in a REPL, e.g. "so I'm going to type print parentheses hello close parentheses in this cell and then execute it by hitting control enter", etc.
I've seen Jupyter-heavy tutorials for example at SciPy that give these sorts of quickie intros to notebooks. I can't find an example but here's something similar I've done: https://github.com/NickleDave/EWIN-coding-bootcamp/blob/master/Python/bootcamp%20day%201%20%2B%20Python%20preliminaries.ipynb Seems like a good opportunity to explain that the most common use cases are presenting results/methods, teaching, and scratch coding, **not** writing production code / large code bases. Maybe that will help prevent people getting the wrong impression (and then giving a talk about it) đ đ đ David Nicholson, Ph.D. nickledave.github.io https://github.com/NickleDave Prinz lab <http://www.biology.emory.edu/research/Prinz/>, Emory University, Atlanta, GA, USA On Wed, Aug 29, 2018 at 8:54 AM, Maxime Boissonneault < maxime.boissonnea...@calculquebec.ca> wrote: > Hi Carol, > I don't think this is where the subthread about Conda is heading. Jupyter > notbooks is orthogonal to Anaconda. You can definitely have Jupyter without > Conda. From a teaching perspective, both Conda and Jupyter notebooks do a > fine job. But just as it would be beneficial to warn users about notebook > caveats (hidden states and such), it would also be good to do the same for > conda caveats (performance). > > Cheers, > > Maxime > > > > > > On 2018-08-28 6:29 PM, Carol Willing wrote: > >> Hi all, >> >> There's positive discussion that has been started by Joel's talk. While I >> liked his talk and there are some good points re: improving support for >> software engineering best practices in Jupyter and JupyterLab notebooks, >> I'm a bit concerned about the direction that this conversation is going. >> >> While all are entitled to their personal opinions and the Carpentries >> will use notebooks when and if needed, I believe that the Carpentries would >> be doing its students a disservice by warning people not to use the >> notebooks or conda. >> >> The notebooks are a popular and effective tool for scientists and data >> scientists to have in their toolbox. Project Jupyter won the ACM Software >> System Award recently, and the ACM stated "These tools, which include >> IPython, the Jupyter Notebook and JupyterHub, have become a de facto >> standard for data analysis in research, education, journalism and >> industry." https://awards.acm.org/software-system >> >> While it's great for folks to have different personal perspectives, I >> want to make sure that the Carpentries and its lessons do not recommend >> that the Jupyter Notebooks, IPython, and JupyterHub should be avoided by >> scientists and data scientists. >> >> Thanks, >> >> Carol Willing >> >> >> On 28 Aug 2018, at 11:38, Maxime Boissonneault < >>> maxime.boissonnea...@calculquebec.ca> wrote: >>> >>> These kinds of things are rather hard to track in time, because >>> everything is a moving target (conda and other package managers constantly >>> get updated, but also version of packages changes), but here is a bit more >>> details : >>> >>> - The 10x performance difference was with a user code, which I >>> unfortunately can't share (nor do I still have a copy of it). It was about >>> numpy, which may or may not have changed since MKL can now be shipped with >>> Anaconda. >>> >>> - FFTW, 2x performance gain : These slides compare between >>> Conda-provided (and those provided by other package managers) FFTW, and one >>> which was built on an avx2 cluster, the performance gain is 2x (see slides >>> 28 and 29 : >>> https://archive.fosdem.org/2018/schedule/event/installing_ >>> software_for_scientists/attachments/slides/2437/ >>> export/events/attachments/installing_software_for_ >>> scientists/slides/2437/20180204_installing_software_for_scientists.pdf >>> >>> >>> - Tensorflow, 7x gain for CPU version, slide 28 of this talk : >>> https://archive.fosdem.org/2018/schedule/event/how_to_make_ >>> package_managers_cry/attachments/slides/2297/export/events/ >>> attachments/how_to_make_package_managers_cry/slides/ >>> 2297/how_to_make_package_managers_cry.pdf >>> >>> This one was not comparing Conda itself, but manylinux python wheels >>> provided by the Tensorflow team, but no doubt Conda has the same issue if >>> they build for generic architectures. >>> >>> >>> >>> Basically, any package that is compiled in a portable manner, such as >>> what Conda and manylinux wheels do, will have some degree of speedup if >>> compiled for the target architecture instead. This is typically achieved by >>> the team of analysts who manage a cluster. >>> >>> Cheers, >>> >>> Maxime >>> >>> >>> On 2018-08-28 2:20 PM, Ashwin Srinath wrote: >>> >>>> I'm very interested to see these examples? We use and advocate the use >>>> of conda environments and I'm happy to be convinced otherwise. >>>> >>>> Thanks, >>>> Ashwin >>>> >>>> On Tue, Aug 28, 2018 at 2:17 PM, Maxime Boissonneault >>>> <maxime.boissonnea...@calculquebec.ca> wrote: >>>> >>>>> Regarding performance, we have example of code using Anaconda-provided >>>>> packages that run 10 times slower than the same code using locally >>>>> built >>>>> packages, optimized for the cluster architectures. That's not *a bit* >>>>> slower, that's a lot slower. >>>>> >>>>> Regarding "cheating on your partner", that analogy is not by me, but >>>>> the >>>>> point he is trying to carry is that Anaconda basically replaces any >>>>> cluster >>>>> provided versions, which HPC center people are working hard to >>>>> optimize. >>>>> Recent versions of Anaconda are even worse, by packaging things like >>>>> compilers and linkers, creating conflicts with cluster-provided system >>>>> libraries and tools, and creating a lot of debugging problems for >>>>> users and >>>>> support people alike. >>>>> >>>>> Regards, >>>>> >>>>> Maxime >>>>> >>>>> >>>>> On 2018-08-28 12:48 PM, RĂ©mi Rampin wrote: >>>>> >>>>> 2018-08-28 12:27 EDT, Maxime Boissonneault >>>>> <maxime.boissonnea...@calculquebec.ca>: >>>>> >>>>>> As a side-discussion, I think we should also be wary of using >>>>>> Anaconda, >>>>>> and tell users not to use it in a cluster environment. For reasons, >>>>>> see >>>>>> here : >>>>>> https://twitter.com/mboisso/status/1034476890353020928 >>>>>> >>>>> Hi Maxime, >>>>> >>>>> All I see in this thread is that "it's like cheating on your partner" >>>>> (!!!) >>>>> and it's "generically optimized software" that might be a bit slower >>>>> than >>>>> locally-built libs (interesting concern when using Python, an >>>>> interpreted >>>>> scripting language (and on the slow side too)). >>>>> >>>>> Could you elaborate on those reasons? >>>>> >>>>> Best >>>>> -- >>>>> RĂ©mi >>>>> >>>>> >>>>> The Carpentries / discuss / see discussions + participants + delivery >>>>> options Permalink >>>>> >>>> ------------------------------------------ >>>> The Carpentries: discuss >>>> Permalink: https://carpentries.topicbox.com/groups/discuss/T1505f74d7f6 >>>> e32f8-Mad4fadc6a6da6de2b5f2aeb9 >>>> Delivery options: https://carpentries.topicbox.c >>>> om/groups/discuss/subscription >>>> >>> >>> -- >>> --------------------------------- >>> Maxime Boissonneault >>> Analyste de calcul - Calcul QuĂ©bec, UniversitĂ© Laval >>> PrĂ©sident - ComitĂ© de coordination du soutien Ă la recherche de Calcul >>> QuĂ©bec >>> Team lead - Research Support National Team, Compute Canada >>> Instructeur Software Carpentry >>> Ph. D. en physique >>> >>> ------------------------------------------ >> The Carpentries: discuss >> Permalink: https://carpentries.topicbox.com/groups/discuss/T1505f74d7f6 >> e32f8-M77e71bf94fc82bac35910927 >> Delivery options: https://carpentries.topicbox.c >> om/groups/discuss/subscription >> > > > -- > --------------------------------- > Maxime Boissonneault > Analyste de calcul - Calcul QuĂ©bec, UniversitĂ© Laval > PrĂ©sident - ComitĂ© de coordination du soutien Ă la recherche de Calcul > QuĂ©bec > Team lead - Research Support National Team, Compute Canada > Instructeur Software Carpentry > Ph. D. en physique > > > ------------------------------------------ > The Carpentries: discuss > Permalink: https://carpentries.topicbox.com/groups/discuss/T1505f74d7f6 > e32f8-Maa170b9124a7aca14bbb63f8 > Delivery options: https://carpentries.topicbox.c > om/groups/discuss/subscription > ------------------------------------------ The Carpentries: discuss Permalink: https://carpentries.topicbox.com/groups/discuss/T1505f74d7f6e32f8-Ma4ef0973120a37d42a9f04b2 Delivery options: https://carpentries.topicbox.com/groups/discuss/subscription