Hi folks,

Variants on this theme  regularly recur over the years, whether the tools used 
in SWC are a means to an end or the end in themselves, whether the focus should 
be on a suite of tools used from the command-line, or not.

Maybe from this, instead of a one-size-fits-all-so-no-one-is-happy lesson, the 
Python lesson could be forked into two variants:

* One on Jupyter notebooks, as a way to wean researchers off of Excel and into 
a more programmatic way of doing things.
* One on good programming practice, using Python command-line and text editor. 
This lesson would be in the spirit of moving researchers onto command-line 
based tools.

Hosts could then decide which might be best for their audiences for a specific 
workshop.

cheers,
mike

________________________________________
From: Maxime Boissonneault <maxime.boissonnea...@calculquebec.ca>
Sent: 29 August 2018 13:54
To: discuss; Carol Willing
Cc: Titus Brown
Subject: Re: [discuss] Slide of Joel Grus' JupyterCon Talk "I Don't Like 
Notebooks"

Hi Carol,
I don't think this is where the subthread about Conda is heading.
Jupyter notbooks is orthogonal to Anaconda. You can definitely have
Jupyter without Conda. From a teaching perspective, both Conda and
Jupyter notebooks do a fine job. But just as it would be beneficial to
warn users about notebook caveats (hidden states and such), it would
also be good to do the same for conda caveats (performance).

Cheers,

Maxime




On 2018-08-28 6:29 PM, Carol Willing wrote:
> Hi all,
>
> There's positive discussion that has been started by Joel's talk. While I 
> liked his talk and there are some good points re: improving support for 
> software engineering best practices in Jupyter and JupyterLab notebooks, I'm 
> a bit concerned about the direction that this conversation is going.
>
> While all are entitled to their personal opinions and the Carpentries will 
> use notebooks when and if needed, I believe that the Carpentries would be 
> doing its students a disservice by warning people not to use the notebooks or 
> conda.
>
> The notebooks are a popular and effective tool for scientists and data 
> scientists to have in their toolbox. Project Jupyter won the ACM Software 
> System Award recently, and the ACM stated "These tools, which include 
> IPython, the Jupyter Notebook and JupyterHub, have become a de facto standard 
> for data analysis in research, education, journalism and industry." 
> https://awards.acm.org/software-system
>
> While it's great for folks to have different personal perspectives, I want to 
> make sure that the Carpentries and its lessons do not recommend that the 
> Jupyter Notebooks, IPython, and JupyterHub should be avoided by scientists 
> and data scientists.
>
> Thanks,
>
> Carol Willing
>
>
>> On 28 Aug 2018, at 11:38, Maxime Boissonneault 
>> <maxime.boissonnea...@calculquebec.ca> wrote:
>>
>> These kinds of things are rather hard to track in time, because everything 
>> is a moving target (conda and other package managers constantly get updated, 
>> but also version of packages changes), but here is a bit more details :
>>
>> - The 10x performance difference was with a user code, which I unfortunately 
>> can't share (nor do I still have a copy of it). It was about numpy, which 
>> may or may not have changed since MKL can now be shipped with Anaconda.
>>
>> - FFTW, 2x performance gain : These slides compare between Conda-provided 
>> (and those provided by other package managers) FFTW, and one which was built 
>> on an avx2 cluster, the performance gain is 2x (see slides 28 and 29 :
>> https://archive.fosdem.org/2018/schedule/event/installing_software_for_scientists/attachments/slides/2437/export/events/attachments/installing_software_for_scientists/slides/2437/20180204_installing_software_for_scientists.pdf
>>
>>
>> - Tensorflow, 7x gain for CPU version, slide 28 of this talk : 
>> https://archive.fosdem.org/2018/schedule/event/how_to_make_package_managers_cry/attachments/slides/2297/export/events/attachments/how_to_make_package_managers_cry/slides/2297/how_to_make_package_managers_cry.pdf
>>
>>    This one was not comparing Conda itself, but manylinux python wheels 
>> provided by the Tensorflow team, but no doubt Conda has the same issue if 
>> they build for generic architectures.
>>
>>
>>
>> Basically, any package that is compiled in a portable manner, such as what 
>> Conda and manylinux wheels do, will have some degree of speedup if compiled 
>> for the target architecture instead. This is typically achieved by the team 
>> of analysts who manage a cluster.
>>
>> Cheers,
>>
>> Maxime
>>
>>
>> On 2018-08-28 2:20 PM, Ashwin Srinath wrote:
>>> I'm very interested to see these examples? We use and advocate the use
>>> of conda environments and I'm happy to be convinced otherwise.
>>>
>>> Thanks,
>>> Ashwin
>>>
>>> On Tue, Aug 28, 2018 at 2:17 PM, Maxime Boissonneault
>>> <maxime.boissonnea...@calculquebec.ca> wrote:
>>>> Regarding performance, we have example of code using Anaconda-provided
>>>> packages that run 10 times slower than the same code using locally built
>>>> packages, optimized for the cluster architectures. That's not *a bit*
>>>> slower, that's a lot slower.
>>>>
>>>> Regarding "cheating on your partner", that analogy is not by me, but the
>>>> point he is trying to carry is that Anaconda basically replaces any cluster
>>>> provided versions, which HPC center people are working hard to optimize.
>>>> Recent versions of Anaconda are even worse, by packaging things like
>>>> compilers and linkers, creating conflicts with cluster-provided system
>>>> libraries and tools, and creating a lot of debugging problems for users and
>>>> support people alike.
>>>>
>>>> Regards,
>>>>
>>>> Maxime
>>>>
>>>>
>>>> On 2018-08-28 12:48 PM, Rémi Rampin wrote:
>>>>
>>>> 2018-08-28 12:27 EDT, Maxime Boissonneault
>>>> <maxime.boissonnea...@calculquebec.ca>:
>>>>> As a side-discussion, I think we should also be wary of using Anaconda,
>>>>> and tell users not to use it in a cluster environment. For reasons, see
>>>>> here :
>>>>> https://twitter.com/mboisso/status/1034476890353020928
>>>> Hi Maxime,
>>>>
>>>> All I see in this thread is that "it's like cheating on your partner" (!!!)
>>>> and it's "generically optimized software" that might be a bit slower than
>>>> locally-built libs (interesting concern when using Python, an interpreted
>>>> scripting language (and on the slow side too)).
>>>>
>>>> Could you elaborate on those reasons?
>>>>
>>>> Best
>>>> --
>>>> Rémi
>>>>
>>>>
>>>> The Carpentries / discuss / see discussions + participants + delivery
>>>> options Permalink
>>> ------------------------------------------
>>> The Carpentries: discuss
>>> Permalink: 
>>> https://carpentries.topicbox.com/groups/discuss/T1505f74d7f6e32f8-Mad4fadc6a6da6de2b5f2aeb9
>>> Delivery options: 
>>> https://carpentries.topicbox.com/groups/discuss/subscription
>>
>> --
>> ---------------------------------
>> Maxime Boissonneault
>> Analyste de calcul - Calcul Québec, Université Laval
>> Président - Comité de coordination du soutien à la recherche de Calcul Québec
>> Team lead - Research Support National Team, Compute Canada
>> Instructeur Software Carpentry
>> Ph. D. en physique
>>
> ------------------------------------------
> The Carpentries: discuss
> Permalink: 
> https://carpentries.topicbox.com/groups/discuss/T1505f74d7f6e32f8-M77e71bf94fc82bac35910927
> Delivery options: https://carpentries.topicbox.com/groups/discuss/subscription


--
---------------------------------
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Président - Comité de coordination du soutien à la recherche de Calcul Québec
Team lead - Research Support National Team, Compute Canada
Instructeur Software Carpentry
Ph. D. en physique


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


------------------------------------------
The Carpentries: discuss
Permalink: 
https://carpentries.topicbox.com/groups/discuss/T1505f74d7f6e32f8-M197c9bab7f977b741cb7bdef
Delivery options: https://carpentries.topicbox.com/groups/discuss/subscription

Reply via email to