Hi Bennet,

Jake VanderPlas recently re-upped his blog post on this --

https://jakevdp.github.io/blog/2017/03/03/reproducible-data-analysis-in-jupyter/

best,
—titus

> On Aug 30, 2018, at 7:51 AM, Bennet Fauber <[email protected]> wrote:
> 
> Just out of curiosity, how do notebooks play in the `reproducible
> research` space?
> 
> Do Carpentries instructors, as a matter of course, tell students how
> to clear all results and save a notebook with nothing having been
> exectured?
> 
> Can such a notebook be executed sequentially, like a .py script can?
> From a command line?
> 
> Can a notebook file replace a .py file?
> 
> Do instructors show attendees how to export from a notebook to a .py
> file so that the contents can be used as a script in a multistep
> workflow or pipeline?
> 
> A notebook must be sequentially executable from a de novo state to
> count as reproducible, mustn't it?
> 
> The context in which most of the people I deal with want to use python
> is as the driver of one step of many in a sequential processing
> stream.  There isn't a single program that can be 'presented' in a
> notebook in a meaningful way.  Each step may also be separately
> executed in isolation.
> 
> Yes, at the very end of the project, there might be something that can
> be encapsulated in one script that prepares some visualizations and
> perhaps some statistical or modelling steps, but that is dependent on
> months of preparatory work.  All of the prior processing, QC,
> validation, etc., is much more fragmented over time, over who does
> which pieces of the work, in some cases redoes the work, etc.
> 
> If those folks don't learn how to use python for the most
> time-consuming, least visual, most fragmented portions of the their
> work, then its utility drops proportionally.
> 
> Clearly no one solution is going to satisfy all instructors, nor will
> one solution be appropriate or all audiences.
> 
> I think it is good to get the appropriate circumstances discussed, as
> we are doing here.  Possibly the points, uses, counterindicators,
> etc., for both command line and notebooks could be enumerated
> somewhere for reference?  Perhaps workshop organizers can be pointed
> to that information source that will help them to gauge what the needs
> and interests of the intended audience are and to request a workshop
> curriculum that is best suited to that audience and those needs?
> Would that result in a better fit of material to audience?  to better
> clarity of advertising what will really be presented in each workshop?
> to more focused and coordinated approaches among the various lessons
> of a workshop?
> 
> An iPad is a wonderful tool for something things, but it is not a
> replacement for a MacBook Pro.  Different purposes, different tools.
> I certainly think there is a place for both in everyone's work.  I
> also think that if a workshop is going to convincingly going to
> illustrate the power and utility of the command line, then it should
> be consistent in its use of the command line throughout.  If there is
> something else you want to teach, that is absolutely fine, but then
> the whole workshop should support that something else, both in its
> materials and its tools.
> 
> 
> On Thu, Aug 30, 2018 at 10:25 AM Rémi Rampin <[email protected]> wrote:
>> 
>> 2018-08-24 16:47 EDT, Konrad Förstner <[email protected]>:
>>> 
>>> Beside the fact that this talk is it really funny, it raises a lot of
>>> issues that I can confirm from my experience: [...]
>> 
>> 
>> Hi everyone,
>> 
>> I realize there's been a lot of attempts already to solve this "hidden 
>> state" problem at the software level, but I wonder if a "modal" notebook 
>> could help.
>> 
>> It seems to me that those problems arise because notebooks are trying to 
>> support "exploration/playing around" and "presentation" workflows from the 
>> same interface. There is no reason the full history can't be kept, other 
>> that it makes for a bad presentation; likewise, there is no reason to have 
>> every bit of code in the notebook, other than it is necessary to be able to 
>> run it again.
>> 
>> So maybe having a separate "exploration" mode where all cells are kept in 
>> order since the last kernel reset, and a "presentation" mode where some of 
>> those cells can be selected for presentation and the rest hidden would do 
>> some good?
>> 
>> There would be no need for GitHub and similar services that can render 
>> notebooks to show anything but the "presentation" view. But when I download 
>> and open the notebook, I would be able to get to a chronological, 
>> reproducible view if I choose to.
>> 
>> I do see some problems with this, mainly in that authors might not be aware 
>> of the non-presentation cells they are including (might have private stuff, 
>> or use a lot of space). It seems also a tad more complex (but less so than 
>> using the history magics!). I wonder if something like this has been 
>> attempted or exist in another software.
>> 
>> Cheers
>> --
>> Rémi
>> The Carpentries / discuss / see discussions + participants + delivery 
>> options Permalink
> 
> ------------------------------------------
> The Carpentries: discuss
> Permalink: 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__carpentries.topicbox.com_groups_discuss_T1505f74d7f6e32f8-2DM8b481632cd9315ab82b8c620&d=DwIFaQ&c=nE__W8dFE-shTxStwXtp0A&r=SYmX1-nsUDOfo9qO3rPQ3w&m=BbS90rkMToFfQyQPUC5ft7-o_JHBu38pMgYJGth-yfc&s=rywjd7uuT5o8beCSqtcfDbOL-cx1KtuxihjPDL5jfeo&e=
> Delivery options: 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__carpentries.topicbox.com_groups_discuss_subscription&d=DwIFaQ&c=nE__W8dFE-shTxStwXtp0A&r=SYmX1-nsUDOfo9qO3rPQ3w&m=BbS90rkMToFfQyQPUC5ft7-o_JHBu38pMgYJGth-yfc&s=6NwbBqb6JYG2lfRenlnozXMgzzn0_51-fVCcl-7wTVA&e=

------------------------------------------
The Carpentries: discuss
Permalink: 
https://carpentries.topicbox.com/groups/discuss/T1505f74d7f6e32f8-M178a185e5fabd4dbb2ade8d4
Delivery options: https://carpentries.topicbox.com/groups/discuss/subscription

Reply via email to