Hi Bennet, Jake VanderPlas recently re-upped his blog post on this --
https://jakevdp.github.io/blog/2017/03/03/reproducible-data-analysis-in-jupyter/ best, —titus > On Aug 30, 2018, at 7:51 AM, Bennet Fauber <[email protected]> wrote: > > Just out of curiosity, how do notebooks play in the `reproducible > research` space? > > Do Carpentries instructors, as a matter of course, tell students how > to clear all results and save a notebook with nothing having been > exectured? > > Can such a notebook be executed sequentially, like a .py script can? > From a command line? > > Can a notebook file replace a .py file? > > Do instructors show attendees how to export from a notebook to a .py > file so that the contents can be used as a script in a multistep > workflow or pipeline? > > A notebook must be sequentially executable from a de novo state to > count as reproducible, mustn't it? > > The context in which most of the people I deal with want to use python > is as the driver of one step of many in a sequential processing > stream. There isn't a single program that can be 'presented' in a > notebook in a meaningful way. Each step may also be separately > executed in isolation. > > Yes, at the very end of the project, there might be something that can > be encapsulated in one script that prepares some visualizations and > perhaps some statistical or modelling steps, but that is dependent on > months of preparatory work. All of the prior processing, QC, > validation, etc., is much more fragmented over time, over who does > which pieces of the work, in some cases redoes the work, etc. > > If those folks don't learn how to use python for the most > time-consuming, least visual, most fragmented portions of the their > work, then its utility drops proportionally. > > Clearly no one solution is going to satisfy all instructors, nor will > one solution be appropriate or all audiences. > > I think it is good to get the appropriate circumstances discussed, as > we are doing here. Possibly the points, uses, counterindicators, > etc., for both command line and notebooks could be enumerated > somewhere for reference? Perhaps workshop organizers can be pointed > to that information source that will help them to gauge what the needs > and interests of the intended audience are and to request a workshop > curriculum that is best suited to that audience and those needs? > Would that result in a better fit of material to audience? to better > clarity of advertising what will really be presented in each workshop? > to more focused and coordinated approaches among the various lessons > of a workshop? > > An iPad is a wonderful tool for something things, but it is not a > replacement for a MacBook Pro. Different purposes, different tools. > I certainly think there is a place for both in everyone's work. I > also think that if a workshop is going to convincingly going to > illustrate the power and utility of the command line, then it should > be consistent in its use of the command line throughout. If there is > something else you want to teach, that is absolutely fine, but then > the whole workshop should support that something else, both in its > materials and its tools. > > > On Thu, Aug 30, 2018 at 10:25 AM Rémi Rampin <[email protected]> wrote: >> >> 2018-08-24 16:47 EDT, Konrad Förstner <[email protected]>: >>> >>> Beside the fact that this talk is it really funny, it raises a lot of >>> issues that I can confirm from my experience: [...] >> >> >> Hi everyone, >> >> I realize there's been a lot of attempts already to solve this "hidden >> state" problem at the software level, but I wonder if a "modal" notebook >> could help. >> >> It seems to me that those problems arise because notebooks are trying to >> support "exploration/playing around" and "presentation" workflows from the >> same interface. There is no reason the full history can't be kept, other >> that it makes for a bad presentation; likewise, there is no reason to have >> every bit of code in the notebook, other than it is necessary to be able to >> run it again. >> >> So maybe having a separate "exploration" mode where all cells are kept in >> order since the last kernel reset, and a "presentation" mode where some of >> those cells can be selected for presentation and the rest hidden would do >> some good? >> >> There would be no need for GitHub and similar services that can render >> notebooks to show anything but the "presentation" view. But when I download >> and open the notebook, I would be able to get to a chronological, >> reproducible view if I choose to. >> >> I do see some problems with this, mainly in that authors might not be aware >> of the non-presentation cells they are including (might have private stuff, >> or use a lot of space). It seems also a tad more complex (but less so than >> using the history magics!). I wonder if something like this has been >> attempted or exist in another software. >> >> Cheers >> -- >> Rémi >> The Carpentries / discuss / see discussions + participants + delivery >> options Permalink > > ------------------------------------------ > The Carpentries: discuss > Permalink: > https://urldefense.proofpoint.com/v2/url?u=https-3A__carpentries.topicbox.com_groups_discuss_T1505f74d7f6e32f8-2DM8b481632cd9315ab82b8c620&d=DwIFaQ&c=nE__W8dFE-shTxStwXtp0A&r=SYmX1-nsUDOfo9qO3rPQ3w&m=BbS90rkMToFfQyQPUC5ft7-o_JHBu38pMgYJGth-yfc&s=rywjd7uuT5o8beCSqtcfDbOL-cx1KtuxihjPDL5jfeo&e= > Delivery options: > https://urldefense.proofpoint.com/v2/url?u=https-3A__carpentries.topicbox.com_groups_discuss_subscription&d=DwIFaQ&c=nE__W8dFE-shTxStwXtp0A&r=SYmX1-nsUDOfo9qO3rPQ3w&m=BbS90rkMToFfQyQPUC5ft7-o_JHBu38pMgYJGth-yfc&s=6NwbBqb6JYG2lfRenlnozXMgzzn0_51-fVCcl-7wTVA&e= ------------------------------------------ The Carpentries: discuss Permalink: https://carpentries.topicbox.com/groups/discuss/T1505f74d7f6e32f8-M178a185e5fabd4dbb2ade8d4 Delivery options: https://carpentries.topicbox.com/groups/discuss/subscription
