On Thu, Jul 5, 2012 at 9:34 AM, Gael Varoquaux <[email protected]> wrote: > I am very clearly -1 on this suggestion for several reasons:
You guys should definitely find a policy that works well for sklearn. I just want to provide some info here, not push for using notebooks in your default setup: > a. I worry very much about leaving the tried and tested notion of a source > code file. We have a complete development and maintenance flow that is > based upon and that it would break, it particular: > 1. Version control: I don't know how people do version control of > notebooks, but I am a bit worried of what the diffs will look > like. I think that we currently have a great workflow with git and > github. We've tried to make sure the format is as version control-friendly as possible, within the limits of accepting that it's json. We have the following to help with VC: - the json keys are always sorted, so there's no noise from dict reordering commit-to-commit - the code is always broken up into lines in the json (json has no multiline string literals) and reassembled later, so that code diffs are truly line-based and not whole-cell-blobs - the 'remove all output' option from the menu also strips out input prompts, so that only truly human-typed input remains in the file. This makes it easy to commit notebooks that will not produce any diff at all unless their text/code actually changes. You're still dealing with diffs of a json file, but at least I think this makes those diffs as human-friendly as we can make them. > 2. Testing: with the scipy-lecture notes and the NISL tutorial > (https://github.com/nisl/tutorial) we really found it hard to > make sure that across time the tutorials did not break. We now have > a policy that all code must be doctested, and all figures must be > generated from an example (pretty much the policy that we have in > the scikit-learn). Doctesting can be tricky, I am quite happy to > rely on the existing best practices that work for rst and source > code files. As part of the nbconvert work, we're also adding support for 'doctesting' a notebook, which in fact should be a fair bit more robust and convenient in the long run than doctesting from sphinx-pasted blocks. But that's coming, not ready yet, so your caveat on bleeding edge tools very much applies here. Cheers, f ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
