[jupyter] Re: [ANN] nbconvert 5.3.0 — now with tag-based element filtering!

Matt Craig Tue, 05 Sep 2017 18:50:25 -0700

This looks really neat -- I'm starting to try to use notebooks to do 
enrollment projections for a committee I'm on but we'll need to produce 
what would essentially be filtered output of the results for wider 
dissemination.


Matt Craig


On Friday, September 1, 2017 at 11:01:59 PM UTC-5, M Pacer wrote:
>
> We are pleased to announced the release of nbconvert 5.3.0!
>
> It is available via pypi (pip install nbconvert -U) and conda-forge (conda 
> install nbconvert -c conda-forge).
>
> This release has a number of bug fixes as well as docs, testing, and other 
> miscellaneous improvements. 
>
> In addition, we're excited to share with everyone the news that nbconvert 
> now supports *tag-based element filtering*! 
>
> Cell metadata tags allow filtering cell level elements, which removes: 
>
>    - cells with tags that match the tags in TagRemovePreprocessor.
>    remove_cell_tags,
>    - inputs in cells with tags that match the tags in 
>    TagRemovePreprocessor.remove_input_tags, 
>    - all outputs in cells with tags that match 
>    TagRemovePreprocessor.remove_all_outputs_tags. 
>
> To remove individual output elements (leaving others) you cannot use cell 
> metadata tags — instead, you will need to use output metadata tags (see 
> below for more explanation). Doing so removes:
>
>    - Outputs with tags that match the tags in 
>    TagRemovePreprocessor.remove_single_output_tags
>
> These different traitlets can be mixed and matched as desired.
>
> A much more comprehensive explanation as to how to use tag-based element 
> filtering is included at the end of this announcement. 
>
> For more details about the full release, see the changelog 
> <http://nbconvert.readthedocs.io/en/latest/changelog.html#id1>.
>
> We thank the following 9 authors who contributed 176 commits.
>
> * Benjamin Ragan-Kelley
> * Damián Avila
> * M Pacer
> * Matthias Bussonnier
> * Michael Scott Cuthbert
> * mpacer
> * Patricia Hanus
> * tdalseide
> * Thomas Kluyver
>
> Cheers,
> the nbconvert team
>
> ------------------------------
>
>
> *Filtering Notebook Content*
>
>
> Users have long asked for a way to remove content from notebooks when 
> converting them with nbconvert. There are lots of use cases for this kind 
> of feature. 
>
> Sometimes content makes sense in interactive contexts, but not when 
> presenting the work to other people. For example, in a data analytic report 
> for non-experts who don't know python, it may not make sense to show them 
> code. It could also makes sense to show *some* code cells, but not 
> others. For example, if you wanted to show data analyst colleagues how you 
> processed the data, but not your import statements or plotting commands. Or 
> if you were showing designer colleagues *how *you plotted your results 
> you might want to hide your analyses but show your plotting code. 
>
> It would be frivolous to have to recreate the same content to generate one 
> report for each of those cases. It should be possible to easily create each 
> of those reports from the same source notebook. This release of nbconvert 
> addresses these (and many other) use cases. 
>
> *Global Content Filtering*
>
> With 5.2 we introduced global content filtering, which allows you to 
> remove every instance of different kinds of elements. This allows you to 
> remove every input, every output, every code or markdown cell, and all the 
> input or output prompts. This release solved the filtering problem for many 
> use-cases.
>
> ***new** Tag-based Element Filtering*
>
> But global content filtering doesn't allow you to remove only some of your 
> code, it will remove every input no matter what you do. In order to remove 
> only some of the content, we have to have a way to specify which content 
> should be removed. 
>
> This release allows you to specify which elements are to be removed 
> through the use of *tags.* 
>
> *What are tags?*
>
> Tags are strings that cannot contain spaces or commas. You can assign tags 
> to cells and they will be accessible in the cell's metadata. See the nbformat 
> cell metadata 
> <http://nbformat.readthedocs.io/en/latest/format_description.html#cell-metadata>
>  docs 
> for more information.
>
> *Tag Toolbar*
>
> In notebook 5.0, we introduced the tag toolbar, which allows you to assign 
> and remove tags to cells in the notebook interface. 
>
> The basic user model for assigning and removing tags to one cell is 
> demonstrated in the gif below:
>
>
>
>
>
> *Using tags for filtering cells*
>
> The simplest case of removing elements is cells, so we'll describe how to 
> do that first. 
>
> If you wanted to remove some cells from your notebook RemoveElements.ipynb, 
> you would apply the same tag to each of those cells. Let's say that this 
> tag is called to_remove.
>
> Let's say that you wanted to just remove those cells and convert it to a 
> static html page, then you would use the following command: 
>
> jupyter nbconvert RemoveElements 
> --TagRemovePreprocessor.remove_cell_tags={\"to_remove\"} 
>
> *NB*: you need to use curly brackets and escape quotes so that the value 
> will be interpreted as a python set. You can avoid this complication by 
> using an external config file and inside it placing the code: 
> c.TagRemovePreprocessor.remove_cell_tags.add("to_remove"). This holds for 
> all of the other traitlet values as well. For more on passing see the 
> traitlets 
> documentation on configurable objects 
> <http://traitlets.readthedocs.io/en/stable/config.html>.
>
> *Using tags for filtering inputs*
>
> In the examples we gave, we wanted to remove some inputs leaving their 
> outputs (for example to show plots without the plotting code). In order to 
> remove only the inputs we change the traitlet that holds the tags that we 
> are going to use to figure out which cells should have some of their 
> content removed. Specifically, instead of adding the tag to 
> remove_cell_tags, we would add it to remove_input_tags. Then, all of the 
> cells that had tags matching remove_input_tags would have their inputs 
> removed. 
>
> For example, if we wanted to remove the inputs from those cells with the 
> to_remove tag we would set
>
> jupyter nbconvert RemoveElements 
> --TagRemovePreprocessor.remove_input_tags={\"to_remove\"}
>
> *Using tags for filtering all of a cell's outputs *
>
> Cells can have multiple outputs. If we're using cell metadata, we're 
> speaking at the cell level. So when we say we want a cell's outputs 
> removed, we're saying that we want *all* the outputs removed. Again we 
> would use a different traitlet value, this time we add the tag to 
> remove_all_outputs_tags instead of adding it to remove_cell_tags.
>
> For example, if we wanted to remove all the outputs of those cells tagged 
> with to_remove we would use:
>
> jupyter nbconvert RemoveElements 
> --TagRemovePreprocessor.remove_all_outputs_tags={\"to_remove\"}
>
> *Using tags for filtering single outputs*
>
> It is possible to remove individual outputs, but that needs to be 
> specified in the individual outputs' metadata, not the cells' metadata. 
>
> So, if we wanted to remove only a single output using the to_remove tag, 
> we need to set the output metadata on that to have a tags field and set 
> that array to include the to_remove tag. The easiest way to do this is to 
> use IPython's display function; display is a builtin to IPython as of 5.4 
> and 6.1. We use  display because it takes an optional metadata argument 
> for setting output metadata. 
>
> Thus if in RemoveElements.ipynb we had a code cell with the following 
> source: 
>
> display("hello", metadata={"tags":["to_remove"]})
> display("goodbye", metadata={})
>
> and we executed it, converted it to markdown and displayed it using 
>
> jupyter nbconvert RemoveElements --to markdown && cat RemoveElements.md
>
> we'd see 
>
> ```python
> display("hello", metadata={"tags":["to_remove"]})
> display("goodbye", metadata={})
> ```
>
>
>     'hello'
>
>
>
>     'goodbye'
>
> versus 
>
> jupyter nbconvert RemoveElements --to 
> markdown --TagRemovePreprocessor.remove_single_output_tags={\"to_remove\"} && 
> cat RemoveElements.md
>
> ```python
> display("hello", metadata={"tags":["to_remove"]})
> display("goodbye", metadata={})
> ```
>
>
>     'goodbye'
>
>
> *Mixing and matching: more complicated filtering workflows *
>
> You can use all of these traitlets at the same to interesting effect. In 
> some cases, you will want to use different tags for filtering different 
> kinds of information, especially if those kinds of information could 
> conflict. On the other hand, you might want to use the same tags for 
> different kinds of information if they are complementary. Consider the 
> following example.  
>
> Suppose you had a notebook  My2Reports.ipynb capable of producing two 
> completely different kinds of reports depending on which of two possible 
> code paths it takes. One code path includes a collection of cells all of 
> which are tagged with A, and the other code path includes cells all of 
> which are tagged with B. In both code paths you have cells that create 
> plots have less than beautiful code for their beautiful plots; those cells 
> are tagged with hide_plot_code (in addition to A or B). Additionally, you 
> have some cells that you need to run to set the data up, but in so doing 
> these cells spit out a lot of logs that you want to remove — those cells 
> are tagged with hide_noisy_logs. Finally, you have some cells at the end 
> that provide results for both code paths A and B (so these *cells* would 
> be tagged with *neither* A nor B), but the individual outputs will be 
> tagged with either A or B. Then you would be able to get two versions of 
> the resulting report by running the following lines of code:
>
> jupyter nbconvert ComplicatedFilteringExample --output Assignment_A --to 
> markdown \
> --TagRemovePreprocessor.remove_cell_tags={\"B\"} \
> --TagRemovePreprocessor.remove_input_tags={\"hide_plot_code\"} \
> --TagRemovePreprocessor.remove_all_outputs_tags={\"hide_noisy_logs\"} \
> --TagRemovePreprocessor.remove_single_output_tags={\"B\"}
>
> jupyter nbconvert ComplicatedFilteringExample --output report_B --to 
> markdown \
> --TagRemovePreprocessor.remove_cell_tags={\"A\"} \
> --TagRemovePreprocessor.remove_input_tags={\"hide_plot_code\"} \
> --TagRemovePreprocessor.remove_all_outputs_tags={\"hide_noisy_logs\"} \
> --TagRemovePreprocessor.remove_single_output_tags={\"A\"}
>
>
> Or if you wanted to use config files, you could have configA.py:
>
> cat configA.py
>
> c.NbConvertApp.output_base = "report_A"
> c.NbConvertApp.export_format = "markdown"
> c.TagRemovePreprocessor.remove_cell_tags.add("B")
> c.TagRemovePreprocessor.remove_input_tags.add("hide_plot_code")
> c.TagRemovePreprocessor.remove_all_outputs_tags.add("hide_noisy_logs")
> c.TagRemovePreprocessor.remove_single_output_tags.add("B")
>
> and configB.py:
>
> cat configB.py
>
> c.NbConvertApp.output_base = "report_B"
> c.NbConvertApp.export_format = "markdown"
> c.TagRemovePreprocessor.remove_cell_tags.add("A")
> c.TagRemovePreprocessor.remove_input_tags.add("hide_plot_code")
> c.TagRemovePreprocessor.remove_all_outputs_tags.add("hide_noisy_logs")
> c.TagRemovePreprocessor.remove_single_output_tags.add("A")
>
> Then you could run much shorter versions of the above commands and produce 
> the same output:
>
> jupyter nbconvert ComplicatedFilteringExample --config configA.py
>
> jupyter nbconvert ComplicatedFilteringExample --config configB.py
>
> *Wrapping up:*
>
> You can easily set cell metadata using the notebook's cell tag toolbar. 
> The traitlets for filtering at the cell level exactly match strings in the 
> ith cell's nb.cells[i].metadata.tags value.
> The traitlets that filter at the cell level are:
>
>    - TagRemovePreprocessor.remove_cell_tags
>    - TagRemovePreprocessor.remove_input_tags
>    - TagRemovePreprocessor.remove_all_output_tags
>
> You can easily set output metadata using IPython's display function.
> The traitlet for filtering at the output level exactly matches strings for 
> ith cell's jth output's nb.cells[ith].outputs[j].metadata.tags value.
> And the traitlet that filters at the output level is
>
>    - TagRemovePreprocessor.remove_single_output_tags
>
> The remove_*_tags traitlets are sets. 
> On the command line you need to use curly brackets {} and escape your 
> quotes. 
> Via a config file, use the .add method. 
>
> For complicated collections of filters, it is usually easier to follow and 
> reproduce if you create a config file. 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Project Jupyter" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jupyter/f421f4f4-498e-4256-bcef-b9885af6ced8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[jupyter] Re: [ANN] nbconvert 5.3.0 — now with tag-based element filtering!

Reply via email to