Engineering Notebook - Jupytext The Leonine Way

Thomas Passin Sat, 26 Oct 2024 06:03:55 -0700

This is a long post so I'm putting a summary at the start.

1. Leo can provide a very good way for a user to create and edit Juptyer 
notebooks.
2. The exploratory work for @jupytext files does not provide a proper 
Leonine experience.
3. The usual Leo approach is weak when it comes to working with 
documentation, and documentation mixed with code, in contrast to progams, 
itemized lists, and the like where Leo is strong.
4. This weaknesss can be overcome and there is a discussion of how and why, 
and some design suggestions.

There has been a flurry of activity in the last few weeks to add an ability
for working with Jupyter notebooks by means of an intermediate file format.
That format is provided by the JupyText program. However, in the rush no
one seems to have thought much about what Leo can bring to the table, or
why anyone would want to use Leo for this purpose, compared let's say with
the Jupyter plugin for Visual Studio Code, which seems very good and very
readable to me.

For those who don't know yet, a Jupyter notebook's file format is JSON.
JSON was designed to interchange data, including program structures. It
was not designed for documentation or readability. Jupyter notebooks
contain a sequence of "cells" - basically nodes - that are either Markdown
text or code cells. There is no other structure. Jupytext's contribution is
to flatten the nested JSON data structure to a flat text format with what
amounts to sentinels - each line is commented out, and there are a few
specially-formatted comments. One of these special comment lines marks the
start of a Markdown cell, and another marks the start of a Python cell.

Leo is very strong in these areas:

1. Structure is indicated by and can be changed using the outline;

2. Only the contents of a single node are visible and editable at one
time. This is excellent for concentrating on programming and itemized
lists of all kinds.

3. Any markup such as Leo's sentinels that are needed to support
structure or other Leo features are hidden from the user. There's hardly
any visual clutter. IMO this is a crucial feature that Leo offers. It
makes writing code to save and restore outlines and external files very
complicated, but the user doesn't need to know about that.

Leo is not nearly as capable in supporting writing and documentation.
That's because even though structure is important, it's also important to
be able to see and edit the flow of the document from node to node and how
nearby parts work together. An example is creating documentation using the
rst3 command and Sphinx. The mechanics of this process are excellent but
doing editing beyond the node level requires a lot of mental effort and
trial runs with Sphinx. The Viewrendered3 plugin is designed to help with
this problem by letting the user view an entire subtree.

Jupyter notebooks are serial combinations of documentation and code.
Basically they are a limited form of "literate programming". The code
cells are usually small enough that they don't need to be structured. They
are displayed by a Jupyter-viewing program in a clean, very readable way.

The Jupytext work over the last few weeks presents the user with a direct
view of the Jupytext-formatted file, sentinels, embedded comments, and all.
A user has to hand-edit the file while being distracted by the extra
markup and not being able to see the flow of the parts one into another.
Yes, parts can be moved around and navigated to using the outline, but the
editing experience is inferior. There is also no syntax highlighting for
code nodes - since the code is all commented out - nor can even rudimentary
syntax checks be carried out. The appearance of the document as it would
be viewed in a Jupyter program - well, it is unknown until the file is
saved to the .ipynb form and reloaded into Jupyter.

The Leonine Way
------------------
1. Leo should present a view of a Jupyter notebook file to the user
without sentinels or visual clutter, just like it does with other external
file types.

2. Leo should be able to recreate the file's structure on reloading, or at
least a close approximation to it.

3. Code cells should be syntax-highlighted and preferably able to be at
least syntax-checkable.

4. Code execution would be a bonus but not required.

5. The user should need to know a minimum of special forms such as
directives or other special markup features, and if there are any they
should have a form similar to other Leo forms.

6. There should be a way to show a view of adjacent or nearby nodes so that
the user can make sure they work together as intended. Showing a fully
rendered view of nodes' Markdown is a desirable bonus to avoid
round-tripping to Jupyter.

7. There should be an easy way to extend the file handling and rendering
capabilities to use other programming languages besides Python.

6. The process of converting, importing, and exporting the files should be
invisible to the user, just as for any other external file that Leo
currently supports.

Thoughts on Design
-------------------
Leo already has parts that parts of the items above. The rst3 command gives
a way to maintain structure apart from embedding it into an external file.
With rst3, a Sphinx document becomes an ordinary Leo tree, not an external
file. Running the command creates file(s) in the format that Sphinx needs.
A "jupyter" command would do a similar job, and it would be far simpler.

VR3 is designed to handle 4, 5, and 6. For example, a Jupytext file starts
a new markdown cell with this line:

# %% [markdown]

A code cell is started by:

# %%

VR3 can recognize blocks delineated with @language directives:

@language md
This starts a block of markdown.

@language python
# Python code goes here

To get a jupytext file to render in VR3, just run a text substitution for
these forms and then remove the remaining leading "#" characters. That can
be done during import of a notebook file, and doing it will give the user a
clean view of the contents without visual clutter. If we're going to have
code to convert for VR3 to render we might as well have the same code
up-front at import time.

The most challenging task will be how to merge changes that someone makes
inside Jupyter with the internal arrangement of nodes inside a Leo outline.
I don't think that will be easy, but it's essentially the same problem that
has already been solved for @clean external files.

--
You received this message because you are subscribed to the Google Groups
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/leo-editor/f96785e9-14eb-440b-8c40-f96bb0269dden%40googlegroups.com.

Engineering Notebook - Jupytext The Leonine Way

Reply via email to