ENB: Could @leo nodes organize other outlines and help the GC?

Edward K. Ream Wed, 16 Jul 2025 08:15:07 -0700

Recent improvements to @clean allow Leo to update outlines containing 
thousands of @clean nodes. For the first time, it is feasible to use Leo to 
work on huge repos such as Rust's compiler 
<https://github.com/rust-lang/rust/tree/master/compiler>.

Alas, Leo's performance degrades substantially when using huge outlines.
Python's GC (Garbage Collector) probably gets overly stressed by all the
temporary data Leo generates.

This Engineering Notebook post explores a possible solution. As always,
please feel free to ignore it. However, this ENB presents an exciting new
direction for Leo.

*@leo nodes would create a hierarchy of Leo outlines*

The idea is to let *@leo nodes* in a* top-level outline* coordinate
operations in *linked sub-outlines*. For example: *rust_compiler.leo* (in
the rust/compiler directory) would have the following @leo nodes:

@leo rustc/rustc.leo
@leo rustc_abi/rustc_abi.leo
@leo rustc_arena/rustc_arena.leo

And dozens of others. So the top-level outline will be tiny and the
*sub-outlines
*will be much smaller. As discussed below, the performance might not
improve enough. But let's discuss some exciting ideas first.

*Cross-file searches and (maybe??) cross-file clones*

Straightforward extensions to Leo's file commands will allow Leonistas to
search all subsidiary outlines from the top-level outline! Cross-tab (or
inter-process) communication will transfer results from the sub-outlines to
the top-level outline. All details are unclear for now.

The details of cross-file cff commands are more complex. Initially, the
sub-outlines could communicate the *cross-file unls* back to the top-level
outline. The cff becomes a set of unls. Recall that *Leo already supports
cross-file unls.*

Later, we might consider true cross-file clones. Changing such a clone in
the top-level outline would change the corresponding clone in the
sub-outline. And vice versa!

But this is not the time to consider how to do this magic. For now, the
conclusion is that cross-file clones *might* make sense, contrary to my
decades-old opinion!

*Helping the GC?*

Now let's turn our attention back to performance issues.

First, let's suppose Leo handles @leo nodes by loading sub-outlines in
separate tabs. Does this help the GC? The answer is "yes and no" -)

Yes, each tab contains less data, so operations on the tree and body *might*
become
more efficient. But no, the GC has the same amount (and a bit more) to
handle. My *guess* is that putting smaller outlines into separate tabs will
have a small (negligible?) effect on performance.

*The first prototype*

Happily, it will be easy to prototype this initial idea. I'll write a
script that:

- Creates @leo nodes for all sub-directories of the rust/compiler directory.
- Creates the corresponding .leo file in each subdirectory.
- Loads (details unclear) each created (subsidiary) .leo file with the
desired @clean nodes.
- Creates a list (suitable for the command-line) of files to be loaded.

So a command line like:

leo rust_compiler.leo <list of sub-outlines>

will load all the desired outlines, placing each sub-outline in its own
tab. It will then be easy to see how much this scheme improves Leo's
performance.

*Separate processes instead of separate tabs*

Separate tabs might not help enough. In that case, the @leo could load
sub-outlines in separate *processes* instead of separate tabs. This
approach will almost surely solve the performance problems. Operating
systems are very very good at running separate processes! Each process will
run a separate copy of Python with its own GC.

The same general ideas still apply, but now the top-level outline and all
the sub-outlines must communicate via Leo's servers. There will probably be
one server per process. Leo's server architecture will almost surely need
to be extended. Surely such a scheme is feasible, but I have no intuition
about the details.

Happily, *we can ignore inter-process complications for now.* I'll do all
my initial experiments using sub-outlines in separate Leo tabs. It should
be straightforward to extend Leo's find command using inter-*tab*
communication. Who knows, maybe cross-file clones *do *make sense!

*Summary*

*@leo nodes* will create a hierarchical relationship between a (single)
*top-level
outline* and several *sub-outlines*. For now, we can assume that @leo nodes
appear only in the top-level outline. We'll reexamine this question later.

Extensions to Leo's file commands (including the clone-find commands) will
allow sub-outlines to send results back to the top-level outlines.
Initially, results will be unls. Eventually, these unls might morph into
cross-file clones. Changing a cross-file clone in the top-level outline
would change the corresponding clone in the sub-outline and *vice versa.*

Communication between tabs is straightforward, but putting sub-outlines in
separate Leo tabs is unlikely to improve Leo's performance enough.

Ultimately, Leo could run each sub-outline in a separate process. This
scheme would require substantial updates to Leo's server. For now, I'll
extend Leo's find commands using inter-*tab* communication. Maybe
cross-file clones *do *make sense!

I welcome all your comments, questions, and suggestions. I am excited by
this project, and I hope you are too.

Edward

--
You received this message because you are subscribed to the Google Groups
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/leo-editor/9e9f5e67-bd85-401e-84dc-4b508907bd5bn%40googlegroups.com.

ENB: Could @leo nodes organize other outlines and help the GC?

Reply via email to