Recent improvements to @clean allow Leo to update outlines containing 
thousands of @clean nodes. For the first time, it is feasible to use Leo to 
work on huge repos such as Rust's compiler 
<https://github.com/rust-lang/rust/tree/master/compiler>.

Alas, Leo's performance degrades substantially when using huge outlines. 
Python's GC (Garbage Collector) probably gets overly stressed by all the 
temporary data Leo generates.

This Engineering Notebook post explores a possible solution. As always, 
please feel free to ignore it. However, this ENB presents an exciting new 
direction for Leo.

*@leo nodes would create a hierarchy of Leo outlines*

The idea is to let *@leo nodes* in a* top-level outline* coordinate 
operations in *linked sub-outlines*. For example: *rust_compiler.leo* (in 
the rust/compiler directory) would have the following @leo nodes:

  @leo rustc/rustc.leo
  @leo rustc_abi/rustc_abi.leo
  @leo rustc_arena/rustc_arena.leo

And dozens of others. So the top-level outline will be tiny and the 
*sub-outlines 
*will be much smaller. As discussed below, the performance might not 
improve enough. But let's discuss some exciting ideas first.

*Cross-file searches and (maybe??) cross-file clones*

Straightforward extensions to Leo's file commands will allow Leonistas to 
search all subsidiary outlines from the top-level outline! Cross-tab (or 
inter-process) communication will transfer results from the sub-outlines to 
the top-level outline. All details are unclear for now.

The details of cross-file cff commands are more complex. Initially, the 
sub-outlines could communicate the *cross-file unls* back to the top-level 
outline. The cff becomes a set of unls. Recall that *Leo already supports 
cross-file unls.*

Later, we might consider true cross-file clones. Changing such a clone in 
the top-level outline would change the corresponding clone in the 
sub-outline. And vice versa!

But this is not the time to consider how to do this magic. For now, the 
conclusion is that cross-file clones *might* make sense, contrary to my 
decades-old opinion!

*Helping the GC?*

Now let's turn our attention back to performance issues.

First, let's suppose Leo handles @leo nodes by loading sub-outlines in 
separate tabs. Does this help the GC? The answer is "yes and no" -)

Yes, each tab contains less data, so operations on the tree and body *might* 
become 
more efficient. But no, the GC has the same amount (and a bit more) to 
handle. My *guess* is that putting smaller outlines into separate tabs will 
have a small (negligible?) effect on performance.

*The first prototype*

Happily, it will be easy to prototype this initial idea. I'll write a 
script that:

- Creates @leo nodes for all sub-directories of the rust/compiler directory.
- Creates the corresponding .leo file in each subdirectory.
- Loads (details unclear) each created (subsidiary) .leo file with the 
desired @clean nodes.
- Creates a list (suitable for the command-line) of files to be loaded.

So a command line like:

leo rust_compiler.leo <list of sub-outlines>

will load all the desired outlines, placing each sub-outline in its own 
tab. It will then be easy to see how much this scheme improves Leo's 
performance.

*Separate processes instead of separate tabs*

Separate tabs might not help enough. In that case, the @leo could load 
sub-outlines in separate *processes* instead of separate tabs. This 
approach will almost surely solve the performance problems. Operating 
systems are very very good at running separate processes! Each process will 
run a separate copy of Python with its own GC.

The same general ideas still apply, but now the top-level outline and all 
the sub-outlines must communicate via Leo's servers. There will probably be 
one server per process. Leo's server architecture will almost surely need 
to be extended. Surely such a scheme is feasible, but I have no intuition 
about the details.

Happily, *we can ignore inter-process complications for now.* I'll do all 
my initial experiments using sub-outlines in separate Leo tabs. It should 
be straightforward to extend Leo's find command using inter-*tab* 
communication. Who knows, maybe cross-file clones *do *make sense!

*Summary*

*@leo nodes* will create a hierarchical relationship between a (single) 
*top-level 
outline* and several *sub-outlines*. For now, we can assume that @leo nodes 
appear only in the top-level outline. We'll reexamine this question later.

Extensions to Leo's file commands (including the clone-find commands) will 
allow sub-outlines to send results back to the top-level outlines. 
Initially, results will be unls. Eventually, these unls might morph into 
cross-file clones. Changing a cross-file clone in the top-level outline 
would change the corresponding clone in the sub-outline and *vice versa.*

Communication between tabs is straightforward, but putting sub-outlines in 
separate Leo tabs is unlikely to improve Leo's performance enough.

Ultimately, Leo could run each sub-outline in a separate process. This 
scheme would require substantial updates to Leo's server. For now, I'll 
extend Leo's find commands using inter-*tab* communication. Maybe 
cross-file clones *do *make sense!

I welcome all your comments, questions, and suggestions. I am excited by 
this project, and I hope you are too.

Edward

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/leo-editor/9e9f5e67-bd85-401e-84dc-4b508907bd5bn%40googlegroups.com.

Reply via email to