Re: ENB: The way forward: using positions and vnodes in the npyscreen code

Edward K. Ream Wed, 17 May 2017 06:59:10 -0700

On Monday, May 15, 2017 at 10:28:49 AM UTC-5, Edward K. Ream wrote:

In conversation with Rebecca last night at dinner I saw more clearly what 
> the next step is.
>
> *tl;dr:* We need an interface *class *that delivers (wrappers to) Leo's 
> position and/or vnode classes instead of some (as yet unknown) npyscreen 
> *array*.
>


The way is now open to create this class.

This post describes a collapse in complexity in the LeoMLTree and 
LeoTreeData classes. This post also gives a very fast caching scheme for 
redrawing npyscreen outlines.

This post continues the original Engineering Notebook post.  It will be of 
interest only to a few. It will be pre-writing for a post to the npyscreen 
list.

*Two non issues*

What I will say later might be taken as a criticism of the npyscreen code 
base. But any such complaints would be silly. It would be extremely 
dangerous and unwise to make *any* non-local changes to the npyscreen 
sources.  Doing so would almost surely break existing npyscreen apps. I 
have felt free to make minor *local* simplifications to the sources in the 
leo/external/npyscreen folder, but that's all.

The quality of the npyscreen docs is also a non-issue. Sources, not words, 
are the ultimate underlying truth. After brief perusal of the npyscreen 
docs, I have only read the npyscreen sources.

*Collapsing the LeoMLTree class*

I am going to speak plainly. Elsewhere I have described the npyscreen 
sources as a maze.  I have spent days wandering the maze trying to unravel 
the code.  Much of the maze is irritating, useless, C-like, blah, blah blah.

In particular, the base npyscreen code seems deathly afraid of crashing. 
There is no reason for this fear in Python. During development, *crashes 
help the app's developer find problems*. In the end, it's up to the *app's 
developer* to eliminate crashes.

Happily, it's possible for subclasses of MLTree, such as LeoMLTree, to cut 
through this maze without touching MLTree at all. Now, LeoMLTree is dead 
simple. Here is how I did it:

1. *Eliminated weak references*. Weak references infect the npyscreen 
sources like a rash. In fact, there is not the slightest reason to use them 
in the tree code, and many good reasons not to do so. For starters, weak 
references complicate debugging. Imo, removing weak references was an 
essential first step toward navigating the maze.

2. *Simplified the all-important values property*. The actual drawing code 
uses what looks like a values *array.* It took a *long* time to see that 
the values *property *simulates an array whose values may be cached.  Here 
are the over-ridden getter and setter methods, as defined in LeoMLTree:

    def _setValues(self, tree):
        self._myFullValues = tree or LeoTreeData()

    def _getValues(self):
        '''
        Return the (possibly cached) list returned by 
self._myFullValues.get_tree_as_list().
    
        Setting _cached_tree to None invalidates the cache.
        '''
        if getattr(self, '_cached_tree', None):
            return self._cached_tree_as_list
        else:
            self._cached_tree = self._myFullValues
            self._cached_tree_as_list = 
self._myFullValues.get_tree_as_list()
            return self._cached_tree_as_list

This code is *way* simpler than the corresponding code in the base MLTree 
class. But even this simpler code is faux clever, for at least three, 
reasons:

1. Code in MLTree (or LeoMLTree) must explicitly disable the cache by 
setting the _cached_tree ivar to None. This makes the client code aware of 
a nerdy little helper ivar.

2. Instead of setting _cached_tree to None, the client code *might as well 
refresh the cache directly*, say by calling self.refresh_values(). So the 
values property has no real purpose. To anticipate a bit, Leo will need to 
refresh/invalidate the cache in exactly one place, namely in its 
c.frame.tree.redraw method.

3. The values property doesn't even do a very good job of caching. 
Refreshing the cache creates a TreeData node for *every visible node* in 
the tree. This could be a problem for large trees. A better caching scheme, 
given below, creates TreeData (LeoTreeData) nodes only when actually 
referenced.

*Clever caching*

At last I can describe how the big switcheroo will work:

- A values *class *will replace the values *property*.

- The class will have a __getitem__ method that will return only the 
requested LeoTreeData item. Something like this (untested):

    def __getitem__(self, n):
        return self.get_data(n)

    def get_data(self, n)
        '''Return a LeoTreeData for the n'th visible position of the 
outline.'''
        c = self.c
        # Look for n or n-1 in the caches.
        data = self.data_cache.get(n)
        if data:
            return data
        p = self.position_cache.get(max(0,n-1))
        if p:
            p = p.moveToNextVisible()
            if p:
                self.position_cache[n] = p
                self.data_cache[n] = data = LeoTreeData(p)
                return data
            else:
                return None
        # Search the tree, caching the result.
        i, p = 0, c.rootPosition()
        while p:
            if i == n:
                self.position_cache[n] = p
                self.data_cache[n] = data = LeoTreeData(p)
                return data
            else:
                p.moveToNextVisible()
        return None

This is clever code. In practice, it will have close to O(0) running time! 
Indeed, LeoMLTree._redraw (adapted from MultiLine.update) *access lines in 
numeric order.* As a result, a search of the Leo outline will be needed for 
only the first displayed line. Thereafter, n-1 will exist in the position 
cache, so only a single call to p.moveToNextVisible() will needed.

*Summary*

Subclasses of MLTree (and TreeData) can be much simpler than the 
corresponding base classes. The actual LeoMLTree and LeoTreeData classes 
contain many more examples than shown here. I hope the actual code will 
help developers of npyscreen devs navigate the maze.

Significantly better caching of TreeData objects is possible. 
LeoMLTree.values[i] should be faster than MLTree.values[i], even 
(especially) for very large Leo outlines. The caching scheme shown above 
reduces the stress on the GC by greatly reducing the number of allocated 
LeoTreeData objects.

c.frame.tree.redraw will clear the two caches in a new LeoValues class and 
then call LeoMLTree.display().

Edward

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.

Re: ENB: The way forward: using positions and vnodes in the npyscreen code

Reply via email to