I said that all the work on #1440 has been valuable, even though a simple 
script might use asttokens to do everything that the code in leoAst.py does.

This Engineering Notebook post explains why deep knowledge of the problem 
domain was needed to get to the surprising script. This post also explains 
some parts of the script in detail. As with all ENB posts, feel free to 
ignore it.

At no time was I upset by the surprise. I immediately treated it as *good* 
news. asttokens now provides a valuable point of comparison and context. 
The work I have done has given me deep insights into the subtle, 
behind-the-scenes, complications involved.

*Why did I, and black, and fstringify miss this possibility?*

In retrospect, it's clear why the Aha is easy to miss:

1. I didn't know until yesterday what data would be needed. It's impossible 
to know what would work until you know exactly what data will be needed. 
It's just all too confusing.

2. I have been assuming all along that *exact* traversal order would 
(ultimately) be required. But that not at all true. Indeed, in some cases 
*random* traversal suffices.

The Fstringify code in leoAst.py is an example. The ast.BinOp visitor would 
work if visited in *any* order, because potential f-strings are disjoint. 
However, we actually want the BinOp visitor to be visited in the 
approximate source-code order those ops appear in the sources, because 
Fstringify produces log messages, and we don't want *those* messages to be 
scrambled ;-)

3. [The big one]. I have been assuming that an exact, 1-to-1, 
correspondence between tokens and ast nodes is needed. Wrong, wrong, wrong! 
We can tolerate many-to-many links between tokens and nodes. That is, many 
nodes might point at a single token, and a single token might point at many 
nodes.

This is what I saw yesterday while discussing links with Rebecca. Iirc, I 
saw that crucial test in o.colon would work just fine with a many-to-many 
mapping between tokens and nodes. I've shown this crucial code before.  
Here it is again:

def colon(self, val):
    """Handle a colon."""
    node = self.token.node
    self.clean('blank')
    if not isinstance(node, ast.Slice):
        self.add_token('op', val)
        self.blank()
        return
    # A slice.
    [snip]

The Aha: yesterday I saw that the code:

if not isinstance(node, ast.Slice):

could be replaced by:

if not any(isinstance(z, ast.Slice) for z in self.token.node_list):

Let's see how token.node_list can be computed...

*The asttokens script*

First, we create a list of *mutable *Token objects.  asttokens uses only 
the named tuples provided by tokenize.tokenize. Named tuples are immutable, 
so the script must create an auxiliary list. The Token class is simple. No 
need to show it here.

atok = asttokens.ASTTokens(source, parse=True)
tokens = [Token(atok_name(z), atok_value(z)) for z in atok.tokens]

Given this list of Token objects, it's a snap to create the token lists:

for node in asttokens.util.walk(atok.tree):
    for ast_token in atok.get_tokens(node, include_extra=True):
        i = ast_token.index
        token = tokens[i]
        token.node_list.append(node)

That's all there is to it. It's also straightforward to inject parent/child 
links into ast nodes. See the actual script for details.

*Summary*

It takes deep insight to realize that asttokens could replace the TOG and 
TOT classes. This is the reason I was happy to see this possibility.

In any event, the TOG and TOT classes are still valuable. They are faster 
and clearer (in most ways) than the asttokens code. Otoh, the asttokens 
code could be said to be more clever. The new insights promise new ways to 
simplify the code in leoAst.py using clever asttokens code.

Edward

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/leo-editor/efae61fa-f3bb-4826-8f1f-446045545ae7%40googlegroups.com.

Reply via email to