This Engineering Notebook post discusses recent studies and thoughts.

I slept all day yesterday, perhaps with a mild case of covid. I've been 
vaccinated and boosted, so I'm not too worried. I feel better this morning. 
Thank goodness for immune systems! And now I have an excuse if my writing 
sounds like ravings :-)

The goal of my thinking is to find my next project. Leo is complete as it 
is. Nothing urgently needs to be added. But studying other problems might 
suggest additions to Leo.

Creativity requires a juicy problem. Not necessarily a big problem, just a 
problem that seems interesting!

*Type inference*

If I could wave a magic wand, I would wish for a faster pylint. And maybe a 
faster mypy, though mypy is surprisingly fast. Both pylint and mypy attempt 
heavy-duty type inference, so I decided (once again) to try to understand 
Hindley-Milner type inference.

Previous attempts were unsuccessful. The notation is cryptic. This time 
around, google yielded much better results. I struck paydirt with a great 
lecture 
<https://www.youtube.com/watch?v=cQf_6NsLp80&list=PLOV3frUQ3B3GMSEBM4JAnFMR4yh3iYfL2&index=2&t=2372s>
 by Adam Doupe <https://adamdoupe.com/> at Arizona State University. The 
second part (of the same lecture) is here 
<https://www.youtube.com/watch?v=THbNCcWWUdA>.

I enjoyed the way Prof. Doupe would ask questions and patiently wait. It's 
the style of a relaxed, confident teacher. Very rare, in my experience. 
Here's what I learned:

First, the algorithm is just common sense! It simply traverses the parse 
tree (in *any* order), recording inferences along the way. Nothing could be 
simpler.

Second, the algorithm is more of a roadmap than a proper algorithm. The 
possible inferences depend on the language, and the algorithm omits any 
mention of the myriad housekeeping details that form the bulk of the actual 
code! 

Armed with my new confidence about type inference, I briefly reviewed the 
source code for mypy and pylint. Mypy and pylint follow this algorithm in 
very different ways! These programs are large and complex. There is no 
obvious way to speed them up!

*Leonine linting*

If improving pylint is out of the question, perhaps we could find a simple 
way to ensure that Leo's c, g, and p vars always mean what we think they 
should mean.

The simplest prototype of such an approach is to use cff to search for 
assignments. I used four cffs with these regexs:

\bg\b\s*= # g =

\bp[0-9]?\b\s*=\s*[A-Z][_\w]+\s*\( # p = Class name

\b[_\w]\s*=\s*(.*?)\bPosition\b    # var = Position

\b[_\w]\s*=\s*(.*?)\bCommands\b    # var = Commands

The results provide an intuitive view of how Leo defines c, g, and p!

The next step would be to transliterate these *text* patterns into searches 
on the parse trees. The transliteration should be straightforward! Indeed, 
one searches (in *any* order) for Assign, AugAssign, or AnnAssign nodes. 
Having found the assignment, one can traverse the subtree looking for c, g, 
p, Position, etc. Maybe one could even generalize the transliteration, but 
that seems unnecessary.

*A puzzle*

I'm relating my recent journey out of order. The first puzzle I set myself 
was to traverse (parse) trees using neither recursion (including ast.walk) 
nor generators. Yes, this puzzle seems quixotic, but I do like puzzles :-)

I don't remember exactly why I first wanted to solve this puzzle. I 
remember telling Rebecca that all the generators in leoAst.py seemed like 
the Hufflepuff way. I wanted a "Ravenclaw way :-) But this is a bit unfair 
to myself. There are lots of clever tricks in leoAst.py, as I documented 
years ago.

Anyway, after several days (!!) of thought, I saw the answer. Indeed, Leo 
*already* traverses outlines using neither generators nor recursion. The 
key is Leo's Position class.

Maybe the original problem was contrived, but the solution is notable. For 
any traversal, there is a Position class that yields that traversal. For 
example, Leo's c.all_positions generator is:
def all_positions(self, copy=True):
    c = self
    p = c.rootPosition()
    while p:
        yield p.copy() if copy else p
        p.moveToThreadNext()

p.moveToThreadNext defines "outline order" *iteratively, *without 
generators or recursion. Take a look. So for *any *desired traversal, we 
can define a *generalized Position class* with a new moveToThreadNext 
method. Do you see? 

moveToThreadNext might be complex. For example, for ast trees, 
moveToThreadNext would examine the type of each tree node, and traverse the 
tree accordingly. Note: Position's stack eliminates the need for recursion. 
The *generalized stack *must contain enough data to move through the entire 
tree.

moveToThreadNext (roughly) corresponds to all the visitors (generators) in 
leoAst.py. This approach is, at the least, a revealing change in point of 
view.

Aside: TokenOrderTraverser.traverse (in leoAst.py) corresponds exactly to 
p.moveToThreadNext. TOT.traverse manipulates a stack explicitly. So a 
generalized moveToThreadNext can be a function. Defining a generalized 
Position class is optional.

*Summary*

The Hindley-Milner algorithm is common sense dressed up in forbidding 
notation. One traverses the tree (in *any*) order, recording inferences.

For most languages, the Hindley-Milner algorithm is merely a roadmap. 
Bookkeeping is the hard part. Pylint and mypy do the bookkeeping in very 
different ways.

Straightforward scripts (transliterations of regexs) can probably check the 
types of all uses of c, g, and p within Leo's code base.

It is straightforward to define any conceivable tree traversal using 
neither recursion nor generators. Just define a *generalized 
moveToThreadNext function*. Optionally, moveToThreadNext can be a member of 
a *generalized Position class*.

That's all for today. Stay tuned.

Edward

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/leo-editor/16513f8d-a3be-4c25-b276-a6ff257ab479n%40googlegroups.com.

Reply via email to