This Engineering Notebook post will discuss the difficulties that *any*
python importer must face. To state my conclusions first:
1. Generating the proper whitespace before @others correctly in *all* cases
requires:
A: Some form of look-ahead, or equivalently, delayed code generation.
B: What amounts to a full *parse* of def and class lines.
2. I am willing to let the importer assume 4-space indentation for @others
in class nodes. In effect, this is what the legacy Py_Importer class does!
*Background*
Vitalije's new importer has trouble importing
mypy/test-data/stdlib-samples/3.2/test/test_textwrap.py. The file *is* imported
perfectly, but many nodes are over-indented due to missing indentation in
`@others` directives in the class nodes.
The relevant code in the mknode function is:
o = indent('@others\n', ind-l_ind)
...
p.b = f'{b1}{o}{b2}'
Alas, the value ind-l_ind won't work in all cases! Instead, I suggest
using the value 4 for all classes :-) That's exactly what the legacy
importer does!
Yes, this would break the strangely-indented unit tests, but I'm willing to
live with that.
*The heroic alternative*
Generating the correct indentation for @others in *all* cases is much more
difficult. Indeed, the indentation of the @others line must be the
indentation of the *first significant line *following the class or def
line. The first significant line is the first line that is not:
- A blank or a comment.
- In a string.
The legacy Py_Importer class detects such lines fairly easily. It is the
first non-blank, non-comment line for which Python_ScanState.in_context
returns False:
def in_context(self):
"""True if in a special context."""
return (
self.context or
self.curlies > 0 or # Open curly brackets
self.parens > 0 or # Open parentheses.
self.squares > 0 or # Open square brackets
self.bs_nl # In backslash/newline.
)
Ironically, having gone through all this trouble, my legacy importer *still*
assumes
4-space indentation! In theory, the importer *could* get the indentation
right. In practice, it's dashed difficult to do so!
The split_root functions (or its helpers) would *also *have to find the
first significant line of a class! In effect, the new importer would have
to do a full parse of the entire class or def line.
*Summary*
The python importer contains analogs of all the phases of an optimizing
compiler. The incoming code must be tokenized and maybe even parsed. Code
generation will never be easy.
In class or def nodes, the leading whitespace of @others directive should
be the leading whitespace of the first significant line of the class or
def. Finding the first significant line of a class or def requires a full
parse.
Importers can avoid the parse phase only if they assume 4-space
indentation! I am willing to make this concession, and I am willing to
abandon (parts of) the unit tests for strangely-indented code.
Edward
--
You received this message because you are subscribed to the Google Groups
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/leo-editor/fd370010-19d7-4530-9346-b93566e72d9cn%40googlegroups.com.