[Python-Dev] Put token information in one place

Serhiy Storchaka Wed, 31 May 2017 04:03:10 -0700

Currently when you add a new token you need to change a couple of files:


* Include/token.h
* _PyParser_TokenNames in Parser/tokenizer.c

* PyToken_OneChar(), PyToken_TwoChars() or PyToken_ThreeChars() inParser/tokenizer.c

* Lib/token.py (generated from Include/token.h)
* EXACT_TOKEN_TYPES in Lib/tokenize.py
* Operator, Bracket or Special in Lib/tokenize.py
* Doc/library/token.rst

It is possible to generate all this information from a single source.Proposed in [1] patch uses Lib/token.py as an initial source. But maybeLib/token.py also should be generated from some file in general format?Some information can be derived from Grammar/Grammar, but not all.Needed also a mapping between token strings ('(' or '>=') and names(LPAR, GREATEREQUAL). Can this be added in Grammar/Grammar or a new file?

There is a related problem, the tokenize module uses three additionaltokens not used by the C tokenizer. It modifies the content of the tokenmodule after importing it, that is not good. [2] One of solutions ismaking a copy of tok_names in tokenize before modifying it, but thisdoesn't work, because third-party code search tokenize constants intoken.tok_names. Other solution is adding tokenize specific constants tothe token module. Is this good to expose in the token module tokens notused in the C tokenizer?

Non-terminal symbols are generated automatically, Lib/symbol.py fromInclude/graminit.h, and Include/graminit.h and Python/graminit.c fromGrammar/Grammar by Parser/pgen. Is it worth to generate Lib/symbol.py bypgen too? Can pgen be implemented in Python?


See also similar issue for opcodes. [3]

[1] https://bugs.python.org/issue30455
[2] https://bugs.python.org/issue25324
[3] https://bugs.python.org/issue17861

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Put token information in one place

Reply via email to