Hi, As I mentioned, NEWLINE is a token. All uppercase words in the grammar are tokens and therefore are produced by the lexer, not the parser. Is not a built-in rule. In particular, that token is produced here:
https://github.com/python/cpython/blob/6777e09166fc384ea0a4b50202c7b0bd7a23330c/Parser/tokenizer.c#L1773 On Wed, 26 Oct 2022 at 20:59, David J W <ward.dav...@gmail.com> wrote: > Pablo, > Nl and Newline are tokens but I am interested in NEWLINE's behavior in > the Python grammar, note the casing. > > For example in simple_stmts @ > https://github.com/python/cpython/blob/main/Grammar/python.gram#L107 > > Is that NEWLINE some sort of built in rule to the grammar? In my project > I am running into problems where the parser crashes any time there is some > double like NL & N or Newline & NL but I want to nail down NEWLINE's > behavior in CPython's PEG grammar. > > On Wed, Oct 26, 2022 at 12:51 PM Pablo Galindo Salgado < > pablog...@gmail.com> wrote: > >> Hi, >> >> I am not sure I understand exactly what you are asking but NEWLINE is a >> token, not a parser rule. What decides when NEWLINE is emitted is the lexer >> that has nothing to do with PEG. Normally PEG parsers also acts as >> tokenizers but the one in cpython does not. >> >> Also notice that CPython’s parser uses a version of the tokeniser written >> in C that doesn’t share code with the exposed version. You will find that >> the tokenizer module in the standard library actually behaves differently >> regarding what tokens are emitted in new lines and indentations. >> >> The only way to be sure is check the code unfortunately. >> >> Hope this helps. >> >> Regards from rainy London, >> Pablo Galindo Salgado >> >> > On 26 Oct 2022, at 19:12, David J W <ward.dav...@gmail.com> wrote: >> > >> > >> > I am writing a Rust version of Python for fun and I am at the parser >> stage of development. >> > >> > I copied and modified a PEG grammar ruleset from another open source >> project and I've already noticed some problems (ex Newline vs NL) with how >> they transcribed things. >> > >> > I am suspecting that CPython's grammar NEWLINE is a builtin rule for >> the parser that is something like `(Newline+ | NL+ ) {NOP}` but wanted to >> sanity check if that is right before I figure out how to hack in a NEWLINE >> rule and update my grammar ruleset. >> > _______________________________________________ >> > Python-Dev mailing list -- python-dev@python.org >> > To unsubscribe send an email to python-dev-le...@python.org >> > https://mail.python.org/mailman3/lists/python-dev.python.org/ >> > Message archived at >> https://mail.python.org/archives/list/python-dev@python.org/message/NMCMEDMEBKATYKRNZLX2NDGFOB5UHQ5A/ >> > Code of Conduct: http://python.org/psf/codeofconduct/ >> >
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/5ZV7BZOYHW3DELYIB4GKRWHUNTYW3V4K/ Code of Conduct: http://python.org/psf/codeofconduct/