subject:"\[Python\-Dev\] Re\: PEP 617\: New PEG parser for CPython"

[Python-Dev] Re: PEP 617 -- New PEG parser for CPython

2020-10-08 Thread Pablo Galindo Salgado

Our experience with automatic testing is that unfortunately is very
difficult to extract real problems with it. We tried some of the new
experimental source generators on top of hypothesis (
https://pypi.org/project/hypothesmith/) and sadly we could not catch many
important things that parsing existing source or tests did immediately.
This is because the grammar space is infinite and therefore exploring it
for parser failures without any constraint on what to explore or what's the
actual structure of the parser won't give you much. If you consider the
collection of programs that don't belong to the language, the cardinality
is even higher.

For example, we had a bug at some point that manifested only when an
f-sring had a specific number of nesting, although normal f-strings parsed
just fine. Catching this automatically without knowing what to look for is
very unlikely.

As one cannot prove that two parsers parse the same language or that two
grammars are equivalent (is equivalent to the halting problem), normally
this is solved with testing both parses with a corpus big enough of both
positive parses and negative ones. In order to improve our confidence with
the negative cases, we would need first a good enough corpus of cases to
test.

On Thu, 8 Oct 2020, 20:49 Daniel Moisset,  wrote:

> In this case, you can use the old parser as an oracle, at least for python
> 3.8 syntax. The new parser should produce a syntax error if and only if the
> old one does. And if it doesn't the AST should be the same I guess (I'm not
> sue if the AST structure changed)
>
> On Thu, 8 Oct 2020, 03:12 Terry Reedy,  wrote:
>
>> On 10/6/2020 2:02 PM, Guido van Rossum wrote:
>> > That's appreciated, but I think what's needed more is someone who
>> > actually wants to undertake this project. It's not just a matter of
>> > running a small script for hours -- someone will have to come up with a
>> > way to fuzz that is actually useful for this particular situation
>> > (inserting random characters in files isn't going to be very
>> effective).
>>
>> Changes should be by token or broader grammatical construct.  However,
>> the real difficulty in auto testing is the lack of a decision mechanism
>> for correct output (code object versus SyntaxError) other than the tests
>> being either designed or checked by parser experts.
>>
>> Classical fuzzing looks for some clearly wrong -- a crash -- rather than
>> an answer *or* an Exception.  So yes, random fuzzing that does not pass
>> known limits could be done to look for crashes.  But this is different
>> from raising SyntaxError to reject wrong programs.
>>
>> Consider unary prefix operators:
>>
>> *a is a SyntaxError, because the grammar circumscribes the use of '*' as
>> prefix.
>>
>> -'' is not, which might surprise some, but I presume the error not being
>> caught until runtime, as a TypeError, is correct for the grammar as
>> written.  Or did I just discover a parser bug?  Or a possible grammar
>> improvement?
>> (In other words, if I, even with my experience, tried grammar/parser
>> fuzzing, I might be as much a nuisance as a help.)
>>
>> It would not necessarily be a regression if the grammar and parser were
>> changed so that an obvious error like "- " were to be
>> caught as a SyntaxError.
>>
>>
>> > "(One area we have not explored extensively is rejection of all
>> > wrong programs.
>>
>> I consider false rejection to be a bigger sin than false acceptance.
>> Wrong programs, like "-''", are likely to fail at runtime anyway.  So
>> one could test acceptance of randomly generated correct but not
>> ridiculously big programs.  But I guess compiling the stdlib and other
>> packages already pretty well covered this.
>>
>> >  We have unit tests that check for a certain number
>> > of explicit rejections, but more work could be done, e.g. by using a
>> > fuzzer that inserts random subtle bugs into existing code. We're
>> > open to help in this area.)"
>> > https://www.python.org/dev/peps/pep-0617/#validation
>>
>>
>> --
>> Terry Jan Reedy
>> ___
>> Python-Dev mailing list -- python-dev@python.org
>> To unsubscribe send an email to python-dev-le...@python.org
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-dev@python.org/message/FNMPQUDPZTX7E4CAGDENNFU6AQJJMW34/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/D33DNLDR4Y6HCGL7K3WRDORNMCLDJ6D5/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe

[Python-Dev] Re: PEP 617 -- New PEG parser for CPython

2020-10-08 Thread Daniel Moisset

In this case, you can use the old parser as an oracle, at least for python
3.8 syntax. The new parser should produce a syntax error if and only if the
old one does. And if it doesn't the AST should be the same I guess (I'm not
sue if the AST structure changed)

On Thu, 8 Oct 2020, 03:12 Terry Reedy,  wrote:

> On 10/6/2020 2:02 PM, Guido van Rossum wrote:
> > That's appreciated, but I think what's needed more is someone who
> > actually wants to undertake this project. It's not just a matter of
> > running a small script for hours -- someone will have to come up with a
> > way to fuzz that is actually useful for this particular situation
> > (inserting random characters in files isn't going to be very effective).
>
> Changes should be by token or broader grammatical construct.  However,
> the real difficulty in auto testing is the lack of a decision mechanism
> for correct output (code object versus SyntaxError) other than the tests
> being either designed or checked by parser experts.
>
> Classical fuzzing looks for some clearly wrong -- a crash -- rather than
> an answer *or* an Exception.  So yes, random fuzzing that does not pass
> known limits could be done to look for crashes.  But this is different
> from raising SyntaxError to reject wrong programs.
>
> Consider unary prefix operators:
>
> *a is a SyntaxError, because the grammar circumscribes the use of '*' as
> prefix.
>
> -'' is not, which might surprise some, but I presume the error not being
> caught until runtime, as a TypeError, is correct for the grammar as
> written.  Or did I just discover a parser bug?  Or a possible grammar
> improvement?
> (In other words, if I, even with my experience, tried grammar/parser
> fuzzing, I might be as much a nuisance as a help.)
>
> It would not necessarily be a regression if the grammar and parser were
> changed so that an obvious error like "- " were to be
> caught as a SyntaxError.
>
>
> > "(One area we have not explored extensively is rejection of all
> > wrong programs.
>
> I consider false rejection to be a bigger sin than false acceptance.
> Wrong programs, like "-''", are likely to fail at runtime anyway.  So
> one could test acceptance of randomly generated correct but not
> ridiculously big programs.  But I guess compiling the stdlib and other
> packages already pretty well covered this.
>
> >  We have unit tests that check for a certain number
> > of explicit rejections, but more work could be done, e.g. by using a
> > fuzzer that inserts random subtle bugs into existing code. We're
> > open to help in this area.)"
> > https://www.python.org/dev/peps/pep-0617/#validation
>
>
> --
> Terry Jan Reedy
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/FNMPQUDPZTX7E4CAGDENNFU6AQJJMW34/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/D33DNLDR4Y6HCGL7K3WRDORNMCLDJ6D5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617 -- New PEG parser for CPython

2020-10-07 Thread Terry Reedy

On 10/6/2020 2:02 PM, Guido van Rossum wrote:
That's appreciated, but I think what's needed more is someone who
actually wants to undertake this project. It's not just a matter of
running a small script for hours -- someone will have to come up with a
way to fuzz that is actually useful for this particular situation
(inserting random characters in files isn't going to be very effective).

Changes should be by token or broader grammatical construct. However,
the real difficulty in auto testing is the lack of a decision mechanism
for correct output (code object versus SyntaxError) other than the tests
being either designed or checked by parser experts.

Classical fuzzing looks for some clearly wrong -- a crash -- rather than
an answer *or* an Exception. So yes, random fuzzing that does not pass
known limits could be done to look for crashes. But this is different
from raising SyntaxError to reject wrong programs.

Consider unary prefix operators:

*a is a SyntaxError, because the grammar circumscribes the use of '*' as
prefix.

-'' is not, which might surprise some, but I presume the error not being
caught until runtime, as a TypeError, is correct for the grammar as
written. Or did I just discover a parser bug? Or a possible grammar
improvement?
(In other words, if I, even with my experience, tried grammar/parser
fuzzing, I might be as much a nuisance as a help.)

It would not necessarily be a regression if the grammar and parser were
changed so that an obvious error like "- " were to be
caught as a SyntaxError.

"(One area we have not explored extensively is rejection of all
wrong programs.

I consider false rejection to be a bigger sin than false acceptance.
Wrong programs, like "-''", are likely to fail at runtime anyway. So
one could test acceptance of randomly generated correct but not
ridiculously big programs. But I guess compiling the stdlib and other
packages already pretty well covered this.

We have unit tests that check for a certain number
of explicit rejections, but more work could be done, e.g. by using a
fuzzer that inserts random subtle bugs into existing code. We're
open to help in this area.)"
https://www.python.org/dev/peps/pep-0617/#validation

--
Terry Jan Reedy
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/FNMPQUDPZTX7E4CAGDENNFU6AQJJMW34/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617 -- New PEG parser for CPython

2020-10-06 Thread Guido van Rossum

That's appreciated, but I think what's needed more is someone who actually
wants to undertake this project. It's not just a matter of running a small
script for hours -- someone will have to come up with a way to fuzz that is
actually useful for this particular situation (inserting random characters
in files isn't going to be very effective). If someone wants to give that a
try I'm sure they'd be delighted to use up the rest of your AWS budget. :-)

On Tue, Oct 6, 2020 at 7:02 AM Brett Lovgren 
wrote:

> I have access to a small Amazon Web Services credit through the end of
> November 2020. I'd be happy to let your team use that credit in support of
> the fuzzing validation mentioned below.
>
> "(One area we have not explored extensively is rejection of all wrong
> programs. We have unit tests that check for a certain number of explicit
> rejections, but more work could be done, e.g. by using a fuzzer that
> inserts random subtle bugs into existing code. We're open to help in this
> area.)"
> https://www.python.org/dev/peps/pep-0617/#validation
>
> Sincerely,
> Brett
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/4IG64CLKT4HTQOOYGUHCUUGWNJROCN5A/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/F6ELI34QWOPZNXKW73K6DTV27LXQZ76V/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-05-06 Thread David Mertz

Hi Guido, Pablo & Lysandros,

I'm excited about this improvement to Python, and was interested to hear
about it at the language summit as well.  I happen to be friends with
Alessandro Warth, whom you cited in the PEP as developing the packrat
parsing technique you use (at least in part).  I wrote to him to ask if he
knew being cited, and he responded in part with these comments.  The
additional link may perhaps be useful for you:

Alex: (If they had gotten in touch, I would have pointed them at my
> dissertation, which I think had a simpler description of that algorithm.
> There's also the Ohm implementation [https://github.com/harc/ohm], where
> I figured out how to simplify it further.)
>


-- 
The dead increasingly dominate and strangle both the living and the
not-yet born.  Vampiric capital and undead corporate persons abuse
the lives and control the thoughts of homo faber. Ideas, once born,
become abortifacients against new conceptions.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZYTT24HSRGH7RIDYJRGCSTUQIYIHWHTI/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-21 Thread Gregory P. Smith

On Tue, Apr 21, 2020 at 9:35 PM Gregory P. Smith  wrote:

> Could we go ahead and mark lib2to3 as Pending Deprecation in 3.9 so we can
> get it out of the stdlib by 3.11 or 3.12?
>

I'm going ahead and tracking the idea in https://bugs.python.org/issue40360.


>
> lib2to3 is the basis of all sorts of general source code manipulation
> tooling.  Its name and original reason d'etre have moved on.  It is
> actively used to parse and rewrite Python 3 code all the time.  yapf uses
> it, black uses a fork of it.  Other Python code manipulation tooling uses
> it.  Modernize like fixers are useful for all sorts of cleanups.
>
> IMNSHO it would be better if lib2to3 were *not* in the stdlib anymore -
> Black already chose to fork lib2to3
> .  So given that it is
> eventually not going to be able to parse future syntax, the better answer
> seems like deprecation, putting the final version up on PyPI and letting
> any descendants of it live on PyPI where they can get more active care than
> a stdlib module ever does.
>
> -gps
>
>
> On Tue, Apr 21, 2020 at 6:58 PM Guido van Rossum  wrote:
>
>> Great! Please submit a PR to update the [lib]2to3 docs and CC me
>> (@gvanrossum).
>>
>> While perhaps it wouldn't hurt if the PEP mentioned lib2to3, it was just
>> accepted by the Steering Council without such language, and I wouldn't want
>> to imply that the SC agrees with everything I said. So I still think we
>> ought to deal with lib2to3 independently (and no, it won't need its own PEP
>> :-). A reasonable option would be to just deprecate it and recommend people
>> use parso, LibCST or something else (I wouldn't recommend pegen in its
>> current form yet).
>>
>> On Tue, Apr 21, 2020 at 6:21 PM Carl Meyer  wrote:
>>
>>> On Sat, Apr 18, 2020 at 10:38 PM Guido van Rossum 
>>> wrote:
>>> >
>>> > Note that, while there is indeed a docs page about 2to3, the only docs
>>> for lib2to3 in the standard library reference are a link to the source code
>>> and a single "Note: The lib2to3 API should be considered unstable and may
>>> change drastically in the future."
>>> >
>>> > Fortunately,  in order to support the 2to3 application, lib2to3
>>> doesn't need to change, because the syntax of Python 2 is no longer
>>> changing. :-) Choosing to remove 2to3 is an independent decision. And
>>> lib2to3 does not depend in any way on the old parser module. (It doesn't
>>> even use the standard tokenize module, but incorporates its own version
>>> that is slightly tweaked to support Python 2.)
>>>
>>> Indeed! Thanks for clarifying, I now recall that I already knew it
>>> doesn't, but forgot.
>>>
>>> The docs page for 2to3 does currently say "lib2to3 could also be
>>> adapted to custom applications in which Python code needs to be edited
>>> automatically." Perhaps at least this sentence should be removed, and
>>> maybe also replaced with a clearer note that lib2to3 not only has an
>>> unstable API, but also should not necessarily be expected to continue
>>> to parse future Python versions, and thus building tools on top of it
>>> should be discouraged rather than recommended. (Maybe even use the
>>> word "deprecated.") Happy to submit a PR for this if you agree it's
>>> warranted.
>>>
>>> It still seems to me that it wouldn't hurt for PEP 617 itself to also
>>> mention this shift in lib2to3's effective status (from "available but
>>> no API stability guarantee" to "probably will not parse future Python
>>> versions") as one of its indirect effects.
>>>
>>> > You've mentioned a few different tools that already use different
>>> technologies: LibCST depends on parso which has a fork of pgen2, lib2to3
>>> which has the original pgen2. I wonder if this would be an opportunity to
>>> move such parsing support out of the standard library completely. There are
>>> already two versions of pegen, but neither is in the standard library:
>>> there is the original pegen repo which is where things started, and there
>>> is a fork of that code in the CPython Tools directory (not yet in the
>>> upstream repo, but see PR 19503).
>>> >
>>> > The pegen tool has two generators, one generating C code and one
>>> generating Python code. I think that the C generator is really only
>>> relevant for CPython itself: it relies on the builtin tokenizer (the one
>>> written in C, not the stdlib tokenize.py) and the generated C code depends
>>> on many internal APIs. In fact the C generator in the original pegen repo
>>> doesn't work with Python 3.9 because those internal APIs are no longer
>>> exported. (It also doesn't work with Python 3.7 or older because it makes
>>> critical use of the walrus operator. :-) Also, once we started getting
>>> serious about replacing the old parser, we worked exclusively on the C
>>> generator in the CPython Tools directory, so the version in the original
>>> pegen repo is lagging quite a bit behind (is is the Python grammar in that
>>> repo). But as I said you're not gonna need it.
>>>

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-21 Thread Gregory P. Smith

Could we go ahead and mark lib2to3 as Pending Deprecation in 3.9 so we can
get it out of the stdlib by 3.11 or 3.12?

lib2to3 is the basis of all sorts of general source code manipulation
tooling.  Its name and original reason d'etre have moved on.  It is
actively used to parse and rewrite Python 3 code all the time.  yapf uses
it, black uses a fork of it.  Other Python code manipulation tooling uses
it.  Modernize like fixers are useful for all sorts of cleanups.

IMNSHO it would be better if lib2to3 were *not* in the stdlib anymore -
Black already chose to fork lib2to3
.  So given that it is
eventually not going to be able to parse future syntax, the better answer
seems like deprecation, putting the final version up on PyPI and letting
any descendants of it live on PyPI where they can get more active care than
a stdlib module ever does.

-gps


On Tue, Apr 21, 2020 at 6:58 PM Guido van Rossum  wrote:

> Great! Please submit a PR to update the [lib]2to3 docs and CC me
> (@gvanrossum).
>
> While perhaps it wouldn't hurt if the PEP mentioned lib2to3, it was just
> accepted by the Steering Council without such language, and I wouldn't want
> to imply that the SC agrees with everything I said. So I still think we
> ought to deal with lib2to3 independently (and no, it won't need its own PEP
> :-). A reasonable option would be to just deprecate it and recommend people
> use parso, LibCST or something else (I wouldn't recommend pegen in its
> current form yet).
>
> On Tue, Apr 21, 2020 at 6:21 PM Carl Meyer  wrote:
>
>> On Sat, Apr 18, 2020 at 10:38 PM Guido van Rossum 
>> wrote:
>> >
>> > Note that, while there is indeed a docs page about 2to3, the only docs
>> for lib2to3 in the standard library reference are a link to the source code
>> and a single "Note: The lib2to3 API should be considered unstable and may
>> change drastically in the future."
>> >
>> > Fortunately,  in order to support the 2to3 application, lib2to3 doesn't
>> need to change, because the syntax of Python 2 is no longer changing. :-)
>> Choosing to remove 2to3 is an independent decision. And lib2to3 does not
>> depend in any way on the old parser module. (It doesn't even use the
>> standard tokenize module, but incorporates its own version that is slightly
>> tweaked to support Python 2.)
>>
>> Indeed! Thanks for clarifying, I now recall that I already knew it
>> doesn't, but forgot.
>>
>> The docs page for 2to3 does currently say "lib2to3 could also be
>> adapted to custom applications in which Python code needs to be edited
>> automatically." Perhaps at least this sentence should be removed, and
>> maybe also replaced with a clearer note that lib2to3 not only has an
>> unstable API, but also should not necessarily be expected to continue
>> to parse future Python versions, and thus building tools on top of it
>> should be discouraged rather than recommended. (Maybe even use the
>> word "deprecated.") Happy to submit a PR for this if you agree it's
>> warranted.
>>
>> It still seems to me that it wouldn't hurt for PEP 617 itself to also
>> mention this shift in lib2to3's effective status (from "available but
>> no API stability guarantee" to "probably will not parse future Python
>> versions") as one of its indirect effects.
>>
>> > You've mentioned a few different tools that already use different
>> technologies: LibCST depends on parso which has a fork of pgen2, lib2to3
>> which has the original pgen2. I wonder if this would be an opportunity to
>> move such parsing support out of the standard library completely. There are
>> already two versions of pegen, but neither is in the standard library:
>> there is the original pegen repo which is where things started, and there
>> is a fork of that code in the CPython Tools directory (not yet in the
>> upstream repo, but see PR 19503).
>> >
>> > The pegen tool has two generators, one generating C code and one
>> generating Python code. I think that the C generator is really only
>> relevant for CPython itself: it relies on the builtin tokenizer (the one
>> written in C, not the stdlib tokenize.py) and the generated C code depends
>> on many internal APIs. In fact the C generator in the original pegen repo
>> doesn't work with Python 3.9 because those internal APIs are no longer
>> exported. (It also doesn't work with Python 3.7 or older because it makes
>> critical use of the walrus operator. :-) Also, once we started getting
>> serious about replacing the old parser, we worked exclusively on the C
>> generator in the CPython Tools directory, so the version in the original
>> pegen repo is lagging quite a bit behind (is is the Python grammar in that
>> repo). But as I said you're not gonna need it.
>> >
>> > On the other hand, the Python generator is designed to be flexible, and
>> while it defaults to using the stdlib tokenize.py tokenizer, you can easily
>> hook up your own. Putting this version in the stdlib would be a mistake,
>> because

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-21 Thread Guido van Rossum

Great! Please submit a PR to update the [lib]2to3 docs and CC me
(@gvanrossum).

While perhaps it wouldn't hurt if the PEP mentioned lib2to3, it was just
accepted by the Steering Council without such language, and I wouldn't want
to imply that the SC agrees with everything I said. So I still think we
ought to deal with lib2to3 independently (and no, it won't need its own PEP
:-). A reasonable option would be to just deprecate it and recommend people
use parso, LibCST or something else (I wouldn't recommend pegen in its
current form yet).

On Tue, Apr 21, 2020 at 6:21 PM Carl Meyer  wrote:

> On Sat, Apr 18, 2020 at 10:38 PM Guido van Rossum 
> wrote:
> >
> > Note that, while there is indeed a docs page about 2to3, the only docs
> for lib2to3 in the standard library reference are a link to the source code
> and a single "Note: The lib2to3 API should be considered unstable and may
> change drastically in the future."
> >
> > Fortunately,  in order to support the 2to3 application, lib2to3 doesn't
> need to change, because the syntax of Python 2 is no longer changing. :-)
> Choosing to remove 2to3 is an independent decision. And lib2to3 does not
> depend in any way on the old parser module. (It doesn't even use the
> standard tokenize module, but incorporates its own version that is slightly
> tweaked to support Python 2.)
>
> Indeed! Thanks for clarifying, I now recall that I already knew it
> doesn't, but forgot.
>
> The docs page for 2to3 does currently say "lib2to3 could also be
> adapted to custom applications in which Python code needs to be edited
> automatically." Perhaps at least this sentence should be removed, and
> maybe also replaced with a clearer note that lib2to3 not only has an
> unstable API, but also should not necessarily be expected to continue
> to parse future Python versions, and thus building tools on top of it
> should be discouraged rather than recommended. (Maybe even use the
> word "deprecated.") Happy to submit a PR for this if you agree it's
> warranted.
>
> It still seems to me that it wouldn't hurt for PEP 617 itself to also
> mention this shift in lib2to3's effective status (from "available but
> no API stability guarantee" to "probably will not parse future Python
> versions") as one of its indirect effects.
>
> > You've mentioned a few different tools that already use different
> technologies: LibCST depends on parso which has a fork of pgen2, lib2to3
> which has the original pgen2. I wonder if this would be an opportunity to
> move such parsing support out of the standard library completely. There are
> already two versions of pegen, but neither is in the standard library:
> there is the original pegen repo which is where things started, and there
> is a fork of that code in the CPython Tools directory (not yet in the
> upstream repo, but see PR 19503).
> >
> > The pegen tool has two generators, one generating C code and one
> generating Python code. I think that the C generator is really only
> relevant for CPython itself: it relies on the builtin tokenizer (the one
> written in C, not the stdlib tokenize.py) and the generated C code depends
> on many internal APIs. In fact the C generator in the original pegen repo
> doesn't work with Python 3.9 because those internal APIs are no longer
> exported. (It also doesn't work with Python 3.7 or older because it makes
> critical use of the walrus operator. :-) Also, once we started getting
> serious about replacing the old parser, we worked exclusively on the C
> generator in the CPython Tools directory, so the version in the original
> pegen repo is lagging quite a bit behind (is is the Python grammar in that
> repo). But as I said you're not gonna need it.
> >
> > On the other hand, the Python generator is designed to be flexible, and
> while it defaults to using the stdlib tokenize.py tokenizer, you can easily
> hook up your own. Putting this version in the stdlib would be a mistake,
> because the code is pretty immature; it is really waiting for a good home,
> and if parso or LibCST were to decide to incorporate a fork of it and
> develop it into a high quality parser generator for Python-like languages
> that would be great. I wouldn't worry much about the duplication of code --
> the Python generator in the CPython Tools directory is only used for one
> purpose, and that is to produce the meta-parser (the parser for grammars)
> from the meta-grammar. And I would happily stop developing the original
> pegen once a fork is being developed.
>
> Thanks, this is all very clarifying! I hadn't even found the original
> gvanrossum/pegen repo, and was just looking at the CPython PR for PEP
> 617. Clearly I haven't been following this work closely.
>
> > Another option would be to just improve the python generator in the
> original pegen repo to satisfy the needs of parso and LibCST. Reading the
> blurb for parso it looks like it really just parses *Python*, which is less
> ambitious than pegen. But it also seems to support error

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-21 Thread Carl Meyer

On Sat, Apr 18, 2020 at 10:38 PM Guido van Rossum  wrote:
>
> Note that, while there is indeed a docs page about 2to3, the only docs for 
> lib2to3 in the standard library reference are a link to the source code and a 
> single "Note: The lib2to3 API should be considered unstable and may change 
> drastically in the future."
>
> Fortunately,  in order to support the 2to3 application, lib2to3 doesn't need 
> to change, because the syntax of Python 2 is no longer changing. :-) Choosing 
> to remove 2to3 is an independent decision. And lib2to3 does not depend in any 
> way on the old parser module. (It doesn't even use the standard tokenize 
> module, but incorporates its own version that is slightly tweaked to support 
> Python 2.)

Indeed! Thanks for clarifying, I now recall that I already knew it
doesn't, but forgot.

The docs page for 2to3 does currently say "lib2to3 could also be
adapted to custom applications in which Python code needs to be edited
automatically." Perhaps at least this sentence should be removed, and
maybe also replaced with a clearer note that lib2to3 not only has an
unstable API, but also should not necessarily be expected to continue
to parse future Python versions, and thus building tools on top of it
should be discouraged rather than recommended. (Maybe even use the
word "deprecated.") Happy to submit a PR for this if you agree it's
warranted.

It still seems to me that it wouldn't hurt for PEP 617 itself to also
mention this shift in lib2to3's effective status (from "available but
no API stability guarantee" to "probably will not parse future Python
versions") as one of its indirect effects.

> You've mentioned a few different tools that already use different 
> technologies: LibCST depends on parso which has a fork of pgen2, lib2to3 
> which has the original pgen2. I wonder if this would be an opportunity to 
> move such parsing support out of the standard library completely. There are 
> already two versions of pegen, but neither is in the standard library: there 
> is the original pegen repo which is where things started, and there is a fork 
> of that code in the CPython Tools directory (not yet in the upstream repo, 
> but see PR 19503).
>
> The pegen tool has two generators, one generating C code and one generating 
> Python code. I think that the C generator is really only relevant for CPython 
> itself: it relies on the builtin tokenizer (the one written in C, not the 
> stdlib tokenize.py) and the generated C code depends on many internal APIs. 
> In fact the C generator in the original pegen repo doesn't work with Python 
> 3.9 because those internal APIs are no longer exported. (It also doesn't work 
> with Python 3.7 or older because it makes critical use of the walrus 
> operator. :-) Also, once we started getting serious about replacing the old 
> parser, we worked exclusively on the C generator in the CPython Tools 
> directory, so the version in the original pegen repo is lagging quite a bit 
> behind (is is the Python grammar in that repo). But as I said you're not 
> gonna need it.
>
> On the other hand, the Python generator is designed to be flexible, and while 
> it defaults to using the stdlib tokenize.py tokenizer, you can easily hook up 
> your own. Putting this version in the stdlib would be a mistake, because the 
> code is pretty immature; it is really waiting for a good home, and if parso 
> or LibCST were to decide to incorporate a fork of it and develop it into a 
> high quality parser generator for Python-like languages that would be great. 
> I wouldn't worry much about the duplication of code -- the Python generator 
> in the CPython Tools directory is only used for one purpose, and that is to 
> produce the meta-parser (the parser for grammars) from the meta-grammar. And 
> I would happily stop developing the original pegen once a fork is being 
> developed.

Thanks, this is all very clarifying! I hadn't even found the original
gvanrossum/pegen repo, and was just looking at the CPython PR for PEP
617. Clearly I haven't been following this work closely.

> Another option would be to just improve the python generator in the original 
> pegen repo to satisfy the needs of parso and LibCST. Reading the blurb for 
> parso it looks like it really just parses *Python*, which is less ambitious 
> than pegen. But it also seems to support error recovery, which currently 
> isn't part of pegen. (However, we've thought about it.) Anyway, regardless of 
> how exactly this is structured someone will probably have to take over 
> development and support. Pegen started out as a hobby project to educate 
> myself about PEG parsers. Then I wrote a bunch of blog posts about my 
> approach, and finally I started working on using it to generate a replacement 
> for the old pgen-based parser. But I never found the time to make it an 
> appealing parser generator tool for other languages, even though that was on 
> my mind as a future possibility. It will take some time

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-18 Thread Nam Nguyen

On Sat, Apr 18, 2020 at 9:45 PM Guido van Rossum  wrote:

> But I never found the time to make it an appealing parser generator tool
> for other languages, even though that was on my mind as a future
> possibility.
>

I simply want to +1 on this. A general purpose parser library in the stdlib
would be fantastic for its many parsing needs. I'm so looking forward to
that future.
Thanks,
Nam
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HWBHOAZ4HMGQZ7LSLO7WEQMPDZRLGG7A/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-18 Thread Guido van Rossum

On Sat, Apr 18, 2020 at 4:53 PM Carl Meyer  wrote:

> The PEP is exciting and is very clearly presented, thank you all for
> the hard work!
>
> Considering the comments in the PEP about the new parser not
> preserving a parse tree or CST, I have some questions about the future
> options for Python language-services tooling which requires a CST in
> order to round-trip and modify Python code. Examples in this space
> include auto-formatters, refactoring tools, linters with autofix, etc.
> Today many such tools (e.g. Black, 2to3) are based on lib2to3. Other
> tools already have their own parser (e.g. LibCST -- which I help
> maintain -- and Jedi both use parso, a fork of pgen2).
>

Right, LibCST is very exciting. Note that AFAIK none of the tools you
mention depend on the old parser module. (Though I'm not denying that there
might be tools depending on it -- that's why we're keeping it until 3.10.)

> 1) 2to3 and lib2to3 are not mentioned in the PEP, but are a documented
> part of the standard library used by some very popular tools, and
> currently depend on pgen2. A quick search of the PEP 617 pull request
> does not suggest that it modifies lib2to3. Will lib2to3 also be
> removed in Python 3.10 along with the old parser? It might be good for
> the PEP to address the future of 2to3 and lib2to3 explicitly.
>

Note that, while there is indeed a docs page about 2to3
, the only docs for *lib2to3*
in the standard library reference are a link to the source code and a
single "*Note:* The lib2to3

API should be considered unstable and may change drastically in the future."

Fortunately,  in order to support the 2to3 application, lib2to3 doesn't
need to change, because the syntax of Python 2 is no longer changing. :-)
Choosing to remove 2to3 is an independent decision. And lib2to3 does not
depend in any way on the old parser module. (It doesn't even use the
standard tokenize module, but incorporates its own version that is slightly
tweaked to support Python 2.)

> 2) As these tools make the necessary adaptations to support Python
> 3.10, which may no longer be parsable with an LL(1) parser, will we be
> able to leverage any part of pegen to construct a lossless Python CST,
> or will we likely need to fork pegen outside of CPython or build a
> wholly new parser? It would be neat if an alternate grammar could be
> written in pegen that has access to all tokens (including NL and
> COMMENT) for this purpose; that would save a lot of code duplication
> and potential for inconsistency. I haven't had a chance to fully read
> through the PEP 617 pull request, but it looks like its tokenizer
> wrapper currently discards NL and COMMENT. I understand this is a
> distinct use case with distinct needs and I'm not suggesting that we
> should make significant sacrifices in the performance or
> maintainability of pegen to serve it, but if it's possible to enable
> some sharing by making API choices now before it's merged, that seems
> worth considering.
>

You've mentioned a few different tools that already use different
technologies: LibCST depends on parso which has a fork of pgen2, lib2to3
which has the original pgen2. I wonder if this would be an opportunity to
move such parsing support out of the standard library completely. There are
already two versions of pegen, but neither is in the standard library:
there is the original pegen  repo
which is where things started, and there is a fork of that code in the CPython
Tools

directory (not yet in the upstream repo, but see PR 19503
).

The pegen tool has two generators, one generating C code and one generating
Python code. I think that the C generator is really only relevant for
CPython itself: it relies on the builtin tokenizer (the one written in C,
not the stdlib tokenize.py) and the generated C code depends on many
internal APIs. In fact the C generator in the original pegen repo doesn't
work with Python 3.9 because those internal APIs are no longer exported.
(It also doesn't work with Python 3.7 or older because it makes critical
use of the walrus operator. :-) Also, once we started getting serious about
replacing the old parser, we worked exclusively on the C generator in the
CPython Tools directory, so the version in the original pegen repo is
lagging quite a bit behind (is is the Python grammar in that repo). But as
I said you're not gonna need it.

On the other hand, the Python generator is designed to be flexible, and
while it defaults to using the stdlib tokenize.py tokenizer, you can easily
hook up your own. Putting this version in the stdlib would be a mistake,
because the code is pretty immature; it is really waiting for a good home,
and if parso or LibCST were to decide to incorporate

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-18 Thread Carl Meyer

The PEP is exciting and is very clearly presented, thank you all for
the hard work!

Considering the comments in the PEP about the new parser not
preserving a parse tree or CST, I have some questions about the future
options for Python language-services tooling which requires a CST in
order to round-trip and modify Python code. Examples in this space
include auto-formatters, refactoring tools, linters with autofix, etc.
Today many such tools (e.g. Black, 2to3) are based on lib2to3. Other
tools already have their own parser (e.g. LibCST -- which I help
maintain -- and Jedi both use parso, a fork of pgen2).

1) 2to3 and lib2to3 are not mentioned in the PEP, but are a documented
part of the standard library used by some very popular tools, and
currently depend on pgen2. A quick search of the PEP 617 pull request
does not suggest that it modifies lib2to3. Will lib2to3 also be
removed in Python 3.10 along with the old parser? It might be good for
the PEP to address the future of 2to3 and lib2to3 explicitly.

2) As these tools make the necessary adaptations to support Python
3.10, which may no longer be parsable with an LL(1) parser, will we be
able to leverage any part of pegen to construct a lossless Python CST,
or will we likely need to fork pegen outside of CPython or build a
wholly new parser? It would be neat if an alternate grammar could be
written in pegen that has access to all tokens (including NL and
COMMENT) for this purpose; that would save a lot of code duplication
and potential for inconsistency. I haven't had a chance to fully read
through the PEP 617 pull request, but it looks like its tokenizer
wrapper currently discards NL and COMMENT. I understand this is a
distinct use case with distinct needs and I'm not suggesting that we
should make significant sacrifices in the performance or
maintainability of pegen to serve it, but if it's possible to enable
some sharing by making API choices now before it's merged, that seems
worth considering.

Carl
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/5MPFFCYOEDKEPKNSNIDZ7H6AYTXUFFAY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-16 Thread Pablo Galindo Salgado

After the feedback received in the language summit, we have made a modification 
to the
proposed migration plan in PEP 617 so the new parser will be the default in 
3.9alpha6:

https://github.com/python/peps/pull/1369
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/67K2PVFHIFXIZU3KXL7C4NQQX3M5OO3X/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-08 Thread Steven D'Aprano

On Mon, Apr 06, 2020 at 07:03:30PM -0700, Guido van Rossum wrote:
> After 30 years am I not allowed to take new information into account and
> consider a change of heart? :-)

Of course :-)


-- 
Steven
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YBDHD5EHCMKWIG4KK2XUX7SA47Q33QTB/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-06 Thread Greg Ewing


Another point in favour of always-reserved keywords is that they
make life a lot easier for syntax highlighters.

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AAQRIAXLHSWYRQIV2DYHYSISMYRASO67/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-06 Thread Greg Ewing


On 7/04/20 6:54 am, Guido van Rossum wrote:

I'm not sure that that was the conclusion. At the time the point was 
that we *wanted* all keywords to be reserved everywhere, an `as` was an 
ugly exception to that rule, which we got rid of as soon as we could -- 
not because it was a bad idea but because it violated a somewhat 
arbitrary rule.


I don't see it as an arbitrary rule, or at least no more arbitrary
than any other language rule. Given that the rule exists, it's the
exception that seems arbitrary. There's little justification for it
other than "we only thought of using it as a keyword later".

To reduce arbitrariness, we would either have to make *all*
keywords context-sensitive, or come up with some principled way
of deciding whether a given keyword should be reserved or not.

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AK5SSVAXA2EIMU6VM2AIGXLV5KTU4IRP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-06 Thread Greg Ewing


On 7/04/20 5:43 am, Guido van Rossum wrote:
The biggest difference is that the `|` operator is no longer 
symmetrical (since if you have alternatives `A | B`, and both match at 
some point in the input, PEG reports A, while the old generator would 
reject the grammar as being ambiguous.


I'm still inclined to think that allowing ambiguous grammars is
more of a bug than a feature. Is there some way the generator
could be made to at least warn if the grammar is genuinely
ambiguous (as opposed to just having overlapping first sets in
alternatives)?


We don't specify how other implementations must parse the language


And this is one of the reasons. If we use a PEG grammar as the
definition of the language, and aren't careful about ambiguities
when we add new syntax, we might accidentally end up with something
that can *only* be parsed with a PEG parser or something equally
powerful.


I'm sure there will be other ways to parse the same language.


That's certainly true now, but can you be sure it will remain
true if additions are made that rely on the full power of PEG?

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HQBD73ZNDXU5OM3PGK22GMXT5GRMORCX/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-06 Thread Guido van Rossum

After 30 years am I not allowed to take new information into account and
consider a change of heart? :-)

On Mon, Apr 6, 2020 at 6:21 PM Steven D'Aprano  wrote:

> On Mon, Apr 06, 2020 at 11:54:54AM -0700, Guido van Rossum wrote:
>
> > (In an early version of the PEG parser, all keywords were
> > context-sensitive, and there were only very few places in the grammar
> where
> > this required us to insert negative lookaheads to make edge cases parse
> > correctly. The rest was taken care by careful ordering of rules, e.g. the
> > rule for `del_stmt` must be tried before the rule for `expression_stmt`
> > since `del *x` would match the latter.)
>
> I think, on first glance, I'd rather have all keywords context-sensitive
> than just some. But I haven't put a great deal of thought into that
> aspect of it, and I reserve the right to change my mind :-)
>
>
> > > Personally, I would not like to have to explain to newcomers why
> `match`
> > > is a keyword but you can still use it as a function or variable, but
> not
> > > other keywords like `raise`, `in`, `def` etc.
> > >
> > > match expression:
> > > match = True
> > >
> >
> > What kind of newcomers do you have that they even notice that, unless you
> > were to draw attention to it?
>
> It didn't take me 25 years to try using "of" and "if" for "output file"
> and "input file", so I guess my answer to your question is ordinary
> newcomers :-)
>
> "Newcomers" doesn't just including beginners to programming, it can
> include people experienced in one or more other language coming to
> Python for the first time.
>
> But if we're talking about complete beginners, the concept of what is
> and isn't a keyword is not always clear. Why is the first of these legal
> but not the second? Both words are highlighted in my editor:
>
> str = "Hello world"
> class = "wizard"
>
> People are going to learn that `match` is a keyword, and then they are
> going to come across code using it as a variable or method, and while
> the context-sensitive rule might be obvious to us, it won't be obvious
> to them precisely because they are still learning the language rules.
>
> I think that `match` would be an especially interesting case because I
> can easily see someone starting off with a variable `match`, that they
> handle in an `if` statement, and then as the code evolves they shift it
> to a `match` statement:
>
> match match:
>
> and not bother to refactor the name because they are familiar enough
> with is that the meaning is obvious.
>
> On the other hand there are definitely a few keywords that collide with
> useful names. Apart from `if`, I have wanted to use these as variables,
> parameters or functions:
>
> class, raise, in, from, except, while, lambda
>
> (off the top of my head, there may be others). There's at least one
> place in the random module where a parameter is misspelled "lambd"
> because lambda is a keyword. So there is certainly something to be said
> for getting rid of keywords.
>
> On the third hand, keywords don't just make it easier for the
> interpreter, they also make it easier for the human reader. You don't
> need to care about context, `except` is `except` wherever you see it.
> That makes it a dead-simple rule for anyone to learn, because there are
> no exceptions, pun intended.
>
> (I guess inside strings and comments are exceptions, but they are
> well-understood and *simple* exceptions.)
>
> I just can't help feeling at this point that while there are pros and
> cons to making things a keyword, having some keywords be context
> sensitive but not others is going to combine the worst of both and end
> up be confusing and awkward.
>
>
> > I'm serious -- from the kind of questions
> > I've seen in user forums, most newcomers are having a hard enough time
> > learning more fundamental concepts and abstractions than the precise
> rules
> > for reserved words.
>
> That's because the precise rules for reserved words are dead-simple to
> learn. You can't use them anywhere except in the correct context. If we
> start adding exceptions to that, that reserved words are only sometimes
> reserved, I think that will make them harder to learn. If it's only some
> reserved words but not others, that's even harder because we have three
> classes of words:
>
> * words that are never reserved
> * words that are sometimes reserved, depending on what is around them
> * words that are always reserved
>
> I had thought that "no context-sensitive keywords" was a hard rule, so I
> was surprised that you are now re-considering it.
>
>
> --
> Steven
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/NNMJB7QIRAMROQU5XRM6YV7K4NXBGXBT/
> Code of Conduct:

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-06 Thread Steven D'Aprano

On Mon, Apr 06, 2020 at 11:54:54AM -0700, Guido van Rossum wrote:

> (In an early version of the PEG parser, all keywords were
> context-sensitive, and there were only very few places in the grammar where
> this required us to insert negative lookaheads to make edge cases parse
> correctly. The rest was taken care by careful ordering of rules, e.g. the
> rule for `del_stmt` must be tried before the rule for `expression_stmt`
> since `del *x` would match the latter.)

I think, on first glance, I'd rather have all keywords context-sensitive 
than just some. But I haven't put a great deal of thought into that 
aspect of it, and I reserve the right to change my mind :-)

> > Personally, I would not like to have to explain to newcomers why `match`
> > is a keyword but you can still use it as a function or variable, but not
> > other keywords like `raise`, `in`, `def` etc.
> >
> > match expression:
> > match = True
> >
> 
> What kind of newcomers do you have that they even notice that, unless you
> were to draw attention to it?

It didn't take me 25 years to try using "of" and "if" for "output file" 
and "input file", so I guess my answer to your question is ordinary 
newcomers :-)

"Newcomers" doesn't just including beginners to programming, it can 
include people experienced in one or more other language coming to 
Python for the first time.

But if we're talking about complete beginners, the concept of what is 
and isn't a keyword is not always clear. Why is the first of these legal 
but not the second? Both words are highlighted in my editor:

str = "Hello world"
class = "wizard"

People are going to learn that `match` is a keyword, and then they are 
going to come across code using it as a variable or method, and while 
the context-sensitive rule might be obvious to us, it won't be obvious 
to them precisely because they are still learning the language rules.

I think that `match` would be an especially interesting case because I 
can easily see someone starting off with a variable `match`, that they 
handle in an `if` statement, and then as the code evolves they shift it 
to a `match` statement:

match match:

and not bother to refactor the name because they are familiar enough 
with is that the meaning is obvious.

On the other hand there are definitely a few keywords that collide with 
useful names. Apart from `if`, I have wanted to use these as variables, 
parameters or functions:

class, raise, in, from, except, while, lambda

(off the top of my head, there may be others). There's at least one 
place in the random module where a parameter is misspelled "lambd" 
because lambda is a keyword. So there is certainly something to be said 
for getting rid of keywords.

On the third hand, keywords don't just make it easier for the 
interpreter, they also make it easier for the human reader. You don't 
need to care about context, `except` is `except` wherever you see it. 
That makes it a dead-simple rule for anyone to learn, because there are 
no exceptions, pun intended.

(I guess inside strings and comments are exceptions, but they are 
well-understood and *simple* exceptions.)

I just can't help feeling at this point that while there are pros and 
cons to making things a keyword, having some keywords be context 
sensitive but not others is going to combine the worst of both and end 
up be confusing and awkward.

> I'm serious -- from the kind of questions
> I've seen in user forums, most newcomers are having a hard enough time
> learning more fundamental concepts and abstractions than the precise rules
> for reserved words.

That's because the precise rules for reserved words are dead-simple to 
learn. You can't use them anywhere except in the correct context. If we 
start adding exceptions to that, that reserved words are only sometimes 
reserved, I think that will make them harder to learn. If it's only some 
reserved words but not others, that's even harder because we have three 
classes of words:

* words that are never reserved
* words that are sometimes reserved, depending on what is around them
* words that are always reserved

I had thought that "no context-sensitive keywords" was a hard rule, so I 
was surprised that you are now re-considering it.

-- 
Steven
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NNMJB7QIRAMROQU5XRM6YV7K4NXBGXBT/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-06 Thread Steve Holden

On Mon, Apr 6, 2020 at 8:04 PM Guido van Rossum  wrote:

> On Mon, Apr 6, 2020 at 11:36 AM Steven D'Aprano 
> wrote:
>
>>
>> Personally, I would not like to have to explain to newcomers why `match`
>> is a keyword but you can still use it as a function or variable, but not
>> other keywords like `raise`, `in`, `def` etc.
>>
>> match expression:
>> match = True
>>
>
> What kind of newcomers do you have that they even notice that, unless you
> were to draw attention to it? I'm serious -- from the kind of questions
> I've seen in user forums, most newcomers are having a hard enough time
> learning more fundamental concepts and abstractions than the precise rules
> for reserved words.
>

Absolutely. Beginners can simply be told they are keywords. If they then
come across them in other contexts, hopefully there'll be a sensible
documentation page that a web search for " keyword" would lead to
an explanation that

"Some Python keywords can only ever be used with that meaning. Others can
be used with other meanings where the context makes it clear that the
keyword interpretation does not apply. You are recommended not to use such
keywords as names in your own programs. The feature was implemented to make
porting existing code to future versions of Python simpler."

The tutorial should contain a similar passage.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/S2URQJVKERGOTKGJM5ZQX3EOCA2KNNS2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-06 Thread Chris Angelico

On Tue, Apr 7, 2020 at 5:03 AM Guido van Rossum  wrote:
>
> On Mon, Apr 6, 2020 at 11:36 AM Steven D'Aprano  wrote:
>> Personally, I would not like to have to explain to newcomers why `match`
>> is a keyword but you can still use it as a function or variable, but not
>> other keywords like `raise`, `in`, `def` etc.
>>
>> match expression:
>> match = True
>
>
> What kind of newcomers do you have that they even notice that, unless you 
> were to draw attention to it? I'm serious -- from the kind of questions I've 
> seen in user forums, most newcomers are having a hard enough time learning 
> more fundamental concepts and abstractions than the precise rules for 
> reserved words.
>

>From my experience of teaching a variety of languages, including SQL,
it's usually not something people have a problem with in toy examples
- but it becomes a major nuisance when they're trying to deal with a
problem and some keyword is getting in the way. SQL is *full* of
context-sensitive keywords, and every once in a while, someone uses a
non-reserved word as a column name, and everything works until they
run into some specific context where it doesn't work. (It's a bit
messier than in Python due to multiple abstraction layers eg ORMs, and
sometimes they deal with these issues and sometimes not; but it's
still that much harder to debug specifically _because_ things aren't
always reserved.)

Ultimately it comes down to the number of edge cases that people have
to learn, and how edgy those cases are. Python already has the
possibility to override builtins, so you can say "list = []" without
an error; context-sensitive keywords sit in a space between those and
fully-reserved words. It'll come down to specific words as to whether
it's inevitably going to be a problem down the track, or almost
certainly going to be fine.

BTW, is the PEG parser going to make it easier to hack on the language
syntax? If so, it'd be that much easier to experiment with these kinds
of ideas in a separate branch/fork, and quickly find out if there's
going to be any major impact. At the moment, editing the grammar is a
bit daunting - too many easy ways to mess it up.

ChrisA
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OAF2EVT6ZGRFOHMVLZGETZSSG52CGYQV/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-06 Thread Guido van Rossum

On Mon, Apr 6, 2020 at 11:36 AM Steven D'Aprano  wrote:

> On Mon, Apr 06, 2020 at 10:43:11AM -0700, Guido van Rossum wrote:
>
> > I've been toying with the idea of introducing a "match" statement
> > similar to Scala's match expression by making "match" a keyword only when
> > followed by an expression and a colon.)
>
> Didn't we conclude from `as` that having context-sensitive keywords was
> a bad idea?
>

I'm not sure that that was the conclusion. At the time the point was that
we *wanted* all keywords to be reserved everywhere, an `as` was an ugly
exception to that rule, which we got rid of as soon as we could -- not
because it was a bad idea but because it violated a somewhat arbitrary rule.

We went through the same thing with `async` and `await`, and the experience
there was worse: a lot of libraries in the very space `async def` was aimed
at were using `async` as a parameter name, often in APIs, and they had to
scramble to redesign their APIs and get their users to change their
programs.

In retrospect I wish we had just kept `async` as a context-sensitive
keyword, since it was totally doable.

(In an early version of the PEG parser, all keywords were
context-sensitive, and there were only very few places in the grammar where
this required us to insert negative lookaheads to make edge cases parse
correctly. The rest was taken care by careful ordering of rules, e.g. the
rule for `del_stmt` must be tried before the rule for `expression_stmt`
since `del *x` would match the latter.)

> Personally, I would not like to have to explain to newcomers why `match`
> is a keyword but you can still use it as a function or variable, but not
> other keywords like `raise`, `in`, `def` etc.
>
> match expression:
> match = True
>

What kind of newcomers do you have that they even notice that, unless you
were to draw attention to it? I'm serious -- from the kind of questions
I've seen in user forums, most newcomers are having a hard enough time
learning more fundamental concepts and abstractions than the precise rules
for reserved words.

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XW6RDCITNUWIUPADTJMSIRMPBGMHFKJS/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-06 Thread Steven D'Aprano

On Mon, Apr 06, 2020 at 10:43:11AM -0700, Guido van Rossum wrote:

> I've been toying with the idea of introducing a "match" statement
> similar to Scala's match expression by making "match" a keyword only when
> followed by an expression and a colon.)

Didn't we conclude from `as` that having context-sensitive keywords was 
a bad idea?

Personally, I would not like to have to explain to newcomers why `match` 
is a keyword but you can still use it as a function or variable, but not 
other keywords like `raise`, `in`, `def` etc.

match expression:
match = True


-- 
Steven
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2INKROEUGRGCSPUM7L7JY7KI4C7HUP3Z/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-06 Thread Guido van Rossum

On Mon, Apr 6, 2020 at 5:18 AM Jeff Allen  wrote:

> The PEP gives a good exposition of the problem and proposed solution,
> thanks.
>
> If I understand correctly, the proposal is that the PEG grammar should
> become the definitive grammar for Python at some point, probably for Python
> 3.10, so it may evolve without the LL(1) restrictions. I'd like to raise
> some points with respect to that, which perhaps the migration section could
> answer.
>
Thanks, you definitely have a point here.

> When definitive, the grammar would not then just be for CPython, and would
> also appear as user documentation of the language. Whether that change
> leaves Python with a more useful (readable) grammar seems an important test
> of the idea. I'm looking at
> https://github.com/we-like-parsers/cpython/blob/pegen/Grammar/python.gram
> , and assuming that is indicative of a future definitive grammar. That may
> be incorrect, as it has these issues in my view:
>
> 1. It is decorated with actions in C. If a decorated grammar is offered as
> definitive, one with Python actions (operations on the AST) is preferable,
> as implementation neutral, although still hostage to AST changes that are
> not language changes. Maybe one stripped of actions is best.
>
Yes, the plan is to strip actions and a few other embellishments (types,
names, cuts, and probably also lookaheads -- although the latter may be
significant, we only use them for optimization). The parser generator (
https://github.com/we-like-parsers/cpython/tree/pegen/Tools/peg_generator)
prints a stripped representation (though currently preserving lookaheads --
suppressing those would be a simple change to the code).

> 2. It's quite long, and not at first glance more readable than the LL(1)
> grammar. I had understood ugliness in the LL(1) grammar to result from
> skirting limitations that PEG eliminates. The PEG one is twice as long, but
> recognising about half of it is actions, let's just say that as a grammar
> it's no shorter.
>
Indeed. I believe part of this actually comes from the desire to be 100%
compatible with the old parser (an important constraint is that we don't
want to change the AST since we don't want to change the byte code
generator).

Another part of it comes from expressing in the grammar constraints that
the old parser generator cannot express. For example, the old parser
accepts `1 = x` as an assignment, and it is rejected in a later stage. The
new parser expresses this restriction in the grammar. Note that the full
grammar published in the reference manual (
https://docs.python.org/3.8/reference/grammar.html) doesn't say anything
about this; the grammar used later to describe assignment_stmt does (
https://docs.python.org/3.8/reference/simple_stmts.html#grammar-token-assignment-stmt),
but as a result it is not LL(1) -- those grammar sections sprinkled
throughout the reference manual are all written and updated by hand (and
sometimes we forget!).

> 3. There is some manual guidance by means of &-guards, only necessary (I
> think) as a speed-up or to force out meaningful syntax errors. That would
> be noise to the reader. (This goes away if the PEG parser generator
> generate guards from the first set at a simple "no backtracking" marker.)
>
Yeah, see above. We've thought of generating FIRST sets as a future
enhancement of the generator, and then they can go away. At the moment the
lookaheads we have are all carefully aimed at optimizing the time and space
requirements of the parser.

> 4. In some places, expansive alternatives seem to be motivated by the
> difference between actions, for a start, wherever async pops up. Maybe it
> is also why the definition of lambda is so long. That could go away with
> different support code (e.g. is_async as an argument), but if improvements
> to the support change grammar rules, when the language has not changed,
> that's a danger sign too.
>
Yeah, lambda is complicated by the requirement on the generated AST.
Arguably we have gone too far here (and for 'parameters', which solves
almost the same problem for regular function definitions) and we should put
some of the checks back in the support code. But I note that the old
grammar also has some warts in the area of parameter definitions (though
its lambda is definitely simpler).

> All that I think means that the "operational" grammar from which you build
> the parser is going to be quite unlike the one with which you communicate
> the language. At present ~/Grammar/Grammar both generates the parser (I
> thought) and appears as documentation. I take it to be the ideal that we
> use a single, human-readable definition. For example ANTLR 4 has worked
> hard to facilitate a grammar in which actions are implicit, and the
> generation of an AST from the parse tree/events can be elsewhere. (I'm not
> plugging ANTLR specifically as a solution.)
>
Our cheaper solution is to remove the actions from the display grammar. But
I don't think that Grammar/Grammar should be

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-06 Thread Guido van Rossum

On Mon, Apr 6, 2020 at 4:03 AM Fabio Zadrozny  wrote:

> I think using a PEG parser is interesting, but I do have some questions
> related to what's to expect in the future for other people which have to
> follow the Python grammar, so, can you shed some light on this?
>
> Does that mean that the grammar format currently available (which is
> currently specified in https://docs.python.org/3.8/reference/grammar.html)
> will no longer be updated/used?
>

The grammar format used for the PEG parser is nearly the same as the old
grammar, when you remove actions and some embellishments needed for
actions. The biggest difference is that the `|` operator is no longer
symmetrical (since if you have alternatives `A | B`, and both match at some
point in the input, PEG reports A, while the old generator would reject the
grammar as being ambiguous.

> Is it expected that other language implementations/parsers also have to
> move to a PEG parser in the future? -- which would probably be the case if
> the language deviates strongly off LL(1)
>

We don't specify how other implementations must parse the language -- in
fact I have no idea how the parsers of any of the other implementations
work. I'm sure there will be other ways to parse the same language.

But yeah, if there are implementations that currently closely follow
Python's LL(1) parser structure they may have to be changed once we start
introducing new syntax that makes use of the freedom PEG gives us. (For
example, I've been toying with the idea of introducing a "match" statement
similar to Scala's match expression by making "match" a keyword only when
followed by an expression and a colon.)

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WBGUXFL54OLTYINNLRMAW5UH7KSIX7QX/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-06 Thread Jeff Allen

The PEP gives a good exposition of the problem and proposed solution, 
thanks.


If I understand correctly, the proposal is that the PEG grammar should 
become the definitive grammar for Python at some point, probably for 
Python 3.10, so it may evolve without the LL(1) restrictions. I'd like 
to raise some points with respect to that, which perhaps the migration 
section could answer.


When definitive, the grammar would not then just be for CPython, and 
would also appear as user documentation of the language. Whether that 
change leaves Python with a more useful (readable) grammar seems an 
important test of the idea. I'm looking at 
https://github.com/we-like-parsers/cpython/blob/pegen/Grammar/python.gram 
, and assuming that is indicative of a future definitive grammar. That 
may be incorrect, as it has these issues in my view:


1. It is decorated with actions in C. If a decorated grammar is offered 
as definitive, one with Python actions (operations on the AST) is 
preferable, as implementation neutral, although still hostage to AST 
changes that are not language changes. Maybe one stripped of actions is 
best.


2. It's quite long, and not at first glance more readable than the LL(1) 
grammar. I had understood ugliness in the LL(1) grammar to result from 
skirting limitations that PEG eliminates. The PEG one is twice as long, 
but recognising about half of it is actions, let's just say that as a 
grammar it's no shorter.


3. There is some manual guidance by means of &-guards, only necessary (I 
think) as a speed-up or to force out meaningful syntax errors. That 
would be noise to the reader. (This goes away if the PEG parser 
generator generate guards from the first set at a simple "no 
backtracking" marker.)


4. In some places, expansive alternatives seem to be motivated by the 
difference between actions, for a start, wherever async pops up. Maybe 
it is also why the definition of lambda is so long. That could go away 
with different support code (e.g. is_async as an argument), but if 
improvements to the support change grammar rules, when the language has 
not changed, that's a danger sign too.


All that I think means that the "operational" grammar from which you 
build the parser is going to be quite unlike the one with which you 
communicate the language. At present ~/Grammar/Grammar both generates 
the parser (I thought) and appears as documentation. I take it to be the 
ideal that we use a single, human-readable definition. For example ANTLR 
4 has worked hard to facilitate a grammar in which actions are implicit, 
and the generation of an AST from the parse tree/events can be 
elsewhere. (I'm not plugging ANTLR specifically as a solution.)


Jeff Allen

On 02/04/2020 19:10, Guido van Rossum wrote:
Since last fall's core sprint in London, Pablo Galindo Salgado, 
Lysandros Nikolaou and myself have been working on a new parser for 
CPython. We are now far enough along that we present a PEP we've written:


https://www.python.org/dev/peps/pep-0617/

Hopefully the PEP speaks for itself. We are hoping for a speedy 
resolution so we can land the code we've written before 3.9 beta 1.


If people insist I can post a copy of the entire PEP here on the list, 
but since a lot of it is just background information on the old LL(1) 
and the new PEG parsing algorithms, I figure I'd spare everyone the 
need of reading through that. Below is a copy of the most relevant 
section from the PEP. I'd also like to point out the section on 
performance (which you can find through the above link) -- basically 
performance is on a par with that of the old parser.


==
Migration plan
==

This section describes the migration plan when porting to the new 
PEG-based parser
if this PEP is accepted. The migration will be executed in a series of 
steps that allow

initially to fallback to the previous parser if needed:

1.  Before Python 3.9 beta 1, include the new PEG-based parser 
machinery in CPython
    with a command-line flag and environment variable that allows 
switching between
    the new and the old parsers together with explicit APIs that allow 
invoking the
    new and the old parsers independently. At this step, all Python 
APIs like ``ast.parse``
    and ``compile`` will use the parser set by the flags or the 
environment variable and

    the default parser will be the current parser.

2.  After Python 3.9 Beta 1 the default parser will be the new parser.

3.  Between Python 3.9 and Python 3.10, the old parser and related 
code (like the
    "parser" module) will be kept until a new Python release happens 
(Python 3.10). In
    the meanwhile and until the old parser is removed, **no new Python 
Grammar
    addition will be added that requires the peg parser**. This means 
that the grammar

    will be kept LL(1) until the old parser is removed.

4.  In Python 3.10, remove the old parser, the command-line flag, the 
environment

    variable and the "parser" module and related

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-06 Thread Fabio Zadrozny

On Thu, Apr 2, 2020 at 3:16 PM Guido van Rossum  wrote:

> Since last fall's core sprint in London, Pablo Galindo Salgado, Lysandros
> Nikolaou and myself have been working on a new parser for CPython. We are
> now far enough along that we present a PEP we've written:
>
> https://www.python.org/dev/peps/pep-0617/
>
> Hopefully the PEP speaks for itself. We are hoping for a speedy resolution
> so we can land the code we've written before 3.9 beta 1.
>
> If people insist I can post a copy of the entire PEP here on the list, but
> since a lot of it is just background information on the old LL(1) and the
> new PEG parsing algorithms, I figure I'd spare everyone the need of reading
> through that. Below is a copy of the most relevant section from the PEP.
> I'd also like to point out the section on performance (which you can find
> through the above link) -- basically performance is on a par with that of
> the old parser.
>
>
Hi Guido,

I think using a PEG parser is interesting, but I do have some questions
related to what's to expect in the future for other people which have to
follow the Python grammar, so, can you shed some light on this?

Does that mean that the grammar format currently available (which is
currently specified in https://docs.python.org/3.8/reference/grammar.html)
will no longer be updated/used?

Is it expected that other language implementations/parsers also have to
move to a PEG parser in the future? -- which would probably be the case if
the language deviates strongly off LL(1)

Thanks,

Fabio
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZETOUF7L7XBPJ2D2U7UZZBBBDTP727XZ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-05 Thread Guido van Rossum

On Sun, Apr 5, 2020 at 5:16 PM Greg Ewing 
wrote:

> On 6/04/20 4:48 am, Guido van Rossum wrote:
> > There's no need to worry about this: in almost all cases the error
> > indicator points to the same spot in the source code as with the old
> > parser.
>
> I'm curious about how that works. From the description in the PEP,
> it seems that none of the individual parsing functions can report
> an error, because there might be another branch higher up that
> succeeds. Does it keep track of the maximum distance it got through
> the source or something like that?
>

I guess you could call it that. There is a small layer of abstraction
between the actual tokenizer (which cannot go back) and the generated
parser functions. This abstraction buffers tokens. When a parser function
wants a token it calls into this abstraction, and that either satisfies it
from its buffer, or if there is no lookahead in the buffer left, calls the
actual tokenizer. When a parser function fails, it calls into the
abstraction layer to back up to a previous point (which I call the "mark").

(A simplified version of this layer is shown in my blog post,
https://medium.com/@gvanrossum_83706/building-a-peg-parser-d4869b5958fb --
the class Tokenizer.)

When an error bubbles all the way up, we report a SyntaxError pointing to
the farthest token that the abstraction has buffered (self.pos in the blog
post).

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/U47A3SALOBQMWTVBPHDFD5OXCYXF7QSY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-05 Thread Greg Ewing


On 6/04/20 4:48 am, Guido van Rossum wrote:
There's no need to worry about this: in almost all cases the error 
indicator points to the same spot in the source code as with the old 
parser.


I'm curious about how that works. From the description in the PEP,
it seems that none of the individual parsing functions can report
an error, because there might be another branch higher up that
succeeds. Does it keep track of the maximum distance it got through
the source or something like that?

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/THY5LIBHBSVZTUYMGZUYJDDXRASU4T6Z/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-05 Thread Guido van Rossum

The tl;dr is that actions specified in the grammar are specific to the
target language. So if you want to use the pegen tool to generate both
Python and C code for the same grammar, you would need two grammar files
with the same grammar but different actions. Since our goal here is just to
generate a parser for use in CPython that's not a problem. Other PEG parser
generators make different choices, e.g. TatSu puts semantics actions in a
separate file (https://tatsu.readthedocs.io/en/stable/semantics.html).

On Sun, Apr 5, 2020 at 11:06 AM Pablo Galindo Salgado 
wrote:

> > The only thing I'm missing from the PEP is more detail about how the
> cross-language nature of the parser actions are handled.
>
> Expanded the "actions" section in the PEP here:
> https://github.com/python/peps/pull/1357
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/EJMHASPUOAW7R2BKJCCVI4BGQRLN3ZRX/
> Code of Conduct: http://python.org/psf/codeofconduct/
>

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UHGCXQGQYPJIOM34VEEMV74OPNBXNNW5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-05 Thread Pablo Galindo Salgado

> The only thing I'm missing from the PEP is more detail about how the
cross-language nature of the parser actions are handled.

Expanded the "actions" section in the PEP here: 
https://github.com/python/peps/pull/1357
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EJMHASPUOAW7R2BKJCCVI4BGQRLN3ZRX/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-05 Thread Guido van Rossum

> On 6/04/20 2:08 am, Jelle Zijlstra wrote:
> > The current CPython parser usually just produces "SyntaxError: invalid
> > syntax" for any error, while other languages that I work with usually
> > say something more precise like 'expected x, got y'. What will the error
> > messages in the PEG parser look like? Making syntax errors more
> > informative can be a nice improvement to usability.
>

Unfortunately they look pretty much the same. We're actually currently
trying to improve the error messages for situations where the old parser
produces something specialized (mostly because the LL(1) grammar can't
express something and the check is done in a later pass).

> On Sun, Apr 5, 2020 at 7:55 AM Greg Ewing 
> wrote:
> And related to that, how precisely will it be able to pinpoint the
> location of the error? The backtracking worries me a bit in that
> regard. I can imagine it trying all possible ways to parse the
> input and then only being able to say "Something is wrong somewhere
> in this file."
>

There's no need to worry about this: in almost all cases the error
indicator points to the same spot in the source code as with the old
parser. I was worried about this too, but it really doesn't seem to be a
problem -- I think this might be different with highly ambiguous grammars,
but since Python's grammar is still *mostly* LL(1), it looks like we're
fine.

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CSFQGEYL6NVWRWQLJ3B5AXKV3ZAFWA65/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-05 Thread Greg Ewing


On 6/04/20 2:08 am, Jelle Zijlstra wrote:
The current CPython parser usually just produces "SyntaxError: invalid 
syntax" for any error, while other languages that I work with usually 
say something more precise like 'expected x, got y'. What will the error 
messages in the PEG parser look like? Making syntax errors more 
informative can be a nice improvement to usability.


And related to that, how precisely will it be able to pinpoint the
location of the error? The backtracking worries me a bit in that
regard. I can imagine it trying all possible ways to parse the
input and then only being able to say "Something is wrong somewhere
in this file."

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AW3FIDN6BTD2UANVBPT4LO76PALKVLJN/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-05 Thread Jelle Zijlstra

El jue., 2 abr. 2020 a las 11:19, Guido van Rossum ()
escribió:

> Since last fall's core sprint in London, Pablo Galindo Salgado, Lysandros
> Nikolaou and myself have been working on a new parser for CPython. We are
> now far enough along that we present a PEP we've written:
>
> https://www.python.org/dev/peps/pep-0617/
>
> Hopefully the PEP speaks for itself. We are hoping for a speedy resolution
> so we can land the code we've written before 3.9 beta 1.
>
> The current CPython parser usually just produces "SyntaxError: invalid
syntax" for any error, while other languages that I work with usually say
something more precise like 'expected x, got y'. What will the error
messages in the PEG parser look like? Making syntax errors more informative
can be a nice improvement to usability.


> If people insist I can post a copy of the entire PEP here on the list, but
> since a lot of it is just background information on the old LL(1) and the
> new PEG parsing algorithms, I figure I'd spare everyone the need of reading
> through that. Below is a copy of the most relevant section from the PEP.
> I'd also like to point out the section on performance (which you can find
> through the above link) -- basically performance is on a par with that of
> the old parser.
>
> ==
> Migration plan
> ==
>
> This section describes the migration plan when porting to the new
> PEG-based parser
> if this PEP is accepted. The migration will be executed in a series of
> steps that allow
> initially to fallback to the previous parser if needed:
>
> 1.  Before Python 3.9 beta 1, include the new PEG-based parser machinery
> in CPython
> with a command-line flag and environment variable that allows
> switching between
> the new and the old parsers together with explicit APIs that allow
> invoking the
> new and the old parsers independently. At this step, all Python APIs
> like ``ast.parse``
> and ``compile`` will use the parser set by the flags or the
> environment variable and
> the default parser will be the current parser.
>
> 2.  After Python 3.9 Beta 1 the default parser will be the new parser.
>
> 3.  Between Python 3.9 and Python 3.10, the old parser and related code
> (like the
> "parser" module) will be kept until a new Python release happens
> (Python 3.10). In
> the meanwhile and until the old parser is removed, **no new Python
> Grammar
> addition will be added that requires the peg parser**. This means that
> the grammar
> will be kept LL(1) until the old parser is removed.
>
> 4.  In Python 3.10, remove the old parser, the command-line flag, the
> environment
> variable and the "parser" module and related code.
>
> --
> --Guido van Rossum (python.org/~guido)
> *Pronouns: he/him **(why is my pronoun here?)*
> 
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/HOZ2RI3FXUEMAT4XAX4UHFN4PKG5J5GR/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SRVOVLZWEMZX4LSUXFDZCQO6CV27QEGO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-03 Thread Greg Ewing


On 4/04/20 9:29 am, Brett Cannon wrote:

I think "needs" is a bit strong. It would be nice, though. Regardless, as long 
as this is a net improvement over the status quo I don't see this being rejected on the 
grounds that an LR or LALR parser would be better since we have a working PEG parser 
today. :)


Even if the section only says "We didn't consider any alternatives,
because...", I still think it should be there.

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/URT2FASB34SF2VDC57ZCWEEL2RLMIGN7/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-03 Thread Brett Cannon

Greg Ewing wrote:
> On 3/04/20 7:10 am, Guido van Rossum wrote:
> > Since last fall's core sprint in London, Pablo
> > Galindo Salgado, 
> > Lysandros Nikolaou and myself have been working on a new parser for 
> > CPython. We are now far enough along that we present a PEP we've written:
> > https://www.python.org/dev/peps/pep-0617/
> > Was any consideration given to other types of parser, such
> as LR or LALR?
> LR parsers handle left recursion naturally, and don't suffer
> from any of the drawbacks mentioned in the PEP such as taking
> exponential time or requiring all the source to be loaded
> into memory.
> I think there needs to be a section in the PEP justifying the
> choice of PEG over the alternatives.

I think "needs" is a bit strong. It would be nice, though. Regardless, as long 
as this is a net improvement over the status quo I don't see this being 
rejected on the grounds that an LR or LALR parser would be better since we have 
a working PEG parser today. :)
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NZAFQSEJD6SCXLSFNTNMPEJZM5XJ7TKL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-03 Thread Pablo Galindo Salgado

>The only thing I'm missing from the PEP is more detail about how the
> cross-language nature of the parser actions are handled. The example covers
> just C, and the description of the actions says they're C expressions. The
> only mention of Python code generation is for alternatives without actions.
> Is the intent that the actions are cross-language, or translated to Python
> somehow, or is the support for generating a Python-based parser merely for
>. debugging, as that action suggests?

Oh, good point. Thanks for pointing that out. We certainly need to explain
that a bit better. The current situation is that actions support both Python 
and C
code. They are basically pieces of code that will be included in the resulting
program, no matter on what language is written in.

For instance, we use the Python generator to generate the code that parses the
grammar for the generator itself. The output is written in Python and the 
metagrammar
uses actions written in Python:

https://github.com/we-like-parsers/cpython/blob/pegen/Tools/peg_generator/pegen/metagrammar.gram

So regarding the usage of Python code generation, is certainly useful for 
debugging
but is actually used by the generator itself to bootstrap a section of it (the 
one that parses
grammars). The feeling of bootstrapping parsers never gets old and is one of the
most fun parts to do :)

I will prepare a PR soon to complement the section about actions in the PEP.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/62FBT5KVUES5W4SXF22WAXXE2E32Y6WT/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-03 Thread Pablo Galindo Salgado

> That paragraph seems rather confused. I think what it might be
> trying to say is that a PEG parser allows you to write productions
> with overlapping first sets (which would be "ambiguous" for an
> LL parser), but still somehow guarantees that a unique parse tree
> is produced. The latter suggests that the grammar as a whole still
> needs to be unambiguous.

We may need to rephrase this to make it a bit more clear, but this is
trying to say that PEG grammars cannot be ambiguous in the same sense
as context-free grammars are normally said to be ambiguous. Notice that
an ambiguous grammar is normally defined (for instance here 
https://en.wikipedia.org/wiki/Ambiguous_grammar)
only for context-free grammars as a grammar with more than one possible
parse tree. In the PEG formalism as Guido explained in the previous email
there is only one possible parse tree because the parser always chooses the 
first
option.

As a consequence of this (and as a particular case of this) and as you mention, 
the
PEG formalism allows writing productions with overlapping first sets. Also, 
notice
that first sets are mainly relevant for LL(k) parsers and the like because 
those need to
*deduce* which alternative to follow given multiple choices in production while
PEG will always try in order.

In general, the argument is that because of how PEG works, it will only be one 
parse
tree and this makes the grammar "not ambiguous" under the typical definition for
ambiguity for context-free grammars (having multiple parse trees).
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4D3B2NM2JMV2UKIT6EV5Q2A6XK2HXDEH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-03 Thread Thomas Wouters

Thanks, Guido, Pablo, Lysandros, that's a great PEP. Also thanks to
everyone else working on the PEG parser over the last year, like Emily. I
know it's a lot of work but as someone who's intimately aware of the
headaches caused by the LL(1) parser, I greatly appreciate it :).

The only thing I'm missing from the PEP is more detail about how the
cross-language nature of the parser actions are handled. The example covers
just C, and the description of the actions says they're C expressions. The
only mention of Python code generation is for alternatives without actions.
Is the intent that the actions are cross-language, or translated to Python
somehow, or is the support for generating a Python-based parser merely for
debugging, as that action suggests?

-- 
Thomas Wouters 

Hi! I'm an email virus! Think twice before sending your email to help me
spread!
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QCLQTDFCUYJWCZOUYKPYTN5DVUGAATFW/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Matt Billenstein via Python-Dev

On Thu, Apr 02, 2020 at 08:57:30PM -0700, Guido van Rossum wrote:
> On Thu, Apr 2, 2020 at 7:55 PM Matt Billenstein  wrote:
> 
> Even just running it in a dev build against the corpus of the top few
> thousand packages on pypi might give enough confidence -- I had a script
> to download the top N packages and run some script over the python files
> contained therein, but I can't seem to find it atm.
> 
> 
> We got that. Check https://github.com/gvanrossum/pegen/tree/master/scripts --
> look at download_pypi_packages.py and test_pypi_packages.py.

Very nice!

m

-- 
Matt Billenstein
m...@vazor.com
http://www.vazor.com/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Q5DY4ZNCB7GYGIL5LUWJFJ7GLL5EJMW2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Guido van Rossum

On Thu, Apr 2, 2020 at 7:55 PM Matt Billenstein  wrote:

> On Thu, Apr 02, 2020 at 05:17:31PM -0700, Guido van Rossum wrote:
> > On Thu, Apr 2, 2020 at 4:20 PM Nathaniel Smith  wrote:
> >
> > If the AST is supposed to be the same, then would it make sense to
> > temporarily – maybe just during the alpha/beta period – always run
> > *both* parsers and confirm that they match?
> >
> >
> > That's not a bad idea!
> https://github.com/we-like-parsers/cpython/issues/33
>
> Even just running it in a dev build against the corpus of the top few
> thousand packages on pypi might give enough confidence -- I had a script
> to download the top N packages and run some script over the python files
> contained therein, but I can't seem to find it atm.
>

We got that. Check https://github.com/gvanrossum/pegen/tree/master/scripts
-- look at download_pypi_packages.py and test_pypi_packages.py.

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/P7CPMHFQBTS3SH2A6TOUJNTZI2F25JFH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Matt Billenstein via Python-Dev

On Thu, Apr 02, 2020 at 05:17:31PM -0700, Guido van Rossum wrote:
> On Thu, Apr 2, 2020 at 4:20 PM Nathaniel Smith  wrote:
> 
> If the AST is supposed to be the same, then would it make sense to
> temporarily – maybe just during the alpha/beta period – always run
> *both* parsers and confirm that they match?
> 
> 
> That's not a bad idea! https://github.com/we-like-parsers/cpython/issues/33

Even just running it in a dev build against the corpus of the top few
thousand packages on pypi might give enough confidence -- I had a script
to download the top N packages and run some script over the python files
contained therein, but I can't seem to find it atm.

m

-- 
Matt Billenstein
m...@vazor.com
http://www.vazor.com/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YCL5HQIV5YIRPZ5VCKV6B7U5XCECFAW2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Greg Ewing


On 3/04/20 3:22 pm, Guido van Rossum wrote:
This allows more freedom in designing a grammar. 
For example, it would let a language designer solve the "dangling else" 
problem from the Wikipedia page, by writing the form including the 
"else" clause first .


I'm inclined to think that such problems shouldn't be solved at the
parser level, but rather at the language level, i.e. don't design
the language that way in the first place. After all, if it's
confusing to the computer, it's probably going to be confusing
to humans as well.

(I note that all of Wirth's languages after Pascal changed the syntax
so as not to have a dangling else problem.)

Personally I would rather my parser generator *did* complain about
ambiguities, so that I can facepalm myself for designing my language
in such a stupid way.

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MKMIDLTI3U7WKJ6PPMHNRQBVZINCJJAQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Greg Ewing


On 3/04/20 2:13 pm, Victor Stinner wrote:

"Unlike LL(1) parsers PEG-based parsers cannot be ambiguous: if a
string parses, it has exactly one valid parse tree. This means that a
PEG-based parser cannot suffer from the ambiguity problems described
in the previous section."


That paragraph seems rather confused. I think what it *might* be
trying to say is that a PEG parser allows you to write productions
with overlapping first sets (which would be "ambiguous" for an
LL parser), but still somehow guarantees that a unique parse tree
is produced. The latter suggests that the grammar as a whole still
needs to be unambiguous.

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/K7LH4VHVOZ5ISIUTJ3I7UWEVM3KGK5Y6/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Guido van Rossum

 Le ven. 3 avr. 2020 à 02:58, Greg Ewing  a
écrit :
> On 3/04/20 10:33 am, Victor Stinner wrote:
> > I also like the fact that PEG is deterministic, whereas
> > LL(1) parsers are not.
>
> Where do you get that LL(1) parsers are not deterministic?
> That's news to me!

On Thu, Apr 2, 2020 at 6:15 PM Victor Stinner  wrote:

> Sorry, I was referring to *ambiguous* grammar rules. Extract of the PEP:
>
> "Unlike LL(1) parsers PEG-based parsers cannot be ambiguous: if a
> string parses, it has exactly one valid parse tree. This means that a
> PEG-based parser cannot suffer from the ambiguity problems described
> in the previous section."
>

Maybe we need to rephrase this a bit. It's more that the LL(1) and PEG
formalisms deal very different with ambiguous *grammars*. An example of an
ambiguous grammar would be:

start: X | Y
X: expr
Y: expr
expr: NAME | NAME '+' NAME

There are probably better examples of ambiguous grammars (see
https://en.wikipedia.org/wiki/Ambiguous_grammar) but I think this will do
to explain the problem.

This is a fine context-free grammar (it accepts strings like "a" and "a+b")
but the LL(1) formalism will reject it because it sees an overlap in FIRST
sets between X and Y -- not surprising because they have the same RHS.
Also, even a more powerful formalism would have to make a choice whether to
choose X or Y, which may matter if the derivation is used to build a parse
tree (like Python's pgen does).

OTOH a PEG parser generator will always take the X alternative -- it
doesn't care that there's more than one derivation, since its '|' operator
is not symmetrical: X|Y and Y|X are not the same, as they are in LL(1) and
most other formalisms. (In fact, the common notation for PEG uses '/' to
emphasize this, but it looks ugly to me so I changed it to '|'.)

That PEG (by definition) always uses the first matching alternative is
actually a blessing as well as a curse. The downside is that PEG can't tell
you when have a real ambiguity in your grammar. But the upside is that it
works like a programmer would write a (recursive descent) parser. Thus it
"solves" the problem of ambiguous grammars by choosing the first
alternative. This allows more freedom in designing a grammar. For example,
it would let a language designer solve the "dangling else" problem from the
Wikipedia page, by writing the form including the "else" clause first .
(Python doesn't have that problem due to the use of indentation, but it
might appear in another disguise.)

I should probably refine this argument and include it in the PEP as one of
the reasons to prefer PEG over LR or LALR (but I need to think more about
that -- it was a very early choice).

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3LHZNXC2MJJ2RUMDGRH7L2ZOPH5ZE6QK/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Batuhan Taskaya

Wonderful news! I'm really excited to see what is coming alongside
this flexible parser.

On Thu, Apr 2, 2020 at 9:16 PM Guido van Rossum  wrote:
>
> Since last fall's core sprint in London, Pablo Galindo Salgado, Lysandros 
> Nikolaou and myself have been working on a new parser for CPython. We are now 
> far enough along that we present a PEP we've written:
>
> https://www.python.org/dev/peps/pep-0617/
>
> Hopefully the PEP speaks for itself. We are hoping for a speedy resolution so 
> we can land the code we've written before 3.9 beta 1.
>
> If people insist I can post a copy of the entire PEP here on the list, but 
> since a lot of it is just background information on the old LL(1) and the new 
> PEG parsing algorithms, I figure I'd spare everyone the need of reading 
> through that. Below is a copy of the most relevant section from the PEP. I'd 
> also like to point out the section on performance (which you can find through 
> the above link) -- basically performance is on a par with that of the old 
> parser.
>
> ==
> Migration plan
> ==
>
> This section describes the migration plan when porting to the new PEG-based 
> parser
> if this PEP is accepted. The migration will be executed in a series of steps 
> that allow
> initially to fallback to the previous parser if needed:
>
> 1.  Before Python 3.9 beta 1, include the new PEG-based parser machinery in 
> CPython
> with a command-line flag and environment variable that allows switching 
> between
> the new and the old parsers together with explicit APIs that allow 
> invoking the
> new and the old parsers independently. At this step, all Python APIs like 
> ``ast.parse``
> and ``compile`` will use the parser set by the flags or the environment 
> variable and
> the default parser will be the current parser.
>
> 2.  After Python 3.9 Beta 1 the default parser will be the new parser.
>
> 3.  Between Python 3.9 and Python 3.10, the old parser and related code (like 
> the
> "parser" module) will be kept until a new Python release happens (Python 
> 3.10). In
> the meanwhile and until the old parser is removed, **no new Python Grammar
> addition will be added that requires the peg parser**. This means that 
> the grammar
> will be kept LL(1) until the old parser is removed.
>
> 4.  In Python 3.10, remove the old parser, the command-line flag, the 
> environment
> variable and the "parser" module and related code.
>
> --
> --Guido van Rossum (python.org/~guido)
> Pronouns: he/him (why is my pronoun here?)
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/HOZ2RI3FXUEMAT4XAX4UHFN4PKG5J5GR/
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PIR74ZYSBM46TW3OZYGPFEBV4I4BZ5MN/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Victor Stinner

Sorry, I was referring to *ambiguous* grammar rules. Extract of the PEP:

"Unlike LL(1) parsers PEG-based parsers cannot be ambiguous: if a
string parses, it has exactly one valid parse tree. This means that a
PEG-based parser cannot suffer from the ambiguity problems described
in the previous section."

Victor

Le ven. 3 avr. 2020 à 02:58, Greg Ewing  a écrit :
>
> On 3/04/20 10:33 am, Victor Stinner wrote:
> > I also like the fact that PEG is deterministic, whereas
> > LL(1) parsers are not.
>
> Where do you get that LL(1) parsers are not deterministic?
> That's news to me!
>
> --
> Greg
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/JHHLCUF7APZ5BV7C5NUNPPWI264L5XSJ/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KGKPKUOG54AD555K7V65XADXWK7MFY4X/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Greg Ewing


On 3/04/20 10:33 am, Victor Stinner wrote:

I also like the fact that PEG is deterministic, whereas
LL(1) parsers are not.


Where do you get that LL(1) parsers are not deterministic?
That's news to me!

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JHHLCUF7APZ5BV7C5NUNPPWI264L5XSJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Greg Ewing


On 3/04/20 7:10 am, Guido van Rossum wrote:
Since last fall's core sprint in London, Pablo Galindo Salgado, 
Lysandros Nikolaou and myself have been working on a new parser for 
CPython. We are now far enough along that we present a PEP we've written:


https://www.python.org/dev/peps/pep-0617/


Was any consideration given to other types of parser, such
as LR or LALR?

LR parsers handle left recursion naturally, and don't suffer
from any of the drawbacks mentioned in the PEP such as taking
exponential time or requiring all the source to be loaded
into memory.

I think there needs to be a section in the PEP justifying the
choice of PEG over the alternatives.

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LR6GQNAHMWDN3BZCXEZFSEU2FTK64MS5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Guido van Rossum

On Thu, Apr 2, 2020 at 4:20 PM Nathaniel Smith  wrote:

> On Thu, Apr 2, 2020 at 2:48 PM Pablo Galindo Salgado
>  wrote:
> >
> > > About the migration, can I ask who is going to (help to) fix projects
> > which rely on the AST?
> >
> > I think you misunderstood: The AST is exactly the same as the old and
> the new parser. The only
> > the thing that the new parser does is not generate an immediate CST
> (Concrete Syntax Tree) and that
> > is only half-exposed in the parser module.
>
> If the AST is supposed to be the same, then would it make sense to
> temporarily – maybe just during the alpha/beta period – always run
> *both* parsers and confirm that they match?
>

That's not a bad idea! https://github.com/we-like-parsers/cpython/issues/33

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TV2C2SVAUEWJ4ICSQ7TLFN3M74Y5W32M/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Guido van Rossum

On Thu, Apr 2, 2020 at 2:55 PM Pablo Galindo Salgado 
wrote:

> > About the migration, can I ask who is going to (help
> to) fix projects which rely on the AST?
>
> Whoops, I send the latest email before finishing it by mistake. Here is
> the extended version of the answer:
>
> I think there is a misunderstanding here: The new parser generates the
> same AST as the old parser so
> calling ast.parse() or compile() will yield exactly the same result. We
> have extensive testing around that
> and that was a goal from the beginning. Projects using the ast module will
> not need to do anything special.
>

I think that's only half true. It's true if they already work with Python
3.9 (master/HEAD). But probably some of these packages have not yet started
testing with 3.9 nightly runs or even alphas, so it's at least
*conceivable* that some of the fixes we applied to the AST could require
(small) adjustments. And I think *that* was what Victor was referring to.
(For example, I'm not 100% sure that mypy actually works with the latest
3.9. But there seems to be something else wrong there so I can't even test
it.)

> The difference is that the new parser does not generate a CST (Concrete
> Syntax Tree). The concrete syntax tree
> is an immediate structure from where the AST is generated. This structure
> is only partially exposed via the
> "parser" module but otherwise is only used in the parser itself so it
> should not be a problem. On the other hand:
> as explained in the PEP, the lack of the CST greatly simplifies the AST
> generation among other advantages.
>

I just remembered another difference. We haven't really investigated how
good the error reporting is. I'm sure there are cases where the syntax
error points at a *slightly* different position -- sometimes it's a bit
better, sometimes a bit worse. But there could be cases where the PEG
parser reads ahead chasing some alternative that will fail much later, and
then it would be much worse. We should probably explore this.

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GUDSLUXXKVGXIO4FQNUW6U6YMYKZLG3W/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Nathaniel Smith

On Thu, Apr 2, 2020 at 2:48 PM Pablo Galindo Salgado
 wrote:
>
> > About the migration, can I ask who is going to (help to) fix projects
> which rely on the AST?
>
> I think you misunderstood: The AST is exactly the same as the old and the new 
> parser. The only
> the thing that the new parser does is not generate an immediate CST (Concrete 
> Syntax Tree) and that
> is only half-exposed in the parser module.

If the AST is supposed to be the same, then would it make sense to
temporarily – maybe just during the alpha/beta period – always run
*both* parsers and confirm that they match?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IRQUITYBQFJYUFQRKTVDXUBX4X42ARMP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Guido van Rossum

On Thu, Apr 2, 2020 at 12:43 PM Paul Moore  wrote:

> On Thu, 2 Apr 2020 at 19:20, Guido van Rossum  wrote:
> >
> > Since last fall's core sprint in London, Pablo Galindo Salgado,
> Lysandros Nikolaou and myself have been working on a new parser for
> CPython. We are now far enough along that we present a PEP we've written:
> >
> > https://www.python.org/dev/peps/pep-0617/
> >
> > Hopefully the PEP speaks for itself. We are hoping for a speedy
> resolution so we can land the code we've written before 3.9 beta 1.
>
> Excellent news! One question - will there be any user-visible change
> as a result of this PEP other than the removal of the "parser" module?
> From my quick reading of the PEP, I didn't see anything, so I assume
> the answer is "no".
>

I suppose it depends on how deep you dig, but the intention is that the
returned AST is identical in each case. (We've "cheated" a bit by making a
few small changes to the code that produces an AST for the old parser,
mostly bugs related to line/column numbers.)

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7CYNFY73LAIPF5L4BYHOX6KVIWLPCTW6/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Guido van Rossum

On Thu, Apr 2, 2020 at 1:21 PM Barry Warsaw  wrote:

> On Apr 2, 2020, at 13:07, Pablo Galindo Salgado 
> wrote:
> >
> >> Just to clarify, this means that 3.9 will ship with the PEG parser as
> default,
> >> right?  If so, this would be a new feature, post beta.  Since that is
> counter to our
> >> general policy, we would need to get explicit RM approval for such a
> change.
> >
> > The idea is merging it *before beta* and make it the default *from the
> beta* (maybe
> > merging that commit making it the default the day before the beta for
> instance).
>
> Yes, sorry I don’t mean to be pedantic, but it would be better policy-wise
> if the new parser were the default in time for the first beta rather than
> after.
>

That was the intention, i.e. releasing beta 1 with the new parser being the
default. The current wording in the PEP are wrong, we'll fix that.

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/N4QU7AYS7WCSAIQJY7YB3AQBBPLYHEVO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Pablo Galindo Salgado

> About the migration, can I ask who is going to (help
to) fix projects which rely on the AST?

Whoops, I send the latest email before finishing it by mistake. Here is the 
extended version of the answer:

I think there is a misunderstanding here: The new parser generates the same AST 
as the old parser so
calling ast.parse() or compile() will yield exactly the same result. We have 
extensive testing around that
and that was a goal from the beginning. Projects using the ast module will not 
need to do anything special.

The difference is that the new parser does not generate a CST (Concrete Syntax 
Tree). The concrete syntax tree
is an immediate structure from where the AST is generated. This structure is 
only partially exposed via the
"parser" module but otherwise is only used in the parser itself so it should 
not be a problem. On the other hand:
as explained in the PEP, the lack of the CST greatly simplifies the AST 
generation among other advantages.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/52HCDM3QUKQLKKADB4Z3ILP5DN4VTQM2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Pablo Galindo Salgado

> About the migration, can I ask who is going to (help to) fix projects
which rely on the AST?

I think you misunderstood: The AST is exactly the same as the old and the new 
parser. The only
the thing that the new parser does is not generate an immediate CST (Concrete 
Syntax Tree) and that
is only half-exposed in the parser module.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CHH4RV2GAGMMCL3RGUQ4BX6MJQIU6PKS/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Victor Stinner

Hi,

It's great to see that you finally managed to come up with a PEP, this
work becomes concrete: congrats!

I started to read the PEP, and it's really well written! I heard that
LL(1) parsers have limits, but this PEP explains very well that the
current Python grammar was already "hacked" to work around these
limitations. I also like the fact that PEG is deterministic, whereas
LL(1) parsers are not.

I like to have the new parser being the default, it will ease its
adoption and force users to adapt their code. Otherwise, the migration
may take forever and never complete :-(

--

About the migration, can I ask who is going to (help to) fix projects
which rely on the AST?

I know that the motto was always "we don't provide any backward
compatibility warranty on the AST", *but* more and more projects are
using the Python AST.

Examples of projects relying on the AST:

* gast: used by Pythran
* pylint uses astroid
* Chameleon
* Genshi
* Mako
* pyflakes
* (likely others)

I'm not asking to stop making AST changes. I'm following AST changes,
and the AST is becoming better and better at each Python release!

I'm just asking is there are volunteers around to help to make these
projects compatible with Python 3.9, before the Python 3.9.0 final
release (to accelerate the adoption of Python 3.9). These volunteers
don't have to be the ones behind the PEP 617.

Note: example of previous AST incompatible changes (use ast.Constant,
remove old AST classes) in Python 3.8:
https://bugs.python.org/issue32892 A compatibility layer was added to
ease the migration from old AST classes to the new ast.Constant.

Victor

Le jeu. 2 avr. 2020 à 20:15, Guido van Rossum  a écrit :
>
> Since last fall's core sprint in London, Pablo Galindo Salgado, Lysandros 
> Nikolaou and myself have been working on a new parser for CPython. We are now 
> far enough along that we present a PEP we've written:
>
> https://www.python.org/dev/peps/pep-0617/
>
> Hopefully the PEP speaks for itself. We are hoping for a speedy resolution so 
> we can land the code we've written before 3.9 beta 1.
>
> If people insist I can post a copy of the entire PEP here on the list, but 
> since a lot of it is just background information on the old LL(1) and the new 
> PEG parsing algorithms, I figure I'd spare everyone the need of reading 
> through that. Below is a copy of the most relevant section from the PEP. I'd 
> also like to point out the section on performance (which you can find through 
> the above link) -- basically performance is on a par with that of the old 
> parser.
>
> ==
> Migration plan
> ==
>
> This section describes the migration plan when porting to the new PEG-based 
> parser
> if this PEP is accepted. The migration will be executed in a series of steps 
> that allow
> initially to fallback to the previous parser if needed:
>
> 1.  Before Python 3.9 beta 1, include the new PEG-based parser machinery in 
> CPython
> with a command-line flag and environment variable that allows switching 
> between
> the new and the old parsers together with explicit APIs that allow 
> invoking the
> new and the old parsers independently. At this step, all Python APIs like 
> ``ast.parse``
> and ``compile`` will use the parser set by the flags or the environment 
> variable and
> the default parser will be the current parser.
>
> 2.  After Python 3.9 Beta 1 the default parser will be the new parser.
>
> 3.  Between Python 3.9 and Python 3.10, the old parser and related code (like 
> the
> "parser" module) will be kept until a new Python release happens (Python 
> 3.10). In
> the meanwhile and until the old parser is removed, **no new Python Grammar
> addition will be added that requires the peg parser**. This means that 
> the grammar
> will be kept LL(1) until the old parser is removed.
>
> 4.  In Python 3.10, remove the old parser, the command-line flag, the 
> environment
> variable and the "parser" module and related code.
>
> --
> --Guido van Rossum (python.org/~guido)
> Pronouns: he/him (why is my pronoun here?)
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/HOZ2RI3FXUEMAT4XAX4UHFN4PKG5J5GR/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/6X7OFZPG7URY5AK3ZKFJNGHUXXHMKNS4/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Barry Warsaw

On Apr 2, 2020, at 13:07, Pablo Galindo Salgado  wrote:
> 
>> Just to clarify, this means that 3.9 will ship with the PEG parser as 
>> default,
>> right?  If so, this would be a new feature, post beta.  Since that is 
>> counter to our
>> general policy, we would need to get explicit RM approval for such a change.
> 
> The idea is merging it *before beta* and make it the default *from the beta* 
> (maybe
> merging that commit making it the default the day before the beta for 
> instance).

Yes, sorry I don’t mean to be pedantic, but it would be better policy-wise if 
the new parser were the default in time for the first beta rather than after.

Cheers,
-Barry



signature.asc
Description: Message signed with OpenPGP
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IDHZ3S2UEPNZXPW3E7BZY3LYP5TXAA4B/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Pablo Galindo Salgado

> Just to clarify, this means that 3.9 will ship with the PEG parser as default,
> right?  If so, this would be a new feature, post beta.  Since that is counter 
> to our
> general policy, we would need to get explicit RM approval for such a change.

The idea is merging it *before beta* and make it the default *from the beta* 
(maybe
merging that commit making it the default the day before the beta for instance).

Pablo
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VDM7LD4BOXZXMJLUFLOTVBF7B5AULB5W/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Barry Warsaw

Great to see this new work pay off!

On Apr 2, 2020, at 11:10, Guido van Rossum  wrote:

> 2.  After Python 3.9 Beta 1 the default parser will be the new parser.

Just to clarify, this means that 3.9 will ship with the PEG parser as default, 
right?  If so, this would be a new feature, post beta.  Since that is counter 
to our general policy, we would need to get explicit RM approval for such a 
change.

Cheers,
-Barry



signature.asc
Description: Message signed with OpenPGP
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EWBDCJVFEI4WASPXPWVIXVDMPI6TDC6Q/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Paul Moore

On Thu, 2 Apr 2020 at 19:20, Guido van Rossum  wrote:
>
> Since last fall's core sprint in London, Pablo Galindo Salgado, Lysandros 
> Nikolaou and myself have been working on a new parser for CPython. We are now 
> far enough along that we present a PEP we've written:
>
> https://www.python.org/dev/peps/pep-0617/
>
> Hopefully the PEP speaks for itself. We are hoping for a speedy resolution so 
> we can land the code we've written before 3.9 beta 1.

Excellent news! One question - will there be any user-visible change
as a result of this PEP other than the removal of the "parser" module?
>From my quick reading of the PEP, I didn't see anything, so I assume
the answer is "no".

Paul
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CSUXLFZLP556LCX6MXYITEDQM3EQQZ5K/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Ivan Levkivskyi

These are good news. I think the new parser is indeed both simpler and more
flexible - great!

--
Ivan



On Thu, 2 Apr 2020 at 19:19, Guido van Rossum  wrote:

> Since last fall's core sprint in London, Pablo Galindo Salgado, Lysandros
> Nikolaou and myself have been working on a new parser for CPython. We are
> now far enough along that we present a PEP we've written:
>
> https://www.python.org/dev/peps/pep-0617/
>
> Hopefully the PEP speaks for itself. We are hoping for a speedy resolution
> so we can land the code we've written before 3.9 beta 1.
>
> If people insist I can post a copy of the entire PEP here on the list, but
> since a lot of it is just background information on the old LL(1) and the
> new PEG parsing algorithms, I figure I'd spare everyone the need of reading
> through that. Below is a copy of the most relevant section from the PEP.
> I'd also like to point out the section on performance (which you can find
> through the above link) -- basically performance is on a par with that of
> the old parser.
>
> ==
> Migration plan
> ==
>
> This section describes the migration plan when porting to the new
> PEG-based parser
> if this PEP is accepted. The migration will be executed in a series of
> steps that allow
> initially to fallback to the previous parser if needed:
>
> 1.  Before Python 3.9 beta 1, include the new PEG-based parser machinery
> in CPython
> with a command-line flag and environment variable that allows
> switching between
> the new and the old parsers together with explicit APIs that allow
> invoking the
> new and the old parsers independently. At this step, all Python APIs
> like ``ast.parse``
> and ``compile`` will use the parser set by the flags or the
> environment variable and
> the default parser will be the current parser.
>
> 2.  After Python 3.9 Beta 1 the default parser will be the new parser.
>
> 3.  Between Python 3.9 and Python 3.10, the old parser and related code
> (like the
> "parser" module) will be kept until a new Python release happens
> (Python 3.10). In
> the meanwhile and until the old parser is removed, **no new Python
> Grammar
> addition will be added that requires the peg parser**. This means that
> the grammar
> will be kept LL(1) until the old parser is removed.
>
> 4.  In Python 3.10, remove the old parser, the command-line flag, the
> environment
> variable and the "parser" module and related code.
>
> --
> --Guido van Rossum (python.org/~guido)
> *Pronouns: he/him **(why is my pronoun here?)*
> 
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/HOZ2RI3FXUEMAT4XAX4UHFN4PKG5J5GR/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/772SIMSA6B7P43ICIHVWPGSB3OYP5SSP/
Code of Conduct: http://python.org/psf/codeofconduct/

62 matches

Mail list logo