Our experience with automatic testing is that unfortunately is very
difficult to extract real problems with it. We tried some of the new
experimental source generators on top of hypothesis (
https://pypi.org/project/hypothesmith/) and sadly we could not catch many
important things that parsing existing source or tests did immediately.
This is because the grammar space is infinite and therefore exploring it
for parser failures without any constraint on what to explore or what's the
actual structure of the parser won't give you much. If you consider the
collection of programs that don't belong to the language, the cardinality
is even higher.

For example, we had a bug at some point that manifested only when an
f-sring had a specific number of nesting, although normal f-strings parsed
just fine. Catching this automatically without knowing what to look for is
very unlikely.

As one cannot prove that two parsers parse the same language or that two
grammars are equivalent (is equivalent to the halting problem), normally
this is solved with testing both parses with a corpus big enough of both
positive parses and negative ones. In order to improve our confidence with
the negative cases, we would need first a good enough corpus of cases to
test.

On Thu, 8 Oct 2020, 20:49 Daniel Moisset, <dfmois...@gmail.com> wrote:

> In this case, you can use the old parser as an oracle, at least for python
> 3.8 syntax. The new parser should produce a syntax error if and only if the
> old one does. And if it doesn't the AST should be the same I guess (I'm not
> sue if the AST structure changed)
>
> On Thu, 8 Oct 2020, 03:12 Terry Reedy, <tjre...@udel.edu> wrote:
>
>> On 10/6/2020 2:02 PM, Guido van Rossum wrote:
>> > That's appreciated, but I think what's needed more is someone who
>> > actually wants to undertake this project. It's not just a matter of
>> > running a small script for hours -- someone will have to come up with a
>> > way to fuzz that is actually useful for this particular situation
>> > (inserting random characters in files isn't going to be very
>> effective).
>>
>> Changes should be by token or broader grammatical construct.  However,
>> the real difficulty in auto testing is the lack of a decision mechanism
>> for correct output (code object versus SyntaxError) other than the tests
>> being either designed or checked by parser experts.
>>
>> Classical fuzzing looks for some clearly wrong -- a crash -- rather than
>> an answer *or* an Exception.  So yes, random fuzzing that does not pass
>> known limits could be done to look for crashes.  But this is different
>> from raising SyntaxError to reject wrong programs.
>>
>> Consider unary prefix operators:
>>
>> *a is a SyntaxError, because the grammar circumscribes the use of '*' as
>> prefix.
>>
>> -'' is not, which might surprise some, but I presume the error not being
>> caught until runtime, as a TypeError, is correct for the grammar as
>> written.  Or did I just discover a parser bug?  Or a possible grammar
>> improvement?
>> (In other words, if I, even with my experience, tried grammar/parser
>> fuzzing, I might be as much a nuisance as a help.)
>>
>> It would not necessarily be a regression if the grammar and parser were
>> changed so that an obvious error like "- <string literal>" were to be
>> caught as a SyntaxError.
>>
>>
>> >     "(One area we have not explored extensively is rejection of all
>> >     wrong programs.
>>
>> I consider false rejection to be a bigger sin than false acceptance.
>> Wrong programs, like "-''", are likely to fail at runtime anyway.  So
>> one could test acceptance of randomly generated correct but not
>> ridiculously big programs.  But I guess compiling the stdlib and other
>> packages already pretty well covered this.
>>
>> >  We have unit tests that check for a certain number
>> >     of explicit rejections, but more work could be done, e.g. by using a
>> >     fuzzer that inserts random subtle bugs into existing code. We're
>> >     open to help in this area.)"
>> >     https://www.python.org/dev/peps/pep-0617/#validation
>>
>>
>> --
>> Terry Jan Reedy
>> _______________________________________________
>> Python-Dev mailing list -- python-dev@python.org
>> To unsubscribe send an email to python-dev-le...@python.org
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-dev@python.org/message/FNMPQUDPZTX7E4CAGDENNFU6AQJJMW34/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/D33DNLDR4Y6HCGL7K3WRDORNMCLDJ6D5/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3O6KKN5NZF3OHXJY3NVI7NAYY37DYCGR/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to