Our experience with automatic testing is that unfortunately is very difficult to extract real problems with it. We tried some of the new experimental source generators on top of hypothesis ( https://pypi.org/project/hypothesmith/) and sadly we could not catch many important things that parsing existing source or tests did immediately. This is because the grammar space is infinite and therefore exploring it for parser failures without any constraint on what to explore or what's the actual structure of the parser won't give you much. If you consider the collection of programs that don't belong to the language, the cardinality is even higher.
For example, we had a bug at some point that manifested only when an f-sring had a specific number of nesting, although normal f-strings parsed just fine. Catching this automatically without knowing what to look for is very unlikely. As one cannot prove that two parsers parse the same language or that two grammars are equivalent (is equivalent to the halting problem), normally this is solved with testing both parses with a corpus big enough of both positive parses and negative ones. In order to improve our confidence with the negative cases, we would need first a good enough corpus of cases to test. On Thu, 8 Oct 2020, 20:49 Daniel Moisset, <dfmois...@gmail.com> wrote: > In this case, you can use the old parser as an oracle, at least for python > 3.8 syntax. The new parser should produce a syntax error if and only if the > old one does. And if it doesn't the AST should be the same I guess (I'm not > sue if the AST structure changed) > > On Thu, 8 Oct 2020, 03:12 Terry Reedy, <tjre...@udel.edu> wrote: > >> On 10/6/2020 2:02 PM, Guido van Rossum wrote: >> > That's appreciated, but I think what's needed more is someone who >> > actually wants to undertake this project. It's not just a matter of >> > running a small script for hours -- someone will have to come up with a >> > way to fuzz that is actually useful for this particular situation >> > (inserting random characters in files isn't going to be very >> effective). >> >> Changes should be by token or broader grammatical construct. However, >> the real difficulty in auto testing is the lack of a decision mechanism >> for correct output (code object versus SyntaxError) other than the tests >> being either designed or checked by parser experts. >> >> Classical fuzzing looks for some clearly wrong -- a crash -- rather than >> an answer *or* an Exception. So yes, random fuzzing that does not pass >> known limits could be done to look for crashes. But this is different >> from raising SyntaxError to reject wrong programs. >> >> Consider unary prefix operators: >> >> *a is a SyntaxError, because the grammar circumscribes the use of '*' as >> prefix. >> >> -'' is not, which might surprise some, but I presume the error not being >> caught until runtime, as a TypeError, is correct for the grammar as >> written. Or did I just discover a parser bug? Or a possible grammar >> improvement? >> (In other words, if I, even with my experience, tried grammar/parser >> fuzzing, I might be as much a nuisance as a help.) >> >> It would not necessarily be a regression if the grammar and parser were >> changed so that an obvious error like "- <string literal>" were to be >> caught as a SyntaxError. >> >> >> > "(One area we have not explored extensively is rejection of all >> > wrong programs. >> >> I consider false rejection to be a bigger sin than false acceptance. >> Wrong programs, like "-''", are likely to fail at runtime anyway. So >> one could test acceptance of randomly generated correct but not >> ridiculously big programs. But I guess compiling the stdlib and other >> packages already pretty well covered this. >> >> > We have unit tests that check for a certain number >> > of explicit rejections, but more work could be done, e.g. by using a >> > fuzzer that inserts random subtle bugs into existing code. We're >> > open to help in this area.)" >> > https://www.python.org/dev/peps/pep-0617/#validation >> >> >> -- >> Terry Jan Reedy >> _______________________________________________ >> Python-Dev mailing list -- python-dev@python.org >> To unsubscribe send an email to python-dev-le...@python.org >> https://mail.python.org/mailman3/lists/python-dev.python.org/ >> Message archived at >> https://mail.python.org/archives/list/python-dev@python.org/message/FNMPQUDPZTX7E4CAGDENNFU6AQJJMW34/ >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ > Python-Dev mailing list -- python-dev@python.org > To unsubscribe send an email to python-dev-le...@python.org > https://mail.python.org/mailman3/lists/python-dev.python.org/ > Message archived at > https://mail.python.org/archives/list/python-dev@python.org/message/D33DNLDR4Y6HCGL7K3WRDORNMCLDJ6D5/ > Code of Conduct: http://python.org/psf/codeofconduct/ >
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/3O6KKN5NZF3OHXJY3NVI7NAYY37DYCGR/ Code of Conduct: http://python.org/psf/codeofconduct/