Barry Scott writes: > I'm not so sure that a "universal parsing library" is possible for > the stdlib.
That shouldn't be our goal. (And I don't think Nam is wedded to that expression of the goal.) > I think one way you could find out what the requirements are is to > refactor at least 2 of the existing stdlib modules that you have > identified as needing a better parser. I think this is a really good idea. I'll be sprinting on Mailman at PyCon, but if Nam and other proponents have time around PyCon (and haven't done it already :-) I'll be able to make time then. Feel free to ping me off-list. (Meeting at PyCon would be a bonus, but IRC or SNS messaging/whiteboarding works for me too if other interested folks can't be there.) > Did you find that you could use the same parser code for both? I think it highly likely that "enough" protocols and "little languages" that are normally written by machines (or skilled programmers) can be handled by "Dragon Book"[1] parsers to make it worth adding some parsing library to the stdlib. Of course, more general (but still efficient) options have been developed since I last shaved a yacc, but that's not the point. Developers who have special needs (extremely efficient parsing of a relatively simple grammar, more general grammars) or simply want to continue using a different module that they've been requiring from the Cheese Shop since it was called "the Cheese Shop"[2] can (and should) do that. The point of the stdlib is to provide standard batteries that serve in common situations going forward. I've been using regexps since 1980, and am perfectly comfortable with rather complex expressions. Eg, I've written more or less general implementations of RFC 3986 and its predecessor RFC 2396, which is one of the examples Nam has tried. Nevertheless, there are some tricky aspects (for example, I did *not* try to implement 3986 in one expression -- as 3986 says: These restrictions result in five different ABNF rules for a path (Section 3.3), only one of which will match any given URI reference. so I used multiple, mutually exclusive regexps for the different productions). There is no question in my mind that the ABNF is easier to read. Implementing a set of regexps from the ABNF is easier than reconstructing the ABNF from the regexps. That's *my* rationale for including a parsing module in the stdlib: making common parsing tasks more reliable in implementation and more maintainable. To me, the real question is, "Suppose we add a general parsing library to the stdlib, and refactor some modules to use it. (1) Will this 'magically' fix some bugs/RFEs? (Not essential, but would be a nice bonus.) (2) Will the denizens of python-ideas and python-dev find such refactored modules readable and more maintainable than a plethora of ad hoc recursive descent parsers?" Obviously people who haven't studied parsers will have to project to a future self that has become used to reading grammar descriptions, but I think folks around here are used to that kind of projection. This would be a good test. Footnotes: [1] "Do I date myself? Very well then, I date myself." [2] See [1]. _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/