On Jul 29, 2011, at 7:46 AM, Stefan Behnel wrote: > Joao S. O. Bueno, 29.07.2011 13:22: >> On Fri, Jul 29, 2011 at 1:37 AM, Stefan Behnel wrote: >>> Brett Cannon, 28.07.2011 23:49: >>>> >>>> On Thu, Jul 28, 2011 at 11:25, Matt wrote: >>>>> >>>>> - What policies are in place for keeping parity with other HTML >>>>> parsers (such as those in web browsers)? >>>> >>>> There aren't any beyond "it would be nice". >>>> [...] >>>> It's more of an issue of someone caring enough to do the coding work to >>>> bring the parser up to spec for HTML5 (or introduce new code to live >>>> beside >>>> the HTML4 parsing code). >>> >>> Which, given that html5lib readily exists, would likely be a lot more work >>> than anyone who is interested in HTML5 handling would want to invest. >>> >>> I don't think we need a new HTML5 parsing implementation only to have it in >>> the stdlib. That's the old sunny Java way of doing it. >> >> I disaagree. >> Having proper html parsing out of the box is part of the "batteries >> included" thing. > > Well, you can easily prove me wrong by implementing this. > > Stefan
Please don't implement this just to profe Stefan wrong :). The thing to do, if you want html parsing in the stdlib, is to _incorporate_ html5lib, which is already a perfectly good, thoroughly tested HTML parser, and simply deprecate HTMLParser and friends. Implementing a new parser would serve no purpose I can see. -glyph _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com