On Tue, Apr 24, 2012 at 14:34, Éric Araujo <mer...@netwok.org> wrote: > Le 24/04/2012 15:02, Georg Brandl a écrit : >> >> On 24.04.2012 20:34, Benjamin Peterson wrote: >>> >>> 2012/4/24 Georg Brandl<g.bra...@gmx.net>: >>>> >>>> I think that's misleading: there's no way to "correctly" parse malformed >>>> HTML. >>> >>> There is in the since that you can follow the HTML5 algorithm, which >>> can "parse" any junk you throw at it. >> >> Ah, good. Then I hope we are following the algorithm here (and are slowly >> coming to use it for htmllib in general). > > > Yes, Ezio’s commits on html.parser/HTMLParser in the last months have been > following the HTML5 spec. Ezio, RDM and I have had some discussion about > that on some bug reports, IRC and private mail and reached the agreement to > do the useful thing, that is follow HTML5 and not pretend that the stdlib > parser is strict or validating. > > Ezio was thinking about a blog.python.org post to advertise this.
Please do this, and I welcome anyone else who wants to write about their work on the blog to do so. Contact me for info. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com