Re: [Python-Dev] cpython (2.7): #14538: HTMLParser can now parse correctly start tags that contain a bare /.

Éric Araujo Tue, 24 Apr 2012 12:37:07 -0700

Le 24/04/2012 15:02, Georg Brandl a écrit :

On 24.04.2012 20:34, Benjamin Peterson wrote:

2012/4/24 Georg Brandl<[email protected]>:

I think that's misleading: there's no way to "correctly" parse malformed HTML.

There is in the since that you can follow the HTML5 algorithm, which
can "parse" any junk you throw at it.

Ah, good. Then I hope we are following the algorithm here (and are slowly
coming to use it for htmllib in general).

Yes, Ezio’s commits on html.parser/HTMLParser in the last months havebeen following the HTML5 spec. Ezio, RDM and I have had some discussionabout that on some bug reports, IRC and private mail and reached theagreement to do the useful thing, that is follow HTML5 and not pretendthat the stdlib parser is strict or validating.


Ezio was thinking about a blog.python.org post to advertise this.

Regards
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython (2.7): #14538: HTMLParser can now parse correctly start tags that contain a bare /.

Reply via email to