Re: [Python-Dev] HTMLParser and HTML5

Glyph Lefkowitz Fri, 29 Jul 2011 11:06:19 -0700

On Jul 29, 2011, at 7:46 AM, Stefan Behnel wrote:

> Joao S. O. Bueno, 29.07.2011 13:22:
>> On Fri, Jul 29, 2011 at 1:37 AM, Stefan Behnel wrote:
>>> Brett Cannon, 28.07.2011 23:49:
>>>> 
>>>> On Thu, Jul 28, 2011 at 11:25, Matt wrote:
>>>>> 
>>>>> - What policies are in place for keeping parity with other HTML
>>>>> parsers (such as those in web browsers)?
>>>> 
>>>> There aren't any beyond "it would be nice".
>>>> [...]
>>>> It's more of an issue of someone caring enough to do the coding work to
>>>> bring the parser up to spec for HTML5 (or introduce new code to live
>>>> beside
>>>> the HTML4 parsing code).
>>> 
>>> Which, given that html5lib readily exists, would likely be a lot more work
>>> than anyone who is interested in HTML5 handling would want to invest.
>>> 
>>> I don't think we need a new HTML5 parsing implementation only to have it in
>>> the stdlib. That's the old sunny Java way of doing it.
>> 
>> I disaagree.
>> Having proper html parsing out of the box is part of the "batteries
>> included" thing.
> 
> Well, you can easily prove me wrong by implementing this.
> 
> Stefan


Please don't implement this just to profe Stefan wrong :).

The thing to do, if you want html parsing in the stdlib, is to _incorporate_ 
html5lib, which is already a perfectly good, thoroughly tested HTML parser, and 
simply deprecate HTMLParser and friends.  Implementing a new parser would serve 
no purpose I can see.

-glyph

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] HTMLParser and HTML5

Reply via email to