Re: Beautiful Soup - close tags more promptly?

Tim Delaney Tue, 25 Oct 2022 11:02:12 -0700

On Mon, 24 Oct 2022 at 19:03, Chris Angelico <[email protected]> wrote:


>
> Ah, cool. Thanks. I'm not entirely sure of the various advantages and
> disadvantages of the different parsers; is there a tabulation
> anywhere, or at least a list of recommendations on choosing a suitable
> parser?
>

Coming to this a bit late, but from my experience with BeautifulSoup and
HTML produced by other people ...

lxml is easily the fastest, but also the least forgiving.
html.parer is middling on performance, but as you've seen sometimes makes
mistakes.
html5lib is the slowest, but is most forgiving of malformed input and edge
cases.

I use html5lib - it's fast enough for what I do, and the most likely to
return results matching what the author saw when they maybe tried it in a
single web browser.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Beautiful Soup - close tags more promptly?

Reply via email to