On the other hand, while I love Beautiful Soup and used it for many years,
I have found that lxml is considerably faster if you are looking to do
heavy duty scraping/extraction, so it may be worth consideration depending
on what your particular needs are.

On 11 February 2012 21:00, Mike Orr <[email protected]> wrote:

> On Fri, Feb 10, 2012 at 12:49 PM, Jason <[email protected]> wrote:
> > BeautifulSoup: I use lxml.html for HTML parsing , but I haven't used
> > BeautifulSoup enough to know the difference.
>
> BeautifulSoup can handle bad HTML better. If your HTML files are
> coming from a third party and you can't control their quality, it may
> be better to use BeautifulSoup than get an exception. I have an
> application with an online glossary, and the glossary is maintained by
> another person using some visual tool. So when I get an update, I use
> BeautifulSoup to parse it.
>
> --
> Mike Orr <[email protected]>
>
> --
> You received this message because you are subscribed to the Google Groups
> "pylons-discuss" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/pylons-discuss?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"pylons-discuss" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en.

Reply via email to