On the other hand, while I love Beautiful Soup and used it for many years, I have found that lxml is considerably faster if you are looking to do heavy duty scraping/extraction, so it may be worth consideration depending on what your particular needs are.
On 11 February 2012 21:00, Mike Orr <[email protected]> wrote: > On Fri, Feb 10, 2012 at 12:49 PM, Jason <[email protected]> wrote: > > BeautifulSoup: I use lxml.html for HTML parsing , but I haven't used > > BeautifulSoup enough to know the difference. > > BeautifulSoup can handle bad HTML better. If your HTML files are > coming from a third party and you can't control their quality, it may > be better to use BeautifulSoup than get an exception. I have an > application with an online glossary, and the glossary is maintained by > another person using some visual tool. So when I get an update, I use > BeautifulSoup to parse it. > > -- > Mike Orr <[email protected]> > > -- > You received this message because you are subscribed to the Google Groups > "pylons-discuss" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/pylons-discuss?hl=en. > > -- You received this message because you are subscribed to the Google Groups "pylons-discuss" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/pylons-discuss?hl=en.
