On Sat, 29 Jan 2011 15:09:14 -0500, Jesse Rosenthal <jrosent...@jhu.edu> wrote: > So BS is the best I could find for this job
No doubt. I once tried to scrape http://theeconomist.com. It has so broken html that all parsers broke down. BeautifulSoup at least made it through and didn't completely fail. so I agree it is the best thing for surely broken html email Sebastian
pgpBf3HpzeOcB.pgp
Description: PGP signature
_______________________________________________ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch