Hello, Some columns in a DB have badly formed HTML, to the point BeautifulSoup (lxml?) fails:
============= #Some records start with 0A</crap> soup = BeautifulSoup("\n</strong>", 'lxml') #AttributeError: 'NoneType' object has no attribute 'text' print(soup.body.text) ============= What would be a nice way to solve the problem? Is there a command to remove wrong tags altogether (eg. strings that starts with </strong>), or should I just catch the error? Thank you. _______________________________________________ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com