Paweł Widera <[email protected]> added the comment:
A simple workaround for the BeautifulSoup is the following wrapper. It
sanitize the javascript code before passing it to the parser by joining
the disjoint strings, so that "</scr"+"ipt>" becomes "</script>".
def bs(input):
pattern = re.compile('\"\+\"')
match = lambda x: ""
massage = copy.copy(BeautifulSoup.MARKUP_MASSAGE)
massage.extend([(pattern, match)])
return BeautifulSoup(input, markupMassage=massage)
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue670664>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com