New submission from Greg Baker <ggba...@sfu.ca>: I believe what I'm seeing here is somewhat related to issue 670664, but is easier to handle because of the CDATA structure. Basically, HTMLParser doesn't recognize CDATA sections at all, so their content is incorrectly parsed like normal data.
The following is an attempt to parse (a snippet of) valid XHTML, but it raises an HTMLParseError. data = """<script type="text/javascript"> //<![CDATA[ function foo() { document.write('"></' + 'script>');} //]]> </script>""" from HTMLParser import HTMLParser parser = HTMLParser() parser.feed(data) ---------- components: Library (Lib) messages: 93905 nosy: ggbaker severity: normal status: open title: HTMLParser doesn't handle <![CDATA[ ... ]]> type: behavior versions: Python 2.6 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue7114> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com