[issue7114] HTMLParser doesn't handle

Greg Baker Mon, 12 Oct 2009 14:33:00 -0700

New submission from Greg Baker <ggba...@sfu.ca>:

I believe what I'm seeing here is somewhat related to issue 670664, but
is easier to handle because of the CDATA structure.  Basically,
HTMLParser doesn't recognize CDATA sections at all, so their content is
incorrectly parsed like normal data.


The following is an attempt to parse (a snippet of) valid XHTML, but it
raises an HTMLParseError.

data = """<script type="text/javascript">
//<![CDATA[
function foo() {
document.write('"></' + 'script>');}
//]]>
</script>"""

from HTMLParser import HTMLParser
parser = HTMLParser()
parser.feed(data)

----------
components: Library (Lib)
messages: 93905
nosy: ggbaker
severity: normal
status: open
title: HTMLParser doesn't handle <![CDATA[ ... ]]>
type: behavior
versions: Python 2.6

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue7114>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7114] HTMLParser doesn't handle

Reply via email to