New submission from halfjuice <halfju...@gmail.com>: When parsing html containing the following tag: ... <!- ie6 doesn't allow empty div. -> ... SGMLParser will stop parse following content without any warning. When such tag is removed everything works fine.
When looking into sgmllib.py, statement below found: if rawdata.startswith("<!", i): # This is some sort of declaration; in "HTML as # deployed," this should only be the document type # declaration ("<!DOCTYPE html...>"). I think that's why something goes wrong here. ---------- components: Library (Lib) messages: 118048 nosy: halfjuice priority: normal severity: normal status: open title: sgmllib fail to parse html containing <!- .... -> type: behavior versions: Python 2.6 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue10035> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com