New submission from AbcSxyZ <rossi....@outlook.com>:
Coming from deprecated feature. Using python 3.7.3 Related and probably fixed with https://bugs.python.org/issue31844 Just in case. I've got 2 different related problems, the first one creating the second. Using linked file and this class : ``` from html.parser import HTMLParser class LinkParser(HTMLParser): """ DOM parser to retrieve href of all <a> elements """ def parse_links(self, html_content): self.links = [] self.feed(html_content) return self.links def handle_starttag(self, tag, attrs): if tag == "a": attrs = {key.lower():value for key, *value in attrs} urls = attrs.get("href", None) if urls and urls[0]: self.links.append(urls[0]) # def error(self, *args, **kwargs): # pass if __name__ == "__main__": with open("error.txt") as File: LinkParser().parse_links(File.read()) ``` With error method commented, it creates : ``` File "scanner/link.py", line 8, in parse_links self.feed(html_content) File "/usr/lib/python3.7/html/parser.py", line 111, in feed self.goahead(0) File "/usr/lib/python3.7/html/parser.py", line 179, in goahead k = self.parse_html_declaration(i) File "/usr/lib/python3.7/html/parser.py", line 264, in parse_html_declaration return self.parse_marked_section(i) File "/usr/lib/python3.7/_markupbase.py", line 159, in parse_marked_section self.error('unknown status keyword %r in marked section' % rawdata[i+3:j]) File "/usr/lib/python3.7/_markupbase.py", line 34, in error "subclasses of ParserBase must override error()") NotImplementedError: subclasses of ParserBase must override error() ``` If error method do not raise anything, using only pass, it creates : ``` File "/home/simon/Documents/radio-parser/scanner/link.py", line 8, in parse_links self.feed(html_content) File "/usr/lib/python3.7/html/parser.py", line 111, in feed self.goahead(0) File "/usr/lib/python3.7/html/parser.py", line 179, in goahead k = self.parse_html_declaration(i) File "/usr/lib/python3.7/html/parser.py", line 264, in parse_html_declaration return self.parse_marked_section(i) File "/usr/lib/python3.7/_markupbase.py", line 160, in parse_marked_section if not match: UnboundLocalError: local variable 'match' referenced before assignment ``` We see here `match` variable is not created if `self.error` is called, and because error do not raise exception, will create UnboundLocalError : ``` def parse_marked_section(self, i, report=1): rawdata= self.rawdata assert rawdata[i:i+3] == '<![', "unexpected call to parse_marked_section()" sectName, j = self._scan_name( i+3, i ) if j < 0: return j if sectName in {"temp", "cdata", "ignore", "include", "rcdata"}: # look for standard ]]> ending match= _markedsectionclose.search(rawdata, i+3) elif sectName in {"if", "else", "endif"}: # look for MS Office ]> ending match= _msmarkedsectionclose.search(rawdata, i+3) else: self.error('unknown status keyword %r in marked section' % rawdata[i+3:j]) if not match: return -1 if report: j = match.start(0) self.unknown_decl(rawdata[i+3: j]) return match.end(0) ``` ---------- files: error.txt messages: 374899 nosy: AbcSxyZ priority: normal severity: normal status: open title: HTMLParser : HTMLParser.error creating multiple errors. type: crash versions: Python 3.7 Added file: https://bugs.python.org/file49370/error.txt _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue41489> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com