Luke Kenneth Casson Leighton <l...@lkcl.net> added the comment:
regular expressions are not something i am familiar or comfortable with (never have been: the patterns are too dense). however REMOVING "Bracket" from the regular expression(s) for PseudoToken "fixes" the problem. some debug print statements dropped in at around line 640 of tokenize.py show that the match on the "working" code with r"\(") as input gives a start/end/spos/epos that is DIFFERENT from when the same code is given just "\(" line 'r"\\(")\n' pos 0 7 r <_sre.SRE_Match object; span=(0, 5), match='r"\\("'> pseudo start/end 0 5 (2, 0) (2, 5) vs line '"\\(")\n' pos 0 6 " <_sre.SRE_Match object; span=(0, 4), match='"\\("'> pseudo start/end 0 4 (5, 0) (5, 4) there *may* be a way to "fix" this by taking out the pattern matching on Bracket and prioritising everything else. while pos < max: pseudomatch = _compile(PseudoToken).match(line, pos) print ("pos", pos, max, line[pos], pseudomatch) if pseudomatch: # scan for tokens start, end = pseudomatch.span(1) spos, epos, pos = (lnum, start), (lnum, end), end print ("pseudo start/end", start, end, spos, epos) if start == end: continue Bracket = '[][(){}]' Special = group(r'\r?\n', r'\.\.\.', r'[:;.,@]') # REMOVE Bracket Funny = group(Operator, Special) PlainToken = group(Number, Funny, String, Name) Token = Ignore + PlainToken # First (or only) line of ' or " string. ContStr = group(StringPrefix + r"'[^\n'\\]*(?:\\.[^\n'\\]*)*" + group("'", r'\\\r?\n'), StringPrefix + r'"[^\n"\\]*(?:\\.[^\n"\\]*)*' + group('"', r'\\\r?\n')) PseudoExtras = group(r'\\\r?\n|\Z', Comment, Triple) PseudoToken = Whitespace + group(PseudoExtras, Number, Funny, ContStr, Name) ---------- nosy: -serhiy.storchaka _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue34428> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com