Luke Kenneth Casson Leighton <l...@lkcl.net> added the comment:

regular expressions are not something i am familiar or comfortable
with (never have been: the patterns are too dense).  however REMOVING
"Bracket" from the regular expression(s) for PseudoToken "fixes"
the problem.

some debug print statements dropped in at around line 640 of
tokenize.py show that the match on the "working" code
with r"\(") as input gives a start/end/spos/epos that is DIFFERENT
from when the same code is given just "\("

line 'r"\\(")\n'
pos 0 7 r <_sre.SRE_Match object; span=(0, 5), match='r"\\("'>
pseudo start/end 0 5 (2, 0) (2, 5)

vs

line '"\\(")\n'
pos 0 6 " <_sre.SRE_Match object; span=(0, 4), match='"\\("'>
pseudo start/end 0 4 (5, 0) (5, 4)

there *may* be a way to "fix" this by taking out the pattern
matching on Bracket and prioritising everything else.


        while pos < max:
            pseudomatch = _compile(PseudoToken).match(line, pos)
            print ("pos", pos, max, line[pos], pseudomatch)
            if pseudomatch:                                # scan for tokens
                start, end = pseudomatch.span(1)
                spos, epos, pos = (lnum, start), (lnum, end), end
                print ("pseudo start/end", start, end, spos, epos)
                if start == end:
                    continue

 
Bracket = '[][(){}]'
Special = group(r'\r?\n', r'\.\.\.', r'[:;.,@]')
# REMOVE Bracket
Funny = group(Operator, Special)

PlainToken = group(Number, Funny, String, Name)
Token = Ignore + PlainToken

# First (or only) line of ' or " string.
ContStr = group(StringPrefix + r"'[^\n'\\]*(?:\\.[^\n'\\]*)*" +
                group("'", r'\\\r?\n'),
                StringPrefix + r'"[^\n"\\]*(?:\\.[^\n"\\]*)*' +
                group('"', r'\\\r?\n'))
PseudoExtras = group(r'\\\r?\n|\Z', Comment, Triple)
PseudoToken = Whitespace + group(PseudoExtras, Number, Funny, ContStr, Name)

----------
nosy:  -serhiy.storchaka

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue34428>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to