Vlastimil Brom <[email protected]> added the comment:
Thanks for the update;
Just a small observation regarding some character ranges and ignorecase,
probably irrelevant, but a difference to the current re anyway:
>>> zero2z =
>>> u"0123456789:;<=>?...@abcdefghijklmnopqrstuvwxyz[\]^_`abcdefghijklmnopqrstuvwxyz"
>>> re.findall("(?i)[X-d]", zero2z)
[]
>>> regex.findall("(?i)[X-d]", zero2z)
[u'A', u'B', u'C', u'D', u'X', u'Y', u'Z', u'[', u'\\', u']', u'^', u'_', u'`',
u'a', u'b', u'c', u'd', u'x', u'y', u'z']
>>>
re.findall("(?i)[B-d]", zero2z)
[u'B', u'C', u'D', u'b', u'c', u'd']
regex.findall("(?i)[B-d]", zero2z)
[u'A', u'B', u'C', u'D', u'E', u'F', u'G', u'H', u'I', u'J', u'K', u'L', u'M',
u'N', u'O', u'P', u'Q', u'R', u'S', u'T', u'U', u'V', u'W', u'X', u'Y', u'Z',
u'[', u'\\', u']', u'^', u'_', u'`', u'a', u'b', u'c', u'd', u'e', u'f', u'g',
u'h', u'i', u'j', u'k', u'l', u'm', u'n', u'o', u'p', u'q', u'r', u's', u't',
u'u', u'v', u'w', u'x', u'y', u'z']
It seems, that the re module is building the character set using a case
insensitive "alphabet" in some way.
I guess, the behaviour of re is buggy here, while regex is ok (tested on py
2.7, Win XPp).
vbr
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue2636>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com