[issue31856] Unexpected behavior of re module when VERBOSE flag is set

2017-10-23 Thread Bob Kline

Bob Kline  added the comment:

The light finally comes on. I actually *was* putting a backslash into the 
string value, with the raw flag (which is, of course, what you were trying to 
tell me). Thanks for your patience. :-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31856] Unexpected behavior of re module when VERBOSE flag is set

2017-10-23 Thread Bob Kline

Bob Kline  added the comment:

I had been under the impression that "escaped" in this context meant that an 
escape character (the backslash) was part of the string value for the regular 
expression (there's a little bit of overloading going on with that word). 
Thanks for setting me straight.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31856] Unexpected behavior of re module when VERBOSE flag is set

2017-10-23 Thread Matthew Barnett

Matthew Barnett  added the comment:

Your verbose examples put the pattern into raw triple-quoted strings, which is 
OK, but their first character is a backslash, which makes the next character (a 
newline) an escaped literal whitespace character. Escaped whitespace is 
significant in a verbose pattern.

--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31856] Unexpected behavior of re module when VERBOSE flag is set

2017-10-23 Thread Bob Kline

New submission from Bob Kline :

According to the documentation of the re module, "When this flag [re.VERBOSE] 
has been specified, whitespace within the RE string is ignored, except when the 
whitespace is in a character class or preceded by an unescaped backslash; this 
lets you organize and indent the RE more clearly. This flag also lets you put 
comments within a RE that will be ignored by the engine; comments are marked by 
a '#' that’s neither in a character class [n]or preceded by an unescaped 
backslash." (I'm quoting from the 3.6.3 documentation, but I've tested with 
several versions of Python, as indicated in the issue's `Versions` field, all 
with the same results.)

Given this description, I would have expected the output for each of the pairs 
of calls to findall() in the attached repro code to be the same, but that is 
not what's happening. In the case of the first pair of calls, for example, the 
non-verbose version finds two more matches than the verbose version, even 
though the regular expression is identical for the two calls, ignoring 
whitespace and comments in the expression string. Similar problems appear with 
the other two pairs of calls.

Here's the output from the attached code:

['&', '(', '/Term/SemanticType/@cdr:ref', '==']
['/Term/SemanticType/@cdr:ref', '==']
[' XXX ']
[]
[' XXX ']
[]

It would seem that at least one of the following is true:

 1. the module is not behaving as it should
 2. the documentation is wrong
 3. I have not understood the documentation correctly

I'm happy for it to be #3, as long as someone can explain what I have not 
understood.

--
components: Library (Lib), Regular Expressions
files: regex-repro.py
messages: 304849
nosy: bkline, ezio.melotti, mrabarnett
priority: normal
severity: normal
status: open
title: Unexpected behavior of re module when VERBOSE flag is set
type: behavior
versions: Python 2.7, Python 3.5, Python 3.6
Added file: https://bugs.python.org/file47232/regex-repro.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com