New submission from Vivian D <[email protected]>:
Here are the steps that I went through to test my regular expressions in my
command prompt (a file attachment shows this as well). I am using Windows 11,
version 21H2:
>>> import re
>>> regex = r"(((\w)+\w*\3){2}|(\w)+(?=\w*\4)\w*(?!\4)(\w)\w*\5)\w*"
>>> testString = "Alabama and Mississippi are next to each other"
>>> re.findall(regex,testString,re.IGNORECASE)
[('Mississipp', 'ipp', 'p', '', '')]
>>> testString = "alabama and Mississippi are next to each other"
>>> re.findall(regex,testString,re.IGNORECASE)
[('Mississipp', 'ipp', 'p', '', '')]
>>> regex = r"((\w)+\w*\2(\w)+\w*\3|(\w)+(?=\w*\4)\w*(?!\4)(\w)\w*\5)\w*"
>>> re.findall(regex,testString,re.IGNORECASE)
[('alabama', 'a', 'a', '', ''), ('Mississipp', 's', 'p', '', '')]
>>> testString = "Alabama and Mississippi are next to each other"
>>> re.findall(regex,testString,re.IGNORECASE)
[('Alabama', 'A', 'a', '', ''), ('Mississipp', 's', 'p', '', '')]
I created a regular expression to match any words with two sets of the same
vowel, including words with four of the same vowel, ignoring case. My first
regular expression “(((\w)+\w*\3){2}|(\w)+(?=\w*\4)\w*(?!\4)(\w)\w*\5)\w*" was
able to match “Mississippi” but unable to match “Alabama” as it should have. To
make sure that this error wasn’t somehow caused by a case sensitivity issue, I
retested the regex with “alabama” instead of “Alabama”, but still I got no
match on “alabama”. Then I tried replacing the quantifier {2} with just
expression that was supposed to be repeated. This gave me the regex:
"((\w)+\w*\2(\w)+\w*\3|(\w)+(?=\w*\4)\w*(?!\4)(\w)\w*\5)\w*". For some reason,
this was able to match on both “alabama” and “Alabama” now, as shown above, and
continued to match on Mississippi like expected. However, this result seems to
contradict my understand of regular expressions because all I did to get these
different results was copy the expression that was supposed to be executed
twice by the quantifier.
----------
components: Library (Lib)
files: ComandPrompt.pdf
messages: 414668
nosy: vmd3.14
priority: normal
severity: normal
status: open
title: Quantifier and Expanded Regex Expression Gives Different Results
type: behavior
versions: Python 3.8
Added file: https://bugs.python.org/file50661/ComandPrompt.pdf
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue46945>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com