[issue45956] Add scanf regular expressions to re

2021-12-01 Thread Maxwell Bernstein
Change by Maxwell Bernstein : -- resolution: -> rejected stage: patch review -> resolved status: open -> closed ___ Python tracker ___

[issue45956] Add scanf regular expressions to re

2021-12-01 Thread Eric V. Smith
Eric V. Smith added the comment: Maxwell: thank you for your contribution. I agree that these don’t belong in the re module. I think a personal library or something on PyPI (logically equivalent to more-itertools) would be more appropriate. I suggest closing this as rejected. --

[issue45956] Add scanf regular expressions to re

2021-12-01 Thread Raymond Hettinger
Raymond Hettinger added the comment: The scanf() translation table primarily serves as a way to learn regex syntax for people who only know scanf syntax. It would defeat the educational purpose to immortalize the translation as fixed constants. For the most part, people are better off

[issue45956] Add scanf regular expressions to re

2021-12-01 Thread Maxwell Bernstein
Change by Maxwell Bernstein : -- keywords: +patch pull_requests: +28111 stage: -> patch review pull_request: https://github.com/python/cpython/pull/29885 ___ Python tracker

[issue45956] Add scanf regular expressions to re

2021-12-01 Thread Maxwell Bernstein
New submission from Maxwell Bernstein : The documentation for the `re` module suggests several regular expressions for use in simulating `scanf()`. Provide these directly in the `re` module. -- components: Library (Lib) messages: 407491 nosy: tekknolagi priority: normal severity

[issue45869] Unicode and acii regular expressions do not agree on ascii space characters

2021-11-23 Thread Oleg Iarygin
Change by Oleg Iarygin : -- nosy: +arhadthedev ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue45869] Unicode and acii regular expressions do not agree on ascii space characters

2021-11-23 Thread Joran van Apeldoorn
Joran van Apeldoorn added the comment: Hi, I was not suggesting that the documentation literally says they should be the same but it might be unexpected for users if ASCCI characters change properties depending on whether they are considered in a unicode or pure ASCII setting. The

[issue45869] Unicode and acii regular expressions do not agree on ascii space characters

2021-11-22 Thread Steven D'Aprano
Steven D'Aprano added the comment: In any case, any change to this would have to be limited to Python 3.11. It is not clearly a bug, so this would be an enhancement. -- type: behavior -> enhancement versions: -Python 3.10, Python 3.8, Python 3.9

[issue45869] Unicode and acii regular expressions do not agree on ascii space characters

2021-11-22 Thread Steven D'Aprano
Steven D'Aprano added the comment: Hi Joran, I'm not sure why you think that /s should agree between ASCII and Unicode. That seems like an unjustified assumption to me. You say: "The expectation would be that the re.A (or re.ASCII) flag should not impact the matching behavior of a regular

[issue45869] Unicode and acii regular expressions do not agree on ascii space characters

2021-11-22 Thread Matthew Barnett
Matthew Barnett added the comment: For comparison, the regex module says that 0x1C..0x1F aren't whitespace, and the Unicode property White_Space ("\p{White_Space}" in a pattern, where supported) also says that they aren't whitespace. -- ___

[issue45869] Unicode and acii regular expressions do not agree on ascii space characters

2021-11-22 Thread Joran van Apeldoorn
Joran van Apeldoorn added the comment: Small addition, the sre categories CATEGORY_LINEBREAK and CATEGORY_UNI_LINEBREAK also do not agree on ASCII characters. The first is only '\n' while the second also includes for example '\r' and some others. These do not seem to correspond to anything

[issue45869] Unicode and acii regular expressions do not agree on ascii space characters

2021-11-22 Thread Joran van Apeldoorn
Change by Joran van Apeldoorn : -- type: -> behavior ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue45869] Unicode and acii regular expressions do not agree on ascii space characters

2021-11-22 Thread Joran van Apeldoorn
Expressions files: unicode-ascii-space.py messages: 406773 nosy: control-k, ezio.melotti, mrabarnett priority: normal severity: normal status: open title: Unicode and acii regular expressions do not agree on ascii space characters versions: Python 3.10, Python 3.11, Python 3.8, Python 3.9 Added file

[issue30349] Preparation for advanced set syntax in regular expressions

2021-09-21 Thread Philippe Ombredanne
Philippe Ombredanne added the comment: Sorry, my comment was at best nonsensical gibberish! I meant to say that this warning message should include the actual regex at fault; otherwise it is hard to fix when the regex in question comes from some data structure like a list; then the line

[issue30349] Preparation for advanced set syntax in regular expressions

2021-09-21 Thread Philippe Ombredanne
Philippe Ombredanne added the comment: FWIW, this warning is annoying because it is hard to fix in the case where the regex are source from data: the warning message does not include the regex at fault; it should otherwise the warning is noisy and ineffective IMHO. -- nosy:

Re: issue with regular expressions

2019-10-22 Thread joseph pareti
Ok, thanks. It works for me. regards, Am Di., 22. Okt. 2019 um 11:29 Uhr schrieb Matt Wheeler : > > > On Tue, 22 Oct 2019, 09:44 joseph pareti, wrote: > >> the following code ends in an exception: >> >> import re >> pattern = 'Sottoscrizione unica soluzione' >> mylines = []

Re: issue with regular expressions

2019-10-22 Thread Matt Wheeler
On Tue, 22 Oct 2019, 09:44 joseph pareti, wrote: > the following code ends in an exception: > > import re > pattern = 'Sottoscrizione unica soluzione' > mylines = []# Declare an empty list. with open ('tmp.txt', 'rt') as myfile: # Open tmp.txt for reading

issue with regular expressions

2019-10-22 Thread joseph pareti
the following code ends in an exception: import re pattern = 'Sottoscrizione unica soluzione' mylines = []# Declare an empty list. with open ('tmp.txt', 'rt') as myfile: # Open tmp.txt for reading text. for myline in myfile: # For each

Re: regular expressions help

2019-09-20 Thread Barry Scott
When I'm debugging a regex I make the regex shorter and shorter to figure out what the problem is. Try starting with re.compile(r'm') and then add the chars one by one seeing what happens as the string gets longer. Barry > On 19 Sep 2019, at 09:41, Pradeep Patra wrote: > > I am using python

Re: regular expressions help

2019-09-19 Thread Chris Angelico
tead of passing my-dog i can pass my-cat or blah blah. I am thinking of > creating a list of probable combinations to search from the list. Anybody > have better ideas? > If you just want to find a string in another string, don't use regular expressions at all! Just ask Python dire

Re: regular expressions help

2019-09-19 Thread Pradeep Patra
Thanks David /Anthony for your help. I figured out the issue myself. I dont need any ^, $ etc to the regex pattern and the plain string (for exp my-dog) works fine. I am looking at creating a generic method so that instead of passing my-dog i can pass my-cat or blah blah. I am thinking of

Re: regular expressions help

2019-09-19 Thread David
On Thu, 19 Sep 2019 at 19:34, Pradeep Patra wrote: > Thanks David for your quick help. Appreciate it. When I tried on python 2.7.3 > the same thing you did below I got the error after matches.group(0) as > follows: > > AttributeError: NoneType object has no attribute 'group'. > > I tried to

Re: regular expressions help

2019-09-19 Thread Pradeep Patra
Thanks David for your quick help. Appreciate it. When I tried on python 2.7.3 the same thing you did below I got the error after matches.group(0) as follows: AttributeError: NoneType object has no attribute 'group'. I tried to check 'None' for no match for re.search as the documentation says but

Re: regular expressions help

2019-09-19 Thread David
On Thu, 19 Sep 2019 at 18:41, Pradeep Patra wrote: > On Thursday, September 19, 2019, Pradeep Patra > wrote: >> On Thursday, September 19, 2019, David wrote: >>> On Thu, 19 Sep 2019 at 17:51, Pradeep Patra >>> wrote: >>> > pattern=re.compile(r'^my\-dog$') >>> > matches = re.search(mystr)

Re: regular expressions help

2019-09-19 Thread Pradeep Patra
I am using python 2.7.6 but I also tried on python 3.7.3. On Thursday, September 19, 2019, Pradeep Patra wrote: > Beginning of the string. But I tried removing that as well and it still > could not find it. When I tested at www.regex101.com and it matched > successfully whereas I may be wrong.

Re: regular expressions help

2019-09-19 Thread David
On Thu, 19 Sep 2019 at 17:51, Pradeep Patra wrote: > > pattern=re.compile(r'^my\-dog$') > matches = re.search(mystr) > > In the above example both cases(match/not match) the matches returns "None" Hi, do you know what the '^' character does in your pattern? --

regular expressions help

2019-09-19 Thread Pradeep Patra
Hi all, I was playing around with regular expressions and testing the simple regular expression and its notworking for some reason. I want to search "my-dog" at any of the place in a string and return the index but its not working. I tried both in python 3.7.3 and 2.7.x. Can anyone p

[issue37996] 2to3 introduces unwanted extra backslashes for unicode characters in regular expressions

2019-08-31 Thread Bob Kline
Bob Kline added the comment: In fact, I suppose it's possible that the warning as I worded it is still not restrictive enough, and that there are subtle dependencies between the fixers which would make the action of one of them render the code no longer safely fixable as Python 2 code by

[issue37996] 2to3 introduces unwanted extra backslashes for unicode characters in regular expressions

2019-08-31 Thread Bob Kline
Bob Kline added the comment: Thanks, I understand. However, this highlights something which had slipped under my radar. You get one shot at running a code set through the tool. You can't do what I was doing, which was to run the tool in "don't write" mode, then fix by hand some of the

[issue37996] 2to3 introduces unwanted extra backslashes for unicode characters in regular expressions

2019-08-31 Thread Ned Deily
Change by Ned Deily : -- resolution: -> not a bug stage: -> resolved status: open -> closed ___ Python tracker ___ ___

[issue37996] 2to3 introduces unwanted extra backslashes for unicode characters in regular expressions

2019-08-31 Thread Matthew Barnett
Matthew Barnett added the comment: You wrote "the u had already been removed by hand". By removing the u in the _Python 2_ code, you changed that string from a Unicode string to a bytestring. In a bytestring, \u is not an escape; b"\u" == b"\\u". -- nosy: +mrabarnett

[issue37996] 2to3 introduces unwanted extra backslashes for unicode characters in regular expressions

2019-08-31 Thread Bob Kline
Bob Kline added the comment: Ah, this is worse than I first thought. It's not just converting code by adding extra backslashes to regular expression strings, where at least the regular expression engine will do what the original code was asking the Python parser to do (unless user code

[issue37996] 2to3 introduces unwanted extra backslashes for unicode characters in regular expressions

2019-08-31 Thread Bob Kline
Bob Kline added the comment: The original string had u"""...""" and the u had already been removed by hand in preparation for moving to Python 3. -- ___ Python tracker ___

[issue37996] 2to3 introduces unwanted extra backslashes for unicode characters in regular expressions

2019-08-31 Thread Bob Kline
gine to handle. -- components: 2to3 (2.x to 3.x conversion tool) messages: 350922 nosy: bkline priority: normal severity: normal status: open title: 2to3 introduces unwanted extra backslashes for unicode characters in regular expressions type: behavior versions: Python 3.7 _

[issue35824] http.cookies._CookiePattern modifying regular expressions

2019-05-02 Thread daniel hahler
daniel hahler added the comment: I seems like http.cookiejar should be used for clients, which includes more relaxed parsing of cookies. This is mentioned in the docs at https://github.com/python/cpython/blame/443fe5a52a3d6a101795380227ced38b4b5e0a8b/Doc/library/http.cookies.rst#L63-L65.

[issue35824] http.cookies._CookiePattern modifying regular expressions

2019-04-25 Thread Martin Panter
Martin Panter added the comment: Test_http_cookies line 19 has the following test case: {'data': 'keebler="E=mc2; L=\\"Loves\\"; fudge=\\012;"', 'dict': {'keebler' : 'E=mc2; L="Loves"; fudge=\012;'}, 'repr': '', 'output': 'Set-Cookie: keebler="E=mc2; L=\\"Loves\\"; fudge=\\012;"'} This

[issue35824] http.cookies._CookiePattern modifying regular expressions

2019-04-23 Thread SilentGhost
Change by SilentGhost : -- nosy: +martin.panter, xtreak ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue35824] http.cookies._CookiePattern modifying regular expressions

2019-04-22 Thread MeiK
MeiK added the comment: I found that using http.cookiejar.parse_ns_headers would cause some of the previous tests to fail, and if you think this method is workable, I can follow it to write a new one and pass all the tests. -- nosy: -martin.panter, xtreak

[issue35824] http.cookies._CookiePattern modifying regular expressions

2019-04-22 Thread MeiK
MeiK added the comment: You are right, I saw the agreed way of parsing in RFC6265[1], it seems that you should not use regular expressions. I used http.cookiejar to update the code, but it failed to pass the test: https://github.com/python/cpython/blob/master/Lib/test/test_http_cookies.py

[issue35824] http.cookies._CookiePattern modifying regular expressions

2019-04-22 Thread daniel hahler
daniel hahler added the comment: http.cookiejar parses this correctly, using http2time: >>> import http.cookiejar >>> http.cookiejar.parse_ns_headers(["has_recent_activity=1; path=/; expires=Mon, 22 Apr 2019 23:27:18 -"]) [[('has_recent_activity', '1'), ('path', '/'),

[issue35824] http.cookies._CookiePattern modifying regular expressions

2019-04-22 Thread daniel hahler
daniel hahler added the comment: Another example of a value that fails to parse is if "-" is used instead of "GMT", which is the case with GitHub: > Set-Cookie: has_recent_activity=1; path=/; expires=Mon, 22 Apr 2019 23:27:18 > - So using a regular expression here to only parse the

[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2019-02-25 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker ___

[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2019-02-25 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: New changeset 95fc8e687c487ecf97f4b1b98dfc0c05e3c9cbff by Serhiy Storchaka in branch '3.7': [3.7] bpo-28450: Fix and improve the documentation for unknown escapes in RE. (GH-11920). (GH-12029)

[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2019-02-25 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- pull_requests: +12060 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2019-02-25 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: New changeset a180b007d96fe68b32f11dec720fbd0cd5b6758a by Serhiy Storchaka in branch 'master': bpo-28450: Fix and improve the documentation for unknown escapes in RE. (GH-11920)

[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2019-02-18 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- keywords: +patch pull_requests: +11945 stage: needs patch -> patch review ___ Python tracker ___

[issue35824] http.cookies._CookiePattern modifying regular expressions

2019-01-26 Thread Karthikeyan Singaravelan
Karthikeyan Singaravelan added the comment: Yes, sorry I thought it was the format used for parsing too. Thanks for the example Martin. I am linking @MeiK PR to the issue where I asked them to open an issue for this. -- keywords: +patch pull_requests: +11517 stage: -> patch review

[issue35824] http.cookies._CookiePattern modifying regular expressions

2019-01-26 Thread Martin Panter
Martin Panter added the comment: I presume MeiK wants to use BaseCookie to parse the Set-Cookie header field, as in >>> BaseCookie('Hello=World; Expires=Thu, 31 Jan 2019 05:56:00 GMT;') >>> BaseCookie('Hello=World; Expires=Thu,31 Jan 2019 05:56:00 GMT;') Karthikeyan, if you meant the

[issue35824] http.cookies._CookiePattern modifying regular expressions

2019-01-24 Thread Karthikeyan Singaravelan
Karthikeyan Singaravelan added the comment: Thanks for the MDN cookie directive link. I didn't know it links to Date link in the GitHub PR. I don't see space optional in the sane-date format specified for expires attribute. I could be reading the grammar wrong. I will wait for others

[issue35824] http.cookies._CookiePattern modifying regular expressions

2019-01-24 Thread MeiK
New submission from MeiK : http.cookies.BaseCookie[1] can't parse Expires in this format like Expires=Thu,31 Jan 2019 05:56:00 GMT;(Less space after Thu,). I encountered this problem in actual use, Chrome, IE and Firefox can parse this string normally. Many languages, such as JavaScript, can

[issue35824] http.cookies._CookiePattern modifying regular expressions

2019-01-24 Thread MeiK
Change by MeiK : -- components: Extension Modules nosy: MeiK priority: normal severity: normal status: open title: http.cookies._CookiePattern modifying regular expressions type: enhancement ___ Python tracker <https://bugs.python.org/issue35

[issue32067] Deprecate accepting unrecognized braces in regular expressions

2018-12-23 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- resolution: -> rejected stage: patch review -> resolved status: open -> closed ___ Python tracker ___

[issue34304] clarification on escaping \d in regular expressions

2018-08-01 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: If you want to replace %d with literal \d, you need to repeat the backslash 4 times: pattern = re.sub('%d', 'd+', pattern) or use a raw string literal and repeat the backslash 2 times: pattern = re.sub('%d', r'\\d+', pattern) Since the

[issue34304] clarification on escaping \d in regular expressions

2018-08-01 Thread Karthikeyan Singaravelan
Karthikeyan Singaravelan added the comment: The reported behavior is reproducible in master as well as of ea68d83933 but not on 3.6.0. I couldn't bisect to the exact commit between 3.7.0 and 3.6.0 where this change was introduced though. I can also see some deprecation warnings as below

[issue34304] clarification on escaping \d in regular expressions

2018-08-01 Thread Karthikeyan Singaravelan
Change by Karthikeyan Singaravelan : -- nosy: +xtreak ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue34304] clarification on escaping \d in regular expressions

2018-07-31 Thread Saba Kauser
s' % this, len(this)) re.error: bad escape \d at position 0 if I change the statement to have 3 backslash like pattern = re.sub('%d', '\\\d+', pattern) I can correctly generate correct regular expression. Can you please comment if this has changed in python 3.7 and we need to escape 'd' i

[issue30349] Preparation for advanced set syntax in regular expressions

2018-02-05 Thread Tim Graham
Tim Graham added the comment: Okay, I created #32775. -- ___ Python tracker ___ ___

[issue30349] Preparation for advanced set syntax in regular expressions

2018-02-05 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Good catch! fnmatch.translate() can produce a pattern which emits a warning when compiled. Could you please open a separate issue for this? -- ___ Python tracker

[issue30349] Preparation for advanced set syntax in regular expressions

2018-02-05 Thread Tim Graham
Tim Graham added the comment: It might be worth adding part of the problematic regex to the warning message. For Django's tests, I see an error like "FutureWarning: Possible nested set at position 17 return re.compile(res).match". It took some effort to track down the

[issue32067] Deprecate accepting unrecognized braces in regular expressions

2017-11-24 Thread Jakub Wilk
Change by Jakub Wilk : -- nosy: +jwilk ___ Python tracker ___ ___ Python-bugs-list mailing

[issue32067] Deprecate accepting unrecognized braces in regular expressions

2017-11-18 Thread Serhiy Storchaka
Serhiy Storchaka <storchaka+cpyt...@gmail.com> added the comment: Since this will require changing regular expressions in several places in the stdlib I have chosen emitting PendingDeprecationWarning and long deprecation period. But I'm now not sure that this is a good idea. Non-e

[issue32067] Deprecate accepting unrecognized braces in regular expressions

2017-11-18 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- keywords: +patch pull_requests: +4392 stage: -> patch review ___ Python tracker ___

[issue32067] Deprecate accepting unrecognized braces in regular expressions

2017-11-18 Thread Serhiy Storchaka
Change by Serhiy Storchaka <storchaka+cpyt...@gmail.com>: -- title: Deprecate accepting -> Deprecate accepting unrecognized braces in regular expressions ___ Python tracker <rep...@bugs.python.org> <https://bugs.pyt

[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2017-11-16 Thread Serhiy Storchaka
Serhiy Storchaka <storchaka+cpyt...@gmail.com> added the comment: Barry, could you please improve the documentation about unknown escape sequences in regular expressions? My skills is not enough for this. -- ___ Python tracke

[issue30349] Preparation for advanced set syntax in regular expressions

2017-11-16 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker

[issue30349] Preparation for advanced set syntax in regular expressions

2017-11-16 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: New changeset 05cb728d68a278d11466f9a6c8258d914135c96c by Serhiy Storchaka in branch 'master': bpo-30349: Raise FutureWarning for nested sets and set operations (#1553)

[issue30349] Preparation for advanced set syntax in regular expressions

2017-10-05 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Made a warning for '[' be emitted only at the start of a set. This significantly decrease the breakage of other code. I think we can get around without implicit union of nested sets, like in [_[0-9][:Latin:]]. This can be

[issue31580] Defer compiling regular expressions

2017-09-27 Thread Barry A. Warsaw
Barry A. Warsaw <ba...@python.org> added the comment: I'm closing this experiment. I'm not convinced that even if we can make start up time faster for module global regular expressions, we'll ever get enough buy-in from the ecosystem to make this worth it, because you'd really want to g

[issue31580] Defer compiling regular expressions

2017-09-26 Thread Ezio Melotti
Ezio Melotti added the comment: What about adding a lazy_compile() function? It will leave the current behavior unchanged, it's explicit, and it's easier to use cross version (if importing re.lazy_compile fails, use re.compile). FWIW I'm -1 on changing re.compile, -1 on adding re.IMMEDIATE,

[issue31580] Defer compiling regular expressions

2017-09-26 Thread Stefan Behnel
Stefan Behnel added the comment: I'm also against changing re.compile() to not compile. And I often write code like this: replace_whitespace = re.compile(r"\s+").sub which is not covered by your current proposed change. -- nosy: +scoder ___

[issue31580] Defer compiling regular expressions

2017-09-26 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Barry, please test re.search() with your changes. $ ./python -m timeit -s 'import re; s = "a" * 100' 're.search("b", s)' Unpatched: 50 loops, best of 5: 529 nsec per loop Patched:5 loops, best of 5: 7.46 usec per loop --

[issue31580] Defer compiling regular expressions

2017-09-26 Thread Raymond Hettinger
Raymond Hettinger added the comment: I'm flat-out opposed to changing the default behavior. If some API change gets make it needs to strictly be opt-in. Even then, I don't think this is a great idea. We already have ways to do it if people actually cared about this. FWIW, other languages

[issue31580] Defer compiling regular expressions

2017-09-26 Thread Barry A. Warsaw
Barry A. Warsaw added the comment: On Sep 26, 2017, at 11:27, R. David Murray wrote: > > Precompiling as a compile-time optimization would be cool. I think we are > currently favoring doing that kind of thing as an AST optimization step? I was thinking about that

[issue31580] Defer compiling regular expressions

2017-09-26 Thread R. David Murray
R. David Murray added the comment: Precompiling as a compile-time optimization would be cool. I think we are currently favoring doing that kind of thing as an AST optimization step? I think Raymond and my point was that the current behavior should remain unchanged by default. So a

[issue31580] Defer compiling regular expressions

2017-09-26 Thread Barry A. Warsaw
Barry A. Warsaw added the comment: Let's separate the use of lru_cache from the deferred compilation. I think I'll just revert the change to use lru_cache, although I'll note that the impetus for this was the observation that once MAXCACHE is reached the entire regexp cache is purged. That

[issue31580] Defer compiling regular expressions

2017-09-25 Thread Serhiy Storchaka
). Errors in regular expression now raised only on first use of the pattern. This is a drawback from educational and debugging points of view. The patch also breaks warnings emitted during compiling regular expressions. Now they report wrong source line and use wrong line for caching in case of -Wonce

[issue31580] Defer compiling regular expressions

2017-09-25 Thread R. David Murray
R. David Murray added the comment: I agree with Raymond. It would be strange to have the API that is obviously designed to pre-compile the regex not pre-compile the regex. If the concern is that a non-precompiled regex might get bumped out of the cache but you want a way to only compile a

[issue31580] Defer compiling regular expressions

2017-09-25 Thread Raymond Hettinger
Raymond Hettinger added the comment: ISTM, the whole point is to compile in advance. When I worked during high frequency trading, that was essential to news trading where you *really* didn't want to pay the compilation cost at the time the regex was used. This proposal takes away the user's

[issue31580] Defer compiling regular expressions

2017-09-25 Thread Barry A. Warsaw
Changes by Barry A. Warsaw : -- keywords: +patch pull_requests: +3742 stage: -> patch review ___ Python tracker ___

[issue31580] Defer compiling regular expressions

2017-09-25 Thread Barry A. Warsaw
New submission from Barry A. Warsaw: It's a very common pattern to see the following at module scope: cre_a = re.compile('some pattern') cre_b = re.compile('other pattern') and so on. This can cost you at start up time because all those regular expressions are compiled at import time, even

[issue30375] Correct stacklevel of warnings when compile regular expressions

2017-05-18 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker

[issue30375] Correct stacklevel of warnings when compile regular expressions

2017-05-18 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: New changeset dfcfc915787def056e225fb22ad5a5ee8da4052f by Serhiy Storchaka in branch '2.7': [2.7] bpo-30375: Correct the stacklevel of regex compiling warnings. (#1595) (#1648)

[issue30375] Correct stacklevel of warnings when compile regular expressions

2017-05-18 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- pull_requests: +1743 ___ Python tracker ___ ___

[issue30375] Correct stacklevel of warnings when compile regular expressions

2017-05-16 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: New changeset 24b5ed230df65f6a1f9d8dd0c4409377576113d9 by Serhiy Storchaka in branch '3.5': [3.5] bpo-30375: Correct the stacklevel of regex compiling warnings. (GH-1595) (#1605)

[issue30375] Correct stacklevel of warnings when compile regular expressions

2017-05-16 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: New changeset 73fb45df0487144765808c5d25914c67232d83fe by Serhiy Storchaka in branch '3.6': [3.6] bpo-30375: Correct the stacklevel of regex compiling warnings. (GH-1595) (#1604)

[issue30375] Correct stacklevel of warnings when compile regular expressions

2017-05-16 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- versions: +Python 2.7 ___ Python tracker ___

[issue30375] Correct stacklevel of warnings when compile regular expressions

2017-05-16 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- pull_requests: +1696 ___ Python tracker ___ ___

[issue30375] Correct stacklevel of warnings when compile regular expressions

2017-05-16 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- pull_requests: +1695 ___ Python tracker ___ ___

[issue30375] Correct stacklevel of warnings when compile regular expressions

2017-05-16 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: New changeset c7ac7280c321b3c1679fe5f657a6be0f86adf173 by Serhiy Storchaka in branch 'master': bpo-30375: Correct the stacklevel of regex compiling warnings. (#1595) https://github.com/python/cpython/commit/c7ac7280c321b3c1679fe5f657a6be0f86adf173

[issue30375] Correct stacklevel of warnings when compile regular expressions

2017-05-15 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- pull_requests: +1689 ___ Python tracker ___ ___

[issue30375] Correct stacklevel of warnings when compile regular expressions

2017-05-15 Thread Serhiy Storchaka
le('(x(?i))', re.IGNORECASE) >>> re.compile('((x(?i)))') __main__:1: DeprecationWarning: Flags not at the start of the expression ((x(?i))) re.compile('((x(?i)))', re.IGNORECASE) -- assignee: serhiy.storchaka components: Library (Lib), Regular Expressions messages: 293736 nosy: ezio.melotti,

[issue30349] Preparation for advanced set syntax in regular expressions

2017-05-12 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- pull_requests: +1650 ___ Python tracker ___ ___

[issue30349] Preparation for advanced set syntax in regular expressions

2017-05-12 Thread Serhiy Storchaka
module and make this syntax enabled by default, this will break some code. It is very unlikely the the regular expression contains duplicated characters ('--', '||', '&&' or '~~'), but nested sets uses just '[', and non-escaped '[' is occurred in character sets in regular expressions (ev

[issue30285] Optimize case-insensitive regular expressions

2017-05-09 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker

[issue30285] Optimize case-insensitive regular expressions

2017-05-09 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: New changeset 6d336a027913327fc042b0d758a16724fea27b9c by Serhiy Storchaka in branch 'master': bpo-30285: Optimize case-insensitive matching and searching (#1482) https://github.com/python/cpython/commit/6d336a027913327fc042b0d758a16724fea27b9c --

[issue30285] Optimize case-insensitive regular expressions

2017-05-06 Thread Raymond Hettinger
Raymond Hettinger added the comment: This seems like a great idea. -- nosy: +rhettinger ___ Python tracker ___

[issue30285] Optimize case-insensitive regular expressions

2017-05-05 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- pull_requests: +1582 ___ Python tracker ___ ___

[issue30285] Optimize case-insensitive regular expressions

2017-05-05 Thread Serhiy Storchaka
New submission from Serhiy Storchaka: Matching and searching case-insensitive regular expressions is much slower than matching and searching case-sensitive regular expressions. Case-insensitivity requires converting every character in input string to lower case and disables some optimizations

[issue30277] Speeds up compiling cases-insensitive regular expressions

2017-05-05 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: New changeset 7186cc29be352bed6f1110873283d073fd0643e4 by Serhiy Storchaka in branch 'master': bpo-30277: Replace _sre.getlower() with _sre.ascii_tolower() and _sre.unicode_tolower(). (#1468)

[issue30277] Speeds up compiling cases-insensitive regular expressions

2017-05-05 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- pull_requests: +1567 ___ Python tracker ___ ___

  1   2   3   4   5   6   7   8   9   10   >