Re: Regular Expression bug?

2023-03-02 Thread jose isaias cabrera
On Thu, Mar 2, 2023 at 9:56 PM Alan Bawden wrote: > > jose isaias cabrera writes: > >On Thu, Mar 2, 2023 at 2:38 PM Mats Wichmann wrote: > >This re is a bit different than the one I am used. So, I am trying to match >everything after 'pn=': > >import re >s = "pm=jose

Re: Regular Expression bug?

2023-03-02 Thread jose isaias cabrera
ject manager. pn=project name. I needed search() rather than match(). > > >>> s = "pn=jose pn=2017" > ... > >>> s0 = r0.match(s) > >>> s0 > > > > > -Original Message- > From: Python-list On > Behalf Of jose isaias cab

Re: Regular Expression bug?

2023-03-02 Thread jose isaias cabrera
On Thu, Mar 2, 2023 at 8:30 PM Cameron Simpson wrote: > > On 02Mar2023 20:06, jose isaias cabrera wrote: > >This re is a bit different than the one I am used. So, I am trying to > >match > >everything after 'pn=': > > > >import re > >s = "pm=jose pn=2017" > >m0 = r"pn=(.+)" > >r0 =

Re: Regular Expression bug?

2023-03-02 Thread Alan Bawden
jose isaias cabrera writes: On Thu, Mar 2, 2023 at 2:38 PM Mats Wichmann wrote: This re is a bit different than the one I am used. So, I am trying to match everything after 'pn=': import re s = "pm=jose pn=2017" m0 = r"pn=(.+)" r0 = re.compile(m0) s0 = r0.match(s)

Re: Regular Expression bug?

2023-03-02 Thread Cameron Simpson
On 02Mar2023 20:06, jose isaias cabrera wrote: This re is a bit different than the one I am used. So, I am trying to match everything after 'pn=': import re s = "pm=jose pn=2017" m0 = r"pn=(.+)" r0 = re.compile(m0) s0 = r0.match(s) `match()` matches at the start of the string. You want

RE: Regular Expression bug?

2023-03-02 Thread avi.e.gross
;> s0 -Original Message- From: Python-list On Behalf Of jose isaias cabrera Sent: Thursday, March 2, 2023 8:07 PM To: Mats Wichmann Cc: python-list@python.org Subject: Re: Regular Expression bug? On Thu, Mar 2, 2023 at 2:38 PM Mats Wichmann wrote: > > On 3/2/23 12:28

Re: Regular Expression bug?

2023-03-02 Thread jose isaias cabrera
On Thu, Mar 2, 2023 at 2:38 PM Mats Wichmann wrote: > > On 3/2/23 12:28, Chris Angelico wrote: > > On Fri, 3 Mar 2023 at 06:24, jose isaias cabrera wrote: > >> > >> Greetings. > >> > >> For the RegExp Gurus, consider the following python3 code: > >> > >> import re > >> s = "pn=align upgrade

RE: Regular Expression bug?

2023-03-02 Thread avi.e.gross
On Behalf Of jose isaias cabrera Sent: Thursday, March 2, 2023 2:23 PM To: python-list@python.org Subject: Regular Expression bug? Greetings. For the RegExp Gurus, consider the following python3 code: import re s = "pn=align upgrade sd=2023-02-" ro = re.compile(r"pn=(.+) &

Re: Regular Expression bug?

2023-03-02 Thread jose isaias cabrera
n upgrade sd=2023-02-" > > ro = re.compile(r"pn=(.+) ") > > r0=ro.match(s) > > >>> print(r0.group(1)) > > align upgrade > > > > > > This is wrong. It should be 'align' because the group only goes up-to > > the space. Thoughts? Thanks.

Re: Regular Expression bug?

2023-03-02 Thread Mats Wichmann
On 3/2/23 12:28, Chris Angelico wrote: On Fri, 3 Mar 2023 at 06:24, jose isaias cabrera wrote: Greetings. For the RegExp Gurus, consider the following python3 code: import re s = "pn=align upgrade sd=2023-02-" ro = re.compile(r"pn=(.+) ") r0=ro.match(s) print(r0.group(1)) align upgrade

Re: Regular Expression bug?

2023-03-02 Thread 2QdxY4RzWzUUiLuE
) > align upgrade > > > This is wrong. It should be 'align' because the group only goes up-to > the space. Thoughts? Thanks. The bug is in your regular expression; the plus modifier is greedy. If you want to match up to the first space, then you'll need something like [^ ] (i.

Re: Regular Expression bug?

2023-03-02 Thread Chris Angelico
On Fri, 3 Mar 2023 at 06:24, jose isaias cabrera wrote: > > Greetings. > > For the RegExp Gurus, consider the following python3 code: > > import re > s = "pn=align upgrade sd=2023-02-" > ro = re.compile(r"pn=(.+) ") > r0=ro.match(s) > >>> print(r0.group(1)) > align upgrade > > > This is wrong.

Regular Expression bug?

2023-03-02 Thread jose isaias cabrera
Greetings. For the RegExp Gurus, consider the following python3 code: import re s = "pn=align upgrade sd=2023-02-" ro = re.compile(r"pn=(.+) ") r0=ro.match(s) >>> print(r0.group(1)) align upgrade This is wrong. It should be 'align' because the group only goes up-to the space. Thoughts? Thanks.

[issue418615] regular expression bug in pipes.py.

2022-04-10 Thread admin
Change by admin : -- github: None -> 34409 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue418613] regular expression bug in pipes.py

2022-04-10 Thread admin
Change by admin : -- github: None -> 34408 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue416526] Regular expression tests: SEGV on Mac OS

2022-04-10 Thread admin
Change by admin : -- github: None -> 34348 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue47066] Convert a warning about flags not at the start of the regular expression into error

2022-03-19 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker ___

[issue47066] Convert a warning about flags not at the start of the regular expression into error

2022-03-19 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: New changeset 92a6abf72e7a8274f96edbb5297119d4ff055be7 by Serhiy Storchaka in branch 'main': bpo-47066: Convert a warning about flags not at the start of the regular expression into error (GH-31994) https://github.com/python/cpython/commit

[issue47066] Convert a warning about flags not at the start of the regular expression into error

2022-03-19 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- keywords: +patch pull_requests: +30084 stage: -> patch review pull_request: https://github.com/python/cpython/pull/31994 ___ Python tracker

[issue47066] Convert a warning about flags not at the start of the regular expression into error

2022-03-19 Thread Serhiy Storchaka
New submission from Serhiy Storchaka : This warning was introduced in 3.6. The reason is that in most other regular expression implementations global inline flags in the middle of the expression have different semantic: they affect only the part of the expression after the flag

[issue46825] slow matching on regular expression

2022-02-22 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: The re module does not support features corresponding to std::regex_constants::__polynomial in C++. Rewrite your regular expression or try to use alternative regex implementations (for example wrappers around the re2 library or C++ regex library

[issue46825] slow matching on regular expression

2022-02-22 Thread Matthew Barnett
Matthew Barnett added the comment: The expression is a repeated alternative where the first alternative is a repeat. Repeated repeats can result in a lot of attempts and backtracking and should be avoided. Try this instead: (0|1(01*0)*1)+ --

[issue46825] slow matching on regular expression

2022-02-22 Thread Heran Yang
New submission from Heran Yang : I'm using `re.fullmatch` to match a string that only contains 0 and 1. The regular expression is: (0+|1(01*0)*1)+ It runs rather slow with Python 3.7, but when I try using regex in C++, with std::regex_constants::__polynomial, it works well. Would someone

[issue46474] Inefficient regular expression complexity in EntryPoint.pattern

2022-02-14 Thread Łukasz Langa
Łukasz Langa added the comment: New changeset 8a84aef0123bd8c13cf81fbc3b5f6d45f96c2656 by Jason R. Coombs in branch '3.8': [3.8] bpo-46474: Avoid REDoS in EntryPoint.pattern (sync with importlib_metadata 4.10.1) (GH-30803). (#30829)

[issue46474] Inefficient regular expression complexity in EntryPoint.pattern

2022-01-23 Thread Jason R. Coombs
Change by Jason R. Coombs : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker ___

[issue46474] Inefficient regular expression complexity in EntryPoint.pattern

2022-01-23 Thread Jason R. Coombs
Jason R. Coombs added the comment: New changeset 1514d1252f96e6a83eb65c439522a6b5443f6a1a by Jason R. Coombs in branch '3.9': [3.9] bpo-46474: Avoid REDoS in EntryPoint.pattern (sync with importlib_metadata 4.10.1) (GH-30803). (GH-30828)

[issue46474] Inefficient regular expression complexity in EntryPoint.pattern

2022-01-23 Thread Jason R. Coombs
Jason R. Coombs added the comment: New changeset a7a4ca4f06c8c31d7f403113702ad2e80bfc326b by Jason R. Coombs in branch '3.10': [3.10] bpo-46474: Avoid REDoS in EntryPoint.pattern (sync with importlib_metadata 4.10.1) (GH-30803) (GH-30827)

[issue46474] Inefficient regular expression complexity in EntryPoint.pattern

2022-01-23 Thread Jason R. Coombs
Change by Jason R. Coombs : -- pull_requests: +29016 pull_request: https://github.com/python/cpython/pull/30829 ___ Python tracker ___

[issue46474] Inefficient regular expression complexity in EntryPoint.pattern

2022-01-23 Thread Jason R. Coombs
Change by Jason R. Coombs : -- pull_requests: +29015 pull_request: https://github.com/python/cpython/pull/30828 ___ Python tracker ___

[issue46474] Inefficient regular expression complexity in EntryPoint.pattern

2022-01-23 Thread Jason R. Coombs
Change by Jason R. Coombs : -- pull_requests: +29014 pull_request: https://github.com/python/cpython/pull/30827 ___ Python tracker ___

[issue46474] Inefficient regular expression complexity in EntryPoint.pattern

2022-01-22 Thread Jason R. Coombs
Jason R. Coombs added the comment: New changeset 51c3e28c8a163e58dc753765e3cc51d5a717e70d by Jason R. Coombs in branch 'main': bpo-46474: Avoid REDoS in EntryPoint.pattern (sync with importlib_metadata 4.10.1) (GH-30803)

[issue46474] Inefficient regular expression complexity in EntryPoint.pattern

2022-01-22 Thread Jason R. Coombs
Jason R. Coombs added the comment: New changeset 443dec6c9a104386ee90165d32fb28d0c5d29043 by Jason R. Coombs in branch 'main': bpo-46474: Apply changes from importlib_metadata 4.10.0 (GH-30802) https://github.com/python/cpython/commit/443dec6c9a104386ee90165d32fb28d0c5d29043 --

[issue46474] Inefficient regular expression complexity in EntryPoint.pattern

2022-01-22 Thread Jason R. Coombs
Change by Jason R. Coombs : -- pull_requests: +28989 pull_request: https://github.com/python/cpython/pull/30803 ___ Python tracker ___

[issue46474] Inefficient regular expression complexity in EntryPoint.pattern

2022-01-22 Thread Jason R. Coombs
Change by Jason R. Coombs : -- keywords: +patch pull_requests: +28987 stage: -> patch review pull_request: https://github.com/python/cpython/pull/30802 ___ Python tracker ___

[issue46474] Inefficient regular expression complexity in EntryPoint.pattern

2022-01-22 Thread Jason R. Coombs
Jason R. Coombs added the comment: Because I want this security issue to be back-portable to older Pythons, I'll first apply importlib_metadata 4.10.0 and then apply the change from 4.10.1 separately. -- ___ Python tracker

[issue46474] Inefficient regular expression complexity in EntryPoint.pattern

2022-01-22 Thread Jason R. Coombs
with importlib_metadata 4.10.1. Let's get that fix incorporated into Python as well. -- assignee: jaraco components: Library (Lib) messages: 411282 nosy: jaraco priority: normal severity: normal status: open title: Inefficient regular expression complexity in EntryPoint.pattern type: security versions

[issue43222] Regular expression split fails on 3.6 and not 2.7 or 3.7+

2021-02-15 Thread Philip
Change by Philip : -- resolution: -> wont fix stage: -> resolved status: open -> closed ___ Python tracker ___ ___

[issue43222] Regular expression split fails on 3.6 and not 2.7 or 3.7+

2021-02-14 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: There was a bug in the regular expression engine which caused re.split() working incorrectly with zero-width patterns. Note that in your example _DIGIT_BOUNDARY_RE.split("10.0.0") returns ['10.0.0'] on Python 2.7 -- the result which you unlikel

[issue43222] Regular expression split fails on 3.6 and not 2.7 or 3.7+

2021-02-14 Thread Philip
ons messages: 386942 nosy: ezio.melotti, mrabarnett, probinso priority: normal severity: normal status: open title: Regular expression split fails on 3.6 and not 2.7 or 3.7+ type: crash versions: Python 3.6 ___ Python track

[issue39949] truncating match in regular expression match objects repr

2020-06-19 Thread Quentin Wenger
Quentin Wenger added the comment: Other pathological case: literal backslashes ``` >>> re.match(".*", r"\\") ``` -- ___ Python tracker ___

[issue39949] truncating match in regular expression match objects repr

2020-06-19 Thread Quentin Wenger
Quentin Wenger added the comment: *off -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue39949] truncating match in regular expression match objects repr

2020-06-19 Thread Quentin Wenger
Quentin Wenger added the comment: (but those are one-character escapes, so that should be fine - either the escape is complete or the backslash is trailing and can be "peeled of") -- ___ Python tracker

[issue39949] truncating match in regular expression match objects repr

2020-06-19 Thread Quentin Wenger
Quentin Wenger added the comment: And ascii escapes should also not be forgotten. ``` >>> re.match(b".*", b"\t") >>> re.match(".*", "\t") ``` -- ___ Python tracker ___

[issue39949] truncating match in regular expression match objects repr

2020-06-19 Thread Quentin Wenger
Quentin Wenger added the comment: An extraneous difficulty also exists for bytes regexes, because there non-ascii characters are repr'ed using escape sequences. So there's a risk of cutting one in the middle. ``` >>> import re >>> re.match(b".*", b"\xce") ``` --

[issue39949] truncating match in regular expression match objects repr

2020-06-18 Thread Seth Troisi
Seth Troisi added the comment: I was thinking about how to add the end quote and found these weird cases: >>> "asdf'asdf'asdf" "asdf'asdf'asdf" >>> "asdf\"asdf\"asdf" 'asdf"asdf"asdf' >>> "asdf\"asdf'asdf" 'asdf"asdf\'asdf' This means that len(s) +2 (or 3 for bytes) !=

[issue39949] truncating match in regular expression match objects repr

2020-06-16 Thread Quentin Wenger
Quentin Wenger added the comment: File objects are an example of a square-bracket repr with string parameters in the repr, but no truncation is performed (see https://github.com/python/cpython/blob/master/Modules/_io/textio.c#L2912). Various truncations with the same (lack of?) clarity are

[issue39949] truncating match in regular expression match objects repr

2020-06-16 Thread Quentin Wenger
Quentin Wenger added the comment: Oh ok, I was mislead by the example in your first message, where you did have both the quote and ellipsis. I don't have a strong opinion. - having the quote is a bit more "clean" - but not having it makes clear than the pattern is truncated (per se, three

[issue39949] truncating match in regular expression match objects repr

2020-06-16 Thread Seth Troisi
Seth Troisi added the comment: @matpi The current behavior is for the right quote to not appear I kept this behavior but happy to consider changing that. See the linked patch for examples -- ___ Python tracker

[issue39949] truncating match in regular expression match objects repr

2020-06-16 Thread Seth Troisi
Change by Seth Troisi : -- keywords: +patch pull_requests: +20100 stage: needs patch -> patch review pull_request: https://github.com/python/cpython/pull/20922 ___ Python tracker

[issue39949] truncating match in regular expression match objects repr

2020-06-16 Thread Quentin Wenger
Quentin Wenger added the comment: @eric.smith thanks, no problem. If I can give any advice on this present issue, I would suggest to have the ellipsis _inside_ the quote, to make clear that the pattern is being truncated, not the match. So instead of ``` <_sre.SRE_Match object; span=(0,

[issue39949] truncating match in regular expression match objects repr

2020-06-16 Thread Eric V. Smith
Eric V. Smith added the comment: Ah, I see. I missed that this issue was only about match objects. I apologize for the confusion. That being the case, I'll re-open the other issue. -- ___ Python tracker

[issue39949] truncating match in regular expression match objects repr

2020-06-16 Thread Quentin Wenger
Quentin Wenger added the comment: For a bit of background, the other issue is about the repr of compiled patterns, not match objects. Please see my argument there about the conformance to repr's doc - merely adding an ellipsis would _not_ solve this case. I have however nothing against the

[issue39949] truncating match in regular expression match objects repr

2020-06-16 Thread Eric V. Smith
Eric V. Smith added the comment: There was a discussion in issue40984 that the repr must be eval-able. I don't feel very strongly about this, mainly because I don't think anyone ever does eval(repr(some_regex)). I'd be slightly sympathetic to wanting the eval to fail if the repr had to

[issue39949] truncating match in regular expression match objects repr

2020-06-16 Thread Seth Troisi
Seth Troisi added the comment: I didn't propose a patch before because I was unsure of decision. Now that there is a +1 from Raymond I'll working on a patch and some documentation. Expect a patch within the week. -- ___ Python tracker

[issue39949] truncating match in regular expression match objects repr

2020-06-16 Thread Eric V. Smith
Change by Eric V. Smith : -- components: +Regular Expressions -Library (Lib) nosy: +ezio.melotti, mrabarnett resolution: not a bug -> stage: resolved -> needs patch status: closed -> open ___ Python tracker

[issue39949] truncating match in regular expression match objects repr

2020-06-16 Thread Raymond Hettinger
Raymond Hettinger added the comment: +1 for adding an ellipsis. It's a conventional way to indicate that the displayed data is truncated. Concur with Eric that missing close quote is too subtle (and odd, and unexpected). -- nosy: +rhettinger

[issue24880] ctypeslib patch for regular expression for symbols to include

2020-05-31 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- resolution: -> third party stage: -> resolved status: open -> closed ___ Python tracker ___ ___

[issue38804] Regular Expression Denial of Service in http.cookiejar

2020-05-14 Thread STINNER Victor
STINNER Victor added the comment: The fix landed in all maintained versions, thanks. I close the issue. -- priority: release blocker -> resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker

[issue38804] Regular Expression Denial of Service in http.cookiejar

2020-04-02 Thread Larry Hastings
Larry Hastings added the comment: New changeset 55a6a16a46239a71b635584e532feb8b17ae7fdf by Victor Stinner in branch '3.5': bpo-38804: Fix REDoS in http.cookiejar (GH-17157) (#17344) https://github.com/python/cpython/commit/55a6a16a46239a71b635584e532feb8b17ae7fdf --

[issue38804] Regular Expression Denial of Service in http.cookiejar

2020-03-27 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- nosy: +larry priority: normal -> release blocker versions: -Python 2.7, Python 3.6, Python 3.7, Python 3.8, Python 3.9 ___ Python tracker ___

[issue38826] Regular Expression Denial of Service in urllib.request.AbstractBasicAuthHandler

2020-03-25 Thread STINNER Victor
STINNER Victor added the comment: This issue is a duplicate of bpo-39503 which has a PR. Thanks Ben Caller for the report, I credited you in my fix ;-) -- nosy: +vstinner resolution: -> duplicate stage: -> resolved status: open -> closed superseder: -> [security][CVE-2020-8492]

[issue39949] truncating match in regular expression match objects repr

2020-03-24 Thread Seth Troisi
Change by Seth Troisi : -- resolution: -> not a bug stage: -> resolved status: open -> closed ___ Python tracker ___ ___

[issue39949] truncating match in regular expression match objects repr

2020-03-12 Thread Eric V. Smith
Eric V. Smith added the comment: I think the missing closing quote is supposed to be your visual clue that it's truncated. Although I'll grant you that it's pretty subtle. -- nosy: +eric.smith versions: +Python 3.9 ___ Python tracker

[issue39949] truncating match in regular expression match objects repr

2020-03-12 Thread Seth Troisi
this is a good idea. I couldn't think of other examples (urllib maybe?) in Python of how this is handled but I could potentially look for some if that would help -- components: Library (Lib) messages: 364052 nosy: Seth.Troisi, serhiy.storchaka priority: normal severity: normal statu

[issue38826] Regular Expression Denial of Service in urllib.request.AbstractBasicAuthHandler

2020-03-03 Thread Matthew Barnett
Matthew Barnett added the comment: A smaller change to the regex would be to replace the "(?:.*,)*" with "(?:[^,]*,)*". I'd also suggest using a raw string instead: rx = re.compile(r'''(?:[^,]*,)*[ \t]*([^ \t]+)[ \t]+realm=(["']?)([^"']*)\2''', re.I) -- nosy: +mrabarnett

[issue38826] Regular Expression Denial of Service in urllib.request.AbstractBasicAuthHandler

2020-03-02 Thread Michał Górny
Change by Michał Górny : -- nosy: +mgorny ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue38826] Regular Expression Denial of Service in urllib.request.AbstractBasicAuthHandler

2020-02-04 Thread Anselmo Melo
Change by Anselmo Melo : -- nosy: +Anselmo Melo ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue38804] Regular Expression Denial of Service in http.cookiejar

2019-11-24 Thread STINNER Victor
STINNER Victor added the comment: New changeset e6499033032d5b647e43a3b49da0c1c64b151743 by Victor Stinner in branch '2.7': bpo-38804: Fix REDoS in http.cookiejar (GH-17157) (GH-17345) https://github.com/python/cpython/commit/e6499033032d5b647e43a3b49da0c1c64b151743 --

[issue38804] Regular Expression Denial of Service in http.cookiejar

2019-11-22 Thread Ned Deily
Ned Deily added the comment: New changeset 0716056c49e9505041e30386dad9b2e788f67aaf by Ned Deily (Miss Islington (bot)) in branch '3.6': bpo-38804: Fix REDoS in http.cookiejar (GH-17157) (#17343) https://github.com/python/cpython/commit/0716056c49e9505041e30386dad9b2e788f67aaf --

[issue38804] Regular Expression Denial of Service in http.cookiejar

2019-11-22 Thread STINNER Victor
STINNER Victor added the comment: I'm now tracking this vulnerability at: https://python-security.readthedocs.io/vuln/cookiejar-redos.html -- ___ Python tracker ___

[issue38804] Regular Expression Denial of Service in http.cookiejar

2019-11-22 Thread miss-islington
miss-islington added the comment: New changeset a1e1be4c4969c7c20c8c958e5ab5279ae6a66a16 by Miss Islington (bot) in branch '3.8': bpo-38804: Fix REDoS in http.cookiejar (GH-17157) https://github.com/python/cpython/commit/a1e1be4c4969c7c20c8c958e5ab5279ae6a66a16 -- nosy:

[issue38804] Regular Expression Denial of Service in http.cookiejar

2019-11-22 Thread miss-islington
miss-islington added the comment: New changeset cb6085138a845f8324adc011b65754acc2086cc0 by Miss Islington (bot) in branch '3.7': bpo-38804: Fix REDoS in http.cookiejar (GH-17157) https://github.com/python/cpython/commit/cb6085138a845f8324adc011b65754acc2086cc0 --

[issue38804] Regular Expression Denial of Service in http.cookiejar

2019-11-22 Thread STINNER Victor
Change by STINNER Victor : -- pull_requests: +16829 pull_request: https://github.com/python/cpython/pull/17345 ___ Python tracker ___

[issue38804] Regular Expression Denial of Service in http.cookiejar

2019-11-22 Thread STINNER Victor
Change by STINNER Victor : -- pull_requests: +16828 pull_request: https://github.com/python/cpython/pull/17344 ___ Python tracker ___

[issue38804] Regular Expression Denial of Service in http.cookiejar

2019-11-22 Thread miss-islington
Change by miss-islington : -- pull_requests: +16827 pull_request: https://github.com/python/cpython/pull/17343 ___ Python tracker ___

[issue38804] Regular Expression Denial of Service in http.cookiejar

2019-11-22 Thread miss-islington
Change by miss-islington : -- pull_requests: +16826 pull_request: https://github.com/python/cpython/pull/17342 ___ Python tracker ___

[issue38804] Regular Expression Denial of Service in http.cookiejar

2019-11-22 Thread miss-islington
Change by miss-islington : -- pull_requests: +16825 pull_request: https://github.com/python/cpython/pull/17341 ___ Python tracker ___

[issue38804] Regular Expression Denial of Service in http.cookiejar

2019-11-22 Thread STINNER Victor
STINNER Victor added the comment: New changeset 1b779bfb8593739b11cbb988ef82a883ec9d077e by Victor Stinner (bcaller) in branch 'master': bpo-38804: Fix REDoS in http.cookiejar (GH-17157) https://github.com/python/cpython/commit/1b779bfb8593739b11cbb988ef82a883ec9d077e --

[issue38826] Regular Expression Denial of Service in urllib.request.AbstractBasicAuthHandler

2019-11-17 Thread Ben Caller
Ben Caller added the comment: I have been advised that DoS issues can be added to the public bug tracker since there is no privilege escalation, but should still have the security label. -- ___ Python tracker

[issue38826] Regular Expression Denial of Service in urllib.request.AbstractBasicAuthHandler

2019-11-16 Thread Karthikeyan Singaravelan
Karthikeyan Singaravelan added the comment: Thanks for the report. Please report security issues to secur...@python.org so that the security team can analyze and triage it to be made public. More information at https://www.python.org/news/security/ -- nosy: +xtreak

[issue38826] Regular Expression Denial of Service in urllib.request.AbstractBasicAuthHandler

2019-11-16 Thread Ben Caller
New submission from Ben Caller : The regular expression urllib.request.AbstractBasicAuthHandler.rx is vulnerable to malicious inputs which cause denial of service (REDoS). The regex is: rx = re.compile('(?:.*,)*[ \t]*([^ \t]+)[ \t]+' 'realm=(["\']?)([^"\']*)

[issue38804] Regular Expression Denial of Service in http.cookiejar

2019-11-14 Thread Karthikeyan Singaravelan
Change by Karthikeyan Singaravelan : -- nosy: +xtreak ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue38804] Regular Expression Denial of Service in http.cookiejar

2019-11-14 Thread Karthikeyan Singaravelan
Change by Karthikeyan Singaravelan : -- nosy: +serhiy.storchaka, vstinner ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue38804] Regular Expression Denial of Service in http.cookiejar

2019-11-14 Thread Ben Caller
Change by Ben Caller : -- keywords: +patch pull_requests: +1 stage: -> patch review pull_request: https://github.com/python/cpython/pull/17157 ___ Python tracker ___

[issue38804] Regular Expression Denial of Service in http.cookiejar

2019-11-14 Thread Ben Caller
New submission from Ben Caller : The regex http.cookiejar.LOOSE_HTTP_DATE_RE iss vulnerable to regular expression denial of service (REDoS). LOOSE_HTTP_DATE_RE.match is called when using http.cookiejar.CookieJar to parse Set-Cookie headers returned by a server. Processing a response from

[issue22491] Support Unicode line boundaries in regular expression

2019-10-27 Thread Lewis Gaul
Lewis Gaul added the comment: Hi there, I'm running 'EnHackathon' in a couple of weeks, and was wondering if this could be a good issue for a small team of first-time contributors with experience in C to work on. Would anyone be able to offer any guidance for where to start in

Re: python3, regular expression and bytes text

2019-10-14 Thread Eko palypse
Am Montag, 14. Oktober 2019 13:56:09 UTC+2 schrieb Chris Angelico: > > (My apologies for saying this in reply to an unrelated post, but I > also don't see those posts, so it's not easy to reply to them.) > > ChrisA Nothing to apologize and thank you for clarification, I was already checking my

Re: python3, regular expression and bytes text

2019-10-14 Thread Chris Angelico
On Mon, Oct 14, 2019 at 10:41 PM Eko palypse wrote: > > Am Sonntag, 13. Oktober 2019 21:20:26 UTC+2 schrieb moi: > > [Do not know why I spent hours with this...] > > > > To answer you question. > > Yes, I confirm. > > It seems that as soon as one works with bytes and when > > a char is encoded in

Re: python3, regular expression and bytes text

2019-10-14 Thread Eko palypse
Am Sonntag, 13. Oktober 2019 21:20:26 UTC+2 schrieb moi: > [Do not know why I spent hours with this...] > > To answer you question. > Yes, I confirm. > It seems that as soon as one works with bytes and when > a char is encoded in more than 1 byte, "re" goes into > troubles. > First, sorry for

Re: python3, regular expression and bytes text

2019-10-12 Thread Eko palypse
of the current buffer to me. The problem is that the buffer can have all possible encodings. cp1251, cp1252, utf8, ucs-2 ... but scintilla informs me about which encoding is currently used. I wanted to realize a regular expression tester with Python3, and mark the text that has been matched by regular

Re: python3, regular expression and bytes text

2019-10-12 Thread MRAB
with re.LOCALE are slow. It may be more efficient to decode text and use Unicode regular expression. Thank you, I guess I'm convinced to always decode everything (re pattern and text) to utf8 internally and then do the re search but then I would need to figure out the correct position, hmm - some

Re: python3, regular expression and bytes text

2019-10-12 Thread Chris Angelico
On Sun, Oct 13, 2019 at 7:16 AM Richard Damon wrote: > > On 10/12/19 3:46 PM, Eko palypse wrote: > > Thank you very much for your answer. > > > >> You have to be able to match bytes, not strings. > > May I ask you to elaborate on this, sorry non-native English speaker. > > The buffer I receive is

Re: python3, regular expression and bytes text

2019-10-12 Thread MRAB
charsets. So even if you set the utf-8 locale, it would not help. Regular expressions with re.LOCALE are slow. It may be more efficient to decode text and use Unicode regular expression. +1 It's best to treat re.LOCALE as being for old legacy encodings that use/used 8 bits per character. Wherever

Re: python3, regular expression and bytes text

2019-10-12 Thread Richard Damon
On 10/12/19 3:46 PM, Eko palypse wrote: > Thank you very much for your answer. > >> You have to be able to match bytes, not strings. > May I ask you to elaborate on this, sorry non-native English speaker. > The buffer I receive is a byte-like buffer. > >> I don't think you'll be able to 100%

Re: python3, regular expression and bytes text

2019-10-12 Thread Chris Angelico
hen you're matching text (the normal way you use a regular expression), every element in the RE matches a character (or emptiness). For instance, the regular expression "^[bc]at$" has these elements: "^" matches emptiness at the start "[bc]" matches either "b&qu

Re: python3, regular expression and bytes text

2019-10-12 Thread Eko palypse
ow. It may be more efficient to > decode text and use Unicode regular expression. Thank you, I guess I'm convinced to always decode everything (re pattern and text) to utf8 internally and then do the re search but then I would need to figure out the correct position, hmm - some ongoing in

Re: python3, regular expression and bytes text

2019-10-12 Thread Eko palypse
Thank you very much for your answer. > You have to be able to match bytes, not strings. May I ask you to elaborate on this, sorry non-native English speaker. The buffer I receive is a byte-like buffer. > I don't think you'll be able to 100% reliably match bytes in this way. > You're asking it

Re: python3, regular expression and bytes text

2019-10-12 Thread Serhiy Storchaka
, it would not help. Regular expressions with re.LOCALE are slow. It may be more efficient to decode text and use Unicode regular expression. -- https://mail.python.org/mailman/listinfo/python-list

Re: python3, regular expression and bytes text

2019-10-12 Thread Chris Angelico
On Sun, Oct 13, 2019 at 5:11 AM Eko palypse wrote: > > What needs to be set in order to be able to use a re search within > utf8 encoded bytes? You have to be able to match bytes, not strings. > So how can I make it work with utf8 encoded text? > Note, decoding it to a string isn't preferred as

python3, regular expression and bytes text

2019-10-12 Thread Eko palypse
What needs to be set in order to be able to use a re search within utf8 encoded bytes? My test, being on a windows PC with cp1252 setup, looks like this import re import locale cp1252 = 'Ärger im Paradies'.encode('cp1252') utf8 = 'Ärger im Paradies'.encode('utf-8') print('cp1252:', cp1252)

  1   2   3   4   5   6   7   8   9   10   >