[issue43075] ReDoS in urllib.request
yeting li added the comment: For a regex has polynomial worst-case complexity, limiting the maximum input length is indeed a very effective method. As shown below, as the input length becomes smaller, the matching time becomes significantly smaller. header = '' + ',' * (10 ** 4)1.617s header = '' + ',' * (10 ** 3)0.014s header = '' + ',' * (10 ** 2)0.00017s -- ___ Python tracker <https://bugs.python.org/issue43075> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43075] ReDoS in urllib.request
yeting li added the comment: Sorry for the delay. I analyzed the performance of the current version '(?:^|,)[ \t]*([^ \t]+)[ \t]+' and the fixed version '(?:^|,)[ \t]*([^ \t,]+)[ \t]+'. I ran the following HTTP header ten times: header = '' + ',' * (10 ** 5) The current version takes about 139.178s-140.946s, while the repaired version takes about 0.006s. You can analyze them with the code below. from time import perf_counter for _ in range(0, 10): BEGIN = perf_counter() header = repeat_10_5_simple headers = Headers(header) handler.http_error_auth_reqed("WWW-Authenticate", host, req, Headers(header)) DURATION = perf_counter() - BEGIN print(f"took {DURATION} seconds!") For CVE-2020-8492, it is the backtracking performance caused by some ambiguity during the matching, and this issue is caused by the regex engine constantly moves the matching regex across the malicious string that does not have a match for the regex. Because the locations of the vulnerabilities are the same, so I refer to your code. Thanks for the code ;-)! -- ___ Python tracker <https://bugs.python.org/issue43075> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43075] ReDoS in request
yeting li added the comment: Thank you for your quick reply! I agree. Catastrophic backtracking is typically regarded as a regex with exponential worst-case matching time. Besides regexes with exponential worst-case time complexity, ReDoS also includes ones with other super-linear (e.g., quadratic) worst-case time complexity. Thanks again for your reply, I'm trying to create a pull request for it. -- ___ Python tracker <https://bugs.python.org/issue43075> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43075] ReDoS in request
Change by yeting li : -- keywords: +patch pull_requests: +23205 stage: needs patch -> patch review pull_request: https://github.com/python/cpython/pull/24391 ___ Python tracker <https://bugs.python.org/issue43075> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43075] ReDoS in request
New submission from yeting li : Hi, I find this regex '(?:^|,)[ \t]*([^ \t]+)[ \t]+' may be stucked by input. The vulnerable regex is located in https://github.com/python/cpython/blob/5c5a938573ce665f00e362c7766912d9b3f3b44e/Lib/urllib/request.py#L946 The ReDOS vulnerability of the regex is mainly due to the sub-pattern ',([^ \t]+)' and can be exploited with the following string attack_str = "," * 1 You can execute redos_python.py to reproduce the ReDos vulnerability. I am willing to suggest that you replace '(?:^|,)[ \t]*([^ \t]+)[ \t]+' with '(?:^|,)[ \t]*([^ \t,]+)[ \t]+' Looking forward for your response! Best, Yeting Li -- components: Library (Lib) files: redos_python.py messages: 385974 nosy: yetingli priority: normal severity: normal status: open title: ReDoS in request versions: Python 3.6, Python 3.7, Python 3.8, Python 3.9 Added file: https://bugs.python.org/file49778/redos_python.py ___ Python tracker <https://bugs.python.org/issue43075> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41921] REDoS in parseentities
New submission from yeting li : Hi, I find this regex '' may be stucked by input. The vulnerable regex is located in https://github.com/python/cpython/blob/8d21aa21f2cbc6d50aab3f420bb23be1d081dac4/Tools/scripts/parseentities.py#L18 The ReDOS vulnerability of the regex is mainly due to the sub-pattern ' +((?:.|\n)+?) *' and can be exploited with the following string ' <https://bugs.python.org/issue41921> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41715] REDoS in c_analyzer
yeting li added the comment: I'm sorry there was a typo just now. replace _\w*[a-zA-Z]\w* with (_\d*)+([a-zA-Z]([_\d])*)+ -- ___ Python tracker <https://bugs.python.org/issue41715> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41715] REDoS in c_analyzer
yeting li added the comment: You can use the dk.brics.automaton library to verify whether two regexes are equivalent. -- ___ Python tracker <https://bugs.python.org/issue41715> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41715] REDoS in c_analyzer
yeting li added the comment: I think we can replace \w*[a-zA-Z]\w* with (_\d*)+([a-zA-Z]([_\d])*)+ This is an equivalent fix and the fixed regex is safe. Does that sound right to you? -- ___ Python tracker <https://bugs.python.org/issue41715> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41715] REDoS in c_analyzer
Change by yeting li : -- components: +Library (Lib) type: -> security versions: +Python 3.10 ___ Python tracker <https://bugs.python.org/issue41715> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41715] REDoS in c_analyzer
Change by yeting li : -- title: REDoS inc_analyzer -> REDoS in c_analyzer ___ Python tracker <https://bugs.python.org/issue41715> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41715] REDoS inc_analyzer
New submission from yeting li : Hi, I find this regex "^([a-zA-Z]|_\w*[a-zA-Z]\w*|[a-zA-Z]\w*)$" may be stucked by input. The vulnerable regex is located in https://github.com/python/cpython/blob/54a66ade2067c373d31003ad260e1b7d14c81564/Tools/c-analyzer/c_analyzer/common/info.py#L12 The ReDOS vulnerability of the regex is mainly due to the sub-pattern \w*[a-zA-Z]\w* and can be exploited with the following string "_" + "a" * 5000 + "!" I think you can limit the input length or fix this regex. Looking forward for your response! Best, Yeting Li -- files: info.py messages: 376355 nosy: yetingli priority: normal severity: normal status: open title: REDoS inc_analyzer Added file: https://bugs.python.org/file49445/info.py ___ Python tracker <https://bugs.python.org/issue41715> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41712] REDoS in purge
New submission from yeting li : I find this regex "(\d+\.\d+\.\d+)(\w+\d+)?$" may be stucked by input. The vulnerable regex is located in https://github.com/python/cpython/blob/54a66ade2067c373d31003ad260e1b7d14c81564/Tools/msi/purge.py#L15 The ReDOS vulnerability of the regex is mainly due to the sub-pattern \w+\d+ and can be exploited with the following string "1.1.1"+"1" * 5000 + "!" I think you can limit the input length or fix this regex. For example, you can modify the sub-pattern \w+\d+ to ([A-Za-z_]*\d)+ Looking forward for your response! Best, Yeting Li -- components: Library (Lib) files: purge.py messages: 376343 nosy: yetingli priority: normal severity: normal status: open title: REDoS in purge type: security versions: Python 3.10 Added file: https://bugs.python.org/file49443/purge.py ___ Python tracker <https://bugs.python.org/issue41712> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com