[issue43075] ReDoS in urllib.request

2021-04-07 Thread yeting li


yeting li  added the comment:

For a regex has polynomial worst-case complexity, limiting the maximum input 
length is indeed a very effective method.

As shown below, as the input length becomes smaller, the matching time becomes 
significantly smaller.

header = '' + ',' * (10 ** 4)1.617s
header = '' + ',' * (10 ** 3)0.014s
header = '' + ',' * (10 ** 2)0.00017s

--

___
Python tracker 
<https://bugs.python.org/issue43075>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43075] ReDoS in urllib.request

2021-03-14 Thread yeting li


yeting li  added the comment:

Sorry for the delay. I analyzed the performance of the current version 
'(?:^|,)[ \t]*([^ \t]+)[ \t]+' and the fixed version '(?:^|,)[ \t]*([^ \t,]+)[ 
\t]+'. I ran the following HTTP header ten times:

header = '' + ',' * (10 ** 5)

The current version takes about 139.178s-140.946s, while the repaired version 
takes about 0.006s.

You can analyze them with the code below.

from time import perf_counter
for _ in range(0, 10):
BEGIN = perf_counter()
header = repeat_10_5_simple
headers = Headers(header)
handler.http_error_auth_reqed("WWW-Authenticate", host, req, 
Headers(header))
DURATION = perf_counter() - BEGIN
print(f"took {DURATION} seconds!") 

For CVE-2020-8492, it is the backtracking performance caused by some ambiguity 
during the matching, and this issue is caused by the regex engine constantly 
moves the matching regex across the malicious string that does not have a match 
for the regex.

Because the locations of the vulnerabilities are the same, so I refer to your 
code. Thanks for the code ;-)!

--

___
Python tracker 
<https://bugs.python.org/issue43075>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43075] ReDoS in request

2021-01-30 Thread yeting li


yeting li  added the comment:

Thank you for your quick reply!

I agree. Catastrophic backtracking is typically regarded as a regex with 
exponential worst-case matching time. Besides regexes with exponential 
worst-case time complexity, ReDoS also includes ones with  other super-linear 
(e.g., quadratic) worst-case time complexity.


Thanks again for your reply, I'm trying to create a pull request for it.

--

___
Python tracker 
<https://bugs.python.org/issue43075>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43075] ReDoS in request

2021-01-30 Thread yeting li


Change by yeting li :


--
keywords: +patch
pull_requests: +23205
stage: needs patch -> patch review
pull_request: https://github.com/python/cpython/pull/24391

___
Python tracker 
<https://bugs.python.org/issue43075>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43075] ReDoS in request

2021-01-30 Thread yeting li

New submission from yeting li :

Hi,

I find this regex '(?:^|,)[ \t]*([^ \t]+)[ \t]+' may be stucked by input.

The vulnerable regex is located in 
https://github.com/python/cpython/blob/5c5a938573ce665f00e362c7766912d9b3f3b44e/Lib/urllib/request.py#L946

The ReDOS vulnerability of the regex is mainly due to the sub-pattern ',([^ 
\t]+)' and can be exploited with the following string
attack_str = "," * 1

You can execute redos_python.py to reproduce the ReDos vulnerability.


I am willing to suggest that you replace '(?:^|,)[ \t]*([^ \t]+)[ \t]+' with 
'(?:^|,)[ \t]*([^ \t,]+)[ \t]+'

Looking forward for your response​!

Best,
Yeting Li

--
components: Library (Lib)
files: redos_python.py
messages: 385974
nosy: yetingli
priority: normal
severity: normal
status: open
title: ReDoS in request
versions: Python 3.6, Python 3.7, Python 3.8, Python 3.9
Added file: https://bugs.python.org/file49778/redos_python.py

___
Python tracker 
<https://bugs.python.org/issue43075>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41921] REDoS in parseentities

2020-10-03 Thread yeting li

New submission from yeting li :

Hi,

I find this regex '' 
may be stucked by input.
The vulnerable regex is located in
https://github.com/python/cpython/blob/8d21aa21f2cbc6d50aab3f420bb23be1d081dac4/Tools/scripts/parseentities.py#L18

The ReDOS vulnerability of the regex is mainly due to the sub-pattern ' 
+((?:.|\n)+?) *'
and can be exploited with the following string
'
<https://bugs.python.org/issue41921>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41715] REDoS in c_analyzer

2020-09-04 Thread yeting li


yeting li  added the comment:

I'm sorry there was a typo just now.


replace _\w*[a-zA-Z]\w* with (_\d*)+([a-zA-Z]([_\d])*)+

--

___
Python tracker 
<https://bugs.python.org/issue41715>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41715] REDoS in c_analyzer

2020-09-04 Thread yeting li


yeting li  added the comment:

You can use the dk.brics.automaton library to verify whether two regexes are 
equivalent.

--

___
Python tracker 
<https://bugs.python.org/issue41715>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41715] REDoS in c_analyzer

2020-09-04 Thread yeting li


yeting li  added the comment:

I think we can replace \w*[a-zA-Z]\w* with (_\d*)+([a-zA-Z]([_\d])*)+

This is an equivalent fix and the fixed regex is safe.

Does that sound right to you?

--

___
Python tracker 
<https://bugs.python.org/issue41715>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41715] REDoS in c_analyzer

2020-09-04 Thread yeting li


Change by yeting li :


--
components: +Library (Lib)
type:  -> security
versions: +Python 3.10

___
Python tracker 
<https://bugs.python.org/issue41715>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41715] REDoS in c_analyzer

2020-09-04 Thread yeting li


Change by yeting li :


--
title: REDoS inc_analyzer -> REDoS in c_analyzer

___
Python tracker 
<https://bugs.python.org/issue41715>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41715] REDoS inc_analyzer

2020-09-04 Thread yeting li

New submission from yeting li :

Hi,

I find this regex "^([a-zA-Z]|_\w*[a-zA-Z]\w*|[a-zA-Z]\w*)$" may be stucked by 
input.
The vulnerable regex is located in
https://github.com/python/cpython/blob/54a66ade2067c373d31003ad260e1b7d14c81564/Tools/c-analyzer/c_analyzer/common/info.py#L12

The ReDOS vulnerability of the regex is mainly due to the sub-pattern 
\w*[a-zA-Z]\w*
and can be exploited with the following string
"_" + "a" * 5000 + "!"


I think you can limit the input length or fix this regex.


Looking forward for your response​!

Best,
Yeting Li

--
files: info.py
messages: 376355
nosy: yetingli
priority: normal
severity: normal
status: open
title: REDoS inc_analyzer
Added file: https://bugs.python.org/file49445/info.py

___
Python tracker 
<https://bugs.python.org/issue41715>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41712] REDoS in purge

2020-09-04 Thread yeting li

New submission from yeting li :

I  find this regex "(\d+\.\d+\.\d+)(\w+\d+)?$" may be stucked by input.
The vulnerable regex is located in
https://github.com/python/cpython/blob/54a66ade2067c373d31003ad260e1b7d14c81564/Tools/msi/purge.py#L15

The ReDOS vulnerability of the regex is mainly due to the sub-pattern \w+\d+
and can be exploited with the following string
"1.1.1"+"1" * 5000 + "!"


I think you can limit the input length or fix this regex.

For example, you can modify the sub-pattern \w+\d+ to ([A-Za-z_]*\d)+

Looking forward for your response​!

Best,
Yeting Li

--
components: Library (Lib)
files: purge.py
messages: 376343
nosy: yetingli
priority: normal
severity: normal
status: open
title: REDoS in purge
type: security
versions: Python 3.10
Added file: https://bugs.python.org/file49443/purge.py

___
Python tracker 
<https://bugs.python.org/issue41712>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com