>
> A regex that's vulnerable to pathological behavior is a DoS attack waiting
>> to happen. Especially when used for parsing log data (which might contain
>> untrusted data). If possible, we should make it harder for people to shoot
>> themselves in the feet.
>>
>
And this is exactly what happened to me. I have a job that
automatically parses logs as they are uploaded, and a log came in that had
an unexpected pattern that triggered pathological behavior in my regex that
did not occur when processing the expected input.  This caused the import
pipeline to back up for many hours before I noticed and fixed it.


> While definitely not as bad and not as likely as SQL injection, I think
> the possibility of regex DoS is totally missing in the stdlib re docs.
> Should there be something added there about if you need to put user input
> into an expression, best practice is to re.escape it?
>

Unless I am missing something, I don't see how re.escape would have helped
me here. I wasn't trying to treat arbitrary input as a regex, so escaping
the regex characters in it wouldn't have done anything to help me. The
problem is that a regex *that I wrote* had a bug in it that caused
pathological behavior, but it wasn't found during testing because it only
occurred when matching against an unexpected input.

-- 
[image: DataStax Logo Square] <https://www.datastax.com/> *J.B. Langston*
Tech Support
Tools Wrangler
+1 650 389 6000 <16503896000> | datastax.com <https://www.datastax.com/>
Find DataStax Online: [image: LinkedIn Logo]
<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linkedin.com_company_datastax&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU&m=uHzE4WhPViSF0rsjSxKhfwGDU1Bo7USObSc_aIcgelo&s=akx0E6l2bnTjOvA-YxtonbW0M4b6bNg4nRwmcHNDo4Q&e=>
   [image: Facebook Logo]
<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.facebook.com_datastax&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU&m=uHzE4WhPViSF0rsjSxKhfwGDU1Bo7USObSc_aIcgelo&s=ncMlB41-6hHuqx-EhnM83-KVtjMegQ9c2l2zDzHAxiU&e=>
   [image: Twitter Logo] <https://twitter.com/DataStax>   [image: RSS Feed]
<https://www.datastax.com/blog/rss.xml>   [image: Github Logo]
<https://github.com/datastax>


On Mon, Feb 14, 2022 at 3:59 PM Nick Timkovich <prometheus...@gmail.com>
wrote:

> A regex that's vulnerable to pathological behavior is a DoS attack waiting
>> to happen. Especially when used for parsing log data (which might contain
>> untrusted data). If possible, we should make it harder for people to shoot
>> themselves in the feet.
>>
>
> While definitely not as bad and not as likely as SQL injection, I think
> the possibility of regex DoS is totally missing in the stdlib re docs.
> Should there be something added there about if you need to put user input
> into an expression, best practice is to re.escape it?
>
>

-- 
[image: DataStax Logo Square] <https://www.datastax.com/> *J.B. Langston*
Tech Support
Tools Wrangler
+1 650 389 6000 <16503896000> | datastax.com <https://www.datastax.com/>
Find DataStax Online: [image: LinkedIn Logo]
<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linkedin.com_company_datastax&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU&m=uHzE4WhPViSF0rsjSxKhfwGDU1Bo7USObSc_aIcgelo&s=akx0E6l2bnTjOvA-YxtonbW0M4b6bNg4nRwmcHNDo4Q&e=>
   [image: Facebook Logo]
<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.facebook.com_datastax&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU&m=uHzE4WhPViSF0rsjSxKhfwGDU1Bo7USObSc_aIcgelo&s=ncMlB41-6hHuqx-EhnM83-KVtjMegQ9c2l2zDzHAxiU&e=>
   [image: Twitter Logo] <https://twitter.com/DataStax>   [image: RSS Feed]
<https://www.datastax.com/blog/rss.xml>   [image: Github Logo]
<https://github.com/datastax>
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RCK6Z2PRBH6NRFWBSRVZJ2CSTEPKK2VF/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to