Source: nltk Version: 3.6.5-1 Severity: important Tags: security upstream Forwarded: https://github.com/nltk/nltk/issues/2866 X-Debbugs-Cc: car...@debian.org, Debian Security Team <t...@security.debian.org>
Hi, The following vulnerability was published for nltk. CVE-2021-43854[0]: | NLTK (Natural Language Toolkit) is a suite of open source Python | modules, data sets, and tutorials supporting research and development | in Natural Language Processing. Versions prior to 3.6.5 are vulnerable | to regular expression denial of service (ReDoS) attacks. The | vulnerability is present in PunktSentenceTokenizer, sent_tokenize and | word_tokenize. Any users of this class, or these two functions, are | vulnerable to the ReDoS attack. In short, a specifically crafted long | input to any of these vulnerable functions will cause them to take a | significant amount of execution time. If your program relies on any of | the vulnerable functions for tokenizing unpredictable user input, then | we would strongly recommend upgrading to a version of NLTK without the | vulnerability. For users unable to upgrade the execution time can be | bounded by limiting the maximum length of an input to any of the | vulnerable functions. Our recommendation is to implement such a limit. If you fix the vulnerability please also make sure to include the CVE (Common Vulnerabilities & Exposures) id in your changelog entry. For further information see: [0] https://security-tracker.debian.org/tracker/CVE-2021-43854 https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-43854 [1] https://github.com/nltk/nltk/issues/2866 [2] https://github.com/nltk/nltk/security/advisories/GHSA-f8m6-h2c7-8h9x Please adjust the affected versions in the BTS as needed. Regards, Salvatore