Bug#926392: licensecheck chokes on long lines
On Thu, 06 Jun 2019 11:26:21 +0200, Jonas Smedegaard wrote: > This bug was introduced in upstream git commit 26bc59e by changing > \W*\S\W* to \W*\S+\W* - and this commit was first introduced in upstream > release v3.1.90. > > In other words, this does _not_ affect Buster. Great, thanks! So we can celebrate that at least the perl side of the archive is ready for release :) Cheers, gregor -- .''`. https://info.comodo.priv.at -- Debian Developer https://www.debian.org : :' : OpenPGP fingerprint D1E1 316E 93A7 60A8 104D 85FA BB3A 6801 8649 AA06 `. `' Member VIBE!AT & SPI Inc. -- Supporter Free Software Foundation Europe `- signature.asc Description: Digital Signature
Bug#926392: licensecheck chokes on long lines
Control: found -1 3.1.92-1 Quoting Jonas Smedegaard (2019-06-05 23:17:36) > Quoting gregor herrmann (2019-06-05 21:46:36) > > AFAICS this is the only buster-relevant RC bug we have. > > > > > > Jonas, my hope is that you have a chance to look into this issue, as > > you are also the upstream maintainer of this module :) > > Yes, I will sure look into this. > > It was not high on my list, however - I was under the impression that > this does not affect Buster. > > I will prioritize at least verifying that detail. This bug was introduced in upstream git commit 26bc59e by changing \W*\S\W* to \W*\S+\W* - and this commit was first introduced in upstream release v3.1.90. In other words, this does _not_ affect Buster. - Jonas -- * Jonas Smedegaard - idealist & Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ [x] quote me freely [ ] ask before reusing [ ] keep private signature.asc Description: signature
Bug#926392: licensecheck chokes on long lines
Quoting gregor herrmann (2019-06-05 21:46:36) > On Wed, 17 Apr 2019 07:08:00 +, Niels Thykier wrote: > > > On Thu, 04 Apr 2019 18:13:43 +0200 Jonas Smedegaard wrote: > > > Quoting Sandro Mani (2019-04-04 13:36:28) > > > > $ wget > > > > https://files.pythonhosted.org/packages/source/x/xonsh/xonsh-0.8.12.tar.gz > > > > $ tar xf xonsh-0.8.12.tar.gz > > > > $ licensecheck xonsh-0.8.12/xonsh/parser_table.py > > > > > > > > => Licensecheck hangs eating cpu cycles (the file has lines with > > > > 33k and 71k characters). > > > > > > Indeed. Thanks for reporting! > > > I have been digging in the code (admittedly using the master branch > > of the libregexp-pattern-license-perl and licensecheck rather than > > the packages) and basically, it is a DOS from suboptimal regex. > > Thanks for your investigation, Niels! Agreed, thanks a lot for your investigation, Niels: I was _very_ happy when you posted it, but then got distracted by other business before getting around to replying back then - sorry! > AFAICS this is the only buster-relevant RC bug we have. > > > Jonas, my hope is that you have a chance to look into this issue, as > you are also the upstream maintainer of this module :) Yes, I will sure look into this. It was not high on my list, however - I was under the impression that this does not affect Buster. I will prioritize at least verifying that detail. - Jonas -- * Jonas Smedegaard - idealist & Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ [x] quote me freely [ ] ask before reusing [ ] keep private signature.asc Description: signature
Bug#926392: licensecheck chokes on long lines
On Wed, 17 Apr 2019 07:08:00 +, Niels Thykier wrote: > On Thu, 04 Apr 2019 18:13:43 +0200 Jonas Smedegaard wrote: > > Quoting Sandro Mani (2019-04-04 13:36:28) > > > $ wget > > > https://files.pythonhosted.org/packages/source/x/xonsh/xonsh-0.8.12.tar.gz > > > $ tar xf xonsh-0.8.12.tar.gz > > > $ licensecheck xonsh-0.8.12/xonsh/parser_table.py > > > > > > => Licensecheck hangs eating cpu cycles (the file has lines with 33k and > > > 71k characters). > > > > Indeed. Thanks for reporting! > I have been digging in the code (admittedly using the master branch of > the libregexp-pattern-license-perl and licensecheck rather than the > packages) and basically, it is a DOS from suboptimal regex. Thanks for your investigation, Niels! AFAICS this is the only buster-relevant RC bug we have. Jonas, my hope is that you have a chance to look into this issue, as you are also the upstream maintainer of this module :) Cheers, gregor -- .''`. https://info.comodo.priv.at -- Debian Developer https://www.debian.org : :' : OpenPGP fingerprint D1E1 316E 93A7 60A8 104D 85FA BB3A 6801 8649 AA06 `. `' Member VIBE!AT & SPI Inc. -- Supporter Free Software Foundation Europe `- signature.asc Description: Digital Signature
Bug#926392: licensecheck chokes on long lines
On Thu, 04 Apr 2019 18:13:43 +0200 Jonas Smedegaard wrote: > control: tag -1 confirmed > > Quoting Sandro Mani (2019-04-04 13:36:28) > > $ wget > > https://files.pythonhosted.org/packages/source/x/xonsh/xonsh-0.8.12.tar.gz > > $ tar xf xonsh-0.8.12.tar.gz > > $ licensecheck xonsh-0.8.12/xonsh/parser_table.py > > > > => Licensecheck hangs eating cpu cycles (the file has lines with 33k and > > 71k characters). > > Indeed. Thanks for reporting! > > - Jonas > > -- > * Jonas Smedegaard - idealist & Internet-arkitekt > * Tlf.: +45 40843136 Website: http://dr.jones.dk/ > > [x] quote me freely [ ] ask before reusing [ ] keep private Hi, I have been digging in the code (admittedly using the master branch of the libregexp-pattern-license-perl and licensecheck rather than the packages) and basically, it is a DOS from suboptimal regex. I traced it down to getting stuck on the python_2 "grant_license". This regex expands to (manually reformatted with /x for readability): """ m! (?^: (?: (?: (?:[Ll]icensed|[Rr]eleased) [ ] under|(?:according [ ] to|as [ ] governed [ ] by|under) [ ] the [ ] (?:conditions|terms) [ ] of)(?:(?:[Tt]he [ ] )?Python-2.0 | (?:[Tt]he [ ])?Python(?: [ ] [Ll]icense)? [ ] 2.0 | (?:[Tt]he [ ])?Python-2.0 | (?:[Tt]he [ ])?Python [ ] Software [ ] Foundation(?: [ ] [Ll]icense)? [ ] version [ ] 2 | (?:[Tt]he [ ])?python2 | (?:[Tt]he [ ])?Python-2 | (?:[Tt]he [ ])?PSF-2 | (?:[Tt]he [ ])?Python(?: [ ] [Ll]icense)? [ ] Version [ ] 2 | (?:[Tt]he [ ])?PYTHON [ ] SOFTWARE [ ] FOUNDATION [ ] LICENSE [ ] VERSION [ ] 2 | (?:[Tt]he [ ])?python-license-2.0) | (?:\W*\S+\W*)PSF [ ] is [ ] making [ ] Python [ ] available [ ] to [ ] Licensee ) ) !x """ The problem is the *last* alternative, namely: """ (?:\W*\S+\W*)PSF [ ] is [ ] making [ ] [...] """ That \W*\S+\W* (known as ${BB} in the libregexp-pattern-license-perl code) is stirring up hell. Basically, perl wants to find the *longest* match and will spent stupid amount of time in this "trivial" regex enumerating exponentially many "non-matches" ([1] strikes again). Simply removing ${BB} will make the code continue past the python_2 test relatively fast. For the python_2 case, I think that the phrase "PSF is making Python available to Licensee" would be sufficient enough to consider it a match (i.e. ${BB} is redundant) - though it will change behaviour on an anchored match (I hope this is not a problem). Though it then gets stuck in the next regex "cube" (from @L_type_unversioned) and that is as far down the rabbit hole I ventured in terms of regex getting stuck (note that "cube" indirectly uses the $BB regex too). Thanks, ~Niels [1] https://swtch.com/~rsc/regexp/regexp1.html
Bug#926392: licensecheck chokes on long lines
control: tag -1 confirmed Quoting Sandro Mani (2019-04-04 13:36:28) > $ wget > https://files.pythonhosted.org/packages/source/x/xonsh/xonsh-0.8.12.tar.gz > $ tar xf xonsh-0.8.12.tar.gz > $ licensecheck xonsh-0.8.12/xonsh/parser_table.py > > => Licensecheck hangs eating cpu cycles (the file has lines with 33k and > 71k characters). Indeed. Thanks for reporting! - Jonas -- * Jonas Smedegaard - idealist & Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ [x] quote me freely [ ] ask before reusing [ ] keep private signature.asc Description: signature
Bug#926392: licensecheck chokes on long lines
Package: licensecheck Version: 3.0.36 As reported downstream at [1]: $ wget https://files.pythonhosted.org/packages/source/x/xonsh/xonsh-0.8.12.tar.gz $ tar xf xonsh-0.8.12.tar.gz $ licensecheck xonsh-0.8.12/xonsh/parser_table.py => Licensecheck hangs eating cpu cycles (the file has lines with 33k and 71k characters). [1] https://bugzilla.redhat.com/show_bug.cgi?id=1695680