Bug#926392: licensecheck chokes on long lines

2019-06-06 Thread gregor herrmann
On Thu, 06 Jun 2019 11:26:21 +0200, Jonas Smedegaard wrote:

> This bug was introduced in upstream git commit 26bc59e by changing 
> \W*\S\W* to \W*\S+\W* - and this commit was first introduced in upstream 
> release v3.1.90.
> 
> In other words, this does _not_ affect Buster.

Great, thanks!

So we can celebrate that at least the perl side of the archive is
ready for release :)


Cheers,
gregor

-- 
 .''`.  https://info.comodo.priv.at -- Debian Developer https://www.debian.org
 : :' : OpenPGP fingerprint D1E1 316E 93A7 60A8 104D  85FA BB3A 6801 8649 AA06
 `. `'  Member VIBE!AT & SPI Inc. -- Supporter Free Software Foundation Europe
   `-   


signature.asc
Description: Digital Signature


Bug#926392: licensecheck chokes on long lines

2019-06-06 Thread Jonas Smedegaard
Control: found -1 3.1.92-1

Quoting Jonas Smedegaard (2019-06-05 23:17:36)
> Quoting gregor herrmann (2019-06-05 21:46:36)
> > AFAICS this is the only buster-relevant RC bug we have.
> >  
> > 
> > Jonas, my hope is that you have a chance to look into this issue, as 
> > you are also the upstream maintainer of this module :)
> 
> Yes, I will sure look into this.
> 
> It was not high on my list, however - I was under the impression that 
> this does not affect Buster.
> 
> I will prioritize at least verifying that detail.

This bug was introduced in upstream git commit 26bc59e by changing 
\W*\S\W* to \W*\S+\W* - and this commit was first introduced in upstream 
release v3.1.90.

In other words, this does _not_ affect Buster.


 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private


signature.asc
Description: signature


Bug#926392: licensecheck chokes on long lines

2019-06-05 Thread Jonas Smedegaard
Quoting gregor herrmann (2019-06-05 21:46:36)
> On Wed, 17 Apr 2019 07:08:00 +, Niels Thykier wrote:
> 
> > On Thu, 04 Apr 2019 18:13:43 +0200 Jonas Smedegaard  wrote:
> > > Quoting Sandro Mani (2019-04-04 13:36:28)
> > > > $ wget 
> > > > https://files.pythonhosted.org/packages/source/x/xonsh/xonsh-0.8.12.tar.gz
> > > > $ tar xf xonsh-0.8.12.tar.gz
> > > > $ licensecheck xonsh-0.8.12/xonsh/parser_table.py
> > > > 
> > > > => Licensecheck hangs eating cpu cycles (the file has lines with 
> > > > 33k and 71k characters).
> > > 
> > > Indeed. Thanks for reporting!
> 
> > I have been digging in the code (admittedly using the master branch 
> > of the libregexp-pattern-license-perl and licensecheck rather than 
> > the packages) and basically, it is a DOS from suboptimal regex.
> 
> Thanks for your investigation, Niels!

Agreed, thanks a lot for your investigation, Niels: I was _very_ happy 
when you posted it, but then got distracted by other business before 
getting around to replying back then - sorry!


> AFAICS this is the only buster-relevant RC bug we have.
>  
> 
> Jonas, my hope is that you have a chance to look into this issue, as
> you are also the upstream maintainer of this module :)

Yes, I will sure look into this.

It was not high on my list, however - I was under the impression that 
this does not affect Buster.

I will prioritize at least verifying that detail.


 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private


signature.asc
Description: signature


Bug#926392: licensecheck chokes on long lines

2019-06-05 Thread gregor herrmann
On Wed, 17 Apr 2019 07:08:00 +, Niels Thykier wrote:

> On Thu, 04 Apr 2019 18:13:43 +0200 Jonas Smedegaard  wrote:
> > Quoting Sandro Mani (2019-04-04 13:36:28)
> > > $ wget 
> > > https://files.pythonhosted.org/packages/source/x/xonsh/xonsh-0.8.12.tar.gz
> > > $ tar xf xonsh-0.8.12.tar.gz
> > > $ licensecheck xonsh-0.8.12/xonsh/parser_table.py
> > > 
> > > => Licensecheck hangs eating cpu cycles (the file has lines with 33k and 
> > > 71k characters).
> > 
> > Indeed. Thanks for reporting!

> I have been digging in the code (admittedly using the master branch of
> the libregexp-pattern-license-perl and licensecheck rather than the
> packages) and basically, it is a DOS from suboptimal regex.

Thanks for your investigation, Niels!

AFAICS this is the only buster-relevant RC bug we have.
 

Jonas, my hope is that you have a chance to look into this issue, as
you are also the upstream maintainer of this module :)


Cheers,
gregor

-- 
 .''`.  https://info.comodo.priv.at -- Debian Developer https://www.debian.org
 : :' : OpenPGP fingerprint D1E1 316E 93A7 60A8 104D  85FA BB3A 6801 8649 AA06
 `. `'  Member VIBE!AT & SPI Inc. -- Supporter Free Software Foundation Europe
   `-   


signature.asc
Description: Digital Signature


Bug#926392: licensecheck chokes on long lines

2019-04-17 Thread Niels Thykier
On Thu, 04 Apr 2019 18:13:43 +0200 Jonas Smedegaard  wrote:
> control: tag -1 confirmed
> 
> Quoting Sandro Mani (2019-04-04 13:36:28)
> > $ wget 
> > https://files.pythonhosted.org/packages/source/x/xonsh/xonsh-0.8.12.tar.gz
> > $ tar xf xonsh-0.8.12.tar.gz
> > $ licensecheck xonsh-0.8.12/xonsh/parser_table.py
> > 
> > => Licensecheck hangs eating cpu cycles (the file has lines with 33k and 
> > 71k characters).
> 
> Indeed. Thanks for reporting!
> 
>  - Jonas
> 
> -- 
>  * Jonas Smedegaard - idealist & Internet-arkitekt
>  * Tlf.: +45 40843136  Website: http://dr.jones.dk/
> 
>  [x] quote me freely  [ ] ask before reusing  [ ] keep private

Hi,

I have been digging in the code (admittedly using the master branch of
the libregexp-pattern-license-perl and licensecheck rather than the
packages) and basically, it is a DOS from suboptimal regex.

I traced it down to getting stuck on the python_2 "grant_license".  This
regex expands to (manually reformatted with /x for readability):

"""
m!
(?^:
(?:
(?: (?:[Ll]icensed|[Rr]eleased) [ ] under|(?:according [ ] to|as
[ ] governed [ ] by|under) [ ] the [ ] (?:conditions|terms)
[ ] of)(?:(?:[Tt]he [ ] )?Python-2.0

  | (?:[Tt]he [ ])?Python(?: [ ] [Ll]icense)? [ ] 2.0
  | (?:[Tt]he [ ])?Python-2.0
  | (?:[Tt]he [ ])?Python [ ] Software [ ]
Foundation(?: [ ] [Ll]icense)? [ ] version [ ] 2
  | (?:[Tt]he [ ])?python2
  | (?:[Tt]he [ ])?Python-2
  | (?:[Tt]he [ ])?PSF-2
  | (?:[Tt]he [ ])?Python(?: [ ] [Ll]icense)? [ ] Version [ ] 2
  | (?:[Tt]he [ ])?PYTHON [ ] SOFTWARE [ ] FOUNDATION [ ] LICENSE [
] VERSION [ ] 2
  | (?:[Tt]he [ ])?python-license-2.0)
  | (?:\W*\S+\W*)PSF [ ] is [ ] making [ ] Python [ ] available [ ]
to [ ] Licensee

)

)
!x
"""

The problem is the *last* alternative, namely:

"""
  (?:\W*\S+\W*)PSF [ ] is [ ] making [ ] [...]
"""


That \W*\S+\W* (known as ${BB} in the libregexp-pattern-license-perl
code) is stirring up hell. Basically, perl wants to find the *longest*
match and will spent stupid amount of time in this "trivial" regex
enumerating exponentially many "non-matches" ([1] strikes again).

Simply removing ${BB} will make the code continue past the python_2 test
relatively fast.   For the python_2 case, I think that the phrase "PSF
is making Python available to Licensee" would be sufficient enough to
consider it a match (i.e. ${BB} is redundant) - though it will change
behaviour on an anchored match (I hope this is not a problem).


Though it then gets stuck in the next regex "cube" (from
@L_type_unversioned) and that is as far down the rabbit hole I ventured
in terms of regex getting stuck (note that "cube" indirectly uses the
$BB regex too).

Thanks,
~Niels

[1] https://swtch.com/~rsc/regexp/regexp1.html



Bug#926392: licensecheck chokes on long lines

2019-04-04 Thread Jonas Smedegaard
control: tag -1 confirmed

Quoting Sandro Mani (2019-04-04 13:36:28)
> $ wget 
> https://files.pythonhosted.org/packages/source/x/xonsh/xonsh-0.8.12.tar.gz
> $ tar xf xonsh-0.8.12.tar.gz
> $ licensecheck xonsh-0.8.12/xonsh/parser_table.py
> 
> => Licensecheck hangs eating cpu cycles (the file has lines with 33k and 
> 71k characters).

Indeed. Thanks for reporting!

 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private


signature.asc
Description: signature


Bug#926392: licensecheck chokes on long lines

2019-04-04 Thread Sandro Mani

Package: licensecheck
Version: 3.0.36

As reported downstream at [1]:


$ wget 
https://files.pythonhosted.org/packages/source/x/xonsh/xonsh-0.8.12.tar.gz

$ tar xf xonsh-0.8.12.tar.gz
$ licensecheck xonsh-0.8.12/xonsh/parser_table.py

=> Licensecheck hangs eating cpu cycles (the file has lines with 33k and 
71k characters).



[1] https://bugzilla.redhat.com/show_bug.cgi?id=1695680