Bug#738342: lintian: checks/cruft - GFDL check is slow
Le 9 févr. 2014 13:54, "Niels Thykier" a écrit : > > Package: lintian > Version: 2.5.21 > Severity: normal > > A quick benchmark suggests that lintian spends nearly 2 minutes on the > Linux source package (I tested with linux/3.10~rc7-1~exp1). Profiling > Lintian with perl -d:NYTProf suggests that the vast majority of the time > is spent in: > > """ > if ($cleanedblock =~ $gfdlpattern) { > """ > > Where $gfdlpattern is one of: > > """ > # classical gfdl matching pattern > my $normalgfdlpattern = qr/ > (?'contextbefore'(?: > (?:(?!a \s+ copy \s+ of \s+ the \s+ license \s+ is).){1024}| > (?:\s+ copy \s+ of \s+ the \s+ license \s+ is.{0,1024}?))) > gnu \s+ free \s+ documentation \s+ license > (?'rawgfdlsections'(?:(?!gnu \s+ free \s+ documentation \s+ license).){0,1024}?) > a \s+ copy \s+ of \s+ the \s+ license \s+ is > /xsmo; > > # for first block we get context from the beginning > my $firstblockgfdlpattern = qr/ > (?'rawcontextbefore'(?: > (?:(?!a \s+ copy \s+ of \s+ the \s+ license \s+ is).){1024}| > \A(?:(?!a \s+ copy \s+ of \s+ the \s+ license \s+ is).){0,1024}| > (?:\s+ copy \s+ of \s+ the \s+ license \s+ is.{0,1024}?) > ) > ) > gnu \s+ free \s+ documentation \s+ license > (?'rawgfdlsections'(?:(?!gnu \s+ free \s+ documentation \s+ license).){0,1024}?) > a \s+ copy \s+ of \s+ the \s+ license \s+ is > /xsmo; > """ > > > The profiler suggests that 60% of the runtime is spent in the > "CORE:match" operations inside "license_check" from c/cruft. The > regex appeas to be hit "only" 2452 times, but it spends an average of > 55.9ms per time totalling 137s. > > Bastian, do you have an ideas for reducing the cost of the regex? Yes I have. Use these regexp only if we could match gnu free documentation license Bastien > > ~Niels >
Bug#738342: lintian: checks/cruft - GFDL check is slow
Package: lintian Version: 2.5.21 Severity: normal A quick benchmark suggests that lintian spends nearly 2 minutes on the Linux source package (I tested with linux/3.10~rc7-1~exp1). Profiling Lintian with perl -d:NYTProf suggests that the vast majority of the time is spent in: """ if ($cleanedblock =~ $gfdlpattern) { """ Where $gfdlpattern is one of: """ # classical gfdl matching pattern my $normalgfdlpattern = qr/ (?'contextbefore'(?: (?:(?!a \s+ copy \s+ of \s+ the \s+ license \s+ is).){1024}| (?:\s+ copy \s+ of \s+ the \s+ license \s+ is.{0,1024}?))) gnu \s+ free \s+ documentation \s+ license (?'rawgfdlsections'(?:(?!gnu \s+ free \s+ documentation \s+ license).){0,1024}?) a \s+ copy \s+ of \s+ the \s+ license \s+ is /xsmo; # for first block we get context from the beginning my $firstblockgfdlpattern = qr/ (?'rawcontextbefore'(?: (?:(?!a \s+ copy \s+ of \s+ the \s+ license \s+ is).){1024}| \A(?:(?!a \s+ copy \s+ of \s+ the \s+ license \s+ is).){0,1024}| (?:\s+ copy \s+ of \s+ the \s+ license \s+ is.{0,1024}?) ) ) gnu \s+ free \s+ documentation \s+ license (?'rawgfdlsections'(?:(?!gnu \s+ free \s+ documentation \s+ license).){0,1024}?) a \s+ copy \s+ of \s+ the \s+ license \s+ is /xsmo; """ The profiler suggests that 60% of the runtime is spent in the "CORE:match" operations inside "license_check" from c/cruft. The regex appeas to be hit "only" 2452 times, but it spends an average of 55.9ms per time totalling 137s. Bastian, do you have an ideas for reducing the cost of the regex? ~Niels -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org