Control: reassign -1 libstring-copyright-perl Control: retitle -1 libstring-copyright-perl: incorrectly parses multi-line copyright notices Control: found -1 0.003005-1
Hi Ximin, Quoting Ximin Luo (2017-07-05 17:45:17) > For > https://sources.debian.net/src/sagemath/7.6-2/sage/src/sage/misc/edit_module.py/ > > $ licensecheck --copyright src/sage/misc/edit_module.py > src/sage/misc/edit_module.py: GPL > [Copyright: 2007 Nils Bruin <[email protected]> and] > > This is wrong, but I can work around it with the following sed script: > > $ cat src/sage/misc/edit_module.py | tr '\n' '\t' | sed -e > 's/\(,\|\band\)\s*\t#\?\s*/\1 /g' | tr '\t' '\n' > fixed.py > $ licensecheck --copyright fixed.py > fixed.py: GPL > [Copyright: 2007 Nils Bruin <[email protected]> and William Stein > <[email protected]>] > > It would be good if this logic were incorporated into licensecheck > itself. I'd help, but my perl is really bad. > > (Also perhaps the # in the regex should be a (?:#|//|/*) or something > like that) I agree (unsurprisingly) that this is wrong. Unfortunately it is not as simple as throwing a regex at it: One of my reasons for taking over and working on licensecheck was a remark once on d-devel@ that it was far too slow to be usable for Chromium, and I wanted to (silently so as to not make too much of a fool of myself) take the challenge of optimizing it. Unlikely in its days living in devscripts, licensecheck routines to match copyright holders have been separated into new library String::Copyright (libstring-copyright-perl in Debian), and the code has been refactored to use a single large RE2-compatible regex to match each copyright statement, in the hope of some day switching to use the RE2 engine and become faster... My first brief look at this has revealed a few bugs: In next release of licensecheck the leading # is stripped _before_ handing over to String::Copyright code (as was intended for years). Have a look (if interested) at /usr/share/perl5/String/Copyright.pm and in particular the (huge when expanded) $signs_and_more_re at line 138. Replacing $blank_re with $blank_or_break_re in $owners_re (line 136) succeeds in detecting the second copyright holder, but then also bogusly includes the license statement as a copyright holder. > X That is the most elegant signature I have seen. Ever! It beats my primary school teacher who used "kh" to mean both her initials and an abbreviation of the danish equivalent of "kind regards". - Jonas -- * Jonas Smedegaard - idealist & Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ [x] quote me freely [ ] ask before reusing [ ] keep private
signature.asc
Description: signature

