Bug#906284: lintian: check for incomplete-creative-commons-license gives false positives: the "not a law firm" is a preamble, not a license

2018-09-20 Thread Chris Lamb
tags 906284 + pending
thanks

This is now fixed in Git, pending upload:

  
https://salsa.debian.org/lintian/lintian/commit/b0ee727b5f3abe977e5c5f57eedecfd4486cf127


Regards,

-- 
  ,''`.
 : :'  : Chris Lamb
 `. `'`  la...@debian.org / chris-lamb.co.uk
   `-



Bug#906284: lintian: check for incomplete-creative-commons-license gives false positives: the "not a law firm" is a preamble, not a license

2018-09-04 Thread Chris Lamb
Jonathan Dowland wrote:

> you wanted a corpus of good and bad texts to test against. Is that
> still the case?

Anyone who implements the Lintian change will require updating the
testsuite, so yes.

> > I like how this implies that Lintian, too, is a hacky script...
> 
> Sorry if it can be interpreted that way, that is not what I meant.

(Ah, shame...)


Regards,

-- 
  ,''`.
 : :'  : Chris Lamb
 `. `'`  la...@debian.org / chris-lamb.co.uk
   `-



Bug#906284: lintian: check for incomplete-creative-commons-license gives false positives: the "not a law firm" is a preamble, not a license

2018-09-04 Thread Jonathan Dowland

Hi!

On Mon, Sep 03, 2018 at 09:16:27AM +0100, Chris Lamb wrote:

(I like how this implies that Lintian, too, is a hacky script...)


Sorry if it can be interpreted that way, that is not what I meant.


Do let me know when you are happy with the output so we can update
Lintian, etc.


I think I've proved the concept that Julien suggested, but I have not
attempted to write a lintian patch.

Who is proposing to do what? Looking back over the bug history, Chris,
you seemed keen to make the lintain change, but you wanted a corpus of
good and bad texts to test against. Is that still the case?


Thanks,

--

⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ Jonathan Dowland
⢿⡄⠘⠷⠚⠋⠀ https://jmtd.net
⠈⠳⣄



Bug#906284: lintian: check for incomplete-creative-commons-license gives false positives: the "not a law firm" is a preamble, not a license

2018-09-03 Thread Chris Lamb
Hi,

> I attempted to simulate this change in Lintian with a totally
> separate hacky script

(I like how this implies that Lintian, too, is a hacky script...)

Do let me know when you are happy with the output so we can update
Lintian, etc.


Regards,

-- 
  ,''`.
 : :'  : Chris Lamb
 `. `'`  la...@debian.org / chris-lamb.co.uk
   `-



Bug#906284: lintian: check for incomplete-creative-commons-license gives false positives: the "not a law firm" is a preamble, not a license

2018-08-28 Thread Jonathan Dowland

Interestingly, 307 is roughly half of all CC-license using packages,
based on the numbers I counted in #795402

--

⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ Jonathan Dowland
⢿⡄⠘⠷⠚⠋⠀ https://jmtd.net
⠈⠳⣄



Bug#906284: lintian: check for incomplete-creative-commons-license gives false positives: the "not a law firm" is a preamble, not a license

2018-08-25 Thread Chris Lamb
forcemerge 906284 907272
thanks

Hi,

This looks like #906284 - let's at least centralise the on-going
discussion there (and vice versa).


Regards,

-- 
  ,''`.
 : :'  : Chris Lamb
 `. `'`  la...@debian.org / chris-lamb.co.uk
   `-



Bug#906284: lintian: check for incomplete-creative-commons-license gives false positives: the "not a law firm" is a preamble, not a license

2018-08-23 Thread Chris Lamb
Hi Julian & Jonathan,

> How about the following?  In the parse_license function, where each
> license paragraph is parsed, something like the following:
> 
> if ($full_license and $short_license =~ m/cc-/) {
> if ($full_license !~ /definitions/i) {
> tag 'incomplete-creative-commons-license';
> }
> }

Jonathan, any input on this?


Regards,

-- 
  ,''`.
 : :'  : Chris Lamb
 `. `'`  la...@debian.org / chris-lamb.co.uk
   `-



Bug#906284: lintian: check for incomplete-creative-commons-license gives false positives: the "not a law firm" is a preamble, not a license

2018-08-19 Thread Julian Gilbey
On Thu, Aug 16, 2018 at 04:32:08PM +0100, Chris Lamb wrote:
> Hi Julian,
> 
> > The test for the human-readable rather than legal text of the Creative
> > Commons licenses seems to fail, because the preamble about Creative
> > Commons not being a law firm is not part of the license text, and
> > neither is the postamble about Creative Commons not being a party to
> > the license agreement; they are instead form the terms and conditions
> > between Creative Commons and the person using a CC license.  So I
> > cannot see why these parts should necessarily be included in the
> > Debian copyright file.  Has there been a policy decision to require
> > this, perhaps?
> > 
> > Also, it seems that this check would be better in the parse_license
> > function when checking each license block rather than the run
> > function, as there might be more than one CC license in a copyright
> > file, and it is feasible that one is correct and one not.
> 
> CC'ing Jonathan Dowland who filed the original request for this
> in #903470. Could you folks come to some agreement on a good/reliable
> check?

Hi Chris and Jonathan,

How about the following?  In the parse_license function, where each
license paragraph is parsed, something like the following:

if ($full_license and $short_license =~ m/cc-/) {
if ($full_license !~ /definitions/i) {
tag 'incomplete-creative-commons-license';
}
}

All of the full legal texts contain "Section 1. Definitions", whereas
the human-readable summaries don't.

This also means that you are not searching the entire copyright file,
but rather just the paragraph with the full Creative Commons text.

Best wishes,

   Julian



Bug#906284: lintian: check for incomplete-creative-commons-license gives false positives: the "not a law firm" is a preamble, not a license

2018-08-16 Thread Chris Lamb
Hi Julian,

> The test for the human-readable rather than legal text of the Creative
> Commons licenses seems to fail, because the preamble about Creative
> Commons not being a law firm is not part of the license text, and
> neither is the postamble about Creative Commons not being a party to
> the license agreement; they are instead form the terms and conditions
> between Creative Commons and the person using a CC license.  So I
> cannot see why these parts should necessarily be included in the
> Debian copyright file.  Has there been a policy decision to require
> this, perhaps?
> 
> Also, it seems that this check would be better in the parse_license
> function when checking each license block rather than the run
> function, as there might be more than one CC license in a copyright
> file, and it is feasible that one is correct and one not.

CC'ing Jonathan Dowland who filed the original request for this
in #903470. Could you folks come to some agreement on a good/reliable
check?


Regards,

-- 
  ,''`.
 : :'  : Chris Lamb
 `. `'`  la...@debian.org / chris-lamb.co.uk
   `-



Bug#906284: lintian: check for incomplete-creative-commons-license gives false positives: the "not a law firm" is a preamble, not a license

2018-08-16 Thread Julian Gilbey
Package: lintian
Version: 2.5.96
Severity: normal

The test for the human-readable rather than legal text of the Creative
Commons licenses seems to fail, because the preamble about Creative
Commons not being a law firm is not part of the license text, and
neither is the postamble about Creative Commons not being a party to
the license agreement; they are instead form the terms and conditions
between Creative Commons and the person using a CC license.  So I
cannot see why these parts should necessarily be included in the
Debian copyright file.  Has there been a policy decision to require
this, perhaps?

Also, it seems that this check would be better in the parse_license
function when checking each license block rather than the run
function, as there might be more than one CC license in a copyright
file, and it is feasible that one is correct and one not.

Best wishes,

   Julian

-- System Information:
Debian Release: buster/sid
  APT prefers stretch
  APT policy: (500, 'stretch'), (500, 'testing'), (500, 'stable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.14.0-3-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_GB.utf8, LC_CTYPE=en_GB.utf8 (charmap=UTF-8) (ignored: LC_ALL 
set to en_GB.UTF-8), LANGUAGE=en_GB.utf8 (charmap=UTF-8) (ignored: LC_ALL set 
to en_GB.UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages lintian depends on:
ii  binutils   2.31.1-2
ii  bzip2  1.0.6-8.1
ii  diffstat   1.61-1+b1
ii  dpkg   1.19.0.5+b1
ii  file   1:5.34-2
ii  gettext0.19.8.1-6+b1
ii  intltool-debian0.35.0+20060710.4
ii  libapt-pkg-perl0.1.34
ii  libarchive-zip-perl1.60-1
ii  libclass-accessor-perl 0.51-1
ii  libclone-perl  0.39-1
ii  libdpkg-perl   1.19.0.5
ii  libemail-valid-perl1.202-1
ii  libfile-basedir-perl   0.08-1
ii  libipc-run-perl20180523.0-1
ii  liblist-moreutils-perl 0.416-1+b3
ii  libparse-debianchangelog-perl  1.2.0-12
ii  libtext-levenshtein-perl   0.13-1
ii  libtimedate-perl   2.3000-2
ii  liburi-perl1.74-1
ii  libxml-simple-perl 2.25-1
ii  libyaml-libyaml-perl   0.72+repack-1
ii  man-db 2.8.4-2
ii  patchutils 0.3.4-2
ii  perl [libdigest-sha-perl]  5.26.2-7
ii  t1utils1.41-2
ii  xz-utils   5.2.2-1.3

Versions of packages lintian recommends:
ii  libperlio-gzip-perl  0.19-1+b4

Versions of packages lintian suggests:
pn  binutils-multiarch 
ii  dpkg-dev   1.19.0.5
ii  libhtml-parser-perl3.72-3+b2
ii  libtext-template-perl  1.53-1

-- no debconf information