Re: What licenses should be included in /usr/share/common-licenses?

2023-10-09 Thread Sean Whitton
Hello Russ,

Thank you for working on this.

On Sat 09 Sep 2023 at 08:35pm -07, Russ Allbery wrote:

> In order to structure the discussion and prod people into thinking about
> the implications, I will make the following straw man proposal.  This is
> what I would do if the decision was entirely up to me:
>
> Licenses will be included in common-licenses if they meet all of the
> following criteria:
>
> * The license is DFSG-free.
> * Exactly the same license wording is used by all works covered by it.
> * The license applies to at least 100 source packages in Debian.
> * The license text is longer than 25 lines.

Something that hasn't been brought up yet is the effects on NEW review.
I would like to expand the idea of the same license wording being used
by all works, to include the additional requirement that there aren't
any very similar licenses that are easily confused with the license.

For, if it's a license with small variations of any kind, including
variations that are not project-specific things like the names of
copyright holders, then NEW review is much easier if all the text is
right there in d/copyright.

I would be in favour of the 25 lines criterion.  The main problem with
manipulating d/copyright is only the really long licenses, IME.

-- 
Sean Whitton


signature.asc
Description: PGP signature


Re: What licenses should be included in /usr/share/common-licenses?

2023-09-13 Thread Benjamin Drung
On Sat, 2023-09-09 at 20:35 -0700, Russ Allbery wrote:
> Licenses will be included in common-licenses if they meet all of the
> following criteria:
> 
> * The license is DFSG-free.
> * Exactly the same license wording is used by all works covered by it.
> * The license applies to at least 100 source packages in Debian.
> * The license text is longer than 25 lines.

That is a good starting point. The third rules could be made more
relaxed to also allow including licences that will save disk space for
common installations (todo: define what is common). Example: More than 2
(or 3) source packages use this license which produce binary packages
that are part of the desktop and/or server seed. 

-- 
Benjamin Drung
Debian & Ubuntu Developer



Re: What licenses should be included in /usr/share/common-licenses?

2023-09-10 Thread Jonas Smedegaard
Quoting Russ Allbery (2023-09-10 21:41:59)
> Jeremy Stanley  writes:
> 
> > I'm surprised, for example, by the absence of the ISC license given that
> > not only ISC's software but much of that originating from the OpenBSD
> > ecosystem uses it. My personal software projects also use the ISC
> > license. Are you aggregating the "License:" field in copyright files
> > too, or is it really simply a hard-coded list of matching patterns?
> 
> It's only a hard-coded list of matching patterns, and it doesn't match any
> of the short licenses because historically I wasn't considering them (with
> the exception of common-licenses references to the BSD license, which I
> kind of would like to make an RC bug and clean up so that we could remove
> the BSD license from common-licenses on the grounds that it's specific to
> only the University of California and confuses people).  If we go with any
> sort of threshold, the script will need serious improvements.
> 
> That was something else I wanted to ask: I've invested all of a couple of
> hours in this script, and would be happy to throw it away in favor of
> something that tries to do a more proper job of classifying the licenses
> referenced in debian/copyright.  Has someone already done this (Jonas,
> perhaps)?

I have so far worked the most on identifying and grouping source data,
putting only little attention (yet - but do dream big...) towards
parsing and processing debian/copyright files e.g. to compare and assess
how well aligned the file is with the content it is supposed to cover.

So if I understand your question correctly and you are not looking for
the output of `licensecheck --list-licenses`, then unfortunately I have
nothing exciting to offer.


 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/
 * Sponsorship: https://ko-fi.com/drjones

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

signature.asc
Description: signature


Re: What licenses should be included in /usr/share/common-licenses?

2023-09-10 Thread Timo Röhling

* Russ Allbery  [2023-09-10 09:16]:

In order to structure the discussion and prod people into thinking about
the implications, I will make the following straw man proposal.  This is
what I would do if the decision was entirely up to me:



Licenses will be included in common-licenses if they meet all of the
following criteria:



* The license is DFSG-free.
* Exactly the same license wording is used by all works covered by it.
* The license applies to at least 100 source packages in Debian.
* The license text is longer than 25 lines.


In the thread so far, there's been a bit of early convergence around my
threshold of 100 packages above.  I want to make sure people realize that
this is a very conservative threshold that would mean saying no to most
new license inclusion requests.

My guess is that with the threshold set at 100, we will probably add
around eight new licenses with the 25 line threshold (AGPL-2,
Artistic-2.0, CC-BY 3.0, CC-BY 4.0, CC-BY-SA 3.0, CC-BY-SA 4.0, and
OFL-1.1, and I'm not sure about some of those because the CC licenses have
variants that would each have to reach the threshold independently; my
current ad hoc script does not distinguish between the variants), and
maybe 10 to 12 total without that threshold (adding Expat, zlib, some of
the BSD licenses).  This would essentially be continuing current practice
except with more transparent and consistent criteria.  It would mean not
including a lot of long legal license texts that people have complained
about having to duplicate, such as the CDDL, CeCILL licenses, probably the
EPL, the Unicode license, etc.

If that's what people want, that's what we'll do; as I said, that's what I
would do if the choice were left entirely up to me.  But I want to make
sure I give the folks who want a much more relaxed standard a chance to
speak up.


For me, this outcome would already be an improvement over the current
situation and alleviate my biggest pain point (CC licenses).
Still, I'd like to be significantly more relaxed.

I propose the following three criteria must be satisfied for
inclusion in /usr/share/common-licenses:

 * The license is DFSG-free.
 * Exactly the same license wording is used by all works covered by it.
 * The license is in the SPDX list of common licenses 
(https://spdx.org/licenses/)
   OR
   The license applies to at least 100 source packages in Debian.


I am not committed to the 100 source packages threshold, it is
mostly intended as fallback for a hypothetical future license which
is super popular but for some reason does not make it to the SPDX
list in a timely manner.

One very intentional side effect of my proposal is a nudge towards
using SPDX License Identifiers in d/copyright files.


Cheers
Timo

--
⢀⣴⠾⠻⢶⣦⠀   ╭╮
⣾⠁⢠⠒⠀⣿⡁   │ Timo Röhling   │
⢿⡄⠘⠷⠚⠋⠀   │ 9B03 EBB9 8300 DF97 C2B1  23BF CC8C 6BDD 1403 F4CA │
⠈⠳⣄   ╰╯


signature.asc
Description: PGP signature


Re: What licenses should be included in /usr/share/common-licenses?

2023-09-10 Thread Russ Allbery
Jeremy Stanley  writes:

> I'm surprised, for example, by the absence of the ISC license given that
> not only ISC's software but much of that originating from the OpenBSD
> ecosystem uses it. My personal software projects also use the ISC
> license. Are you aggregating the "License:" field in copyright files
> too, or is it really simply a hard-coded list of matching patterns?

It's only a hard-coded list of matching patterns, and it doesn't match any
of the short licenses because historically I wasn't considering them (with
the exception of common-licenses references to the BSD license, which I
kind of would like to make an RC bug and clean up so that we could remove
the BSD license from common-licenses on the grounds that it's specific to
only the University of California and confuses people).  If we go with any
sort of threshold, the script will need serious improvements.

That was something else I wanted to ask: I've invested all of a couple of
hours in this script, and would be happy to throw it away in favor of
something that tries to do a more proper job of classifying the licenses
referenced in debian/copyright.  Has someone already done this (Jonas,
perhaps)?

-- 
Russ Allbery (r...@debian.org)  



Re: What licenses should be included in /usr/share/common-licenses?

2023-09-10 Thread Jeremy Stanley
On 2023-09-09 20:35:27 -0700 (-0700), Russ Allbery wrote:
[...]
> Finally, as promised, here is the count of source packages in
> unstable that use the set of licenses that I taught my script to
> look for.  This is likely not accurate; the script uses a bunch of
> heuristics and guesswork.
[...]

I'm surprised, for example, by the absence of the ISC license given
that not only ISC's software but much of that originating from the
OpenBSD ecosystem uses it. My personal software projects also use
the ISC license. Are you aggregating the "License:" field in
copyright files too, or is it really simply a hard-coded list of
matching patterns?

Regardless, this is great work, thanks for kicking off the
reevaluation!
-- 
Jeremy Stanley


signature.asc
Description: PGP signature


Re: What licenses should be included in /usr/share/common-licenses?

2023-09-10 Thread Jonas Smedegaard
Quoting Russ Allbery (2023-09-10 18:16:07)
> Russ Allbery  writes:
> 
> > In order to structure the discussion and prod people into thinking about
> > the implications, I will make the following straw man proposal.  This is
> > what I would do if the decision was entirely up to me:
> 
> > Licenses will be included in common-licenses if they meet all of the
> > following criteria:
> 
> > * The license is DFSG-free.
> > * Exactly the same license wording is used by all works covered by it.
> > * The license applies to at least 100 source packages in Debian.
> > * The license text is longer than 25 lines.
> 
> In the thread so far, there's been a bit of early convergence around my
> threshold of 100 packages above.  I want to make sure people realize that
> this is a very conservative threshold that would mean saying no to most
> new license inclusion requests.
> 
> My guess is that with the threshold set at 100, we will probably add
> around eight new licenses with the 25 line threshold (AGPL-2,
> Artistic-2.0, CC-BY 3.0, CC-BY 4.0, CC-BY-SA 3.0, CC-BY-SA 4.0, and
> OFL-1.1, and I'm not sure about some of those because the CC licenses have
> variants that would each have to reach the threshold independently; my
> current ad hoc script does not distinguish between the variants), and
> maybe 10 to 12 total without that threshold (adding Expat, zlib, some of
> the BSD licenses).  This would essentially be continuing current practice
> except with more transparent and consistent criteria.  It would mean not
> including a lot of long legal license texts that people have complained
> about having to duplicate, such as the CDDL, CeCILL licenses, probably the
> EPL, the Unicode license, etc.
> 
> If that's what people want, that's what we'll do; as I said, that's what I
> would do if the choice were left entirely up to me.  But I want to make
> sure I give the folks who want a much more relaxed standard a chance to
> speak up.

Good point.

Another way of reading the responses is that there was some interest in
including even more licenses.

I would also prefer inclusion of more licenses, simply had the
impression that a) we could do that step by step, and b) my habit of
writing copyright files (and other teksts) using [semantic linebreaks]
made me forget that Expat license is arguably only 3 lines long (whereas
in my style of writing it is 24-25 lines long).

If "include all SPDX licenses" is for some reason (space in minimal
systems?) problematic, then let me propose a threshold of 1000
characters - as that just about covers Expat ;-)


 - Jonas


[semantic linebreaks]: https://sembr.org/

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/
 * Sponsorship: https://ko-fi.com/drjones

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

signature.asc
Description: signature


Re: What licenses should be included in /usr/share/common-licenses?

2023-09-10 Thread Russ Allbery
Russ Allbery  writes:

> In order to structure the discussion and prod people into thinking about
> the implications, I will make the following straw man proposal.  This is
> what I would do if the decision was entirely up to me:

> Licenses will be included in common-licenses if they meet all of the
> following criteria:

> * The license is DFSG-free.
> * Exactly the same license wording is used by all works covered by it.
> * The license applies to at least 100 source packages in Debian.
> * The license text is longer than 25 lines.

In the thread so far, there's been a bit of early convergence around my
threshold of 100 packages above.  I want to make sure people realize that
this is a very conservative threshold that would mean saying no to most
new license inclusion requests.

My guess is that with the threshold set at 100, we will probably add
around eight new licenses with the 25 line threshold (AGPL-2,
Artistic-2.0, CC-BY 3.0, CC-BY 4.0, CC-BY-SA 3.0, CC-BY-SA 4.0, and
OFL-1.1, and I'm not sure about some of those because the CC licenses have
variants that would each have to reach the threshold independently; my
current ad hoc script does not distinguish between the variants), and
maybe 10 to 12 total without that threshold (adding Expat, zlib, some of
the BSD licenses).  This would essentially be continuing current practice
except with more transparent and consistent criteria.  It would mean not
including a lot of long legal license texts that people have complained
about having to duplicate, such as the CDDL, CeCILL licenses, probably the
EPL, the Unicode license, etc.

If that's what people want, that's what we'll do; as I said, that's what I
would do if the choice were left entirely up to me.  But I want to make
sure I give the folks who want a much more relaxed standard a chance to
speak up.

-- 
Russ Allbery (r...@debian.org)  



Re: What licenses should be included in /usr/share/common-licenses?

2023-09-10 Thread Russ Allbery
Jonas Smedegaard  writes:
> Quoting Hideki Yamane (2023-09-10 11:00:07)

>>  Hmm, how about providing license-common package and that depends on
>>  "license-common-list", and ISO image provides both, then? It would be
>>  no regressions.

I do wonder why we've never done this.  Does anyone know?  common-licenses
is in an essential package so it doesn't require a dependency and is
always present, and we've leaned on that in the past in justifying not
including those licenses in the binary packages themselves, but I'm not
sure why a package dependency wouldn't be legally equivalent.  We allow
symlinking the /usr/share/doc directory in some cases where there is a
dependency, so we don't strictly require every binary package have a
copyright file.

>>  I expect license-common-list data as below
>> 
>>  license-short-name: URL
>>  GPL-2: file:///usr/share/common-licenses/GPL-2
>>  Boost-1.0: https://spdx.org/licenses/BSL-1.0.html

> Ah, so what you propose is to use file URIs.

> I guess Russ' response above was a concern over using http(s) URIs
> towards a non-local resource.

Yes, I think the https URL is an essential part of the first proposal,
since it avoids needing to ship a copy of all of the licenses.  But I'm
dubious that would pass legal muster.

The alternative proposal as I understand it would be to haave a
license-common package that includes full copies of all the licenses with
some more relaxed threshold requirement and have packages that use one of
those licenses depend on that package.  (This would obviously require a
maintainer be found for the license-common package.)

> License: Apache-2.0
> Reference: /usr/share/common-licenses/Apache-2.0

This is separate from this particular bug, but I would love to see the
pointer to common-licenses turned into a formal field of this type in the
copyright format, rather than being an ad hoc comment.

-- 
Russ Allbery (r...@debian.org)  



Re: What licenses should be included in /usr/share/common-licenses?

2023-09-10 Thread Luca Boccassi
On Sun, 10 Sept 2023 at 04:36, Russ Allbery  wrote:
> Licenses will be included in common-licenses if they meet all of the
> following criteria:
>
> * The license is DFSG-free.
> * Exactly the same license wording is used by all works covered by it.
> * The license applies to at least 100 source packages in Debian.
> * The license text is longer than 25 lines.

+1, great work and great starting point.

I also agree with Enrico and I'd like lower limits too, but any
progress is good progress on this matter for me.



Re: What licenses should be included in /usr/share/common-licenses?

2023-09-10 Thread Jonas Smedegaard
Quoting Hideki Yamane (2023-09-10 11:00:07)
> On Sat, 09 Sep 2023 22:41:48 -0700
> Russ Allbery  wrote:
> > >  How about just pointing SPDX licenses URL for whole license text and
> > >  lists DFSG-free licenses from that? (but yes, we should adjust short
> > >  name of licenses for DEP-5 and SPDX for it).
> > 
> > Can we do this legally?  If we can, it certainly has substantial merits,
> > but I'm not sure that this satisfies the requirement in a lot of licenses
> > to distribute a copy of the license along with the work.  Some licenses
> > may allow that to be provided as a URL, but I don't think they all do
> > (which makes sense since people may receive Debian on physical media and
> > not have Internet access).
> 
>  Hmm, how about providing license-common package and that depends on
>  "license-common-list", and ISO image provides both, then? It would be
>  no regressions.
> 
> 
>  I expect license-common-list data as below
> 
>  license-short-name: URL
>  GPL-2: file:///usr/share/common-licenses/GPL-2
>  Boost-1.0: https://spdx.org/licenses/BSL-1.0.html

Ah, so what you propose is to use file URIs.

I guess Russ' response above was a concern over using http(s) URIs
towards a non-local resource.

What I practice since some years is the following syntax:

Files: foo/bar
Copyright:
  2022  Someone
License: Apache-2.0 or Expat

License: Apache-2.0
Reference: /usr/share/common-licenses/Apache-2.0

License: Expat
 [the full contents of the Expat license]

That syntax introduces a new field "Reference" (our copyright file
format permits new fields, despite lintian complaining about it).
Related discussion is at https://bugs.debian.org/786450


 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/
 * Sponsorship: https://ko-fi.com/drjones

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

signature.asc
Description: signature


Re: What licenses should be included in /usr/share/common-licenses?

2023-09-10 Thread Marco d'Itri
On Sep 10, Enrico Zini  wrote:

> I like this. I'd say that even if a license is shorter than 25 lines I'd
> appreciate to be able to link to it instead of copypasting it.
Me too.

> I like to be able to fill the license field with a value, after checking
> that the upstream license didn't diverge from what it looks like. I'd
> love to use SPDX IDs there, for example. In an ideal world, I'd like to
> autofill debian/copyright with SPDX IDs from upstream metadata. Having a
> link to a file goes closer to having a declarative license ID.
Agreed.

-- 
ciao,
Marco


signature.asc
Description: PGP signature


Re: What licenses should be included in /usr/share/common-licenses?

2023-09-10 Thread Hideki Yamane
On Sat, 09 Sep 2023 22:41:48 -0700
Russ Allbery  wrote:
> >  How about just pointing SPDX licenses URL for whole license text and
> >  lists DFSG-free licenses from that? (but yes, we should adjust short
> >  name of licenses for DEP-5 and SPDX for it).
> 
> Can we do this legally?  If we can, it certainly has substantial merits,
> but I'm not sure that this satisfies the requirement in a lot of licenses
> to distribute a copy of the license along with the work.  Some licenses
> may allow that to be provided as a URL, but I don't think they all do
> (which makes sense since people may receive Debian on physical media and
> not have Internet access).

 Hmm, how about providing license-common package and that depends on
 "license-common-list", and ISO image provides both, then? It would be
 no regressions.


 I expect license-common-list data as below

 license-short-name: URL
 GPL-2: file:///usr/share/common-licenses/GPL-2
 Boost-1.0: https://spdx.org/licenses/BSL-1.0.html

-- 
Hideki Yamane 



Re: What licenses should be included in /usr/share/common-licenses?

2023-09-09 Thread Russ Allbery
Hideki Yamane  writes:
> Russ Allbery  wrote:

>> Licenses will be included in common-licenses if they meet all of the
>> following criteria:

>  How about just pointing SPDX licenses URL for whole license text and
>  lists DFSG-free licenses from that? (but yes, we should adjust short
>  name of licenses for DEP-5 and SPDX for it).

Can we do this legally?  If we can, it certainly has substantial merits,
but I'm not sure that this satisfies the requirement in a lot of licenses
to distribute a copy of the license along with the work.  Some licenses
may allow that to be provided as a URL, but I don't think they all do
(which makes sense since people may receive Debian on physical media and
not have Internet access).

-- 
Russ Allbery (r...@debian.org)  



Re: What licenses should be included in /usr/share/common-licenses?

2023-09-09 Thread Enrico Zini
On Sat, Sep 09, 2023 at 08:35:27PM -0700, Russ Allbery wrote:

> Licenses will be included in common-licenses if they meet all of the
> following criteria:
> 
> * The license is DFSG-free.
> * Exactly the same license wording is used by all works covered by it.
> * The license applies to at least 100 source packages in Debian.
> * The license text is longer than 25 lines.

I like this. I'd say that even if a license is shorter than 25 lines I'd
appreciate to be able to link to it instead of copypasting it.

I like to be able to fill the license field with a value, after checking
that the upstream license didn't diverge from what it looks like. I'd
love to use SPDX IDs there, for example. In an ideal world, I'd like to
autofill debian/copyright with SPDX IDs from upstream metadata. Having a
link to a file goes closer to having a declarative license ID.

In general the less bytes I have to maintain in debian/* the happier I
am, and as a personal aesthetic sense I feel like the less bytes we all
have to maintain in debian/* the less is our collective maintenance
burden.


Enrico

-- 
GPG key: 4096R/634F4BD1E7AD5568 2009-05-08 Enrico Zini 



Re: What licenses should be included in /usr/share/common-licenses?

2023-09-09 Thread Hideki Yamane
On Sat, 09 Sep 2023 20:35:27 -0700
Russ Allbery  wrote:
> Licenses will be included in common-licenses if they meet all of the
> following criteria:

 How about just pointing SPDX licenses URL for whole license text and
 lists DFSG-free licenses from that? (but yes, we should adjust short
 name of licenses for DEP-5 and SPDX for it).


-- 
Hideki Yamane 



Re: What licenses should be included in /usr/share/common-licenses?

2023-09-09 Thread Jonas Smedegaard
Quoting Russ Allbery (2023-09-10 05:35:27)
> In order to structure the discussion and prod people into thinking about
> the implications, I will make the following straw man proposal.  This is
> what I would do if the decision was entirely up to me:
> 
> Licenses will be included in common-licenses if they meet all of the
> following criteria:
> 
> * The license is DFSG-free.
> * Exactly the same license wording is used by all works covered by it.
> * The license applies to at least 100 source packages in Debian.
> * The license text is longer than 25 lines.

I fully support the above proposed criteria, and appreciate your
initiative to have this conversation.


 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/
 * Sponsorship: https://ko-fi.com/drjones

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

signature.asc
Description: signature