Hello everyone, I come seeking your opinions. Please cc 885...@bugs.debian.org on replies so that we can accumulate this discussion in a Debian Policy bug.
One of the responsibilities of the Policy Editors is to determine which licenses should be included in /usr/share/common-licenses, and thus do not have to be reproduced in the copyright file of every package that use them. We have never had a clear criteria for this. We need one, so that we can advertise a clear and transparent policy for inclusion without having the conversation from first principles for each new license. I was the one who made the last few decisions, and I based the decision largely on the number of binary packages in Debian using the license. When I was doing this, I set a fairly high threshold (more packages than the least popular package currently in /usr/share/common-licenses, which historically has been GFDL-1.3 although it now appears to be MPL-1.1). No one was entirely satisfied with that criteria, including me. I have the following questions: 1. What criteria (besides the obvious one of being a DFSG-free license) should we apply when deciding what licenses to include? Number of packages? Length? How positive we feel towards the license? Some combination of these things? Please be specific. 2. If we use number of packages as a criteria, what should the threshold be? I have appended to the bottom of this message the current output of my ad-hoc license-count tool run against the current archive so that you have a feeling for how many packages use various licenses. 3. If we use number of packages, should that be source packages or binary packages? Source packages represent maintainer effort; binary packages represent disk clutter. 4. Should there be a length cutoff for licenses, such that we do not include in /usr/share/common-licenses any license shorter than some number of lines or bytes? The justification would be that telling people to go look elsewhere for the license has some inherent overhead and annoyance when they discover that the license is all of ten lines and could have just been included in the copyright file. 5. Should we exclude licenses that contain text that all or most users of the license customize when they use it? For example, the existing /usr/share/common-licenses/BSD contains the clause: 3. Neither the name of the University nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. which users of this specific license usually change to instead include the name of their organization, or their name, or something else. Full disclosure: it will be very hard to convince me that licenses used this way should be included in common-licenses, since I believe it is technically incorrect to omit a license and point to the common-licenses version when the provisions of the common-licenses version are different in detail due to naming different people or requiring or prohibiting mentioning of different names as endorsements. Here are various concerns that people have had in this area in the past. I'm neither indicating agreement nor disagreement with any of these points, only listing them to provoke thought about some of the things people have raised before. * Including long legal texts in debian/copyright, particularly if one wants to format them for copyright-format, is tedious and annoying and doesn't benefit our users in any significant way, and therefore we should include as many licenses as possible in common-licenses to spare people that work. * common-licenses consumes disk space on every installed Debian system of any size, and therefore should be kept small to avoid wasting system resources. * Every appproved DFSG license should be included in common-licenses so that it serves as a repository of licenses the project has approved. * Including a license in common-licenses implies that the project approves of that license, and therefore licenses such as the LaTeX Project Public License 1.0, which requires renaming derived works, should not be included even though DFSG #4 grudgingly allows for this type of license term. * All licenses explicitly mentioned in the Debian Free Software Guidelines should be present in common-licenses (as justification for including the BSD license even though the current text is specific to the Regents of the University of California). In order to structure the discussion and prod people into thinking about the implications, I will make the following straw man proposal. This is what I would do if the decision was entirely up to me: Licenses will be included in common-licenses if they meet all of the following criteria: * The license is DFSG-free. * Exactly the same license wording is used by all works covered by it. * The license applies to at least 100 source packages in Debian. * The license text is longer than 25 lines. I will attempt to guide and summarize discussion on this topic. No decision will be made immediately; I will summarize what I've heard first and be transparent about what direction I think the discussion is converging towards (if any). Finally, as promised, here is the count of source packages in unstable that use the set of licenses that I taught my script to look for. This is likely not accurate; the script uses a bunch of heuristics and guesswork. AGPL 3 277 Apache 2.0 5274 Artistic 4187 Artistic 2.0 337 BSD (common-licenses) 42 CC-BY 1.0 3 CC-BY 2.0 15 CC-BY 2.5 13 CC-BY 3.0 240 CC-BY 4.0 159 CC-BY-SA 1.0 8 CC-BY-SA 2.0 48 CC-BY-SA 2.5 16 CC-BY-SA 3.0 425 CC-BY-SA 4.0 237 CC0-1.0 1069 CDDL 67 CeCILL 30 CeCILL-B 13 CeCILL-C 9 GFDL (any) 569 GFDL (symlink) 55 GFDL 1.2 289 GFDL 1.3 231 GPL (any) 20006 GPL (symlink) 1331 GPL 1 4033 GPL 2 10466 GPL 3 6783 LGPL (any) 5019 LGPL (symlink) 265 LGPL 2 3850 LGPL 2.1 2926 LGPL 3 1526 LaTeX PPL 46 LaTeX PPL (any) 40 LaTeX PPL 1.3c 32 MPL 1.1 165 MPL 2.0 361 SIL OFL 1.0 11 SIL OFL 1.1 258 -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>