Steve, Nick: On Sun, Feb 23, 2020 at 10:04 AM Steve Jorgensen <ste...@stevej.name> wrote: > > Nick Coghlan wrote: > > On Sun, 23 Feb 2020 at 08:40, Ian Stapleton Cordasco > > graffatcolmin...@gmail.com wrote: > > > > > > Forgive me if I'm missing something but doesn't > > > license-file provides this functionality (see > > > https://stackoverflow.com/a/48691876) for > > > an example. > > > I surmise not enough people use it although it's readily available? > > > This is likely to be the case, as license-file[s] is a setuptools > > feature aimed at ensuring the license file ends up in the sdist/wheel > > archive, rather than a published metadata field aimed at allowing > > other tools to find that license file within the sdist/wheel > > archive. > > There's a pre-draft PEP in discussion at > > https://github.com/pombredanne/spdx-pypi-pep/pull/2 > > and > > https://discuss.python.org/t/improving-license-clarity-with-better-package-m... > > that looks at clarifying licensing metadata through the use of SPDX > > classifiers. That draft PEP also formalises the "License-File" field. > > The approach I'm currently taking to this problem is to combine > > https://github.com/nexB/scancode-toolkit/blob/develop/README.rst > > for > > finding component licenses with > > https://github.com/nexB/aboutcode-toolkit > > to generate an open source > > attribution bundle for those components. The one key caveat on that > > approach is that the initial scancode output requires some non-trivial > > cleanup before you can feed it into the aboutcode ABOUT file generator > > when first applying it to a project: > > https://github.com/nexB/aboutcode-toolkit/issues/416 > > Cheers, > > Nick. > > P.S. As with a lot of distribution related issues, the key challenge > > with making improvements in this space is that developers really need > > tools that work today to meet their open source attribution > > obligations (such as nexB's scancode & aboutcode toolkits), while > > metadata level improvements (like Philippe's draft PEP) will take > > years to cover a significant proportion of published packages (and > > there's a long tail of rarely updated projects that may never catch > > up). > > Thanks for your very informative & useful reply. :)
Nick: What you are doing with the scancode and aboutcode toolkits seems super yummy and would likely be super useful elsewhere! If you think there is something we could extract to make it part of the tools, I am game to help. And I need to submit that draft PEP BTW :] Steve: that PEP eventually documents the de-facto undocumented thing that includes license_file(s) in built wheels. The field already exists and is supported already so it can be used. To Nick's point it is going to take a long while to fix it all in the actual packages. That said, I am also involved in an initiative to help along the way and hopefully will help take only 100 years instead of the original thousand years needed to fix the problem (See https://clearlydefined.io ) There we are 1. scanning with scancode ALL the packages (Python + everything else if there is such a thing ;) ) 2. licensing data quality is "scored" with this approach https://github.com/clearlydefined/license-score/blob/master/ClearlyLicensedMetrics.md The license scoring includes if the full license text is present or not in the package (which is your original concern). 3. volunteers are reviewing that data for accuracy and correctness and fixing it if needed. 4. eventually fixes are pushed back upstream. There is also some Google summer of project https://github.com/nexB/aboutcode/wiki/Project-Ideas-Improve-License-Detection-Accuracy to do some large scale analysis of the 10M scans we have on hand. Do not hesitate to reach out on our off list. -- Cordially Philippe Ombredanne p...@nexb.com (scancode-toolkit maintainer) -- Distutils-SIG mailing list -- distutils-sig@python.org To unsubscribe send an email to distutils-sig-le...@python.org https://mail.python.org/mailman3/lists/distutils-sig.python.org/ Message archived at https://mail.python.org/archives/list/distutils-sig@python.org/message/NF7V7Z2LOBEQNAZ6WQBSXDAQFU6RDG7C/