Am 04.02.2022 um 14:41 schrieb Richard Purdie:
On Fri, 2022-02-04 at 10:05 +0100, Stefan Herbrechtsmeier wrote:
Am 03.02.2022 um 22:24 schrieb Richard Purdie via lists.openembedded.org:
On Thu, 2022-02-03 at 09:07 -0800, Saul Wold wrote:
When a file can not be identified by checksum and they contain an SPDX
License-Identifier tag, use it as the found license.

[YOCTO #14529]

Tested with LICENSE files that contain 1 or more SPDX-License-Identifier tags

Signed-off-by: Saul Wold <saul.w...@windriver.com>
---
   scripts/lib/recipetool/create.py | 16 +++++++++++-----
   1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/scripts/lib/recipetool/create.py b/scripts/lib/recipetool/create.py
index 507a230511..9149c2d94f 100644
--- a/scripts/lib/recipetool/create.py
+++ b/scripts/lib/recipetool/create.py
@@ -1221,14 +1221,20 @@ def guess_license(srctree, d):
       for licfile in sorted(licfiles):
           md5value = bb.utils.md5_file(licfile)
           license = md5sums.get(md5value, None)
+        license_list = []
           if not license:
               license, crunched_md5, lictext = crunch_license(licfile)
               if lictext and not license:
-                license = 'Unknown'
-                logger.info("Please add the following line for '%s' to a 
'lib/recipetool/licenses.csv' " \
-                    "and replace `Unknown` with the license:\n" \
-                    "%s,Unknown" % (os.path.relpath(licfile, srctree), 
md5value))
-        if license:
+                spdx_re = re.compile('SPDX-License-Identifier:\s+([-A-Za-z\d. 
]+)[ |\n|\r\n]*?')
+                license_list = re.findall(spdx_re, "\n".join(lictext))
+                if not license_list:
+                    license_list.append('Unknown')
+                    logger.info("Please add the following line for '%s' to a 
'lib/recipetool/licenses.csv' " \
+                        "and replace `Unknown` with the license:\n" \
+                        "%s,Unknown" % (os.path.relpath(licfile, srctree), 
md5value))
+        else:
+            license_list.append(license)
+        for license in license_list:
               licenses.append((license, os.path.relpath(licfile, srctree), 
md5value))
# FIXME should we grab at least one source file with a license header and add that too?

I think to close this bug the code may need to go one step further and
effectively grep over the source tree.

Please keep in mind that we need a full license text and not only the
license name for license compliance. The current function only search
for license files with license text.

We'd probably want to list the value of any SPDX-License-Identifier: header
found in any of the source files for the user to then decide upon?

I think this is an other feature like a license checker because if you
have a SPDX-License-Identifier without a license text you have a license
violation.

This brings us to the problem that this code will interpret a file with
only a SPDX-License-Identifier as a license file with license text.

As I understand it the tool is there to help write a recipe so filling out
LICENSE and highlighting a missing full license text would be a valid approach
for the tool and helpful to the user?

Yes, but we should distinguish between license files which are guess via hash of the content and SPDX-License-Identifier which labels the source code’s license. In this case the SPDX-License-Identifier is non-material text from a license file and should be filtered out inside crunch_license function.

The collection of all used licenses via SPDX-License-Identifier is an additional feature and we need a warning if a SPDX-License-Identifier exists without license file.

It certainly isn't intended as full validation, just intended to assist the
creation of a recipe.

But this patch is an regress because it doesn't distinguish between a license file with a known hash and a mostly empty file with a SPDX-License-Identifier.

Regards
  Stefan
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#161366): 
https://lists.openembedded.org/g/openembedded-core/message/161366
Mute This Topic: https://lists.openembedded.org/mt/88887504/21656
Group Owner: openembedded-core+ow...@lists.openembedded.org
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to