On 2/1/22 19:21, Peter Kjellerstedt wrote:
-----Original Message-----
From: [email protected]
<[email protected]> On Behalf Of Saul Wold
Sent: den 2 februari 2022 01:02
To: [email protected]; [email protected]
Cc: Saul Wold <[email protected]>
Subject: [OE-core] [PATCH] create-spdx: Get SPDX-License-Identifier from source
This patch will read the begining of source files and try to find
the SPDX-License-Identifier to populate the licenseInfoInFiles
field for each source file. This does not populate licenseConculed
I assume that should be "licenseConcluded".
Well that depends on if "we" want to take some "ownership" of the
conclusion as the "preparer". How would we handle the case of 2
SPDX-License-Identifiers tags in a file, is it an "AND" or an "OR"?
Simple example.
The description of licenseConcluded is:
"License expression for licenseConcluded. The licensing that the
preparer of this SPDX document has concluded, based on the evidence,
actually applies to the package."
At somepoint we might be able to fill in that field, but for now I think
we leave it as NOASSERTION.
Sau!
at this time, nor rolls it up to package level.
We read as binary to since some source code seem to have some
to -> too
binary characters, the license is then converted to ascii strings.
Signed-off-by: Saul Wold <[email protected]>
---
Merge after Joshua's patch (spdx: Add set helper for list properties)
merges
meta/classes/create-spdx.bbclass | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/meta/classes/create-spdx.bbclass b/meta/classes/create-spdx.bbclass
index 8b4203fdb5d..588489cc2b0 100644
--- a/meta/classes/create-spdx.bbclass
+++ b/meta/classes/create-spdx.bbclass
@@ -37,6 +37,24 @@ SPDX_SUPPLIER[doc] = "The SPDX PackageSupplier field for
SPDX packages created f
do_image_complete[depends] = "virtual/kernel:do_create_spdx"
+def extract_licenses(filename):
+ import re
+ import oe.spdx
You do not use oe.spdx in this function.
+
+ lic_regex = re.compile(b'SPDX-License-Identifier:\s+([-A-Za-z\d. ]+)[
|\n|\r\n]*?')
I assume you meant:
lic_regex = re.compile(b'SPDX-License-Identifier:\s+([-A-Za-z\d. ]+)(?:
|\n|\r\n)*?')
Not that it really matters though, as it will yield the same result as:
lic_regex = re.compile(b'SPDX-License-Identifier:\s+([-A-Za-z\d. ]+)')
However, neither of the expressions above will correctly match all the
SPDX-License-Identifier examples at https://spdx.dev/ids/#how.
Use this instead:
lic_regex = re.compile(b'^\W*SPDX-License-Identifier:\s*([
\w\d.()+-]+?)(?:\s+\W*)?$', re.MULTILINE)
+
+ try:
+ with open(filename, 'rb') as f:
+ size = min(15000, os.stat(filename).st_size)
+ txt = f.read(size)
+ licenses = re.findall(lic_regex, txt)
+ if licenses:
+ ascii_licenses = [lic.decode('ascii') for lic in licenses]
+ return ascii_licenses
+ except Exception as e:
+ bb.warn(f"Exception reading {filename}: {e}")
+ return None
+
def get_doc_namespace(d, doc):
import uuid
namespace_uuid = uuid.uuid5(uuid.NAMESPACE_DNS,
d.getVar("SPDX_UUID_NAMESPACE"))
@@ -232,6 +250,11 @@ def add_package_files(d, doc, spdx_pkg, topdir,
get_spdxid, get_types, *, archiv
checksumValue=bb.utils.sha256_file(filepath),
))
+ if "SOURCE" in spdx_file.fileTypes:
+ extracted_lics = extract_licenses(filepath)
+ if extracted_lics:
+ spdx_file.licenseInfoInFiles = extracted_lics
+
doc.files.append(spdx_file)
doc.add_relationship(spdx_pkg, "CONTAINS", spdx_file)
spdx_pkg.hasFiles.append(spdx_file.SPDXID)
--
2.31.1
//Peter
--
Sau!
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#161177):
https://lists.openembedded.org/g/openembedded-core/message/161177
Mute This Topic: https://lists.openembedded.org/mt/88847186/21656
Group Owner: [email protected]
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-