> -----Original Message----- > From: Saul Wold <[email protected]> > Sent: den 2 februari 2022 05:07 > To: Peter Kjellerstedt <[email protected]>; openembedded- > [email protected]; [email protected] > Subject: Re: [OE-core] [PATCH] create-spdx: Get SPDX-License-Identifier > from source > > On 2/1/22 19:21, Peter Kjellerstedt wrote: > >> -----Original Message----- > >> From: [email protected] <openembedded- > [email protected]> On Behalf Of Saul Wold > >> Sent: den 2 februari 2022 01:02 > >> To: [email protected]; [email protected] > >> Cc: Saul Wold <[email protected]> > >> Subject: [OE-core] [PATCH] create-spdx: Get SPDX-License-Identifier > from source > >> > >> This patch will read the begining of source files and try to find > >> the SPDX-License-Identifier to populate the licenseInfoInFiles > >> field for each source file. This does not populate licenseConculed > > > > I assume that should be "licenseConcluded". > > Well that depends on if "we" want to take some "ownership" of the > conclusion as the "preparer". How would we handle the case of 2 > SPDX-License-Identifiers tags in a file, is it an "AND" or an "OR"? > Simple example. > > The description of licenseConcluded is: > > "License expression for licenseConcluded. The licensing that the > preparer of this SPDX document has concluded, based on the evidence, > actually applies to the package." > > At somepoint we might be able to fill in that field, but for now I think > we leave it as NOASSERTION. > > Sau!
Sorry, you misunderstood. Since I do not know the specification, I could only assume that the field you intended to refer to was actually named "licenseConcluded" rather than "licenseConculed". //Peter > >> at this time, nor rolls it up to package level. > >> > >> We read as binary to since some source code seem to have some > > > > to -> too > > > >> binary characters, the license is then converted to ascii strings. > >> > >> Signed-off-by: Saul Wold <[email protected]> > >> --- > >> Merge after Joshua's patch (spdx: Add set helper for list properties) > >> merges > >> > >> meta/classes/create-spdx.bbclass | 23 +++++++++++++++++++++++ > >> 1 file changed, 23 insertions(+) > >> > >> diff --git a/meta/classes/create-spdx.bbclass b/meta/classes/create- > spdx.bbclass > >> index 8b4203fdb5d..588489cc2b0 100644 > >> --- a/meta/classes/create-spdx.bbclass > >> +++ b/meta/classes/create-spdx.bbclass > >> @@ -37,6 +37,24 @@ SPDX_SUPPLIER[doc] = "The SPDX PackageSupplier field > for SPDX packages created f > >> > >> do_image_complete[depends] = "virtual/kernel:do_create_spdx" > >> > >> +def extract_licenses(filename): > >> + import re > >> + import oe.spdx > > > > You do not use oe.spdx in this function. > > > >> + > >> + lic_regex = re.compile(b'SPDX-License-Identifier:\s+([-A-Za-z\d. > ]+)[ |\n|\r\n]*?') > > > > I assume you meant: > > > > lic_regex = re.compile(b'SPDX-License-Identifier:\s+([-A-Za-z\d. > ]+)(?: |\n|\r\n)*?') > > > > Not that it really matters though, as it will yield the same result as: > > > > lic_regex = re.compile(b'SPDX-License-Identifier:\s+([-A-Za-z\d. > ]+)') > > > > However, neither of the expressions above will correctly match all the > > SPDX-License-Identifier examples at https://spdx.dev/ids/#how. > > > > Use this instead: > > > > lic_regex = re.compile(b'^\W*SPDX-License-Identifier:\s*([ > \w\d.()+-]+?)(?:\s+\W*)?$', re.MULTILINE) > > > >> + > >> + try: > >> + with open(filename, 'rb') as f: > >> + size = min(15000, os.stat(filename).st_size) > >> + txt = f.read(size) > >> + licenses = re.findall(lic_regex, txt) > >> + if licenses: > >> + ascii_licenses = [lic.decode('ascii') for lic in > licenses] > >> + return ascii_licenses > >> + except Exception as e: > >> + bb.warn(f"Exception reading {filename}: {e}") > >> + return None > >> + > >> def get_doc_namespace(d, doc): > >> import uuid > >> namespace_uuid = uuid.uuid5(uuid.NAMESPACE_DNS, > d.getVar("SPDX_UUID_NAMESPACE")) > >> @@ -232,6 +250,11 @@ def add_package_files(d, doc, spdx_pkg, topdir, > get_spdxid, get_types, *, archiv > >> checksumValue=bb.utils.sha256_file(filepath), > >> )) > >> > >> + if "SOURCE" in spdx_file.fileTypes: > >> + extracted_lics = extract_licenses(filepath) > >> + if extracted_lics: > >> + spdx_file.licenseInfoInFiles = extracted_lics > >> + > >> doc.files.append(spdx_file) > >> doc.add_relationship(spdx_pkg, "CONTAINS", spdx_file) > >> spdx_pkg.hasFiles.append(spdx_file.SPDXID) > >> -- > >> 2.31.1 > > > > //Peter > > > > -- > Sau!
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#161190): https://lists.openembedded.org/g/openembedded-core/message/161190 Mute This Topic: https://lists.openembedded.org/mt/88847186/21656 Group Owner: [email protected] Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [[email protected]] -=-=-=-=-=-=-=-=-=-=-=-
