On 1/15/24 7:50 AM, Jasper Orschulko via lists.openembedded.org wrote:
Hi Alex,

Okay, I've read the README file in that repo, and if i understood it
right, the process is:
- run fossology
- have a human inspect the output, and correct it on a file by file
basis (tremendous waste of time and limited developer resources even
when done the 'open source way' if you ask me but whatevs)
- place the corrected output into the above repository

Correct.

Do you really really need the 'human corrected' part of all this?

Unfortunately we do need the "human corrected" part (I wish we
wouldn't). We have industry customers (and in turn our legal
department) that demand a license compliance report and clearing on a
"per-file" basis. Currently available scanning tools are unfortunately
way to unreliable for usage without any human interaction (believe me,
we tried).

Will this be stupid amount of work? Yes. Is this compliance obligations
gone mad? IMO, absolutely yes.

But unfortunately this seems to be the way the industry is heading. We
as a supplier company are getting more and more requests of this sort
from our clients (big players in the public transport and automotive
industry) who won't "play" with us unless these obligations are
fulfilled. I've also heared similar stories from other companies.
(That's what you get when you let company lawyers go wild 😉️)

So I'm not happy with the situation but doing the best I can, given the
requirements. In our case that means sharing our license clearings
(which we have to do in any case) with the open source community, in
the hope that other companies have similar challenges and that we can
get some crowd-sourcing going.

It will never possibly cover all of the packages you need to ship and
match all their versions.

So what we are currently striving for is covering all target relevant
packages (aka without any special suffixes) of a "basic linux build"
(aka image-core-minimal) for LTS releases (starting with kirkstone).
Additionally, meta-ossselot will have logic for re-use of other package
versions (so does osselot as a whole), limiting curation to the diff.
I also have some ideas on dealing with openembedded patches added to
point releases (currently another pain point).

One note, when I worked on this a few years ago (pre-2020) each source file was correlated to an entry in our SPDX database. The licenses were manually reviewed and that was all attached to a file's hash.

This way it didn't matter if you were looking at an upstream tarball, an upstream git repository, or any other mode of getting the source code.

Tracking was effectively "human reviewed" SPDX was compared against the patched file manifest. Any sources that didn't match the checksum were flagged for review. A master database of checksums were also available to see if the file was duplicated in other repositories as well. This also allowed comparison of license/copyright and possibly patent data between different projects. (and ALSO allowed exceptions for generated m4, configure.in, etc files!)

This manual review work absolutely needs to be done for some parts of the world and some company requirements... but be careful not to make the work so complex it can't be completed.

--Mark

Best,
Jasper





-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#193686): 
https://lists.openembedded.org/g/openembedded-core/message/193686
Mute This Topic: https://lists.openembedded.org/mt/103730186/21656
Group Owner: [email protected]
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to