On Wed, Sep 14, 2022 at 9:16 AM Marta Rybczynska <[email protected]> wrote: > > Dear all, > (cross-posting to oe-core and *-architecture) > In the last months, we have worked in Oniro on using the create-spdx > class for both IP compliance and security. > > During this work, Alberto Pianon has found that some information is > missing from the SBOM and it does not contain enough for Software > Composition Analysis. The main missing point is the relation between > the actual upstream sources and the final binaries (create-spdx uses > composite sources).
I believe we map the binaries to the source code from the -dbg packages; is the premise that this is insufficient? Can you elaborate more on why that is, I don't quite understand. The debug sources are (basically) what we actually compiled (e.g. post-do_patch) to produce the binary, and you can in turn follow these back to the upstream sources with the downloadLocation property. > > Alberto has worked on how to obtain the missing data and now has a > POC. This POC provides full source-to-binary tracking of Yocto builds > through a couple of scripts (intended to be transformed into a new > bbclass at a later stage). The goal is to add the missing pieces of > information in order to get a "real" SBOM from Yocto, which should, at > a minimum: Please be a little careful with the wording; SBoMs have a lot of uses, and many of them we can satisfy with what we currently generate; it may not do the exact use case you are looking for, but that doesn't mean it's not a "real" SBoM :) > > - carefully describe what is found in a final image (i.e. binary files > and their dependencies), since that is what is actually distributed > and goes into the final product; > - describe how such binary files have been generated and where they > come from (i.e. upstream sources, including patches and other stuff > added from meta-layers); provenance is important for a number of > reasons related to IP Compliance and security. > > The aim is to become able to: > > - map binaries to their corresponding upstream source packages (and > not to the "internal" source packages created by recipes by combining > multiple upstream sources and patches) > - map binaries to the source files that have been actually used to > build them - which usually are a small subset of the whole source > package > > With respect to IP compliance, this would allow to, among other things: > > - get the real license text for each binary file, by getting the > license of the specific source files it has been generated from > (provided by Fossology, for instance), - and not the main license > stated in the corresponding recipe (which may be as confusing as > GPL-2.0-or-later & LGPL-2.1-or-later & BSD-3-Clause & BSD-4-Clause, or > even worse) IIUC this is the difference between the "Declared" license and the "Concluded" license. You can report both, and I think create-spdx.bbclass can currently do this with its rudimentary source license scanning. You really do want both and it's a great way to make sure that the "Declared" license (that is the license in the recipe) reflects the reality of the source code. > - automatically check license incompatibilities at the binary file level. > > Other possible interesting things could be done also on the security side. > > This work intends to add a way to provide additional data that can be > used by create-spdx, not to replace create-spdx in any way. > > The sources with a long README are available at > https://gitlab.eclipse.org/eclipse/oniro-compliancetoolchain/toolchain/tinfoilhat/-/tree/srctracker/srctracker > > What do you think of this work? Would it be of interest to integrate > into YP at some point? Shall we discuss this? This seems promising as something that could potentially move into core. I have a few points: - The extraction of the sources to a dedicated directory is something that Richard has been toying around with for quite a while, and I think it would greatly simplify that part of your process. I would very much encourage you to look at the work he's done, and work on that to get it pushed across the finish line as it's a really good improvement that would benefit not just your source scanning. - I would encourage you to not wait to turn this into a bbclass and/or library functions. You should be able to do this in a new layer, and that would make it much clearer as to what the path to being included in OE-core would look like. It also would (IMHO) be nicer to the users :) > > Marta and Alberto
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#1635): https://lists.openembedded.org/g/openembedded-architecture/message/1635 Mute This Topic: https://lists.openembedded.org/mt/93678489/21656 Group Owner: [email protected] Unsubscribe: https://lists.openembedded.org/g/openembedded-architecture/unsub [[email protected]] -=-=-=-=-=-=-=-=-=-=-=-
