On Wed, Sep 14, 2022 at 9:16 AM Marta Rybczynska <[email protected]> wrote:
>
> Dear all,
> (cross-posting to oe-core and *-architecture)
> In the last months, we have worked in Oniro on using the create-spdx
> class for both IP compliance and security.
>
> During this work, Alberto Pianon has found that some information is
> missing from the SBOM and it does not contain enough for Software
> Composition Analysis. The main missing point is the relation between
> the actual upstream sources and the final binaries (create-spdx uses
> composite sources).

I believe we map the binaries to the source code from the -dbg
packages; is the premise that this is insufficient? Can you elaborate
more on why that is, I don't quite understand. The debug sources are
(basically) what we actually compiled (e.g. post-do_patch) to produce
the binary, and you can in turn follow these back to the upstream
sources with the downloadLocation property.

>
> Alberto has worked on how to obtain the missing data and now has a
> POC. This POC provides full source-to-binary tracking of Yocto builds
> through a couple of scripts (intended to be transformed into a new
> bbclass at a later stage). The goal is to add the missing pieces of
> information in order to get a "real" SBOM from Yocto, which should, at
> a minimum:

Please be a little careful with the wording; SBoMs have a lot of uses,
and many of them we can satisfy with what we currently generate; it
may not do the exact use case you are looking for, but that doesn't
mean it's not a "real" SBoM :)

>
> - carefully describe what is found in a final image (i.e. binary files
> and their dependencies), since that is what is actually distributed
> and goes into the final product;
> - describe how such binary files have been generated and where they
> come from (i.e. upstream sources, including patches and other stuff
> added from meta-layers); provenance is important for a number of
> reasons related to IP Compliance and security.
>
> The aim is to become able to:
>
> - map binaries to their corresponding upstream source packages (and
> not to the "internal" source packages created by recipes by combining
> multiple upstream sources and patches)
> - map binaries to the source files that have been actually used to
> build them - which usually are a small subset of the whole source
> package
>
> With respect to IP compliance, this would allow to, among other things:
>
> - get the real license text for each binary file, by getting the
> license of the specific source files it has been generated from
> (provided by Fossology, for instance), - and not the main license
> stated in the corresponding recipe (which may be as confusing as
> GPL-2.0-or-later & LGPL-2.1-or-later & BSD-3-Clause & BSD-4-Clause, or
> even worse)

IIUC this is the difference between the "Declared" license and the
"Concluded" license. You can report both, and I think
create-spdx.bbclass can currently do this with its rudimentary source
license scanning. You really do want both and it's a great way to make
sure that the "Declared" license (that is the license in the recipe)
reflects the reality of the source code.

> - automatically check license incompatibilities at the binary file level.
>
> Other possible interesting things could be done also on the security side.
>
> This work intends to add a way to provide additional data that can be
> used by create-spdx, not to replace create-spdx in any way.
>
> The sources with a long README are available at
> https://gitlab.eclipse.org/eclipse/oniro-compliancetoolchain/toolchain/tinfoilhat/-/tree/srctracker/srctracker
>
> What do you think of this work? Would it be of interest to integrate
> into YP at some point? Shall we discuss this?

This seems promising as something that could potentially move into
core. I have a few points:
 - The extraction of the sources to a dedicated directory is something
that Richard has been toying around with for quite a while, and I
think it would greatly simplify that part of your process. I would
very much encourage you to look at the work he's done, and work on
that to get it pushed across the finish line as it's a really good
improvement that would benefit not just your source scanning.
 - I would encourage you to not wait to turn this into a bbclass
and/or library functions. You should be able to do this in a new
layer, and that would make it much clearer as to what the path to
being included in OE-core would look like. It also would (IMHO) be
nicer to the users :)

>
> Marta and Alberto
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#1635): 
https://lists.openembedded.org/g/openembedded-architecture/message/1635
Mute This Topic: https://lists.openembedded.org/mt/93678489/21656
Group Owner: [email protected]
Unsubscribe: https://lists.openembedded.org/g/openembedded-architecture/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to