Hey Brandon and Nisha,
There are people from Microsoft that would probably be interested in
participating in this discussion, are you considering doing some brainstorming
sessions?
In my mind I’ve always imagined the pedigree information (which build would be
a part of) as defining new both new element types and potentially new
relationship types (although I think we might have most of these covered
already). I’d also want to ensure that anything we do here can integrate with
in-toto/SLSA attestations.
If I take an example in SPDX today (bar-0.1 is a static library consumed to
consume the foo-1.0 package):
(File:foo.c, File:foo.h, File:bar.lib, File:bar.h)--[:GENERATES]-->File:foo
(File:bar.c, File:bar.h)--[:GENERATES]-->File:bar.lib
(File:bar.lib, File:bar.h)--[:CONTAINED_IN]-->File:bar-0.1.tgz
Package:bar-0.1--[:DISTRIBUTION_ARTIFACT]-->File:bar0.1.tgz
Package:bar-0.1--[:BUILD_DEPENDENCY_OF]-->Package:foo-1.0
(File:foo)--[:CONTAINED_IN]-->File:foo-1.0.tgz
Package:foo-1.0--[:DISTRIBUTION_ARTIFACT]-->File:foo-1.0.tgz
I can trace the integrity of all of the build artifacts (based on their hashes)
but I don’t know anything about the build environment or build environments
that produced them. I don’t know if bar and foo were built in the same build
process or by different build processes and, if the builds were fully
reproducible, I may not even know which instance of a build produced them
(because the input and output hashes would be the same). I may be able to
assume some of this information based on the pedigree of the SPDX document
(such as who created it and signed it) but that’s an inference and still lacks
interesting detail.
In SLSA attestation there is a statement (made up of a subject and a predicate)
and an envelope (made up of the statement and a signature). The predicate is a
claim that the signer is making about the subject, such as, this artifact was
built by a specific instance of a build process that has these attributes and
used these inputs. This model maps quite nicely on to SPDX where subjects are
references to an SPDX element (typically something derived from Artifact), the
predicate is a subclass of Element that describes the claim being made, and we
have a signature over the document (or in the future individual elements). We
also have the ability to track creator independently of signer using the
“createdBy” from Element to Identity.
Some of the content in SPDX is already an attestation (or more meta it all is),
for example, an Annotation is a predicate containing a type and a textual
statement and it is linked to a subject by the “subject” property. Similarly,
license and vulnerability information are attestations about artifacts. More
meta, relationships are also attestations (this is one of the reasons I wanted
them to inherit from Element), they are a predicate that describes the type of
relationship and what the relationship is to (the From is the subject of the
statement in this case).
So going back to the example above what we want is a predicate that describes
an instance of a build, so we can define a new BuildRun or BuildInstance class,
that inherits from Element (or possibly Artifact, I’d have to think about that
some more – somewhere that Sean’s definitions would help 😊). That would then
let us extend the graph above:
(File:foo.c, File:foo.h, File:bar.lib, File:bar.h)--[:GENERATES]-->File:foo
(File:bar.c, File:bar.h)--[:GENERATES]-->File:bar.lib
(File:bar.lib, File:bar.h)--[:CONTAINED_IN]-->File:bar-0.1.tgz
Package:bar-0.1--[:DISTRIBUTION_ARTIFACT]-->File:bar0.1.tgz
Package:bar-0.1--[:BUILD_DEPENDENCY_OF]-->Package:foo-1.0
(File:foo)--[:CONTAINED_IN]-->File:foo-1.0.tgz
Package:foo-1.0--[:DISTRIBUTION_ARTIFACT]-->File:foo-1.0.tgz
# I chose to include both the Package and the package’s distribution artifact
to establish a stronger link to the physical files consumed and produced, but
there’s other ways this could be modeled. For example, if this was consuming a
git repository containing foo.c and foo.h then the commit can be modeled as a
Package which the build DEPENDS_ON.
BuildRun:run_123--[:DEPENDS_ON]-->(Package:bar-01, File:bar-0.1.tgz,
File:foo.c, File:foo.h)
BuildRun:run_123--[:GENERATES]-->(Package:foo-1.0, File:foo-1.0.tgz)
Package:gcc-9.4.0--[:BUILD_TOOL_OF]-->BuildRun:run_123
# We could add properties to BuildRun to capture any necessary information
(this needs to be modeled to have the right level of abstraction and
flexibility)
BuildRun:
environment: Map<string, string>
command_line: String
stdout: String
stderr: String
In this example we can see that the BuildRun:run_123 consumed a pre-build bar
package and used gcc 9.4.0 so we have additional context we didn’t have before,
if bar was built from source from the repo in the same build as foo we’d see a
graph more like this:
BuildRun:run_123--[:DEPENDS_ON]-->(File:bar.c, File:bar.h, File:foo.c,
File:foo.h)
BuildRun:run_123--[:GENERATES]-->(Package:foo-1.0, File:foo-1.0.tgz)
Package:gcc-9.4.0--[:BUILD_TOOL_OF]-->BuildRun:run_123
Regards,
William Bartholomew (he/him) – Let’s
chat<https://outlook.office.com/findtime/[email protected]&anonymous&ep=plink>
Principal Security Strategist
Global Cybersecurity Policy – Microsoft
My working day may not be your working day. Please don’t feel obliged to reply
to this e-mail outside of your normal working hours.
From: [email protected] <[email protected]> On Behalf Of Brandon
Lum via lists.spdx.org
Sent: Saturday, April 2, 2022 12:49 PM
To: Nisha Kumar <[email protected]>
Cc: [email protected]
Subject: [EXTERNAL] Re: [spdx-tech] Adding Build SBOM relationships for S3C
resiliency
You don't often get email from
[email protected]<mailto:[email protected]>. Learn
why this is important<http://aka.ms/LearnAboutSenderIdentification>
Hey Nisha,
Yes - exactly!! Curious to hear what some ideas are around a "build profile"!
Would this be along the lines of another element/document that would be
referenced? or maybe kind of like the defects vulnerability ref documents?
Another aspect that I'm hoping to explore - is being able to put together SBOM
documents which are not directly linked to each other. I.e. in the situation
where there is a known unknown that a build was using Package ABC with hash
XYZ, would it be possible to fill in the gaps by finding the SBOM document with
the binary hash XYZ, and adding references to the document (or composing the
documents).
Cheers
Brandon
On Tue, Mar 29, 2022 at 11:18 AM Nisha Kumar
<[email protected]<mailto:[email protected]>> wrote:
Hi Brandon,
Sorry for getting back to you so late. I've been thinking of an SPDX 3.0
profile that would contain software build information like what you have
described in 1., but it seems to me from previous conversations that the
information could be covered using relationships such as BUILD_TOOL_OF and
GENERATED_FROM. However, things like "build environment" (like VMs and
containers) and build flags are not part of relationships. I think it would be
useful to define some new relationships based on these considerations as part
of a "build profile".
Thoughts?
-Nisha
On 3/17/22 07:41, Brandon Lum via
lists.spdx.org<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.spdx.org%2F&data=04%7C01%7Cwillbar%40microsoft.com%7Cbbca7b884a574f49227e08da14e1ce27%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637845257327869976%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=i51a0OaJZnH0WiN3PZDk8MTYw%2FVGaV3NOft1QdbDpI8%3D&reserved=0>
wrote:
Hi All,
I've been exploring ideas in the build provenance realm, and I think there are
some ideas there that could be useful to incorporate into SPDX. I wanted to get
a sense if folks are interested, and would love to work on something for this!
Some of the ideas from build provenance (I'm going to frame it around the
security use case since that's what I'm most familiar with). These are mostly
orthogonal concepts to those of the SLSA
framework<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslsa.dev%2F&data=04%7C01%7Cwillbar%40microsoft.com%7Cbbca7b884a574f49227e08da14e1ce27%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637845257327869976%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=8Wct%2B620KkVi3DyDLXi%2FFFr0ea57y8l%2BUqv0J23zMcg%3D&reserved=0>:
1. What is the toolchain used to build this binary/artifact (in the event where
a compromised compiler, build container, etc. is detected)
2. What/who is the builder that was used to build this binary/artifact (in the
event where a build system gets compromised - e.g. CI/CD like github actions,
travis, circle CI is compromised), with the ability to respond to breach.
3. (Already part of SPDX relationship between
elements<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspdx.github.io%2Fspdx-spec%2Frelationships-between-SPDX-elements%2F&data=04%7C01%7Cwillbar%40microsoft.com%7Cbbca7b884a574f49227e08da14e1ce27%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637845257327869976%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=ckGxOGqZP20DXzxgGPuUt5g3J5uitWtTtR4T3hPU8gk%3D&reserved=0>)
What are the materials that were used to build this binary/artifact
4. (Already covered by proposed canonicalisation committee) Integrity
validation/provenance of claims of binary/artifact
I think there could potentially be a place to define some of these in SPDX,
maybe through adding more relationships to
https://spdx.github.io/spdx-spec/relationships-between-SPDX-elements/<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspdx.github.io%2Fspdx-spec%2Frelationships-between-SPDX-elements%2F&data=04%7C01%7Cwillbar%40microsoft.com%7Cbbca7b884a574f49227e08da14e1ce27%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637845257327869976%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=ckGxOGqZP20DXzxgGPuUt5g3J5uitWtTtR4T3hPU8gk%3D&reserved=0>,
or otherwise.
Would like to hear thoughts/interest from folks!
On a side note: I am also interested in getting more into the tooling side of
Build SBOMs (and distribution/resolution of). Would love to chat with anyone
that's working on it - I'm hoping to define some projects around this!
Cheers
Brandon
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#4437): https://lists.spdx.org/g/Spdx-tech/message/4437
Mute This Topic: https://lists.spdx.org/mt/89846631/21656
Group Owner: [email protected]
Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub [[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-