Hey Brandon and Nisha,

There are people from Microsoft that would probably be interested in 
participating in this discussion, are you considering doing some brainstorming 
sessions?

In my mind I’ve always imagined the pedigree information (which build would be 
a part of) as defining new both new element types and potentially new 
relationship types (although I think we might have most of these covered 
already). I’d also want to ensure that anything we do here can integrate with 
in-toto/SLSA attestations.

If I take an example in SPDX today (bar-0.1 is a static library consumed to 
consume the foo-1.0 package):
(File:foo.c, File:foo.h, File:bar.lib, File:bar.h)--[:GENERATES]-->File:foo
(File:bar.c, File:bar.h)--[:GENERATES]-->File:bar.lib
(File:bar.lib, File:bar.h)--[:CONTAINED_IN]-->File:bar-0.1.tgz
Package:bar-0.1--[:DISTRIBUTION_ARTIFACT]-->File:bar0.1.tgz
Package:bar-0.1--[:BUILD_DEPENDENCY_OF]-->Package:foo-1.0
(File:foo)--[:CONTAINED_IN]-->File:foo-1.0.tgz
Package:foo-1.0--[:DISTRIBUTION_ARTIFACT]-->File:foo-1.0.tgz

I can trace the integrity of all of the build artifacts (based on their hashes) 
but I don’t know anything about the build environment or build environments 
that produced them. I don’t know if bar and foo were built in the same build 
process or by different build processes and, if the builds were fully 
reproducible, I may not even know which instance of a build produced them 
(because the input and output hashes would be the same). I may be able to 
assume some of this information based on the pedigree of the SPDX document 
(such as who created it and signed it) but that’s an inference and still lacks 
interesting detail.

In SLSA attestation there is a statement (made up of a subject and a predicate) 
and an envelope (made up of the statement and a signature). The predicate is a 
claim that the signer is making about the subject, such as, this artifact was 
built by a specific instance of a build process that has these attributes and 
used these inputs. This model maps quite nicely on to SPDX where subjects are 
references to an SPDX element (typically something derived from Artifact), the 
predicate is a subclass of Element that describes the claim being made, and we 
have a signature over the document (or in the future individual elements). We 
also have the ability to track creator independently of signer using the 
“createdBy” from Element to Identity.

Some of the content in SPDX is already an attestation (or more meta it all is), 
for example, an Annotation is a predicate containing a type and a textual 
statement and it is linked to a subject by the “subject” property. Similarly, 
license and vulnerability information are attestations about artifacts. More 
meta, relationships are also attestations (this is one of the reasons I wanted 
them to inherit from Element), they are a predicate that describes the type of 
relationship and what the relationship is to (the From is the subject of the 
statement in this case).

So going back to the example above what we want is a predicate that describes 
an instance of a build, so we can define a new BuildRun or BuildInstance class, 
that inherits from Element (or possibly Artifact, I’d have to think about that 
some more – somewhere that Sean’s definitions would help 😊). That would then 
let us extend the graph above:
(File:foo.c, File:foo.h, File:bar.lib, File:bar.h)--[:GENERATES]-->File:foo
(File:bar.c, File:bar.h)--[:GENERATES]-->File:bar.lib
(File:bar.lib, File:bar.h)--[:CONTAINED_IN]-->File:bar-0.1.tgz
Package:bar-0.1--[:DISTRIBUTION_ARTIFACT]-->File:bar0.1.tgz
Package:bar-0.1--[:BUILD_DEPENDENCY_OF]-->Package:foo-1.0
(File:foo)--[:CONTAINED_IN]-->File:foo-1.0.tgz
Package:foo-1.0--[:DISTRIBUTION_ARTIFACT]-->File:foo-1.0.tgz

# I chose to include both the Package and the package’s distribution artifact 
to establish a stronger link to the physical files consumed and produced, but 
there’s other ways this could be modeled. For example, if this was consuming a 
git repository containing foo.c and foo.h then the commit can be modeled as a 
Package which the build DEPENDS_ON.
BuildRun:run_123--[:DEPENDS_ON]-->(Package:bar-01, File:bar-0.1.tgz, 
File:foo.c, File:foo.h)
BuildRun:run_123--[:GENERATES]-->(Package:foo-1.0, File:foo-1.0.tgz)
Package:gcc-9.4.0--[:BUILD_TOOL_OF]-->BuildRun:run_123

# We could add properties to BuildRun to capture any necessary information 
(this needs to be modeled to have the right level of abstraction and 
flexibility)
BuildRun:
      environment: Map<string, string>
      command_line: String
      stdout: String
      stderr: String

In this example we can see that the BuildRun:run_123 consumed a pre-build bar 
package and used gcc 9.4.0 so we have additional context we didn’t have before, 
if bar was built from source from the repo in the same build as foo we’d see a 
graph more like this:
BuildRun:run_123--[:DEPENDS_ON]-->(File:bar.c, File:bar.h, File:foo.c, 
File:foo.h)
BuildRun:run_123--[:GENERATES]-->(Package:foo-1.0, File:foo-1.0.tgz)
Package:gcc-9.4.0--[:BUILD_TOOL_OF]-->BuildRun:run_123


Regards,

William Bartholomew (he/him) – Let’s 
chat<https://outlook.office.com/findtime/[email protected]&anonymous&ep=plink>
Principal Security Strategist
Global Cybersecurity Policy – Microsoft

My working day may not be your working day. Please don’t feel obliged to reply 
to this e-mail outside of your normal working hours.

From: [email protected] <[email protected]> On Behalf Of Brandon 
Lum via lists.spdx.org
Sent: Saturday, April 2, 2022 12:49 PM
To: Nisha Kumar <[email protected]>
Cc: [email protected]
Subject: [EXTERNAL] Re: [spdx-tech] Adding Build SBOM relationships for S3C 
resiliency

You don't often get email from 
[email protected]<mailto:[email protected]>. Learn 
why this is important<http://aka.ms/LearnAboutSenderIdentification>
Hey Nisha,

Yes - exactly!! Curious to hear what some ideas are around a "build profile"! 
Would this be along the lines of another element/document that would be 
referenced? or maybe kind of like the defects vulnerability ref documents?

Another aspect that I'm hoping to explore - is being able to put together SBOM 
documents which are not directly linked to each other. I.e. in the situation 
where there is a known unknown that a build was using Package ABC with hash 
XYZ, would it be possible to fill in the gaps by finding the SBOM document with 
the binary hash XYZ, and adding references to the document (or composing the 
documents).

Cheers
Brandon

On Tue, Mar 29, 2022 at 11:18 AM Nisha Kumar 
<[email protected]<mailto:[email protected]>> wrote:

Hi Brandon,

Sorry for getting back to you so late. I've been thinking of an SPDX 3.0 
profile that would contain software build information like what you have 
described in 1., but it seems to me from previous conversations that the 
information could be covered using relationships such as BUILD_TOOL_OF and 
GENERATED_FROM. However, things like "build environment" (like VMs and 
containers) and build flags are not part of relationships. I think it would be 
useful to define some new relationships based on these considerations as part 
of a "build profile".

Thoughts?

-Nisha
On 3/17/22 07:41, Brandon Lum via 
lists.spdx.org<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.spdx.org%2F&data=04%7C01%7Cwillbar%40microsoft.com%7Cbbca7b884a574f49227e08da14e1ce27%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637845257327869976%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=i51a0OaJZnH0WiN3PZDk8MTYw%2FVGaV3NOft1QdbDpI8%3D&reserved=0>
 wrote:
Hi All,

I've been exploring ideas in the build provenance realm, and I think there are 
some ideas there that could be useful to incorporate into SPDX. I wanted to get 
a sense if folks are interested, and would love to work on something for this!

Some of the ideas from build provenance (I'm going to frame it around the 
security use case since that's what I'm most familiar with). These are mostly 
orthogonal concepts to those of the SLSA 
framework<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslsa.dev%2F&data=04%7C01%7Cwillbar%40microsoft.com%7Cbbca7b884a574f49227e08da14e1ce27%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637845257327869976%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=8Wct%2B620KkVi3DyDLXi%2FFFr0ea57y8l%2BUqv0J23zMcg%3D&reserved=0>:
1. What is the toolchain used to build this binary/artifact (in the event where 
a compromised compiler, build container, etc. is detected)
2. What/who is the builder that was used to build this binary/artifact (in the 
event where a build system gets compromised - e.g. CI/CD like github actions, 
travis, circle CI is compromised), with the ability to respond to breach.
3. (Already part of SPDX relationship between 
elements<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspdx.github.io%2Fspdx-spec%2Frelationships-between-SPDX-elements%2F&data=04%7C01%7Cwillbar%40microsoft.com%7Cbbca7b884a574f49227e08da14e1ce27%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637845257327869976%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=ckGxOGqZP20DXzxgGPuUt5g3J5uitWtTtR4T3hPU8gk%3D&reserved=0>)
 What are the materials that were used to build this binary/artifact
4. (Already covered by proposed canonicalisation committee) Integrity 
validation/provenance of claims of binary/artifact

I think there could potentially be a place to define some of these in SPDX, 
maybe through adding more relationships to 
https://spdx.github.io/spdx-spec/relationships-between-SPDX-elements/<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspdx.github.io%2Fspdx-spec%2Frelationships-between-SPDX-elements%2F&data=04%7C01%7Cwillbar%40microsoft.com%7Cbbca7b884a574f49227e08da14e1ce27%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637845257327869976%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=ckGxOGqZP20DXzxgGPuUt5g3J5uitWtTtR4T3hPU8gk%3D&reserved=0>,
 or otherwise.

Would like to hear thoughts/interest from folks!

On a side note: I am also interested in getting more into the tooling side of 
Build SBOMs (and distribution/resolution of). Would love to chat with anyone 
that's working on it - I'm hoping to define some projects around this!

Cheers
Brandon



-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#4437): https://lists.spdx.org/g/Spdx-tech/message/4437
Mute This Topic: https://lists.spdx.org/mt/89846631/21656
Group Owner: [email protected]
Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub [[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to