Brandon,
There is already an SPDX 3.0 working session scheduled for the 15th of April (11am-2pm eastern).
Bob
Robert (Bob) Martin Sr. Software and Supply Chain Assurance Principal Eng. Cross Cutting Solutions and Innovation Dept Cyber Solutions Innovation Center MITRE Labs MITRE Corporation 781-271-3001o 781-424-4095c
Hi William,
We don't have a brainstorming session planned yet - but we should!! I believe Steve and Kate have been working through some of this as well. Let me find some time for us to get together and put something together! How are folks for 15 April (Friday) for a discussion?
I am in agreement with you 100% on making sure this is aligned with SLSA and in-toto! (I am actually writing a blog post on some synergies as we speak!)
I like the BuildRun idea (and maybe that's also orthogonal to the build profiles that Nisha mentioned). Also, I do appreciate your specificity in laying out the idea concretely :).
One thing that I was thinking to add is also the Builder information/URI (or maybe that's already encodable in the env/metadata). Think its worth calling out since that's the identifier for attestations. Since the BuildRun document may not necessarily be generated by the organization that asked for the build - i.e. calling out to github actions/travisCI, it would allow us to also reason about compromises.
Another thought that was brought up in some conversations is that we could also point to/reference certain documents where appropriate. Still very half baked, but along the lines of thinking about leveraging some SLSA/intoto efforts out there without potentially duplicating efforts.
On Mon, Apr 4, 2022 at 1:53 PM William Bartholomew (CELA) <[email protected]> wrote:
Hey Brandon and Nisha,
There are people from Microsoft that would probably be interested in participating in this discussion, are you considering doing some brainstorming sessions?
In my mind I’ve always imagined the pedigree information (which build would be a part of) as defining new both new element types and potentially new relationship types (although I think we might have most of these covered already). I’d also want to ensure that anything we do here can integrate with in-toto/SLSA attestations.
If I take an example in SPDX today (bar-0.1 is a static library consumed to consume the foo-1.0 package):
(File:foo.c, File:foo.h, File:bar.lib, File:bar.h)--[:GENERATES]-->File:foo
(File:bar.c, File:bar.h)--[:GENERATES]-->File:bar.lib
(File:bar.lib, File:bar.h)--[:CONTAINED_IN]-->File:bar-0.1.tgz
Package:bar-0.1--[:DISTRIBUTION_ARTIFACT]-->File:bar0.1.tgz
Package:bar-0.1--[:BUILD_DEPENDENCY_OF]-->Package:foo-1.0
(File:foo)--[:CONTAINED_IN]-->File:foo-1.0.tgz
Package:foo-1.0--[:DISTRIBUTION_ARTIFACT]-->File:foo-1.0.tgz
I can trace the integrity of all of the build artifacts (based on their hashes) but I don’t know anything about the build environment or build environments that produced them. I don’t know if bar and foo were built in the same build process or by different build processes and, if the builds were fully reproducible, I may not even know which instance of a build produced them (because the input and output hashes would be the same). I may be able to assume some of this information based on the pedigree of the SPDX document (such as who created it and signed it) but that’s an inference and still lacks interesting detail.
In SLSA attestation there is a statement (made up of a subject and a predicate) and an envelope (made up of the statement and a signature). The predicate is a claim that the signer is making about the subject, such as, this artifact was built by a specific instance of a build process that has these attributes and used these inputs. This model maps quite nicely on to SPDX where subjects are references to an SPDX element (typically something derived from Artifact), the predicate is a subclass of Element that describes the claim being made, and we have a signature over the document (or in the future individual elements). We also have the ability to track creator independently of signer using the “createdBy” from Element to Identity.
Some of the content in SPDX is already an attestation (or more meta it all is), for example, an Annotation is a predicate containing a type and a textual statement and it is linked to a subject by the “subject” property. Similarly, license and vulnerability information are attestations about artifacts. More meta, relationships are also attestations (this is one of the reasons I wanted them to inherit from Element), they are a predicate that describes the type of relationship and what the relationship is to (the From is the subject of the statement in this case).
So going back to the example above what we want is a predicate that describes an instance of a build, so we can define a new BuildRun or BuildInstance class, that inherits from Element (or possibly Artifact, I’d have to think about that some more – somewhere that Sean’s definitions would help 😊). That would then let us extend the graph above:
(File:foo.c, File:foo.h, File:bar.lib, File:bar.h)--[:GENERATES]-->File:foo
(File:bar.c, File:bar.h)--[:GENERATES]-->File:bar.lib
(File:bar.lib, File:bar.h)--[:CONTAINED_IN]-->File:bar-0.1.tgz
Package:bar-0.1--[:DISTRIBUTION_ARTIFACT]-->File:bar0.1.tgz
Package:bar-0.1--[:BUILD_DEPENDENCY_OF]-->Package:foo-1.0
(File:foo)--[:CONTAINED_IN]-->File:foo-1.0.tgz
Package:foo-1.0--[:DISTRIBUTION_ARTIFACT]-->File:foo-1.0.tgz
# I chose to include both the Package and the package’s distribution artifact to establish a stronger link to the physical files consumed and produced, but there’s other ways this could be modeled. For example, if this was consuming a git repository containing foo.c and foo.h then the commit can be modeled as a Package which the build DEPENDS_ON.
BuildRun:run_123--[:DEPENDS_ON]-->(Package:bar-01, File:bar-0.1.tgz, File:foo.c, File:foo.h)
BuildRun:run_123--[:GENERATES]-->(Package:foo-1.0, File:foo-1.0.tgz)
Package:gcc-9.4.0--[:BUILD_TOOL_OF]-->BuildRun:run_123
# We could add properties to BuildRun to capture any necessary information (this needs to be modeled to have the right level of abstraction and flexibility)
BuildRun:
environment: Map<string, string>
command_line: String
stdout: String
stderr: String
In this example we can see that the BuildRun:run_123 consumed a pre-build bar package and used gcc 9.4.0 so we have additional context we didn’t have before, if bar was built from source from the repo in the same build as foo we’d see a graph more like this:
BuildRun:run_123--[:DEPENDS_ON]-->(File:bar.c, File:bar.h, File:foo.c, File:foo.h)
BuildRun:run_123--[:GENERATES]-->(Package:foo-1.0, File:foo-1.0.tgz)
Package:gcc-9.4.0--[:BUILD_TOOL_OF]-->BuildRun:run_123
Regards,
William Bartholomew (he/him) – Let’s chat
Principal Security Strategist
Global Cybersecurity Policy – Microsoft
My working day may not be your working day. Please don’t feel obliged to reply to this e-mail outside of your normal working hours.
From: [email protected] <[email protected]> On Behalf Of Brandon Lum via lists.spdx.org
Sent: Saturday, April 2, 2022 12:49 PM
To: Nisha Kumar <[email protected]>
Cc: [email protected]
Subject: [EXTERNAL] Re: [spdx-tech] Adding Build SBOM relationships for S3C resiliency
You don't often get email from [email protected]. Learn why this is important
Hey Nisha,
Yes - exactly!! Curious to hear what some ideas are around a "build profile"! Would this be along the lines of another element/document that would be referenced? or maybe kind of like the defects vulnerability ref documents?
Another aspect that I'm hoping to explore - is being able to put together SBOM documents which are not directly linked to each other. I.e. in the situation where there is a known unknown that a build was using Package ABC with hash XYZ, would it be possible to fill in the gaps by finding the SBOM document with the binary hash XYZ, and adding references to the document (or composing the documents).
Cheers
Brandon
On Tue, Mar 29, 2022 at 11:18 AM Nisha Kumar <[email protected]> wrote:
Hi Brandon,
Sorry for getting back to you so late. I've been thinking of an SPDX 3.0 profile that would contain software build information like what you have described in 1., but it seems to me from previous conversations that the information could be covered using relationships such as BUILD_TOOL_OF and GENERATED_FROM. However, things like "build environment" (like VMs and containers) and build flags are not part of relationships. I think it would be useful to define some new relationships based on these considerations as part of a "build profile".
Thoughts?
-Nisha
On 3/17/22 07:41, Brandon Lum via lists.spdx.org wrote:
Hi All,
I've been exploring ideas in the build provenance realm, and I think there are some ideas there that could be useful to incorporate into SPDX. I wanted to get a sense if folks are interested, and would love to work on something for this!
Some of the ideas from build provenance (I'm going to frame it around the security use case since that's what I'm most familiar with). These are mostly orthogonal concepts to those of the SLSA framework:
1. What is the toolchain used to build this binary/artifact (in the event where a compromised compiler, build container, etc. is detected)
2. What/who is the builder that was used to build this binary/artifact (in the event where a build system gets compromised - e.g. CI/CD like github actions, travis, circle CI is compromised), with the ability to respond to breach.
3. (Already part of SPDX relationship between elements) What are the materials that were used to build this binary/artifact
4. (Already covered by proposed canonicalisation committee) Integrity validation/provenance of claims of binary/artifact
I think there could potentially be a place to define some of these in SPDX, maybe through adding more relationships to https://spdx.github.io/spdx-spec/relationships-between-SPDX-elements/, or otherwise.
Would like to hear thoughts/interest from folks!
On a side note: I am also interested in getting more into the tooling side of Build SBOMs (and distribution/resolution of). Would love to chat with anyone that's working on it - I'm hoping to define some projects around this!
Cheers
Brandon
_._,_._,_
Links:You receive all messages sent to this group.
View/Reply Online (#4444) | Reply To Sender | Reply To Group | Mute This Topic | New Topic
Your Subscription | Contact Group Owner | Unsubscribe [[email protected]]
_._,_._,_
