Hi David,

If I've interpreted this correctly (and I admit I may not have) this sounds 
like the approach the SPDX 2.x uses where each file (document) has a namespace 
and then all the identifiers within that file are relative to that namespace. 
Those identifiers could be meaningful or meaningless because they're opaque to 
SPDX (although SPDX 2.x does impose some structural requirements on the 
identifier).

In the SPDX 3.x model you can still produce files with this same structure if 
you want, you have complete control over the namespaces and identifiers you 
use, so if you want to have a namespace per file and then unique identifiers 
within that namespace that's perfectly fine.

However, over the last 18 months participants have expressed a few desired 
requirements that led to us making the model more flexible while still 
supporting that use case. Specifically:

  *   Elements not being tied to documents. This was to support use cases where 
you may not be transferring the elements and just want to load up a graph of 
elements and traverse them.
  *   Being able to include an element that was originally created in one 
document into another document (for example, to create an aggregate document to 
transfer into an air-gapped environment).
  *   Encouraging re-use of elements when feasible. This becomes interesting 
when we have more and more "canonical" SBOMs being produced (i.e. the SBOM 
being produced by the original software creator rather than a third-party). 
Re-using the canonical elements is preferrable to creating new elements.

This approaches aren't mutually exclusive and I believe both are supported 
today.

Regards,

William Bartholomew (he/him) - Let's 
chat<https://outlook.office.com/findtime/[email protected]&anonymous&ep=plink>
Principal Security Strategist
Cybersecurity Policy - Digital Diplomacy

From: [email protected] <[email protected]> On Behalf Of David 
Kemp via lists.spdx.org
Sent: Friday, November 12, 2021 1:04 PM
To: SPDX-list <[email protected]>
Subject: [EXTERNAL] [spdx-tech] Abstraction, Collections and Graphs

All,

We had a good discussion of Element IDs and Collections at the last tech 
meeting.  Here are a couple of slides illustrating an abstract view of 
collections.

1) We have used small hand-tailored examples, and full SPDX files in various 
formats for discussion.  Here's a third approach: I generated a random small 
graph using a page Google found at 
http://bl.ocks.org/erkal/9746513<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fbl.ocks.org%2Ferkal%2F9746513&data=04%7C01%7Cwillbar%40microsoft.com%7C3a70aa0e20e3482e528008d9a61ff47c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637723478502499195%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=e8kj7ch7ApoNeAT0pLcDvj6UfE0tkqdvdurj77DoRQI%3D&reserved=0>,
 then used that as a basis for an SPDX Element graph.  The reason to start with 
random input is to avoid cherry-picking simple examples. The intent is for the 
same process to work correctly for any random input.  The undirected input 
graph had 15 anonymous nodes, and the process is to create SPDX collection and 
leaf Elements to represent each of them.

2) Sean correctly points out that a Collection is simply a set of element 
references, and everyone supported the KISS principle at the meeting.  There is 
nothing simpler than an unlabeled graph, and the slides illustrate the process 
of assigning SPDX Element types to an unlabeled graph.  There are many 
different ways to do that assignment, but all of them must result in a directed 
graph with edges from Collection Elements to members of each Collection.  Leaf 
Elements (Annotation, Relationship, Identity and Artifact) are not collections 
and thus have no outgoing edges to member Elements.

3) In the logical model each Element exists on its own. But in the information 
model Elements are serialized into data files.  The creator/serializer can 
decide how many Elements to include in a single serialized file.  The slides 
show three examples: 15 serialized files (1 per Element), 1 serialized file 
with 15 Elements, and 2 serialized files with the 15 Elements divided among 
them and references from one to the other.  The only difference between those 
files is how Element ids are assigned; the information model allows any mixture 
of large and small files.

4) The bottom line is that the more Elements an author chooses to include in a 
file:

  1.  the less work the author needs to do in choosing unique IRIs - there is 
only one per file, and all of the others are derived from it (in whatever 
manner the author chooses), and
  2.  the more efficient the serialization can be, assuming the author chooses 
to use small local ids.
I hope these are self-explanatory.  The data for the Element graph is below, it 
can be pasted into 
http://sketchviz.com<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fsketchviz.com%2F&data=04%7C01%7Cwillbar%40microsoft.com%7C3a70aa0e20e3482e528008d9a61ff47c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637723478502509190%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=HkLTxN%2Fah4iZdji9X9LQ3a%2BgXq3HcOHAL%2BmHjbFqnMg%3D&reserved=0>
 to view.

Regards,
Dave

```
digraph G {
  node [fontname=arial; style=filled; fillcolor=lightskyblue1];

n1 [label="SBOM"];
  n1 -> n2;
  n1 -> n4;
  n1 -> n7;
  n1 -> n8;
  n1 -> n9;
  n1 -> n10;
n2 [label="Package1"];
  n2 -> n3;
  n2 -> n4;
  n2 -> n6;
  n2 -> n10;
n3 [label="Package2"];
  n3 -> n10;
  n3 -> n5;
n4 [label="Package3"];
  n4 -> n8;
  n4 -> n12;
n5 [label="Package4"];
  n5 -> n6;
n6 [label="Package5"];
  n6 -> n7;
n7 [label="file1"; style=filled; fillcolor=palegreen];
n8 [label="file2"; style=filled; fillcolor=palegreen];
n9 [label="Package6"];
  n9 -> n13;
  n9 -> n14;
n10 [label="file3"; style=filled; fillcolor=palegreen];
n11 [label="Anno1"; style=filled; fillcolor=palegreen];
  n11 -> n3;
n12 [label="file5"; style=filled; fillcolor=palegreen];
n13 [label="Package7"];
  n13 -> n15;
n14 [label="file6"; style=filled; fillcolor=palegreen];
n15 [label="file7"; style=filled; fillcolor=palegreen];
}
```





-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#4254): https://lists.spdx.org/g/Spdx-tech/message/4254
Mute This Topic: https://lists.spdx.org/mt/87076458/21656
Group Owner: [email protected]
Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub [[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to