Compare serialized examples for three use cases:

1) I want to transfer three Actors.
2) SBOM C uses SBOM B, which uses SBOM A.
3) I want the canonical hash of a Relationship.

Regards,
Dave

On Tue, Jul 19, 2022 at 12:47 AM William Bartholomew (CELA) via
lists.spdx.org <[email protected]> wrote:

> CIL
>
> ------------------------------
> *From:* [email protected] <[email protected]> on behalf of
> David Kemp via lists.spdx.org <[email protected]>
> *Sent:* Monday, July 18, 2022 6:18 PM
> *To:* SPDX-list <[email protected]>
> *Subject:* [EXTERNAL] Re: [spdx-tech] V3 serialization
>
> One principle is that the goal of serialization is to put Elements into
> physical format, NOT to create new elements that didn't exist prior to
> serialization.  If you have 6 elements going into serialization, you should
> have 6 elements coming out, not 7.
>
> *[William] *Agreed, does my example violate that? It would be difficult
> for a serialization to "generate" elements because of the id and other
> required properties so I had not considered this a possibility.
>
> The second principle is that logical elements should be independent: the
> value of one element does not depend on the value of any other element.
>
> *[William] *I think it depends on your definition of "depends on" (pun
> intended). Elements may have properties that are references to other
> elements and serializers may choose to use that information for more
> compact serialization but since this would get unwound on deserialization
> that's immaterial.
>
> I believe that those two principles are worth adopting as design
> requirements.
>
> It is ugly to put something into serialization and get something else back
> out,
>
> *[William] *Agreed, though a lot of serializers/deserializers end up
> making minor changes as a result of normalization and other processes. Not
> ideal but that's an implementation detail within each
> serializer/deserializer.
>
> and it's really ugly to stuff one element's value inside another
>
> *[William] *I don't agree with this, at least for "collection" elements.
> Also, the serialization model for collection elements could support either
> element references or the element itself so if you think it's ugly then you
> would have the option of not doing nesting.
>
> not least because you can wind up with infinite recursion with documents
> inside documents inside documents inside documents
>
> *[William] *This is avoidable and using references instead of nesting
> doesn't prevent this problem. In fact, if you only use nesting then it's
> impossible to have infinite recursion, it's only when you use references
> that becomes possible.
>
>  Even two levels of element nesting makes things quite difficult to
> disentangle.
>
> *[William] *I don't agree, for collections the nesting makes it obvious
> which collection an element is part of without having to follow the id
> references. Since the serialization model could support either approach I
> don't see this being a blocker.
>
> The fundamental principle is that a file containing data is not an
> element.  A Transfer Unit is defined by a data schema, just like the
> content of any XML file or JSON file or ASN.1 file.  If the logical model
> has a Document element that describes an X.509 certificate, that element
> has interesting facts about the certificate but does not define its
> content.  It is essential to remember the difference between the bytes in a
> file and the properties of a File or Document element - the difference
> between a thing and metadata about that thing.
>
> *[William] *We've had this discussion a number of times, the Collection
> element (and its subclasses) aren't metadata about collections, document,
> SBOM, etc. they are the collection, document, SBOM, etc. There is no
> "physical" thing outside of the SPDX document that is the collection,
> document, SBOM, etc., they only exist in the SPDX graph. You could take
> that SBOM, serialize it to disk, and then have a File element that talks
> about the physical serialization of the SBOM, but that's different to the
> SBOM SPDX element.
>
> * defaults:
> I created a separate defaults property to hold the five defaultable
> properties in order to distinguish them from non-defaultable properties.
> Gary and I like the idea, but I'm not wedded to it.  The transfer unit
> schema could have "defaultCreatedBy", "defaultCreated", etc properties at
> the top level, to highlight that they are defaults, unlike name,
> description, comments, etc.  Whatever the mechanism, there must be a way to
> ensure that "name" doesn't take an inappropriate default value if it isn't
> populated, while the default for "profiles" is appropriate.
>
> *[William] *I'm struggling with multiple properties that have the same
> definition having different names and different locations on the objects,
> it feels like a lot to explain. We could flag certain properties as
> inheritable in the schema, and this only applies to collection elements so
> I think the scope is quite narrow.
>
> * array vs map
> I used map as a conversation starter, because it fits the "unique"
> semantics of element ids, and because mapping types are ubiquitous now,
> XML schema had it in 2005
> https://www.w3.org/2005/07/xml-schema-patterns.html#Maps
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2F2005%2F07%2Fxml-schema-patterns.html%23Maps&data=05%7C01%7Cwillbar%40microsoft.com%7C5fdb31fe5d124147f1d808da69249c20%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637937903545815080%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=EQ3nPm2ddYPE8xvFM0KqQAlHYZwbK%2BoIxh9K8hTz9r4%3D&reserved=0>,
> and it's a built-in part of JSON.
>
> *[William] *While it is built-in to XML and JSON my experience is that
> it's not been supported well by schema languages and
> serializers/deserializers. I know I've had situations where I had to
> duplicate the id property in the class to ensure that other things work
> correctly (and to maintain the independence of the class). Also, in most
> object oriented languages there is not a way to get the key from the object
> so you end up having to track the key independently of the object which is
> a pain.
>
> JSON-LD even treats ID differently from other properties by giving it a
> reserved @ID type, and SQL databases have primary keys with the special
> characteristic that they uniquely identify the record rather than being
> just another column.  Autogenerated ids are often hidden because they are
> ubiquitous.
>
> *[William] *In both JSON-LD and SQL the properties are still normal
> properties, in JSON-LD it's still a property on the object it just has a
> special name, in SQL it's still a column in the table it just has special
> metadata attached to it. Even autogenerated ids are typically normal
> columns they're just system generated and you can't change their definition.
>
>   And finally, you introduced Map to the logical model for Extensions.  If
> it's OK for extensions, it's OK for Elements :-).
>
> *[William] *Not the same 🙂. The map for extensions is a map of
> "extension type" to value, not of "id" to value. It is a consequence of us
> deciding that each type can only be assigned once that it can also be used
> as an id, but it is primarily a type, not an id. If we changed that design
> decision it would no longer function as an id.
>
> Seriously though, I'm not wedded to Map.  Treating Id as any other
> property but having some prose saying that it can be used as a primary key
> / unique identifier is OK, it's just kind of loose given that references
> from foreign to primary keys is a universal concept.
>
> *[William] *In SQL Server (others are similar) a foreign key takes the
> form FOREIGN KEY (ChildCol1, ChildCol2) REFERENCES parentTable (ParentCol1,
> ParentCol2, ...), they're still just columns, nothing magical about them,
> not even their names.
>
> * type property
> Since JSON does not have types it's good practice to ensure that "type:
> identity" cannot collide with a property named "identity".  At the core
> profile all type and property names are defined and don't collide, but if
> "type" goes away we'll need to ensure that properties defined in any
> profile cannot collide with types defined in any profile.  Again JSON-LD
> treats @type as a reserved property:
> https://w3c.github.io/json-ld-syntax/#typed-values
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fw3c.github.io%2Fjson-ld-syntax%2F%23typed-values&data=05%7C01%7Cwillbar%40microsoft.com%7C5fdb31fe5d124147f1d808da69249c20%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637937903545815080%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=CKnF7%2BIEKZf8stmRlX21mxCvsHPWJi1OOT7zbGdsNQ4%3D&reserved=0>
> .
>
> *[William] *Agreed, and type isn't in the logical model, a JSON-LD
> serializer would use @type, an XML one would use XML namespaces and element
> names, a ProtoBuf one would use message types. Since my examples were
> "plain" JSON which does not have a built-in way of declaring types I used a
> "plain" property to capture the type, I agree that the name of this
> property should avoid potential conflict (e.g. by prefixing with an _).
>
> * document root
> A transfer unit file is not an Element and not a logical type or a class.
> The bytes in SPDX documents are not defined by the logical model, they just
> have to be able to be de-serialized into element instances.
>
> *[William] *Same disagreement as above.
>
> Data schemas (for JSON, XML, ASN.1, ...) explicitly do not define classes,
> they define only data types.
>
> *[William] *I'm not sure what definition of "class" you're using here,
> but the boxes on the diagram could be represent in an OO language as
> classes or interfaces, for our purposes I don't think the distinction
> between class and data type is meaningful.
>
> Regards,
> Dave
>
>
> On Mon, Jul 18, 2022 at 7:08 PM William Bartholomew (CELA) via
> lists.spdx.org
> <https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.spdx.org%2F&data=05%7C01%7Cwillbar%40microsoft.com%7C5fdb31fe5d124147f1d808da69249c20%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637937903545815080%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=TiataTJ3Eq7MrdDQuuJIlKVEqBSvD3161KGOG6FSG%2BQ%3D&reserved=0>
> <[email protected]> wrote:
>
> There are some “proposed” examples at the bottom of the model diagram
> (note that I intended these to be representative until we define the exact
> serialization for each data format):
>
> https://github.com/spdx/spdx-3-model/blob/main/model.png
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fspdx%2Fspdx-3-model%2Fblob%2Fmain%2Fmodel.png&data=05%7C01%7Cwillbar%40microsoft.com%7C5fdb31fe5d124147f1d808da69249c20%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637937903545815080%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=rqc21Xo%2F%2B2KlwtUvFSqmnyhiGCBPJKTZydqzDG1VYw0%3D&reserved=0>
>
>
>
> Some of the key differences (with no implied support for either choice, I
> have included my reasoning for reference only):
>
>    - Defaults being represented as the original properties on a
>    collection element * vs* being in their own “defaults” property.
>       - I was thinking about this as a traditional inheritance/overrides
>       structure. If a property doesn’t have a value you can walk the tree up
>       looking for the same property.
>    - Array of elements *vs* map of elements.
>       - In the past I have found schema languages don’t have good support
>       for one of the properties of an object being outside of the object 
> (i.e. a
>       key on the collection outside). Having a completely contained object 
> makes
>       canonicalization etc. easier at the risk of the array having multiple
>       instances of the same element (which can be solved in other ways).
>    - Type being a string property *vs* an object property containing the
>    type.
>       - I mainly followed the JSON-LD style and it has one less level of
>       nesting.
>    - Document root being an element *vs* a custom class.
>       - Tried to minimize custom classes by having everything as either
>       an element or a value type.
>
>
>
>
>
> Regards,
>
>
>
> William Bartholomew (he/him) – Let’s chat
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Foutlook.office.com%2Fbookwithme%2Fuser%2F988a5aee063345bab5c400a0da19af33%40microsoft.com%2Fmeetingtype%2FSVRwCe7HMUGxuT6WGxi68g2%3Fanonymous%26ep%3Dmlink&data=05%7C01%7Cwillbar%40microsoft.com%7C5fdb31fe5d124147f1d808da69249c20%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637937903545815080%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=fOWymTbUg2G24V70i0ng03S7DCnFGcdN45pwnDYCPrw%3D&reserved=0>
>
> Principal Security Strategist
>
> Global Cybersecurity Policy – Microsoft
>
>
>
> *My working day may not be your working day. Please don’t feel obliged to
> reply to this e-mail outside of your normal working hours.*
>
>
>
> *From:* [email protected] <[email protected]> *On Behalf Of
> *David Kemp via lists.spdx.org
> <https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.spdx.org%2F&data=05%7C01%7Cwillbar%40microsoft.com%7C5fdb31fe5d124147f1d808da69249c20%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637937903545815080%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=TiataTJ3Eq7MrdDQuuJIlKVEqBSvD3161KGOG6FSG%2BQ%3D&reserved=0>
> *Sent:* Monday, July 18, 2022 1:56 PM
> *To:* SPDX-list <[email protected]>
> *Subject:* [EXTERNAL] [spdx-tech] V3 serialization
>
>
>
> Last week I took an action item to describe what serialized data for the
> v3 logical model could look like, in order to clarify discussion of the
> types shown in the model.
>
> The thing to remember about v3 is that it is knowledge graph centric, not
> document centric. Element instances from the knowledge graph can be
> serialized into data instances, but the data definition is controlled by
> the logical model, not vice versa.  Data examples in various formats can
> illustrate the logical model for readers of the v3 spec, but they do not
> define it as they do in SPDX v2.
>
> A collection of independent element values is shown in
> "logical-elements".  JSON data is use to visualize the element values, but
> it is important to remember that the logical value itself is the ability to
> answer questions:
> * what is the id of this element?
> * what is the type of this element?
> * who created this element?
> etc.  The element is a class with getters that allow each property of an
> instance to be retrieved, and those property values are independent of
> serialization format.
>
> That collection of elements can be serialized into a transfer unit file as
> shown in "transfer units"
>
> A Document element describes the contents of a transfer unit, but does not
> need to be present in the transfer unit.  The example transfer unit
> containing six elements (an SBOM, a Package, two Files, a Relationship, and
> an Actor that created them) is:
>
> {
>   "namespace": "urn:acme.dev
> <https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Facme.dev%2F&data=05%7C01%7Cwillbar%40microsoft.com%7C5fdb31fe5d124147f1d808da69249c20%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637937903545815080%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=PKSuAOtEI1oCP5UYPeEQEjOt9abpeMQhfII1XzDY4M4%3D&reserved=0>
> :",
>   "defaults": {
>     "createdBy": ["identities:fred"],
>     "created": "2022-04-05T22:00:00Z",
>     "specVersion": "3.0",
>     "profiles": ["Core", "Software"],
>     "dataLicense": "CC0-1.0"
>   },
>   "elementValues": {
>     "artifacts:gnu-coreutils/v9.1/src/du.c": {
>       "type": {
>         "file": {
>           "filePurpose": ["APPLICATION", "SOURCE"]
>         }
>       }
>     },
>     "artifacts:gnu-coreutils/v9.1/src/echo.c": {
>       "type": {
>         "file": {
>           "filePurpose": ["APPLICATION", "SOURCE"]
>         }
>       }
>     },
>     "artifacts:gnu-coreutils/v9.1": {
>       "type": {
>         "package": {
>           "packagePurpose": ["APPLICATION", "SOURCE"],
>           "downloadLocation": "
> http://mirror.rit.edu/gnu/coreutils/coreutils-9.1.tar.gz
> <https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmirror.rit.edu%2Fgnu%2Fcoreutils%2Fcoreutils-9.1.tar.gz&data=05%7C01%7Cwillbar%40microsoft.com%7C5fdb31fe5d124147f1d808da69249c20%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637937903545815080%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=4sq6zH7mAV6i7AmXqkbACbfYy%2BkEBWZmOIrXNN1HJxM%3D&reserved=0>
> ",
>           "homePage": "https://www.gnu.org/software/coreutils/
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.gnu.org%2Fsoftware%2Fcoreutils%2F&data=05%7C01%7Cwillbar%40microsoft.com%7C5fdb31fe5d124147f1d808da69249c20%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637937903545815080%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=uBw%2BKAYUGx6qpjWVTTtWQY1LLuT0i3EOFHzp7U%2BWfPw%3D&reserved=0>
> "
>         }
>       },
>       "name": "GNU Coreutils"
>     },
>     "relationships:gnu-coreutils/v9.1": {
>       "type": {
>         "relationship": {
>           "relationshipType": "CONTAINS",
>           "from": "urn:acme.dev:artifacts:gnu-coreutils/v9.1",
>           "to": [
>             "artifacts:gnu-coreutils/v9.1/src/du.c",
>             "artifacts:gnu-coreutils/v9.1/src/echo.c"
>           ]
>         }
>       }
>     },
>     "identities:fred": {
>       "type": {
>         "actor": {}
>       },
>       "identifiedBy": [{"email": "[email protected]"}]
>     },
>     "sboms:gnu-coreutils/v9.1": {
>       "type": {
>         "sbom": {
>           "elements": [
>             "artifacts:gnu-coreutils/v9.1/src/du.c",
>             "artifacts:gnu-coreutils/v9.1/src/echo.c",
>             "artifacts:gnu-coreutils/v9.1",
>             "relationships:gnu-coreutils/v9.1",
>             "identities:fred"
>           ]
>         }
>       }
>     }
>   }
> }
>
> The element examples, transfer unit examples, and the SPDX v3 schema
> derived from the logical model are available in
> https://github.com/davaya/spdx-3-elements
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdavaya%2Fspdx-3-elements&data=05%7C01%7Cwillbar%40microsoft.com%7C5fdb31fe5d124147f1d808da69249c20%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637937903545815080%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3LM2oFWzhacH2fk0Na7s7mo1Vb4cpu4v3miwzeSPB%2Fc%3D&reserved=0>
> .
>
> The intent is for these to assist in refining the logical model and its
> serializations together.
>
> Regards,
> Dave
>
> 
>
>


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#4661): https://lists.spdx.org/g/Spdx-tech/message/4661
Mute This Topic: https://lists.spdx.org/mt/92468742/21656
Group Owner: [email protected]
Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub [[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to