100% agree with Alexios, for compatibility across all serialization formats, today’s and tomorrow’s, canonicalization should not use any information that’s not in the logical model. Ideally, canonicalization would define how to convert each type into a series of bytes and then how those series of bytes get combined and hashed.
The types that exist in the model today are: * IRI * URL * String * Enum (these are simple enums that just have an identifier, not an identifier and a value) * DateTime (ISO-8601) * SemVer (SemVer 2.0.0) * MediaType (RFC 2046) * Array (ordered list of values, each value in an array will be the same type – though this type may be a base class in which case any subclass of that would be acceptable) * Map<Key, Value> (dictionary of unique keys with an associated value, all keys in a dictionary will have the same type, all values in a dictionary will have the same type – though this type may be a base class in which case any subclass of that would be acceptable) * Element (referenced by SPDXID and equality based on SPDXID, has a type and a set of properties, each property has a name, type, and value) * Struct (cannot be referenced across documents, references within a document are serialization-specific and opaque to canonicalization, has a type and a set of properties, each property has a name, type, and value) We don’t have any today, but I’m sure we’ll need Integer 😊. Regards, William Bartholomew (he/him) – Let’s chat<https://outlook.office.com/bookwithme/user/[email protected]/meetingtype/SVRwCe7HMUGxuT6WGxi68g2?anonymous&ep=mlink> Principal Security Strategist Global Cybersecurity Policy – Microsoft My working day may not be your working day. Please don’t feel obliged to reply to this e-mail outside of your normal working hours. From: [email protected] <[email protected]> On Behalf Of Alexios Zavras via lists.spdx.org Sent: Tuesday, August 9, 2022 8:15 AM To: David Kemp <[email protected]> Cc: SPDX Technical Mailing List <[email protected]> Subject: [EXTERNAL] Re: [spdx-tech] Canonicalisation Committee: please vote on the CPE external reference type format! Between the alternatives of Object and Array as presented, I’m also for Array, due to conciseness. However, I would be extremely reluctant to have the Canonical Serialization operating in something else than the regular SPDX model we are discussing in all our tech calls. What I mean is, that, in order to move from String ("cpe:2.3:a:debian:php5-common:5.3.2-1:*:*:*:*:*:*:*") to an Object or an Array, the deconstruction should be reflected in the SPDX model itself: the value of a CPE reference should not be a String (nor even a CPEString), but a list of 11 attributes (part, product, etc.). Otherwise, we essentially state that Canonical Serialization would be using its own, special model – which I believe is wrong. -- zvr From: [email protected]<mailto:[email protected]> <[email protected]<mailto:[email protected]>> On Behalf Of David Kemp Sent: Saturday, 6 August, 2022 03:37 To: Sebastian Crane <[email protected]<mailto:[email protected]>> Cc: SPDX Technical Mailing List <[email protected]<mailto:[email protected]>> Subject: Re: [spdx-tech] Canonicalisation Committee: please vote on the CPE external reference type format! The characteristic that makes canonicalization interesting is that by definition any canonical format can be converted to any other without loss. If any of the alternatives in this issue or any other issue can't be converted, then they aren't viable candidates. So picking among equivalent alternatives will always be a matter of personal preference, not capability. My personal preference is for 1) explicit structure and 2) compactness, so I vote for the Array format. But I believe each of these alternatives is equivalent to my favorite, so meh. Note that the string format includes the type information "cpe:2.3". This information must be known when using any of the alternatives, but is normally communicated by the property name of the value, so the same information in the string is redundant. {"cpe-2.3": {,,,Object...}} {"cpe-2.3": [,,,Array...]} {"cpe-2.3": "cpe:2.3:...String..."} Regards, David On Fri, Aug 5, 2022 at 7:21 PM Sebastian Crane <[email protected]<mailto:[email protected]>> wrote: Dear all, We had a very productive Canonicalisation Committee meeting today, in which we explored the many aspects of CPEs and discussed their representation in the SPDX Canonical Serialisation. There are two forms of CPE which we support as External Reference Types in SPDX: CPE version 2.2 from 2009, and CPE version 2.3 from 2011. They differ a little from each other in terms of their usual string representation, but share the same basic constructs. During the meeting, we came up with three different JSON representations that would be suitable for encoding CPE information unambiguously. As all three could be used in the Canonical Serialisation, we decided to call a vote as to which was preferred by the SPDX community, including tooling developers intending to support the Canonical Serialisation for SPDX 3.0. Please feel free to vote, and make sure to cast it by means of a reply on this mailing list! :) -------------------------------------------------------------------------------- 'Object' option: {"part":"a","product":"php5-common","vendor":"debian","version":"5.3.2-1"} 'Array' option: ["a","debian","php5-common","5.3.2-1"] 'String' option: "cpe:2.3:a:debian:php5-common:5.3.2-1:*:*:*:*:*:*:*" -------------------------------------------------------------------------------- Please note that the Object option has key-value pairs sorted alphabetically by their key's name, as will be the case for all objects in the SPDX 3.0 Canonical Serialisation. Versions 2.2 and 2.3 of the CPE specification are similar enough that we'll use the same format for both of them in the Canonical Serialisation. An important factor that we will discuss later (and which we're not voting on here) is how null (wildcard) attributes are represented. In these examples, I've omitted the null attributes for the Object and Array options, leaving out the entire key-value pair (for Object) and the null array elements (Array). We could just as easily define that the String option must also have trailing null attributes removed, and conversely we could specify that all attributes are present for the Object and Array options, requiring a special value (such as NULL or the empty string) as its value. I'll count the votes on Thursday the 11th of August. Feel free to discuss the relative merits of the options here on the mailing list, and I look forward to seeing which option wins! :) Best wishes, Sebastian Intel Deutschland GmbH Registered Address: Am Campeon 10, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.intel.de%2F&data=05%7C01%7Cwillbar%40microsoft.com%7Ce64ea91eb3364f58d09008da7a19f74d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637956549192944002%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=8FHwd%2FcS2OrVomeWs0qBM%2BGbqazUDjU07zHkQ4uOsf4%3D&reserved=0> Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928 -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#4729): https://lists.spdx.org/g/Spdx-tech/message/4729 Mute This Topic: https://lists.spdx.org/mt/92846339/21656 Group Owner: [email protected] Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub [[email protected]] -=-=-=-=-=-=-=-=-=-=-=-
