I was thinking about this last week as I was putting together some SPDX 3.0 
samples. Do we think canonicalization, whose purpose is to be able to hash the 
content, needs to understand the semantics at all? Or, can we say that once 
deserialized to a logical model and then strictly serialized to a defined 
format and hashed then that is sufficient? This serialization would only need 
to understand fundamental data types and since it is per element doesn't need 
to understand the relationships between elements.


Regards,

William Bartholomew (he/him) - Let's chat
Principal Security Strategist
Global Cybersecurity Policy - Microsoft

My working day may not be your working day. Please don't feel obliged to reply 
to this e-mail outside of your normal working hours.

-----Original Message-----
From: [email protected] <[email protected]> On Behalf Of 
Sebastian Crane via lists.spdx.org
Sent: Monday, May 16, 2022 10:12 AM
To: SPDX Technical Mailing List <[email protected]>
Subject: [EXTERNAL] [spdx-tech] Agenda topic for tomorrow's meeting: 
SpecVersion property

Dear all,

I would like to propose an agenda topic for tomorrow's SPDX Tech Team meeting:
the precise format of the SpecVersion property on each Element.

During last week's Canonicalisation Committee meeting we discussed the factors 
for a canonical represenation of this property's data type. However, it became 
apparent that many aspects of the discussion were out of scope for the 
Canonicalisation Committee, and should be brought up with the wider Tech Team.

On the draft SPDX 3.0 model diagram, each Element has a SpecVersion property 
indicating which version of the SPDX Specification the Element is conformant 
to. The data type of SpecVersion refers to the 'Semantic Versioning' scheme 
described at 
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsemver.org%2F&amp;data=05%7C01%7Cwillbar%40microsoft.com%7Cb4a85115c5794d2f2d0c08da375f4297%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637883179551953921%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=8qVsyShOJ4hAD84Hzi8iZsXi2luv5cwkw1EP6FkAiKI%3D&amp;reserved=0
 - known simply as 'SemVer'.

At the Canonicalisation Committee meeting we came up with three option for the 
SpecVersion property's data type:

1: A structured data type composed of integers corresponding to the major, 
minor and patch levels

2: A plain string

3: Enumeration (enum) type of specification versions published by SPDX

It's worth noting that No.1 can only express a subset of the full Semantic 
Versioning specification, which supports extra tags like 'release candidate' 
and 'alpha'.

The main criteria that came up in the meeting was the ability for tooling to 
ignore or reject SPDX data that is of a version not supported by that tool.

If the types of changes represented by the major, minor and patch levels are 
rigorously defined and consistent, tools would be able to determine 
compatibilty automatically with No.1. This is more nuanced when considering the 
Canonical Serialisation, since this could make feature releases (usually minor 
changes) breaking, major changes. Clearly, a tool can not implement the 
canonical represenation of a value introduced in the future! Yet, a tool merely 
performing analysis on the data fields it understands can just ignore the newly 
added fields.

No.3 would also allow for automated compatibility determination, but only for 
SPDX specification versions that it is hard-coded to understand, due to the 
'semantic' elements of the version specifier being opaque to the tool.

Looking forward to hearing everyones' views on this!

Best wishes,

Sebastian







-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#4514): https://lists.spdx.org/g/Spdx-tech/message/4514
Mute This Topic: https://lists.spdx.org/mt/91144916/21656
Group Owner: [email protected]
Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub [[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to