One strong vote for strings.  The spec clearly defines the string 
serializations already and introducing numbers is an unnecessary additional 
complexity for some tooling.

 

Gary

 

From: [email protected] <[email protected]> On Behalf Of David 
Kemp
Sent: Friday, June 24, 2022 7:37 AM
To: SPDX-list <[email protected]>
Subject: [spdx-tech] Canonicalization - enumerations

 

Sebastian called a vote on whether "the" canonical representation of enumerated 
lists such as hash algorithms and relationship types should be strings or 
numbers.

My vote is "doesn't matter".  I lean toward efficient serializations because 
they are more likely to be rigorously correct, but the critical requirement is 
that the model defines the equivalence tables for all enumerations:

Hash Algorithms:
 1 SHA1
 2 SHA224
 3 SHA256

Software Purposes
 1 APPLICATION
 2 FRAMEWORK

 3 LIBRARY

  .
etc.
We can say today that the canonical serialization will use human readable 
values then work through the details of translating to and from concise 
serializations.  At that point, when all translations are guaranteed to be 
lossless, our work is done.

We could then throw the switch (using Sebastian's analogy) and say the 
canonical hash is computed over CBOR data and everything would still work 
perfectly, because any format can be converted into any other.

Routers are designed to parse IP packets in optimized format 
(https://datatracker.ietf.org/doc/html/rfc791#section-3.1), but optimized data 
can be displayed to humans by tools like Wireshark.  Routers could be designed 
to process data in human-readable format.  They would be much less efficient, 
but they would work correctly as long as the semantic equivalence between 
efficient and human-readable data is precisely defined.

If SBOMs become as ubiquitous in machine-to-machine operations as IP they will 
surely be processed in an efficient format, and humans will use tools like 
Wireshark to display/debug them.  But for now, we can design canonical hashes 
to hash over the string "TCP" instead of the number 6 
(https://en.wikipedia.org/wiki/List_of_IP_protocol_numbers) for convenience.

Dave









-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#4607): https://lists.spdx.org/g/Spdx-tech/message/4607
Mute This Topic: https://lists.spdx.org/mt/91965644/21656
Group Owner: [email protected]
Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub [[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to