Hey,

my vote would be to define the names of hashes like "sha256" for the json 
serialization and use them as strings in the cannonical serialization. My goal 
would be to keep it as similar to "normal" json as possible.

Best
Max

----- Ursprüngliche Mail -----
Von: "David Kemp" <[email protected]>
An: "SPDX-list" <[email protected]>
Gesendet: Freitag, 24. Juni 2022 16:36:46
Betreff: [spdx-tech] Canonicalization - enumerations

Sebastian called a vote on whether "the" canonical representation of
enumerated lists such as hash algorithms and relationship types should be
strings or numbers.

My vote is "doesn't matter".  I lean toward efficient serializations
because they are more likely to be rigorously correct, but the critical
requirement is that the model defines the equivalence tables for all
enumerations:

Hash Algorithms:
 1 SHA1
 2 SHA224
 3 SHA256

Software Purposes
 1 APPLICATION
 2 FRAMEWORK
 3 LIBRARY
  .
etc.
We can say today that the canonical serialization will use human readable
values then work through the details of translating to and from concise
serializations.  At that point, when all translations are guaranteed to be
lossless, our work is done.

We could then throw the switch (using Sebastian's analogy) and say the
canonical hash is computed over CBOR data and everything would still work
perfectly, because any format can be converted into any other.

Routers are designed to parse IP packets in optimized format (
https://datatracker.ietf.org/doc/html/rfc791#section-3.1), but optimized
data can be displayed to humans by tools like Wireshark.  Routers could be
designed to process data in human-readable format.  They would be much less
efficient, but they would work correctly as long as the semantic
equivalence between efficient and human-readable data is precisely defined.

If SBOMs become as ubiquitous in machine-to-machine operations as IP they
will surely be processed in an efficient format, and humans will use tools
like Wireshark to display/debug them.  But for now, we can design canonical
hashes to hash over the string "TCP" instead of the number 6 (
https://en.wikipedia.org/wiki/List_of_IP_protocol_numbers) for convenience.

Dave



-- 
Maximilian Huber * [email protected] * +49-174-3410223
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Thomas Endres
Sitz: Unterföhring * Amtsgericht München * HRB 135082


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#4613): https://lists.spdx.org/g/Spdx-tech/message/4613
Mute This Topic: https://lists.spdx.org/mt/91965644/21656
Group Owner: [email protected]
Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub [[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to