Canonical Hash Subgroup folks,

Attached is a diagram illustrating the canonicalization process.  As you
know, our goal is to be able to take an SPDX document in any format and
produce a hash value that is independent of that format.  If the same SPDX
information is serialized in RDF and JSON, then the hash of those documents
must be the same.  If the SPDX information in two documents is different,
then their hashes must be different.

Thus far we have discussed
* canonical data format (agreed to consider JSON and CBOR)
* canonicalization tool programming languages
* directly hashing a canonical data format vs. constructing a Merkle hash
tree from an AST of that data format
* normalizing URL strings

We have not yet discussed defining the SPDX Abstract Syntax Tree, which is
similar to but more  strictly defined than the SPDX logical model.
Although we discussed JSON ASTs in the context of producing hash trees, a
JSON AST has no knowledge of SPDX and thus doesn't help when processing
SPDX documents in other formats.  This diagram illustrates some of the
topics to be addressed, and hopefully can guide and focus future discussion.

v/r,
David


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#4570): https://lists.spdx.org/g/Spdx-tech/message/4570
Mute This Topic: https://lists.spdx.org/mt/91678203/21656
Group Owner: [email protected]
Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub [[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to