Identifiers vs. locators

Several problems were mentioned on today's call that have the same root
cause: failing to make a clear and unambiguous distinction between
identifiers and locators.  An id uniquely, well, identifies something, if
the thing doesn't change then neither does its id.  A locator helps people
or machines find things; the same thing can have multiple copies in
multiple places, or as Nisha discussed with respect to Docker containers, a
single instance of a thing can be found by multiple paths, some of which
may form loops.  And the same thing may have multiple human-readable names.

WHATWG <https://url.spec.whatwg.org/> encourages that confusion:

> Standardize on the term URL. URI and IRI are just confusing. In practice a
> single algorithm is used for both so keeping them distinct is not helping
> anyone. URL also easily wins the search result popularity contest.


Perhaps the problem is not the terms, but the fact that the authors are
easily confused. Standards writers should have higher standards.

Recommendations:
* SPDX Artifact MUST have an id attribute.  It doesn't matter whether it is
a hash value or an SPDXID or something else, the only requirement is that
it be unique and immutable given the content of the package or file or
snippet.  It's form can be a URN, or an unregistered URN-like URI if
necessary, or an unprefixed raw value if the attribute name unambiguously
defines the method by which it is assigned.

* SPDX Artifact MAY have a multivalued set of locator attributes such as
filenames or URLs.

* The algorithm for computing the ID of a package or manifest MUST exclude
all locator attributes from the computation.

Solving the Docker problem may require digging down into the content of zip
files and tarballs to understand and extract just the information of
interest when computing the ID of an SPDX manifest.  Maybe a software
distribution can be treated as a unit below which pathnames are constant
and unique WRT the hash of their content, but that assumption must be
documented to highlight the fact that results are undefined if the
assumption is wrong.

Looking forward to next week.

v/r,
David

>

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#3911): https://lists.spdx.org/g/Spdx-tech/message/3911
Mute This Topic: https://lists.spdx.org/mt/76715099/21656
Group Owner: [email protected]
Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub  
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to