Re: [spdx-tech] Specific element identification/referencing questions

David Kemp Wed, 22 Sep 2021 07:33:20 -0700

On Tue, Sep 21, 2021 at 1:49 PM William Bartholomew via lists.spdx.org
<[email protected]> wrote:


> Here is a list of the open questions I captured from today’s meeting. If
> you have additional open questions that need to be answered please reply to
> this email with them. Let’s *not* try to answer the questions on this
> thread, if you want to propose a solution for a specific question start a
> thread for that specific question, otherwise we’ll work through these in
> the next meeting(s). I’ve included a copy of the model below for reference
> (unchanged since the last one I sent to the mailing list).
>
>
>
> * Adding an abstract Collection class as the superclass of Document and
> ContextualCollection simplified the model, are we comfortable keeping that?
>

We need to answer: "what is the difference between Document and
ContextualCollection?"  In SPDXv2 an SBOM was a Document.  In SPDXv3 SBOM
is a ContextualCollection.

Semantically there are two types of Collection: contextual and
non-contextual.
* Contextual is a set of related elements - the collection is created as a
unit and elements contained in the collection are always part of the
collection even when referenced by individual IRI.  The collection Element
and its contained Elements are created at the same instant by the same
creator, regardless of whether any creation properties are serialized in
the Elements.
* Non-contextual is a collection that does not define any Elements. The
Elements were created before the collection, are copied into the
collection, and are unchanged whether the collection exists or not.

It would be much clearer if Document were called something like Bundle
since it no longer has the semantics of a v2 Document.  Keeping the same
name and changing the meaning is not just confusing, it's misleading.

Syntactically it would be possible to have a single collection type that
has all of 1) defined elements, 2) referenced elements, and 3) copied
elements, but that sounds like a nightmare to try to explain.  It's easier
to keep separate names for the two collection types.

* Adding a NamespaceMap property on Collection allows us to roundtrip
> prefix to namespace mappings across formats, any strong opinions on not
> having this?
>
>                 * Do we want to enable a default namespace (e.g. no
> prefix)?
>

No objection to NamespaceMap.
Yes, a ContextualCollection MUST have a default namespace / base IRI.  It
defines the difference between contained and referenced Elements.

The purpose of the default namespace isn't to save a few prefix
characters.  The default namespace is the Collection IRI - the base IRI
shared by all contained Elements when the Collection is created. Every
Element in a ContextualCollection that doesn't have the default namespace
is known to be an external Element, referenced from the collection, created
before the collection and validated using an IntegrityMethod.

* If an Element is contained in a Collection can it omit properties (such
> as createdAt) and inherit them from the Collection?
>

 The Collection Element has creation information that applies to all
Elements contained in the collection.  The base IRI and creation info of
every contained Element MUST be identical to the container Element, because
otherwise the element would be referenced (external), not contained
(internal).  When serializing the Collection Element, the creation info of
its internal elements SHOULD be omitted, and if not, deserializers MUST
validate that they are equal

When serializing an element contained in a collection separately from the
collection, the creation info must be known.  It can be known by either
serializing it in the Element or obtaining the collection Element.  The
collection element can be obtained by either accompanying the contained
Element with the explicit IRI of its container or deriving the container
IRI from the Element IRI.


                * If a Collection is in another Collection can it inherit
> from higher levels?
>
I recommend No.  Each collection is created at a specific instant.  It's
technically possible to create an entire collection tree at a single
instant, but the difficulty of dealing with that special case is not
justified by any articulated benefit. Adopt a rule that lower level
collections must be created before a higher level collection can contain
them.  An application can of course propagate down a collection tree and
know all of the properties of all levels in the tree. But a higher
collection does not change the creation info of any Elements it contains,
there are no dependencies among collection IRIs, and the relationship of
each Element to its immediate parent collection is preserved.


>                 * This presumes an Element can only be in a single
> Collection (this makes sense since Collections represent containment not
> aggregation)
>
Yes.  Every Element can be contained in at most one Collection but can be
referenced from arbitrarily many Collections.


> * Can an element be copied from one document (lowercase-d) to another?
>
>                 * If Documents are Collections and Collections are
> containment this implies an element can't be copied.
>
An element is contained in at most one Collection, therefore if it is
copied it is an external Element.  A reference to an external element may
in theory be accompanied by "copiedProperties", but I agree that that
should be prohibited to eliminate the possibility of copy/paste errors (as
well as wasted space.)


>                 * What about use cases where we want to serialize a graph
> of elements from different documents into a single physical stream (e.g.
> wanting to put in a single JSON API response, or collapse into a single
> physical JSON file for ease of transfer into an air-gapped environment,
> etc.)?
>

That is a use case for Bundle (what is currently called "Document") Elements


> * Depending on the answers above, what is the impact to ExternalMap?
>
>                 * One thing ExternalMap was being used for was that in
> SPDX 2.2 ExternalDocumentRef the integrity was over the document, not
> individual elements, ExternalMap allowed us to store the integrity
> information for a referenced document in the referencing document without
> needing to compute per element integrity.
>

When computing integrity it is necessary to canonicalize the content.  One
step in canonicalizing is converting all element IDs to full IRIs.

Dave

>


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#4201): https://lists.spdx.org/g/Spdx-tech/message/4201
Mute This Topic: https://lists.spdx.org/mt/85771717/21656
Group Owner: [email protected]
Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub [[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [spdx-tech] Specific element identification/referencing questions

Reply via email to