Let me split this question into a few questions for the community:
1. Do we agree that Set, Bag, etc. are specialized forms of Collection?
2. Are all subclasses of Collection, now and forever, going to have the same
specialized semantics or could some subclasses be sets and others be bags?
* If all the subclasses are the same now and we can't imagine any
possible scenario where that won't be true or desired, then Collection should
be the specialized form (Set or Bag or ...).
* If we know some will be different or we don't want to commit to this
always being true, then Collection should be the generalized form (Collection).
3. Even if Collection is a specialized form of collection is the general
term Collection more approachable to the broader community (less technical and
non-native English speakers)? The specification text either way would need to
be specific.
Regards,
William Bartholomew (he/him) - Let's
chat<https://outlook.office.com/findtime/[email protected]&anonymous&ep=plink>
Principal Security Strategist
Cybersecurity Policy - Digital Diplomacy
From: David Kemp <[email protected]>
Sent: Tuesday, November 23, 2021 5:59 PM
To: William Bartholomew (CELA) <[email protected]>
Cc: [email protected]
Subject: Re: [EXTERNAL] Re: [spdx-tech] ContextualCollection and CONTAINS
Relationship
William,
I agree that:
* Collection is a grouping of Elements
* Package is a grouping of artifacts
* Contains describes one of two physical relationships between artifacts
(the other is "references")
Elements are metadata about artifacts. Artifacts are data, and data can
contain or reference other data. (A paper can have references, an html file
can have links.)
The question raised today is: Is a Collection a Set or a Bag
(https://en.wikipedia.org/wiki/Multiset<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FMultiset&data=04%7C01%7Cwillbar%40microsoft.com%7C0a1832997b934f977be708d9aeedf31e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637733159288601279%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=hk1Tvf4QvdZ%2Bzkjv5YsvQSTFaw63LirBMViPhvsFiRg%3D&reserved=0>)?
In other words, does a Collection of Elements count the number of copies of a
file that exists in an artifact, or just the fact that that file exists? The
members of a grouping can be ordered or unordered, and unique or non-unique.
I'm assuming a Collection is unordered, but unordered Collection members are
either unique (Set) or non-unique (Bag). I'm also assuming that Collection of
Elements is unique - the Collection is a Set. Is that correct?
Then you get the benefits of grouping of elements (being able to refer to a set
of elements so you can re-use them) but you avoid the multiple methods of
describing artifacts contained within another artifact.
That is one use case. Another use case is an anonymous grouping of elements
that can't be referred to or re-used. That is the "ferry" example from
physical artifacts - the cars on a ferry are an ephemeral grouping, once they
leave the ferry they are no longer a grouping that can be referred to. That is
also the non-Collection example - a Collection of Elements (not artifacts) can
be referred to, but a non-Collection of Elements is an anonymous ephemeral
grouping of Elements that exists only in the serialized data containing that
grouping.
There are two reasons for non-Collection groupings:
1) Applications that need N random Elements to fully perform their function
don't need an artificial N+1th element - the grouping is meaningless and
doesn't need to be referred to or re-used, it's used only to import N Elements
into the Application.
2) Godel's incompleteness theorem
(https://plato.stanford.edu/entries/goedel-incompleteness/<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplato.stanford.edu%2Fentries%2Fgoedel-incompleteness%2F&data=04%7C01%7Cwillbar%40microsoft.com%7C0a1832997b934f977be708d9aeedf31e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637733159288601279%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=2OGXax82Xy08fMjkTp2SSGnpx6V0O8UWJVgs6leckZE%3D&reserved=0>)
talks about formal systems and proofs within them, but an analogy would treat
the Universe of Elements as a system, in which case that Universe cannot be
described as an Element, because as soon as you did so, the Universe would now
have N+1 Elements, a Collection of N+1 Elements would become a Universe of N+2,
and so on. The use case is again to serialize N Elements and wind up with the
same N Elements after deserialization.
Persistent groupings (Collections) are absolutely a requirement. Ephemeral
groupings (non-Collections / Bundles / Sets) are also a requirement. Both are
supported in the information model, and as I noted, a Bundle / Set is not a
"non-contextual Collection" because it is not a Collection at all. The N+1th
Element does not exist in a Bundle / Set.
Dave
P.S.: I agree with you and Sean that composition
(https://www.uml-diagrams.org/composition.html<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.uml-diagrams.org%2Fcomposition.html&data=04%7C01%7Cwillbar%40microsoft.com%7C0a1832997b934f977be708d9aeedf31e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637733159288601279%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=PIFB2%2F5qSR%2BzA34lOwnu02GFem9gR34aD2YgKSZKTSc%3D&reserved=0>)
is the wrong relationship between a logical Collection and its members,
because the members don't existentially depend on the Collection. Normal
association (filled arrow) is the appropriate relationship in the logical
model. This reinforces that while an Artifact (data) can contain other
Artifacts (data), a Collection Element describing a grouping of Artifacts does
not "contain" other Elements. Destruction of the Collection does not destroy
the Elements it references.
On Tue, Nov 23, 2021 at 11:56 AM William Bartholomew (CELA) via
lists.spdx.org<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.spdx.org%2F&data=04%7C01%7Cwillbar%40microsoft.com%7C0a1832997b934f977be708d9aeedf31e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637733159288601279%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=laAtnY4pgJ7VxKmse7fBrMcFk1QRGK%2BGTWDHMn1MoKg%3D&reserved=0>
<[email protected]<mailto:[email protected]>>
wrote:
The "ah ha" moment for me out of the last meeting was that ContextualCollection
and Package were trying to do double duty, representing both a grouping of
elements (metadata about artifacts) and describing the artifacts contained
within another artifact. This also overlapped with the purpose of the CONTAINS
relationship which is used to describe the artifacts contained within another
artifact.
If we split these purposes and say that:
1. ContextualCollection is a grouping of elements
2. Package is a grouping of artifacts
3. CONTAINS relationship is the only method to describe the artifacts
contained within another artifact
Then you get the benefits of grouping of elements (being able to refer to a set
of elements so you can re-use them) but you avoid the multiple methods of
describing artifacts contained within another artifact.
A couple of examples:
* These are logically equivalent:
* PackageA (artifact) CONTAINS (relationship) FileA (artifact) and FileB
(artifact)
* PackageA (artifact) CONTAINS (relationship) PackageAContents
(contextualcollection) which includes FileA (artifact) and FileB (artifact)
* So are these:
* PackageA (artifact) DEPENDS_ON (relationship) PackageB (artifact) and
PackageC (artifact)
* PackageA (artifact) DEPENDS_ON (relationship) PackageADependencies
(contextualcollection) which includes PackageB (artifact) and PackageC
(artifact)
Another way of thinking about it is that ContextualCollection has meaning
inside the SPDX realm whereas Relationships have meaning in the "real world".
Regards,
William Bartholomew (he/him) - Let's
chat<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Foutlook.office.com%2Ffindtime%2Fvote%3Fbook%3Dwillbar%40microsoft.com%26anonymous%26ep%3Dplink&data=04%7C01%7Cwillbar%40microsoft.com%7C0a1832997b934f977be708d9aeedf31e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637733159288651279%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=hc2peYnI58LDfrQ5VqkdMYlQeUlZCT3IFYxzaIZP%2F6c%3D&reserved=0>
Principal Security Strategist
Cybersecurity Policy - Digital Diplomacy
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#4267): https://lists.spdx.org/g/Spdx-tech/message/4267
Mute This Topic: https://lists.spdx.org/mt/87265208/21656
Group Owner: [email protected]
Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub [[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-