The amount of data in the graph is determined by use cases in either
representation. A Package with an elements property can list zero, some,
or a whole bunch of file and package IRIs in that property. An empty list
is equivalent to creating zero CONTAINS relationship elements. In either
representation the individual File and Package elements have to be created
the first time they are needed.
If later you look inside the package artifact and find 100 files you then
create 100 file elements. Then you either create 100 new relationship
elements or you create 1 new package element with 100 IRIs in the
"elements" property. So for pros and cons, both the initial creation and
the bulk update use case are more efficient using the property.
Now assume the artifact hasn't changed but you discover that you made a
mistake and it really contains 102 files. You have to create two new File
elements, then either two new Relationship elements or one new Package
element with 102 IRIs. Depending on how broad and deep the artifact is
(thousands of files?) and how many times you make mistakes and have to
create new package elements for a single unchanged artifact, the size of
the IRI list could be bigger than the size of the extra relationship
elements. The two approaches are like having two manifests
"Package_version_rev0" and "Package_version_rev1", vs. a single manifest
"Package_version" plus a separate "patch 1". The pro's of the property
approach include having a name ('rev1') and unique IRI for the latest
description of the package. I think that's a significant advantage.
So if the poll alternatives are:
1) CONTAINS relationship, no elements property
2) elements property, deprecate CONTAINS relationship
3) allow both elements and CONTAINS
I vote for 2. I strongly object to alternative 1 because the initial
creation of a large package would require hundreds or thousands of
relationship elements in addition to the package element.
There are two reasons to vote for 3:
a) there are some use cases where patching a package list with "contains"
relationships saves bytes in the element graph. This allows the syntactic
sugar concept where elements property can be converted to CONTAINS
relationships and vice versa.
b) there are some use cases where patching a package list accomplishes
something that cannot be expressed by creating a new revision of the
package list. This use case means the syntactic sugar concept is invalid.
My concern with 3 is that I believe there are no use cases where 3b is true
(can't represent the use case graph using just the elements property). But
it is easy to mistakenly create incorrect or inconsistent CONTAINS
relationships. With option 2, tools could always expand the property into
relationships for processing or viewing graphs, while not being exposed to
mistakes enabled by option 3.
Dave
On Tue, Jan 25, 2022 at 10:32 AM William Bartholomew (CELA) <
[email protected]> wrote:
> The challenge is that SPDX doesn’t *require* you to describe the contents
> of the package unless it’s needed for your use cases. I’ve worked in
> several scenarios where the package-level information is sufficient and
> calculating, knowing, and transporting around package content information
> would be unnecessary. When you have broad and deep dependency trees
> (*cough* npm *cough*) forcing the package contents to be part of the
> package element pulls in an immense amount of information which may be
> completely unnecessary, the NTIA’s minimum SBOM elements does not even
> require file level information, only package level information.
>
>
>
> Additionally, we need to separate the metadata about the package from the
> package itself in this discussion. Yes, if a package’s contents change it
> is a new package, if we learn new metadata about a package’s contents does
> that require a new package (not package contents) metadata? I could make
> arguments either way but given the amount of information that we expect
> will be attached to element ids I lean towards them not being versioned if
> relationship metadata (including contains) change. Your comment about
> dependencies focuses on incoming dependencies, outgoing dependencies are
> very similar to files, they are just “delayed” resolution files.
>
>
>
> Regards,
>
>
>
> William Bartholomew (he/him) – Let’s chat
> <https://outlook.office.com/findtime/[email protected]&anonymous&ep=plink>
>
> Principal Security Strategist
>
> Global Cybersecurity Policy – Microsoft
>
>
>
> *My working day may not be your working day. Please don’t feel obliged to
> reply to this e-mail outside of your normal working hours.*
>
>
>
> *From:* [email protected] <[email protected]> *On Behalf Of
> *David Kemp via lists.spdx.org
> *Sent:* Tuesday, January 25, 2022 3:31 AM
> *To:* SPDX-list <[email protected]>
> *Subject:* [EXTERNAL] [spdx-tech] Is "contains" special?
>
>
>
> The difference between "contains" and every other type of relationship is
> that it is the minimum essential requirement for some types to exist. A
> package cannot be a package without having contents. It's "packageness" is
> defined by the fact that it has contents. The same cannot be said for all
> of the other relationship types - a Package and a BOM can exist without
> patches, variants, ancestors, dependencies, examples, etc.
>
> If any of those other relationship types were essential for a Package or
> BOM to exist, then the model would include "dependency_element",
> "patch_element" properties in addition to the contents ("element")
> property, and the version of the Package would change whenever the
> properties change. The reason dependency is not a property is because a
> Package and its version don't change every time some other Package
> references / uses / becomes dependent on it.
>
> Contains is special and different from all other relationships because if
> the content of a Package changes, it is a different version of the Package.
>
> Dave
>
>
>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#4345): https://lists.spdx.org/g/Spdx-tech/message/4345
Mute This Topic: https://lists.spdx.org/mt/88673938/21656
Group Owner: [email protected]
Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub [[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-