> Unfortunately, that one is a two-edged sword. If you don't know the type
(e.g. you're trying to look something up by ID) then you need to search
through all the types to find the ID. Conversely, if you want to find
everything of a certain type then grouping by type is beneficial.

 

Good point if you're using the serialization format to represent your
internal storage of the graph.  In all my SPDX software, I use a different
internal representation of the SPDX graph than what is represented in the
serialization format so this particular situation never comes up.  This
brings up another meta-issue - should we be optimizing the serialization
format to be used as an internal storage format or optimizing it for
deserialization and reserialization?  If the latter, than having arrays of
types is much easier IMHO.  If you go the type property route, all the
deserializers I'm familiar with would require writing custom deserialization
code whereas using the arrays can use just the of the shelf libraries.  I'm
happy to be proven wrong on this point if anyone knows of a deserializer for
JSON (not JSON-LD) that can understand the type property.

 

To your second meta issue, Below are my thoughts based on past experience
maintaining some of the SPDX tooling:

 

*       If we ONLY support JSON-LD, a number of issues go away and the
tooling is vastly simplified.
*       Supporting JSON-LD and the RDF dialects are just slightly more
complicated for the tooling since JSON-LD can be viewed as another dialect
of RDF.
*       Supporting YAML and/or XML introduces some of the same issues as
supporting a simplified JSON format.  If we support one of these, we might
as well support all IMHO.
*       Tag/Value is it's own set of (rather large) complexities.
*       Spreadsheets have a similar set of complexities as Tag/Value, but
they are distinct enough that there isn't much leverage between solving both
at the same time.  I will be using spreadsheets myself, so I'll probably
continue to support some type of spreadsheet format in 3.0 if it is at all
feasible.

 

Gary

 

 

From: [email protected] <[email protected]> On Behalf Of
William Bartholomew (CELA) via lists.spdx.org
Sent: Thursday, July 21, 2022 12:20 PM
To: William Bartholomew (CELA) <[email protected]>;
[email protected]; [email protected]; David Kemp
<[email protected]>
Subject: Re: [spdx-tech] Captain of the Ship

 

There's a meta-question here that we need to answer related to JSON
serialization, would SPDX 3.0 support JSON and JSON-LD, just JSON, or just
JSON-LD? I'd lean towards JSON-LD as long as we have a purely mechanical
upgrade process from SPDX 2.x JSON to SPDX 3.x JSON-LD. If we adopt JSON-LD
then a number of serialization design questions already have answers, and it
is still parseable as JSON.

 

 

Regards,

 

William Bartholomew (he/him) -
<https://outlook.office.com/bookwithme/user/988a5aee063345bab5c400a0da19af33
@microsoft.com/meetingtype/SVRwCe7HMUGxuT6WGxi68g2?anonymous&ep=mlink> Let's
chat

Principal Security Strategist

Global Cybersecurity Policy - Microsoft

 

My working day may not be your working day. Please don't feel obliged to
reply to this e-mail outside of your normal working hours.

 

From: [email protected] <mailto:[email protected]>
<[email protected] <mailto:[email protected]> > On Behalf Of
William Bartholomew (CELA) via lists.spdx.org
Sent: Thursday, July 21, 2022 12:16 PM
To: [email protected] <mailto:[email protected]> ;
[email protected] <mailto:[email protected]> ; David Kemp
<[email protected] <mailto:[email protected]> >
Subject: [EXTERNAL] Re: [spdx-tech] Captain of the Ship

 

Unfortunately, that one is a two-edged sword. If you don't know the type
(e.g. you're trying to look something up by ID) then you need to search
through all the types to find the ID. Conversely, if you want to find
everything of a certain type then grouping by type is beneficial.

 

I'd lean towards not grouping by type because you can always create a
type->id mapping when deserializing. Given that we'll have more types with
profiles, I think grouping by type will have more downsides than upsides.

 

William

 

From: [email protected] <mailto:[email protected]>
<[email protected] <mailto:[email protected]> > On Behalf Of
Gary O'Neall via lists.spdx.org
Sent: Wednesday, July 20, 2022 10:26 AM
To: [email protected] <mailto:[email protected]> ; David Kemp
<[email protected] <mailto:[email protected]> >; SPDX-list
<[email protected] <mailto:[email protected]> >
Subject: [EXTERNAL] Re: [spdx-tech] Captain of the Ship

 

One additional consideration that came up in the 2.X discussion was how to
handle the type for the elements. 

In David's example, the type is one of the properties. For 2.X, we
implemented separate arrays for each type. For some of the JSON
serialization libraries, this affords a significant convenience when
deserializing into objects of the same type.

Note that this isn't an issue for JSON-LD or RDF serialization formats which
natively handle types.

Gary

On July 20, 2022 11:57:01 AM CDT, David Kemp <[email protected]
<mailto:[email protected]> > wrote:

We discussed whether elements should be serialized as maps or arrays, and I
provided an example map serialization for discussion.  The two serialization
formats are equivalent, in that they deserialize to identical logical nodes.
But the discussion highlighted some practical distinctions:

1) Members of a map are pre-indexed by IRI, while an array must be searched
member by member to find the element with a specified IRI.  Because looking
up element references is a common operation, the first step after receiving
an array of elements would be to build an index from IRI to element position
in the array.

2) In order to find the captain of a ship with 1000 rooms, you'd need to
search each room to look for someone wearing a captain's uniform.  Or in
order to find an SBOM element in an array of 1000 elements, you'd need to
examine all elements to determine which one(s) are the SBOM type.  That's
true whether the 1000 elements are serialized as a map or an array.  BUT, if
the 1000 elements were serialized as a map AND a rootElements property
existed to list the SBOM IRI(s), no searching is required, the map points
directly to the captain.

Conclusion: serialization as a map doesn't help finding the captain if the
captain's ID isn't specified along with the map.  But if the captain's ID is
specified, map serialization is hugely more efficient than having to search
1000 elements in an array to find that ID.

In any case, here is the JSON-serialized array equivalent of the previous
map example, along with listing the 5 default properties at the top level
instead of nested in a "defaults" property:

{
  "namespace": "urn:acme.dev
<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Facme.dev%2
F&data=05%7C01%7Cwillbar%40microsoft.com%7Ca1577cf6a5ea48e2b84308da6b4d6b81%
7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637940278170635133%7CUnknown%7C
TWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%
3D%7C3000%7C%7C%7C&sdata=0mT7x3lLT79o%2F%2Fox7mqzcoU%2B%2FOLp%2BPZTNlg7Tb0Mo
lE%3D&reserved=0> :",
  "createdBy": ["identities:fred"],
  "created": "2022-04-05T22:00:00Z",
  "specVersion": "3.0",
  "profiles": ["Core", "Software"],
  "dataLicense": "CC0-1.0",
  "elementValues": [
    {
      "id": "artifacts:gnu-coreutils/v9.1/src/du.c",
      "type": {
        "file": {
          "filePurpose": ["APPLICATION", "SOURCE"]
        }
      }
    },
    {
      "id": "artifacts:gnu-coreutils/v9.1/src/echo.c",
      "type": {
        "file": {
          "filePurpose": ["APPLICATION", "SOURCE"]
        }
      }
    },
    {
      "id": "artifacts:gnu-coreutils/v9.1",
      "type": {
        "package": {
          "packagePurpose": ["APPLICATION", "SOURCE"],
          "downloadLocation":
"http://mirror.rit.edu/gnu/coreutils/coreutils-9.1.tar.gz
<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmirror.rit
.edu%2Fgnu%2Fcoreutils%2Fcoreutils-9.1.tar.gz&data=05%7C01%7Cwillbar%40micro
soft.com%7Ca1577cf6a5ea48e2b84308da6b4d6b81%7C72f988bf86f141af91ab2d7cd011db
47%7C1%7C0%7C637940278170635133%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAi
LCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=KRSCcFE
No6PHtmEWzSPxFdSqVHuWPyttudmyca4Bl%2FA%3D&reserved=0> ",
          "homePage": "https://www.gnu.org/software/coreutils/
<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.gnu.o
rg%2Fsoftware%2Fcoreutils%2F&data=05%7C01%7Cwillbar%40microsoft.com%7Ca1577c
f6a5ea48e2b84308da6b4d6b81%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6379
40278170635133%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiL
CJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=XzPBxJLQ7NH%2BR5ezMXwBo3
4Zl7fYkQkNdh6oIxoDG4A%3D&reserved=0> "
        }
      },
      "name": "GNU Coreutils"
    },
    {
      "id": "relationships:gnu-coreutils/v9.1",
      "type": {
        "relationship": {
          "relationshipType": "CONTAINS",
          "from": "urn:acme.dev:artifacts:gnu-coreutils/v9.1",
          "to": [
            "artifacts:gnu-coreutils/v9.1/src/du.c",
            "artifacts:gnu-coreutils/v9.1/src/echo.c"
          ]
        }
      }
    },
    {
      "id": "identities:fred",
      "type": {
        "actor": {}
      },
      "identifiedBy": [{"email": "[email protected] <mailto:[email protected]> "}]
    },
    {
      "id": "sboms:gnu-coreutils/v9.1",
      "type": {
        "sbom": {
          "elements": [
            "artifacts:gnu-coreutils/v9.1/src/du.c",
            "artifacts:gnu-coreutils/v9.1/src/echo.c",
            "artifacts:gnu-coreutils/v9.1",
            "relationships:gnu-coreutils/v9.1",
            "identities:fred"
          ]
        }
      }
    }
  ]
}

Regards,
David

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.





-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#4672): https://lists.spdx.org/g/Spdx-tech/message/4672
Mute This Topic: https://lists.spdx.org/mt/92509189/21656
Group Owner: [email protected]
Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub [[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to