Re: [spdx-tech] SPDX data model

Jeremiah C. Foster Thu, 23 Apr 2020 09:33:34 -0700

Would it make sense to create html entity documents in the same fashion as W3C 
does it and publish them at https://schema.org/?



On Thu, 2020-04-23 at 08:32 -0700, Gary O'Neall wrote:
… or produce the data model from the OWL document.

The OWL document currently has the text, relationships, and cardinality.

Here’s a view of the information in the OWL document nicely rendered in HTML: 
https://spdx.org/rdf/terms/

Either way, I very much agree on having a single source.  That would reduce a 
lot of copy/paste errors (and save me quite a bit of time during SPDX releases).

Gary

From: [email protected] <[email protected]> On Behalf Of David 
Kemp
Sent: Thursday, April 23, 2020 8:22 AM
To: Zavras, Alexios <[email protected]>
Cc: [email protected]
Subject: Re: [spdx-tech] SPDX data model

I agree wholeheartedly and enthusiastically on the benefits of a single source 
of truth.

Ultimately it would be nice to have a machine-readable single source of truth 
(the data model) for both text information and OWL diagrams in the spec. That 
would require tooling to extract what is needed from the data model to produce 
the OWL.

On Thu, Apr 23, 2020 at 10:55 AM Zavras, Alexios 
<[email protected]<mailto:[email protected]>> wrote:
Thanks for this, Dave.

You are absolutely correct that information on multiplicity, etc. is crucial 
and necessary for our data model. Rest assured that all this information is 
already in the specification, where each field has information describing 
whether it’s mandatory, its cardinality, value restrictions that might exist, 
etc.

What I was trying to say (unclearly, definitely) is that this information is 
not shown *in this diagram*, which only shows the different OWL classes and 
basic relationships between them. There is no intent for this diagram to be the 
complete source of truth for the whole SPDX spec.

Do you think that cardinality is necessary to be shown in the diagram? (I agree 
it would be useful).
More generally, do you think it’s necessary to have all the information of the 
specification included? I was not planning to add information like “checksum 
values must be in lower-case hex strings”; I only used “string” in the diagram.
And then we have to think how to represent conditions like “this field is 
mandatory if filesAnalyzed value is True, but must not be present if 
filesAnalyzed is False”…

-- zvr

From: [email protected]<mailto:[email protected]> 
<[email protected]<mailto:[email protected]>> On Behalf Of David 
Kemp
Sent: Thursday, 23 April, 2020 16:33
To: Zavras, Alexios <[email protected]<mailto:[email protected]>>
Cc: [email protected]<mailto:[email protected]>
Subject: Re: [spdx-tech] SPDX data model

Hi Alexios,

I don't have an ontology background, so forgive me if this is a silly question, 
but "Classes and Attributes" sounds like they belong in the domain of software 
design, while "Data" does not depend on how software used to write and read the 
data is implemented. Software could be written in C or Java or Python, using 
classes or not using them, and as long as the data creator and the data 
consumer use the same data they are interoperable.

So I'm particularly confused by "There is no information like “1:N” on purpose; 
it’s a little confusing to have a class diagram also look like an 
entity-relationship one."

When defining a data structure, it makes a big difference whether there is a 
single required element (1..1), a single optional element (0..1), or an array 
of elements (1..*) - multiplicity is a critical part of the data model that 
cannot be left out.   There might be a place for class diagrams and 
entity-relationship diagrams somewhere in SPDX, but the data model is the part 
that must be defined unambiguously in order for writers and readers to agree on 
the structure of an SPDX document.

Regards,
Dave




On Wed, Apr 22, 2020 at 5:14 AM Alexios Zavras 
<[email protected]<mailto:[email protected]>> wrote:
Hi all,

As some of you may know, we have been using an ontology (expressed in OWL) to 
represent the SPDX classes and attributes; we also have a graphical 
representation of the SPDX data model. These reside, respectively, in the 
directories “ontology” and “model” of the spdx-spec repository.
Although these essentially represent the same thing, for historical reasons 
they were actually independent “sources of truth”.

We all know the disadvantages of such an approach (duplicate effort in 
maintaining, diverging data, …).
I worked, therefore, on producing a graphical representation from the OWL 
ontology. I actually generate PlantUML, which then generates the diagram using 
GraphViz.
Gary was kind enough to review the initial results and thought we might 
consider it for the future.

I attach a couple of generated files.

Quick notes:

  *   The attached data is a DRAFT of an interim state. Do not use for any 
serious purpose. Consider it only Proof of Concept.
  *   The generation is still a hack and not a very polished solution. So, the 
distinction between “class” and “enumeration” (noted by “C” or “E” in a circle 
in the diagram) is not very robust. Unfortunately, in OWL everything is classes…
  *   The only arrows (relationships) depicted are:

     *   Solid lines with white triangle: the typical way to represent 
“subclass of”
     *   Dotted lines with arrows: a class points to the class of an element

  *   There is no information like “1:N” on purpose; it’s a little confusing to 
have a class diagram also look like an entity-relationship one.
  *   You can see the (deprecated) “Review” class in the upper right of the 
diagram, not connected to anything. This will be fixed in the ontology.
  *   You may also notice a brand new “LicenseExpressionDRAFT” class, 
abstracting away all details about licenses, operators, license list or 
LicenseRef entries, etc. I felt this level of detail inside license expressions 
confused the “SPDX-info” side of things. We probably can have a separate 
diagram about license expressions, without confusing the “core” classes and 
data model.

May I please ask you to review the graphic, tell me whether it’s useful and 
point out other information you would like to see included.
If there are no major disadvantages, we will consider retiring the 
(independent) “model” graphic.

-- zvr


Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de<http://www.intel.de>
Managing Directors: Christin Eisenschmid, Gary Kershaw
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de<http://www.intel.de>
Managing Directors: Christin Eisenschmid, Gary Kershaw
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928


________________________________

This e-mail and any attachment(s) are intended only for the recipient(s) named 
above and others who have been specifically authorized to receive them. They 
may contain confidential information. If you are not the intended recipient, 
please do not read this email or its attachment(s). Furthermore, you are hereby 
notified that any dissemination, distribution or copying of this e-mail and any 
attachment(s) is strictly prohibited. If you have received this e-mail in 
error, please immediately notify the sender by replying to this e-mail and then 
delete this e-mail and any attachment(s) or copies thereof from your system. 
Thank you.

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#3872): https://lists.spdx.org/g/Spdx-tech/message/3872
Mute This Topic: https://lists.spdx.org/mt/73191667/21656
Group Owner: [email protected]
Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub  
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [spdx-tech] SPDX data model

Reply via email to