Would it make sense to create html entity documents in the same fashion as W3C does it and publish them at https://schema.org/?
On Thu, 2020-04-23 at 08:32 -0700, Gary O'Neall wrote: … or produce the data model from the OWL document. The OWL document currently has the text, relationships, and cardinality. Here’s a view of the information in the OWL document nicely rendered in HTML: https://spdx.org/rdf/terms/ Either way, I very much agree on having a single source. That would reduce a lot of copy/paste errors (and save me quite a bit of time during SPDX releases). Gary From: [email protected] <[email protected]> On Behalf Of David Kemp Sent: Thursday, April 23, 2020 8:22 AM To: Zavras, Alexios <[email protected]> Cc: [email protected] Subject: Re: [spdx-tech] SPDX data model I agree wholeheartedly and enthusiastically on the benefits of a single source of truth. Ultimately it would be nice to have a machine-readable single source of truth (the data model) for both text information and OWL diagrams in the spec. That would require tooling to extract what is needed from the data model to produce the OWL. On Thu, Apr 23, 2020 at 10:55 AM Zavras, Alexios <[email protected]<mailto:[email protected]>> wrote: Thanks for this, Dave. You are absolutely correct that information on multiplicity, etc. is crucial and necessary for our data model. Rest assured that all this information is already in the specification, where each field has information describing whether it’s mandatory, its cardinality, value restrictions that might exist, etc. What I was trying to say (unclearly, definitely) is that this information is not shown *in this diagram*, which only shows the different OWL classes and basic relationships between them. There is no intent for this diagram to be the complete source of truth for the whole SPDX spec. Do you think that cardinality is necessary to be shown in the diagram? (I agree it would be useful). More generally, do you think it’s necessary to have all the information of the specification included? I was not planning to add information like “checksum values must be in lower-case hex strings”; I only used “string” in the diagram. And then we have to think how to represent conditions like “this field is mandatory if filesAnalyzed value is True, but must not be present if filesAnalyzed is False”… -- zvr From: [email protected]<mailto:[email protected]> <[email protected]<mailto:[email protected]>> On Behalf Of David Kemp Sent: Thursday, 23 April, 2020 16:33 To: Zavras, Alexios <[email protected]<mailto:[email protected]>> Cc: [email protected]<mailto:[email protected]> Subject: Re: [spdx-tech] SPDX data model Hi Alexios, I don't have an ontology background, so forgive me if this is a silly question, but "Classes and Attributes" sounds like they belong in the domain of software design, while "Data" does not depend on how software used to write and read the data is implemented. Software could be written in C or Java or Python, using classes or not using them, and as long as the data creator and the data consumer use the same data they are interoperable. So I'm particularly confused by "There is no information like “1:N” on purpose; it’s a little confusing to have a class diagram also look like an entity-relationship one." When defining a data structure, it makes a big difference whether there is a single required element (1..1), a single optional element (0..1), or an array of elements (1..*) - multiplicity is a critical part of the data model that cannot be left out. There might be a place for class diagrams and entity-relationship diagrams somewhere in SPDX, but the data model is the part that must be defined unambiguously in order for writers and readers to agree on the structure of an SPDX document. Regards, Dave On Wed, Apr 22, 2020 at 5:14 AM Alexios Zavras <[email protected]<mailto:[email protected]>> wrote: Hi all, As some of you may know, we have been using an ontology (expressed in OWL) to represent the SPDX classes and attributes; we also have a graphical representation of the SPDX data model. These reside, respectively, in the directories “ontology” and “model” of the spdx-spec repository. Although these essentially represent the same thing, for historical reasons they were actually independent “sources of truth”. We all know the disadvantages of such an approach (duplicate effort in maintaining, diverging data, …). I worked, therefore, on producing a graphical representation from the OWL ontology. I actually generate PlantUML, which then generates the diagram using GraphViz. Gary was kind enough to review the initial results and thought we might consider it for the future. I attach a couple of generated files. Quick notes: * The attached data is a DRAFT of an interim state. Do not use for any serious purpose. Consider it only Proof of Concept. * The generation is still a hack and not a very polished solution. So, the distinction between “class” and “enumeration” (noted by “C” or “E” in a circle in the diagram) is not very robust. Unfortunately, in OWL everything is classes… * The only arrows (relationships) depicted are: * Solid lines with white triangle: the typical way to represent “subclass of” * Dotted lines with arrows: a class points to the class of an element * There is no information like “1:N” on purpose; it’s a little confusing to have a class diagram also look like an entity-relationship one. * You can see the (deprecated) “Review” class in the upper right of the diagram, not connected to anything. This will be fixed in the ontology. * You may also notice a brand new “LicenseExpressionDRAFT” class, abstracting away all details about licenses, operators, license list or LicenseRef entries, etc. I felt this level of detail inside license expressions confused the “SPDX-info” side of things. We probably can have a separate diagram about license expressions, without confusing the “core” classes and data model. May I please ask you to review the graphic, tell me whether it’s useful and point out other information you would like to see included. If there are no major disadvantages, we will consider retiring the (independent) “model” graphic. -- zvr Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de<http://www.intel.de> Managing Directors: Christin Eisenschmid, Gary Kershaw Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928 Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de<http://www.intel.de> Managing Directors: Christin Eisenschmid, Gary Kershaw Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928 ________________________________ This e-mail and any attachment(s) are intended only for the recipient(s) named above and others who have been specifically authorized to receive them. They may contain confidential information. If you are not the intended recipient, please do not read this email or its attachment(s). Furthermore, you are hereby notified that any dissemination, distribution or copying of this e-mail and any attachment(s) is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender by replying to this e-mail and then delete this e-mail and any attachment(s) or copies thereof from your system. Thank you. -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#3872): https://lists.spdx.org/g/Spdx-tech/message/3872 Mute This Topic: https://lists.spdx.org/mt/73191667/21656 Group Owner: [email protected] Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub [[email protected]] -=-=-=-=-=-=-=-=-=-=-=-
