Thanks Alexios!

 

I just updated the document with a few additional thoughts.  I added them in
suggestion mode.

 

For the call tomorrow, I was thinking of the following structure:

 

*       Agree on the problem statement
*       Discuss alternatives
*       Agree on the criteria we want to use to choose
*       Compare the alternatives based on the criteria

 

Looking forward to the discussion.

 

Gary

 

From: [email protected] <[email protected]> On Behalf Of
Alexios Zavras
Sent: Monday, December 7, 2020 8:24 AM
To: [email protected]
Subject: [spdx-tech] modeling SPDX

 

Hi all,

 

In tomorrow's SPDX Tech call, we said we would discuss how to "model" the
SPDX information - especially looking towards the next version of the
specification.

To save us all some time in introductory discussion, I wrote down some
thoughts about the problem and solutions:
https://docs.google.com/document/d/13PojpaFPdoKZ9Gyh_DEY-Rp7lldyMbSiGE3vCRQh
R9M/edit?usp=sharing

 

Please read, comment, and edit/update!

Looking forward to our call tomorrow to discuss all this.

 

I paste a current snapshot:

 

Modeling SPDX 

What is the problem?

There are multiple sources for what is defined to be "SPDX". Right now,
there are:

*       The specification text
*       The object model
*       The reference Java code
*       Code for Python and Go libraries

All of these are not always consistent...

What do we want?

The ideal outcome would be a "single source of truth". This could
subsequently be used to automagically generate all the required versions.

What will be modeled?

Based on the current specification, examples of information that should be
modeled may include:

1.      FileName is a field in the file information section
2.      FileName in Tag-Value is Property spdx:fileName in class spdx:File
in RDF
3.      FileName value is a string
4.      FilesAnalyzed is a boolean value
5.      Created has format YYYY-MM-DDThh:mm:ssZ
6.      FileName is a mandatory field and can only appear once
7.      PackageVerificationCode has to appear once if FilesAnalyzed is true
but must not appear if FilesAnalyzed is false
8.      ExternalRef has format "category type locator" where category is one
of SECURITY, PACKAGE-MANAGER, PERSISTENT-ID, OTHER and type is one of ...

 

What are the alternatives?

We're looking for something to "model" SPDX. 

Wikipedia: A data model is an abstract model that organizes elements of data
and standardizes how they relate to one another and to the properties of
real-world entities. Related but not equivalent concepts:information model,
object model.

Unsurprisingly, there are lots of existing modeling frameworks already.

English text

Pure text or not. It should be noted that the current specification is a mix
of English text, some structure (sections for Cardinality, Format, etc.),
some computer-like notation (e.g. pseudo-grammar-like constructs), some
example code, some tables, and probably some more.

OWL

The Web Ontology Language 2 (OWL) is a knowledge representation language for
authoring ontologies. Ontologies are a formal way to describe taxonomies and
classification networks, essentially defining the structure of knowledge for
various domains: the nouns representing classes of objects and the verbs
representing relations between the objects. 

UML

Unified Modeling Language (UML) is a standardized general-purpose modeling
language in the field of software engineering. It is a graphical language
for visualizing, specifying, constructing, and documenting the artifacts of
a software-intensive system. UML offers a mix of functional models, data
models, and database models. UML has been approved as an ISO standard.

Class definition code

One can also start from actual computer code written in an object-oriented
way, that implements classes and objects.

Pros and cons

In random order, for now:

*       UML is widely used and is an ISO standard
*       UML is a graphical language, so not ideal for git-based team
collaboration
*       UML diagrams can be generated by textual descriptions by PlantUML
*       Computer code in any language (Python, Java, .) is not readable by
everyone
*       OWL is not so well known
*       UML can be edited by Open Source tools
*       OWL can be edited by Open Source tools
*       OWL can express a superset of what UML can express
*       Python is better known than UML or OWL
*       English is understood by everyone
*       English natural language text may not be rigorous enough
*       OWL can combine information from different sources/definitions
(e.g., URI type)

 

 

-- zvr

 

Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de> 
Managing Directors: Christin Eisenschmid, Gary Kershaw
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928





-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#3938): https://lists.spdx.org/g/Spdx-tech/message/3938
Mute This Topic: https://lists.spdx.org/mt/78781735/21656
Group Owner: [email protected]
Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub [[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to