Hi David and SPDX tech team,

 

Thanks for the information and recommendations.

 

I’m half way through reading the OASIS document, but I thought I would share 
some preliminary thoughts prior to our weekly Tuesday call (see meeting 
information at https://spdx.dev/participate/tech/ for meeting coordinates).

 

JADN looks quite promising.  I’m not sure if it will captures the semantic 
relationships that OWL provides, but that may not be necessary for the SPDX use 
case.  I think it would be an interesting exercise to see how JADN could be 
applied to SPDX.  I see that the JADN tools 
<https://github.com/davaya/jadn-pypkg>  already supports documentation use 
cases and conversion to JSON schemas with XSD planned.  Adding an OWL 
conversion should be possible.

 

I agree with the recommendation to use UML for the datatype constraints.

 

I also think we should consider JADN as one of the alternatives.

 

Looking forward to today’s call.


Cheers,
Gary

 

From: [email protected] <[email protected]> On Behalf Of David 
Kemp
Sent: Monday, December 14, 2020 10:16 AM
To: SPDX-list <[email protected]>; Considine, Toby 
<[email protected]>; Duncan Sparrell <[email protected]>
Subject: Re: [spdx-tech] modeling SPDX

 

The modeling requirement is to have a "single source of truth".

1) That implies that that truth must be translatable faithfully and reliably 
into multiple implementation formats.
2) #1 implies that there must be algorithms to perform those translations.
3) #2 implies that tooling is used to validate the translation algorithms and 
perform the translations

I have been working on an Information Modeling language based on UML and tools 
for translating IMs into JSON Schema format, as well as text and table 
representations useful for writing standards such as SPDX.  Additional 
translations, particularly to and from OWL, would demonstrate feasibility of 
that approach.  That way either OWL or a UML-based IM could be used as the 
single source of truth, and Python code and SPDX documents in multiple formats 
(XML, JSON, tag-value) could be generated from it.

I've been thinking about using SPDX as a test case for that approach, but 
haven't done much detailed modeling yet. The first section of a paper 
describing the approach is available at 
https://docs.google.com/document/d/169L5VQDiPVNREbVuVIh03UVX0KMCi-I9Dz6YtI4fhg8,
 and a first working draft of JADN, an IM language implementing the UML-based 
approach, is available from OASIS at 
https://github.com/oasis-tcs/openc2-jadn/blob/working/jadn-v1.0-wd01.md.  A 
public review of that WD is coming up; if you'd like to participate in the 
review please let me know.

So my recommendation is:
* The single source of truth should capture the datatype constraints defined by 
UML
* UML does not have a unique file format, but any file format that can capture 
those constraints can be considered as the source (which rules out things like 
English prose and Python code - those are derived representations, not the 
source.)

Dave

 

 

On Tue, Dec 8, 2020 at 6:46 PM Alberto Pianon <[email protected] 
<mailto:[email protected]> > wrote:

I've never tried in languages other than python, but I see that JSON schemas 
are supported in most programming languages

Il 2020-12-09 00:41 Nisha Kumar ha scritto:

 

Hi Alberto,

 

I think specifying conditional keys in the markdown document (XML, RDF, JSON, 
YAML, etc) sounds like a neat idea! Of course, now we end up having to 
implement interpreters for any scheme we come up with. 

 

-n

 

From: <[email protected] <mailto:[email protected]> > on behalf 
of "Alberto Pianon via lists.spdx.org <http://lists.spdx.org> " 
<[email protected] <mailto:[email protected]> >
Reply-To: "[email protected] <mailto:[email protected]> " <[email protected] 
<mailto:[email protected]> >
Date: Tuesday, December 8, 2020 at 2:32 PM
To: "[email protected] <mailto:[email protected]> " 
<[email protected] <mailto:[email protected]> >
Subject: Re: [spdx-tech] modeling SPDX

 

Hi all,

have you ever considered using JSON schema?

It has some nice features such as conditional dependencies, and you can use 
also an if-then-else syntax, that could be useful to model FilesAnalyzed etc.

See some examples here: https://stackoverflow.com/a/38781027

BTW, IANAE so please don't blast me if this suggestion does not make any sense 
from a technical standpoint :)

Regards,

Alberto

Il 2020-12-07 17:24 Alexios Zavras ha scritto:

 

Hi all,

 

In tomorrow's SPDX Tech call, we said we would discuss how to "model" the SPDX 
information – especially looking towards the next version of the specification.

To save us all some time in introductory discussion, I wrote down some thoughts 
about the problem and solutions:  
<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.google.com%2Fdocument%2Fd%2F13PojpaFPdoKZ9Gyh_DEY-Rp7lldyMbSiGE3vCRQhR9M%2Fedit%3Fusp%3Dsharing&data=04%7C01%7Cnishak%40vmware.com%7C4cca0631f353415ecf7908d89bc91fbd%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C1%7C637430635415719677%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=DgwLHE1d394ieXU3KsDUkH6ELi9Ejs%2FuoivNIwyn9UU%3D&reserved=0>
 
https://docs.google.com/document/d/13PojpaFPdoKZ9Gyh_DEY-Rp7lldyMbSiGE3vCRQhR9M/edit?usp=sharing

 

Please read, comment, and edit/update!

Looking forward to our call tomorrow to discuss all this.

 

I paste a current snapshot:

 

Modeling SPDX 

What is the problem?

There are multiple sources for what is defined to be "SPDX". Right now, there 
are:

*         The specification text

*         The object model

*         The reference Java code

*         Code for Python and Go libraries

All of these are not always consistent...

What do we want?

The ideal outcome would be a "single source of truth". This could subsequently 
be used to automagically generate all the required versions.

What will be modeled?

Based on the current specification, examples of information that should be 
modeled may include:

1.       FileName is a field in the file information section

2.       FileName in Tag-Value is Property spdx:fileName in class spdx:File in 
RDF

3.       FileName value is a string

4.       FilesAnalyzed is a boolean value

5.       Created has format YYYY-MM-DDThh:mm:ssZ

6.       FileName is a mandatory field and can only appear once

7.       PackageVerificationCode has to appear once if FilesAnalyzed is true 
but must not appear if FilesAnalyzed is false

8.       ExternalRef has format "category type locator" where category is one 
of SECURITY, PACKAGE-MANAGER, PERSISTENT-ID, OTHER and type is one of ...

 

What are the alternatives?

We're looking for something to "model" SPDX. 

Wikipedia: A data model is an abstract model that organizes elements of data 
and standardizes how they relate to one another and to the properties of 
real-world entities. Related but not equivalent concepts:information model, 
object model.

Unsurprisingly, there are lots of existing modeling frameworks already.

English text

Pure text or not. It should be noted that the current specification is a mix of 
English text, some structure (sections for Cardinality, Format, etc.), some 
computer-like notation (e.g. pseudo-grammar-like constructs), some example 
code, some tables, and probably some more...

OWL

The Web Ontology Language 2 (OWL) is a knowledge representation language for 
authoring ontologies. Ontologies are a formal way to describe taxonomies and 
classification networks, essentially defining the structure of knowledge for 
various domains: the nouns representing classes of objects and the verbs 
representing relations between the objects. 

UML

Unified Modeling Language (UML) is a standardized general-purpose modeling 
language in the field of software engineering. It is a graphical language for 
visualizing, specifying, constructing, and documenting the artifacts of a 
software-intensive system. UML offers a mix of functional models, data models, 
and database models. UML has been approved as an ISO standard.

Class definition code

One can also start from actual computer code written in an object-oriented way, 
that implements classes and objects.

Pros and cons

In random order, for now:

*         UML is widely used and is an ISO standard

*         UML is a graphical language, so not ideal for git-based team 
collaboration

*         UML diagrams can be generated by textual descriptions by PlantUML

*         Computer code in any language (Python, Java, ...) is not readable by 
everyone

*         OWL is not so well known

*         UML can be edited by Open Source tools

*         OWL can be edited by Open Source tools

*         OWL can express a superset of what UML can express

*         Python is better known than UML or OWL

*         English is understood by everyone

*         English natural language text may not be rigorous enough

*         OWL can combine information from different sources/definitions (e.g., 
URI type)

 

 

-- zvr

 

Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de> 
Managing Directors: Christin Eisenschmid, Gary Kershaw
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

 





-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#3947): https://lists.spdx.org/g/Spdx-tech/message/3947
Mute This Topic: https://lists.spdx.org/mt/78781735/21656
Group Owner: [email protected]
Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub [[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to