Just adding a bit of historical context and personal experience to Alexios 
description below - which I largely agree with.

The XML format is actually the 3rd iteration of formats the legal team has used 
to capture license information.

Iteration 1: spreadsheets (open office format)
Iteration 2: spreadsheets with separate text files with a very proprietary 
format for denoting how to format the files in HTML (e.g. if a line starts with 
3 spaces, it is a bullet and should be indented).
Iteration 3: XML

Iteration 2 came out of limitations in the spreadsheet (length of text in a 
cell) and the inability to format the text for good HTML readability.

Iteration 3 came out of frustration trying to maintain iteration 2.  I wasn't 
the driver of the change, but from my own personal experience in iteration 2, 
we found ourselves re-inventing HTML and HTML in the proprietary text formats - 
moving to XML solved that problem.  Having a single spreadsheet with all the 
metadata didn't lend itself well to multiple collaborators - separate files for 
each license metadata made collaboration much easier.  It was a large and 
painful move involving a lot of effort to XML but in IMO resulted in a much 
easier to maintain text format and worth the effort overall.

There are several text formatting alternatives (full HTML, LaTeX, SGML, 
markdown among just a few).  Based on my past experience, I would not want to 
go back to a proprietary text format for the text portion of the license data.

For the metadata, there are several alternatives, but we would need to somehow 
link them to the text format.  Since moving to a different metadata format 
would involve some effort, I would like to see a strong enough benefit to 
justify the effort AND volunteers to help with necessary changes to the tooling.

So far, I have not seen an alternative to XML with enough benefit to go through 
the significant effort of changing - but I'm willing to listen and discuss.

Gary

> -----Original Message-----
> From: Spdx-legal@lists.spdx.org <Spdx-legal@lists.spdx.org> On Behalf Of
> Alexios Zavras
> Sent: Monday, January 15, 2024 7:07 AM
> To: Jonas Smedegaard <d...@jones.dk>; Richard Fontana
> <rfont...@redhat.com>
> Cc: SPDX-legal <spdx-legal@lists.spdx.org>
> Subject: Re: XML format is unsatisfactory
> 
> Richard, interesting discussion, but I think your reasoning is backwards, from
> historical perspective.
> Keep in mind that license-list-XML keeps data and metadata about licenses.
> 
> The metadata is easy: stuff like short identifier, when it was added, whether
> it's OSI approved, links for the original text, etc. etc.
> I agree that these can be represented in any format (XML, YAML, JSON, TOML,
> text in key-value pairs, ...).
> 
> Then we have the data: it started out as pure text, then we wanted to have
> some structure (split into paragraphs, or bullet lists), then we had to 
> represent
> that some parts of the text are optional (they could be present or not), then
> there were also alternatives (something must be there, but maybe the content
> is not arbitrary).
> XML is the best format for describing text with markups -- the billions of web
> pages in HTML (an XML with specific set of tags) attest to that.
> 
> A "simple {{mustache-style}} curly braces markup when needed" will simply
> not do -- you want tags to contain other elements (an <optional> that includes
> <list> that includes <item> that includes <alt>). There are two ways to 
> achieve
> this: by marking the start and end or by nesting structures. XML uses 
> start/end
> tags; other markup systems use either one (or both). I have long experience in
> every conceivable setup.
> 
> Based on the requirements to describe the text with markups, XML was
> selected. And since we had it for the data, it made sense to have it for
> metadata as well.
> I don't think that "makes it easier to generate usable HTML" was high in the
> criteria ever.
> 
> 
> So, is this a discussion on how to represent the metadata, how to represent
> the data, or both? If people are uncomfortable with having the metadata
> expressed in XML, I would not object to split the metadata in a different
> format (YAML, TOML, whatever). But finding an acceptable alternative for the
> actual text... I would be *very* interested to see a practical proposal.
> 
> --
> zvr
> -----Original Message-----
> From: Spdx-legal@lists.spdx.org <Spdx-legal@lists.spdx.org> On Behalf Of
> Jonas Smedegaard
> Sent: Monday, 15 January, 2024 06:29
> To: Richard Fontana <rfont...@redhat.com>
> Cc: SPDX-legal <spdx-legal@lists.spdx.org>
> Subject: Re: XML format is unsatisfactory
> 
> Quoting Richard Fontana (2024-01-15 06:22:47)
> > On Mon, Jan 15, 2024 at 12:01 AM Jonas Smedegaard <d...@jones.dk>
> wrote:
> > >
> > > Quoting Richard Fontana (2024-01-14 23:41:55)
> > > > On Sun, Jan 14, 2024 at 2:47 PM Jonas Smedegaard <d...@jones.dk>
> wrote:
> > > >
> > > > > The XML files is a representation of RDF.
> > > > >
> > > > > Another more human readable and editable RDF repræsentation
> > > > > exists which is a *lossless* conversion: Turtle.
> > > > >
> > > > > On a Debian-based system, you can see how MIT license looks as
> > > > > Turtle by installing te package raptor2-utils, an then run this 
> > > > > command:
> > > > >
> > > > >   rapper -i rdfxml -O http://spdx.org/licenses/ -o turtle
> > > > > MIT.rdf
> > > >
> > > > This sounds interesting but:
> > > >
> > > > [ref@charlie ~]$ rapper -i rdfxml -O http://spdx.org/licenses/ -o
> > > > turtle MIT.rdf
> > > > rapper: Parsing URI MIT.rdf with parser rdfxml
> > > > rapper: Serializing with serializer turtle and base URI
> > > > http://spdx.org/licenses/ @base <http://spdx.org/licenses/> .
> > > > rapper: Error - URI MIT.rdf - Resolving URI failed: Could not
> > > > resolve
> > > > host: MIT.rdf
> > > > rapper: Failed to parse URI MIT.rdf rdfxml content @prefix rdf:
> > > > <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
> > > >
> > > > rapper: Parsing returned 0 triples
> > >
> > > Sorry, I thought it was obvious but realize now that I should have
> > > been
> > > explicit: You need to `cd` to the path where the MIT.rdf is located
> > > - i.e. you need to checkout the git repository for the sources first.
> >
> > Ah, I see. So:
> >
> > [ref@charlie rdfxml]$ rapper -i rdfxml -O http://spdx.org/licenses/ -o
> > turtle MIT.rdf
> > rapper: Parsing URI file:///home/ref/license-list-data/rdfxml/MIT.rdf
> > with parser rdfxml
> > rapper: Serializing with serializer turtle and base URI
> > http://spdx.org/licenses/ @base <http://spdx.org/licenses/> .
> > @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
> > @prefix spdx: <../rdf/terms#> .
> > @prefix doap: <http://usefulinc.com/ns/doap#> .
> > @prefix ptr: <http://www.w3.org/2009/pointers#> .
> > @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
> >
> > <MIT>
> >     spdx:crossRef [
> >         spdx:isLive false ;
> >         spdx:isValid true ;
> >         spdx:isWayBackLink false ;
> >         spdx:match "N/A" ;
> >         spdx:order "0"^^<http://www.w3.org/2001/XMLSchema#int> ;
> >         spdx:timestamp "2024-01-05T20:12:47Z" ;
> >         spdx:url "https://opensource.org/licenses/MIT"; ;
> >         a spdx:CrossRef
> >     ] ;
> >     spdx:isDeprecatedLicenseId false ;
> >     spdx:isFsfLibre true ;
> >     spdx:isOsiApproved true ;
> >     spdx:licenseId "MIT" ;
> >     spdx:licenseText """MIT License
> >
> > Copyright (c) <year> <copyright holders>
> >
> > Permission is hereby granted, free of charge, to any person obtaining
> > a copy of this software and associated documentation files (the
> > \"Software\"), to deal in the Software without restriction, including
> > without limitation the rights to use, copy, modify, merge, publish,
> > distribute, sublicense, and/or sell copies of the Software, and to
> > permit persons to whom the Software is furnished to do so, subject to
> > the following conditions:
> >
> > The above copyright notice and this permission notice shall be
> > included in all copies or substantial portions of the Software.
> >
> > THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND,
> > EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
> OF
> > MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
> NONINFRINGEMENT.
> > IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR
> ANY
> > CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF
> CONTRACT,
> > TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH
> THE
> > SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
> > """ ;
> >     spdx:licenseTextHtml """
> >       <div class=\"optional-license-text\">
> >          <p>MIT License</p>
> >
> >       </div>
> >       <div class=\"replaceable-license-text\">
> >          <p>Copyright (c) &lt;year&gt; &lt;copyright holders&gt;
> >          </p>
> >
> >       </div>
> >
> >       <p>Permission is hereby granted, free of charge, to any person
> > obtaining a copy of <var class=\"replaceable-license-text\"> this
> > software and
> >          associated documentation files</var> (the
> > &quot;Software&quot;), to deal in the Software without restriction,
> >          including without limitation the rights to use, copy, modify,
> > merge, publish, distribute, sublicense,
> >          and/or sell copies of the Software, and to permit persons to
> > whom the Software is furnished to do so,
> >          subject to the following conditions:</p>
> >
> >       <p>The above copyright notice and this permission notice
> >          <var class=\"optional-license-text\"> (including the next
> > paragraph)</var>
> >          shall be included in all copies or substantial
> >          portions of the Software.</p>
> >
> >       <p>THE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT
> WARRANTY
> > OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT
> >          LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
> > PARTICULAR PURPOSE AND NONINFRINGEMENT. IN
> >          NO EVENT SHALL <var class=\"replaceable-license-text\"> THE
> > AUTHORS OR COPYRIGHT HOLDERS</var> BE LIABLE FOR ANY CLAIM,
> DAMAGES OR
> > OTHER LIABILITY,
> >          WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > FROM, OUT OF OR IN CONNECTION WITH THE
> >          SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.</p>
> >
> >     """ ;
> >     spdx:name "MIT License" ;
> >     spdx:standardLicenseTemplate """<<beginOptional>>MIT License
> >
> > <<endOptional>> <<var;name=\"copyright\";original=\"Copyright (c)
> > <year> <copyright holders>  \";match=\".{0,5000}\">>
> >
> > Permission is hereby granted, free of charge, to any person obtaining
> > a copy of <<var;name=\"software\";original=\"this software and
> > associated documentation
> > files\";match=\"this\\s+software\\s+and\\s+associated\\s+documentation
> > \\s+files|this\\s+source\\s+file\">>
> > (the \"Software\"), to deal in the Software without restriction,
> > including without limitation the rights to use, copy, modify, merge,
> > publish, distribute, sublicense, and/or sell copies of the Software,
> > and to permit persons to whom the Software is furnished to do so,
> > subject to the following conditions:
> >
> > The above copyright notice and this permission notice<<beginOptional>>
> > (including the next paragraph)<<endOptional>> shall be included in all
> > copies or substantial portions of the Software.
> >
> > THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND,
> > EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
> OF
> > MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
> NONINFRINGEMENT.
> > IN NO EVENT SHALL <<var;name=\"copyrightHolder\";original=\"THE
> > AUTHORS OR COPYRIGHT HOLDERS\";match=\".+\">> BE LIABLE FOR ANY
> CLAIM,
> > DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
> TORT OR
> > OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
> SOFTWARE OR
> > THE USE OR OTHER DEALINGS IN THE SOFTWARE.
> >
> > """ ;
> >     a spdx:ListedLicense ;
> >     rdfs:seeAlso "https://opensource.org/licenses/MIT"; .
> >
> > rapper: Parsing returned 19 triples
> >
> > This doesn't really seem to be any better than (and actually seems
> > worse than) the license-list-XML file "MIT.xml" for purposes of
> > readability and maintainability.
> 
> How so?
> 
> Because it doesn't look like YAML?
> 
> 
>  - Jonas
> 
> --
>  * Jonas Smedegaard - idealist & Internet-arkitekt
>  * Tlf.: +45 40843136  Website: http://dr.jones.dk/
>  * Sponsorship: https://ko-fi.com/drjones
> 
>  [x] quote me freely  [ ] ask before reusing  [ ] keep private
> 
> 
> 
> 
> 
> Intel Deutschland GmbH
> Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
> Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de> Managing
> Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva Chairperson
> of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial
> Register: Amtsgericht Muenchen HRB 186928
> 
> 
> 
> 




-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#3496): https://lists.spdx.org/g/Spdx-legal/message/3496
Mute This Topic: https://lists.spdx.org/mt/103724268/21656
Group Owner: spdx-legal+ow...@lists.spdx.org
Unsubscribe: https://lists.spdx.org/g/Spdx-legal/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to