Re: debian/copyright format and SPDX

2023-09-25 Thread Stephan Lachnit
On Mon, Sep 25, 2023 at 7:15 AM Steve Langasek  wrote:
>
> So can you tell me where in that specification this "flat text file" format
> is actually described?  The specification is not on the page that includes
> this quote.  The text does not link to the place in the spec where this
> format is described.  The site's search page (because it's reasonable for a
> spec to require a server-side search engine in order for users to be able to
> find information in it...) doesn't return any results for the string
> 'tag:value', and a search for 'tag value' points first to
> https://spdx.github.io/spdx-spec/v2.3/file-tags/ which
> describes embedding tags in a source file, not constructing a tag:value
> file.
>
> Frankly, the fact that the SPDX spec itself is as bad as it is should be
> disqualifying for using any file format specified within.  But I would still
> be willing to give it a fair shake, if I could actually find it.

E.g. here under examples?
https://spdx.github.io/spdx-spec/v2.3/document-creation-information/

Cheers,
Stephan



Re: debian/copyright format and SPDX

2023-09-24 Thread Steve Langasek
On Fri, Sep 22, 2023 at 12:58:10PM +0200, Stephan Lachnit wrote:
> On Fri, Sep 22, 2023 at 11:11 AM Steve Langasek  wrote:

> > SPDX defines an xml format only.  They lost before they'd even started.

> > debian/copyright is supposed to be human-readable first and foremost.  XML
> > need not apply.

> Not true. From [1]:

> > Shall be in a human readable form.
> > [...]
> > Multiple serialization formats may be used to represent the information 
> > being exchanged. Current supported formats include:
> > [...]
> > tag:value flat text file as described in this specification


So can you tell me where in that specification this "flat text file" format
is actually described?  The specification is not on the page that includes
this quote.  The text does not link to the place in the spec where this
format is described.  The site's search page (because it's reasonable for a
spec to require a server-side search engine in order for users to be able to
find information in it...) doesn't return any results for the string
'tag:value', and a search for 'tag value' points first to
https://spdx.github.io/spdx-spec/v2.3/file-tags/ which
describes embedding tags in a source file, not constructing a tag:value
file.

Frankly, the fact that the SPDX spec itself is as bad as it is should be
disqualifying for using any file format specified within.  But I would still
be willing to give it a fair shake, if I could actually find it.

-- 
Steve Langasek   Give me a lever long enough and a Free OS
Debian Developer   to set it on, and I can move the world.
Ubuntu Developer   https://www.debian.org/
slanga...@ubuntu.com vor...@debian.org


signature.asc
Description: PGP signature


Re: debian/copyright format and SPDX

2023-09-22 Thread Russ Allbery
Sune Vuorela  writes:

> I do think that this is another point of "we should kill our babies if
> they don't take off". And preferably faster if/when "we lost" the race.

> We carried around the debian menu for a decade or so after we failed to
> gain traction and people centered on desktop files.

> We failed to gain traction on the structure of the copyright file, and
> spdx is the one who has won here.

I generally agree with everything you're saying, but I don't think it
applies to the structure of the copyright file.  Last I checked, SPDX even
recommends that people use our format for complicated copyright summaries
that their native format can't represent.

It is hampered by being in a language that no one has a readily-available
parser for, and I wish I'd supported the push for it to be in YAML at the
time since YAML has been incredibly successful in the format wars due to
the wild success of Kubernetes (which is heavily based on YAML at the UI
layer although it uses JSON on the wire), but it's still one of the best
if not the best format available for its purpose.

(Yes, I know, the YAML spec is a massive mess, etc.  It's also better than
any other structured file format I've used among those with readily
available parsers in every programming language, and you can use a very
stripped-down version of it without object references and the like.  TOML
unforutnately failed miserably on nested tables in a way that makes it
mostly unusable for a lot of applications YAML does well on.)

-- 
Russ Allbery (r...@debian.org)  



Re: debian/copyright format and SPDX

2023-09-22 Thread Stephan Lachnit
On Fri, Sep 22, 2023 at 11:11 AM Steve Langasek  wrote:
>
>
> SPDX defines an xml format only.  They lost before they'd even started.
>
> debian/copyright is supposed to be human-readable first and foremost.  XML
> need not apply.

Not true. From [1]:

> Shall be in a human readable form.
> [...]
> Multiple serialization formats may be used to represent the information being 
> exchanged. Current supported formats include:
> [...]
> tag:value flat text file as described in this specification

It's the same format that REUSE [2] uses. It is very much human
readable, or at least as human readable as DEP5 (I don't think any
human will ever read a DEP5 file with 200 lines of CC-BY-4.0 but
that's besides the point).

Cheers,
Stephan

[1]: 
https://spdx.github.io/spdx-spec/v2.3/conformance/#44-standard-data-format-requirements
[2]: https://reuse.software/



Re: debian/copyright format and SPDX

2023-09-22 Thread G. Branden Robinson
At 2023-09-22T02:11:15-0700, Steve Langasek wrote:
> SPDX defines an xml format only.  They lost before they'd even
> started.
> 
> debian/copyright is supposed to be human-readable first and foremost.
> XML need not apply.

Very much +1 on everything quoted.

That said, SPDX's license list and the (plain text) tagging convention
that has grown up around it is of great value, and not at all
incompatible with Debian's documentary conventions.

Regards,
Branden


signature.asc
Description: PGP signature


Re: debian/copyright format and SPDX

2023-09-22 Thread Steve Langasek
On Fri, Sep 22, 2023 at 08:43:25AM -, Sune Vuorela wrote:
> On 2023-09-08, Jeremy Stanley  wrote:
> > Since Debian's machine-readable format has been around longer than
> > either of the newer formats you mentioned, it seems like it would
> > make more sense for the tools to incorporate a parser for it rather

> I do think that this is another point of "we should kill our babies if
> they don't take off". And preferably faster if/when "we lost" the race.

> We carried around the debian menu for a decade or so after we failed to
> gain traction and people centered on desktop files.

> We failed to gain traction on the structure of the copyright file, and
> spdx is the one who has won here.

SPDX defines an xml format only.  They lost before they'd even started.

debian/copyright is supposed to be human-readable first and foremost.  XML
need not apply.

-- 
Steve Langasek   Give me a lever long enough and a Free OS
Debian Developer   to set it on, and I can move the world.
Ubuntu Developer   https://www.debian.org/
slanga...@ubuntu.com vor...@debian.org


signature.asc
Description: PGP signature


+1 (Re: debian/copyright format and SPDX)

2023-09-22 Thread Holger Levsen
On Fri, Sep 22, 2023 at 08:43:25AM -, Sune Vuorela wrote:
> I do think that this is another point of "we should kill our babies if
> they don't take off". And preferably faster if/when "we lost" the race.
> 
> We carried around the debian menu for a decade or so after we failed to
> gain traction and people centered on desktop files.
> 
> We failed to gain traction on the structure of the copyright file, and
> spdx is the one who has won here.
> 
> This is just going to be more useless work. Things can more or less the
> same, so let's go with the one where we get the least work requirements
> in the long run, and not put more resources into the current version.

very much +1 on everything quoted.


-- 
cheers,
Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

War is peace. Freedom is slavery. Covid is like the flu.


signature.asc
Description: PGP signature


Re: debian/copyright format and SPDX

2023-09-22 Thread Sune Vuorela
On 2023-09-08, Jeremy Stanley  wrote:
> Since Debian's machine-readable format has been around longer than
> either of the newer formats you mentioned, it seems like it would
> make more sense for the tools to incorporate a parser for it rather

I do think that this is another point of "we should kill our babies if
they don't take off". And preferably faster if/when "we lost" the race.

We carried around the debian menu for a decade or so after we failed to
gain traction and people centered on desktop files.

We failed to gain traction on the structure of the copyright file, and
spdx is the one who has won here.

This is just going to be more useless work. Things can more or less the
same, so let's go with the one where we get the least work requirements
in the long run, and not put more resources into the current version.

/Sune



Re: debian/copyright format and SPDX

2023-09-09 Thread Jonas Smedegaard
Quoting Paul Wise (2023-09-09 09:18:59)
> On Fri, 2023-09-08 at 12:09 +0530, Hideki Yamane wrote:
> 
> > Making appropriate debian/copyright file is hard and boring task, IMHO
> 
> Using scancode-toolkit/etc can probably automate most of that work.
> 
> https://wiki.debian.org/CopyrightReviewTools

Yeah, although that page is aiming at debian/copyright format, not SPDX.

As an example, it does not cover (and I think it would be confusing to
add to it, at least as currently structured) how licensecheck also
supports producing SPDX shortnames e.g. like this (which generates a
slightly **broken** debian/copyright file sude to MIT/Expat difference,
so beware):

  licensecheck --check '.*' --recursive --deb-machine --lines 0 
--shortname-scheme spdx -- *


 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/
 * Sponsorship: https://ko-fi.com/drjones

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

signature.asc
Description: signature


Re: debian/copyright format and SPDX

2023-09-09 Thread Paul Wise
On Fri, 2023-09-08 at 12:09 +0530, Hideki Yamane wrote:

> Making appropriate debian/copyright file is hard and boring task, IMHO

Using scancode-toolkit/etc can probably automate most of that work.

https://wiki.debian.org/CopyrightReviewTools

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


signature.asc
Description: This is a digitally signed message part


Re: debian/copyright format and SPDX

2023-09-08 Thread Hideki Yamane
Hi,

On Fri, 08 Sep 2023 07:34:43 -0700
Russ Allbery  wrote:
> The really interesting part of SPDX is the license list and the canonical
> name assignment, which is *way* more active and *way* more mature at this
> point than the equivalent in Debian.  They have a much larger license
> list, which is currently being bolstered by Fedora, and the new licenses
> and rules for deduplicating them are reviewed by lawyers as part of their
> maintenance process.  Their identifiers are also incerasingly used in
> upstream software in SPDX-License-Identifier pseudo-headers.

 Yes, so we don't need to spare our limited resources to maintain
 license list if we adopt it, IMHO.


-- 
Hideki Yamane 



Re: debian/copyright format and SPDX

2023-09-08 Thread Jeremy Stanley
On 2023-09-08 13:31:43 + (+), Jeremy Stanley wrote:
> On 2023-09-08 12:09:09 +0530 (+0530), Hideki Yamane wrote:
> [...]
> >  SPDX is led by the Linux foundation project, OpenChain for license
> >  compliance.
> [...]
> 
> Unless I'm misreading, OpenChain follows the REUSE specification
> which acknowledges the sufficiency of "DEP5" formatted license info
[...]

Apologies for the confusion on my part. I see from the responses now
that I misunderstood the suggestion. If it's to normalize some of
the contents in copyright files in order to match SPDX identifiers
rather than adopt an entire new format, then I agree that makes a
bit more sense (though does seem like a daunting undertaking). FWIW,
there does already seem to be some attempt at alignment in the
current specification when it comes to handling things like
versioned licenses, and it refers back to the SPDX registry for
license texts.
-- 
Jeremy Stanley


signature.asc
Description: PGP signature


Re: debian/copyright format and SPDX

2023-09-08 Thread Mattia Rizzolo
On Fri, Sep 08, 2023 at 07:34:43AM -0700, Russ Allbery wrote:
> I don't think the file format is the most interesting part of SPDX.  They
> don't really have a competing format equivalent to the functionality of
> our copyright files (at least that I've seen; I vaguely follow their
> lists).  Last time I looked, they were doing a lot with XML, which I don't
> think anyone would adopt for new formats these days.  (YAML or TOML or
> something like that is now a lot more popular.)

Formally, SPDX is only a data model, which supports several
serializations formats.  The XML one is I believe the most common one
for some technically good reasons, but it does support YAML
serialization, as well as some lossy ones as well (like CSV, plaintext,
etc...).

-- 
regards,
Mattia Rizzolo

GPG Key: 66AE 2B4A FCCF 3F52 DA18  4D18 4B04 3FCD B944 4540  .''`.
More about me:  https://mapreri.org : :'  :
Launchpad user: https://launchpad.net/~mapreri  `. `'`
Debian QA page: https://qa.debian.org/developer.php?login=mattia  `-


signature.asc
Description: PGP signature


Re: debian/copyright format and SPDX

2023-09-08 Thread Russ Allbery
Jonas Smedegaard  writes:

> Only issue I am aware of is that SPDX shortname "MIT" equals Debian
> shortname "Expat".

There was also some sort of weirdly ideological argument with the FSF
about what identifiers to use for the GPL and related licenses, which
resulted in SPDX using an "-only" and "-or-later" syntax in the identifier
at the insistence of the FSF rather than a separate generic syntax the way
that we do.

https://spdx.org/licenses/ is the current license list and assigned short
identifiers.

-- 
Russ Allbery (r...@debian.org)  



Re: debian/copyright format and SPDX

2023-09-08 Thread Russ Allbery
Jeremy Stanley  writes:

> Since Debian's machine-readable format has been around longer than
> either of the newer formats you mentioned, it seems like it would make
> more sense for the tools to incorporate a parser for it rather than
> create needless churn in the package archive just to transform an
> established standard into whatever the format-du-jour happens to be (and
> then halfway through another new format gains popularity, and the
> process starts all over again).

I don't think the file format is the most interesting part of SPDX.  They
don't really have a competing format equivalent to the functionality of
our copyright files (at least that I've seen; I vaguely follow their
lists).  Last time I looked, they were doing a lot with XML, which I don't
think anyone would adopt for new formats these days.  (YAML or TOML or
something like that is now a lot more popular.)  In terms of file formats,
writing a lossy converter from Debian copyright files to whatever format
is of interest for BOMs would probably do most of the job.

The really interesting part of SPDX is the license list and the canonical
name assignment, which is *way* more active and *way* more mature at this
point than the equivalent in Debian.  They have a much larger license
list, which is currently being bolstered by Fedora, and the new licenses
and rules for deduplicating them are reviewed by lawyers as part of their
maintenance process.  Their identifiers are also incerasingly used in
upstream software in SPDX-License-Identifier pseudo-headers.

I have no idea how to do a transition, but I do think Debian would benefit
from adopting the SPDX license identifiers where one exists, and possibly
from joining forces with Fedora to submit and get idenifiers assigned to
the licenses that we see that are not yet registered.

-- 
Russ Allbery (r...@debian.org)  



Re: debian/copyright format and SPDX

2023-09-08 Thread Jeremy Stanley
On 2023-09-08 12:09:09 +0530 (+0530), Hideki Yamane wrote:
[...]
>  SPDX is led by the Linux foundation project, OpenChain for license
>  compliance.
[...]

Unless I'm misreading, OpenChain follows the REUSE specification
which acknowledges the sufficiency of "DEP5" formatted license info:

https://github.com/OpenChain-Project/Reference-Material/blob/master/General-Compliance-Support-Material/REUSE.software/en/REUSE.software-3.0.md

Since Debian's machine-readable format has been around longer than
either of the newer formats you mentioned, it seems like it would
make more sense for the tools to incorporate a parser for it rather
than create needless churn in the package archive just to transform
an established standard into whatever the format-du-jour happens to
be (and then halfway through another new format gains popularity,
and the process starts all over again).

Sorry to come across as skeptical, but there are organizations out
there churning out redundant "standards" rather than reusing
suitable existing formats, and while I'd like to assume that it's
simply because they did insufficient research to be aware of prior
art, it seems like all too often it's in pursuit of signing on more
and more donors at the expense of distracting active free/libre open
source software communities from what they would normally focus on
achieving.
-- 
Jeremy Stanley


signature.asc
Description: PGP signature


Re: debian/copyright format and SPDX

2023-09-08 Thread Jonas Smedegaard
Hi Hideki,

Quoting Hideki Yamane (2023-09-08 08:39:09)
½> 
>  tl;dr: How about considering updating debian/copyright format to have
> more compatibility with SPDX format
> 
> 
>  SBOM is expected to be used widely and several tools support it as a trend
>  now, since US government asks to use it. There are two formats for it,
>  Software Package Data Exchange (SPDX) and CycloneDX.
> 
>  SPDX is led by the Linux foundation project, OpenChain for license
>  compliance. And CycloneDX is developed by the Open Web Application Security
>  Project (OWASP), so it is intended to use track vulnerabilities, IMO.
> 
> 
>  Well, as I said above, several tools support SPDX and CycloneDX now and
>  continue to be expanded, so I consider it'd be better if debian/copyright
>  has more compatibilities with them, especially SPDX. It would be easier
>  to handle debian/copyright data with tools that are outside of Debian.
> 
> 
>  Making appropriate debian/copyright file is hard and boring task, IMHO,
>  but if non-Debian people also can help it, it would be easier to fix it.

Only issue I am aware of is that SPDX shortname "MIT" equals Debian
shortname "Expat".  It sounds like you are referring to more and larger
incompatibilities than that.  Do you mean e.g. support for checksums of
files (which will blow up size and kill readability of the file)?

Perhaps as a start compile a list of incompatibilities on a wiki page -
or point to it if one already exists.  Then we can add pros and cons at
that page, as the discussion here progresses.


 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/
 * Sponsorship: https://ko-fi.com/drjones

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

signature.asc
Description: signature