This was the approach I took here
<https://pmonks.github.io/clj-spdx/spdx.expressions.html#var-normalise>
[1].  By default this library normalises SPDX expressions in these ways:

   - All SPDX listed identifiers are normalised to their listed case (e.g.
   "aPaCHe-2.0" -> "Apache-2.0")
   - Operators are uppercased (e.g. "aNd" -> "AND")
   - Order of operations is made explicit via grouping parens (e.g.
   "Apache-2.0 OR MIT AND BSD-3-Clause" -> "Apache-2.0 OR (MIT AND
   BSD-3-Clause)"
      - It could be argued that this is overkill, but in my opinion this
      makes longer expressions easier to comprehend
      - The "historical oddity" deprecated GPL ids are normalised to their
   non-deprecated equivalents (e.g. "GPL-2.0" -> "GPL-2.0-only", "GPL-2.0+" ->
   "GPL-2.0-or-later", etc.)
      - This includes cursed (but valid) expressions such as
      "GPL-2.0-with-GCC-exception WITH Classpath-exception-2.0" being converted
      to "GPL-2.0-only WITH GCC-exception-2.0 AND GPL-2.0-only WITH
      Classpath-exception-2.0"

Cheers,
Peter

[1] https://pmonks.github.io/clj-spdx/spdx.expressions.html#var-normalise





On Wed, Nov 22, 2023 at 1:14 AM Steve Kilbane <[email protected]>
wrote:

> I suppose this argues for the existence of SPDX “beautifiers” – being
> lenient in accepting input, and canonically strict in the output they
> produce.
>
>
>
> Philippe said:
>
> > A good practice for tools is to be flexible and lenient when accepting
> inputs
>
>
>
> Well, sort of: certainly, that’s been accepted wisdom from UNIX days, but
> it *has* meant that implementations of long-lived standards have to be
> ridiculously complicated because they have to cope with inputs that are
> fundamentally broken, but still accepted by a widely-used tool which
> “proves” the broken input is “correct”.
>
>
>
> Perhaps a better practice is “accept but complain” (with an option to turn
> the complaining off) so that it’s clear the tool *is* being lenient, and
> that the input *is* deficient?
>
>
>
> steve
>
>
>
> *From: *[email protected] <[email protected]> on behalf of McCoy
> Smith <[email protected]>
> *Date: *Tuesday, 21 November 2023 at 18:07
> *To: *[email protected] <[email protected]>
> *Subject: *Re: [spdx] Allowing lowercase operands ("and"/"or"/"with")
>
> [External]
>
> > -----Original Message-----
> > From: [email protected] <[email protected]> On Behalf Of Philippe
> > Ombredanne
> > Sent: Tuesday, November 21, 2023 9:25 AM
> > To: [email protected]
> > Subject: Re: [spdx] Allowing lowercase operands ("and"/"or"/"with")
> >
> > A good practice for tools is to be flexible and lenient when accepting
> inputs in
> > general, and optionally be strict (and report warning or errors then
> only) For
> > instance, there is no easily discernible pattern of why an SPDX license
> > identifier is all upper case, all lower case or mixed case, so it is
> hard for a
> > human to avoid mistakes and there are many such minor case errors in the
> > wild wrt. the case of SPDX license identifiers. Yet all SPDX identifiers
> are
> > unique, ignoring the case. So in practice, most tools would ignore the
> case
> > when parsing and output an expression using the specified "canonical"
> case
> > of identifiers and operators.
>
> I'm with Philippe on this one. Not everyone is doing this mechanically,
> and anything that's going to reduce errors in mechanical scans is better.
> I sort of doubt anyone is doing "aNd" "oR" etc. except via a typo, and
> that should be accounted for (even if it scans poorly when human read)
>
>
>
>
>
> 
>
>


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#1813): https://lists.spdx.org/g/spdx/message/1813
Mute This Topic: https://lists.spdx.org/mt/102715215/21656
Group Owner: [email protected]
Unsubscribe: https://lists.spdx.org/g/spdx/leave/2655439/21656/1698928721/xyzzy 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to