Hi Peter,
You raise a good point on the meta data to support tools development being a distinct category. I also agree that a real-time conversation would be a good approach. Count me in as interested in a follow up discussion. Cheers, Gary From: [email protected] <[email protected]> On Behalf Of Peter Monks via lists.spdx.org Sent: Friday, December 12, 2025 1:47 PM To: [email protected] Cc: [email protected]; [email protected]; [email protected]; SPDX-legal <[email protected]>; [email protected] Subject: Re: [spdx-tech] Expression of interest: technical metadata that embellishes the license lists G'day Gary, There are indeed metadata fields present in the lists that aren't in the 3.0.x models - the three fields I mentioned in my previous email are all taken from v3.27.0 of the license list <https://github.com/spdx/license-list-data/blob/v3.27.0/json/licenses.json> [1] and are not represented in the SPDX 3.0.1 model <https://spdx.github.io/spdx-spec/v3.0.1/model/ExpandedLicensing/Classes/ListedLicense/> [2]. More generally, I think the question of whether any given piece of metadata should be added to the SPDX model is more complex and nuanced, and the answer is far from being "unequivocally yes". I personally think in terms of categories defined via an NxN matrix where the individual "per dimension" discriminators might include questions such as: * Does the metadata have legal significance? * Does the metadata belong in an SBOM, or is it truly metadata? i.e. tools use it as an input to the process of constructing data that might eventually appear in an SBOM, but the metadata itself has no direct relevance, significance, or meaning in an SBOM (and therefore has no need to be represented in the SPDX model) This second dimension is where my primary interested lies, and I'd argue that there are already special-cased examples of this occurring in SPDX today (the matching templates, equivalent words file, etc.). The hypothetical metadata examples I gave earlier (version series, license publisher information) also fall into this same bucket, as do several other examples from my own work. It's obvious to me that there's a gap in how SPDX supports tool implementers with this kind of "true" metadata. Taking a step back again, and not to sound like a broken record, I think there's some recognition here that this might be a "real thing" that's worthy of further investigation, but I'm concerned that we keep falling into the trap of trying to do the work here on the mailing list (which IMO is a poor vehicle for that). I'd prefer to make this an informal and probably temporary team of some kind, whose goal is to deliver a proposal that can then be socialised with the broader SPDX team & community. I'm very open to suggestions about how this work should be structured, if creating an informal "working group" is not how the SPDX project typically organises tasks like this. Cheers, Peter [1] https://github.com/spdx/license-list-data/blob/v3.27.0/json/licenses.json [2] https://spdx.github.io/spdx-spec/v3.0.1/model/ExpandedLicensing/Classes/ListedLicense/ On Fri, Dec 12, 2025 at 11:15 AM Gary O'Neall via lists.spdx.org <http://lists.spdx.org> <[email protected] <mailto:[email protected]> > wrote: Greeting all, A couple of thoughts on this. On the topic of metadata elements in the license list but not in the model: With SPDX 3.0, we’ve tried to bring the license list fields into alignment with the model itself. The issue Peter raised is valid for SPDX 2.X, but if there are any fields in the license list not in the model, we should add those as issues and fix it in 3.1 or 3.2. A bit of history on this – it was always the intent to keep the model and license list metadata in sync. The legal team desired to have some fields supported before a given release of the SPDX spec which caused things to get a bit out of sync in the past. If you look through the history of closed issues in the SPDX spec repo, you’ll find a record of us bringing things back into alignment. On the original topic of adding more properties of classes to the license data: I would suggest separating out the work of creating a field to store the data from the process of maintaining the data that is stored there. I have not seen much resistance to adding fields to the license model if there was a valid use case to support it. I’ve seen a lot of resistance to having the SPDX legal team maintain the data – especially if the data could be considered opinionated. There are two reasons I’ve heard raised on not adding data fields: - Effort to maintain the data – this is especially true if the data can change over time - Legal opinions vary and we don’t want to encode one particular opinion in the SPDX maintained metadata Caveat: These are my recollections on past meetings and don’t represent my own opinion. I also may not have accurately represented their opinions above, but hopefully I came close. I tend to think of additional license metadata in 4 categories: 1. Metadata maintained by the legal team for the license list (e.g. license name, identifier, text) – I’ve found adding more of these to meet with some resistance. 2. Metadata maintained by a well identified organization outside the SPDX community (e.g. fsfLibre, OSI approved) – These seem to be relatively easy to add if the organization is well established. I don’t recall any examples where one of these fields was rejected. The MOF would fall into this category. 3. Metadata deterministically generated by existing metadata – currently no examples 4. Metadata augmented by a producer of SPDX data – currently no examples Peter – If you agree with my above categories, which categories would your additional fall into? We’ve never implemented 3 or 4, but I have some ideas how to approach it and would be interested to discuss, and I agree it is primarily a technical discussion – although I would want the legal team to be involved. For 1) and 2), we should probably work with the legal team to make sure they can support the effort and the external data sources used are valid. I would personally be opposed to creating a separate subgroup to maintain metadata on the listed licenses even if it is considered more technical than legal. Best regards, Gary From: [email protected] <mailto:[email protected]> <[email protected] <mailto:[email protected]> > On Behalf Of Peter Monks via lists.spdx.org <http://lists.spdx.org> Sent: Friday, December 12, 2025 8:34 AM To: [email protected] <mailto:[email protected]> Cc: [email protected] <mailto:[email protected]> ; [email protected] <mailto:[email protected]> ; SPDX-legal <[email protected] <mailto:[email protected]> >; [email protected] <mailto:[email protected]> Subject: Re: [spdx-tech] Expression of interest: technical metadata that embellishes the license lists There are several metadata elements included in the published license and exception lists that aren't in the SPDX model, so that first sentence isn't accurate (referenceNumber, reference, detailsURL, etc.). The second sentence is something that I'd consider to be "in scope" for discussion by this group, probably coupled to the discussion of whether this proposal should push for these additional metadata elements to be included in the license lists directly (vs being managed as separate artifacts). But we're getting ahead of ourselves - are you expressing interest in being part of a group that would discuss these (and other) topics related to an expanded set of listed license and exception metadata? Cheers, Peter On Fri, Dec 12, 2025 at 12:35 AM Alexios Zavras via lists.spdx.org <http://lists.spdx.org> <[email protected] <mailto:[email protected]> > wrote: The SPDX model defines the characteristics of licenses in the License List: https://spdx.github.io/spdx-spec/v3.0.1/model/ExpandedLicensing/Classes/ListedLicense/ If we want to add more properties there, that's where they should go. Although probably these better be added to one of the superclasses, since licenses not in the License List might also benefit for such extra properties. -- zvr -- _____ From: [email protected] <mailto:[email protected]> <[email protected] <mailto:[email protected]> > on behalf of Peter Monks via lists.spdx.org <http://lists.spdx.org> <[email protected] <mailto:[email protected]> > Sent: Friday, December 12, 2025 03:22 To: [email protected] <mailto:[email protected]> <[email protected] <mailto:[email protected]> > Cc: [email protected] <mailto:[email protected]> <[email protected] <mailto:[email protected]> >; SPDX-legal <[email protected] <mailto:[email protected]> >; [email protected] <mailto:[email protected]> <[email protected] <mailto:[email protected]> > Subject: Re: [spdx-tech] Expression of interest: technical metadata that embellishes the license lists G’day Karen, I may be wrong, but I don’t think there is an SPDX affordance that exists today that really supports embellishing the listed licenses / exceptions with this kind of metadata. (and yes this MOF use case sounds like another great example of the kind of thing I’m thinking of - thanks for sharing it!) Cheers, Peter On Dec 11, 2025, at 4:12 PM, Karen Bennet via lists.spdx.org <http://lists.spdx.org> <[email protected] <mailto:[email protected]> > wrote: I had a call with LF-AI/Data MOF today and they are open to using SPDX fields/terminology instead of having their own meta data. They just asked that we put together what needs to change in their MOF tool to be more aligned with SPDX. On Thursday, December 11, 2025 at 05:14:26 p.m. EST, Arthit Suriyawongkul via lists.spdx.org <http://lists.spdx.org> <[email protected] <mailto:[email protected]> > wrote: Another non-legal use case from the Model Openness Framework (another Linux Foundation project). MOF has a use case of wanting to know if a license is meant for a code, data, or documentation. This will be used to evaluate the appropriateness of the license. MOF current implementation is by extending licenses.json from spdx/license-list-data with a custom "ContentType" field. https://github.com/spdx/spdx-3-model/issues/1181#issuecomment-3643835381 Cheers, Art The job of a citizen is to keep his mouth open. -- Günter Grass On Fri 12 Dec 2025, 06:02 Peter Monks, <[email protected] <mailto:pmonks%[email protected]> > wrote: Yep that's right Art - that's one example of the kind of thing I'm thinking about, though I'm trying not to bog the conversation down in specifics yet, because I'm very open to the idea that there are other forms of technical metadata that may benefit other implementers too, and IMO the focus should be on a general solution first, and then a set of consensus-identified common metadata elements second, rather than getting too caught up in the specific technical metadata I happen to need in my own tools (which may very well be a corner case when compared to other implementers' needs). But at the risk of the conversation going down a rabbit-hole critiquing specific examples (which is premature), the specific example I gave on the implementer's call was the notion of "version series", which is basically a way to group SPDX identifiers that form a linear sequence of versions. Some examples: * Apache-1.0, Apache-1.1, and Apache-2.0 * GPL-1.0, GPL-2.0, and GPL-3.0 * LPPL-1.0, LPPL-1.1, LPPL-1.2, LPPL-1.3a, and LPPL-1.3c * Spencer-86, Spencer-94, and Spencer-99 * ...and more, though not all, SPDX identifiers... Another example I didn't mention on the call but that I've also thought about is metadata that identifies the publisher of a license. For example: * Apache Software Foundation (Apache licenses) * Free Software Foundation (GPL, LGPL, AGPL, GFDL, etc. licenses, plus various exceptions) * LaTeX Project * Henry Spencer * ...and more... Additional metadata elements related to those publishers (URLs, other contact information, etc.) may also prove useful. To reiterate - these are just some random examples I've personally run into, and I really don't want this conversation to get sidetracked into nitpicking the specific merits (or not) of these specific examples. My primary goal is to get a sense of whether other implementers also have unmet needs around additional license / exception metadata that doesn't exist in the lists today, and if so, form a group to only then start working through the nitty gritty details of what those elements are, and if/how SPDX might better support them. I hope this also answers your concern Alexios about why the legal team may be reluctant to consider adding these kinds of things to the lists. The examples I provided have no legal relevance (and I imagine there will be other examples in the same bucket), and so it would be no surprise if the legal team were to express disinterest in adding them. The key insight is that having little to no legal significance does NOT mean that such metadata isn't valuable (i.e. to implementers). Cheers, Peter On Thu, Dec 11, 2025 at 7:03 AM Arthit Suriyawongkul <[email protected] <mailto:[email protected]> > wrote: Peter, During the Implementers call on 2025-12-10 you discussed a license matching optimization technique that leverage a knowledge that a couple of licenses may be related or is a member of a same series. I can't recall it entirely, but I think that use case is less about legal (as from legal perspective , each of them is a different license) and more about technical implementation (reduce search space). If you don't mind to repeat that again here to help us better understand it. Cheers, Art The job of a citizen is to keep his mouth open. -- Günter Grass Intel Deutschland GmbH Registered Address: Dornacher Straße 1, 85622 Feldkirchen, Germany Tel: +49 89 991 430, www.intel.de <http://www.intel.de> Managing Directors: Harry Demas, Jeffrey Schneiderman, Yin Chong Sorrell Chairperson of the Supervisory Board: Nicole Lau Registered Seat: Munich Commercial Register: Amtsgericht München HRB 186928 -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#6059): https://lists.spdx.org/g/Spdx-tech/message/6059 Mute This Topic: https://lists.spdx.org/mt/116717999/21656 Group Owner: [email protected] Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub [[email protected]] -=-=-=-=-=-=-=-=-=-=-=-
