[ 
https://issues.apache.org/jira/browse/MPIR-382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16843539#comment-16843539
 ] 

Gary O'Neall commented on MPIR-382:
-----------------------------------

A couple suggestions from a member of the SPDX team - Using SPDX License ID's 
is an unambiguous ID you can use for licenses.  There are a few ways to 
determine if licenses match a particular license ID.  The name is one of the 
more unreliable means - so I would avoid using the license name from the POM 
file.

By way of background, the SPDX organization maintains a [curated list of open 
source licenses|[https://spdx.org/licenses/]] along with [matching 
guidelines|[https://spdx.org/spdx-license-list/matching-guidelines]] and [tool 
libraries|[https://github.com/spdx/tools]] to solve issue like normalizing the 
license names as discussed in this issue.

The most reliable method would be to match against the actual text using the 
matching guidelines.  There is a [Java library 
class|[https://github.com/spdx/tools/blob/master/src/org/spdx/compare/LicenseCompareHelper.java]]
 which can help facilitate the matching, but it is not a high performance 
utility and it requires the actual text (e.g. any HTML would need to be scraped 
and the actual license text extracted).

The approach we took in the [SPDX Maven 
Plugin|[https://github.com/spdx/spdx-maven-plugin]] was use the URL's for the 
license to match.  The code for this can be found in the [LicenseManager 
class|[https://github.com/spdx/spdx-maven-plugin/blob/master/src/main/java/org/spdx/maven/LicenseManager.java]].

Feel free to leverage code from the libraries.  The code itself uses the 
Apache-2.0 license and there are some dependencies which use Apache friendly 
licenses.

You can also post questions to the [SPDX tech mailing 
list|[https://lists.spdx.org/g/spdx-tech]] or submit issues against the license 
list XML or tools libraries in the [SPDX git 
repository|[https://github.com/spdx]].

> Group different license spellings together
> ------------------------------------------
>
>                 Key: MPIR-382
>                 URL: https://issues.apache.org/jira/browse/MPIR-382
>             Project: Maven Project Info Reports Plugin
>          Issue Type: Improvement
>          Components: licenses
>    Affects Versions: 3.0.0
>            Reporter: Vincent Privat
>            Priority: Major
>
> The licenses report is not aware of the different possible spellings we can 
> find in all Maven dependencies published on Maven Central. This results in 
> any given license (for example, Apache 2.0) being listed several times. For 
> example, take a look to the seven variants listed in [maven-site-plugin 
> dependencies|http://maven.apache.org/plugins/maven-site-plugin/dependencies.html#Licenses]:
>  * Apache Public License 2.0
>  * Apache License Version 2
>  * Apache License, Version 2.0
>  * Apache Software License - Version 2.0
>  * Apache License 2.0
>  * Apache License Version 2.0
>  * The Apache Software License, Version 2.0
> On a large project, this makes the license report unreadable. The plugin 
> should detect all these variants denote the same license and group them 
> together.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to