Hi Matt,

>> We are working on creation of SPDX docs from parallel scans by FOSSology and 
>> Ninka. One of the things that we would like
>>  to do is identify which scanner identified which license at the file level. 
>> And we would like to mark if the licenses were
>> the identified by one, both, or if there was a conflict. Has there been 
>> discussion of this into the new spec? Is there a way for us to do it now?

One of the important underlying objectives of the SPDX spec is that it is 
"tool" independent (analogous to PDF). Although the spec does include a field 
to specify the reviewer which may include a tool name, this is largely intended 
to establish credibility and trust. SPDX doc consumers generally expect 
definitive valid SPDX values (even if it is NOASSERTION) and should not have to 
decipher license conflicts between two or more tools (or humans). The 
expectation is that SPDX producer organizations build trust with SPDX doc 
consumers by producing quality SPDX data. It is left to the producer 
organization to decide which tool (or tools) and/or procedures to use in 
generating the data. Although, credibility and trust can also be established 
for a given tool, typically understanding whether the data was generated by an 
upstream project as opposed to a tool is helpful information.

Alternatively it makes sense to create a new tool that:

i)                    compares the output of two or more license analysis tools 
and reports on matches and conflicts; or

ii)                   generates an SPDX doc by

                                                          1.         performing 
analysis by comparing  the results two or more existing license analysis tools 
(agents);

                                                          2.         then using 
that information deduce a single valid SPDX license expression value for the 
LicenseConcluded field for each file.


For example, Sameer Ahmed presented an example of the latter case (ii) at the 
Linux Foundation Collab summit back in April 2013. A cloud service was 
presented that used up to six different analysis agents to identify the license 
of each file. Although multiple license analysis agents (tools) were employed 
to analyze each file, a weighted voting algorithm was used to deduce which 
single license expression was the best choice for a given file.

All in all, it is my opinion the spec should remain tool independent as much as 
reasonably possible and the SPDX working groups should encourage the 
development of new tools that could compare the results of different license 
analysis tools (and/or humans) and report matches and conflicts.

best,
- Mark

Mark Gisi | Wind River | Senior Intellectual Property Manager
Tel (510) 749-2016 | Fax (510) 749-4552



From: [email protected] 
[mailto:[email protected]] On Behalf Of Matt Germonprez
Sent: Monday, February 03, 2014 6:58 AM
To: [email protected]
Subject: Identified By

Hi everyone,

We are working on creation of SPDX docs from parallel scans by FOSSology and 
Ninka. One of the things that we would like to do is identify which scanner 
identified which license at the file level. And we would like to mark if the 
licenses were the identified by one, both, or if there was a conflict.

Has there been discussion of this into the new spec? Is there a way for us to 
do it now?

Thanks,
Matt

--
Mutual of Omaha Associate Professor of Information Systems
University of Nebraska at Omaha
Vita<http://myweb.unomaha.edu/~mgermonprez/>
Open Communities Lab<http://ocrl.unomaha.edu/>
NSF Grant on Open Communities<http://1.usa.gov/17mbd1Z>
_______________________________________________
Spdx-tech mailing list
[email protected]
https://lists.spdx.org/mailman/listinfo/spdx-tech

Reply via email to