Hi Matt,
>> We are working on creation of SPDX docs from parallel scans by FOSSology and
>> Ninka. One of the things that we would like
>> to do is identify which scanner identified which license at the file level.
>> And we would like to mark if the licenses were
>> the identified by one, both, or if there was a conflict. Has there been
>> discussion of this into the new spec? Is there a way for us to do it now?
One of the important underlying objectives of the SPDX spec is that it is
"tool" independent (analogous to PDF). Although the spec does include a field
to specify the reviewer which may include a tool name, this is largely intended
to establish credibility and trust. SPDX doc consumers generally expect
definitive valid SPDX values (even if it is NOASSERTION) and should not have to
decipher license conflicts between two or more tools (or humans). The
expectation is that SPDX producer organizations build trust with SPDX doc
consumers by producing quality SPDX data. It is left to the producer
organization to decide which tool (or tools) and/or procedures to use in
generating the data. Although, credibility and trust can also be established
for a given tool, typically understanding whether the data was generated by an
upstream project as opposed to a tool is helpful information.
Alternatively it makes sense to create a new tool that:
i) compares the output of two or more license analysis tools
and reports on matches and conflicts; or
ii) generates an SPDX doc by
1. performing
analysis by comparing the results two or more existing license analysis tools
(agents);
2. then using
that information deduce a single valid SPDX license expression value for the
LicenseConcluded field for each file.
For example, Sameer Ahmed presented an example of the latter case (ii) at the
Linux Foundation Collab summit back in April 2013. A cloud service was
presented that used up to six different analysis agents to identify the license
of each file. Although multiple license analysis agents (tools) were employed
to analyze each file, a weighted voting algorithm was used to deduce which
single license expression was the best choice for a given file.
All in all, it is my opinion the spec should remain tool independent as much as
reasonably possible and the SPDX working groups should encourage the
development of new tools that could compare the results of different license
analysis tools (and/or humans) and report matches and conflicts.
best,
- Mark
Mark Gisi | Wind River | Senior Intellectual Property Manager
Tel (510) 749-2016 | Fax (510) 749-4552
From: [email protected]
[mailto:[email protected]] On Behalf Of Matt Germonprez
Sent: Monday, February 03, 2014 6:58 AM
To: [email protected]
Subject: Identified By
Hi everyone,
We are working on creation of SPDX docs from parallel scans by FOSSology and
Ninka. One of the things that we would like to do is identify which scanner
identified which license at the file level. And we would like to mark if the
licenses were the identified by one, both, or if there was a conflict.
Has there been discussion of this into the new spec? Is there a way for us to
do it now?
Thanks,
Matt
--
Mutual of Omaha Associate Professor of Information Systems
University of Nebraska at Omaha
Vita<http://myweb.unomaha.edu/~mgermonprez/>
Open Communities Lab<http://ocrl.unomaha.edu/>
NSF Grant on Open Communities<http://1.usa.gov/17mbd1Z>
_______________________________________________
Spdx-tech mailing list
[email protected]
https://lists.spdx.org/mailman/listinfo/spdx-tech