Adding to Philippe's comment on using the standard headers. Parsing the optional and variable tags is relatively straightforward if you would like to convert those standard headers into their original text form.
Just remove the <<beginOptional>> and <<endOptional>> For the <<var ...>> tag, replace the <<var ...>> with the text in the original= (e.g. "<<var;name=orgClause3;original=the copyright holder;match=.+>>" would be replaced by "the copyright holder"). The intention for these <<optional>> and <<var>> tags are to assist in matching (see the Spec Appendix 2 for more information https://spdx.org/spdx-specification-21-web-version#h.2mjng0vqrghe). That being said, since they are easy to parse out, you should be able to use the standard headers to help with your documentation. > -----Original Message----- > From: [email protected] [mailto:spdx-tech- > [email protected]] On Behalf Of Philippe Ombredanne > Sent: Friday, March 3, 2017 3:24 AM > To: SPDX Tech > Subject: Re: [spdx-tech] Retrieving standard license headers > > On Fri, Mar 3, 2017 at 10:58 AM, Didier Verna <[email protected]> wrote: > > Hello, > > I'm the author of an automatic documentation system for libraries > > written in the Common Lisp programming language. > > Welcome! > https://github.com/didierverna/declt looks neat! > > > Lisp has a couple of > > de-facto standard tools for describing and managing packages, and I'm > > trying to push those to using SPDX format where relevant. > > This is great. Thank you for trying to make the world a better place wrt > license clarity. Be assured that will help in any way we can! > > > I see that SPDX makes license texts accessible on github, but what > > about standard license headers ? I couldn't find those anywhere, > > although I know you've got them somewhere (they are provided in the HTML > pages). > > > > It would be convenient for me to have programmatic access to the > > standard license headers, because then, given an SPDX license > > identifier, I could then retrieve those and incorporate them directly > > in the generated documentation files. > > A couple things: > > You could use the structured data available (such as JSON) available in > https://github.com/spdx/license-list-data/tree/master/json > > When available the standard header would be in the standardLicenseHeader > attribute for a license. > > See this for example: > > https://gist.github.com/pombredanne/50b04d5d4f37070cca9921bc5cbb9018#file-gpl- > 1-0-json-L9 > > You will find that these "standardLicenseHeader" texts when present have a few > issues, but nothing too bad: > > - they contain a copyright template of sorts which is entirely not needed and > make these useless for re-using in any clean documentation or attribution > notice. > > You could remove these types of string: > "<<var;name=copyright;original=19xx name of author;match=.+>>" > or use that somehow and replace it with a copyright but the results are likely > to be rather ugly in many case. They also assume that you have captured a > clean and structured copyright statement beforehand. > > - headers may not be available at all. Since the "+" "or later" ids have been > deprecated, there is no way that I know of where to get a clean "GPL 2.0 or > later notice" for instance. Some license may not be up to date too, but this > is minor and easy to fix over time. > > - in several cases, using a standard template may be grossly incorrect and the > actual exact notice that was in the original package should be what is reused > if you want crisp and clear documentation and compliance. For this you would > need to actually capture these real notices and copyright from the code > proper. > > You could try my scancode-toolkit for this with the --license-text option > though that part while stable and decent is not 100% there to reassemble a > *perfect* notice capture. We are working on it though. > In any case you would get from this a JSON with copyrights and licenses texts > and notices alright. > I confess that scancode-toolkit was mistakenly written in Python and not Lisp > as it should have. We are working slowly to implement a messy, incomplete > subset of an imperfect Lisp engine as all serious software projects tend to do > eventually. > > > Lisp, Jazz, Aïkido: http://www.didierverna.info > > A definitely eclectic mix! > And I like the order of your priorities: Think, sing, dance. > > -- > Cordially > Philippe Ombredanne > _______________________________________________ > Spdx-tech mailing list > [email protected] > https://lists.spdx.org/mailman/listinfo/spdx-tech _______________________________________________ Spdx-tech mailing list [email protected] https://lists.spdx.org/mailman/listinfo/spdx-tech
