Hi Eric, Gary O’Neall wrote a paper about the various ways one can access the SPDX License List, which is available in a variety of ways (beside scraping). That paper is here: http://wiki.spdx.org/images/SPDX-TR-2014-2.v1.0.pdf
I’m also copying your email to the SPDX tech team, as that is a better forum for discussing this kind of thing. Thanks, Jilayne SPDX Legal Team co-lead [email protected] > On Nov 20, 2015, at 7:51 AM, Eric S. Raymond <[email protected]> wrote: > > I'm thinking about writing a codewalker that would scan a source tree > for license inclusions and replace them with SPDX tags. > > The hard part of this wouldn't be the code, it would be scraping > copies of all the canonical license texts and SPDX names. > > For this, and other related reasons, I request that you make the > license list available in a machine-parseable form. What I'd like to > be able to do is write a code generator that massages that form into > Python structures that then drive the source transformation. > > Here's a possible format: > > ----------------------------------------------------- > > license-identifier-1 <<EOF > text of > license 1 > EOF > > license-identifier-2 <<EOF > text of > license 2 > EOF > > <much more> > ----------------------------------------------------- > > The details of the format don't matter much as long as it's > self-describing, textual, and easy to parse. JSON would do, > in which case the above would look like this: > > ----------------------------------------------------- > {"spdx-licenses":[ > > {"license-identifier-1":" > text of > license 1 > "}, > > {"license-identifier-2":" > text of > license 2 > "}, > > <much more> > > ]; > ----------------------------------------------------- > > > > -- > <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> > _______________________________________________ Spdx-tech mailing list [email protected] https://lists.spdx.org/mailman/listinfo/spdx-tech
