(Also, I'm sorry for the mess that is this code... it started off as a
one-off hack of its own, so I didn't do much to make it maintainable or
clean)

On Tue, Feb 7, 2017, at 18:34, Kris Reeves wrote:
> In Node, I would probably use... cheerio to do quick one-off structural
> adjustments, and maybe a sax parser like sax-js to do streaming
> adjustments of tag names and stuff, then pipe it all through a
> (hopefully fixed) formatting thing like the wrap-xml script to do the
> line-wrapping and pretty printing. When we're ready for that, I hope I
> can help out! It will be difficult until all the files are in one place
> to do it in one pass, though.
> 
> Kris
> 
> On Tue, Feb 7, 2017, at 10:58, [email protected] wrote:
> > Thanks Kris for the update and pointers to the code.
> > 
> > I'll give it a try - but it looks like there is a good amount of detail
> > with access to the source, I don't expect any problems.  I'm also warming
> > up to Node as a decent infrastructure for the tooling.
> > 
> > Do you have any tools or ideas for transforming the current XML
> > attributes and property names to the new names? If not, I can write
> > something to do the conversion (although it will be in Java ;)
> > 
> > Gary
> > 
> > 
> > > -----Original Message-----
> > > From: [email protected] [mailto:spdx-tech-
> > > [email protected]] On Behalf Of J Lovejoy
> > > Sent: Monday, February 6, 2017 11:09 PM
> > > To: Kris Reeves
> > > Cc: [email protected]; SPDX-legal
> > > Subject: Re: Update
> > > 
> > > Thanks Kris!
> > > 
> > > The legal team will get cracking on the exceptions and new licenses - 
> > > good to
> > > know everything left to review is all in there.
> > > 
> > > I’ve copied the tech team, as I’m hoping someone more code-savvy than I 
> > > can
> > > parse the instructions on the tool below and then see how we can use that
> > > going forward.
> > > 
> > > We’ll miss you in Tahoe this year, in any case!
> > > 
> > > Jilayne
> > > 
> > > SPDX Legal Team co-lead
> > > [email protected]
> > > 
> > > 
> > > > On Feb 5, 2017, at 3:13 PM, Kris Reeves <[email protected]> wrote:
> > > >
> > > > Hi, folks!
> > > >
> > > > I've got some updates for you, though I imagine those of you
> > > > subscribed to notifications on the spdx/license-list-XML repo have
> > > > probably got a bunch of notifications, for which I apologize!
> > > >
> > > > First up, I've sent PRs for all the exceptions and the new licenses.
> > > > Some of these may still have the kinds of problems we had before, but
> > > > I hope not too many. Perfectionism has been getting in the way of me
> > > > getting things done, so I figure something is better than nothing here.
> > > >
> > > > Next, the conversion tool I've been using, which has been updated to
> > > > deal with exceptions from the XLS:
> > > > https://github.com/myndzi/license-tool
> > > >
> > > > I'm sure if I did the wrong thing license wise with that repo, someone
> > > > will tell me ;)
> > > >
> > > > A number of notes are required for explaining how to use this, which
> > > > I'll enumerate here:
> > > >
> > > > Installing:
> > > > - You need a recent-ish version of Node (at least one that supports
> > > > arrow functions), which I believe is >=4. Various package managers
> > > > include Node, but it's generally considered best by the Node community
> > > > to install the latest package from the website here:
> > > > https://nodejs.org/en/download/current/ (I typically build from source).
> > > > For convenience, you might take advantage of `n`:
> > > > https://github.com/mklement0/n-install (I recommend auditing any shell
> > > > scripts rather than just blindly run them!) -- `n` can be installed
> > > > this way without having Node, and then you can simply execute `n
> > > > latest` to get the latest build.
> > > > - Clone the repository
> > > > - From inside the cloned folder, `npm install`
> > > >
> > > > Using:
> > > > `node convert` or `node convert exceptions` in the project directory
> > > >
> > > > Since this tool was written to batch-process a bunch of files, I never
> > > > really gave it a one-off mode. It looks for an SPDX spreadsheet in
> > > > ./license-list and attempts to run the process for every license (or
> > > > exception) it finds that *does not exist* in ./src/licenses or
> > > > ./src/exceptions
> > > >
> > > > There is a branch (`git checkout current`) on the license-tool
> > > > repository that has all the XML files I have previously converted
> > > > checked in, so for future batches one should be able to update the
> > > > license-list subrepo and pull the new files, then run the batch
> > > > converter (`node convert`)
> > > >
> > > > (For this latest batch, I copied the XML files from my previous work
> > > > into ./src/licenses and ran the script; then, I checked out master,
> > > > which left the un-added files dangling, copied them to my
> > > > license-list-XML fork, and ran a little bash script to check each one
> > > > into its own branch individually and push it up to github. I created
> > > > the PRs manually this time.)
> > > >
> > > > The "user interface":
> > > > The conversion tool presents you with a UI for each file. You are able
> > > > to mark sections of the text in one of four modes, and optionally
> > > > toggle the "review" flag.
> > > >
> > > > Keys:
> > > > 1 - title mode
> > > > 2 - copyright mode
> > > > 3 - license mode
> > > > 4 - optional mode
> > > > "`", "~" - toggle 'review'
> > > > esc, q - abort/quit
> > > > enter, tab - write file, proceed to next up, down - extend/reduce
> > > > current block by one chunk page up, page down - extend/reduce current
> > > > block by one page
> > > >
> > > > You *must* have marked *all* the license body before continuing,
> > > > otherwise the program will just crash when you hit enter (low priority
> > > > bug for me). I usually hit pgdn a few times at the end to make sure of
> > > > this and gobble up any blank trailing lines.
> > > >
> > > > If it crashes, don't worry -- it'll pick up where it left off when you
> > > > run it again.
> > > >
> > > > You'll notice that SPDX markup and bullet points are highlighted in
> > > > the license body when using the conversion tool; you can't change
> > > > this, it's only there to display to you what it has identified and
> > > > will perform special actions on.
> > > >
> > > > There is one other utility included in here, `wrap-xml`; this can be
> > > > used to reformat an XML file by wrapping it to a given width.
> > > > Recommended for heavily-edited XML files to keep them nice. It will
> > > > rewrite the indentation and so on. It is, I think, the culprit of the
> > > > over-escaped problem in some of the existing licenses (all those
> > > > unnecessary &quot; entities and stuff). I'll reserve fixing this for
> > > > future work (or anyone who wants to send a PR!). With some changes, it
> > > > should be usable to fix all those instances in batch. This script in
> > > > particular operates on stdin and stdout, so to reformat an xml file
> > > > you would do something like `cat file.xml | node wrap-xml >
> > > > file-new.xml`
> > > >
> > > > One last caveat: the list-detection (and part of the reason why it's
> > > > broken) is based on
> > > > the assertion that the input text files have been formatted in a very
> > > > specific way (it counts spaces). This will probably need to be
> > > > adjusted before it's suitable for use on arbitrary input text files
> > > > (new license
> > > > candidates)
> > > >
> > > > Sorry for dragging my feet for so long, and I hope this gets us caught
> > > > up!
> > > >
> > > > Kris
> > > > _______________________________________________
> > > > Spdx-legal mailing list
> > > > [email protected]
> > > > https://lists.spdx.org/mailman/listinfo/spdx-legal
> > > 
> > > _______________________________________________
> > > Spdx-tech mailing list
> > > [email protected]
> > > https://lists.spdx.org/mailman/listinfo/spdx-tech
> > 
_______________________________________________
Spdx-tech mailing list
[email protected]
https://lists.spdx.org/mailman/listinfo/spdx-tech

Reply via email to