I have put on GitHub a thing I call the EEBO-TCP Workset Browser. [1] From the README file:
The EEBO-TCP Workset Browser is a suite of software designed to support "distant reading" against the corpus called the Early English Books Online - Text Creation Partnership corpus. Using the Browser it is possible to: 1) search a "catalog" of the corpus's metadata, 2) create a list of identifiers representing a subset of content for study, 3) feed the identifiers to a set of files which will mirror the content locally, index it, and do some rudimentary analysis outputting as set of HTML files, structured data, and graphs. The reader is then expected to examine the output more "closely" (all puns intended) using their favorite Web browser, text editor, spreadsheet, database, or statistical application. The purpose and functionality of this suite is very similar to the purpose and functionality of HathiTrust Research Center Workset Browser. [1] EBO-TCP Workset Browser - https://github.com/ericleasemorgan/EEBO-TCP-Workset-Browser — Eric Lease Morgan, Librarian University of Notre Dame