On May 14, 2013, at 11:56 AM, Donna Campbell <[email protected]> wrote:

> Cambridge Journals encourages new uses for journal data by releasing its API…
> http://journals.cambridge.org/action/stream?pageId=9048&level=2


Speaking of API's for journals, I have been revisiting JSTOR's Data For 
Research (DFR) site -- http://dfr.jstor.org. It is interesting because it 
allows you to download data sets describing JSTOR search results. Here's how:

  1. go to dfr.jstor.org
  2. sign in
  3. search the (entire) JSTOR collection
  4. refine, refine, and refine your search results
  5. request a dataset
  6. wait for email message telling you dataset is ready
  7. download dataset
  8. munge the dataset to do cool things

At the most, each dataset will contain a file of citation information, a lists 
of ngrams (bigrams, trigrams, and quadgrams) from each article, a list of 
statistically significant keywords from each article, and a list of most 
frequently used words from each article. 

>From this data all sorts of things can be created:

  * a tag/word cloud of each article or of the entire corpus
  * a Simile timeline of published articles
  * various citation formats
  * exports into other databases
  * after automatically downloading PDF versions of the article,
    concordances can be created
  * services such as "find more like this one" can be implemented

Unfortunately, the searching API (an SRU interface) has been discontinued from 
DFR, but the whole thing still is pretty cool.

--
Eric Lease Morgan, Digital Initiatives Librarian
University of Notre Dame

574/631-8604

Reply via email to