Re: [CODE4LIB] internet archive api

Murray, Gregory Tue, 19 Sep 2017 06:59:55 -0700

Eric,

I see your questions has been answered by referring you to the Python
tool, but just FYI a quick and dirty option is simply to use the advanced
search form (https://archive.org/advancedsearch.php) to choose the
collection you want, choose which fields you want returned (e.g.
"identifier"), and choose your output format (JSON, XML, CSV, etc.).


Greg


On 9/18/17, 3:37 PM, "Code for Libraries on behalf of Eric Lease Morgan"
<[email protected] on behalf of [email protected]> wrote:

>Is there an Internet Archive API that will allow me to get the contents
>of a collection as a stream of data and not as a stream of HTML.
>
>A cool collection of early English print materials is available at the
>following URL:
>
>  https://archive.org/details/bplsceep
>
>Each item is associated with an Internet Archive identifier. If I were
>able to easily extract these identifiers, then I would be more easily
>able to provide services based on the collection. But I¹m lazy. I don¹t
>want to read the HTML and scrape it accordingly. Ick! I¹d rather be given
>the list of bibliographics in a more computer-friendly way.
>
>Again, can I programmatically read the contents of a Internet Archive
>collection?
>
>‹
>Eric Morgan

Re: [CODE4LIB] internet archive api

Reply via email to