On Mon, Oct 7, 2013 at 5:04 AM, Stefan Wurzinger < [email protected]> wrote:
> > I'm new to Open Library and I want to ask if there is a Java source > code available to implement a tool which allows to download books from > the Open Library platform. > I don't believe there's a specific Java client library for OpenLibrary. It's pretty simple though. As Karen pointed out, it's basically just a matter of constructing the right URLs and parsing the JSON which is returned. Here's a simple example I found in Google (although it's doing other stuff with the results): http://www.nuxeo.com/blog/development/2013/02/qa-friday-external-web-service-call-included-automation-chain/ As you can see it's just a few lines of code to get the basic results. As Karen mentioned, OpenLibrary only stores metadata, not text. You'll need to get the text from Internet Archive in your desired format. If you were starting with, for example, authors, you'd want to follow the chain Author->Works->Editions->Text and filter to only include editions which contain Internet Archives identifiers. Alternatively, you could just use the search API to search directly for subjects, etc. If you're using the search API, has_fulltext=true is a good way to achieve that filtering. Here's a sequence of URLs showing the progression from the search API through to Internet Archive: http://openlibrary.org/search?q=volleyball&has_fulltext=true http://openlibrary.org/books/ia:passsetcrushvoll00luca/Pass_set_crush_volleyball_illustrated http://openlibrary.org/books/ia:passsetcrushvoll00luca.json http://archive.org/details/passsetcrushvoll00luca (using value of key "ocaid" from JSON result above) http://www.archive.org/download/passsetcrushvoll00luca/ (switching to the download directory from the HTML view - this will redirect to actual storage location) where you can see (what would be) the available download formats. Available formats include raw OCR XML, PDF, text, DejaVu XML, etc. In this particular case, the book is in copyright, so you can't download them. This points out another thing that you'll need to deal with which I don't know the solution to off the top of my head. The ebooks/fulltext search will also return hits on DAISY format books for blind people which are only readable on specially registered devices. I don't know off the top of my head how to filter them out. Hope that helps. Feel free to ask followup questions if you get stuck. Tom
_______________________________________________ Ol-tech mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech Archives: http://www.mail-archive.com/[email protected]/ To unsubscribe from this mailing list, send email to [email protected]
