Re: [CODE4LIB] Internet Archive collection codes?
Peter, I've seen no official information or documentation from the Internet Archive either. I've actually been quite frustrated by several issues for a while now. For example: If you go to http://www.archive.org/details/nonexistentidentifier you'll get a human-readable web page stating that the item cannot be found. That page, however, is served up with an HTTP status of 200 OK, not 404 NOT FOUND. In addition, I've noticed that when certain requests fail due to system load and other issues, I get back an HTML page saying something like the system is experiencing slowness, but again with a 200 OK instead of a 503 SERVICE UNAVAILABLE (ideally with a Retry-After header). These things alone make it extremely difficult to automate any large-scale metadata retrieval from the Internet Archive, and that's without any attempt to download content. I'm working on a post documenting some of the techniques and strategies that have worked for us, but it's not quite ready for human consumption yet. Michael -- Michael B. Klein Digital Initiatives Technology Librarian Boston Public Library [EMAIL PROTECTED] From: Binkley, Peter [EMAIL PROTECTED] Reply-To: Code for Libraries CODE4LIB@LISTSERV.ND.EDU CODE4LIB@LISTSERV.ND.EDU Date: Thu, 5 Jun 2008 13:08:13 -0600 To: CODE4LIB@LISTSERV.ND.EDU Conversation: [CODE4LIB] Internet Archive collection codes? Subject: Re: [CODE4LIB] Internet Archive collection codes? While we're on the subject, are there any more up-to-date instructions for harvesting from Internet Archive than these? http://biodiversitylibrary.blogspot.com/2008/03/harvesting-process-from- internet_14.html And does IA provide guidelines for harvesting (traffic limits etc.)? I clicked around the site a bit and didn't find them, but could easily have missed them. Peter
[CODE4LIB] Open Library API
Inspired by a thread on this list yesterday, I started playing with the Open Library API. In order to query through the API, you must pass a query as a JSON serialized object. That's good, and it could be great, given that for Java and PHP (at least) there already exists the ability to serialize a native data type into and out of JSON. The problem that I'm noticing is that at least the querying process, the naming conventions used by the API complicate this. For instance, in order to do a pattern search of the key field, one must pass the identifier of that field with a tilde (~) appended to that field, so that a query would read like this: { key~: \/about\/* } The problem is that for the two programming languages I use, Java and PHP the variable name key~ and $key~ is illegal, and I believe that is the case for most programming languages. Thus, in this PHP class (an its Java analog) would fail at compile / parse time: class OpenLibraryQuery { public $key~; __construct ($keyValue) { $this-key~ = $keyValue; } } This is a problem, because ideally, I would like to be able to do essentially this: $query = json_encode(new OpenLibraryQuery('\/about\/*'); which, if the above class did parse, would automatically assign $query a valid JSON string, similar to what is above. Instead, I either have to rename my variable, or use string manipulation to make the string work. Note that $query = json_encode(array('key~'='\/about\/*')); will not be parsed through the API, and results in an error message. This leaves me with three questions: 1. Is there an easy way around this, other than string manipulation, that I am missing? Does the solution work for most or all programming languages? 2. Does this strike readers as a significant enough issue to raise with the API developers? 3. Given that Open Library runs on Infogami and has other dependencies, does this strike readers as something that can be remedied? - David --- David Cloutman [EMAIL PROTECTED] Electronic Services Librarian Marin County Free Library Email Disclaimer: http://www.co.marin.ca.us/nav/misc/EmailDisclaimer.cfm
Re: [CODE4LIB] Open Library API
I see no reason not to send this along to the developers. I don't know if key has some special significance or if something else could be easily substituted. The API is very new and hasn't been used much, so it's good to surface things of this nature. (Note: I consult on the bib data aspect of the OL, but also took a stab at expanding some of the text in the API document because the first version was terser than terse.) Do you have someone in mind to send it to? If not, Alexis can probably forward it to the right people. kc Cloutman, David wrote: Inspired by a thread on this list yesterday, I started playing with the Open Library API. In order to query through the API, you must pass a query as a JSON serialized object. That's good, and it could be great, given that for Java and PHP (at least) there already exists the ability to serialize a native data type into and out of JSON. The problem that I'm noticing is that at least the querying process, the naming conventions used by the API complicate this. For instance, in order to do a pattern search of the key field, one must pass the identifier of that field with a tilde (~) appended to that field, so that a query would read like this: { key~: \/about\/* } The problem is that for the two programming languages I use, Java and PHP the variable name key~ and $key~ is illegal, and I believe that is the case for most programming languages. Thus, in this PHP class (an its Java analog) would fail at compile / parse time: class OpenLibraryQuery { public $key~; __construct ($keyValue) { $this-key~ = $keyValue; } } This is a problem, because ideally, I would like to be able to do essentially this: $query = json_encode(new OpenLibraryQuery('\/about\/*'); which, if the above class did parse, would automatically assign $query a valid JSON string, similar to what is above. Instead, I either have to rename my variable, or use string manipulation to make the string work. Note that $query = json_encode(array('key~'='\/about\/*')); will not be parsed through the API, and results in an error message. This leaves me with three questions: 1. Is there an easy way around this, other than string manipulation, that I am missing? Does the solution work for most or all programming languages? 2. Does this strike readers as a significant enough issue to raise with the API developers? 3. Given that Open Library runs on Infogami and has other dependencies, does this strike readers as something that can be remedied? - David --- David Cloutman [EMAIL PROTECTED] Electronic Services Librarian Marin County Free Library Email Disclaimer: http://www.co.marin.ca.us/nav/misc/EmailDisclaimer.cfm -- --- Karen Coyle / Digital Library Consultant [EMAIL PROTECTED] http://www.kcoyle.net ph.: 510-540-7596 skype: kcoylenet fx.: 510-848-3913 mo.: 510-435-8234
[CODE4LIB] Sorry
Disregard my last post. I replied to the wrong email. --- David Cloutman [EMAIL PROTECTED] Electronic Services Librarian Marin County Free Library -Original Message- From: Code for Libraries [mailto:[EMAIL PROTECTED] On Behalf Of Jonathan Rochkind Sent: Thursday, June 05, 2008 3:16 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] refworks developer documentation? Does anyone know where, if anywhere, I find documentation on the ways to send references to RefWorks for importing? Not having any luck on their website. I know I've seen it before though. I remember there were a variety of formats and methods you could send things to RefWorks for an import. Must be documentation somewhere? I bet some code4libber has done this before. Jonathan -- Jonathan Rochkind Digital Services Software Engineer The Sheridan Libraries Johns Hopkins University 410.516.8886 rochkind (at) jhu.edu Email Disclaimer: http://www.co.marin.ca.us/nav/misc/EmailDisclaimer.cfm