Re: [CODE4LIB] Latest OpenLibrary.org release
Dr R. Sanderson wrote: I shouldn't respond to such blatant trolling, but heh... On Wed, 7 May 2008, Casey Durfee wrote: SRU is crap, in my opinion -- overengineered and under-thought, incomprehensible to non-librarians and burdened by the weight of history. What is so incomprehensible about it? Is it the fact it uses XML? No? Is it the REST like interface? No? Ahh... the extremely familiar but not hideously over-complicated and inappropriate (such as SQL, SPARQL or XQuery) query language? That you can just put the URLs into your web browser and use XSLT to display the results, rather than requiring M2M interfaces? The notion that it was designed to be used by all kinds of clients on all kinds of data is irrelevant in my book. Nobody in the *library world* uses it, much less non-libraries. APIs are for use. You don't get any points Except for, you know, small projects like The European Library (which is the template for the nascent European Digital Library), the Library of Congress, DSpace, most digital library systems, etc etc etc. And IndexData have interfaces to many sources of data via SRU, for when it's not natively implemented. for idealogical correctness. A non-librarian could look at that API document, understand it all, and start working with it right away. There is no way you can say that about SRU. I will say it, right now. I've had non librarian students look at the document and start working with it straight away. Multiple times. My apologies if you don't have similar experiences. One of my favorite tricks when explaining networked information retrieval is still to type SRU queries into a browser and walk through the XML response. It's the simplest way I know to get the point across that it is absolutely dead easy to get started using networked IR using a standard protocol, no matter what programming environment you're in. I have yet to meet a programmer/librarian/manager who didn't think that was a cool trick, and who went away with their imagination stimulated to go do something with it... most people buy into the notion that this stuff is somehow 'hard', and come away mildly surprised when they realize that it isn't. Btw, that practice led to this document: http://www.loc.gov/standards/sru/simple.html which tries to get the same point across. --Seb Kudos to the OpenLibrary team, whatever the reason was, for coming up with something better that people outside the library world might actually be willing to use. It's totally arbitrary JSON with a very small fraction of the functionality and at least as much complexity. If people are willing to use it then that's great, certainly. Rob -- Sebastian Hammer, Index Data [EMAIL PROTECTED] www.indexdata.com Ph: (603) 209-6853 Fax: (866) 383-4485
Re: [CODE4LIB] Latest OpenLibrary.org release
On Thu, 2008-05-08 at 11:41 -0400, Godmar Back wrote: On Thu, May 8, 2008 at 11:25 AM, Dr R. Sanderson [EMAIL PROTECTED] wrote: Like what? The current API seems to be concerned with search. Search is what SRU does well. If it was concerned with harvest, I (and I'm sure many others) would have instead suggested OAI-PMH. No, the API presented does not support search. Well, it only doesn't support search because of the way that the API has been described without using the word 'search'! To quote the documentation in the API: -- Infogami provides an API to query the database for objects matching particular criteria ... To find objects matching a particular query, send a GET request to http://openlibrary.org/api/things with query as parameter. In this documentation we use curl as a simple command line query client; any software that supports http GET can be used. ... The API supports querying for objects based of string matching. - And so on. There's a query, which can have its results sorted, be limited in terms of the number of results returned, and have the beginning of that result list start at an offset. Sounds a lot like a search? Rob
Re: [CODE4LIB] Latest OpenLibrary.org release
In general, I think we all agree that standards should be used where possible---a proliferation of APIs that our client software needs to talk to leads to much harder to maintain client software than re-using APIs. However, if the standards are truly too hard to work with, sometimes the 'correct' decision is indeed to design a new one. However, I guess some of our concern is that from observation, it kind of looks like the OpenLibrary team didn't even consider SRU, it wasn't considered and dismissed based on technical evaluation, but rather ignored from the start. This is rather distressing, especially for a project with as ambitious goals as OpenLibrary. For the project to be successful (which most of us observers want very much), it's very important to understand the existing technology landscape and how to fit into it. Now, to be sure, it may be that the existing _library_ technology landscape is not particularly of interest to OpenLibrary, that they are more interested in connecting with the larger technology world and see library technology infrastructure as a tiny and irrelevant backwater. :) That's up to them to decide this sort of strategy, and may be valid, and would justify simply ignoring technology which is _mainly_ (but not exclusively) adopted by the library world---but would of course be distressing to us in the library technology world hoping we can connect to the OpenLibrary project. It's also nice to say it's open source, if you want it you can add it. And it DOES matter that it's open source, and this is HUGELY good. But most of us are already working on multiple open source projects, many of which _we_ are trying to recruit people too also. It's a small pond of library developers, and a big need for open source library software. If OpenLibrary becomes as succesful as we all hope, then it will attract open source contributions, from library developers and others. If the OpenLibrary team wants to _get_ it to that succesful point, than in my view it would be wise to spend OpenLibrary resources on components that will make it easy for library and other developers to interface with it. I'm sure they agree--which is why it has an API at all, rather than just leaving it API-less and saying hey, it's open source, if you want one add one!. That would obviously be counter-productive to the goals. Again, I don't know enough about SRU to to know if it's suitable. I've certainly encountered other library 'standards' that are over-engineered and hard-to-adopt enough to justify abandoning them and creating something new. All I'd hope is that the OpenLibrary team actually _knew about_ SRU and gave it a serious evaluation---it is adopted enough to justify that. It's also worth saying that when you DO abandon an existing standard to write something new---it's a lot more productive to building a sustainable tech infrastructure if you try to make your 'something new' at least a potential de facto standard, rather than a custom thing only for your product. Jonathan Walker, David wrote: Nobody in the *library world* uses it, much less non-libraries. Ironically, I was just checking email in between using the WorldCat SRU server. In addition to the systems Rob mentioned, there are also article databases like JSTOR and Springerlink that implement SRU, and every metasearch system in use in libraries today consume SRU web services. But I think the folks at OpenLibrary should implement an OpenSearch interface. I mean come on! OpenLibrary, OpenSearch. A match made in heaven! ;-) --Dave --- David Walker Library Web Services Manager California State University http://xerxes.calstate.edu From: Code for Libraries on behalf of Casey Durfee Sent: Wed 5/7/2008 1:12 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Latest OpenLibrary.org release SRU is crap, in my opinion -- overengineered and under-thought, incomprehensible to non-librarians and burdened by the weight of history. The notion that it was designed to be used by all kinds of clients on all kinds of data is irrelevant in my book. Nobody in the *library world* uses it, much less non-libraries. APIs are for use. You don't get any points for idealogical correctness. A non-librarian could look at that API document, understand it all, and start working with it right away. There is no way you can say that about SRU. Kudos to the OpenLibrary team, whatever the reason was, for coming up with something better that people outside the library world might actually be willing to use. On Wed, May 7, 2008 at 12:55 PM, Dr R. Sanderson [EMAIL PROTECTED] wrote: I'm the only non-techie on the team, so I don't know that much about SRU. (Our head programmer lives in India, and is presumably asleep at the moment, otherwise I'd ask him!) Is it an interface that is used primarily by libraries? We are definitely hoping that our API will be used by all kinds, so perhaps that's
Re: [CODE4LIB] Latest OpenLibrary.org release
I guess I don't understand why you'd prefer SRU to an API. The ideal (in my mind) is that by having the API available, you have your cake and eat it too. Can a SRU (or OpenSearch or OpenURL or whatever) service not be built on top of the API? However, if there was no API available, only a SRU service, wouldn't you complain about something else that SRU didn't do? I don't know, personally I don't think it's the OL or IA's job to build interfaces that they don't personally need. I am much more interested in them building functionality to the OL itself. If a SRU interface is important enough to somebody, they'll build one (or pay IndexData to do it). If the foundation is there, the standards can be built upon them. -Ross. On Thu, May 8, 2008 at 10:59 AM, Jonathan Rochkind [EMAIL PROTECTED] wrote: In general, I think we all agree that standards should be used where possible---a proliferation of APIs that our client software needs to talk to leads to much harder to maintain client software than re-using APIs. However, if the standards are truly too hard to work with, sometimes the 'correct' decision is indeed to design a new one. However, I guess some of our concern is that from observation, it kind of looks like the OpenLibrary team didn't even consider SRU, it wasn't considered and dismissed based on technical evaluation, but rather ignored from the start. This is rather distressing, especially for a project with as ambitious goals as OpenLibrary. For the project to be successful (which most of us observers want very much), it's very important to understand the existing technology landscape and how to fit into it. Now, to be sure, it may be that the existing _library_ technology landscape is not particularly of interest to OpenLibrary, that they are more interested in connecting with the larger technology world and see library technology infrastructure as a tiny and irrelevant backwater. :) That's up to them to decide this sort of strategy, and may be valid, and would justify simply ignoring technology which is _mainly_ (but not exclusively) adopted by the library world---but would of course be distressing to us in the library technology world hoping we can connect to the OpenLibrary project. It's also nice to say it's open source, if you want it you can add it. And it DOES matter that it's open source, and this is HUGELY good. But most of us are already working on multiple open source projects, many of which _we_ are trying to recruit people too also. It's a small pond of library developers, and a big need for open source library software. If OpenLibrary becomes as succesful as we all hope, then it will attract open source contributions, from library developers and others. If the OpenLibrary team wants to _get_ it to that succesful point, than in my view it would be wise to spend OpenLibrary resources on components that will make it easy for library and other developers to interface with it. I'm sure they agree--which is why it has an API at all, rather than just leaving it API-less and saying hey, it's open source, if you want one add one!. That would obviously be counter-productive to the goals. Again, I don't know enough about SRU to to know if it's suitable. I've certainly encountered other library 'standards' that are over-engineered and hard-to-adopt enough to justify abandoning them and creating something new. All I'd hope is that the OpenLibrary team actually _knew about_ SRU and gave it a serious evaluation---it is adopted enough to justify that. It's also worth saying that when you DO abandon an existing standard to write something new---it's a lot more productive to building a sustainable tech infrastructure if you try to make your 'something new' at least a potential de facto standard, rather than a custom thing only for your product. Jonathan Walker, David wrote: Nobody in the *library world* uses it, much less non-libraries. Ironically, I was just checking email in between using the WorldCat SRU server. In addition to the systems Rob mentioned, there are also article databases like JSTOR and Springerlink that implement SRU, and every metasearch system in use in libraries today consume SRU web services. But I think the folks at OpenLibrary should implement an OpenSearch interface. I mean come on! OpenLibrary, OpenSearch. A match made in heaven! ;-) --Dave --- David Walker Library Web Services Manager California State University http://xerxes.calstate.edu From: Code for Libraries on behalf of Casey Durfee Sent: Wed 5/7/2008 1:12 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Latest OpenLibrary.org release SRU is crap, in my opinion -- overengineered and under-thought, incomprehensible to non-librarians and burdened by the weight of history. The notion that it was designed to be used by all kinds of clients on all kinds of data
Re: [CODE4LIB] Latest OpenLibrary.org release
I guess I don't understand why you'd prefer SRU to an API. The ideal Except SRU _is_ an API? However, if there was no API available, only a SRU service, wouldn't you complain about something else that SRU didn't do? Like what? The current API seems to be concerned with search. Search is what SRU does well. If it was concerned with harvest, I (and I'm sure many others) would have instead suggested OAI-PMH. If the purpose is to have full control over things (in terms of submitting/modifying data, changing options, etc etc) then no, SRU wouldn't be appropriate. However, that's not what the current API seems to be designed to do, nor intended to become. Rob
Re: [CODE4LIB] Latest OpenLibrary.org release
On May 8, 2008, at 11:25 AM, Dr R. Sanderson wrote: However, if there was no API available, only a SRU service, wouldn't you complain about something else that SRU didn't do? Like what? The current API seems to be concerned with search. Search is what SRU does well. If it was concerned with harvest, I (and I'm sure many others) would have instead suggested OAI-PMH. BTW, at an OpenLibrary meeting that took place a couple of months ago I mentioned and advocated for the use of both SRU and OAI-PMH. From my travel log: The afternoon was given up to a number of break-out sessions to discuss specific issues. I participated in the discussion surrounding APIs to integrate Wikipedia with Open Library. The desire is to allow people to cite books in Wikipedia by: 1) searching for title or key, 2) returning a list of results, 3) selecting an item, 4) having the system create a citation for the item, and 5) inserting the citation into a Wikipedia article. Then, on a regular basis, these citations will be checked and updated ensuring their integrity. Everybody in the group supported the concept of REST-ful Web Service computing to accomplish the task, but not everybody's definition of REST-ful computing was congruent, and a bit of a religious war ensued. I tried to advocate for the use of SRU as the search protocol, but ultimately people leaned towards OpenSearch. SRU is too complicated. Regarding the citation checking the discussion surrounded the requesting of one or more identifiers from Open Library and returning a stream of metadata. Here I tried to advocate for another existing, well-established standard -- OAI-PMH -- but again I was shot down. Too complicated. In reality I think two things worked against the adoption of SRU and OAI. First my description of their functionality was not as eloquent as it could have been, and second, the Open Library personnel had never heard of nor knew anything about either protocol. This is another example of library standards being too library-centric. Think Z39.50. [1] After (re-)reading the OpenLibrary API [2], I don't think it really supports search, and at this time I don't think something like SRU or OpenSearch could be implemented on top of it, unless the queries were very specific, such as control number searches. And while I wish OpenLibrary would support SRU and/or OAI-PMH, I am really glad there is at least an API. It is REST-ful. Nice. Returning JSON is okay. XML would be nice too. In short, I think the API is a step in the right direction. Here are a number of more specific observations: * In a sentence, define what Open Library is and what its goals are. Given such a definition and scope statement developers will have an idea whether or not they are stealing information or not. I think they won't be, but I do think it is important to be explicit. Remove any doubt. * Define Infogami. Open Library is driven by an underlying system/ database we call 'Infogami', and this documentation describes a REST- like API for reading content from it. * List and/or enumerate a number of ways the API can be used. Given an ISBN number, Infomami can return a link to an item in any one of a number of formats as well as additional metadata such as author names, titles, abstracts, reviews, and cover art. Other functionality includes... * Distinguish to a greater degree the stylistic differences between links and code in the documentation. There were many times when I thought styled text (such as get, query, type/template) were hotlinks. * Enumerate the different types of objects that can be retrieved from the system and maybe organize them into logical groups. Do the same thing for the types. I think I found this to be the most confusing aspect of the documentation. * For each type, provide a normative example complete with output. Ideally, each type would be additionally elaborated upon with a typical use case. Use the /type/author/title to retrieve the title of an author. Here's a sample query, and here is the output... * Regarding the curl examples, enclose the URLs in double quote marks since many shells will interpret the ampersands incorrectly. * Finish off the documentation with one or more example applications that a developer can cut paste into their editor. One application might be client-based -- Javascript in an HTML head element. Another might be server-based. [1] http://www.library.nd.edu/daiad/morgan/travel/open-library/ [2] http://openlibrary.org/dev/docs/api -- Eric Lease Morgan Hesburgh Libraries, University of Notre Dame
[CODE4LIB] Latest OpenLibrary.org release
The OpenLibrary.org http://www.openlibrary.org team has just finished its latest release on the long path towards one web page for every book ever published. What's new? * added another 6 million book records (13.4 million total) with 18 million more records waiting to be integrated * built an API http://www.openlibrary.org/dev/docs/api to the data which allows you to query the database for objects matching particular criteria or to GET an object from the database * added internationalization support http://www.openlibrary.org/i18n - we have already started on Spanish, Italian and a few other languages, but users are now able to translate the site into any language * search the full text of 230,000 scanned books from the advanced search http://www.openlibrary.org/advanced page * started merging library MARC records and non-library book data crawled from the web (still some kinks to be worked out!) OpenLibrary is a work in progress, so please help us build it! The site, the code and the documentation are all open, so if you're interested in helping as a librarian or a programmer, join us - there's lots left to do! You can join the OL mailing list at: http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss And I'd especially like to thank our awesome team: * Edward Betts * Anand Chitipothu * Karen Coyle * Rebecca Malamud * Paul Rubin * Aaron Swartz Thanks, Alexis Rossi Internet Archive ___ Openlibrary mailing list [EMAIL PROTECTED] http://mail.archive.org/cgi-bin/mailman/listinfo/openlibrary ___ Ol-tech mailing list [EMAIL PROTECTED] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
Re: [CODE4LIB] Latest OpenLibrary.org release
* built an API http://www.openlibrary.org/dev/docs/api to the data which allows you to query the database for objects matching particular criteria or to GET an object from the database Not SRU? Any reasons why you rolled your own? Rob
Re: [CODE4LIB] Latest OpenLibrary.org release
I'm the only non-techie on the team, so I don't know that much about SRU. (Our head programmer lives in India, and is presumably asleep at the moment, otherwise I'd ask him!) Is it an interface that is used primarily by libraries? We are definitely hoping that our API will be used by all kinds, so perhaps that's the reasoning. But this is an Open Source project, so if anyone would like to volunteer to build an SRU interface... you can! Please do! :-) Alexis Dr R. Sanderson wrote: * built an API http://www.openlibrary.org/dev/docs/api to the data which allows you to query the database for objects matching particular criteria or to GET an object from the database Not SRU? Any reasons why you rolled your own? Rob
Re: [CODE4LIB] Latest OpenLibrary.org release
I'm the only non-techie on the team, so I don't know that much about SRU. (Our head programmer lives in India, and is presumably asleep at the moment, otherwise I'd ask him!) Is it an interface that is used primarily by libraries? We are definitely hoping that our API will be used by all kinds, so perhaps that's the reasoning. It's designed to be used by all kinds of clients on all kinds of data, but is from the library world so perhaps the most well defined use cases are in this arena. Have a look at: http://www.loc.gov/standards/sru/ But this is an Open Source project, so if anyone would like to volunteer to build an SRU interface... you can! Please do! :-) I feel a student project coming on. :) Rob
Re: [CODE4LIB] Latest OpenLibrary.org release
SRU is crap, in my opinion -- overengineered and under-thought, incomprehensible to non-librarians and burdened by the weight of history. The notion that it was designed to be used by all kinds of clients on all kinds of data is irrelevant in my book. Nobody in the *library world* uses it, much less non-libraries. APIs are for use. You don't get any points for idealogical correctness. A non-librarian could look at that API document, understand it all, and start working with it right away. There is no way you can say that about SRU. Kudos to the OpenLibrary team, whatever the reason was, for coming up with something better that people outside the library world might actually be willing to use. On Wed, May 7, 2008 at 12:55 PM, Dr R. Sanderson [EMAIL PROTECTED] wrote: I'm the only non-techie on the team, so I don't know that much about SRU. (Our head programmer lives in India, and is presumably asleep at the moment, otherwise I'd ask him!) Is it an interface that is used primarily by libraries? We are definitely hoping that our API will be used by all kinds, so perhaps that's the reasoning. It's designed to be used by all kinds of clients on all kinds of data, but is from the library world so perhaps the most well defined use cases are in this arena. Have a look at: http://www.loc.gov/standards/sru/ But this is an Open Source project, so if anyone would like to volunteer to build an SRU interface... you can! Please do! :-) I feel a student project coming on. :) Rob
Re: [CODE4LIB] Latest OpenLibrary.org release
I shouldn't respond to such blatant trolling, but heh... On Wed, 7 May 2008, Casey Durfee wrote: SRU is crap, in my opinion -- overengineered and under-thought, incomprehensible to non-librarians and burdened by the weight of history. What is so incomprehensible about it? Is it the fact it uses XML? No? Is it the REST like interface? No? Ahh... the extremely familiar but not hideously over-complicated and inappropriate (such as SQL, SPARQL or XQuery) query language? That you can just put the URLs into your web browser and use XSLT to display the results, rather than requiring M2M interfaces? The notion that it was designed to be used by all kinds of clients on all kinds of data is irrelevant in my book. Nobody in the *library world* uses it, much less non-libraries. APIs are for use. You don't get any points Except for, you know, small projects like The European Library (which is the template for the nascent European Digital Library), the Library of Congress, DSpace, most digital library systems, etc etc etc. And IndexData have interfaces to many sources of data via SRU, for when it's not natively implemented. for idealogical correctness. A non-librarian could look at that API document, understand it all, and start working with it right away. There is no way you can say that about SRU. I will say it, right now. I've had non librarian students look at the document and start working with it straight away. Multiple times. My apologies if you don't have similar experiences. Kudos to the OpenLibrary team, whatever the reason was, for coming up with something better that people outside the library world might actually be willing to use. It's totally arbitrary JSON with a very small fraction of the functionality and at least as much complexity. If people are willing to use it then that's great, certainly. Rob
Re: [CODE4LIB] Latest OpenLibrary.org release
Nobody in the *library world* uses it, much less non-libraries. Ironically, I was just checking email in between using the WorldCat SRU server. In addition to the systems Rob mentioned, there are also article databases like JSTOR and Springerlink that implement SRU, and every metasearch system in use in libraries today consume SRU web services. But I think the folks at OpenLibrary should implement an OpenSearch interface. I mean come on! OpenLibrary, OpenSearch. A match made in heaven! ;-) --Dave --- David Walker Library Web Services Manager California State University http://xerxes.calstate.edu From: Code for Libraries on behalf of Casey Durfee Sent: Wed 5/7/2008 1:12 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Latest OpenLibrary.org release SRU is crap, in my opinion -- overengineered and under-thought, incomprehensible to non-librarians and burdened by the weight of history. The notion that it was designed to be used by all kinds of clients on all kinds of data is irrelevant in my book. Nobody in the *library world* uses it, much less non-libraries. APIs are for use. You don't get any points for idealogical correctness. A non-librarian could look at that API document, understand it all, and start working with it right away. There is no way you can say that about SRU. Kudos to the OpenLibrary team, whatever the reason was, for coming up with something better that people outside the library world might actually be willing to use. On Wed, May 7, 2008 at 12:55 PM, Dr R. Sanderson [EMAIL PROTECTED] wrote: I'm the only non-techie on the team, so I don't know that much about SRU. (Our head programmer lives in India, and is presumably asleep at the moment, otherwise I'd ask him!) Is it an interface that is used primarily by libraries? We are definitely hoping that our API will be used by all kinds, so perhaps that's the reasoning. It's designed to be used by all kinds of clients on all kinds of data, but is from the library world so perhaps the most well defined use cases are in this arena. Have a look at: http://www.loc.gov/standards/sru/ But this is an Open Source project, so if anyone would like to volunteer to build an SRU interface... you can! Please do! :-) I feel a student project coming on. :) Rob