Re: [CODE4LIB] worldcat discovery versus metadata apis
Eric, WorldCat Metadata API provides read and write API access to the data in WorldCat: bibliographic records, local bibliographic data and basic holdings. WorldCat Discovery API provides access to search WorldCat and OCLC's Central Index of metadata based on a diverse set of indexes. Data is returned in a Linked Data graph. This API is still in beta. For more detailed information on each API you can look at the documentation at WorldCat Metadata API - http://www.oclc.org/developer/develop/web-services/worldcat-metadata-api.en.html WorldCat Discovery API - https://www.oclc.org/developer/develop/web-services/worldcat-discovery-api.en.html Given the use case you describe, your best bet is probably the WorldCat Metadata API, especially if you need to write any data back to WorldCat. However you can perform the same task with WorldCat Search API and if you are only reading data this might be a better fit because of its simpler authentication method. You're welcome to send an email to dev...@oclc.org if you have further detailed support questions. Karen On Tue, Mar 22, 2016 at 6:15 AM, Eric Lease Morganwrote: > I’m curious. What is the difference between the WorldCat Discovery and > WorldCat Metadata APIs? > > Given an OCLC number, I want to programmatically search WorldCat and get > in return a full bibliographic record compete with authoritative subject > headings and names. Which API should I be using? > > — > Eric Morgan >
Re: [CODE4LIB] WorldCat API - myTags
Arash, I don't believe this functionality currently exists, but I've passed on your desire to those in a position to do something about it. Thanks, Roy On Thu, Jun 6, 2013 at 9:59 AM, Arash.Joorabchi arash.joorab...@ul.ie wrote: Hi all, When viewing a work's metadata on WorldCat.org website, in the tag section of the page you are given the option to add new tags after logging in with your (free) account. I was wondering if there is a WorldCat api to do this from within my Java code. Thanks, Arash
Re: [CODE4LIB] WorldCat Implements Content-Negotiation for Linked Data
+1 On Mon, Jun 3, 2013 at 3:00 PM, Richard Wallis richard.wal...@dataliberate.com wrote: The Linked Data for the millions of resources in WorldCat.org is now available as RDF/XML, JSON-LD, Turtle, and Triples via content-negotiation. Details: http://dataliberate.com/2013/06/content-negotiation-for-worldcat/ ~Richard.
Re: [CODE4LIB] WorldCat Implements Content-Negotiation for Linked Data
Probably something I'm doing wrong, since I'm just copying and pasting, but the command from the blog post: curl -L -H Accept: text/turtle http://www.worldcat.org/oclc/41266045 gets me: curl: (6) Could not resolve host: text; nodename nor servname provided, or not known kc On 6/3/13 12:00 PM, Richard Wallis wrote: The Linked Data for the millions of resources in WorldCat.org is now available as RDF/XML, JSON-LD, Turtle, and Triples via content-negotiation. Details: http://dataliberate.com/2013/06/content-negotiation-for-worldcat/ ~Richard. -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] WorldCat Implements Content-Negotiation for Linked Data
I also get a good response from that, Karen. I've seen this error in the past when DNS doesn't resolve. Possibly you're having connectivity issues. On Mon, Jun 3, 2013 at 3:42 PM, Kyle Banerjee kyle.baner...@gmail.comwrote: What you've provided looks like it will work. My money is that the quotes and/or hyphens aren't legit due to the copy/paste operation. Manually typing at the prompt should work just fine. kyle On Mon, Jun 3, 2013 at 3:21 PM, Karen Coyle li...@kcoyle.net wrote: Probably something I'm doing wrong, since I'm just copying and pasting, but the command from the blog post: curl -L -H Accept: text/turtle http://www.worldcat.org/oclc/**41266045 http://www.worldcat.org/oclc/41266045 gets me: curl: (6) Could not resolve host: text; nodename nor servname provided, or not known kc On 6/3/13 12:00 PM, Richard Wallis wrote: The Linked Data for the millions of resources in WorldCat.org is now available as RDF/XML, JSON-LD, Turtle, and Triples via content-negotiation. Details: http://dataliberate.com/2013/**06/content-negotiation-for-** worldcat/ http://dataliberate.com/2013/06/content-negotiation-for-worldcat/ ~Richard. -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] WorldCat Implements Content-Negotiation for Linked Data
Ta da! That did it, Kyle. Why on earth do we all them smart quotes ?! kc On 6/3/13 4:07 PM, Kyle Banerjee wrote: Just for the heck of it, I tried copying and pasting and got the same error. There were smart quotes on the web page. Turn those into regular single or double quotes and it works fine. kyle On Mon, Jun 3, 2013 at 3:21 PM, Karen Coyle li...@kcoyle.net mailto:li...@kcoyle.net wrote: Probably something I'm doing wrong, since I'm just copying and pasting, but the command from the blog post: curl -L -H Accept: text/turtle http://www.worldcat.org/oclc/__41266045 http://www.worldcat.org/oclc/41266045 gets me: curl: (6) Could not resolve host: text; nodename nor servname provided, or not known kc On 6/3/13 12:00 PM, Richard Wallis wrote: The Linked Data for the millions of resources in WorldCat.org is now available as RDF/XML, JSON-LD, Turtle, and Triples via content-negotiation. Details: http://dataliberate.com/2013/__06/content-negotiation-for-__worldcat/ http://dataliberate.com/2013/06/content-negotiation-for-worldcat/ ~Richard. -- Karen Coyle kco...@kcoyle.net mailto:kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] WorldCat Implements Content-Negotiation for Linked Data
On 04/06/13 11:18, Karen Coyle wrote: Ta da! That did it, Kyle. Why on earth do we all them smart quotes ?! Because they look damn sexy when printed on pulp-of-murdered-tree, which we all know is authoritative form of any communication. cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] WorldCat Implements Content-Negotiation for Linked Data
Those are smart words! Can I quote them? :P Regards, Ben On 4-6-2013 1:40, stuart yeates wrote: On 04/06/13 11:18, Karen Coyle wrote: Ta da! That did it, Kyle. Why on earth do we all them smart quotes ?! Because they look damn sexy when printed on pulp-of-murdered-tree, which we all know is authoritative form of any communication. cheers stuart
Re: [CODE4LIB] Worldcat schema.org search API
Karen, Your output looks like it comes from the old 2007 RDFa 1.0 parser: http://www.w3.org/2007/08/pyRdfa/extract?uri=http%3A%2F%2Fwww.worldcat.org%2Foclc%2F527725format=pretty-xmlwarnings=falseparser=laxspace-preserve=true The new 2012 RDFa 1.1 parser does a better job: http://www.w3.org/2012/pyRdfa/extract?uri=http%3A%2F%2Fwww.worldcat.org%2Foclc%2F527725format=xmlrdfagraph=outputvocab_expansion=falserdfa_lite=falseembedded_rdf=truespace_preserve=truevocab_cache=truevocab_cache_report=falsevocab_cache_refresh=false Note the comment on the old interface page: http://www.w3.org/2007/08/pyRdfa/ Users are advised to migrate to RDFa 1.1 in general, including the RDFa 1.1 distiller. RDFa 1.1 is still pretty new and getting more tools to support it will help. Jeff -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Karen Coyle Sent: Thursday, July 12, 2012 6:16 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Worldcat schema.org search API Ross, it might not be yahoo, but that doesn't mean I know what it is. The pyRDFa utility returns garbage for RDF/XML and TTL, but not for JSON. It's only in the JSON output that I am getting any bibliographic data. The other two send me back a bunch of links to css files. I guess this is good news for folks who prefer JSON. Also, I see the OCLC number in the JSON, but not the URI, although the URI appears in the div with the RDFa: div itemid=http://www.worldcat.org/oclc/527725; itemscope= itemtype=http://schema.org/Book; resource=http://www.worldcat.org/oclc/527725; typeof=http://schema.org/Book;a href=http://www.worldcat.org/oclc/527725;http://www.worldcat.org/oclc /527725/a I must say I wonder a bit about those double but what do I know? Anywhere, here's what I get from pyRDFa: RDF/XML: rdf:RDF_4:Book rdf:about=http://schema.org/Book/rdf:Description rdf:about=http://www.worldcat.org/title/selection-of-early- statistical-papers-of-j-neyman/oclc/527725xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/css/loginpop up.css/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/html/masthea d.css/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/css/alerts.c ss/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/css/modals_j query.css/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/html/layered _divs.css/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/cssj/N245213502/bundles/ print-min.css/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/css/cr_print .css/xhv:stylesheet rdf:resource=http://static.weread.com/css/booksiread/relbookswidget.cs s?0:5/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/css/itemform at.css/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/cssj/N1807112156/bundles /screen-min.css/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/html/record. css/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/html/yui/bui ld/reset-fonts-grids/reset-fonts-grids.css/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/html/new_wco rg.css//rdf:Description/rdf:RDF JSON: { @context: { library: http://purl.org/library/;, oclc: http://www.worldcat.org/oclc/;, skos: http://www.w3.org/2004/02/skos/core#;, madsrdf: http://www.loc.gov/mads/rdf/v1#;, schema: http://schema.org/;, http://purl.org/library/placeOfPublication: { @type: @id }, http://schema.org/about: { @type: @id }, http://schema.org/publisher: { @type: @id }, http://schema.org/author: { @type: @id }, http://www.w3.org/2004/02/skos/core#inScheme: { @type: @id }, http://www.loc.gov/mads/rdf/v1#isIdentifiedByAuthority: { @type: @id } }, @id: oclc:527725, @type: schema:Book, schema:inLanguage: { @value: en, @language: en }, library:holdingsCount: { @value: 285, @language: en }, schema:author: { @id: http://viaf.org/viaf/24666861;, @type: schema:Person, madsrdf:isIdentifiedByAuthority: http://id.loc.gov/authorities/names/n50066374;, schema:name: { @value: Neyman, Jerzy, 1894-1981., @language: en } }, schema:name: { @value: A selection of early statistical papers of J. Neyman., @language: en }, schema:datePublished: { @value: 1967., @language: en }, schema:numberOfPages: { @value: 429, @language: en }, library:oclcnum: { @value: 527725, @language: en }, schema:about: [ { @type: skos:Concept, madsrdf:isIdentifiedByAuthority: http://id.loc.gov/authorities/subjects/sh85082133;, schema:name: { @value: Mathematical statistics., @language: en } }, { @id: http://dewey.info/class/519/;, @type: skos:Concept, skos:inScheme: http://dewey.info/scheme/; }, { @type: skos:Concept, schema:name: { @value: Statistique mathématique., @language: en } }, { @id: http
Re: [CODE4LIB] Worldcat schema.org search API
AHA! Thank you Jeff. I will re-bookmark and try again. kc On 7/13/12 6:31 AM, Young,Jeff (OR) wrote: Karen, Your output looks like it comes from the old 2007 RDFa 1.0 parser: http://www.w3.org/2007/08/pyRdfa/extract?uri=http%3A%2F%2Fwww.worldcat.org%2Foclc%2F527725format=pretty-xmlwarnings=falseparser=laxspace-preserve=true The new 2012 RDFa 1.1 parser does a better job: http://www.w3.org/2012/pyRdfa/extract?uri=http%3A%2F%2Fwww.worldcat.org%2Foclc%2F527725format=xmlrdfagraph=outputvocab_expansion=falserdfa_lite=falseembedded_rdf=truespace_preserve=truevocab_cache=truevocab_cache_report=falsevocab_cache_refresh=false Note the comment on the old interface page: http://www.w3.org/2007/08/pyRdfa/ Users are advised to migrate to RDFa 1.1 in general, including the RDFa 1.1 distiller. RDFa 1.1 is still pretty new and getting more tools to support it will help. Jeff -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Karen Coyle Sent: Thursday, July 12, 2012 6:16 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Worldcat schema.org search API Ross, it might not be yahoo, but that doesn't mean I know what it is. The pyRDFa utility returns garbage for RDF/XML and TTL, but not for JSON. It's only in the JSON output that I am getting any bibliographic data. The other two send me back a bunch of links to css files. I guess this is good news for folks who prefer JSON. Also, I see the OCLC number in the JSON, but not the URI, although the URI appears in the div with the RDFa: div itemid=http://www.worldcat.org/oclc/527725; itemscope= itemtype=http://schema.org/Book; resource=http://www.worldcat.org/oclc/527725; typeof=http://schema.org/Book;a href=http://www.worldcat.org/oclc/527725;http://www.worldcat.org/oclc /527725/a I must say I wonder a bit about those double but what do I know? Anywhere, here's what I get from pyRDFa: RDF/XML: rdf:RDF_4:Book rdf:about=http://schema.org/Book/rdf:Description rdf:about=http://www.worldcat.org/title/selection-of-early- statistical-papers-of-j-neyman/oclc/527725xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/css/loginpop up.css/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/html/masthea d.css/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/css/alerts.c ss/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/css/modals_j query.css/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/html/layered _divs.css/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/cssj/N245213502/bundles/ print-min.css/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/css/cr_print .css/xhv:stylesheet rdf:resource=http://static.weread.com/css/booksiread/relbookswidget.cs s?0:5/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/css/itemform at.css/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/cssj/N1807112156/bundles /screen-min.css/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/html/record. css/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/html/yui/bui ld/reset-fonts-grids/reset-fonts-grids.css/xhv:stylesheet rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/html/new_wco rg.css//rdf:Description/rdf:RDF JSON: { @context: { library: http://purl.org/library/;, oclc: http://www.worldcat.org/oclc/;, skos: http://www.w3.org/2004/02/skos/core#;, madsrdf: http://www.loc.gov/mads/rdf/v1#;, schema: http://schema.org/;, http://purl.org/library/placeOfPublication: { @type: @id }, http://schema.org/about: { @type: @id }, http://schema.org/publisher: { @type: @id }, http://schema.org/author: { @type: @id }, http://www.w3.org/2004/02/skos/core#inScheme: { @type: @id }, http://www.loc.gov/mads/rdf/v1#isIdentifiedByAuthority: { @type: @id } }, @id: oclc:527725, @type: schema:Book, schema:inLanguage: { @value: en, @language: en }, library:holdingsCount: { @value: 285, @language: en }, schema:author: { @id: http://viaf.org/viaf/24666861;, @type: schema:Person, madsrdf:isIdentifiedByAuthority: http://id.loc.gov/authorities/names/n50066374;, schema:name: { @value: Neyman, Jerzy, 1894-1981., @language: en } }, schema:name: { @value: A selection of early statistical papers of J. Neyman., @language: en }, schema:datePublished: { @value: 1967., @language: en }, schema:numberOfPages: { @value: 429, @language: en }, library:oclcnum: { @value: 527725, @language: en }, schema:about: [ { @type: skos:Concept, madsrdf:isIdentifiedByAuthority: http://id.loc.gov/authorities/subjects/sh85082133;, schema:name: { @value: Mathematical statistics., @language: en } }, { @id: http://dewey.info/class/519/;, @type: skos:Concept, skos:inScheme: http://dewey.info/scheme/; }, { @type: skos:Concept, schema:name: { @value: Statistique mathématique., @language: en } }, { @id: http://id.worldcat.org/fast/1012127;, @type: skos:Concept
Re: [CODE4LIB] Worldcat schema.org search API
On 7/10/12 5:07 PM, Karen Coyle wrote: On 7/10/12 4:02 PM, Richard Wallis wrote: But is it available to everyone, and is the data retrieved also usable as ODC-BY by any member of the Web public? Yes it is, and at this stage it is only available from within a html page. The it I was referring to was the API. Roy is telling me that people should use the API, as if that is an obvious option that I am overlooking. I am asking if the general web public can use the API to get this data. I believe that should be a yes/no question/answer. Since no one here from OCLC had the integrity to answer this question, I went ahead and applied for a Worldcat API key, and here is the reply: * Hello, Thank you for your interest in the WorldCat Search API, however at this time the web service is only available to institutions, primarily libraries, that have a specific relationship with OCLC and then only for work related to that library's services. The specific relationship is explained further here, http://oclc.org/developer/documentation/worldcat-search-api/who-can-use. However, there are other OCLC services that are available to individual's non-commercial use. Looking at the list of services available on http://www.worldcat.org/wcpa/content/affiliate/ you'll see that the WorldCat search box and WorldCat links with embedded searches are available to anyone. You may also be interested in checking out the WorldCat Registry, or low-volume use of the xISBN and xISSN services. If you have questions about the service, please contact the product manager, Dawn Hendricks at hendr...@oclc.org mailto:hendr...@oclc.org. * There is nothing wrong with having a proprietary API; but pretending that it isn't (either directly or through omission), or being afraid to say it, is the kind of thing that has caused me to lose respect for OCLC. Nothing should be declared open that isn't available to all, not just members. And advertisements for WC API classes should state members only. That would be honest. And telling folks on a wide-open list that they should use the Worldcat API (without mentioning if you are in a member institution and using this for library services) is at best deceiving, at worst dishonest. I, for one, am tired of OCLC's lies, and I'm not afraid to say it. Fortunately for me, retirement is looming and I don't need to care who likes what I say. This is a relief, to say the least. kc kc This experiment is the first step in a process to make linked data about WorldCat resources available. As it will evolve over time other areas such as API access, content-negotiation, search other query methods, additional RDF data vocabularies, etc., etc., will be considered in concert with community feedback (such as this thread) as to the way forward. Karen I know you are eager to work with and demonstrate the benefits of this way of publishing data. But these things take time and effort, so please be a little patient, and keep firing off these use cases and issues they are all valuable input. ~Richard. kc Roy On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote: The use case clarifies perfectly. Totally feasible. Well, I should say totally feasible with the caveat that I've never used the Worldcat Search API. Not letting that stop me, so long as it is what I imagine it is, then a developer should be able to perform a search, retrieve the response, and, by integrating one of the tools advertised on the schema.org website into his/her code, then retrieve the microdata for each resource returned from the search (and save it as RDF or whatever). If someone has created something like this, do speak up. Yours, Kevin On 07/10/2012 04:48 PM, Karen Coyle wrote: Kevin, if you misunderstand then I undoubtedly haven't been clear (let's at least share the confusion :-)). Here's the use case: PersonA wants to create a comprehensive bibliography of works by AuthorB. The goal is to do a search on AuthorB in WorldCat and extract the RDFa data from those pages in order to populate the bibliography. Apart from all of the issues of getting a perfect match on authors and of manifestation duplicates (there would need to be editing of the results after retrieval at the user's end), how feasible is this? Assume that the author is prolific enough that one wouldn't want to look up all of the records by hand. kc On 7/10/12 1:43 PM, Kevin Ford wrote: As for someone who might want to do this programmatically, he/she should take a look at the Programming languages section of the second link I sent along: http://schema.rdfs.org/tools.**htmlhttp://schema.rdfs.org/tools.html There one can find Ruby, Python, and Java extractors and parsers capable of outputting RDF. A developer can take one of these and programmatically get at the data. Apologies if I am misunderstanding your intent. Yours, Kevin On 07/10/2012 04:34
Re: [CODE4LIB] Worldcat schema.org search API
Well, I got the same email today when I apparently clicked on the wrong link (in the wrong account) while looking for my existing WC Basic API WSKEY (seriously, OCLC, the developer site is *terrible* with regards to usability). That said, here are the steps to get a WC Basic API WSKEY: Log in (or create an account) here: https://worldcat.org/config/SignIn.do On the left should be a menu that reads: WorldCat Registry WorldCat Basic API Key Find A Library API Key Web Service Keys Click on WorldCat Basic API Key, then Request a WorldCat Basic API Key. Then you should be able to use the Basic API (which will return results in RSS or Atom). From the search results, you can follow the links to the Worldcat pages and grab either the schema.org microdata or RDFa (or both, obviously). -Ross. On Thu, Jul 12, 2012 at 12:33 PM, Karen Coyle li...@kcoyle.net wrote: On 7/10/12 5:07 PM, Karen Coyle wrote: On 7/10/12 4:02 PM, Richard Wallis wrote: But is it available to everyone, and is the data retrieved also usable as ODC-BY by any member of the Web public? Yes it is, and at this stage it is only available from within a html page. The it I was referring to was the API. Roy is telling me that people should use the API, as if that is an obvious option that I am overlooking. I am asking if the general web public can use the API to get this data. I believe that should be a yes/no question/answer. Since no one here from OCLC had the integrity to answer this question, I went ahead and applied for a Worldcat API key, and here is the reply: * Hello, Thank you for your interest in the WorldCat Search API, however at this time the web service is only available to institutions, primarily libraries, that have a specific relationship with OCLC and then only for work related to that library's services. The specific relationship is explained further here, http://oclc.org/developer/documentation/worldcat-search-api/who-can-use. However, there are other OCLC services that are available to individual's non-commercial use. Looking at the list of services available on http://www.worldcat.org/wcpa/content/affiliate/ you'll see that the WorldCat search box and WorldCat links with embedded searches are available to anyone. You may also be interested in checking out the WorldCat Registry, or low-volume use of the xISBN and xISSN services. If you have questions about the service, please contact the product manager, Dawn Hendricks at hendr...@oclc.org mailto:hendr...@oclc.org. * There is nothing wrong with having a proprietary API; but pretending that it isn't (either directly or through omission), or being afraid to say it, is the kind of thing that has caused me to lose respect for OCLC. Nothing should be declared open that isn't available to all, not just members. And advertisements for WC API classes should state members only. That would be honest. And telling folks on a wide-open list that they should use the Worldcat API (without mentioning if you are in a member institution and using this for library services) is at best deceiving, at worst dishonest. I, for one, am tired of OCLC's lies, and I'm not afraid to say it. Fortunately for me, retirement is looming and I don't need to care who likes what I say. This is a relief, to say the least. kc kc This experiment is the first step in a process to make linked data about WorldCat resources available. As it will evolve over time other areas such as API access, content-negotiation, search other query methods, additional RDF data vocabularies, etc., etc., will be considered in concert with community feedback (such as this thread) as to the way forward. Karen I know you are eager to work with and demonstrate the benefits of this way of publishing data. But these things take time and effort, so please be a little patient, and keep firing off these use cases and issues they are all valuable input. ~Richard. kc Roy On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote: The use case clarifies perfectly. Totally feasible. Well, I should say totally feasible with the caveat that I've never used the Worldcat Search API. Not letting that stop me, so long as it is what I imagine it is, then a developer should be able to perform a search, retrieve the response, and, by integrating one of the tools advertised on the schema.org website into his/her code, then retrieve the microdata for each resource returned from the search (and save it as RDF or whatever). If someone has created something like this, do speak up. Yours, Kevin On 07/10/2012 04:48 PM, Karen Coyle wrote: Kevin, if you misunderstand then I undoubtedly haven't been clear (let's at least share the confusion :-)). Here's the use case: PersonA wants to create a comprehensive bibliography of works by AuthorB. The goal is to do a search on AuthorB in
Re: [CODE4LIB] Worldcat schema.org search API
Karen, Unfortunately it looks like you requested a key for the WorldCat Search API which does have specific eligibility criteria. The WorldCat Basic API which Ross mentions is available to anyone - http://www.oclc.org/developer/services/worldcat-basic-api It allows you to do an OpenSearch keyword query of WorldCat and get back basic metadata including the link to the worldcat.org page for each record returned. The easiest way to get a key is to go to http://worldcat.org/config/ and login with a WorldCat username/password. You should see a link that says WorldCat Basic API Key which you can use to get a key. I apologize for the confusion between the two APIs (WorldCat Search and WorldCat Basic). The difference is something we've tried to make clearer in our documentation but unfortunately given your experience it is still an issue. Karen On Thu, Jul 12, 2012 at 11:33 AM, Karen Coyle li...@kcoyle.net wrote: On 7/10/12 5:07 PM, Karen Coyle wrote: On 7/10/12 4:02 PM, Richard Wallis wrote: But is it available to everyone, and is the data retrieved also usable as ODC-BY by any member of the Web public? Yes it is, and at this stage it is only available from within a html page. The it I was referring to was the API. Roy is telling me that people should use the API, as if that is an obvious option that I am overlooking. I am asking if the general web public can use the API to get this data. I believe that should be a yes/no question/answer. Since no one here from OCLC had the integrity to answer this question, I went ahead and applied for a Worldcat API key, and here is the reply: * Hello, Thank you for your interest in the WorldCat Search API, however at this time the web service is only available to institutions, primarily libraries, that have a specific relationship with OCLC and then only for work related to that library's services. The specific relationship is explained further here, http://oclc.org/developer/documentation/worldcat-search-api/who-can-use. However, there are other OCLC services that are available to individual's non-commercial use. Looking at the list of services available on http://www.worldcat.org/wcpa/content/affiliate/ you'll see that the WorldCat search box and WorldCat links with embedded searches are available to anyone. You may also be interested in checking out the WorldCat Registry, or low-volume use of the xISBN and xISSN services. If you have questions about the service, please contact the product manager, Dawn Hendricks at hendr...@oclc.org mailto:hendr...@oclc.org. * There is nothing wrong with having a proprietary API; but pretending that it isn't (either directly or through omission), or being afraid to say it, is the kind of thing that has caused me to lose respect for OCLC. Nothing should be declared open that isn't available to all, not just members. And advertisements for WC API classes should state members only. That would be honest. And telling folks on a wide-open list that they should use the Worldcat API (without mentioning if you are in a member institution and using this for library services) is at best deceiving, at worst dishonest. I, for one, am tired of OCLC's lies, and I'm not afraid to say it. Fortunately for me, retirement is looming and I don't need to care who likes what I say. This is a relief, to say the least. kc kc This experiment is the first step in a process to make linked data about WorldCat resources available. As it will evolve over time other areas such as API access, content-negotiation, search other query methods, additional RDF data vocabularies, etc., etc., will be considered in concert with community feedback (such as this thread) as to the way forward. Karen I know you are eager to work with and demonstrate the benefits of this way of publishing data. But these things take time and effort, so please be a little patient, and keep firing off these use cases and issues they are all valuable input. ~Richard. kc Roy On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote: The use case clarifies perfectly. Totally feasible. Well, I should say totally feasible with the caveat that I've never used the Worldcat Search API. Not letting that stop me, so long as it is what I imagine it is, then a developer should be able to perform a search, retrieve the response, and, by integrating one of the tools advertised on the schema.org website into his/her code, then retrieve the microdata for each resource returned from the search (and save it as RDF or whatever). If someone has created something like this, do speak up. Yours, Kevin On 07/10/2012 04:48 PM, Karen Coyle wrote: Kevin, if you misunderstand then I undoubtedly haven't been clear (let's at least share the confusion :-)). Here's the use case: PersonA wants to create a comprehensive bibliography of works by AuthorB. The goal is to do a
Re: [CODE4LIB] Worldcat schema.org search API
It isn't unfortunate, it was deliberate. I have a key for the basic api, but I was being advised that I had overlooked the obvious answer of the worldcat search API. I have no confusion between the two, except for the confusion that seems to be promulgated by OCLC itself. kc On 7/12/12 9:46 AM, Karen Coombs wrote: Karen, Unfortunately it looks like you requested a key for the WorldCat Search API which does have specific eligibility criteria. The WorldCat Basic API which Ross mentions is available to anyone - http://www.oclc.org/developer/services/worldcat-basic-api It allows you to do an OpenSearch keyword query of WorldCat and get back basic metadata including the link to the worldcat.org page for each record returned. The easiest way to get a key is to go to http://worldcat.org/config/ and login with a WorldCat username/password. You should see a link that says WorldCat Basic API Key which you can use to get a key. I apologize for the confusion between the two APIs (WorldCat Search and WorldCat Basic). The difference is something we've tried to make clearer in our documentation but unfortunately given your experience it is still an issue. Karen On Thu, Jul 12, 2012 at 11:33 AM, Karen Coyle li...@kcoyle.net wrote: On 7/10/12 5:07 PM, Karen Coyle wrote: On 7/10/12 4:02 PM, Richard Wallis wrote: But is it available to everyone, and is the data retrieved also usable as ODC-BY by any member of the Web public? Yes it is, and at this stage it is only available from within a html page. The it I was referring to was the API. Roy is telling me that people should use the API, as if that is an obvious option that I am overlooking. I am asking if the general web public can use the API to get this data. I believe that should be a yes/no question/answer. Since no one here from OCLC had the integrity to answer this question, I went ahead and applied for a Worldcat API key, and here is the reply: * Hello, Thank you for your interest in the WorldCat Search API, however at this time the web service is only available to institutions, primarily libraries, that have a specific relationship with OCLC and then only for work related to that library's services. The specific relationship is explained further here, http://oclc.org/developer/documentation/worldcat-search-api/who-can-use. However, there are other OCLC services that are available to individual's non-commercial use. Looking at the list of services available on http://www.worldcat.org/wcpa/content/affiliate/ you'll see that the WorldCat search box and WorldCat links with embedded searches are available to anyone. You may also be interested in checking out the WorldCat Registry, or low-volume use of the xISBN and xISSN services. If you have questions about the service, please contact the product manager, Dawn Hendricks at hendr...@oclc.org mailto:hendr...@oclc.org. * There is nothing wrong with having a proprietary API; but pretending that it isn't (either directly or through omission), or being afraid to say it, is the kind of thing that has caused me to lose respect for OCLC. Nothing should be declared open that isn't available to all, not just members. And advertisements for WC API classes should state members only. That would be honest. And telling folks on a wide-open list that they should use the Worldcat API (without mentioning if you are in a member institution and using this for library services) is at best deceiving, at worst dishonest. I, for one, am tired of OCLC's lies, and I'm not afraid to say it. Fortunately for me, retirement is looming and I don't need to care who likes what I say. This is a relief, to say the least. kc kc This experiment is the first step in a process to make linked data about WorldCat resources available. As it will evolve over time other areas such as API access, content-negotiation, search other query methods, additional RDF data vocabularies, etc., etc., will be considered in concert with community feedback (such as this thread) as to the way forward. Karen I know you are eager to work with and demonstrate the benefits of this way of publishing data. But these things take time and effort, so please be a little patient, and keep firing off these use cases and issues they are all valuable input. ~Richard. kc Roy On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote: The use case clarifies perfectly. Totally feasible. Well, I should say totally feasible with the caveat that I've never used the Worldcat Search API. Not letting that stop me, so long as it is what I imagine it is, then a developer should be able to perform a search, retrieve the response, and, by integrating one of the tools advertised on the schema.org website into his/her code, then retrieve the microdata for each resource returned from the search (and save it as RDF or whatever). If someone has created something like this, do speak up. Yours, Kevin On 07/10/2012 04:48 PM,
Re: [CODE4LIB] Worldcat schema.org search API
Ok, the Pipe didn't quite work as planned. Yahoo! is stripping out all of the relevant html attributes when it's converting the WC microdata html to a string, which renders the whole thing useless. If I don't convert it to a string, it maintains all of the necessary attributes in the JSON output, but it strips them from the RSS and html outputs. I mean, it's hard to complain about free thing doesn't handle my niche problem, but when has that ever stopped me? Anyway, it's there for somebody to clone and poke around with. Maybe somebody more familiar with Pipes can figure a way around this problem. -Ross. On Thu, Jul 12, 2012 at 3:03 PM, Ross Singer rossfsin...@gmail.com wrote: I made a Yahoo Pipe that merges the WorldCat Basic OpenSearch RSS result with the microdata div in the Worldcat pages referred to in the search results: http://pipes.yahoo.com/pipes/pipe.info?_id=05ae2a7bc180f3abe36b11bcaf1adc52 You'll need to enter your wskey for it to work. You can get the output as RSS (which will require the item/description to be unescaped to use) or JSON (which wouldn't require unescaping). It's not terribly fast, but it least should help somebody get started. -Ross. On Thu, Jul 12, 2012 at 1:09 PM, Karen Coyle li...@kcoyle.net wrote: It isn't unfortunate, it was deliberate. I have a key for the basic api, but I was being advised that I had overlooked the obvious answer of the worldcat search API. I have no confusion between the two, except for the confusion that seems to be promulgated by OCLC itself. kc On 7/12/12 9:46 AM, Karen Coombs wrote: Karen, Unfortunately it looks like you requested a key for the WorldCat Search API which does have specific eligibility criteria. The WorldCat Basic API which Ross mentions is available to anyone - http://www.oclc.org/developer/services/worldcat-basic-api It allows you to do an OpenSearch keyword query of WorldCat and get back basic metadata including the link to the worldcat.org page for each record returned. The easiest way to get a key is to go to http://worldcat.org/config/ and login with a WorldCat username/password. You should see a link that says WorldCat Basic API Key which you can use to get a key. I apologize for the confusion between the two APIs (WorldCat Search and WorldCat Basic). The difference is something we've tried to make clearer in our documentation but unfortunately given your experience it is still an issue. Karen On Thu, Jul 12, 2012 at 11:33 AM, Karen Coyle li...@kcoyle.net wrote: On 7/10/12 5:07 PM, Karen Coyle wrote: On 7/10/12 4:02 PM, Richard Wallis wrote: But is it available to everyone, and is the data retrieved also usable as ODC-BY by any member of the Web public? Yes it is, and at this stage it is only available from within a html page. The it I was referring to was the API. Roy is telling me that people should use the API, as if that is an obvious option that I am overlooking. I am asking if the general web public can use the API to get this data. I believe that should be a yes/no question/answer. Since no one here from OCLC had the integrity to answer this question, I went ahead and applied for a Worldcat API key, and here is the reply: * Hello, Thank you for your interest in the WorldCat Search API, however at this time the web service is only available to institutions, primarily libraries, that have a specific relationship with OCLC and then only for work related to that library's services. The specific relationship is explained further here, http://oclc.org/developer/documentation/worldcat-search-api/who-can-use. However, there are other OCLC services that are available to individual's non-commercial use. Looking at the list of services available on http://www.worldcat.org/wcpa/content/affiliate/ you'll see that the WorldCat search box and WorldCat links with embedded searches are available to anyone. You may also be interested in checking out the WorldCat Registry, or low-volume use of the xISBN and xISSN services. If you have questions about the service, please contact the product manager, Dawn Hendricks at hendr...@oclc.org mailto:hendr...@oclc.org. * There is nothing wrong with having a proprietary API; but pretending that it isn't (either directly or through omission), or being afraid to say it, is the kind of thing that has caused me to lose respect for OCLC. Nothing should be declared open that isn't available to all, not just members. And advertisements for WC API classes should state members only. That would be honest. And telling folks on a wide-open list that they should use the Worldcat API (without mentioning if you are in a member institution and using this for library services) is at best deceiving, at worst dishonest. I, for one, am tired of OCLC's lies, and I'm not afraid to say it. Fortunately for me, retirement is looming and I don't need to care who likes what I say. This is a
Re: [CODE4LIB] Worldcat schema.org search API
That only returns a short citation but nothing says how short that citation is, nor if it is formatted. I assume that citation means citation format, which isn't useful. kc On 7/10/12 7:32 PM, Ross Singer wrote: Worldcat does have the basic API, which is more open (assuming your situation qualifies). At any rate, it's free and open to (non-commercial) non-subscribers. http://oclc.org/developer/documentation/worldcat-basic-api/using-api Searching isn't terribly sophisticated, but might suit your need. And the schema.org data will be much richer than what you'd normally get back from the Basic API. -Ross. On Tuesday, July 10, 2012, Karen Coyle wrote: On 7/10/12 4:02 PM, Richard Wallis wrote: But is it available to everyone, and is the data retrieved also usable as ODC-BY by any member of the Web public? Yes it is, and at this stage it is only available from within a html page. The it I was referring to was the API. Roy is telling me that people should use the API, as if that is an obvious option that I am overlooking. I am asking if the general web public can use the API to get this data. I believe that should be a yes/no question/answer. kc This experiment is the first step in a process to make linked data about WorldCat resources available. As it will evolve over time other areas such as API access, content-negotiation, search other query methods, additional RDF data vocabularies, etc., etc., will be considered in concert with community feedback (such as this thread) as to the way forward. Karen I know you are eager to work with and demonstrate the benefits of this way of publishing data. But these things take time and effort, so please be a little patient, and keep firing off these use cases and issues they are all valuable input. ~Richard. kc Roy On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote: The use case clarifies perfectly. Totally feasible. Well, I should say totally feasible with the caveat that I've never used the Worldcat Search API. Not letting that stop me, so long as it is what I imagine it is, then a developer should be able to perform a search, retrieve the response, and, by integrating one of the tools advertised on the schema.org website into his/her code, then retrieve the microdata for each resource returned from the search (and save it as RDF or whatever). If someone has created something like this, do speak up. Yours, Kevin On 07/10/2012 04:48 PM, Karen Coyle wrote: Kevin, if you misunderstand then I undoubtedly haven't been clear (let's at least share the confusion :-)). Here's the use case: PersonA wants to create a comprehensive bibliography of works by AuthorB. The goal is to do a search on AuthorB in WorldCat and extract the RDFa data from those pages in order to populate the bibliography. Apart from all of the issues of getting a perfect match on authors and of manifestation duplicates (there would need to be editing of the results after retrieval at the user's end), how feasible is this? Assume that the author is prolific enough that one wouldn't want to look up all of the records by hand. kc On 7/10/12 1:43 PM, Kevin Ford wrote: As for someone who might want to do this programmatically, he/she should take a look at the Programming languages section of the second link I sent along: http://schema.rdfs.org/tools.htmlhttp://schema.rdfs.org/tools.**html http://schema.rdfs.org/**tools.html http://schema.rdfs.org/tools.html There one can find Ruby, Python, and Java extractors and parsers capable of outputting RDF. A developer can take one of these and programmatically get at the data. Apologies if I am misunderstanding your intent. Yours, Kevin On 07/10/2012 04:34 PM, Karen Coyle wrote: Thanks, Kevin! And Richard! I'm thinking we need a good web site with links to tools. I had already been introduced to http://www.w3.org/2012/pyRdfa/ where you can past a URI and get ttl or rdf/xml. These are all good resources. But what about someone who wants to do this programmatically, not through a web site? Richard's message indicates that this isn't yet available, so perhaps we should be gathering use cases to support the need? And have a place to post various solutions, even ones that are not OCLC-specific? (Because I am hoping that the use of microformats will increase in general.) kc On 7/10/12 12:12 PM, Kevin Ford wrote: is there an open search to get one to the desired records in the first place? -- I'm not certain this will fully address your question, but try these two sites: Website: http://www.google.com/**webmasters/tools/richsnippets -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] Worldcat schema.org search API
Every entry has a link href=http://worldcat.org/oclc/{oclcnumber}/ that will take you to the schema.org. -Ross. On Wed, Jul 11, 2012 at 9:08 AM, Karen Coyle li...@kcoyle.net wrote: That only returns a short citation but nothing says how short that citation is, nor if it is formatted. I assume that citation means citation format, which isn't useful. kc On 7/10/12 7:32 PM, Ross Singer wrote: Worldcat does have the basic API, which is more open (assuming your situation qualifies). At any rate, it's free and open to (non-commercial) non-subscribers. http://oclc.org/developer/documentation/worldcat-basic-api/using-api Searching isn't terribly sophisticated, but might suit your need. And the schema.org data will be much richer than what you'd normally get back from the Basic API. -Ross. On Tuesday, July 10, 2012, Karen Coyle wrote: On 7/10/12 4:02 PM, Richard Wallis wrote: But is it available to everyone, and is the data retrieved also usable as ODC-BY by any member of the Web public? Yes it is, and at this stage it is only available from within a html page. The it I was referring to was the API. Roy is telling me that people should use the API, as if that is an obvious option that I am overlooking. I am asking if the general web public can use the API to get this data. I believe that should be a yes/no question/answer. kc This experiment is the first step in a process to make linked data about WorldCat resources available. As it will evolve over time other areas such as API access, content-negotiation, search other query methods, additional RDF data vocabularies, etc., etc., will be considered in concert with community feedback (such as this thread) as to the way forward. Karen I know you are eager to work with and demonstrate the benefits of this way of publishing data. But these things take time and effort, so please be a little patient, and keep firing off these use cases and issues they are all valuable input. ~Richard. kc Roy On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote: The use case clarifies perfectly. Totally feasible. Well, I should say totally feasible with the caveat that I've never used the Worldcat Search API. Not letting that stop me, so long as it is what I imagine it is, then a developer should be able to perform a search, retrieve the response, and, by integrating one of the tools advertised on the schema.org website into his/her code, then retrieve the microdata for each resource returned from the search (and save it as RDF or whatever). If someone has created something like this, do speak up. Yours, Kevin On 07/10/2012 04:48 PM, Karen Coyle wrote: Kevin, if you misunderstand then I undoubtedly haven't been clear (let's at least share the confusion :-)). Here's the use case: PersonA wants to create a comprehensive bibliography of works by AuthorB. The goal is to do a search on AuthorB in WorldCat and extract the RDFa data from those pages in order to populate the bibliography. Apart from all of the issues of getting a perfect match on authors and of manifestation duplicates (there would need to be editing of the results after retrieval at the user's end), how feasible is this? Assume that the author is prolific enough that one wouldn't want to look up all of the records by hand. kc On 7/10/12 1:43 PM, Kevin Ford wrote: As for someone who might want to do this programmatically, he/she should take a look at the Programming languages section of the second link I sent along: http://schema.rdfs.org/tools.htmlhttp://schema.rdfs.org/tools.**html http://schema.rdfs.org/**tools.html http://schema.rdfs.org/tools.html There one can find Ruby, Python, and Java extractors and parsers capable of outputting RDF. A developer can take one of these and programmatically get at the data. Apologies if I am misunderstanding your intent. Yours, Kevin On 07/10/2012 04:34 PM, Karen Coyle wrote: Thanks, Kevin! And Richard! I'm thinking we need a good web site with links to tools. I had already been introduced to http://www.w3.org/2012/pyRdfa/ where you can past a URI and get ttl or rdf/xml. These are all good resources. But what about someone who wants to do this programmatically, not through a web site? Richard's message indicates that this isn't yet available, so perhaps we should be gathering use cases to support the need? And have a place to post various solutions, even ones that are not OCLC-specific? (Because I am hoping that the use of microformats will increase in general.) kc On 7/10/12 12:12 PM, Kevin Ford wrote: is there an open search to get one to the desired records in the first place? -- I'm not certain this will fully address your question, but try these two sites: Website: http://www.google.com/**webmasters/tools/richsnippets -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph:
Re: [CODE4LIB] Worldcat schema.org search API
Also, my colleague wishes me to point out that the email address and phone number of any OCLC staff member is only two clicks away from our home page. Go to Contact us which is an option along the top on every page, then Contact OCLC Staff which is in the sidebar and also a link on the page as search for a specific person. Roy On Tue, Jul 10, 2012 at 11:42 AM, Karen Coyle li...@kcoyle.net wrote: I have demonstrated the schema.org/RDFa microdata in the WC database to various folks and the question always is: how do I get access to this? (The only source I have is the Facebook API, me being a user rather than a maker.) The microdata is CC-BY once you get a Worldcat URI, but is there an open search to get one to the desired records in the first place? I'm poorly-versed in WC APIs so I'm hoping others have a better grasp. @rjw: the OCLC website does a thorough job of hiding email addresses or I would have asked this directly. Then again, a discussion here could have added value. Thanks, kc -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] Worldcat schema.org search API
Hi Karen At this stage there is no specific api as such to get at the embedded RDFa data in WorldCat - you can use the normal UI of WorldCat itself or one of the WorldCat Search API options such as OpenSearchhttp://oclc.org/developer/documentation/worldcat-search-api/opensearch. This experimental first step at exposing WorldCat data as linked data will evolve. As more development and discussion guides us, more data and ways to get at it will appear. You can get at the raw RDF from the embedded RDFa it in a couple of ways. The W3C RDFa 1.1 Distiller http://www.w3.org/2012/pyRdfa/ is one. Another is using the ARC2 PHP Libraryhttps://github.com/semsol/arc2/wiki/Getting-started-with-ARC2 for those that want to write some simple code. Bruce Washburn has published a posthttp://www.oclc.org/developer/news/linked-data-now-worldcat-facebook-app sharing how he used ARC2 in the enhanced WorldCat Facebook App to extract the RDF from WorldCat and process it to link on and use the same technique on Viaf and FAST. He includes code snippets and a link to the full source for those that are interested. A minor point on licensing - the linked data is licensed under ODC-BYhttp://opendatacommons.org/licenses/by/, not CC-BY. ODC-BY is a data oriented license, as against CC which is more creative work oriented. Sorry my email was hard to find - it is richard.wal...@oclc.org. Also if you have questions or comments about OCLC linked data formatting or publishing you can drop an email to d...@oclc.org. ~Richard. On 10 July 2012 19:42, Karen Coyle li...@kcoyle.net wrote: I have demonstrated the schema.org/RDFa microdata in the WC database to various folks and the question always is: how do I get access to this? (The only source I have is the Facebook API, me being a user rather than a maker.) The microdata is CC-BY once you get a Worldcat URI, but is there an open search to get one to the desired records in the first place? I'm poorly-versed in WC APIs so I'm hoping others have a better grasp. @rjw: the OCLC website does a thorough job of hiding email addresses or I would have asked this directly. Then again, a discussion here could have added value. Thanks, kc -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet -- Richard Wallis Founder, Data Liberate http://dataliberate.com Tel: +44 (0)7767 886 005 Linkedin: http://www.linkedin.com/in/richardwallis Skype: richard.wallis1 Twitter: @rjw IM: rjw3...@hotmail.com
Re: [CODE4LIB] Worldcat schema.org search API
Thanks, Roy. I obviously never got there, but will visit in the future. kc On 7/10/12 12:57 PM, Roy Tennant wrote: Also, my colleague wishes me to point out that the email address and phone number of any OCLC staff member is only two clicks away from our home page. Go to Contact us which is an option along the top on every page, then Contact OCLC Staff which is in the sidebar and also a link on the page as search for a specific person. Roy On Tue, Jul 10, 2012 at 11:42 AM, Karen Coyle li...@kcoyle.net wrote: I have demonstrated the schema.org/RDFa microdata in the WC database to various folks and the question always is: how do I get access to this? (The only source I have is the Facebook API, me being a user rather than a maker.) The microdata is CC-BY once you get a Worldcat URI, but is there an open search to get one to the desired records in the first place? I'm poorly-versed in WC APIs so I'm hoping others have a better grasp. @rjw: the OCLC website does a thorough job of hiding email addresses or I would have asked this directly. Then again, a discussion here could have added value. Thanks, kc -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] Worldcat schema.org search API
As for someone who might want to do this programmatically, he/she should take a look at the Programming languages section of the second link I sent along: http://schema.rdfs.org/tools.html There one can find Ruby, Python, and Java extractors and parsers capable of outputting RDF. A developer can take one of these and programmatically get at the data. Apologies if I am misunderstanding your intent. Yours, Kevin On 07/10/2012 04:34 PM, Karen Coyle wrote: Thanks, Kevin! And Richard! I'm thinking we need a good web site with links to tools. I had already been introduced to http://www.w3.org/2012/pyRdfa/ where you can past a URI and get ttl or rdf/xml. These are all good resources. But what about someone who wants to do this programmatically, not through a web site? Richard's message indicates that this isn't yet available, so perhaps we should be gathering use cases to support the need? And have a place to post various solutions, even ones that are not OCLC-specific? (Because I am hoping that the use of microformats will increase in general.) kc On 7/10/12 12:12 PM, Kevin Ford wrote: is there an open search to get one to the desired records in the first place? -- I'm not certain this will fully address your question, but try these two sites: Website: http://www.google.com/webmasters/tools/richsnippets Example: http://tinyurl.com/dx3h5bg Website: http://linter.structured-data.org/ Example: http://tinyurl.com/bmm8bbc These sites will extract the data, but I don't think you get your choice of serialization. The data are extracted and displayed on the resulting page in the HTML, but at least you can *see* the data. Additionally, there are a number of tools to help with microdata extraction here: http://schema.rdfs.org/tools.html Some of these will allow you to output specific (RDF) serializations. HTH, Kevin On 07/10/2012 02:42 PM, Karen Coyle wrote: I have demonstrated the schema.org/RDFa microdata in the WC database to various folks and the question always is: how do I get access to this? (The only source I have is the Facebook API, me being a user rather than a maker.) The microdata is CC-BY once you get a Worldcat URI, but is there an open search to get one to the desired records in the first place? I'm poorly-versed in WC APIs so I'm hoping others have a better grasp. @rjw: the OCLC website does a thorough job of hiding email addresses or I would have asked this directly. Then again, a discussion here could have added value. Thanks, kc
Re: [CODE4LIB] Worldcat schema.org search API
Kevin, if you misunderstand then I undoubtedly haven't been clear (let's at least share the confusion :-)). Here's the use case: PersonA wants to create a comprehensive bibliography of works by AuthorB. The goal is to do a search on AuthorB in WorldCat and extract the RDFa data from those pages in order to populate the bibliography. Apart from all of the issues of getting a perfect match on authors and of manifestation duplicates (there would need to be editing of the results after retrieval at the user's end), how feasible is this? Assume that the author is prolific enough that one wouldn't want to look up all of the records by hand. kc On 7/10/12 1:43 PM, Kevin Ford wrote: As for someone who might want to do this programmatically, he/she should take a look at the Programming languages section of the second link I sent along: http://schema.rdfs.org/tools.html There one can find Ruby, Python, and Java extractors and parsers capable of outputting RDF. A developer can take one of these and programmatically get at the data. Apologies if I am misunderstanding your intent. Yours, Kevin On 07/10/2012 04:34 PM, Karen Coyle wrote: Thanks, Kevin! And Richard! I'm thinking we need a good web site with links to tools. I had already been introduced to http://www.w3.org/2012/pyRdfa/ where you can past a URI and get ttl or rdf/xml. These are all good resources. But what about someone who wants to do this programmatically, not through a web site? Richard's message indicates that this isn't yet available, so perhaps we should be gathering use cases to support the need? And have a place to post various solutions, even ones that are not OCLC-specific? (Because I am hoping that the use of microformats will increase in general.) kc On 7/10/12 12:12 PM, Kevin Ford wrote: is there an open search to get one to the desired records in the first place? -- I'm not certain this will fully address your question, but try these two sites: Website: http://www.google.com/webmasters/tools/richsnippets Example: http://tinyurl.com/dx3h5bg Website: http://linter.structured-data.org/ Example: http://tinyurl.com/bmm8bbc These sites will extract the data, but I don't think you get your choice of serialization. The data are extracted and displayed on the resulting page in the HTML, but at least you can *see* the data. Additionally, there are a number of tools to help with microdata extraction here: http://schema.rdfs.org/tools.html Some of these will allow you to output specific (RDF) serializations. HTH, Kevin On 07/10/2012 02:42 PM, Karen Coyle wrote: I have demonstrated the schema.org/RDFa microdata in the WC database to various folks and the question always is: how do I get access to this? (The only source I have is the Facebook API, me being a user rather than a maker.) The microdata is CC-BY once you get a Worldcat URI, but is there an open search to get one to the desired records in the first place? I'm poorly-versed in WC APIs so I'm hoping others have a better grasp. @rjw: the OCLC website does a thorough job of hiding email addresses or I would have asked this directly. Then again, a discussion here could have added value. Thanks, kc -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] Worldcat schema.org search API
Karen, RDFa and the basic schema.org vocabulary, plus the intention of the proposed library extension, are not OCLC specific - they are generic tools and techniques applicable across many domains. I would therefore avoid library focussed tool sites, which would run the risk of not keeping up with wider developments. RDFa.info seems to be shaping up as a good resource. Schema.org itself also is a good resource. On the point of how to gain the best from linked data, many especially in the library community, immediately look towards search as the default * paradise* for dealing with data. Many of the benefits of linked data emerge not from search, but from identifying relationships and following links. I heard this described the other day as 'facets on steroids' - not entirely accurate, but it conjures up the right kind of image ;-) I am not saying ignore search, far from it, just suggesting that innovation with linked data often comes from what you can do once you have found (often by traditional methods) a thing . ~Richard. On 10 July 2012 21:34, Karen Coyle li...@kcoyle.net wrote: Thanks, Kevin! And Richard! I'm thinking we need a good web site with links to tools. I had already been introduced to http://www.w3.org/2012/pyRdfa/ where you can past a URI and get ttl or rdf/xml. These are all good resources. But what about someone who wants to do this programmatically, not through a web site? Richard's message indicates that this isn't yet available, so perhaps we should be gathering use cases to support the need? And have a place to post various solutions, even ones that are not OCLC-specific? (Because I am hoping that the use of microformats will increase in general.) kc On 7/10/12 12:12 PM, Kevin Ford wrote: is there an open search to get one to the desired records in the first place? -- I'm not certain this will fully address your question, but try these two sites: Website: http://www.google.com/**webmasters/tools/richsnippetshttp://www.google.com/webmasters/tools/richsnippets Example: http://tinyurl.com/dx3h5bg Website: http://linter.structured-data.**org/http://linter.structured-data.org/ Example: http://tinyurl.com/bmm8bbc These sites will extract the data, but I don't think you get your choice of serialization. The data are extracted and displayed on the resulting page in the HTML, but at least you can *see* the data. Additionally, there are a number of tools to help with microdata extraction here: http://schema.rdfs.org/tools.**html http://schema.rdfs.org/tools.html Some of these will allow you to output specific (RDF) serializations. HTH, Kevin On 07/10/2012 02:42 PM, Karen Coyle wrote: I have demonstrated the schema.org/RDFa microdata in the WC database to various folks and the question always is: how do I get access to this? (The only source I have is the Facebook API, me being a user rather than a maker.) The microdata is CC-BY once you get a Worldcat URI, but is there an open search to get one to the desired records in the first place? I'm poorly-versed in WC APIs so I'm hoping others have a better grasp. @rjw: the OCLC website does a thorough job of hiding email addresses or I would have asked this directly. Then again, a discussion here could have added value. Thanks, kc -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet -- Richard Wallis Founder, Data Liberate http://dataliberate.com Tel: +44 (0)7767 886 005 Linkedin: http://www.linkedin.com/in/richardwallis Skype: richard.wallis1 Twitter: @rjw IM: rjw3...@hotmail.com
Re: [CODE4LIB] Worldcat schema.org search API
The use case clarifies perfectly. Totally feasible. Well, I should say totally feasible with the caveat that I've never used the Worldcat Search API. Not letting that stop me, so long as it is what I imagine it is, then a developer should be able to perform a search, retrieve the response, and, by integrating one of the tools advertised on the schema.org website into his/her code, then retrieve the microdata for each resource returned from the search (and save it as RDF or whatever). If someone has created something like this, do speak up. Yours, Kevin On 07/10/2012 04:48 PM, Karen Coyle wrote: Kevin, if you misunderstand then I undoubtedly haven't been clear (let's at least share the confusion :-)). Here's the use case: PersonA wants to create a comprehensive bibliography of works by AuthorB. The goal is to do a search on AuthorB in WorldCat and extract the RDFa data from those pages in order to populate the bibliography. Apart from all of the issues of getting a perfect match on authors and of manifestation duplicates (there would need to be editing of the results after retrieval at the user's end), how feasible is this? Assume that the author is prolific enough that one wouldn't want to look up all of the records by hand. kc On 7/10/12 1:43 PM, Kevin Ford wrote: As for someone who might want to do this programmatically, he/she should take a look at the Programming languages section of the second link I sent along: http://schema.rdfs.org/tools.html There one can find Ruby, Python, and Java extractors and parsers capable of outputting RDF. A developer can take one of these and programmatically get at the data. Apologies if I am misunderstanding your intent. Yours, Kevin On 07/10/2012 04:34 PM, Karen Coyle wrote: Thanks, Kevin! And Richard! I'm thinking we need a good web site with links to tools. I had already been introduced to http://www.w3.org/2012/pyRdfa/ where you can past a URI and get ttl or rdf/xml. These are all good resources. But what about someone who wants to do this programmatically, not through a web site? Richard's message indicates that this isn't yet available, so perhaps we should be gathering use cases to support the need? And have a place to post various solutions, even ones that are not OCLC-specific? (Because I am hoping that the use of microformats will increase in general.) kc On 7/10/12 12:12 PM, Kevin Ford wrote: is there an open search to get one to the desired records in the first place? -- I'm not certain this will fully address your question, but try these two sites: Website: http://www.google.com/webmasters/tools/richsnippets Example: http://tinyurl.com/dx3h5bg Website: http://linter.structured-data.org/ Example: http://tinyurl.com/bmm8bbc These sites will extract the data, but I don't think you get your choice of serialization. The data are extracted and displayed on the resulting page in the HTML, but at least you can *see* the data. Additionally, there are a number of tools to help with microdata extraction here: http://schema.rdfs.org/tools.html Some of these will allow you to output specific (RDF) serializations. HTH, Kevin On 07/10/2012 02:42 PM, Karen Coyle wrote: I have demonstrated the schema.org/RDFa microdata in the WC database to various folks and the question always is: how do I get access to this? (The only source I have is the Facebook API, me being a user rather than a maker.) The microdata is CC-BY once you get a Worldcat URI, but is there an open search to get one to the desired records in the first place? I'm poorly-versed in WC APIs so I'm hoping others have a better grasp. @rjw: the OCLC website does a thorough job of hiding email addresses or I would have asked this directly. Then again, a discussion here could have added value. Thanks, kc
Re: [CODE4LIB] Worldcat schema.org search API
Uh...what? For the given use case you would be much better off simply using the WorldCat Search API response. Using it only to retrieve an identifier and then going and scraping the Linked Data out of a WorldCat.org page is, at best, redundant. As Richard pointed out, some use cases -- like the one Karen provided -- are not really a good use case for linked data. It's a better use case for an API, which has been available for years. Roy On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote: The use case clarifies perfectly. Totally feasible. Well, I should say totally feasible with the caveat that I've never used the Worldcat Search API. Not letting that stop me, so long as it is what I imagine it is, then a developer should be able to perform a search, retrieve the response, and, by integrating one of the tools advertised on the schema.org website into his/her code, then retrieve the microdata for each resource returned from the search (and save it as RDF or whatever). If someone has created something like this, do speak up. Yours, Kevin On 07/10/2012 04:48 PM, Karen Coyle wrote: Kevin, if you misunderstand then I undoubtedly haven't been clear (let's at least share the confusion :-)). Here's the use case: PersonA wants to create a comprehensive bibliography of works by AuthorB. The goal is to do a search on AuthorB in WorldCat and extract the RDFa data from those pages in order to populate the bibliography. Apart from all of the issues of getting a perfect match on authors and of manifestation duplicates (there would need to be editing of the results after retrieval at the user's end), how feasible is this? Assume that the author is prolific enough that one wouldn't want to look up all of the records by hand. kc On 7/10/12 1:43 PM, Kevin Ford wrote: As for someone who might want to do this programmatically, he/she should take a look at the Programming languages section of the second link I sent along: http://schema.rdfs.org/tools.html There one can find Ruby, Python, and Java extractors and parsers capable of outputting RDF. A developer can take one of these and programmatically get at the data. Apologies if I am misunderstanding your intent. Yours, Kevin On 07/10/2012 04:34 PM, Karen Coyle wrote: Thanks, Kevin! And Richard! I'm thinking we need a good web site with links to tools. I had already been introduced to http://www.w3.org/2012/pyRdfa/ where you can past a URI and get ttl or rdf/xml. These are all good resources. But what about someone who wants to do this programmatically, not through a web site? Richard's message indicates that this isn't yet available, so perhaps we should be gathering use cases to support the need? And have a place to post various solutions, even ones that are not OCLC-specific? (Because I am hoping that the use of microformats will increase in general.) kc On 7/10/12 12:12 PM, Kevin Ford wrote: is there an open search to get one to the desired records in the first place? -- I'm not certain this will fully address your question, but try these two sites: Website: http://www.google.com/webmasters/tools/richsnippets Example: http://tinyurl.com/dx3h5bg Website: http://linter.structured-data.org/ Example: http://tinyurl.com/bmm8bbc These sites will extract the data, but I don't think you get your choice of serialization. The data are extracted and displayed on the resulting page in the HTML, but at least you can *see* the data. Additionally, there are a number of tools to help with microdata extraction here: http://schema.rdfs.org/tools.html Some of these will allow you to output specific (RDF) serializations. HTH, Kevin On 07/10/2012 02:42 PM, Karen Coyle wrote: I have demonstrated the schema.org/RDFa microdata in the WC database to various folks and the question always is: how do I get access to this? (The only source I have is the Facebook API, me being a user rather than a maker.) The microdata is CC-BY once you get a Worldcat URI, but is there an open search to get one to the desired records in the first place? I'm poorly-versed in WC APIs so I'm hoping others have a better grasp. @rjw: the OCLC website does a thorough job of hiding email addresses or I would have asked this directly. Then again, a discussion here could have added value. Thanks, kc
Re: [CODE4LIB] Worldcat schema.org search API
Does the worldcat search api return the data as described with the schema.org and OCLC extension vocabularies? The use case mentioned extracting the RDFa data from those pages. Without knowing the answer to the leading question above, the mock solution addressed that condition. If one simply wanted to create a comprehensive bibliography of works by a particular author, then, yes, the search response would suffice. Kevin On 07/10/2012 05:10 PM, Roy Tennant wrote: Uh...what? For the given use case you would be much better off simply using the WorldCat Search API response. Using it only to retrieve an identifier and then going and scraping the Linked Data out of a WorldCat.org page is, at best, redundant. As Richard pointed out, some use cases -- like the one Karen provided -- are not really a good use case for linked data. It's a better use case for an API, which has been available for years. Roy On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote: The use case clarifies perfectly. Totally feasible. Well, I should say totally feasible with the caveat that I've never used the Worldcat Search API. Not letting that stop me, so long as it is what I imagine it is, then a developer should be able to perform a search, retrieve the response, and, by integrating one of the tools advertised on the schema.org website into his/her code, then retrieve the microdata for each resource returned from the search (and save it as RDF or whatever). If someone has created something like this, do speak up. Yours, Kevin On 07/10/2012 04:48 PM, Karen Coyle wrote: Kevin, if you misunderstand then I undoubtedly haven't been clear (let's at least share the confusion :-)). Here's the use case: PersonA wants to create a comprehensive bibliography of works by AuthorB. The goal is to do a search on AuthorB in WorldCat and extract the RDFa data from those pages in order to populate the bibliography. Apart from all of the issues of getting a perfect match on authors and of manifestation duplicates (there would need to be editing of the results after retrieval at the user's end), how feasible is this? Assume that the author is prolific enough that one wouldn't want to look up all of the records by hand. kc On 7/10/12 1:43 PM, Kevin Ford wrote: As for someone who might want to do this programmatically, he/she should take a look at the Programming languages section of the second link I sent along: http://schema.rdfs.org/tools.html There one can find Ruby, Python, and Java extractors and parsers capable of outputting RDF. A developer can take one of these and programmatically get at the data. Apologies if I am misunderstanding your intent. Yours, Kevin On 07/10/2012 04:34 PM, Karen Coyle wrote: Thanks, Kevin! And Richard! I'm thinking we need a good web site with links to tools. I had already been introduced to http://www.w3.org/2012/pyRdfa/ where you can past a URI and get ttl or rdf/xml. These are all good resources. But what about someone who wants to do this programmatically, not through a web site? Richard's message indicates that this isn't yet available, so perhaps we should be gathering use cases to support the need? And have a place to post various solutions, even ones that are not OCLC-specific? (Because I am hoping that the use of microformats will increase in general.) kc On 7/10/12 12:12 PM, Kevin Ford wrote: is there an open search to get one to the desired records in the first place? -- I'm not certain this will fully address your question, but try these two sites: Website: http://www.google.com/webmasters/tools/richsnippets Example: http://tinyurl.com/dx3h5bg Website: http://linter.structured-data.org/ Example: http://tinyurl.com/bmm8bbc These sites will extract the data, but I don't think you get your choice of serialization. The data are extracted and displayed on the resulting page in the HTML, but at least you can *see* the data. Additionally, there are a number of tools to help with microdata extraction here: http://schema.rdfs.org/tools.html Some of these will allow you to output specific (RDF) serializations. HTH, Kevin On 07/10/2012 02:42 PM, Karen Coyle wrote: I have demonstrated the schema.org/RDFa microdata in the WC database to various folks and the question always is: how do I get access to this? (The only source I have is the Facebook API, me being a user rather than a maker.) The microdata is CC-BY once you get a Worldcat URI, but is there an open search to get one to the desired records in the first place? I'm poorly-versed in WC APIs so I'm hoping others have a better grasp. @rjw: the OCLC website does a thorough job of hiding email addresses or I would have asked this directly. Then again, a discussion here could have added value. Thanks, kc
Re: [CODE4LIB] Worldcat schema.org search API
I think we have a catch-22 here. You need an OCLC developer license to use WC to discover WC URIs using an application; you need WC URIs (or other URIs that are not very diffuse on the Web) to make use of the OCLC linked data. The OCLC linked data is ODC-BY for anyone wishing to use the data, but, if I'm not mistaken, the APIs are not publicly open to the Web public. Thus the schema.org data is ODC-BY but most applications on the web will have little opportunity to discover the OCLC-specific URIs. So the gatekeeper is the API access, that is, the ability to search WC for URI discovery (e.g. with an author's name). So you can link, but you can't easily discover the linking URIs. I suppose that one could discover publications as linked data using the topical access of LCSH, the VIAF links in Wikipedia, or by going through databases like Open Library, which has some OCLC numbers associated with bibliographic data. All of these are accessible via open APIs, I believe, and are linked DBPedia. I understand that linking is linking but unless we are developing data for SkyNet, somewhere along the way the user needs to begin with a human-understandable query. Searching and linked data are not in conflict with each other, they give each other mutual support. It only makes sense that URIs will be discovered through searching at some point in the process of access, as applications like Wikipedia illustrate. (As does the Facebook API, which is a search.) I've tried to find a clear statement of who can get access to the OCLC APIs, but I'm afraid that I can't find a page that clarifies that. I guess one is expected to apply for developer key in order to find out if they qualify. I'll pass that information along. kc On 7/10/12 2:32 PM, Kevin Ford wrote: Does the worldcat search api return the data as described with the schema.org and OCLC extension vocabularies? The use case mentioned extracting the RDFa data from those pages. Without knowing the answer to the leading question above, the mock solution addressed that condition. If one simply wanted to create a comprehensive bibliography of works by a particular author, then, yes, the search response would suffice. Kevin On 07/10/2012 05:10 PM, Roy Tennant wrote: Uh...what? For the given use case you would be much better off simply using the WorldCat Search API response. Using it only to retrieve an identifier and then going and scraping the Linked Data out of a WorldCat.org page is, at best, redundant. As Richard pointed out, some use cases -- like the one Karen provided -- are not really a good use case for linked data. It's a better use case for an API, which has been available for years. Roy On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote: The use case clarifies perfectly. Totally feasible. Well, I should say totally feasible with the caveat that I've never used the Worldcat Search API. Not letting that stop me, so long as it is what I imagine it is, then a developer should be able to perform a search, retrieve the response, and, by integrating one of the tools advertised on the schema.org website into his/her code, then retrieve the microdata for each resource returned from the search (and save it as RDF or whatever). If someone has created something like this, do speak up. Yours, Kevin On 07/10/2012 04:48 PM, Karen Coyle wrote: Kevin, if you misunderstand then I undoubtedly haven't been clear (let's at least share the confusion :-)). Here's the use case: PersonA wants to create a comprehensive bibliography of works by AuthorB. The goal is to do a search on AuthorB in WorldCat and extract the RDFa data from those pages in order to populate the bibliography. Apart from all of the issues of getting a perfect match on authors and of manifestation duplicates (there would need to be editing of the results after retrieval at the user's end), how feasible is this? Assume that the author is prolific enough that one wouldn't want to look up all of the records by hand. kc On 7/10/12 1:43 PM, Kevin Ford wrote: As for someone who might want to do this programmatically, he/she should take a look at the Programming languages section of the second link I sent along: http://schema.rdfs.org/tools.html There one can find Ruby, Python, and Java extractors and parsers capable of outputting RDF. A developer can take one of these and programmatically get at the data. Apologies if I am misunderstanding your intent. Yours, Kevin On 07/10/2012 04:34 PM, Karen Coyle wrote: Thanks, Kevin! And Richard! I'm thinking we need a good web site with links to tools. I had already been introduced to http://www.w3.org/2012/pyRdfa/ where you can past a URI and get ttl or rdf/xml. These are all good resources. But what about someone who wants to do this programmatically, not through a web site? Richard's message indicates that this isn't yet available, so perhaps we should be gathering use
Re: [CODE4LIB] Worldcat schema.org search API
On 7/10/12 2:10 PM, Roy Tennant wrote: Uh...what? For the given use case you would be much better off simply using the WorldCat Search API response. Using it only to retrieve an identifier and then going and scraping the Linked Data out of a WorldCat.org page is, at best, redundant. I do not consider using linked data to be scraping by any meaning of that term. Machine-actionable data is returned in formats like RDF/XML or ttl or JSON. And I'm curious that linked data is somehow not considered to be usable as data and that microformat data is not considered to be searchable -- in fact, its raison d'etre is search optimization. As Richard pointed out, some use cases -- like the one Karen provided -- are not really a good use case for linked data. It's a better use case for an API, which has been available for years. But is it available to everyone, and is the data retrieved also usable as ODC-BY by any member of the Web public? kc Roy On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote: The use case clarifies perfectly. Totally feasible. Well, I should say totally feasible with the caveat that I've never used the Worldcat Search API. Not letting that stop me, so long as it is what I imagine it is, then a developer should be able to perform a search, retrieve the response, and, by integrating one of the tools advertised on the schema.org website into his/her code, then retrieve the microdata for each resource returned from the search (and save it as RDF or whatever). If someone has created something like this, do speak up. Yours, Kevin On 07/10/2012 04:48 PM, Karen Coyle wrote: Kevin, if you misunderstand then I undoubtedly haven't been clear (let's at least share the confusion :-)). Here's the use case: PersonA wants to create a comprehensive bibliography of works by AuthorB. The goal is to do a search on AuthorB in WorldCat and extract the RDFa data from those pages in order to populate the bibliography. Apart from all of the issues of getting a perfect match on authors and of manifestation duplicates (there would need to be editing of the results after retrieval at the user's end), how feasible is this? Assume that the author is prolific enough that one wouldn't want to look up all of the records by hand. kc On 7/10/12 1:43 PM, Kevin Ford wrote: As for someone who might want to do this programmatically, he/she should take a look at the Programming languages section of the second link I sent along: http://schema.rdfs.org/tools.html There one can find Ruby, Python, and Java extractors and parsers capable of outputting RDF. A developer can take one of these and programmatically get at the data. Apologies if I am misunderstanding your intent. Yours, Kevin On 07/10/2012 04:34 PM, Karen Coyle wrote: Thanks, Kevin! And Richard! I'm thinking we need a good web site with links to tools. I had already been introduced to http://www.w3.org/2012/pyRdfa/ where you can past a URI and get ttl or rdf/xml. These are all good resources. But what about someone who wants to do this programmatically, not through a web site? Richard's message indicates that this isn't yet available, so perhaps we should be gathering use cases to support the need? And have a place to post various solutions, even ones that are not OCLC-specific? (Because I am hoping that the use of microformats will increase in general.) kc On 7/10/12 12:12 PM, Kevin Ford wrote: is there an open search to get one to the desired records in the first place? -- I'm not certain this will fully address your question, but try these two sites: Website: http://www.google.com/webmasters/tools/richsnippets Example: http://tinyurl.com/dx3h5bg Website: http://linter.structured-data.org/ Example: http://tinyurl.com/bmm8bbc These sites will extract the data, but I don't think you get your choice of serialization. The data are extracted and displayed on the resulting page in the HTML, but at least you can *see* the data. Additionally, there are a number of tools to help with microdata extraction here: http://schema.rdfs.org/tools.html Some of these will allow you to output specific (RDF) serializations. HTH, Kevin On 07/10/2012 02:42 PM, Karen Coyle wrote: I have demonstrated the schema.org/RDFa microdata in the WC database to various folks and the question always is: how do I get access to this? (The only source I have is the Facebook API, me being a user rather than a maker.) The microdata is CC-BY once you get a Worldcat URI, but is there an open search to get one to the desired records in the first place? I'm poorly-versed in WC APIs so I'm hoping others have a better grasp. @rjw: the OCLC website does a thorough job of hiding email addresses or I would have asked this directly. Then again, a discussion here could have added value. Thanks, kc -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] Worldcat schema.org search API
On 10 July 2012 23:13, Karen Coyle li...@kcoyle.net wrote: On 7/10/12 2:10 PM, Roy Tennant wrote: Uh...what? For the given use case you would be much better off simply using the WorldCat Search API response. Using it only to retrieve an identifier and then going and scraping the Linked Data out of a WorldCat.org page is, at best, redundant. I do not consider using linked data to be scraping by any meaning of that term. The tools and code libraries that extract the RDF wrapped in RDFa markup in html are doing just that - scraping it out of the page markup. However, as it is embedded in there in a structured form, so that process can be considered far more reliable than is normally expected a scraping process that is easily upset by visual changes. Machine-actionable data is returned in formats like RDF/XML or ttl or JSON. From the code and tools that interpret the RDFa in the page, yes. And I'm curious that linked data is somehow not considered to be usable as data and that microformat data is not considered to be searchable Of course it is usable as data - I think what Roy was getting at is that you could have satisfied your use case with tools that were available before the embedding of linked data in to WorldCat detail pages. -- in fact, its raison d'etre is search optimization. Yes one of the reasons for embedding structured data and identifiers, as well as text [as Google puts it 'things not strings'] is SEO. I'm sure that the search engines are already using it for that now. However, SEO is not the only reason for linked data - [as a linked data enthusiast] I would suggest that better SEO is a nice side benefit of something much more powerful. evangelismoff/evangelism As Richard pointed out, some use cases -- like the one Karen provided -- are not really a good use case for linked data. It's a better use case for an API, which has been available for years. But is it available to everyone, and is the data retrieved also usable as ODC-BY by any member of the Web public? Yes it is, and at this stage it is only available from within a html page. This experiment is the first step in a process to make linked data about WorldCat resources available. As it will evolve over time other areas such as API access, content-negotiation, search other query methods, additional RDF data vocabularies, etc., etc., will be considered in concert with community feedback (such as this thread) as to the way forward. Karen I know you are eager to work with and demonstrate the benefits of this way of publishing data. But these things take time and effort, so please be a little patient, and keep firing off these use cases and issues they are all valuable input. ~Richard. kc Roy On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote: The use case clarifies perfectly. Totally feasible. Well, I should say totally feasible with the caveat that I've never used the Worldcat Search API. Not letting that stop me, so long as it is what I imagine it is, then a developer should be able to perform a search, retrieve the response, and, by integrating one of the tools advertised on the schema.org website into his/her code, then retrieve the microdata for each resource returned from the search (and save it as RDF or whatever). If someone has created something like this, do speak up. Yours, Kevin On 07/10/2012 04:48 PM, Karen Coyle wrote: Kevin, if you misunderstand then I undoubtedly haven't been clear (let's at least share the confusion :-)). Here's the use case: PersonA wants to create a comprehensive bibliography of works by AuthorB. The goal is to do a search on AuthorB in WorldCat and extract the RDFa data from those pages in order to populate the bibliography. Apart from all of the issues of getting a perfect match on authors and of manifestation duplicates (there would need to be editing of the results after retrieval at the user's end), how feasible is this? Assume that the author is prolific enough that one wouldn't want to look up all of the records by hand. kc On 7/10/12 1:43 PM, Kevin Ford wrote: As for someone who might want to do this programmatically, he/she should take a look at the Programming languages section of the second link I sent along: http://schema.rdfs.org/tools.**htmlhttp://schema.rdfs.org/tools.html There one can find Ruby, Python, and Java extractors and parsers capable of outputting RDF. A developer can take one of these and programmatically get at the data. Apologies if I am misunderstanding your intent. Yours, Kevin On 07/10/2012 04:34 PM, Karen Coyle wrote: Thanks, Kevin! And Richard! I'm thinking we need a good web site with links to tools. I had already been introduced to http://www.w3.org/2012/pyRdfa/ where you can past a URI and get ttl or rdf/xml. These are all good resources. But what about someone who wants to do this programmatically, not through a
Re: [CODE4LIB] Worldcat schema.org search API
On 7/10/12 4:02 PM, Richard Wallis wrote: But is it available to everyone, and is the data retrieved also usable as ODC-BY by any member of the Web public? Yes it is, and at this stage it is only available from within a html page. The it I was referring to was the API. Roy is telling me that people should use the API, as if that is an obvious option that I am overlooking. I am asking if the general web public can use the API to get this data. I believe that should be a yes/no question/answer. kc This experiment is the first step in a process to make linked data about WorldCat resources available. As it will evolve over time other areas such as API access, content-negotiation, search other query methods, additional RDF data vocabularies, etc., etc., will be considered in concert with community feedback (such as this thread) as to the way forward. Karen I know you are eager to work with and demonstrate the benefits of this way of publishing data. But these things take time and effort, so please be a little patient, and keep firing off these use cases and issues they are all valuable input. ~Richard. kc Roy On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote: The use case clarifies perfectly. Totally feasible. Well, I should say totally feasible with the caveat that I've never used the Worldcat Search API. Not letting that stop me, so long as it is what I imagine it is, then a developer should be able to perform a search, retrieve the response, and, by integrating one of the tools advertised on the schema.org website into his/her code, then retrieve the microdata for each resource returned from the search (and save it as RDF or whatever). If someone has created something like this, do speak up. Yours, Kevin On 07/10/2012 04:48 PM, Karen Coyle wrote: Kevin, if you misunderstand then I undoubtedly haven't been clear (let's at least share the confusion :-)). Here's the use case: PersonA wants to create a comprehensive bibliography of works by AuthorB. The goal is to do a search on AuthorB in WorldCat and extract the RDFa data from those pages in order to populate the bibliography. Apart from all of the issues of getting a perfect match on authors and of manifestation duplicates (there would need to be editing of the results after retrieval at the user's end), how feasible is this? Assume that the author is prolific enough that one wouldn't want to look up all of the records by hand. kc On 7/10/12 1:43 PM, Kevin Ford wrote: As for someone who might want to do this programmatically, he/she should take a look at the Programming languages section of the second link I sent along: http://schema.rdfs.org/tools.**htmlhttp://schema.rdfs.org/tools.html There one can find Ruby, Python, and Java extractors and parsers capable of outputting RDF. A developer can take one of these and programmatically get at the data. Apologies if I am misunderstanding your intent. Yours, Kevin On 07/10/2012 04:34 PM, Karen Coyle wrote: Thanks, Kevin! And Richard! I'm thinking we need a good web site with links to tools. I had already been introduced to http://www.w3.org/2012/pyRdfa/ where you can past a URI and get ttl or rdf/xml. These are all good resources. But what about someone who wants to do this programmatically, not through a web site? Richard's message indicates that this isn't yet available, so perhaps we should be gathering use cases to support the need? And have a place to post various solutions, even ones that are not OCLC-specific? (Because I am hoping that the use of microformats will increase in general.) kc On 7/10/12 12:12 PM, Kevin Ford wrote: is there an open search to get one to the desired records in the first place? -- I'm not certain this will fully address your question, but try these two sites: Website: http://www.google.com/**webmasters/tools/richsnippetshttp://www.google.com/webmasters/tools/richsnippets Example: http://tinyurl.com/dx3h5bg Website: http://linter.structured-data.**org/http://linter.structured-data.org/ Example: http://tinyurl.com/bmm8bbc These sites will extract the data, but I don't think you get your choice of serialization. The data are extracted and displayed on the resulting page in the HTML, but at least you can *see* the data. Additionally, there are a number of tools to help with microdata extraction here: http://schema.rdfs.org/tools.**htmlhttp://schema.rdfs.org/tools.html Some of these will allow you to output specific (RDF) serializations. HTH, Kevin On 07/10/2012 02:42 PM, Karen Coyle wrote: I have demonstrated the schema.org/RDFa microdata in the WC database to various folks and the question always is: how do I get access to this? (The only source I have is the Facebook API, me being a user rather than a maker.) The microdata is CC-BY once you get a Worldcat URI, but is there an open search to get one to the desired records in the first place? I'm poorly-versed in WC APIs so I'm
Re: [CODE4LIB] Worldcat schema.org search API
Worldcat does have the basic API, which is more open (assuming your situation qualifies). At any rate, it's free and open to (non-commercial) non-subscribers. http://oclc.org/developer/documentation/worldcat-basic-api/using-api Searching isn't terribly sophisticated, but might suit your need. And the schema.org data will be much richer than what you'd normally get back from the Basic API. -Ross. On Tuesday, July 10, 2012, Karen Coyle wrote: On 7/10/12 4:02 PM, Richard Wallis wrote: But is it available to everyone, and is the data retrieved also usable as ODC-BY by any member of the Web public? Yes it is, and at this stage it is only available from within a html page. The it I was referring to was the API. Roy is telling me that people should use the API, as if that is an obvious option that I am overlooking. I am asking if the general web public can use the API to get this data. I believe that should be a yes/no question/answer. kc This experiment is the first step in a process to make linked data about WorldCat resources available. As it will evolve over time other areas such as API access, content-negotiation, search other query methods, additional RDF data vocabularies, etc., etc., will be considered in concert with community feedback (such as this thread) as to the way forward. Karen I know you are eager to work with and demonstrate the benefits of this way of publishing data. But these things take time and effort, so please be a little patient, and keep firing off these use cases and issues they are all valuable input. ~Richard. kc Roy On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote: The use case clarifies perfectly. Totally feasible. Well, I should say totally feasible with the caveat that I've never used the Worldcat Search API. Not letting that stop me, so long as it is what I imagine it is, then a developer should be able to perform a search, retrieve the response, and, by integrating one of the tools advertised on the schema.org website into his/her code, then retrieve the microdata for each resource returned from the search (and save it as RDF or whatever). If someone has created something like this, do speak up. Yours, Kevin On 07/10/2012 04:48 PM, Karen Coyle wrote: Kevin, if you misunderstand then I undoubtedly haven't been clear (let's at least share the confusion :-)). Here's the use case: PersonA wants to create a comprehensive bibliography of works by AuthorB. The goal is to do a search on AuthorB in WorldCat and extract the RDFa data from those pages in order to populate the bibliography. Apart from all of the issues of getting a perfect match on authors and of manifestation duplicates (there would need to be editing of the results after retrieval at the user's end), how feasible is this? Assume that the author is prolific enough that one wouldn't want to look up all of the records by hand. kc On 7/10/12 1:43 PM, Kevin Ford wrote: As for someone who might want to do this programmatically, he/she should take a look at the Programming languages section of the second link I sent along: http://schema.rdfs.org/tools.htmlhttp://schema.rdfs.org/tools.**html http://schema.rdfs.org/**tools.html http://schema.rdfs.org/tools.html There one can find Ruby, Python, and Java extractors and parsers capable of outputting RDF. A developer can take one of these and programmatically get at the data. Apologies if I am misunderstanding your intent. Yours, Kevin On 07/10/2012 04:34 PM, Karen Coyle wrote: Thanks, Kevin! And Richard! I'm thinking we need a good web site with links to tools. I had already been introduced to http://www.w3.org/2012/pyRdfa/ where you can past a URI and get ttl or rdf/xml. These are all good resources. But what about someone who wants to do this programmatically, not through a web site? Richard's message indicates that this isn't yet available, so perhaps we should be gathering use cases to support the need? And have a place to post various solutions, even ones that are not OCLC-specific? (Because I am hoping that the use of microformats will increase in general.) kc On 7/10/12 12:12 PM, Kevin Ford wrote: is there an open search to get one to the desired records in the first place? -- I'm not certain this will fully address your question, but try these two sites: Website: http://www.google.com/**webmasters/tools/richsnippets
Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set
Arash - you might not want to use a straight dump of worldcat catalog records- at least not without the associated holdings information.* There are a lot of quasi-duplicate records that are sufficiently broken that the worldcat de-duplication algorithm refuses to merge them. These records will usually only be used by a handful of institutions; the better records will tend to have more associated holdings. The holdings count should be used to weight the strength of association between class numbers and features. Also, since classification/categorization is something that is usually considered to be a property of works, rather than manifestations, one might get better results by using Work sets for training. I would suggest, er, contacting Thom Hickey. Simon * Well, not precisely holdings - you just need the number of distinct institutions with at least one copy. I call them 'hasings'. On Sat, May 19, 2012 at 8:42 PM, Roy Tennant roytenn...@gmail.com wrote: Arash, Yes, we have made WorldCat available to researchers under a special license agreement. I suggest contacting Thom Hickeyhic...@oclc.org about such an arrangement. Thanks, Roy On Fri, May 18, 2012 at 3:46 AM, Arash.Joorabchi arash.joorab...@ul.ie wrote: Dear Karen, I am conducting a research experiment on automatic text classification and I am trying to retrieve top matching bib records (which include DDC fields) for a set of keyphrases extracted from a given document. So, I suppose this is a rather exceptional use case. In fact, the right approach for this experiment is to process the full dump of WorldCat database directly rather than sending a limited number of queries via the API. I read here: http://dltj.org/article/worldcat-lld-may-become-available under-odc-by/ that WorldCat might become available as open linked data in future, which would solve my problem and help similar text mining projects. However, I wonder if it is currently available to researchers under a research/non-commercial use license agreement. Regards, Arash -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Karen Coombs Sent: 17 May 2012 08:37 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set I forwarded this thread to the Product Manager for the WorldCat Search API. She responded back that unfortunately this query is not possible using the API at this time. FYI, the SRU interface to WorldCat Search API doesn't currently support any scan type searches either. Is there a particular use case you're trying to support? Know that would help us document this as a possible enhancement. Karen Karen Coombs Senior Product Analyst Web Services OCLC coom...@oclc.org On Wed, May 16, 2012 at 9:49 PM, Arash.Joorabchi arash.joorab...@ul.ie wrote: Hi Andy, I am a SRU newbie myself, so I don't know how this could be achieved using scan operations and could not find much info on SRU website (http://www.loc.gov/standards/sru/). As for the wildcards, according to this guide: http://www.oclc.org/support/documentation/worldcat/searching/refcard/sea rchworldcatquickreference.pdf the symbols should be preceded by at least 3 characters, and therefore clauses like: ... AND srw.dd=* ... AND srw.dd=?.* ... AND srw/dd=###.* ... AND srw/dd=?3.* do not work and result in the following error: Diagnostics Identifier: info:srw/diagnostic/1/9 Meaning: Details: Message: Not enough chars in truncated term:Truncated words too short(9) Thanks, Arash From: Houghton,Andrew [mailto:hough...@oclc.org] Sent: 16 May 2012 11:58 To: Arash.Joorabchi Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set I'm not an SRU guru, but is it possible to do a scan and look for a postings of zero? Andy. On May 16, 2012, at 6:39, Arash.Joorabchi arash.joorab...@ul.ie wrote: Hi mark, Srw.dd=* does not work either: Identifier: info:srw/diagnostic/1/27 Meaning: Details:srw.dd Message:The index [srw.dd] did not include a searchable value I suppose the only option left is to retrieve everything and filter the results on the client side. Thanks for your quick reply. Arash -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Mike Taylor Sent: 16 May 2012 10:43 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set There is no standard way in CQL to express field X
Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set
Thank you Roy and Simon for the info. As for your second point, I suppose one advantage of using the WorldCat API at this experimental stage is that the returned bib records are already FRBR-ized. Ross - Thanks for the link of Open Library data dump. WorldCat collection is 2 orders of magnitude larger than open library which makes a significant difference considering the skewness and sparsity of bib records classified according to library taxonomies, e.g., DDC, LCC (for more info, see: http://cdm15003.contentdm.oclc.org/cdm/singleitem/collection/p267701coll 27/id/277/rec/28) Thanks, Arash -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Simon Spero Sent: 22 May 2012 19:47 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set Arash - you might not want to use a straight dump of worldcat catalog records- at least not without the associated holdings information.* There are a lot of quasi-duplicate records that are sufficiently broken that the worldcat de-duplication algorithm refuses to merge them. These records will usually only be used by a handful of institutions; the better records will tend to have more associated holdings. The holdings count should be used to weight the strength of association between class numbers and features. Also, since classification/categorization is something that is usually considered to be a property of works, rather than manifestations, one might get better results by using Work sets for training. I would suggest, er, contacting Thom Hickey. Simon * Well, not precisely holdings - you just need the number of distinct institutions with at least one copy. I call them 'hasings'. On Sat, May 19, 2012 at 8:42 PM, Roy Tennant roytenn...@gmail.com wrote: Arash, Yes, we have made WorldCat available to researchers under a special license agreement. I suggest contacting Thom Hickeyhic...@oclc.org about such an arrangement. Thanks, Roy On Fri, May 18, 2012 at 3:46 AM, Arash.Joorabchi arash.joorab...@ul.ie wrote: Dear Karen, I am conducting a research experiment on automatic text classification and I am trying to retrieve top matching bib records (which include DDC fields) for a set of keyphrases extracted from a given document. So, I suppose this is a rather exceptional use case. In fact, the right approach for this experiment is to process the full dump of WorldCat database directly rather than sending a limited number of queries via the API. I read here: http://dltj.org/article/worldcat-lld-may-become-available under-odc-by/ that WorldCat might become available as open linked data in future, which would solve my problem and help similar text mining projects. However, I wonder if it is currently available to researchers under a research/non-commercial use license agreement. Regards, Arash -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Karen Coombs Sent: 17 May 2012 08:37 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set I forwarded this thread to the Product Manager for the WorldCat Search API. She responded back that unfortunately this query is not possible using the API at this time. FYI, the SRU interface to WorldCat Search API doesn't currently support any scan type searches either. Is there a particular use case you're trying to support? Know that would help us document this as a possible enhancement. Karen Karen Coombs Senior Product Analyst Web Services OCLC coom...@oclc.org On Wed, May 16, 2012 at 9:49 PM, Arash.Joorabchi arash.joorab...@ul.ie wrote: Hi Andy, I am a SRU newbie myself, so I don't know how this could be achieved using scan operations and could not find much info on SRU website (http://www.loc.gov/standards/sru/). As for the wildcards, according to this guide: http://www.oclc.org/support/documentation/worldcat/searching/refcard/sea rchworldcatquickreference.pdf the symbols should be preceded by at least 3 characters, and therefore clauses like: ... AND srw.dd=* ... AND srw.dd=?.* ... AND srw/dd=###.* ... AND srw/dd=?3.* do not work and result in the following error: Diagnostics Identifier: info:srw/diagnostic/1/9 Meaning: Details: Message: Not enough chars in truncated term:Truncated words too short(9) Thanks, Arash From: Houghton,Andrew [mailto:hough...@oclc.org] Sent: 16 May 2012 11:58 To: Arash.Joorabchi Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set I'm not an SRU guru, but is it possible to do a scan and look for a postings of zero? Andy. On May
Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set
Arash, Yes, we have made WorldCat available to researchers under a special license agreement. I suggest contacting Thom Hickeyhic...@oclc.org about such an arrangement. Thanks, Roy On Fri, May 18, 2012 at 3:46 AM, Arash.Joorabchi arash.joorab...@ul.ie wrote: Dear Karen, I am conducting a research experiment on automatic text classification and I am trying to retrieve top matching bib records (which include DDC fields) for a set of keyphrases extracted from a given document. So, I suppose this is a rather exceptional use case. In fact, the right approach for this experiment is to process the full dump of WorldCat database directly rather than sending a limited number of queries via the API. I read here: http://dltj.org/article/worldcat-lld-may-become-available under-odc-by/ that WorldCat might become available as open linked data in future, which would solve my problem and help similar text mining projects. However, I wonder if it is currently available to researchers under a research/non-commercial use license agreement. Regards, Arash -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Karen Coombs Sent: 17 May 2012 08:37 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set I forwarded this thread to the Product Manager for the WorldCat Search API. She responded back that unfortunately this query is not possible using the API at this time. FYI, the SRU interface to WorldCat Search API doesn't currently support any scan type searches either. Is there a particular use case you're trying to support? Know that would help us document this as a possible enhancement. Karen Karen Coombs Senior Product Analyst Web Services OCLC coom...@oclc.org On Wed, May 16, 2012 at 9:49 PM, Arash.Joorabchi arash.joorab...@ul.ie wrote: Hi Andy, I am a SRU newbie myself, so I don't know how this could be achieved using scan operations and could not find much info on SRU website (http://www.loc.gov/standards/sru/). As for the wildcards, according to this guide: http://www.oclc.org/support/documentation/worldcat/searching/refcard/sea rchworldcatquickreference.pdf the symbols should be preceded by at least 3 characters, and therefore clauses like: ... AND srw.dd=* ... AND srw.dd=?.* ... AND srw/dd=###.* ... AND srw/dd=?3.* do not work and result in the following error: Diagnostics Identifier: info:srw/diagnostic/1/9 Meaning: Details: Message: Not enough chars in truncated term:Truncated words too short(9) Thanks, Arash From: Houghton,Andrew [mailto:hough...@oclc.org] Sent: 16 May 2012 11:58 To: Arash.Joorabchi Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set I'm not an SRU guru, but is it possible to do a scan and look for a postings of zero? Andy. On May 16, 2012, at 6:39, Arash.Joorabchi arash.joorab...@ul.ie wrote: Hi mark, Srw.dd=* does not work either: Identifier: info:srw/diagnostic/1/27 Meaning: Details: srw.dd Message: The index [srw.dd] did not include a searchable value I suppose the only option left is to retrieve everything and filter the results on the client side. Thanks for your quick reply. Arash -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Mike Taylor Sent: 16 May 2012 10:43 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set There is no standard way in CQL to express field X is not empty. Depending on implementations, NOT srw.dd= might work (but evidently doesn't in this case). Another possibility is srw.dd=*, but again that may or may not work, and might be appallingly inefficient if it does. NOT srw.dd=null will definitely not work: null is not a special word in CQL. -- Mike. On 16 May 2012 10:32, Arash.Joorabchi arash.joorab...@ul.ie wrote: Hi all, I am sending SRU queries to the WorldCat in the following form: String host = http://worldcat.org/webservices/catalog/search/;; String query = sru?query=srw.kw=\ + keyword + \ + AND srw.ln exact \eng\ + AND srw.mt all \bks\ + AND srw.nt=\ + keyword + \ + servicelevel=full + maximumRecords=100 + sortKeys=relevance,,0
Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set
Dear Karen, I am conducting a research experiment on automatic text classification and I am trying to retrieve top matching bib records (which include DDC fields) for a set of keyphrases extracted from a given document. So, I suppose this is a rather exceptional use case. In fact, the right approach for this experiment is to process the full dump of WorldCat database directly rather than sending a limited number of queries via the API. I read here: http://dltj.org/article/worldcat-lld-may-become-available under-odc-by/ that WorldCat might become available as open linked data in future, which would solve my problem and help similar text mining projects. However, I wonder if it is currently available to researchers under a research/non-commercial use license agreement. Regards, Arash -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Karen Coombs Sent: 17 May 2012 08:37 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set I forwarded this thread to the Product Manager for the WorldCat Search API. She responded back that unfortunately this query is not possible using the API at this time. FYI, the SRU interface to WorldCat Search API doesn't currently support any scan type searches either. Is there a particular use case you're trying to support? Know that would help us document this as a possible enhancement. Karen Karen Coombs Senior Product Analyst Web Services OCLC coom...@oclc.org On Wed, May 16, 2012 at 9:49 PM, Arash.Joorabchi arash.joorab...@ul.ie wrote: Hi Andy, I am a SRU newbie myself, so I don't know how this could be achieved using scan operations and could not find much info on SRU website (http://www.loc.gov/standards/sru/). As for the wildcards, according to this guide: http://www.oclc.org/support/documentation/worldcat/searching/refcard/sea rchworldcatquickreference.pdf the symbols should be preceded by at least 3 characters, and therefore clauses like: ... AND srw.dd=* ... AND srw.dd=?.* ... AND srw/dd=###.* ... AND srw/dd=?3.* do not work and result in the following error: Diagnostics Identifier: info:srw/diagnostic/1/9 Meaning: Details: Message: Not enough chars in truncated term:Truncated words too short(9) Thanks, Arash From: Houghton,Andrew [mailto:hough...@oclc.org] Sent: 16 May 2012 11:58 To: Arash.Joorabchi Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set I'm not an SRU guru, but is it possible to do a scan and look for a postings of zero? Andy. On May 16, 2012, at 6:39, Arash.Joorabchi arash.joorab...@ul.ie wrote: Hi mark, Srw.dd=* does not work either: Identifier: info:srw/diagnostic/1/27 Meaning: Details: srw.dd Message: The index [srw.dd] did not include a searchable value I suppose the only option left is to retrieve everything and filter the results on the client side. Thanks for your quick reply. Arash -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Mike Taylor Sent: 16 May 2012 10:43 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set There is no standard way in CQL to express field X is not empty. Depending on implementations, NOT srw.dd= might work (but evidently doesn't in this case). Another possibility is srw.dd=*, but again that may or may not work, and might be appallingly inefficient if it does. NOT srw.dd=null will definitely not work: null is not a special word in CQL. -- Mike. On 16 May 2012 10:32, Arash.Joorabchi arash.joorab...@ul.ie wrote: Hi all, I am sending SRU queries to the WorldCat in the following form: String host = http://worldcat.org/webservices/catalog/search/;; String query = sru?query=srw.kw=\ + keyword + \ + AND srw.ln exact \eng\ + AND srw.mt all \bks\ + AND srw.nt=\ + keyword + \ + servicelevel=full + maximumRecords=100 + sortKeys=relevance,,0 + wskey=[wskey]; And it is working fine, however I'd like to limit the results to those records that have a DDC number assigned to them, but I don't know what's the right way to specify this limit in the query
Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set
On May 18, 2012, at 6:46 AM, Arash.Joorabchi wrote: Dear Karen, I am conducting a research experiment on automatic text classification and I am trying to retrieve top matching bib records (which include DDC fields) for a set of keyphrases extracted from a given document. So, I suppose this is a rather exceptional use case. In fact, the right approach for this experiment is to process the full dump of WorldCat database directly rather than sending a limited number of queries via the API. I read here: http://dltj.org/article/worldcat-lld-may-become-available under-odc-by/ that WorldCat might become available as open linked data in future, which would solve my problem and help similar text mining projects. However, I wonder if it is currently available to researchers under a research/non-commercial use license agreement. Why not use Open Library's dataset (which is freely available with no restrictions)? http://openlibrary.org/developers/dumps -Ross. Regards, Arash -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Karen Coombs Sent: 17 May 2012 08:37 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set I forwarded this thread to the Product Manager for the WorldCat Search API. She responded back that unfortunately this query is not possible using the API at this time. FYI, the SRU interface to WorldCat Search API doesn't currently support any scan type searches either. Is there a particular use case you're trying to support? Know that would help us document this as a possible enhancement. Karen Karen Coombs Senior Product Analyst Web Services OCLC coom...@oclc.org On Wed, May 16, 2012 at 9:49 PM, Arash.Joorabchi arash.joorab...@ul.ie wrote: Hi Andy, I am a SRU newbie myself, so I don't know how this could be achieved using scan operations and could not find much info on SRU website (http://www.loc.gov/standards/sru/). As for the wildcards, according to this guide: http://www.oclc.org/support/documentation/worldcat/searching/refcard/sea rchworldcatquickreference.pdf the symbols should be preceded by at least 3 characters, and therefore clauses like: ... AND srw.dd=* ... AND srw.dd=?.* ... AND srw/dd=###.* ... AND srw/dd=?3.* do not work and result in the following error: Diagnostics Identifier: info:srw/diagnostic/1/9 Meaning: Details: Message: Not enough chars in truncated term:Truncated words too short(9) Thanks, Arash From: Houghton,Andrew [mailto:hough...@oclc.org] Sent: 16 May 2012 11:58 To: Arash.Joorabchi Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set I'm not an SRU guru, but is it possible to do a scan and look for a postings of zero? Andy. On May 16, 2012, at 6:39, Arash.Joorabchi arash.joorab...@ul.ie wrote: Hi mark, Srw.dd=* does not work either: Identifier: info:srw/diagnostic/1/27 Meaning: Details:srw.dd Message:The index [srw.dd] did not include a searchable value I suppose the only option left is to retrieve everything and filter the results on the client side. Thanks for your quick reply. Arash -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Mike Taylor Sent: 16 May 2012 10:43 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set There is no standard way in CQL to express field X is not empty. Depending on implementations, NOT srw.dd= might work (but evidently doesn't in this case). Another possibility is srw.dd=*, but again that may or may not work, and might be appallingly inefficient if it does. NOT srw.dd=null will definitely not work: null is not a special word in CQL. -- Mike. On 16 May 2012 10:32, Arash.Joorabchi arash.joorab...@ul.ie wrote: Hi all, I am sending SRU queries to the WorldCat in the following form: String host = http://worldcat.org/webservices/catalog/search/;; String query = sru?query=srw.kw=\ + keyword + \ + AND srw.ln exact \eng\ + AND srw.mt all \bks\ + AND srw.nt=\ + keyword + \ + servicelevel=full + maximumRecords=100 + sortKeys=relevance,,0
Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set
I forwarded this thread to the Product Manager for the WorldCat Search API. She responded back that unfortunately this query is not possible using the API at this time. FYI, the SRU interface to WorldCat Search API doesn't currently support any scan type searches either. Is there a particular use case you're trying to support? Know that would help us document this as a possible enhancement. Karen Karen Coombs Senior Product Analyst Web Services OCLC coom...@oclc.org On Wed, May 16, 2012 at 9:49 PM, Arash.Joorabchi arash.joorab...@ul.ie wrote: Hi Andy, I am a SRU newbie myself, so I don't know how this could be achieved using scan operations and could not find much info on SRU website (http://www.loc.gov/standards/sru/). As for the wildcards, according to this guide: http://www.oclc.org/support/documentation/worldcat/searching/refcard/sea rchworldcatquickreference.pdf the symbols should be preceded by at least 3 characters, and therefore clauses like: ... AND srw.dd=* ... AND srw.dd=?.* ... AND srw/dd=###.* ... AND srw/dd=?3.* do not work and result in the following error: Diagnostics Identifier: info:srw/diagnostic/1/9 Meaning: Details: Message: Not enough chars in truncated term:Truncated words too short(9) Thanks, Arash From: Houghton,Andrew [mailto:hough...@oclc.org] Sent: 16 May 2012 11:58 To: Arash.Joorabchi Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set I'm not an SRU guru, but is it possible to do a scan and look for a postings of zero? Andy. On May 16, 2012, at 6:39, Arash.Joorabchi arash.joorab...@ul.ie wrote: Hi mark, Srw.dd=* does not work either: Identifier: info:srw/diagnostic/1/27 Meaning: Details: srw.dd Message: The index [srw.dd] did not include a searchable value I suppose the only option left is to retrieve everything and filter the results on the client side. Thanks for your quick reply. Arash -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Mike Taylor Sent: 16 May 2012 10:43 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set There is no standard way in CQL to express field X is not empty. Depending on implementations, NOT srw.dd= might work (but evidently doesn't in this case). Another possibility is srw.dd=*, but again that may or may not work, and might be appallingly inefficient if it does. NOT srw.dd=null will definitely not work: null is not a special word in CQL. -- Mike. On 16 May 2012 10:32, Arash.Joorabchi arash.joorab...@ul.ie wrote: Hi all, I am sending SRU queries to the WorldCat in the following form: String host = http://worldcat.org/webservices/catalog/search/;; String query = sru?query=srw.kw=\ + keyword + \ + AND srw.ln exact \eng\ + AND srw.mt all \bks\ + AND srw.nt=\ + keyword + \ + servicelevel=full + maximumRecords=100 + sortKeys=relevance,,0 + wskey=[wskey]; And it is working fine, however I'd like to limit the results to those records that have a DDC number assigned to them, but I don't know what's the right way to specify this limit in the query. NOT srw.dd= NOT srw.dd=null Neither of above work Thanks, Arash No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.2176 / Virus Database: 2425/5001 - Release Date: 05/15/12
Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set
There is no standard way in CQL to express field X is not empty. Depending on implementations, NOT srw.dd= might work (but evidently doesn't in this case). Another possibility is srw.dd=*, but again that may or may not work, and might be appallingly inefficient if it does. NOT srw.dd=null will definitely not work: null is not a special word in CQL. -- Mike. On 16 May 2012 10:32, Arash.Joorabchi arash.joorab...@ul.ie wrote: Hi all, I am sending SRU queries to the WorldCat in the following form: String host = http://worldcat.org/webservices/catalog/search/;; String query = sru?query=srw.kw=\ + keyword + \ + AND srw.ln exact \eng\ + AND srw.mt all \bks\ + AND srw.nt=\ + keyword + \ + servicelevel=full + maximumRecords=100 + sortKeys=relevance,,0 + wskey=[wskey]; And it is working fine, however I'd like to limit the results to those records that have a DDC number assigned to them, but I don't know what's the right way to specify this limit in the query. NOT srw.dd= NOT srw.dd=null Neither of above work Thanks, Arash -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Chad Benjamin Nelson Sent: 15 May 2012 21:54 To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Atlanta Digital Libraries meetup - May 23rd The first / next Atlanta Digital Libraries meetup is coming up soon: Wednesday, May 23rd 7pm Manuel's Tavernhttp://www.manuelstavern.com/location.php 602 N Highland Avenue Northeast Atlanta, GA 30307 North Avenue Room We have two scheduled talks, and are still looking others interested in presenting. It's informal, so even if it is just a short topic you want to get some feedback on, we'd love to hear it. So, come along if you are interested and in the area. Chad Chad Nelson Web Services Programmer University Library Georgia State University e: cnelso...@gsu.edu t: 404 413 2771 My Calendarhttp://bit.ly/qybPLJ - No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.2176 / Virus Database: 2425/5000 - Release Date: 05/15/12
Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set
Hi mark, Srw.dd=* does not work either: Identifier: info:srw/diagnostic/1/27 Meaning: Details:srw.dd Message:The index [srw.dd] did not include a searchable value I suppose the only option left is to retrieve everything and filter the results on the client side. Thanks for your quick reply. Arash -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Mike Taylor Sent: 16 May 2012 10:43 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set There is no standard way in CQL to express field X is not empty. Depending on implementations, NOT srw.dd= might work (but evidently doesn't in this case). Another possibility is srw.dd=*, but again that may or may not work, and might be appallingly inefficient if it does. NOT srw.dd=null will definitely not work: null is not a special word in CQL. -- Mike. On 16 May 2012 10:32, Arash.Joorabchi arash.joorab...@ul.ie wrote: Hi all, I am sending SRU queries to the WorldCat in the following form: String host = http://worldcat.org/webservices/catalog/search/;; String query = sru?query=srw.kw=\ + keyword + \ + AND srw.ln exact \eng\ + AND srw.mt all \bks\ + AND srw.nt=\ + keyword + \ + servicelevel=full + maximumRecords=100 + sortKeys=relevance,,0 + wskey=[wskey]; And it is working fine, however I'd like to limit the results to those records that have a DDC number assigned to them, but I don't know what's the right way to specify this limit in the query. NOT srw.dd= NOT srw.dd=null Neither of above work Thanks, Arash
Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set
Hi Andy, I am a SRU newbie myself, so I don't know how this could be achieved using scan operations and could not find much info on SRU website (http://www.loc.gov/standards/sru/). As for the wildcards, according to this guide: http://www.oclc.org/support/documentation/worldcat/searching/refcard/sea rchworldcatquickreference.pdf the symbols should be preceded by at least 3 characters, and therefore clauses like: ... AND srw.dd=* ... AND srw.dd=?.* ... AND srw/dd=###.* ... AND srw/dd=?3.* do not work and result in the following error: Diagnostics Identifier: info:srw/diagnostic/1/9 Meaning: Details: Message: Not enough chars in truncated term:Truncated words too short(9) Thanks, Arash From: Houghton,Andrew [mailto:hough...@oclc.org] Sent: 16 May 2012 11:58 To: Arash.Joorabchi Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set I'm not an SRU guru, but is it possible to do a scan and look for a postings of zero? Andy. On May 16, 2012, at 6:39, Arash.Joorabchi arash.joorab...@ul.ie wrote: Hi mark, Srw.dd=* does not work either: Identifier: info:srw/diagnostic/1/27 Meaning: Details:srw.dd Message:The index [srw.dd] did not include a searchable value I suppose the only option left is to retrieve everything and filter the results on the client side. Thanks for your quick reply. Arash -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Mike Taylor Sent: 16 May 2012 10:43 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set There is no standard way in CQL to express field X is not empty. Depending on implementations, NOT srw.dd= might work (but evidently doesn't in this case). Another possibility is srw.dd=*, but again that may or may not work, and might be appallingly inefficient if it does. NOT srw.dd=null will definitely not work: null is not a special word in CQL. -- Mike. On 16 May 2012 10:32, Arash.Joorabchi arash.joorab...@ul.ie wrote: Hi all, I am sending SRU queries to the WorldCat in the following form: String host = http://worldcat.org/webservices/catalog/search/;; String query = sru?query=srw.kw=\ + keyword + \ + AND srw.ln exact \eng\ + AND srw.mt all \bks\ + AND srw.nt=\ + keyword + \ + servicelevel=full + maximumRecords=100 + sortKeys=relevance,,0 + wskey=[wskey]; And it is working fine, however I'd like to limit the results to those records that have a DDC number assigned to them, but I don't know what's the right way to specify this limit in the query. NOT srw.dd= NOT srw.dd=null Neither of above work Thanks, Arash No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.2176 / Virus Database: 2425/5001 - Release Date: 05/15/12
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
We have been trying to enumerate serials holdings as explicitly as possible. E.G., this microfiche supplement to a journal, http://summit.syr.edu/cgi-bin/Pwebrecon.cgi?BBID=274291 shows apparently missing issues. However, there are two pieces of inferred information here: 1) every print issue had a corresponding microfiche supplement (they didn't, so most of these are complete even with the gaps) 2) that volumes, at least up until 1991, had only 26 issues (that is probably is true, but it is not certain) and there is no way to be certain how many issues per volume were published with 1992 (28?, 52?) v.95:no.3 (1973)-v.95:no.8 (1973 v.95:no.10 (1973)-v.95:no.26 (1973) v.96 (1974)-v.97 (1975) v.98:no.1 (1976)-v.98:no.14 (1976) v.98:no.16 (1976)-v.98:no.26 (1976) v.99:no.1 (1977)-v.99:no.25 (1977) v.100 (1978)-v.108 (1986) v.109:no.1 (1987)-v.109:no.19 (1987) v.109:no.21 (1987)-v.109:no.26 (1987) v.110 (1988)-v.111 (1989) v.112:no.1 (1990)-v.112:no.26 (1990) v.113 (1991) v.114:no.1 (1992)-v.114:no.21 (1992) v.114:no.23 (1992)-v.114:no.27 (1992) v.115 (1993)-v.119 (1997) v.120:no.2 (1998:Jan.21)-v.120:no.51 (1998:Dec.30) On Tue, Jun 15, 2010 at 9:56 PM, Bill Dueber b...@dueber.com wrote: On Tue, Jun 15, 2010 at 5:49 PM, Kyle Banerjee baner...@uoregon.edu wrote: No, but parsing holding statements for something that just gets cut off early or which starts late should be easy unless entry is insanely inconsistent. Andthere it is. :-) We're really dealing with a few problems here: - Inconsistent entry by catalogers (probably the least of our worries) - Inconsistent publishing schedules (e.g., the Jan 1942 issue was just plain never printed) - Inconsistent use of volume/number/year/month/whatever throughout a serial's run. So, for example, http://mirlyn.lib.umich.edu/Record/45417/Holdings#1 There are six holdings: 1919-1920 incompl 1920 incompl. 1922 v.4 no.49 v.6 1921 jul-dec v.6 1921jan-jun We have no way of knowing what year volume 4 was printed in, which issues are incomplete in the two volumes that cover 1920, whether volume number are associated with earlier (or later) issues, etc. We, as humans, could try to make some guesses, but they'd just be guesses. It's easy to find examples where month ranges overlap (or leave gaps), where month names and issue numbers are sometimes used interchangeably, where volume numbers suddenly change in the middle of a run because of a merge with another serial (or where the first volume isn't 1 because the serial broke off from a parent), etc. etc. etc. I don't mean to overstate the problem. For many (most?) serials whose existence only goes back a few decades, a relatively simple approach will likely work much of the time -- although even that relatively simple approach will have to take into account a solid dozen or so different ways that enumcron data may have been entered. But to be able to say, with some confidence, that we have the full run? Or a particular issue as labeled my a month name? Much, much harder in the general case. -Bill- -- Bill Dueber Library Systems Programmer University of Michigan Library
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
Don't forget inconsistent data from the person sending the OpenURL. Rosalyn On Tue, Jun 15, 2010 at 9:56 PM, Bill Dueber b...@dueber.com wrote: On Tue, Jun 15, 2010 at 5:49 PM, Kyle Banerjee baner...@uoregon.edu wrote: No, but parsing holding statements for something that just gets cut off early or which starts late should be easy unless entry is insanely inconsistent. Andthere it is. :-) We're really dealing with a few problems here: - Inconsistent entry by catalogers (probably the least of our worries) - Inconsistent publishing schedules (e.g., the Jan 1942 issue was just plain never printed) - Inconsistent use of volume/number/year/month/whatever throughout a serial's run. So, for example, http://mirlyn.lib.umich.edu/Record/45417/Holdings#1 There are six holdings: 1919-1920 incompl 1920 incompl. 1922 v.4 no.49 v.6 1921 jul-dec v.6 1921jan-jun We have no way of knowing what year volume 4 was printed in, which issues are incomplete in the two volumes that cover 1920, whether volume number are associated with earlier (or later) issues, etc. We, as humans, could try to make some guesses, but they'd just be guesses. It's easy to find examples where month ranges overlap (or leave gaps), where month names and issue numbers are sometimes used interchangeably, where volume numbers suddenly change in the middle of a run because of a merge with another serial (or where the first volume isn't 1 because the serial broke off from a parent), etc. etc. etc. I don't mean to overstate the problem. For many (most?) serials whose existence only goes back a few decades, a relatively simple approach will likely work much of the time -- although even that relatively simple approach will have to take into account a solid dozen or so different ways that enumcron data may have been entered. But to be able to say, with some confidence, that we have the full run? Or a particular issue as labeled my a month name? Much, much harder in the general case. -Bill- -- Bill Dueber Library Systems Programmer University of Michigan Library
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
Regarding the data in OCLC, my understanding (as a former serials cataloger) is that there is detailed information for at least some institutions in the interlibrary loan portion of the OCLC database but this is not available via worldcat. I know our ILL department added detailed information for commonly requested titles years ago. I also know we are in the process of getting our detailed holdings loaded into OCLC (possibly just on the ILL side, I'm not sure about this) and maintaining our holdings through batch updates. Many of our current titles use summary holdings, but not all do. I believe the summary holdings work much more effectively with ILL as well so our serials catalogers have been working for years to improve our local data. As part of our move to summary holdings, we also reduced some of the detail in our holdings, so now we show only gaps of entire volumes, but not specific missing issues in our coded holdings (the missing issues are included in notes in our i! tem specific records). If there is better data available to ILL staff, this may be an avenue you could pursue. Wendy Robertson Digital Resources Librarian . The University of Iowa Libraries 1015 Main Library . Iowa City, Iowa 52242 wendy-robert...@uiowa.edu 319-335-5821 -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Bill Dueber Sent: Tuesday, June 15, 2010 8:57 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat as an OpenURL endpoint ? On Tue, Jun 15, 2010 at 5:49 PM, Kyle Banerjee baner...@uoregon.edu wrote: No, but parsing holding statements for something that just gets cut off early or which starts late should be easy unless entry is insanely inconsistent. Andthere it is. :-) We're really dealing with a few problems here: - Inconsistent entry by catalogers (probably the least of our worries) - Inconsistent publishing schedules (e.g., the Jan 1942 issue was just plain never printed) - Inconsistent use of volume/number/year/month/whatever throughout a serial's run. So, for example, http://mirlyn.lib.umich.edu/Record/45417/Holdings#1 There are six holdings: 1919-1920 incompl 1920 incompl. 1922 v.4 no.49 v.6 1921 jul-dec v.6 1921jan-jun We have no way of knowing what year volume 4 was printed in, which issues are incomplete in the two volumes that cover 1920, whether volume number are associated with earlier (or later) issues, etc. We, as humans, could try to make some guesses, but they'd just be guesses. It's easy to find examples where month ranges overlap (or leave gaps), where month names and issue numbers are sometimes used interchangeably, where volume numbers suddenly change in the middle of a run because of a merge with another serial (or where the first volume isn't 1 because the serial broke off from a parent), etc. etc. etc. I don't mean to overstate the problem. For many (most?) serials whose existence only goes back a few decades, a relatively simple approach will likely work much of the time -- although even that relatively simple approach will have to take into account a solid dozen or so different ways that enumcron data may have been entered. But to be able to say, with some confidence, that we have the full run? Or a particular issue as labeled my a month name? Much, much harder in the general case. -Bill- -- Bill Dueber Library Systems Programmer University of Michigan Library
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
On Mon, Jun 14, 2010 at 3:47 PM, Jonathan Rochkind rochk...@jhu.edu wrote: The trick here is that traditional library metadata practices make it _very hard_ to tell if a _specific volume/issue_ is held by a given library. And those are the most common use cases for OpenURL. Yep. That's true even for individual library's with link resolvers. OCLC is not going to be able to solve that particular issue until the local libraries do. If you just want to get to the title level (for a journal or a book), you can easily write your own thing that takes an OpenURL, and either just redirects straight to worldcat.org on isbn/lccn/oclcnum, or actually does a WorldCat API lookup to ensure the record exists first and/or looks up on author/title/etc too. I was mainly thinking of sources that use COinS. If you have a rarely held book, for instance, then OpenURLs resolved against random institutional endpoints are going to mostly be unproductive. However, a union catalog such as OCLC already has the information about libraries in the system that own it. It seems like the more productive path if the goal of a user is simply to locate a copy, where ever it is held. Umlaut already includes the 'naive' just link to worldcat.org based on isbn, oclcnum, or lccn approach, functionality that was written before the worldcat api exists. That is, Umlaut takes an incoming OpenURL, and provides the user with a link to a worldcat record based on isbn, oclcnum, or lccn. Many institutions have chosen to do this. MPOW, however, represents a counter-example and do not link out to OCLC. Tom
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
It seems like the more productive path if the goal of a user is simply to locate a copy, where ever it is held. But I don't think users have *locating a copy* as their goal. Rather, I think their goal is to *get their hands on the book*. If I discover a book via COINs, and you drop me off at Worldcat.org, that allows me to see which libraries own the book. But, unless I happen to be affiliated with those institutions, that's kinda useless information. I have no real way of actually getting the book itself. If, instead, you drop me off at your institution's link resolver menu, and provide me an ILL option in the event you don't have the book, the library can get the book for me, which is really my *goal*. That seems like the more productive path, IMO. --Dave == David Walker Library Web Services Manager California State University http://xerxes.calstate.edu From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Tom Keays [tomke...@gmail.com] Sent: Tuesday, June 15, 2010 8:43 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat as an OpenURL endpoint ? On Mon, Jun 14, 2010 at 3:47 PM, Jonathan Rochkind rochk...@jhu.edu wrote: The trick here is that traditional library metadata practices make it _very hard_ to tell if a _specific volume/issue_ is held by a given library. And those are the most common use cases for OpenURL. Yep. That's true even for individual library's with link resolvers. OCLC is not going to be able to solve that particular issue until the local libraries do. If you just want to get to the title level (for a journal or a book), you can easily write your own thing that takes an OpenURL, and either just redirects straight to worldcat.org on isbn/lccn/oclcnum, or actually does a WorldCat API lookup to ensure the record exists first and/or looks up on author/title/etc too. I was mainly thinking of sources that use COinS. If you have a rarely held book, for instance, then OpenURLs resolved against random institutional endpoints are going to mostly be unproductive. However, a union catalog such as OCLC already has the information about libraries in the system that own it. It seems like the more productive path if the goal of a user is simply to locate a copy, where ever it is held. Umlaut already includes the 'naive' just link to worldcat.org based on isbn, oclcnum, or lccn approach, functionality that was written before the worldcat api exists. That is, Umlaut takes an incoming OpenURL, and provides the user with a link to a worldcat record based on isbn, oclcnum, or lccn. Many institutions have chosen to do this. MPOW, however, represents a counter-example and do not link out to OCLC. Tom
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
IF the user is coming from a recognized on-campus IP, you can configure WorldCat to give the user an ILL link to your library too. At least if you use ILLiad, maybe if you use something else (esp if your ILL software can accept OpenURLs too!). I haven't yet found any good way to do this if the user is off-campus (ezproxy not a good solution, how do we 'force' the user to use ezproxy for worldcat.org anyway?). But in any event, I agree with Dave that worldcat.org isn't a great interface even if you DO get it to have an ILL link in an odd place. I think we can do better. Which is really the whole purpose of Umlaut as an institutional link resolver, giving the user a better screen for I found this citation somewhere else, library what can you do to get it in my hands asap? Still wondering why Umlaut hasn't gotten more interest from people, heh. But we're using it here at JHU, and NYU and the New School are also using it. Jonathan Walker, David wrote: It seems like the more productive path if the goal of a user is simply to locate a copy, where ever it is held. But I don't think users have *locating a copy* as their goal. Rather, I think their goal is to *get their hands on the book*. If I discover a book via COINs, and you drop me off at Worldcat.org, that allows me to see which libraries own the book. But, unless I happen to be affiliated with those institutions, that's kinda useless information. I have no real way of actually getting the book itself. If, instead, you drop me off at your institution's link resolver menu, and provide me an ILL option in the event you don't have the book, the library can get the book for me, which is really my *goal*. That seems like the more productive path, IMO. --Dave == David Walker Library Web Services Manager California State University http://xerxes.calstate.edu From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Tom Keays [tomke...@gmail.com] Sent: Tuesday, June 15, 2010 8:43 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat as an OpenURL endpoint ? On Mon, Jun 14, 2010 at 3:47 PM, Jonathan Rochkind rochk...@jhu.edu wrote: The trick here is that traditional library metadata practices make it _very hard_ to tell if a _specific volume/issue_ is held by a given library. And those are the most common use cases for OpenURL. Yep. That's true even for individual library's with link resolvers. OCLC is not going to be able to solve that particular issue until the local libraries do. If you just want to get to the title level (for a journal or a book), you can easily write your own thing that takes an OpenURL, and either just redirects straight to worldcat.org on isbn/lccn/oclcnum, or actually does a WorldCat API lookup to ensure the record exists first and/or looks up on author/title/etc too. I was mainly thinking of sources that use COinS. If you have a rarely held book, for instance, then OpenURLs resolved against random institutional endpoints are going to mostly be unproductive. However, a union catalog such as OCLC already has the information about libraries in the system that own it. It seems like the more productive path if the goal of a user is simply to locate a copy, where ever it is held. Umlaut already includes the 'naive' just link to worldcat.org based on isbn, oclcnum, or lccn approach, functionality that was written before the worldcat api exists. That is, Umlaut takes an incoming OpenURL, and provides the user with a link to a worldcat record based on isbn, oclcnum, or lccn. Many institutions have chosen to do this. MPOW, however, represents a counter-example and do not link out to OCLC. Tom
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
The trick here is that traditional library metadata practices make it _very hard_ to tell if a _specific volume/issue_ is held by a given library. And those are the most common use cases for OpenURL. Yep. That's true even for individual library's with link resolvers. OCLC is not going to be able to solve that particular issue until the local libraries do. This might not be as bad as people think. The normal argument is that holdings are in free text and there's no way staff will ever have enough time to record volume level holdings. However, significant chunks of the problem can be addressed using relatively simple methods. For example, if you can identify complete runs, you know that a library has all holdings and can start automating things. With this in mind, the first step is to identify incomplete holdings. The mere presence of lingo like missing, lost, incomplete, scattered, wanting, etc. is a dead giveaway. So are bracketed fields that contain enumeration or temporal data (though you'll get false hits using this method when catalogers supply enumeration). Commas in any field that contains enumeration or temporal data also indicate incomplete holdings. I suspect that the mere presence of a note is a great indicator that holdings are incomplete since what kind of yutz writes a note saying all the holdings are here just like you'd expect? Having said that, I need to crawl through a lot more data before being comfortable with that statement. Regexp matches can be used to search for closed date ranges in open serials or close dates within 866 that don't correspond to close dates within fixed fields. That's the first pass. The second pass would be to search for the most common patterns that occur within incomplete holdings. Wash, rinse, repeat. After awhile, you'll get to all the cornball schemes that don't lend themselves towards automation, but hopefully that group of materials is getting to a more manageable size where throwing labor at the metadata makes some sense. Possibly guessing if a volume is available based on timeframe is a good way to go. Worst case scenario if the program can't handle it is you deflect the request to the next institution, and that already happens all the time for a variety of reasons. While my comments are mostly concerned with journal holdings, similar logic can be used with monographic series as well. kyle
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
When I've tried to do this, it's been much harder than your story, I'm afraid. My library data is very inconsistent in the way it expresses it's holdings. Even _without_ missing items, the holdings are expressed in human-readable narrative form which is very difficult to parse reliably. Theoretically, the holdings are expressed according to, I forget the name of the Z. standard, but some standard for expressing human readable holdings with certain punctuation and such. Even if they really WERE all exactly according to this standard, this standard is not very easy to parse consistently and reliably. But in fact, since when these tags are entered nothing validates them to this standard -- and at different times in history the cataloging staff entering them in various libraries had various ideas about how strictly they should follow this local policy -- our holdings are not even reliably according to that standard. But if you think it's easy, please, give it a try and get back to us. :) Maybe your library's data is cleaner than mine. I think it's kind of a crime that our ILS (and many other ILSs) doesn't provide a way for holdings to be efficiency entered (or guessed from prediction patterns etc) AND converted to an internal structured format that actually contains the semantic info we want. Offering catalogers the option to manually enter an MFHD is not a solution. Jonathan Kyle Banerjee wrote: The trick here is that traditional library metadata practices make it _very hard_ to tell if a _specific volume/issue_ is held by a given library. And those are the most common use cases for OpenURL. Yep. That's true even for individual library's with link resolvers. OCLC is not going to be able to solve that particular issue until the local libraries do. This might not be as bad as people think. The normal argument is that holdings are in free text and there's no way staff will ever have enough time to record volume level holdings. However, significant chunks of the problem can be addressed using relatively simple methods. For example, if you can identify complete runs, you know that a library has all holdings and can start automating things. With this in mind, the first step is to identify incomplete holdings. The mere presence of lingo like missing, lost, incomplete, scattered, wanting, etc. is a dead giveaway. So are bracketed fields that contain enumeration or temporal data (though you'll get false hits using this method when catalogers supply enumeration). Commas in any field that contains enumeration or temporal data also indicate incomplete holdings. I suspect that the mere presence of a note is a great indicator that holdings are incomplete since what kind of yutz writes a note saying all the holdings are here just like you'd expect? Having said that, I need to crawl through a lot more data before being comfortable with that statement. Regexp matches can be used to search for closed date ranges in open serials or close dates within 866 that don't correspond to close dates within fixed fields. That's the first pass. The second pass would be to search for the most common patterns that occur within incomplete holdings. Wash, rinse, repeat. After awhile, you'll get to all the cornball schemes that don't lend themselves towards automation, but hopefully that group of materials is getting to a more manageable size where throwing labor at the metadata makes some sense. Possibly guessing if a volume is available based on timeframe is a good way to go. Worst case scenario if the program can't handle it is you deflect the request to the next institution, and that already happens all the time for a variety of reasons. While my comments are mostly concerned with journal holdings, similar logic can be used with monographic series as well. kyle
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
Kyle Banerjee schrieb: This might not be as bad as people think. The normal argument is that holdings are in free text and there's no way staff will ever have enough time to record volume level holdings. However, significant chunks of the problem can be addressed using relatively simple methods. For example, if you can identify complete runs, you know that a library has all holdings and can start automating things. That's what we've done for journal holdings (only) in https://sourceforge.net/projects/doctor-doc/ Works perfect in combination with an EZB-account (rzblx1.uni-regensburg.de/ezeit) as a linkresolver. May be as exact as on issue level. The tool is beeing used by around 100 libraries in Germany, Switzerland and Austria. If you check this one out: Don't expect the perfect OS-system. It has been developped by me (head of library and no IT-Professional) and a colleague (IT-Professional). I learned a lot through this one. There is plenty room for improvement in it: some things implemented not yet so nice, other things done quite nice ;-) If you want to discuss, use or contribute: https://sourceforge.net/projects/doctor-doc/support Very welcome! Markus Fischer While my comments are mostly concerned with journal holdings, similar logic can be used with monographic series as well. kyle
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
I think my perspective of the user's goal is actually the same (or close enough to the same) as David's, just stated differently. The user wants the most local copy or, failing that, a way to order it from another source. However, I have plenty of examples of faculty and occasional grad students who are willing to make the trek to a nearby library -- even out of town libraries -- rather than do ILL. This doesn't encompass every use case or even a typical use case (are there typical cases?), but it does no harm to have information even if you can't always act on it. The problem with OpenURL tied to a particular institution is a) the person may not have (or know they have) an affiliation to a given institution, b) may be coming from outside their institution's IP range so that even the OCLC Registry redirect trick will fail to get them to a (let alone the correct) link resolver, c) there may not be any recourse to find an item if the institution does not own it (MPOW does not provide a link to WorldCat). Tom On Tue, Jun 15, 2010 at 12:16 PM, Walker, David dwal...@calstate.eduwrote: It seems like the more productive path if the goal of a user is simply to locate a copy, where ever it is held. But I don't think users have *locating a copy* as their goal. Rather, I think their goal is to *get their hands on the book*. If I discover a book via COINs, and you drop me off at Worldcat.org, that allows me to see which libraries own the book. But, unless I happen to be affiliated with those institutions, that's kinda useless information. I have no real way of actually getting the book itself. If, instead, you drop me off at your institution's link resolver menu, and provide me an ILL option in the event you don't have the book, the library can get the book for me, which is really my *goal*. That seems like the more productive path, IMO. --Dave == David Walker Library Web Services Manager California State University http://xerxes.calstate.edu From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Tom Keays [tomke...@gmail.com] Sent: Tuesday, June 15, 2010 8:43 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat as an OpenURL endpoint ? On Mon, Jun 14, 2010 at 3:47 PM, Jonathan Rochkind rochk...@jhu.edu wrote: The trick here is that traditional library metadata practices make it _very hard_ to tell if a _specific volume/issue_ is held by a given library. And those are the most common use cases for OpenURL. Yep. That's true even for individual library's with link resolvers. OCLC is not going to be able to solve that particular issue until the local libraries do. If you just want to get to the title level (for a journal or a book), you can easily write your own thing that takes an OpenURL, and either just redirects straight to worldcat.org on isbn/lccn/oclcnum, or actually does a WorldCat API lookup to ensure the record exists first and/or looks up on author/title/etc too. I was mainly thinking of sources that use COinS. If you have a rarely held book, for instance, then OpenURLs resolved against random institutional endpoints are going to mostly be unproductive. However, a union catalog such as OCLC already has the information about libraries in the system that own it. It seems like the more productive path if the goal of a user is simply to locate a copy, where ever it is held. Umlaut already includes the 'naive' just link to worldcat.org based on isbn, oclcnum, or lccn approach, functionality that was written before the worldcat api exists. That is, Umlaut takes an incoming OpenURL, and provides the user with a link to a worldcat record based on isbn, oclcnum, or lccn. Many institutions have chosen to do this. MPOW, however, represents a counter-example and do not link out to OCLC. Tom
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
I do provide the user with the proxied WorldCat URL for just the reasons Jonathan cites. But, no, being an otherwise open web resource, you can't force a user to use it. On Tue, Jun 15, 2010 at 12:22 PM, Jonathan Rochkind rochk...@jhu.eduwrote: I haven't yet found any good way to do this if the user is off-campus (ezproxy not a good solution, how do we 'force' the user to use ezproxy for worldcat.org anyway?).
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
I'm not sure what you mean by complete holdings? The library holds the entire run of the journal from the first issue printed to the last/current? Or just holdings that dont' include missing statements? Perhaps other institutions have more easily parseable holdings data (or even holdings data stored in structured form in the ILS) than mine. For mine, even holdings that don't include missing are not feasibly reliably parseable, I've tried. Jonathan Kyle Banerjee wrote: But if you think it's easy, please, give it a try and get back to us. :) Maybe your library's data is cleaner than mine. I don't think it's easy, but I think detecting *complete* holdings is a big part of the picture and that can be done fairly well. Cleanliness of data will vary from one institution to another, and quite a bit of it will be parsible. Even if you only can't even get half, you're still way ahead of where you'd otherwise be. I think it's kind of a crime that our ILS (and many other ILSs) doesn't provide a way for holdings to be efficiency entered (or guessed from prediction patterns etc) AND converted to an internal structured format that actually contains the semantic info we want. There's too much variation in what people want to do. Even going with manual MFHD, it's still pretty easy to generate stuff that's pretty hard to parse kyle
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
Oh you really do mean complete like complete publication run? Very few of our journal holdings are complete in that sense, they are definitely in the minority. We start getting something after issue 1, or stop getting it before the last issue. Or stop and then start again. Is this really unusual? If all you've figured out is the complete publication run of a journal, and are assuming your library holds it... wait, how is this something you need for any actual use case? My use case is trying to figure out IF we have a particular volume/issue, and ideally, if so, what shelf is it located on. If I'm just going to deal with journals we have the complete publication history of, I don't have a problem anymore, because the answer will always be yes, that's a very simple algorithm, print yes, heh. So, yes, if you assume only holdings of complete publication histories, the problem does get very easy. Incidentally, if anyone is looking for a schema and transmission format for actual _structured_ holdings information, that's flexible enough for idiosyncratic publication histories and holdings, but still structured enough to actually be machine-actionable... I still can't recommend Onix Serial Holdings highly enough! I don't think it gets much use, probably because most of our systems simply don't _have_ this structured information, most of our staff interfaces don't provide reasonably efficient interfaces for entering, etc. But if you can get the other pieces and just need a schema and representation format, Onix Serial Holdings is nice! Jonathan Kyle Banerjee wrote: On Tue, Jun 15, 2010 at 10:13 AM, Jonathan Rochkind rochk...@jhu.eduwrote: I'm not sure what you mean by complete holdings? The library holds the entire run of the journal from the first issue printed to the last/current? Or just holdings that dont' include missing statements? Obviously, there has to some sort of holdings statement -- I'm presuming that something reasonably accurate is available. If there is no summary holdings statement, items aren't inventoried, but holdings are believed to be incomplete, there's not much to work with. As far as retrospectively getting data up to scratch in the case of hopeless situations, there are paths that make sense. For instance, retrospectively inventorying serials may be insane. However, from circ and ILL data, you should know which titles are actually consulted the most. Get those ones in shape first and work backwards. In a major academic library, it may be the case that some titles are *never* handled, but that doesn't cause problems if no one wants them. For low use resources, it can make more sense to just handle things manually. Perhaps other institutions have more easily parseable holdings data (or even holdings data stored in structured form in the ILS) than mine. For mine, even holdings that don't include missing are not feasibly reliably parseable, I've tried. Note that you can get structured holdings data from sources other than the library catalog -- if you know what's missing. Sounds like your situation is particularly challenging. But there are gains worth chasing. Service issues aside, problems like these raise existential questions. If we do an inadequate job of providing access, patrons will just turn to subscription databases and no one will even care about what we do or even if we're still around. Most major academic libraries never got their entire card collection in the online catalog. Patrons don't use that stuff anymore, and almost no one cares (even among librarians). It would be a mistake to think this can't happen again. kyle
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
Oh you really do mean complete like complete publication run? Very few of our journal holdings are complete in that sense, they are definitely in the minority. We start getting something after issue 1, or stop getting it before the last issue. Or stop and then start again. Is this really unusual? No, but parsing holding statements for something that just gets cut off early or which starts late should be easy unless entry is insanely inconsistent. If staff enter info even close to standard practices, you still should be able to read a lot of it even when there are breaks. This is when anal retentive behavior in the tech services dept saves your bacon. This process will be lossy, but sometimes that's all you can do. Some situations may be such that there's no reasonable fix that would significantly improve things. But in that case, it makes sense to move onto other problems. Otherwise, we wind up all our time futzing with fringe use cases and people actually get what they need elsewhere. kyle
Re: [CODE4LIB] WorldCat Terminologies
I bet it's got an SRU api. Aye, Ralph can confirm but I'm pretty sure what you see on your screen is actually an XML sru (api) response to which your browser has applied the suggested xsl stylesheet, thus rendering the API result in a more human friendly manner. A view source on the page should show the SRU. Ian. On 21 March 2010 04:19, Jonathan Rochkind rochk...@jhu.edu wrote: Yeah, the statement that it's a static copy from 2006 would have stopped me in my tracks if I had somehow happened accross the page, which I probably wouldn't have, but now I've bookmarked it so I might find it again -- but will probably forget that it's REALLY up to date even though it says 2006 on it. Nice catch Karen. Karen, that looks to me like an HTML front-end for an SRU service, I bet it's got an SRU api. Which one of these days I'll get around to figuring out how to write code for. From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Karen Coyle [li...@kcoyle.net] Sent: Saturday, March 20, 2010 11:29 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Quoting LeVan,Ralph le...@oclc.org: I hate to muddy the waters, but I can't resist here. Research also exposes a copy of the LC NAF at http://alcme.oclc.org/srw/search/lcnaf It gets updated every Tuesday night. Unfortunately, that page states right up front: A static copy of LC's Name Authority File from February of 2006 That might confuse visitors. Maybe a quick revision is in order? :-) Also, API access? kc This is something I've been maintaining for years and is what Identities points at when you ask to see the NAF record associated with an Identities record. This particular service has none of the linked-data-type bells and whistles I'm putting into VIAF and Identities, but easily could, if there was interest. I believe I've made the indexing on it consistent with what I do in Identities. Looking at the configuration file for the load of this database, I am omitting records with 100$k, 100$t, 100$v, 100$x or any 130 fields. I'm sure Ya'aqov (or other similarly expert Authority Librarian) could tell you why I am omitting them, because I can't off the top of my head. This service is actually running as a long established model of how similar services should run in Research. While it is not running on a machine operated by our production staff, it is automatically monitored by them, they have restart procedures in places when the service becomes unresponsive and problems are escalated by email when the restart fails to fix the problem. (Those emails come to me and where they get treated appropriately.) Let me know if there are questions about any of this. Ralph -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Ya'aqov Ziso Sent: Friday, March 19, 2010 3:29 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Jonathan, thank you, in full accord. Yes, the crux of the matter is Names (NAF being the more expensive library subscription and the one not available for free like http://id.loc.gov At http://orlabs.oclc.org/Identities/ I searched Œclapton, eric¹ and wonder: * are all WorldCat+NAF 49 retrieved headings there helpful? do we need in the list Œclayton, cecile¹ (etc.) * Names that haven¹t made it into an authority record are definitely helpful, but can we suggest a way to sort and rank them more usefully (for the user) on the page? Your thoughts? Ya¹aqov On 3/19/10 2:35 PM, Jonathan Rochkind rochk...@jhu.edu wrote: I don't think the inclusion of non-NAF headings in Identities is a flaw, it's a benefit to the purpose of Identities not to be held back by the somewhat glacial pace of change in NAF. But you're right, the right tool for the job, I don't know that any of the existing OCLC free (or included with other OCLC membership/services) services are the right tool to replace any existing purchased authorities tools or sources. It depends on what you're using them for, of course. I agree that the brochure statement was potentially misleading, but these (Identities, Terminologies, Research Terminologies) are still very interesting and useful services. From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Ya'aqov Ziso [z...@rowan.edu] Sent: Friday, March 19, 2010 2:14 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Karen, Seems like pulling-teeth was worth it. Thank you for these updates and for making them available for all interested. Essentially, given your 6 months latency compared to http://id.loc.gov) and the inclusion of NAF and non-NAF headings in Identities, both Terminologies and Identities are not yet at a level of databases that libraries can go for, forgoing what
Re: [CODE4LIB] WorldCat Terminologies
I'm certain that as Ralph indicated, this file has been kept weekly up-to-date. The html page header will be, eventually, fixed as well to reflect accurately the file's last update and its SRU searchability. The fact remains that for all: terminologies/identities/xISSN/xISBN WC-DEVNET is the customer support and quality control. We have no other address for maintenance, and possibly OCLC Research's dedicated staff lack such address as well. Yes, these experimental services reside on OCLC servers. Unfortunately, given this customer support model, OCLC Research will be constantly put in a defensive position and all we can do is flag problems and maintain this loop. (unless any of you has an idea for a loophole and, please, bring it on!) Ya'aqov -Original Message- From: Code for Libraries on behalf of Jonathan Rochkind Sent: Sun 3/21/2010 12:19 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Yeah, the statement that it's a static copy from 2006 would have stopped me in my tracks if I had somehow happened accross the page, which I probably wouldn't have, but now I've bookmarked it so I might find it again -- but will probably forget that it's REALLY up to date even though it says 2006 on it. Nice catch Karen. Karen, that looks to me like an HTML front-end for an SRU service, I bet it's got an SRU api. Which one of these days I'll get around to figuring out how to write code for. From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Karen Coyle [li...@kcoyle.net] Sent: Saturday, March 20, 2010 11:29 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Quoting LeVan,Ralph le...@oclc.org: I hate to muddy the waters, but I can't resist here. Research also exposes a copy of the LC NAF at http://alcme.oclc.org/srw/search/lcnaf It gets updated every Tuesday night. Unfortunately, that page states right up front: A static copy of LC's Name Authority File from February of 2006 That might confuse visitors. Maybe a quick revision is in order? :-) Also, API access? kc This is something I've been maintaining for years and is what Identities points at when you ask to see the NAF record associated with an Identities record. This particular service has none of the linked-data-type bells and whistles I'm putting into VIAF and Identities, but easily could, if there was interest. I believe I've made the indexing on it consistent with what I do in Identities. Looking at the configuration file for the load of this database, I am omitting records with 100$k, 100$t, 100$v, 100$x or any 130 fields. I'm sure Ya'aqov (or other similarly expert Authority Librarian) could tell you why I am omitting them, because I can't off the top of my head. This service is actually running as a long established model of how similar services should run in Research. While it is not running on a machine operated by our production staff, it is automatically monitored by them, they have restart procedures in places when the service becomes unresponsive and problems are escalated by email when the restart fails to fix the problem. (Those emails come to me and where they get treated appropriately.) Let me know if there are questions about any of this. Ralph -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Ya'aqov Ziso Sent: Friday, March 19, 2010 3:29 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Jonathan, thank you, in full accord. Yes, the crux of the matter is Names (NAF being the more expensive library subscription and the one not available for free like http://id.loc.gov At http://orlabs.oclc.org/Identities/ I searched Oclapton, eric¹ and wonder: * are all WorldCat+NAF 49 retrieved headings there helpful? do we need in the list Oclayton, cecile¹ (etc.) * Names that haven¹t made it into an authority record are definitely helpful, but can we suggest a way to sort and rank them more usefully (for the user) on the page? Your thoughts? Ya¹aqov On 3/19/10 2:35 PM, Jonathan Rochkind rochk...@jhu.edu wrote: I don't think the inclusion of non-NAF headings in Identities is a flaw, it's a benefit to the purpose of Identities not to be held back by the somewhat glacial pace of change in NAF. But you're right, the right tool for the job, I don't know that any of the existing OCLC free (or included with other OCLC membership/services) services are the right tool to replace any existing purchased authorities tools or sources. It depends on what you're using them for, of course. I agree that the brochure statement was potentially misleading, but these (Identities, Terminologies, Research Terminologies) are still very interesting and useful services. From: Code for Libraries [code4...@listserv.nd.edu
Re: [CODE4LIB] WorldCat Terminologies
Actually, that's a standard XML interface with a stylesheet rendering the html. The static message is probably coming out of my database configuration. Ralph -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Jonathan Rochkind Sent: Sunday, March 21, 2010 12:19 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Yeah, the statement that it's a static copy from 2006 would have stopped me in my tracks if I had somehow happened accross the page, which I probably wouldn't have, but now I've bookmarked it so I might find it again -- but will probably forget that it's REALLY up to date even though it says 2006 on it. Nice catch Karen. Karen, that looks to me like an HTML front-end for an SRU service, I bet it's got an SRU api. Which one of these days I'll get around to figuring out how to write code for. From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Karen Coyle [li...@kcoyle.net] Sent: Saturday, March 20, 2010 11:29 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Quoting LeVan,Ralph le...@oclc.org: I hate to muddy the waters, but I can't resist here. Research also exposes a copy of the LC NAF at http://alcme.oclc.org/srw/search/lcnaf It gets updated every Tuesday night. Unfortunately, that page states right up front: A static copy of LC's Name Authority File from February of 2006 That might confuse visitors. Maybe a quick revision is in order? :-) Also, API access? kc This is something I've been maintaining for years and is what Identities points at when you ask to see the NAF record associated with an Identities record. This particular service has none of the linked-data-type bells and whistles I'm putting into VIAF and Identities, but easily could, if there was interest. I believe I've made the indexing on it consistent with what I do in Identities. Looking at the configuration file for the load of this database, I am omitting records with 100$k, 100$t, 100$v, 100$x or any 130 fields. I'm sure Ya'aqov (or other similarly expert Authority Librarian) could tell you why I am omitting them, because I can't off the top of my head. This service is actually running as a long established model of how similar services should run in Research. While it is not running on a machine operated by our production staff, it is automatically monitored by them, they have restart procedures in places when the service becomes unresponsive and problems are escalated by email when the restart fails to fix the problem. (Those emails come to me and where they get treated appropriately.) Let me know if there are questions about any of this. Ralph -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Ya'aqov Ziso Sent: Friday, March 19, 2010 3:29 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Jonathan, thank you, in full accord. Yes, the crux of the matter is Names (NAF being the more expensive library subscription and the one not available for free like http://id.loc.gov At http://orlabs.oclc.org/Identities/ I searched Œclapton, eric¹ and wonder: * are all WorldCat+NAF 49 retrieved headings there helpful? do we need in the list Œclayton, cecile¹ (etc.) * Names that haven¹t made it into an authority record are definitely helpful, but can we suggest a way to sort and rank them more usefully (for the user) on the page? Your thoughts? Ya¹aqov On 3/19/10 2:35 PM, Jonathan Rochkind rochk...@jhu.edu wrote: I don't think the inclusion of non-NAF headings in Identities is a flaw, it's a benefit to the purpose of Identities not to be held back by the somewhat glacial pace of change in NAF. But you're right, the right tool for the job, I don't know that any of the existing OCLC free (or included with other OCLC membership/services) services are the right tool to replace any existing purchased authorities tools or sources. It depends on what you're using them for, of course. I agree that the brochure statement was potentially misleading, but these (Identities, Terminologies, Research Terminologies) are still very interesting and useful services. From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Ya'aqov Ziso [z...@rowan.edu] Sent: Friday, March 19, 2010 2:14 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Karen, Seems like pulling-teeth was worth it. Thank you for these updates and for making them available for all interested. Essentially, given your 6 months latency compared to http://id.loc.gov) and the inclusion of NAF and non-NAF headings in Identities, both Terminologies and Identities are not yet at a level of databases that libraries can go for, forgoing what
Re: [CODE4LIB] WorldCat Terminologies
I'm open to suggestions, Ya'aqov. I've been talking up the idea of some sort of dashboard for our services. Display uptime and response time. It will be tougher to automatically detect a database update and report it. I'll give that some thought for stuff running over my software stack. This seems like the right forum to solicit other suggestions. Has anyone done this before? It seems like there ought to be some lists lying around somewhere of information that would be helpful to service consumers. Ralph -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Ziso, Ya'aqov Sent: Sunday, March 21, 2010 1:09 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies I'm certain that as Ralph indicated, this file has been kept weekly up-to-date. The html page header will be, eventually, fixed as well to reflect accurately the file's last update and its SRU searchability. The fact remains that for all: terminologies/identities/xISSN/xISBN WC-DEVNET is the customer support and quality control. We have no other address for maintenance, and possibly OCLC Research's dedicated staff lack such address as well. Yes, these experimental services reside on OCLC servers. Unfortunately, given this customer support model, OCLC Research will be constantly put in a defensive position and all we can do is flag problems and maintain this loop. (unless any of you has an idea for a loophole and, please, bring it on!) Ya'aqov -Original Message- From: Code for Libraries on behalf of Jonathan Rochkind Sent: Sun 3/21/2010 12:19 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Yeah, the statement that it's a static copy from 2006 would have stopped me in my tracks if I had somehow happened accross the page, which I probably wouldn't have, but now I've bookmarked it so I might find it again -- but will probably forget that it's REALLY up to date even though it says 2006 on it. Nice catch Karen. Karen, that looks to me like an HTML front-end for an SRU service, I bet it's got an SRU api. Which one of these days I'll get around to figuring out how to write code for. From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Karen Coyle [li...@kcoyle.net] Sent: Saturday, March 20, 2010 11:29 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Quoting LeVan,Ralph le...@oclc.org: I hate to muddy the waters, but I can't resist here. Research also exposes a copy of the LC NAF at http://alcme.oclc.org/srw/search/lcnaf It gets updated every Tuesday night. Unfortunately, that page states right up front: A static copy of LC's Name Authority File from February of 2006 That might confuse visitors. Maybe a quick revision is in order? :-) Also, API access? kc This is something I've been maintaining for years and is what Identities points at when you ask to see the NAF record associated with an Identities record. This particular service has none of the linked-data-type bells and whistles I'm putting into VIAF and Identities, but easily could, if there was interest. I believe I've made the indexing on it consistent with what I do in Identities. Looking at the configuration file for the load of this database, I am omitting records with 100$k, 100$t, 100$v, 100$x or any 130 fields. I'm sure Ya'aqov (or other similarly expert Authority Librarian) could tell you why I am omitting them, because I can't off the top of my head. This service is actually running as a long established model of how similar services should run in Research. While it is not running on a machine operated by our production staff, it is automatically monitored by them, they have restart procedures in places when the service becomes unresponsive and problems are escalated by email when the restart fails to fix the problem. (Those emails come to me and where they get treated appropriately.) Let me know if there are questions about any of this. Ralph -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Ya'aqov Ziso Sent: Friday, March 19, 2010 3:29 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Jonathan, thank you, in full accord. Yes, the crux of the matter is Names (NAF being the more expensive library subscription and the one not available for free like http://id.loc.gov At http://orlabs.oclc.org/Identities/ I searched Oclapton, eric¹ and wonder: * are all WorldCat+NAF 49 retrieved headings there helpful? do we need in the list Oclayton, cecile¹ (etc.) * Names that haven¹t made it into an authority record are definitely helpful, but can we suggest a way to sort and rank them more usefully (for the user) on the page? Your thoughts? Ya¹aqov On 3/19/10 2:35 PM, Jonathan Rochkind rochk...@jhu.edu wrote: I don't think
Re: [CODE4LIB] WorldCat Terminologies
Quoting LeVan,Ralph le...@oclc.org: I hate to muddy the waters, but I can't resist here. Research also exposes a copy of the LC NAF at http://alcme.oclc.org/srw/search/lcnaf It gets updated every Tuesday night. Unfortunately, that page states right up front: A static copy of LC's Name Authority File from February of 2006 That might confuse visitors. Maybe a quick revision is in order? :-) Also, API access? kc This is something I've been maintaining for years and is what Identities points at when you ask to see the NAF record associated with an Identities record. This particular service has none of the linked-data-type bells and whistles I'm putting into VIAF and Identities, but easily could, if there was interest. I believe I've made the indexing on it consistent with what I do in Identities. Looking at the configuration file for the load of this database, I am omitting records with 100$k, 100$t, 100$v, 100$x or any 130 fields. I'm sure Ya'aqov (or other similarly expert Authority Librarian) could tell you why I am omitting them, because I can't off the top of my head. This service is actually running as a long established model of how similar services should run in Research. While it is not running on a machine operated by our production staff, it is automatically monitored by them, they have restart procedures in places when the service becomes unresponsive and problems are escalated by email when the restart fails to fix the problem. (Those emails come to me and where they get treated appropriately.) Let me know if there are questions about any of this. Ralph -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Ya'aqov Ziso Sent: Friday, March 19, 2010 3:29 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Jonathan, thank you, in full accord. Yes, the crux of the matter is Names (NAF being the more expensive library subscription and the one not available for free like http://id.loc.gov At http://orlabs.oclc.org/Identities/ I searched Œclapton, eric¹ and wonder: * are all WorldCat+NAF 49 retrieved headings there helpful? do we need in the list Œclayton, cecile¹ (etc.) * Names that haven¹t made it into an authority record are definitely helpful, but can we suggest a way to sort and rank them more usefully (for the user) on the page? Your thoughts? Ya¹aqov On 3/19/10 2:35 PM, Jonathan Rochkind rochk...@jhu.edu wrote: I don't think the inclusion of non-NAF headings in Identities is a flaw, it's a benefit to the purpose of Identities not to be held back by the somewhat glacial pace of change in NAF. But you're right, the right tool for the job, I don't know that any of the existing OCLC free (or included with other OCLC membership/services) services are the right tool to replace any existing purchased authorities tools or sources. It depends on what you're using them for, of course. I agree that the brochure statement was potentially misleading, but these (Identities, Terminologies, Research Terminologies) are still very interesting and useful services. From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Ya'aqov Ziso [z...@rowan.edu] Sent: Friday, March 19, 2010 2:14 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Karen, Seems like pulling-teeth was worth it. Thank you for these updates and for making them available for all interested. Essentially, given your 6 months latency compared to http://id.loc.gov) and the inclusion of NAF and non-NAF headings in Identities, both Terminologies and Identities are not yet at a level of databases that libraries can go for, forgoing what they currently need to buy, load, and maintain locally. Your withdrawal of the brochure statement is justified and apology accepted. Ya¹aqov On 3/19/10 1:35 PM, Karen Coombs coom...@oclc.org wrote: Ya¹aqov, Identities is not based on a name authority file it is based on name data in WorldCat. These two are not the same thing. Names within Identities come from several different fields within the WorldCat MARC records including 1xx, 6xx, and 7xx fields. This is why Identities contains names which have not made their way into a name authority file (http://www.worldcat.org/identities/np-levan,%20ralph%20r). Also Identities doesn¹t contain authority records, it contains Identity records. Identity records contain different information from Authority records. There are name authority projects going on within OCLC Research. The most active is VIAF http://www.viaf.org/ . This service contains name authority information from several national libraries, not just Library of Congress and right now not all of NACO has been loaded, only differentiate personal names. More
Re: [CODE4LIB] WorldCat Terminologies
Yeah, the statement that it's a static copy from 2006 would have stopped me in my tracks if I had somehow happened accross the page, which I probably wouldn't have, but now I've bookmarked it so I might find it again -- but will probably forget that it's REALLY up to date even though it says 2006 on it. Nice catch Karen. Karen, that looks to me like an HTML front-end for an SRU service, I bet it's got an SRU api. Which one of these days I'll get around to figuring out how to write code for. From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Karen Coyle [li...@kcoyle.net] Sent: Saturday, March 20, 2010 11:29 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Quoting LeVan,Ralph le...@oclc.org: I hate to muddy the waters, but I can't resist here. Research also exposes a copy of the LC NAF at http://alcme.oclc.org/srw/search/lcnaf It gets updated every Tuesday night. Unfortunately, that page states right up front: A static copy of LC's Name Authority File from February of 2006 That might confuse visitors. Maybe a quick revision is in order? :-) Also, API access? kc This is something I've been maintaining for years and is what Identities points at when you ask to see the NAF record associated with an Identities record. This particular service has none of the linked-data-type bells and whistles I'm putting into VIAF and Identities, but easily could, if there was interest. I believe I've made the indexing on it consistent with what I do in Identities. Looking at the configuration file for the load of this database, I am omitting records with 100$k, 100$t, 100$v, 100$x or any 130 fields. I'm sure Ya'aqov (or other similarly expert Authority Librarian) could tell you why I am omitting them, because I can't off the top of my head. This service is actually running as a long established model of how similar services should run in Research. While it is not running on a machine operated by our production staff, it is automatically monitored by them, they have restart procedures in places when the service becomes unresponsive and problems are escalated by email when the restart fails to fix the problem. (Those emails come to me and where they get treated appropriately.) Let me know if there are questions about any of this. Ralph -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Ya'aqov Ziso Sent: Friday, March 19, 2010 3:29 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Jonathan, thank you, in full accord. Yes, the crux of the matter is Names (NAF being the more expensive library subscription and the one not available for free like http://id.loc.gov At http://orlabs.oclc.org/Identities/ I searched Œclapton, eric¹ and wonder: * are all WorldCat+NAF 49 retrieved headings there helpful? do we need in the list Œclayton, cecile¹ (etc.) * Names that haven¹t made it into an authority record are definitely helpful, but can we suggest a way to sort and rank them more usefully (for the user) on the page? Your thoughts? Ya¹aqov On 3/19/10 2:35 PM, Jonathan Rochkind rochk...@jhu.edu wrote: I don't think the inclusion of non-NAF headings in Identities is a flaw, it's a benefit to the purpose of Identities not to be held back by the somewhat glacial pace of change in NAF. But you're right, the right tool for the job, I don't know that any of the existing OCLC free (or included with other OCLC membership/services) services are the right tool to replace any existing purchased authorities tools or sources. It depends on what you're using them for, of course. I agree that the brochure statement was potentially misleading, but these (Identities, Terminologies, Research Terminologies) are still very interesting and useful services. From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Ya'aqov Ziso [z...@rowan.edu] Sent: Friday, March 19, 2010 2:14 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Karen, Seems like pulling-teeth was worth it. Thank you for these updates and for making them available for all interested. Essentially, given your 6 months latency compared to http://id.loc.gov) and the inclusion of NAF and non-NAF headings in Identities, both Terminologies and Identities are not yet at a level of databases that libraries can go for, forgoing what they currently need to buy, load, and maintain locally. Your withdrawal of the brochure statement is justified and apology accepted. Ya¹aqov On 3/19/10 1:35 PM, Karen Coombs coom...@oclc.org wrote: Ya¹aqov, Identities is not based on a name authority file it is based on name data in WorldCat. These two are not the same thing. Names within Identities come from
Re: [CODE4LIB] WorldCat Terminologies
PS: Whatever I search for under the cql.any column, I get 0 hits. Maybe cql.any isn't actually supported? Ah, the perils of trying to figure out out SRU. The SRU explain document (which is actually there if you view source: That page in fact just IS an SRU explain document with an XSLT transform to HTML) suggests that cql.any is indeed supported. It might be lying. Whenever I've tried to use an SRU service, it hasn't been nearly as transparently self-explaining as SRU aims to be. Jonathan From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Karen Coyle [li...@kcoyle.net] Sent: Saturday, March 20, 2010 11:29 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Quoting LeVan,Ralph le...@oclc.org: I hate to muddy the waters, but I can't resist here. Research also exposes a copy of the LC NAF at http://alcme.oclc.org/srw/search/lcnaf It gets updated every Tuesday night. Unfortunately, that page states right up front: A static copy of LC's Name Authority File from February of 2006 That might confuse visitors. Maybe a quick revision is in order? :-) Also, API access? kc This is something I've been maintaining for years and is what Identities points at when you ask to see the NAF record associated with an Identities record. This particular service has none of the linked-data-type bells and whistles I'm putting into VIAF and Identities, but easily could, if there was interest. I believe I've made the indexing on it consistent with what I do in Identities. Looking at the configuration file for the load of this database, I am omitting records with 100$k, 100$t, 100$v, 100$x or any 130 fields. I'm sure Ya'aqov (or other similarly expert Authority Librarian) could tell you why I am omitting them, because I can't off the top of my head. This service is actually running as a long established model of how similar services should run in Research. While it is not running on a machine operated by our production staff, it is automatically monitored by them, they have restart procedures in places when the service becomes unresponsive and problems are escalated by email when the restart fails to fix the problem. (Those emails come to me and where they get treated appropriately.) Let me know if there are questions about any of this. Ralph -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Ya'aqov Ziso Sent: Friday, March 19, 2010 3:29 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Jonathan, thank you, in full accord. Yes, the crux of the matter is Names (NAF being the more expensive library subscription and the one not available for free like http://id.loc.gov At http://orlabs.oclc.org/Identities/ I searched Œclapton, eric¹ and wonder: * are all WorldCat+NAF 49 retrieved headings there helpful? do we need in the list Œclayton, cecile¹ (etc.) * Names that haven¹t made it into an authority record are definitely helpful, but can we suggest a way to sort and rank them more usefully (for the user) on the page? Your thoughts? Ya¹aqov On 3/19/10 2:35 PM, Jonathan Rochkind rochk...@jhu.edu wrote: I don't think the inclusion of non-NAF headings in Identities is a flaw, it's a benefit to the purpose of Identities not to be held back by the somewhat glacial pace of change in NAF. But you're right, the right tool for the job, I don't know that any of the existing OCLC free (or included with other OCLC membership/services) services are the right tool to replace any existing purchased authorities tools or sources. It depends on what you're using them for, of course. I agree that the brochure statement was potentially misleading, but these (Identities, Terminologies, Research Terminologies) are still very interesting and useful services. From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Ya'aqov Ziso [z...@rowan.edu] Sent: Friday, March 19, 2010 2:14 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Karen, Seems like pulling-teeth was worth it. Thank you for these updates and for making them available for all interested. Essentially, given your 6 months latency compared to http://id.loc.gov) and the inclusion of NAF and non-NAF headings in Identities, both Terminologies and Identities are not yet at a level of databases that libraries can go for, forgoing what they currently need to buy, load, and maintain locally. Your withdrawal of the brochure statement is justified and apology accepted. Ya¹aqov On 3/19/10 1:35 PM, Karen Coombs coom...@oclc.org wrote: Ya¹aqov, Identities is not based on a name authority file it is based on name data in WorldCat. These two are not the same thing. Names within Identities come from
Re: [CODE4LIB] WorldCat Terminologies
Ya¹aqov, We have decided that it is wiser to withdraw the statement from our brochure, with our apologies, rather than attempt to defend it. We¹re sorry if this has caused you any trouble. As for your questions regarding frequency of update and any guaranteed level of service, we have already answered those questions in venues to which you subscribe. For example: http://tinyurl.com/yaldzcw . It is common practice, from Google to your local startup, for services to be put out in ³experimental² or ³beta² mode to judge interest and potential uses before investing to support them at production level. Thank for your understanding, Karen Coombs OCLC Developer Network Manager -- Forwarded Message From: Ya'aqov Ziso z...@rowan.edu Reply-To: WorldCat Developer Network Discussion List wc-devne...@oclc.org Date: Thu, 18 Mar 2010 16:32:20 -0400 To: wc-devne...@oclc.org Subject: Re: [WC-DEVNET-L] WorldCat Terminologies Karen, At the CODE4LIB Wednesday in Asheville breakout session on API queries http://wiki.code4lib.org/index.php/2010_Breakout_Sessions#Wednesday we questioned the level of maintenance for NAF, LCSH (by OCLC Research for Identities and Terminologies). I also added a question (below) per your distributed brochure. What is the status of these questions? Are we to deal with dirtier data (compared to NAF/LCSH in CONNEXION) for now? Note: without a WC-DEVNET tracking system, some questions get lost, by chance or by intent. Ya¹aqov On 3/3/10 2:30 PM, Ya'aqov Ziso z...@rowan.edu wrote: Karen Coombs, Hi, ³Terminologies ... all those terminologies databases that you used to have to buy, load, and maintain locally now available remotely for free ... (from the blurb OCLC distributed at the CODE4LIB in Asheville, 2/21-25-2010) Could you please elaborate, how can Terminologies Services substitute for what libraries upkeep and pay for currently, given the other statement on that blurb¹s page ³WorldCat Terminologies is still an experiment research service with no service assurances². Kind thanks, Ya¹aqov --- Posted on: WorldCat Developer Network discussion list To post: email to wc-devne...@oclc.org To subscribe, go to https://www3.oclc.org/app/listserv/ To unsubscribe, change options, change to digest mode, or view archive, go to: http://listserv.oclc.org/scripts/wa.exe?A0=WC-DEVNET-L list owners: Roy Tennant, Don Hamparian --- Posted on: WorldCat Developer Network discussion list To post: email to wc-devne...@oclc.org To subscribe, go to https://www3.oclc.org/app/listserv/ To unsubscribe, change options, change to digest mode, or view archive, go to: http://listserv.oclc.org/scripts/wa.exe?A0=WC-DEVNET-L list owners: Roy Tennant, Don Hamparian -- End of Forwarded Message
Re: [CODE4LIB] WorldCat Terminologies
Hello Karen, Since upkeep done to Terminologies and Identities involves all WorldCat membership copied to your note, it will be helpful if OCLC Research would post: an URL specifying upkeep done to Terminologies an URL specifying upkeep done to Identities (is NAF used via CONNEXION the same as the one used for identities? if not, specify) The library community has been by now sufficiently exposed to successful attempts and|or baubles, and learned the difference between ³beta² and ³experimental². As we all go through this learning experience, in this case partnering with OCLC Research, we wish to grow successful. Ya¹aqov Ziso, Electronic Resource Management Librarian, Rowan University 856 256 4804 On 3/19/10 10:00 AM, Karen Coombs coom...@oclc.org wrote: Ya¹aqov, We have decided that it is wiser to withdraw the statement from our brochure, with our apologies, rather than attempt to defend it. We¹re sorry if this has caused you any trouble. As for your questions regarding frequency of update and any guaranteed level of service, we have already answered those questions in venues to which you subscribe. For example: http://tinyurl.com/yaldzcw . It is common practice, from Google to your local startup, for services to be put out in ³experimental² or ³beta² mode to judge interest and potential uses before investing to support them at production level. Thank for your understanding, Karen Coombs OCLC Developer Network Manager -- Forwarded Message From: Ya'aqov Ziso z...@rowan.edu Reply-To: WorldCat Developer Network Discussion List wc-devne...@oclc.org Date: Thu, 18 Mar 2010 16:32:20 -0400 To: wc-devne...@oclc.org Subject: Re: [WC-DEVNET-L] WorldCat Terminologies Karen, At the CODE4LIB Wednesday in Asheville breakout session on API queries http://wiki.code4lib.org/index.php/2010_Breakout_Sessions#Wednesday we questioned the level of maintenance for NAF, LCSH (by OCLC Research for Identities and Terminologies). I also added a question (below) per your distributed brochure. What is the status of these questions? Are we to deal with dirtier data (compared to NAF/LCSH in CONNEXION) for now? Note: without a WC-DEVNET tracking system, some questions get lost, by chance or by intent. Ya¹aqov On 3/3/10 2:30 PM, Ya'aqov Ziso z...@rowan.edu wrote: Karen Coombs, Hi, ³Terminologies ... all those terminologies databases that you used to have to buy, load, and maintain locally now available remotely for free ... (from the blurb OCLC distributed at the CODE4LIB in Asheville, 2/21-25-2010) Could you please elaborate, how can Terminologies Services substitute for what libraries upkeep and pay for currently, given the other statement on that blurb¹s page ³WorldCat Terminologies is still an experiment research service with no service assurances². Kind thanks, Ya¹aqov --- Posted on: WorldCat Developer Network discussion list To post: email to wc-devne...@oclc.org To subscribe, go to https://www3.oclc.org/app/listserv/ To unsubscribe, change options, change to digest mode, or view archive, go to: http://listserv.oclc.org/scripts/wa.exe?A0=WC-DEVNET-L list owners: Roy Tennant, Don Hamparian --- Posted on: WorldCat Developer Network discussion list To post: email to wc-devne...@oclc.org To subscribe, go to https://www3.oclc.org/app/listserv/ To unsubscribe, change options, change to digest mode, or view archive, go to: http://listserv.oclc.org/scripts/wa.exe?A0=WC-DEVNET-L list owners: Roy Tennant, Don Hamparian -- End of Forwarded Message
Re: [CODE4LIB] WorldCat Terminologies
Ya¹aqov, Identities is not based on a name authority file it is based on name data in WorldCat. These two are not the same thing. Names within Identities come from several different fields within the WorldCat MARC records including 1xx, 6xx, and 7xx fields. This is why Identities contains names which have not made their way into a name authority file (http://www.worldcat.org/identities/np-levan,%20ralph%20r). Also Identities doesn¹t contain authority records, it contains Identity records. Identity records contain different information from Authority records. There are name authority projects going on within OCLC Research. The most active is VIAF http://www.viaf.org/ . This service contains name authority information from several national libraries, not just Library of Congress and right now not all of NACO has been loaded, only differentiate personal names. More information on the VIAF project is available at http://www.oclc.org/research/activities/viaf/ Regarding up to dateness of Terminologies as I said in my previous message a schedule for updates was posted to the Developer Network listserv and the Developer Network wiki (http://worldcat.org/devnet/wiki/Terminologies_Updates). Andrew Houghton has also informed you that information about how recently a given terminology is located on the Experimental Terminologies page ( http://tspilot.oclc.org/resources/index.html). Karen -- Karen A. Coombs Product Manager OCLC Developer Network coom...@oclc.org 281-886-0882 Skype:librarywebchic On 3/19/10 10:29 AM, Ya'aqov Ziso z...@rowan.edu wrote: Hello Karen, Since upkeep done to Terminologies and Identities involves all WorldCat membership copied to your note, it will be helpful if OCLC Research would post: an URL specifying upkeep done to Terminologies an URL specifying upkeep done to Identities (is NAF used via CONNEXION the same as the one used for identities? if not, specify) The library community has been by now sufficiently exposed to successful attempts and|or baubles, and learned the difference between ³beta² and ³experimental². As we all go through this learning experience, in this case partnering with OCLC Research, we wish to grow successful. Ya¹aqov Ziso, Electronic Resource Management Librarian, Rowan University 856 256 4804 On 3/19/10 10:00 AM, Karen Coombs coom...@oclc.org wrote: Ya¹aqov, We have decided that it is wiser to withdraw the statement from our brochure, with our apologies, rather than attempt to defend it. We¹re sorry if this has caused you any trouble. As for your questions regarding frequency of update and any guaranteed level of service, we have already answered those questions in venues to which you subscribe. For example: http://tinyurl.com/yaldzcw . It is common practice, from Google to your local startup, for services to be put out in ³experimental² or ³beta² mode to judge interest and potential uses before investing to support them at production level. Thank for your understanding, Karen Coombs OCLC Developer Network Manager -- Forwarded Message From: Ya'aqov Ziso z...@rowan.edu Reply-To: WorldCat Developer Network Discussion List wc-devne...@oclc.org Date: Thu, 18 Mar 2010 16:32:20 -0400 To: wc-devne...@oclc.org Subject: Re: [WC-DEVNET-L] WorldCat Terminologies Karen, At the CODE4LIB Wednesday in Asheville breakout session on API queries http://wiki.code4lib.org/index.php/2010_Breakout_Sessions#Wednesday we questioned the level of maintenance for NAF, LCSH (by OCLC Research for Identities and Terminologies). I also added a question (below) per your distributed brochure. What is the status of these questions? Are we to deal with dirtier data (compared to NAF/LCSH in CONNEXION) for now? Note: without a WC-DEVNET tracking system, some questions get lost, by chance or by intent. Ya¹aqov On 3/3/10 2:30 PM, Ya'aqov Ziso z...@rowan.edu wrote: Karen Coombs, Hi, ³Terminologies ... all those terminologies databases that you used to have to buy, load, and maintain locally now available remotely for free ... (from the blurb OCLC distributed at the CODE4LIB in Asheville, 2/21-25-2010) Could you please elaborate, how can Terminologies Services substitute for what libraries upkeep and pay for currently, given the other statement on that blurb¹s page ³WorldCat Terminologies is still an experiment research service with no service assurances². Kind thanks, Ya¹aqov --- Posted on: WorldCat Developer Network discussion list To post: email to wc-devne...@oclc.org To subscribe, go to https://www3.oclc.org/app/listserv/ To unsubscribe, change options, change to digest mode, or view archive, go to: http://listserv.oclc.org/scripts/wa.exe?A0=WC-DEVNET-L list owners: Roy Tennant, Don Hamparian --- Posted on: WorldCat Developer Network discussion list To post: email to wc-devne...@oclc.org To
Re: [CODE4LIB] WorldCat Terminologies
Karen, Seems like pulling-teeth was worth it. Thank you for these updates and for making them available for all interested. Essentially, given your 6 months latency compared to http://id.loc.gov) and the inclusion of NAF and non-NAF headings in Identities, both Terminologies and Identities are not yet at a level of databases that libraries can go for, forgoing what they currently need to buy, load, and maintain locally. Your withdrawal of the brochure statement is justified and apology accepted. Ya¹aqov On 3/19/10 1:35 PM, Karen Coombs coom...@oclc.org wrote: Ya¹aqov, Identities is not based on a name authority file it is based on name data in WorldCat. These two are not the same thing. Names within Identities come from several different fields within the WorldCat MARC records including 1xx, 6xx, and 7xx fields. This is why Identities contains names which have not made their way into a name authority file (http://www.worldcat.org/identities/np-levan,%20ralph%20r). Also Identities doesn¹t contain authority records, it contains Identity records. Identity records contain different information from Authority records. There are name authority projects going on within OCLC Research. The most active is VIAF http://www.viaf.org/ . This service contains name authority information from several national libraries, not just Library of Congress and right now not all of NACO has been loaded, only differentiate personal names. More information on the VIAF project is available at http://www.oclc.org/research/activities/viaf/ Regarding up to dateness of Terminologies as I said in my previous message a schedule for updates was posted to the Developer Network listserv and the Developer Network wiki (http://worldcat.org/devnet/wiki/Terminologies_Updates). Andrew Houghton has also informed you that information about how recently a given terminology is located on the Experimental Terminologies page ( http://tspilot.oclc.org/resources/index.html). Karen
Re: [CODE4LIB] WorldCat Terminologies
I don't think the inclusion of non-NAF headings in Identities is a flaw, it's a benefit to the purpose of Identities not to be held back by the somewhat glacial pace of change in NAF. But you're right, the right tool for the job, I don't know that any of the existing OCLC free (or included with other OCLC membership/services) services are the right tool to replace any existing purchased authorities tools or sources. It depends on what you're using them for, of course. I agree that the brochure statement was potentially misleading, but these (Identities, Terminologies, Research Terminologies) are still very interesting and useful services. From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Ya'aqov Ziso [z...@rowan.edu] Sent: Friday, March 19, 2010 2:14 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Karen, Seems like pulling-teeth was worth it. Thank you for these updates and for making them available for all interested. Essentially, given your 6 months latency compared to http://id.loc.gov) and the inclusion of NAF and non-NAF headings in Identities, both Terminologies and Identities are not yet at a level of databases that libraries can go for, forgoing what they currently need to buy, load, and maintain locally. Your withdrawal of the brochure statement is justified and apology accepted. Ya¹aqov On 3/19/10 1:35 PM, Karen Coombs coom...@oclc.org wrote: Ya¹aqov, Identities is not based on a name authority file it is based on name data in WorldCat. These two are not the same thing. Names within Identities come from several different fields within the WorldCat MARC records including 1xx, 6xx, and 7xx fields. This is why Identities contains names which have not made their way into a name authority file (http://www.worldcat.org/identities/np-levan,%20ralph%20r). Also Identities doesn¹t contain authority records, it contains Identity records. Identity records contain different information from Authority records. There are name authority projects going on within OCLC Research. The most active is VIAF http://www.viaf.org/ . This service contains name authority information from several national libraries, not just Library of Congress and right now not all of NACO has been loaded, only differentiate personal names. More information on the VIAF project is available at http://www.oclc.org/research/activities/viaf/ Regarding up to dateness of Terminologies as I said in my previous message a schedule for updates was posted to the Developer Network listserv and the Developer Network wiki (http://worldcat.org/devnet/wiki/Terminologies_Updates). Andrew Houghton has also informed you that information about how recently a given terminology is located on the Experimental Terminologies page ( http://tspilot.oclc.org/resources/index.html). Karen
Re: [CODE4LIB] WorldCat Terminologies
Jonathan, thank you, in full accord. Yes, the crux of the matter is Names (NAF being the more expensive library subscription and the one not available for free like http://id.loc.gov At http://orlabs.oclc.org/Identities/ I searched clapton, eric¹ and wonder: * are all WorldCat+NAF 49 retrieved headings there helpful? do we need in the list clayton, cecile¹ (etc.) * Names that haven¹t made it into an authority record are definitely helpful, but can we suggest a way to sort and rank them more usefully (for the user) on the page? Your thoughts? Ya¹aqov On 3/19/10 2:35 PM, Jonathan Rochkind rochk...@jhu.edu wrote: I don't think the inclusion of non-NAF headings in Identities is a flaw, it's a benefit to the purpose of Identities not to be held back by the somewhat glacial pace of change in NAF. But you're right, the right tool for the job, I don't know that any of the existing OCLC free (or included with other OCLC membership/services) services are the right tool to replace any existing purchased authorities tools or sources. It depends on what you're using them for, of course. I agree that the brochure statement was potentially misleading, but these (Identities, Terminologies, Research Terminologies) are still very interesting and useful services. From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Ya'aqov Ziso [z...@rowan.edu] Sent: Friday, March 19, 2010 2:14 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Karen, Seems like pulling-teeth was worth it. Thank you for these updates and for making them available for all interested. Essentially, given your 6 months latency compared to http://id.loc.gov) and the inclusion of NAF and non-NAF headings in Identities, both Terminologies and Identities are not yet at a level of databases that libraries can go for, forgoing what they currently need to buy, load, and maintain locally. Your withdrawal of the brochure statement is justified and apology accepted. Ya¹aqov On 3/19/10 1:35 PM, Karen Coombs coom...@oclc.org wrote: Ya¹aqov, Identities is not based on a name authority file it is based on name data in WorldCat. These two are not the same thing. Names within Identities come from several different fields within the WorldCat MARC records including 1xx, 6xx, and 7xx fields. This is why Identities contains names which have not made their way into a name authority file (http://www.worldcat.org/identities/np-levan,%20ralph%20r). Also Identities doesn¹t contain authority records, it contains Identity records. Identity records contain different information from Authority records. There are name authority projects going on within OCLC Research. The most active is VIAF http://www.viaf.org/ . This service contains name authority information from several national libraries, not just Library of Congress and right now not all of NACO has been loaded, only differentiate personal names. More information on the VIAF project is available at http://www.oclc.org/research/activities/viaf/ Regarding up to dateness of Terminologies as I said in my previous message a schedule for updates was posted to the Developer Network listserv and the Developer Network wiki (http://worldcat.org/devnet/wiki/Terminologies_Updates). Andrew Houghton has also informed you that information about how recently a given terminology is located on the Experimental Terminologies page ( http://tspilot.oclc.org/resources/index.html). Karen
Re: [CODE4LIB] WorldCat Terminologies
I hate to muddy the waters, but I can't resist here. Research also exposes a copy of the LC NAF at http://alcme.oclc.org/srw/search/lcnaf It gets updated every Tuesday night. This is something I've been maintaining for years and is what Identities points at when you ask to see the NAF record associated with an Identities record. This particular service has none of the linked-data-type bells and whistles I'm putting into VIAF and Identities, but easily could, if there was interest. I believe I've made the indexing on it consistent with what I do in Identities. Looking at the configuration file for the load of this database, I am omitting records with 100$k, 100$t, 100$v, 100$x or any 130 fields. I'm sure Ya'aqov (or other similarly expert Authority Librarian) could tell you why I am omitting them, because I can't off the top of my head. This service is actually running as a long established model of how similar services should run in Research. While it is not running on a machine operated by our production staff, it is automatically monitored by them, they have restart procedures in places when the service becomes unresponsive and problems are escalated by email when the restart fails to fix the problem. (Those emails come to me and where they get treated appropriately.) Let me know if there are questions about any of this. Ralph -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Ya'aqov Ziso Sent: Friday, March 19, 2010 3:29 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Jonathan, thank you, in full accord. Yes, the crux of the matter is Names (NAF being the more expensive library subscription and the one not available for free like http://id.loc.gov At http://orlabs.oclc.org/Identities/ I searched Œclapton, eric¹ and wonder: * are all WorldCat+NAF 49 retrieved headings there helpful? do we need in the list Œclayton, cecile¹ (etc.) * Names that haven¹t made it into an authority record are definitely helpful, but can we suggest a way to sort and rank them more usefully (for the user) on the page? Your thoughts? Ya¹aqov On 3/19/10 2:35 PM, Jonathan Rochkind rochk...@jhu.edu wrote: I don't think the inclusion of non-NAF headings in Identities is a flaw, it's a benefit to the purpose of Identities not to be held back by the somewhat glacial pace of change in NAF. But you're right, the right tool for the job, I don't know that any of the existing OCLC free (or included with other OCLC membership/services) services are the right tool to replace any existing purchased authorities tools or sources. It depends on what you're using them for, of course. I agree that the brochure statement was potentially misleading, but these (Identities, Terminologies, Research Terminologies) are still very interesting and useful services. From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Ya'aqov Ziso [z...@rowan.edu] Sent: Friday, March 19, 2010 2:14 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat Terminologies Karen, Seems like pulling-teeth was worth it. Thank you for these updates and for making them available for all interested. Essentially, given your 6 months latency compared to http://id.loc.gov) and the inclusion of NAF and non-NAF headings in Identities, both Terminologies and Identities are not yet at a level of databases that libraries can go for, forgoing what they currently need to buy, load, and maintain locally. Your withdrawal of the brochure statement is justified and apology accepted. Ya¹aqov On 3/19/10 1:35 PM, Karen Coombs coom...@oclc.org wrote: Ya¹aqov, Identities is not based on a name authority file it is based on name data in WorldCat. These two are not the same thing. Names within Identities come from several different fields within the WorldCat MARC records including 1xx, 6xx, and 7xx fields. This is why Identities contains names which have not made their way into a name authority file (http://www.worldcat.org/identities/np-levan,%20ralph%20r). Also Identities doesn¹t contain authority records, it contains Identity records. Identity records contain different information from Authority records. There are name authority projects going on within OCLC Research. The most active is VIAF http://www.viaf.org/ . This service contains name authority information from several national libraries, not just Library of Congress and right now not all of NACO has been loaded, only differentiate personal names. More information on the VIAF project is available at http://www.oclc.org/research/activities/viaf/ Regarding up to dateness of Terminologies as I said in my previous message a schedule for updates was posted to the Developer Network listserv
Re: [CODE4LIB] WorldCat API account
The WorldCat API is not yet in general release. It is presently being beta tested by invited developers, many of whom (if not all) are on this list. Thus the confusion. Sorry, but stay tuned. A good way to do that is to sign up on the WorldCat Developer's Network listserv. A link to the signup page can be found on this page: http://worldcat.org/devnet/ Thanks, Roy On 6/25/08 6/25/08 8:40 AM, Yitzchak Schaffer [EMAIL PROTECTED] wrote: Greetings CODE4LIBers: Does anyone know how to get a test account for the WorldCat API? The wiki, last I checked, instructed to contact OCLC, but my e-mail to their generic address yielded no response. Thanks, --
Re: [CODE4LIB] worldcat
Not really, although we talk about it a lot around here at OCLC. --Th -Original Message- From: Code for Libraries [mailto:[EMAIL PROTECTED] On Behalf Of Eric Lease Morgan Sent: Monday, May 21, 2007 9:34 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] worldcat We here at Notre Dame subscribe to (license?) WorldCat, and I'm wondering, does it have a Web Services interface/API? -- Eric Lease Morgan University Libraries of Notre Dame (574) 631-8604
Re: [CODE4LIB] worldcat
From: Code for Libraries [mailto:[EMAIL PROTECTED] On Behalf Of Eric Lease Morgan Sent: 21 May, 2007 09:34 To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] worldcat We here at Notre Dame subscribe to (license?) WorldCat, and I'm wondering, does it have a Web Services interface/API? I guess it depends on what you consider a Web Service interface and API. Today you create URL's to retrieve XHTML documents. The details are here: http://www.worldcat.org/links/default.jsp It's not pretty since you have to scrape the XHTML document for the information you want and they don't use id attributes on content to make it easy for you to pull information out of the XHTML. However, looking at the roadmap for WorldCat.org, they have always planned a proper Web Service API for it. It's still beta, so some functionality has yet to be delivered. Unfortunately, I cannot say when a Web Service API will be delivered since I'm not on the WorldCat.org team and do not know what their current development priorities are. Also, if you have some specific use cases in mind for how you would like to interact with WorldCat.org using Web services, I'm sure the WorldCat.org folks would like to see them. You can send feedback here: http://www.worldcat.org/oclc/?page=feedback Andy.
Re: [CODE4LIB] worldcat
On May 21, 2007, at 9:52 AM, Houghton,Andrew wrote: We here at Notre Dame subscribe to (license?) WorldCat, and I'm wondering, does it have a Web Services interface/API? I guess it depends on what you consider a Web Service interface and API. Today you create URL's to retrieve XHTML documents. The details are here: http://www.worldcat.org/links/default.jsp Thank you for the prompt replies. This is what I thought, and I was just checking to see if I had missed something. (Bummer.) -- Eric Lease Morgan
Re: [CODE4LIB] worldcat
From: Code for Libraries [mailto:[EMAIL PROTECTED] On Behalf Of Eric Lease Morgan Sent: 22 August, 2006 16:24 To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] worldcat Is there a public Z39.50/SRU/SRW/Web Services interface to WorldCat or OpenWorldCat? I would like to create a simple search engine to query Other's books, and *Cat seems like a great candidate. Inquiring minds would like to know. I'm not sure about a Z39.50/SRU/SRW interface to WorldCat, but you can access WorldCat via URL queries and it appears that the data comes back as an XHTML document. So... you could hack something together with a little XSLT to simulate an SRU interface. Since this isn't documented anywhere, at least that I could find, with a little digging and hacking I came up with the following: URL query: http://worldcat.org/search?q={your+query+goes+here}, e.g. http://worldcat.org/search?q=database+design Results XPath: /html/body//[EMAIL PROTECTED]'tableResults'] /html/body//[EMAIL PROTECTED]'tableResults']/tr/[EMAIL PROTECTED]'result'] Next set of results: http://worldcat.org/search?q=database+designstart={next+number+in+result+set+goes+here}qt=next_page, e.g. http://worldcat.org/search?q=database+designstart=11qt=next_page Andy.
Re: [CODE4LIB] worldcat
I don't think they have a public one but there is one if your institution has Firstsearch. http://www.oclc.org/support/documentation/firstsearch/z3950/fs_z39_config_guide/default.htm The production server provides access to all the databases available and requires a valid FirstSearch authorization. http://www.oclc.org/support/documentation/firstsearch/z3950/z3950_databases/specs/worldcat.htm Ryan Eby On 8/22/06, Eric Lease Morgan [EMAIL PROTECTED] wrote: Is there a public Z39.50/SRU/SRW/Web Services interface to WorldCat or OpenWorldCat? I would like to create a simple search engine to query Other's books, and *Cat seems like a great candidate. Inquiring minds would like to know. -- Eric Morgan