subject:"Re\: \[CODE4LIB\] worldcat"

Re: [CODE4LIB] worldcat discovery versus metadata apis

2016-03-22 Thread Karen Coombs

Eric,

WorldCat Metadata API provides read and write API access to the data in
WorldCat: bibliographic records, local bibliographic data and basic
holdings. WorldCat Discovery API provides access to search WorldCat and
OCLC's Central Index of metadata based on a diverse set of indexes. Data is
returned in a Linked Data graph. This API is still in beta. For more
detailed information on each API you can look at the documentation at

WorldCat Metadata API -
http://www.oclc.org/developer/develop/web-services/worldcat-metadata-api.en.html

WorldCat Discovery API -
https://www.oclc.org/developer/develop/web-services/worldcat-discovery-api.en.html

Given the use case you describe, your best bet is probably the WorldCat
Metadata API, especially if you need to write any data back to WorldCat.
However you can perform the same task with WorldCat Search API and if you
are only reading data this might be a better fit because of its simpler
authentication method.

You're welcome to send an email to dev...@oclc.org if you have further
detailed support questions.

Karen

On Tue, Mar 22, 2016 at 6:15 AM, Eric Lease Morgan  wrote:

> I’m curious. What is the difference between the WorldCat Discovery and
> WorldCat Metadata APIs?
>
> Given an OCLC number, I want to programmatically search WorldCat and get
> in return a full bibliographic record compete with authoritative subject
> headings and names. Which API should I be using?
>
> —
> Eric Morgan
>

Re: [CODE4LIB] WorldCat API - myTags

2013-06-06 Thread Roy Tennant

Arash,
I don't believe this functionality currently exists, but I've passed
on your desire to those in a position to do something about it.
Thanks,
Roy

On Thu, Jun 6, 2013 at 9:59 AM, Arash.Joorabchi arash.joorab...@ul.ie wrote:
 Hi all,



 When viewing a work's metadata on WorldCat.org website, in the tag
 section of the page you are given the option to add new tags after
 logging in with your (free) account. I was wondering if there is a
 WorldCat api to do this from within my Java code.



 Thanks,

 Arash

Re: [CODE4LIB] WorldCat Implements Content-Negotiation for Linked Data

2013-06-03 Thread Ethan Gruber

+1


On Mon, Jun 3, 2013 at 3:00 PM, Richard Wallis 
richard.wal...@dataliberate.com wrote:

 The Linked Data for the millions of resources in WorldCat.org is now
 available as RDF/XML, JSON-LD, Turtle, and Triples via content-negotiation.

 Details:

 http://dataliberate.com/2013/06/content-negotiation-for-worldcat/

 ~Richard.

Re: [CODE4LIB] WorldCat Implements Content-Negotiation for Linked Data

2013-06-03 Thread Karen Coyle

Probably something I'm doing wrong, since I'm just copying and pasting, 
but the command from the blog post:


curl -L -H Accept: text/turtle http://www.worldcat.org/oclc/41266045

gets me:

curl: (6) Could not resolve host: text; nodename nor servname provided, 
or not known


kc

On 6/3/13 12:00 PM, Richard Wallis wrote:

The Linked Data for the millions of resources in WorldCat.org is now
available as RDF/XML, JSON-LD, Turtle, and Triples via content-negotiation.

Details:
http://dataliberate.com/2013/06/content-negotiation-for-worldcat/

~Richard.


--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet

Re: [CODE4LIB] WorldCat Implements Content-Negotiation for Linked Data

2013-06-03 Thread Tom Johnson

I also get a good response from that, Karen.

I've seen this error in the past when DNS doesn't resolve. Possibly you're
having connectivity issues.


On Mon, Jun 3, 2013 at 3:42 PM, Kyle Banerjee kyle.baner...@gmail.comwrote:

 What you've provided looks like it will work. My money is that the quotes
 and/or hyphens aren't legit due to the copy/paste operation.

 Manually typing at the prompt should work just fine.

 kyle


 On Mon, Jun 3, 2013 at 3:21 PM, Karen Coyle li...@kcoyle.net wrote:

  Probably something I'm doing wrong, since I'm just copying and pasting,
  but the command from the blog post:
 
  curl -L -H Accept: text/turtle http://www.worldcat.org/oclc/**41266045
 http://www.worldcat.org/oclc/41266045
 
  gets me:
 
  curl: (6) Could not resolve host: text; nodename nor servname provided,
 or
  not known
 
  kc
 
  On 6/3/13 12:00 PM, Richard Wallis wrote:
 
  The Linked Data for the millions of resources in WorldCat.org is now
  available as RDF/XML, JSON-LD, Turtle, and Triples via
  content-negotiation.
 
  Details:
 
 http://dataliberate.com/2013/**06/content-negotiation-for-**
  worldcat/
 http://dataliberate.com/2013/06/content-negotiation-for-worldcat/
 
  ~Richard.
 
 
  --
  Karen Coyle
  kco...@kcoyle.net http://kcoyle.net
  ph: 1-510-540-7596
  m: 1-510-435-8234
  skype: kcoylenet

Re: [CODE4LIB] WorldCat Implements Content-Negotiation for Linked Data

2013-06-03 Thread Karen Coyle


Ta da! That did it, Kyle. Why on earth do we all them smart quotes ?!

kc

On 6/3/13 4:07 PM, Kyle Banerjee wrote:

Just for the heck of it, I tried copying and pasting and got the same
error. There were smart quotes on the web page. Turn those into regular
single or double quotes and it works fine.

kyle


On Mon, Jun 3, 2013 at 3:21 PM, Karen Coyle li...@kcoyle.net
mailto:li...@kcoyle.net wrote:

Probably something I'm doing wrong, since I'm just copying and
pasting, but the command from the blog post:

curl -L -H Accept: text/turtle
http://www.worldcat.org/oclc/__41266045
http://www.worldcat.org/oclc/41266045

gets me:

curl: (6) Could not resolve host: text; nodename nor servname
provided, or not known

kc

On 6/3/13 12:00 PM, Richard Wallis wrote:

The Linked Data for the millions of resources in WorldCat.org is now
available as RDF/XML, JSON-LD, Turtle, and Triples via
content-negotiation.

Details:
http://dataliberate.com/2013/__06/content-negotiation-for-__worldcat/
http://dataliberate.com/2013/06/content-negotiation-for-worldcat/

~Richard.


--
Karen Coyle
kco...@kcoyle.net mailto:kco...@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet




--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet

Re: [CODE4LIB] WorldCat Implements Content-Negotiation for Linked Data

2013-06-03 Thread stuart yeates


On 04/06/13 11:18, Karen Coyle wrote:

Ta da! That did it, Kyle. Why on earth do we all them smart quotes ?!


Because they look damn sexy when printed on pulp-of-murdered-tree, which 
we all know is authoritative form of any communication.


cheers
stuart
--
Stuart Yeates
Library Technology Services http://www.victoria.ac.nz/library/

Re: [CODE4LIB] WorldCat Implements Content-Negotiation for Linked Data

2013-06-03 Thread Ben Companjen


Those are smart words! Can I quote them?

:P

Regards,

Ben

On 4-6-2013 1:40, stuart yeates wrote:

On 04/06/13 11:18, Karen Coyle wrote:

Ta da! That did it, Kyle. Why on earth do we all them smart quotes ?!


Because they look damn sexy when printed on pulp-of-murdered-tree, which
we all know is authoritative form of any communication.

cheers
stuart

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-13 Thread Young,Jeff (OR)

Karen,

Your output looks like it comes from the old 2007 RDFa 1.0 parser:

http://www.w3.org/2007/08/pyRdfa/extract?uri=http%3A%2F%2Fwww.worldcat.org%2Foclc%2F527725format=pretty-xmlwarnings=falseparser=laxspace-preserve=true

The new 2012 RDFa 1.1 parser does a better job:

http://www.w3.org/2012/pyRdfa/extract?uri=http%3A%2F%2Fwww.worldcat.org%2Foclc%2F527725format=xmlrdfagraph=outputvocab_expansion=falserdfa_lite=falseembedded_rdf=truespace_preserve=truevocab_cache=truevocab_cache_report=falsevocab_cache_refresh=false

Note the comment on the old interface page: http://www.w3.org/2007/08/pyRdfa/

Users are advised to migrate to RDFa 1.1 in general, including the RDFa 
1.1 distiller.

RDFa 1.1 is still pretty new and getting more tools to support it will help.

Jeff

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Karen Coyle
 Sent: Thursday, July 12, 2012 6:16 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Worldcat schema.org  search API
 
 Ross, it might not be yahoo, but that doesn't mean I know what it is.
 The pyRDFa utility returns garbage for RDF/XML and TTL, but not for
 JSON. It's only in the JSON output that I am getting any bibliographic
 data. The other two send me back a bunch of links to css files. I guess
 this is good news for folks who prefer JSON. Also, I see the OCLC
 number
 in the JSON, but not the URI, although the URI appears in the div with
 the RDFa:
 
 div itemid=http://www.worldcat.org/oclc/527725; itemscope=
 itemtype=http://schema.org/Book;
 resource=http://www.worldcat.org/oclc/527725;
 typeof=http://schema.org/Book;a
 href=http://www.worldcat.org/oclc/527725;http://www.worldcat.org/oclc
 /527725/a
 
 I must say I wonder a bit about those double  but what do I know?
 Anywhere, here's what I get from pyRDFa:
 
 RDF/XML:
 
 rdf:RDF_4:Book rdf:about=http://schema.org/Book/rdf:Description
 rdf:about=http://www.worldcat.org/title/selection-of-early-
 statistical-papers-of-j-neyman/oclc/527725xhv:stylesheet
 rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/css/loginpop
 up.css/xhv:stylesheet
 rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/html/masthea
 d.css/xhv:stylesheet
 rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/css/alerts.c
 ss/xhv:stylesheet
 rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/css/modals_j
 query.css/xhv:stylesheet
 rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/html/layered
 _divs.css/xhv:stylesheet
 rdf:resource=http://static1.worldcat.org/wcpa/cssj/N245213502/bundles/
 print-min.css/xhv:stylesheet
 rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/css/cr_print
 .css/xhv:stylesheet
 rdf:resource=http://static.weread.com/css/booksiread/relbookswidget.cs
 s?0:5/xhv:stylesheet
 rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/css/itemform
 at.css/xhv:stylesheet
 rdf:resource=http://static1.worldcat.org/wcpa/cssj/N1807112156/bundles
 /screen-min.css/xhv:stylesheet
 rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/html/record.
 css/xhv:stylesheet
 rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/html/yui/bui
 ld/reset-fonts-grids/reset-fonts-grids.css/xhv:stylesheet
 rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/html/new_wco
 rg.css//rdf:Description/rdf:RDF
 
 JSON:
 
 {
 @context: {
 library: http://purl.org/library/;,
 oclc: http://www.worldcat.org/oclc/;,
 skos: http://www.w3.org/2004/02/skos/core#;,
 madsrdf: http://www.loc.gov/mads/rdf/v1#;,
 schema: http://schema.org/;,
 http://purl.org/library/placeOfPublication: {
 @type: @id
 },
 http://schema.org/about: {
 @type: @id
 },
 http://schema.org/publisher: {
 @type: @id
 },
 http://schema.org/author: {
 @type: @id
 },
 http://www.w3.org/2004/02/skos/core#inScheme: {
 @type: @id
 },
 http://www.loc.gov/mads/rdf/v1#isIdentifiedByAuthority: {
 @type: @id
 }
 },
 @id: oclc:527725,
 @type: schema:Book,
 schema:inLanguage: {
 @value: en,
 @language: en
 },
 library:holdingsCount: {
 @value: 285,
 @language: en
 },
 schema:author: {
 @id: http://viaf.org/viaf/24666861;,
 @type: schema:Person,
 madsrdf:isIdentifiedByAuthority:
 http://id.loc.gov/authorities/names/n50066374;,
 schema:name: {
 @value: Neyman, Jerzy, 1894-1981.,
 @language: en
 }
 },
 schema:name: {
 @value: A selection of early statistical papers of J. Neyman.,
 @language: en
 },
 schema:datePublished: {
 @value: 1967.,
 @language: en
 },
 schema:numberOfPages: {
 @value: 429,
 @language: en
 },
 library:oclcnum: {
 @value: 527725,
 @language: en
 },
 schema:about: [
 {
 @type: skos:Concept,
 madsrdf:isIdentifiedByAuthority:
 http://id.loc.gov/authorities/subjects/sh85082133;,
 schema:name: {
 @value: Mathematical statistics.,
 @language: en
 }
 },
 {
 @id: http://dewey.info/class/519/;,
 @type: skos:Concept,
 skos:inScheme: http://dewey.info/scheme/;
 },
 {
 @type: skos:Concept,
 schema:name: {
 @value: Statistique mathématique.,
 @language: en
 }
 },
 {
 @id: http

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-13 Thread Karen Coyle


AHA! Thank you Jeff. I will re-bookmark and try again.

kc


On 7/13/12 6:31 AM, Young,Jeff (OR) wrote:

Karen,

Your output looks like it comes from the old 2007 RDFa 1.0 parser:

http://www.w3.org/2007/08/pyRdfa/extract?uri=http%3A%2F%2Fwww.worldcat.org%2Foclc%2F527725format=pretty-xmlwarnings=falseparser=laxspace-preserve=true

The new 2012 RDFa 1.1 parser does a better job:

http://www.w3.org/2012/pyRdfa/extract?uri=http%3A%2F%2Fwww.worldcat.org%2Foclc%2F527725format=xmlrdfagraph=outputvocab_expansion=falserdfa_lite=falseembedded_rdf=truespace_preserve=truevocab_cache=truevocab_cache_report=falsevocab_cache_refresh=false

Note the comment on the old interface page: http://www.w3.org/2007/08/pyRdfa/

Users are advised to migrate to RDFa 1.1 in general, including the RDFa 
1.1 distiller.

RDFa 1.1 is still pretty new and getting more tools to support it will help.

Jeff


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
Karen Coyle
Sent: Thursday, July 12, 2012 6:16 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Worldcat schema.org  search API

Ross, it might not be yahoo, but that doesn't mean I know what it is.
The pyRDFa utility returns garbage for RDF/XML and TTL, but not for
JSON. It's only in the JSON output that I am getting any bibliographic
data. The other two send me back a bunch of links to css files. I guess
this is good news for folks who prefer JSON. Also, I see the OCLC
number
in the JSON, but not the URI, although the URI appears in the div with
the RDFa:

div itemid=http://www.worldcat.org/oclc/527725; itemscope=
itemtype=http://schema.org/Book;
resource=http://www.worldcat.org/oclc/527725;
typeof=http://schema.org/Book;a
href=http://www.worldcat.org/oclc/527725;http://www.worldcat.org/oclc
/527725/a

I must say I wonder a bit about those double  but what do I know?
Anywhere, here's what I get from pyRDFa:

RDF/XML:

rdf:RDF_4:Book rdf:about=http://schema.org/Book/rdf:Description
rdf:about=http://www.worldcat.org/title/selection-of-early-
statistical-papers-of-j-neyman/oclc/527725xhv:stylesheet
rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/css/loginpop
up.css/xhv:stylesheet
rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/html/masthea
d.css/xhv:stylesheet
rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/css/alerts.c
ss/xhv:stylesheet
rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/css/modals_j
query.css/xhv:stylesheet
rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/html/layered
_divs.css/xhv:stylesheet
rdf:resource=http://static1.worldcat.org/wcpa/cssj/N245213502/bundles/
print-min.css/xhv:stylesheet
rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/css/cr_print
.css/xhv:stylesheet
rdf:resource=http://static.weread.com/css/booksiread/relbookswidget.cs
s?0:5/xhv:stylesheet
rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/css/itemform
at.css/xhv:stylesheet
rdf:resource=http://static1.worldcat.org/wcpa/cssj/N1807112156/bundles
/screen-min.css/xhv:stylesheet
rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/html/record.
css/xhv:stylesheet
rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/html/yui/bui
ld/reset-fonts-grids/reset-fonts-grids.css/xhv:stylesheet
rdf:resource=http://static1.worldcat.org/wcpa/rel20120711/html/new_wco
rg.css//rdf:Description/rdf:RDF

JSON:

{
@context: {
library: http://purl.org/library/;,
oclc: http://www.worldcat.org/oclc/;,
skos: http://www.w3.org/2004/02/skos/core#;,
madsrdf: http://www.loc.gov/mads/rdf/v1#;,
schema: http://schema.org/;,
http://purl.org/library/placeOfPublication: {
@type: @id
},
http://schema.org/about: {
@type: @id
},
http://schema.org/publisher: {
@type: @id
},
http://schema.org/author: {
@type: @id
},
http://www.w3.org/2004/02/skos/core#inScheme: {
@type: @id
},
http://www.loc.gov/mads/rdf/v1#isIdentifiedByAuthority: {
@type: @id
}
},
@id: oclc:527725,
@type: schema:Book,
schema:inLanguage: {
@value: en,
@language: en
},
library:holdingsCount: {
@value: 285,
@language: en
},
schema:author: {
@id: http://viaf.org/viaf/24666861;,
@type: schema:Person,
madsrdf:isIdentifiedByAuthority:
http://id.loc.gov/authorities/names/n50066374;,
schema:name: {
@value: Neyman, Jerzy, 1894-1981.,
@language: en
}
},
schema:name: {
@value: A selection of early statistical papers of J. Neyman.,
@language: en
},
schema:datePublished: {
@value: 1967.,
@language: en
},
schema:numberOfPages: {
@value: 429,
@language: en
},
library:oclcnum: {
@value: 527725,
@language: en
},
schema:about: [
{
@type: skos:Concept,
madsrdf:isIdentifiedByAuthority:
http://id.loc.gov/authorities/subjects/sh85082133;,
schema:name: {
@value: Mathematical statistics.,
@language: en
}
},
{
@id: http://dewey.info/class/519/;,
@type: skos:Concept,
skos:inScheme: http://dewey.info/scheme/;
},
{
@type: skos:Concept,
schema:name: {
@value: Statistique mathématique.,
@language: en
}
},
{
@id: http://id.worldcat.org/fast/1012127;,
@type: skos:Concept

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-12 Thread Karen Coyle


On 7/10/12 5:07 PM, Karen Coyle wrote:

On 7/10/12 4:02 PM, Richard Wallis wrote:


But is it available to everyone, and is the data retrieved also 
usable as

ODC-BY by any member of the Web public?

Yes it is, and at this stage it is only available from within a html 
page.


The it I was referring to was the API. Roy is telling me that people 
should use the API, as if that is an obvious option that I am 
overlooking. I am asking if the general web public can use the API to 
get this data. I believe that should be a yes/no question/answer.


Since no one here from OCLC had the integrity to answer this question, I 
went ahead and applied for a Worldcat API key, and here is the reply:


*

Hello,

Thank you for your interest in the WorldCat Search API, however at this 
time the web service is only available to institutions, primarily 
libraries, that have a specific relationship with OCLC and then only for 
work related to that library's services. The specific relationship is 
explained further here, 
http://oclc.org/developer/documentation/worldcat-search-api/who-can-use.


However, there are other OCLC services that are available to 
individual's non-commercial use.  Looking at the list of services 
available on http://www.worldcat.org/wcpa/content/affiliate/ you'll see 
that the WorldCat search box and WorldCat links with embedded searches 
are available to anyone.   You may also be interested in checking out 
the WorldCat Registry, or low-volume use of the xISBN and xISSN services.


If you have questions about the service, please contact the product 
manager, Dawn Hendricks at hendr...@oclc.org mailto:hendr...@oclc.org.


*

There is nothing wrong with having a proprietary API; but pretending 
that it isn't (either directly or through omission), or being afraid to 
say it, is the kind of thing that has caused me to lose respect for 
OCLC. Nothing should be declared open that isn't available to all, not 
just members. And advertisements for WC API classes should state 
members only. That would be honest. And telling folks on a wide-open 
list that they should use the Worldcat API (without mentioning if you 
are in a member institution and using this for library services) is at 
best deceiving, at worst dishonest.


I, for one, am tired of OCLC's lies, and I'm not afraid to say it. 
Fortunately for me, retirement is looming and I don't need to care who 
likes what I say. This is a relief, to say the least.


kc






kc



This experiment is the first step in a process to make linked data about
WorldCat resources available.  As it will evolve over time other 
areas such

as API access, content-negotiation, search  other query methods,
additional RDF data vocabularies, etc., etc., will be considered in 
concert

with community feedback (such as this thread) as to the way forward.

Karen I know you are eager to work with and demonstrate the benefits of
this way of publishing data.  But these things take time and effort, so
please be a little patient, and keep firing off these use cases and 
issues

they are all valuable input.

~Richard.



kc


  Roy
On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com 
wrote:



The use case clarifies perfectly.

Totally feasible.  Well, I should say totally feasible with the 
caveat
that I've never used the Worldcat Search API.  Not letting that 
stop me,

so
long as it is what I imagine it is, then a developer should be 
able to
perform a search, retrieve the response, and, by integrating one 
of the

tools advertised on the schema.org website into his/her code, then
retrieve
the microdata for each resource returned from the search (and save 
it as

RDF
or whatever).

If someone has created something like this, do speak up.

Yours,

Kevin





On 07/10/2012 04:48 PM, Karen Coyle wrote:

Kevin, if you misunderstand then I undoubtedly haven't been clear 
(let's

at least share the confusion :-)). Here's the use case:

PersonA wants to create a comprehensive bibliography of works by
AuthorB. The goal is to do a search on AuthorB in WorldCat and 
extract
the RDFa data from those pages in order to populate the 
bibliography.


Apart from all of the issues of getting a perfect match on 
authors and

of manifestation duplicates (there would need to be editing of the
results after retrieval at the user's end), how feasible is this? 
Assume
that the author is prolific enough that one wouldn't want to look 
up all

of the records by hand.

kc

On 7/10/12 1:43 PM, Kevin Ford wrote:


As for someone who might want to do this programmatically, he/she
should take a look at the Programming languages section of the
second link I sent along:

http://schema.rdfs.org/tools.**htmlhttp://schema.rdfs.org/tools.html 



There one can find Ruby, Python, and Java extractors and parsers
capable of outputting RDF.  A developer can take one of these and
programmatically get at the data.

Apologies if I am misunderstanding your intent.

Yours,

Kevin



On 07/10/2012 04:34

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-12 Thread Ross Singer

Well, I got the same email today when I apparently clicked on the
wrong link (in the wrong account) while looking for my existing WC
Basic API WSKEY (seriously, OCLC, the developer site is *terrible*
with regards to usability).

That said, here are the steps to get a WC Basic API WSKEY:

Log in (or create an account) here:
https://worldcat.org/config/SignIn.do

On the left should be a menu that reads:

WorldCat Registry
WorldCat Basic API Key
Find A Library API Key
Web Service Keys

Click on WorldCat Basic API Key, then Request a WorldCat Basic API Key.

Then you should be able to use the Basic API (which will return
results in RSS or Atom).  From the search results, you can follow the
links to the Worldcat pages and grab either the schema.org microdata
or RDFa (or both, obviously).

-Ross.

On Thu, Jul 12, 2012 at 12:33 PM, Karen Coyle li...@kcoyle.net wrote:
 On 7/10/12 5:07 PM, Karen Coyle wrote:

 On 7/10/12 4:02 PM, Richard Wallis wrote:


 But is it available to everyone, and is the data retrieved also usable as
 ODC-BY by any member of the Web public?

 Yes it is, and at this stage it is only available from within a html
 page.


 The it I was referring to was the API. Roy is telling me that people
 should use the API, as if that is an obvious option that I am overlooking. I
 am asking if the general web public can use the API to get this data. I
 believe that should be a yes/no question/answer.


 Since no one here from OCLC had the integrity to answer this question, I
 went ahead and applied for a Worldcat API key, and here is the reply:

 *

 Hello,

 Thank you for your interest in the WorldCat Search API, however at this time
 the web service is only available to institutions, primarily libraries, that
 have a specific relationship with OCLC and then only for work related to
 that library's services. The specific relationship is explained further
 here,
 http://oclc.org/developer/documentation/worldcat-search-api/who-can-use.

 However, there are other OCLC services that are available to individual's
 non-commercial use.  Looking at the list of services available on
 http://www.worldcat.org/wcpa/content/affiliate/ you'll see that the WorldCat
 search box and WorldCat links with embedded searches are available to
 anyone.   You may also be interested in checking out the WorldCat Registry,
 or low-volume use of the xISBN and xISSN services.

 If you have questions about the service, please contact the product manager,
 Dawn Hendricks at hendr...@oclc.org mailto:hendr...@oclc.org.

 *

 There is nothing wrong with having a proprietary API; but pretending that it
 isn't (either directly or through omission), or being afraid to say it, is
 the kind of thing that has caused me to lose respect for OCLC. Nothing
 should be declared open that isn't available to all, not just members. And
 advertisements for WC API classes should state members only. That would be
 honest. And telling folks on a wide-open list that they should use the
 Worldcat API (without mentioning if you are in a member institution and
 using this for library services) is at best deceiving, at worst dishonest.

 I, for one, am tired of OCLC's lies, and I'm not afraid to say it.
 Fortunately for me, retirement is looming and I don't need to care who likes
 what I say. This is a relief, to say the least.

 kc






 kc


 This experiment is the first step in a process to make linked data about
 WorldCat resources available.  As it will evolve over time other areas
 such
 as API access, content-negotiation, search  other query methods,
 additional RDF data vocabularies, etc., etc., will be considered in
 concert
 with community feedback (such as this thread) as to the way forward.

 Karen I know you are eager to work with and demonstrate the benefits of
 this way of publishing data.  But these things take time and effort, so
 please be a little patient, and keep firing off these use cases and
 issues
 they are all valuable input.

 ~Richard.


 kc


   Roy

 On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com
 wrote:

 The use case clarifies perfectly.

 Totally feasible.  Well, I should say totally feasible with the
 caveat
 that I've never used the Worldcat Search API.  Not letting that stop
 me,
 so
 long as it is what I imagine it is, then a developer should be able to
 perform a search, retrieve the response, and, by integrating one of
 the
 tools advertised on the schema.org website into his/her code, then
 retrieve
 the microdata for each resource returned from the search (and save it
 as
 RDF
 or whatever).

 If someone has created something like this, do speak up.

 Yours,

 Kevin





 On 07/10/2012 04:48 PM, Karen Coyle wrote:

 Kevin, if you misunderstand then I undoubtedly haven't been clear
 (let's
 at least share the confusion :-)). Here's the use case:

 PersonA wants to create a comprehensive bibliography of works by
 AuthorB. The goal is to do a search on AuthorB in

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-12 Thread Karen Coombs

Karen,

Unfortunately it looks like you requested a key for the WorldCat
Search API which does have specific eligibility criteria. The WorldCat
Basic API which Ross mentions is available to anyone -
http://www.oclc.org/developer/services/worldcat-basic-api

It allows you to do an OpenSearch keyword query of WorldCat and get
back basic metadata including the link to the worldcat.org page for
each record returned.

The easiest way to get a key is to go to http://worldcat.org/config/
and login with a WorldCat username/password. You should see a link
that says WorldCat Basic API Key which you can use to get a key.

I apologize for the confusion between the two APIs (WorldCat Search
and WorldCat Basic). The difference is something we've tried to make
clearer in our documentation but unfortunately given your experience
it is still an issue.

Karen


On Thu, Jul 12, 2012 at 11:33 AM, Karen Coyle li...@kcoyle.net wrote:
 On 7/10/12 5:07 PM, Karen Coyle wrote:

 On 7/10/12 4:02 PM, Richard Wallis wrote:


 But is it available to everyone, and is the data retrieved also usable as
 ODC-BY by any member of the Web public?

 Yes it is, and at this stage it is only available from within a html
 page.


 The it I was referring to was the API. Roy is telling me that people
 should use the API, as if that is an obvious option that I am overlooking. I
 am asking if the general web public can use the API to get this data. I
 believe that should be a yes/no question/answer.


 Since no one here from OCLC had the integrity to answer this question, I
 went ahead and applied for a Worldcat API key, and here is the reply:

 *

 Hello,

 Thank you for your interest in the WorldCat Search API, however at this time
 the web service is only available to institutions, primarily libraries, that
 have a specific relationship with OCLC and then only for work related to
 that library's services. The specific relationship is explained further
 here,
 http://oclc.org/developer/documentation/worldcat-search-api/who-can-use.

 However, there are other OCLC services that are available to individual's
 non-commercial use.  Looking at the list of services available on
 http://www.worldcat.org/wcpa/content/affiliate/ you'll see that the WorldCat
 search box and WorldCat links with embedded searches are available to
 anyone.   You may also be interested in checking out the WorldCat Registry,
 or low-volume use of the xISBN and xISSN services.

 If you have questions about the service, please contact the product manager,
 Dawn Hendricks at hendr...@oclc.org mailto:hendr...@oclc.org.

 *

 There is nothing wrong with having a proprietary API; but pretending that it
 isn't (either directly or through omission), or being afraid to say it, is
 the kind of thing that has caused me to lose respect for OCLC. Nothing
 should be declared open that isn't available to all, not just members. And
 advertisements for WC API classes should state members only. That would be
 honest. And telling folks on a wide-open list that they should use the
 Worldcat API (without mentioning if you are in a member institution and
 using this for library services) is at best deceiving, at worst dishonest.

 I, for one, am tired of OCLC's lies, and I'm not afraid to say it.
 Fortunately for me, retirement is looming and I don't need to care who likes
 what I say. This is a relief, to say the least.

 kc






 kc


 This experiment is the first step in a process to make linked data about
 WorldCat resources available.  As it will evolve over time other areas
 such
 as API access, content-negotiation, search  other query methods,
 additional RDF data vocabularies, etc., etc., will be considered in
 concert
 with community feedback (such as this thread) as to the way forward.

 Karen I know you are eager to work with and demonstrate the benefits of
 this way of publishing data.  But these things take time and effort, so
 please be a little patient, and keep firing off these use cases and
 issues
 they are all valuable input.

 ~Richard.


 kc


   Roy

 On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com
 wrote:

 The use case clarifies perfectly.

 Totally feasible.  Well, I should say totally feasible with the
 caveat
 that I've never used the Worldcat Search API.  Not letting that stop
 me,
 so
 long as it is what I imagine it is, then a developer should be able to
 perform a search, retrieve the response, and, by integrating one of
 the
 tools advertised on the schema.org website into his/her code, then
 retrieve
 the microdata for each resource returned from the search (and save it
 as
 RDF
 or whatever).

 If someone has created something like this, do speak up.

 Yours,

 Kevin





 On 07/10/2012 04:48 PM, Karen Coyle wrote:

 Kevin, if you misunderstand then I undoubtedly haven't been clear
 (let's
 at least share the confusion :-)). Here's the use case:

 PersonA wants to create a comprehensive bibliography of works by
 AuthorB. The goal is to do a

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-12 Thread Karen Coyle

It isn't unfortunate, it was deliberate. I have a key for the basic api, 
but I was being advised that I had overlooked the obvious answer of the 
worldcat search API. I have no confusion between the two, except for the 
confusion that seems to be promulgated by OCLC itself.


kc


On 7/12/12 9:46 AM, Karen Coombs wrote:

Karen,

Unfortunately it looks like you requested a key for the WorldCat
Search API which does have specific eligibility criteria. The WorldCat
Basic API which Ross mentions is available to anyone -
http://www.oclc.org/developer/services/worldcat-basic-api

It allows you to do an OpenSearch keyword query of WorldCat and get
back basic metadata including the link to the worldcat.org page for
each record returned.

The easiest way to get a key is to go to http://worldcat.org/config/
and login with a WorldCat username/password. You should see a link
that says WorldCat Basic API Key which you can use to get a key.

I apologize for the confusion between the two APIs (WorldCat Search
and WorldCat Basic). The difference is something we've tried to make
clearer in our documentation but unfortunately given your experience
it is still an issue.

Karen


On Thu, Jul 12, 2012 at 11:33 AM, Karen Coyle li...@kcoyle.net wrote:

On 7/10/12 5:07 PM, Karen Coyle wrote:

On 7/10/12 4:02 PM, Richard Wallis wrote:


But is it available to everyone, and is the data retrieved also usable as
ODC-BY by any member of the Web public?

Yes it is, and at this stage it is only available from within a html
page.


The it I was referring to was the API. Roy is telling me that people
should use the API, as if that is an obvious option that I am overlooking. I
am asking if the general web public can use the API to get this data. I
believe that should be a yes/no question/answer.


Since no one here from OCLC had the integrity to answer this question, I
went ahead and applied for a Worldcat API key, and here is the reply:

*

Hello,

Thank you for your interest in the WorldCat Search API, however at this time
the web service is only available to institutions, primarily libraries, that
have a specific relationship with OCLC and then only for work related to
that library's services. The specific relationship is explained further
here,
http://oclc.org/developer/documentation/worldcat-search-api/who-can-use.

However, there are other OCLC services that are available to individual's
non-commercial use.  Looking at the list of services available on
http://www.worldcat.org/wcpa/content/affiliate/ you'll see that the WorldCat
search box and WorldCat links with embedded searches are available to
anyone.   You may also be interested in checking out the WorldCat Registry,
or low-volume use of the xISBN and xISSN services.

If you have questions about the service, please contact the product manager,
Dawn Hendricks at hendr...@oclc.org mailto:hendr...@oclc.org.

*

There is nothing wrong with having a proprietary API; but pretending that it
isn't (either directly or through omission), or being afraid to say it, is
the kind of thing that has caused me to lose respect for OCLC. Nothing
should be declared open that isn't available to all, not just members. And
advertisements for WC API classes should state members only. That would be
honest. And telling folks on a wide-open list that they should use the
Worldcat API (without mentioning if you are in a member institution and
using this for library services) is at best deceiving, at worst dishonest.

I, for one, am tired of OCLC's lies, and I'm not afraid to say it.
Fortunately for me, retirement is looming and I don't need to care who likes
what I say. This is a relief, to say the least.

kc






kc


This experiment is the first step in a process to make linked data about
WorldCat resources available.  As it will evolve over time other areas
such
as API access, content-negotiation, search  other query methods,
additional RDF data vocabularies, etc., etc., will be considered in
concert
with community feedback (such as this thread) as to the way forward.

Karen I know you are eager to work with and demonstrate the benefits of
this way of publishing data.  But these things take time and effort, so
please be a little patient, and keep firing off these use cases and
issues
they are all valuable input.

~Richard.


kc


   Roy

On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com
wrote:


The use case clarifies perfectly.

Totally feasible.  Well, I should say totally feasible with the
caveat
that I've never used the Worldcat Search API.  Not letting that stop
me,
so
long as it is what I imagine it is, then a developer should be able to
perform a search, retrieve the response, and, by integrating one of
the
tools advertised on the schema.org website into his/her code, then
retrieve
the microdata for each resource returned from the search (and save it
as
RDF
or whatever).

If someone has created something like this, do speak up.

Yours,

Kevin





On 07/10/2012 04:48 PM,

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-12 Thread Ross Singer

Ok, the Pipe didn't quite work as planned.  Yahoo! is stripping out
all of the relevant html attributes when it's converting the WC
microdata html to a string, which renders the whole thing useless.

If I don't convert it to a string, it maintains all of the necessary
attributes in the JSON output, but it strips them from the RSS and
html outputs.

I mean, it's hard to complain about free thing doesn't handle my
niche problem, but when has that ever stopped me?

Anyway, it's there for somebody to clone and poke around with.  Maybe
somebody more familiar with Pipes can figure a way around this
problem.

-Ross.

On Thu, Jul 12, 2012 at 3:03 PM, Ross Singer rossfsin...@gmail.com wrote:
 I made a Yahoo Pipe that merges the WorldCat Basic OpenSearch RSS
 result with the microdata div in the Worldcat pages referred to in the
 search results:

 http://pipes.yahoo.com/pipes/pipe.info?_id=05ae2a7bc180f3abe36b11bcaf1adc52

 You'll need to enter your wskey for it to work.

 You can get the output as RSS (which will require the item/description
 to be unescaped to use) or JSON (which wouldn't require unescaping).

 It's not terribly fast, but it least should help somebody get started.

 -Ross.

 On Thu, Jul 12, 2012 at 1:09 PM, Karen Coyle li...@kcoyle.net wrote:
 It isn't unfortunate, it was deliberate. I have a key for the basic api, but
 I was being advised that I had overlooked the obvious answer of the worldcat
 search API. I have no confusion between the two, except for the confusion
 that seems to be promulgated by OCLC itself.

 kc



 On 7/12/12 9:46 AM, Karen Coombs wrote:

 Karen,

 Unfortunately it looks like you requested a key for the WorldCat
 Search API which does have specific eligibility criteria. The WorldCat
 Basic API which Ross mentions is available to anyone -
 http://www.oclc.org/developer/services/worldcat-basic-api

 It allows you to do an OpenSearch keyword query of WorldCat and get
 back basic metadata including the link to the worldcat.org page for
 each record returned.

 The easiest way to get a key is to go to http://worldcat.org/config/
 and login with a WorldCat username/password. You should see a link
 that says WorldCat Basic API Key which you can use to get a key.

 I apologize for the confusion between the two APIs (WorldCat Search
 and WorldCat Basic). The difference is something we've tried to make
 clearer in our documentation but unfortunately given your experience
 it is still an issue.

 Karen


 On Thu, Jul 12, 2012 at 11:33 AM, Karen Coyle li...@kcoyle.net wrote:

 On 7/10/12 5:07 PM, Karen Coyle wrote:

 On 7/10/12 4:02 PM, Richard Wallis wrote:


 But is it available to everyone, and is the data retrieved also usable
 as
 ODC-BY by any member of the Web public?

 Yes it is, and at this stage it is only available from within a html
 page.


 The it I was referring to was the API. Roy is telling me that people
 should use the API, as if that is an obvious option that I am
 overlooking. I
 am asking if the general web public can use the API to get this data. I
 believe that should be a yes/no question/answer.


 Since no one here from OCLC had the integrity to answer this question, I
 went ahead and applied for a Worldcat API key, and here is the reply:

 *

 Hello,

 Thank you for your interest in the WorldCat Search API, however at this
 time
 the web service is only available to institutions, primarily libraries,
 that
 have a specific relationship with OCLC and then only for work related to
 that library's services. The specific relationship is explained further
 here,
 http://oclc.org/developer/documentation/worldcat-search-api/who-can-use.

 However, there are other OCLC services that are available to individual's
 non-commercial use.  Looking at the list of services available on
 http://www.worldcat.org/wcpa/content/affiliate/ you'll see that the
 WorldCat
 search box and WorldCat links with embedded searches are available to
 anyone.   You may also be interested in checking out the WorldCat
 Registry,
 or low-volume use of the xISBN and xISSN services.

 If you have questions about the service, please contact the product
 manager,
 Dawn Hendricks at hendr...@oclc.org mailto:hendr...@oclc.org.

 *

 There is nothing wrong with having a proprietary API; but pretending that
 it
 isn't (either directly or through omission), or being afraid to say it,
 is
 the kind of thing that has caused me to lose respect for OCLC. Nothing
 should be declared open that isn't available to all, not just members.
 And
 advertisements for WC API classes should state members only. That would
 be
 honest. And telling folks on a wide-open list that they should use the
 Worldcat API (without mentioning if you are in a member institution and
 using this for library services) is at best deceiving, at worst
 dishonest.

 I, for one, am tired of OCLC's lies, and I'm not afraid to say it.
 Fortunately for me, retirement is looming and I don't need to care who
 likes
 what I say. This is a

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-11 Thread Karen Coyle

That only returns a short citation but nothing says how short that 
citation is, nor if it is formatted. I assume that citation means 
citation format, which isn't useful.


kc

On 7/10/12 7:32 PM, Ross Singer wrote:

Worldcat does have the basic API, which is more open (assuming your
situation qualifies). At any rate, it's free and open to (non-commercial)
non-subscribers.

http://oclc.org/developer/documentation/worldcat-basic-api/using-api

Searching isn't terribly sophisticated, but might suit your need. And the
schema.org data will be much richer than what you'd normally get back from
the Basic API.

-Ross.


On Tuesday, July 10, 2012, Karen Coyle wrote:


On 7/10/12 4:02 PM, Richard Wallis wrote:


But is it available to everyone, and is the data retrieved also usable as
ODC-BY by any member of the Web public?

Yes it is, and at this stage it is only available from within a html page.


The it I was referring to was the API. Roy is telling me that people
should use the API, as if that is an obvious option that I am overlooking.
I am asking if the general web public can use the API to get this data. I
believe that should be a yes/no question/answer.

kc


This experiment is the first step in a process to make linked data about
WorldCat resources available.  As it will evolve over time other areas such
as API access, content-negotiation, search  other query methods,
additional RDF data vocabularies, etc., etc., will be considered in concert
with community feedback (such as this thread) as to the way forward.

Karen I know you are eager to work with and demonstrate the benefits of
this way of publishing data.  But these things take time and effort, so
please be a little patient, and keep firing off these use cases and issues
they are all valuable input.

~Richard.


kc


   Roy

On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote:

  The use case clarifies perfectly.

Totally feasible.  Well, I should say totally feasible with the caveat
that I've never used the Worldcat Search API.  Not letting that stop me,
so
long as it is what I imagine it is, then a developer should be able to
perform a search, retrieve the response, and, by integrating one of the
tools advertised on the schema.org website into his/her code, then
retrieve
the microdata for each resource returned from the search (and save it as
RDF
or whatever).

If someone has created something like this, do speak up.

Yours,

Kevin





On 07/10/2012 04:48 PM, Karen Coyle wrote:

  Kevin, if you misunderstand then I undoubtedly haven't been clear (let's
at least share the confusion :-)). Here's the use case:

PersonA wants to create a comprehensive bibliography of works by
AuthorB. The goal is to do a search on AuthorB in WorldCat and extract
the RDFa data from those pages in order to populate the bibliography.

Apart from all of the issues of getting a perfect match on authors and
of manifestation duplicates (there would need to be editing of the
results after retrieval at the user's end), how feasible is this? Assume
that the author is prolific enough that one wouldn't want to look up all
of the records by hand.

kc

On 7/10/12 1:43 PM, Kevin Ford wrote:

  As for someone who might want to do this programmatically, he/she
should take a look at the Programming languages section of the
second link I sent along:

http://schema.rdfs.org/tools.htmlhttp://schema.rdfs.org/tools.**html
http://schema.rdfs.org/**tools.html http://schema.rdfs.org/tools.html

There one can find Ruby, Python, and Java extractors and parsers
capable of outputting RDF.  A developer can take one of these and
programmatically get at the data.

Apologies if I am misunderstanding your intent.

Yours,

Kevin



On 07/10/2012 04:34 PM, Karen Coyle wrote:

  Thanks, Kevin! And Richard!

I'm thinking we need a good web site with links to tools. I had
already
been introduced to

http://www.w3.org/2012/pyRdfa/

where you can past a URI and get ttl or rdf/xml. These are all good
resources. But what about someone who wants to do this
programmatically,
not through a web site? Richard's message indicates that this isn't
yet
available, so perhaps we should be gathering use cases to support the
need? And have a place to post various solutions, even ones that are
not
OCLC-specific? (Because I am hoping that the use of microformats will
increase in general.)

kc


On 7/10/12 12:12 PM, Kevin Ford wrote:

  is there an open search to get one to the desired records in the
first

  place?

  -- I'm not certain this will fully address your question, but try
these two sites:

Website: http://www.google.com/**webmasters/tools/richsnippets




--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-11 Thread Ross Singer

Every entry has a link href=http://worldcat.org/oclc/{oclcnumber}/
that will take you to the schema.org.

-Ross.

On Wed, Jul 11, 2012 at 9:08 AM, Karen Coyle li...@kcoyle.net wrote:
 That only returns a short citation but nothing says how short that
 citation is, nor if it is formatted. I assume that citation means citation
 format, which isn't useful.

 kc


 On 7/10/12 7:32 PM, Ross Singer wrote:

 Worldcat does have the basic API, which is more open (assuming your
 situation qualifies). At any rate, it's free and open to (non-commercial)
 non-subscribers.

 http://oclc.org/developer/documentation/worldcat-basic-api/using-api

 Searching isn't terribly sophisticated, but might suit your need. And the
 schema.org data will be much richer than what you'd normally get back from
 the Basic API.

 -Ross.


 On Tuesday, July 10, 2012, Karen Coyle wrote:

 On 7/10/12 4:02 PM, Richard Wallis wrote:

 But is it available to everyone, and is the data retrieved also usable
 as
 ODC-BY by any member of the Web public?

 Yes it is, and at this stage it is only available from within a html
 page.

 The it I was referring to was the API. Roy is telling me that people
 should use the API, as if that is an obvious option that I am
 overlooking.
 I am asking if the general web public can use the API to get this data. I
 believe that should be a yes/no question/answer.

 kc


 This experiment is the first step in a process to make linked data about
 WorldCat resources available.  As it will evolve over time other areas
 such
 as API access, content-negotiation, search  other query methods,
 additional RDF data vocabularies, etc., etc., will be considered in
 concert
 with community feedback (such as this thread) as to the way forward.

 Karen I know you are eager to work with and demonstrate the benefits of
 this way of publishing data.  But these things take time and effort, so
 please be a little patient, and keep firing off these use cases and
 issues
 they are all valuable input.

 ~Richard.


 kc


Roy

 On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote:

   The use case clarifies perfectly.

 Totally feasible.  Well, I should say totally feasible with the caveat
 that I've never used the Worldcat Search API.  Not letting that stop me,
 so
 long as it is what I imagine it is, then a developer should be able to
 perform a search, retrieve the response, and, by integrating one of the
 tools advertised on the schema.org website into his/her code, then
 retrieve
 the microdata for each resource returned from the search (and save it as
 RDF
 or whatever).

 If someone has created something like this, do speak up.

 Yours,

 Kevin





 On 07/10/2012 04:48 PM, Karen Coyle wrote:

   Kevin, if you misunderstand then I undoubtedly haven't been clear
 (let's
 at least share the confusion :-)). Here's the use case:

 PersonA wants to create a comprehensive bibliography of works by
 AuthorB. The goal is to do a search on AuthorB in WorldCat and extract
 the RDFa data from those pages in order to populate the bibliography.

 Apart from all of the issues of getting a perfect match on authors and
 of manifestation duplicates (there would need to be editing of the
 results after retrieval at the user's end), how feasible is this? Assume
 that the author is prolific enough that one wouldn't want to look up all
 of the records by hand.

 kc

 On 7/10/12 1:43 PM, Kevin Ford wrote:

   As for someone who might want to do this programmatically, he/she
 should take a look at the Programming languages section of the
 second link I sent along:


 http://schema.rdfs.org/tools.htmlhttp://schema.rdfs.org/tools.**html
 http://schema.rdfs.org/**tools.html http://schema.rdfs.org/tools.html


 There one can find Ruby, Python, and Java extractors and parsers
 capable of outputting RDF.  A developer can take one of these and
 programmatically get at the data.

 Apologies if I am misunderstanding your intent.

 Yours,

 Kevin



 On 07/10/2012 04:34 PM, Karen Coyle wrote:

   Thanks, Kevin! And Richard!

 I'm thinking we need a good web site with links to tools. I had
 already
 been introduced to

 http://www.w3.org/2012/pyRdfa/

 where you can past a URI and get ttl or rdf/xml. These are all good
 resources. But what about someone who wants to do this
 programmatically,
 not through a web site? Richard's message indicates that this isn't
 yet
 available, so perhaps we should be gathering use cases to support the
 need? And have a place to post various solutions, even ones that are
 not
 OCLC-specific? (Because I am hoping that the use of microformats will
 increase in general.)

 kc


 On 7/10/12 12:12 PM, Kevin Ford wrote:

   is there an open search to get one to the desired records in the
 first

   place?

   -- I'm not certain this will fully address your question, but try
 these two sites:

 Website: http://www.google.com/**webmasters/tools/richsnippets



 --
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 ph:

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-10 Thread Roy Tennant

Also, my colleague wishes me to point out that the email address and
phone number of any OCLC staff member is only two clicks away from our
home page. Go to Contact us which is an option along the top on
every page, then Contact OCLC Staff which is in the sidebar and also
a link on the page as search for a specific person.
Roy

On Tue, Jul 10, 2012 at 11:42 AM, Karen Coyle li...@kcoyle.net wrote:
 I have demonstrated the schema.org/RDFa microdata in the WC database to
 various folks and the question always is: how do I get access to this? (The
 only source I have is the Facebook API, me being a user rather than a
 maker.) The microdata is CC-BY once you get a Worldcat URI, but is there
 an open search to get one to the desired records in the first place? I'm
 poorly-versed in WC APIs so I'm hoping others have a better grasp.

 @rjw: the OCLC website does a thorough job of hiding email addresses or I
 would have asked this directly. Then again, a discussion here could have
 added value.

 Thanks,
 kc

 --
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 ph: 1-510-540-7596
 m: 1-510-435-8234
 skype: kcoylenet

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-10 Thread Richard Wallis

Hi Karen

At this stage there is no specific api as such to get at the embedded RDFa
data in WorldCat - you can use the normal UI of WorldCat itself or one of
the WorldCat Search API options such as
OpenSearchhttp://oclc.org/developer/documentation/worldcat-search-api/opensearch.
  This experimental first step at exposing WorldCat data as linked data
will evolve. As more development and discussion guides us, more data and
ways to get at it will appear.

You can get at the raw RDF from the embedded RDFa it in a couple of ways.
 The W3C RDFa 1.1 Distiller http://www.w3.org/2012/pyRdfa/ is one.
 Another is using the ARC2 PHP
Libraryhttps://github.com/semsol/arc2/wiki/Getting-started-with-ARC2
for
those that want to write some simple code.  Bruce Washburn has published a
posthttp://www.oclc.org/developer/news/linked-data-now-worldcat-facebook-app
sharing
how he used ARC2 in the enhanced WorldCat Facebook App to extract the RDF
from WorldCat and process it to link on and use the same technique on Viaf
and FAST.  He includes code snippets and a link to the full source for
those that are interested.

A minor point on licensing - the linked data is licensed under
ODC-BYhttp://opendatacommons.org/licenses/by/,
not CC-BY.  ODC-BY is a data oriented license, as against CC which is more
creative work oriented.

Sorry my email was hard to find - it is richard.wal...@oclc.org.  Also if
you have questions or comments about OCLC linked data formatting or
publishing you can drop an email to d...@oclc.org.

~Richard.

On 10 July 2012 19:42, Karen Coyle li...@kcoyle.net wrote:

 I have demonstrated the schema.org/RDFa microdata in the WC database to
 various folks and the question always is: how do I get access to this? (The
 only source I have is the Facebook API, me being a user rather than a
 maker.) The microdata is CC-BY once you get a Worldcat URI, but is there
 an open search to get one to the desired records in the first place? I'm
 poorly-versed in WC APIs so I'm hoping others have a better grasp.

 @rjw: the OCLC website does a thorough job of hiding email addresses or I
 would have asked this directly. Then again, a discussion here could have
 added value.

 Thanks,
 kc

 --
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 ph: 1-510-540-7596
 m: 1-510-435-8234
 skype: kcoylenet




-- 
Richard Wallis
Founder, Data Liberate
http://dataliberate.com
Tel: +44 (0)7767 886 005

Linkedin: http://www.linkedin.com/in/richardwallis
Skype: richard.wallis1
Twitter: @rjw
IM: rjw3...@hotmail.com

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-10 Thread Karen Coyle


Thanks, Roy. I obviously never got there, but will visit in the future.

kc

On 7/10/12 12:57 PM, Roy Tennant wrote:

Also, my colleague wishes me to point out that the email address and
phone number of any OCLC staff member is only two clicks away from our
home page. Go to Contact us which is an option along the top on
every page, then Contact OCLC Staff which is in the sidebar and also
a link on the page as search for a specific person.
Roy

On Tue, Jul 10, 2012 at 11:42 AM, Karen Coyle li...@kcoyle.net wrote:

I have demonstrated the schema.org/RDFa microdata in the WC database to
various folks and the question always is: how do I get access to this? (The
only source I have is the Facebook API, me being a user rather than a
maker.) The microdata is CC-BY once you get a Worldcat URI, but is there
an open search to get one to the desired records in the first place? I'm
poorly-versed in WC APIs so I'm hoping others have a better grasp.

@rjw: the OCLC website does a thorough job of hiding email addresses or I
would have asked this directly. Then again, a discussion here could have
added value.

Thanks,
kc

--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet


--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-10 Thread Kevin Ford

As for someone who might want to do this programmatically, he/she should 
take a look at the Programming languages section of the second link I 
sent along:


http://schema.rdfs.org/tools.html

There one can find Ruby, Python, and Java extractors and parsers capable 
of outputting RDF.  A developer can take one of these and 
programmatically get at the data.


Apologies if I am misunderstanding your intent.

Yours,

Kevin



On 07/10/2012 04:34 PM, Karen Coyle wrote:

Thanks, Kevin! And Richard!

I'm thinking we need a good web site with links to tools. I had already
been introduced to

http://www.w3.org/2012/pyRdfa/

where you can past a URI and get ttl or rdf/xml. These are all good
resources. But what about someone who wants to do this programmatically,
not through a web site? Richard's message indicates that this isn't yet
available, so perhaps we should be gathering use cases to support the
need? And have a place to post various solutions, even ones that are not
OCLC-specific? (Because I am hoping that the use of microformats will
increase in general.)

kc


On 7/10/12 12:12 PM, Kevin Ford wrote:

 is there an open search to get one to the desired records in the first
 place?
-- I'm not certain this will fully address your question, but try
these two sites:

Website: http://www.google.com/webmasters/tools/richsnippets
Example: http://tinyurl.com/dx3h5bg

Website: http://linter.structured-data.org/
Example: http://tinyurl.com/bmm8bbc

These sites will extract the data, but I don't think you get your
choice of serialization.  The data are extracted and displayed on the
resulting page in the HTML, but at least you can *see* the data.

Additionally, there are a number of tools to help with microdata
extraction here:

http://schema.rdfs.org/tools.html

Some of these will allow you to output specific (RDF) serializations.


HTH,

Kevin


On 07/10/2012 02:42 PM, Karen Coyle wrote:

I have demonstrated the schema.org/RDFa microdata in the WC database to
various folks and the question always is: how do I get access to this?
(The only source I have is the Facebook API, me being a user rather
than a maker.) The microdata is CC-BY once you get a Worldcat URI, but
is there an open search to get one to the desired records in the first
place? I'm poorly-versed in WC APIs so I'm hoping others have a better
grasp.

@rjw: the OCLC website does a thorough job of hiding email addresses or
I would have asked this directly. Then again, a discussion here could
have added value.

Thanks,
kc

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-10 Thread Karen Coyle

Kevin, if you misunderstand then I undoubtedly haven't been clear (let's 
at least share the confusion :-)). Here's the use case:


PersonA wants to create a comprehensive bibliography of works by 
AuthorB. The goal is to do a search on AuthorB in WorldCat and extract 
the RDFa data from those pages in order to populate the bibliography.


Apart from all of the issues of getting a perfect match on authors and 
of manifestation duplicates (there would need to be editing of the 
results after retrieval at the user's end), how feasible is this? Assume 
that the author is prolific enough that one wouldn't want to look up all 
of the records by hand.


kc

On 7/10/12 1:43 PM, Kevin Ford wrote:
As for someone who might want to do this programmatically, he/she 
should take a look at the Programming languages section of the 
second link I sent along:


http://schema.rdfs.org/tools.html

There one can find Ruby, Python, and Java extractors and parsers 
capable of outputting RDF.  A developer can take one of these and 
programmatically get at the data.


Apologies if I am misunderstanding your intent.

Yours,

Kevin



On 07/10/2012 04:34 PM, Karen Coyle wrote:

Thanks, Kevin! And Richard!

I'm thinking we need a good web site with links to tools. I had already
been introduced to

http://www.w3.org/2012/pyRdfa/

where you can past a URI and get ttl or rdf/xml. These are all good
resources. But what about someone who wants to do this programmatically,
not through a web site? Richard's message indicates that this isn't yet
available, so perhaps we should be gathering use cases to support the
need? And have a place to post various solutions, even ones that are not
OCLC-specific? (Because I am hoping that the use of microformats will
increase in general.)

kc


On 7/10/12 12:12 PM, Kevin Ford wrote:
 is there an open search to get one to the desired records in the 
first

 place?
-- I'm not certain this will fully address your question, but try
these two sites:

Website: http://www.google.com/webmasters/tools/richsnippets
Example: http://tinyurl.com/dx3h5bg

Website: http://linter.structured-data.org/
Example: http://tinyurl.com/bmm8bbc

These sites will extract the data, but I don't think you get your
choice of serialization.  The data are extracted and displayed on the
resulting page in the HTML, but at least you can *see* the data.

Additionally, there are a number of tools to help with microdata
extraction here:

http://schema.rdfs.org/tools.html

Some of these will allow you to output specific (RDF) serializations.


HTH,

Kevin


On 07/10/2012 02:42 PM, Karen Coyle wrote:
I have demonstrated the schema.org/RDFa microdata in the WC 
database to

various folks and the question always is: how do I get access to this?
(The only source I have is the Facebook API, me being a user rather
than a maker.) The microdata is CC-BY once you get a Worldcat 
URI, but

is there an open search to get one to the desired records in the first
place? I'm poorly-versed in WC APIs so I'm hoping others have a better
grasp.

@rjw: the OCLC website does a thorough job of hiding email 
addresses or

I would have asked this directly. Then again, a discussion here could
have added value.

Thanks,
kc





--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-10 Thread Richard Wallis

Karen,

RDFa and the basic schema.org vocabulary, plus the intention of the
proposed library extension, are not OCLC specific - they are generic tools
and techniques applicable across many domains.

I would therefore avoid library focussed tool sites, which would run the
risk of not keeping up with wider developments. RDFa.info seems to be
shaping up as a good resource.

Schema.org itself also is a good resource.

On the point of how to gain the best from linked data, many especially in
the library community, immediately look towards search as the default *
paradise* for dealing with data. Many of the benefits of linked data
emerge not from search, but from identifying relationships and following
links. I heard this described the other day as 'facets on steroids' - not
entirely accurate, but it conjures up the right kind of image ;-)

I am not saying ignore search, far from it, just suggesting that innovation
with linked data often comes from what you can do once you have found (often
by traditional methods) a thing .

~Richard.

On 10 July 2012 21:34, Karen Coyle li...@kcoyle.net wrote:

Thanks, Kevin! And Richard!

I'm thinking we need a good web site with links to tools. I had already
been introduced to

http://www.w3.org/2012/pyRdfa/

where you can past a URI and get ttl or rdf/xml. These are all good
resources. But what about someone who wants to do this programmatically,
not through a web site? Richard's message indicates that this isn't yet
available, so perhaps we should be gathering use cases to support the need?
And have a place to post various solutions, even ones that are not
OCLC-specific? (Because I am hoping that the use of microformats will
increase in general.)

On 7/10/12 12:12 PM, Kevin Ford wrote:

is there an open search to get one to the desired records in the first
place?
-- I'm not certain this will fully address your question, but try these
two sites:

Website:
http://www.google.com/**webmasters/tools/richsnippetshttp://www.google.com/webmasters/tools/richsnippets
Example: http://tinyurl.com/dx3h5bg

Website:
http://linter.structured-data.**org/http://linter.structured-data.org/
Example: http://tinyurl.com/bmm8bbc

These sites will extract the data, but I don't think you get your choice
of serialization. The data are extracted and displayed on the resulting
page in the HTML, but at least you can *see* the data.

Additionally, there are a number of tools to help with microdata
extraction here:

http://schema.rdfs.org/tools.**html http://schema.rdfs.org/tools.html

Some of these will allow you to output specific (RDF) serializations.

HTH,

Kevin

On 07/10/2012 02:42 PM, Karen Coyle wrote:

I have demonstrated the schema.org/RDFa microdata in the WC database to
various folks and the question always is: how do I get access to this?
(The only source I have is the Facebook API, me being a user rather
than a maker.) The microdata is CC-BY once you get a Worldcat URI, but
is there an open search to get one to the desired records in the first
place? I'm poorly-versed in WC APIs so I'm hoping others have a better
grasp.

@rjw: the OCLC website does a thorough job of hiding email addresses or
I would have asked this directly. Then again, a discussion here could
have added value.

Thanks,
kc

--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet

--
Richard Wallis
Founder, Data Liberate
http://dataliberate.com
Tel: +44 (0)7767 886 005

Linkedin: http://www.linkedin.com/in/richardwallis
Skype: richard.wallis1
Twitter: @rjw
IM: rjw3...@hotmail.com

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-10 Thread Kevin Ford


The use case clarifies perfectly.

Totally feasible.  Well, I should say totally feasible with the caveat 
that I've never used the Worldcat Search API.  Not letting that stop me, 
so long as it is what I imagine it is, then a developer should be able 
to perform a search, retrieve the response, and, by integrating one of 
the tools advertised on the schema.org website into his/her code, then 
retrieve the microdata for each resource returned from the search (and 
save it as RDF or whatever).


If someone has created something like this, do speak up.

Yours,

Kevin




On 07/10/2012 04:48 PM, Karen Coyle wrote:

Kevin, if you misunderstand then I undoubtedly haven't been clear (let's
at least share the confusion :-)). Here's the use case:

PersonA wants to create a comprehensive bibliography of works by
AuthorB. The goal is to do a search on AuthorB in WorldCat and extract
the RDFa data from those pages in order to populate the bibliography.

Apart from all of the issues of getting a perfect match on authors and
of manifestation duplicates (there would need to be editing of the
results after retrieval at the user's end), how feasible is this? Assume
that the author is prolific enough that one wouldn't want to look up all
of the records by hand.

kc

On 7/10/12 1:43 PM, Kevin Ford wrote:

As for someone who might want to do this programmatically, he/she
should take a look at the Programming languages section of the
second link I sent along:

http://schema.rdfs.org/tools.html

There one can find Ruby, Python, and Java extractors and parsers
capable of outputting RDF.  A developer can take one of these and
programmatically get at the data.

Apologies if I am misunderstanding your intent.

Yours,

Kevin



On 07/10/2012 04:34 PM, Karen Coyle wrote:

Thanks, Kevin! And Richard!

I'm thinking we need a good web site with links to tools. I had already
been introduced to

http://www.w3.org/2012/pyRdfa/

where you can past a URI and get ttl or rdf/xml. These are all good
resources. But what about someone who wants to do this programmatically,
not through a web site? Richard's message indicates that this isn't yet
available, so perhaps we should be gathering use cases to support the
need? And have a place to post various solutions, even ones that are not
OCLC-specific? (Because I am hoping that the use of microformats will
increase in general.)

kc


On 7/10/12 12:12 PM, Kevin Ford wrote:

 is there an open search to get one to the desired records in the
first
 place?
-- I'm not certain this will fully address your question, but try
these two sites:

Website: http://www.google.com/webmasters/tools/richsnippets
Example: http://tinyurl.com/dx3h5bg

Website: http://linter.structured-data.org/
Example: http://tinyurl.com/bmm8bbc

These sites will extract the data, but I don't think you get your
choice of serialization.  The data are extracted and displayed on the
resulting page in the HTML, but at least you can *see* the data.

Additionally, there are a number of tools to help with microdata
extraction here:

http://schema.rdfs.org/tools.html

Some of these will allow you to output specific (RDF) serializations.


HTH,

Kevin


On 07/10/2012 02:42 PM, Karen Coyle wrote:

I have demonstrated the schema.org/RDFa microdata in the WC
database to
various folks and the question always is: how do I get access to this?
(The only source I have is the Facebook API, me being a user rather
than a maker.) The microdata is CC-BY once you get a Worldcat
URI, but
is there an open search to get one to the desired records in the first
place? I'm poorly-versed in WC APIs so I'm hoping others have a better
grasp.

@rjw: the OCLC website does a thorough job of hiding email
addresses or
I would have asked this directly. Then again, a discussion here could
have added value.

Thanks,
kc

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-10 Thread Roy Tennant

Uh...what? For the given use case you would be much better off simply
using the WorldCat Search API response. Using it only to retrieve an
identifier and then going and scraping the Linked Data out of a
WorldCat.org page is, at best, redundant.

As Richard pointed out, some use cases -- like the one Karen provided
-- are not really a good use case for linked data. It's a better use
case for an API, which has been available for years.
Roy

On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote:
 The use case clarifies perfectly.

 Totally feasible.  Well, I should say totally feasible with the caveat
 that I've never used the Worldcat Search API.  Not letting that stop me, so
 long as it is what I imagine it is, then a developer should be able to
 perform a search, retrieve the response, and, by integrating one of the
 tools advertised on the schema.org website into his/her code, then retrieve
 the microdata for each resource returned from the search (and save it as RDF
 or whatever).

 If someone has created something like this, do speak up.

 Yours,

 Kevin





 On 07/10/2012 04:48 PM, Karen Coyle wrote:

 Kevin, if you misunderstand then I undoubtedly haven't been clear (let's
 at least share the confusion :-)). Here's the use case:

 PersonA wants to create a comprehensive bibliography of works by
 AuthorB. The goal is to do a search on AuthorB in WorldCat and extract
 the RDFa data from those pages in order to populate the bibliography.

 Apart from all of the issues of getting a perfect match on authors and
 of manifestation duplicates (there would need to be editing of the
 results after retrieval at the user's end), how feasible is this? Assume
 that the author is prolific enough that one wouldn't want to look up all
 of the records by hand.

 kc

 On 7/10/12 1:43 PM, Kevin Ford wrote:

 As for someone who might want to do this programmatically, he/she
 should take a look at the Programming languages section of the
 second link I sent along:

 http://schema.rdfs.org/tools.html

 There one can find Ruby, Python, and Java extractors and parsers
 capable of outputting RDF.  A developer can take one of these and
 programmatically get at the data.

 Apologies if I am misunderstanding your intent.

 Yours,

 Kevin



 On 07/10/2012 04:34 PM, Karen Coyle wrote:

 Thanks, Kevin! And Richard!

 I'm thinking we need a good web site with links to tools. I had already
 been introduced to

 http://www.w3.org/2012/pyRdfa/

 where you can past a URI and get ttl or rdf/xml. These are all good
 resources. But what about someone who wants to do this programmatically,
 not through a web site? Richard's message indicates that this isn't yet
 available, so perhaps we should be gathering use cases to support the
 need? And have a place to post various solutions, even ones that are not
 OCLC-specific? (Because I am hoping that the use of microformats will
 increase in general.)

 kc


 On 7/10/12 12:12 PM, Kevin Ford wrote:

  is there an open search to get one to the desired records in the
 first
  place?
 -- I'm not certain this will fully address your question, but try
 these two sites:

 Website: http://www.google.com/webmasters/tools/richsnippets
 Example: http://tinyurl.com/dx3h5bg

 Website: http://linter.structured-data.org/
 Example: http://tinyurl.com/bmm8bbc

 These sites will extract the data, but I don't think you get your
 choice of serialization.  The data are extracted and displayed on the
 resulting page in the HTML, but at least you can *see* the data.

 Additionally, there are a number of tools to help with microdata
 extraction here:

 http://schema.rdfs.org/tools.html

 Some of these will allow you to output specific (RDF) serializations.


 HTH,

 Kevin


 On 07/10/2012 02:42 PM, Karen Coyle wrote:

 I have demonstrated the schema.org/RDFa microdata in the WC
 database to
 various folks and the question always is: how do I get access to this?
 (The only source I have is the Facebook API, me being a user rather
 than a maker.) The microdata is CC-BY once you get a Worldcat
 URI, but
 is there an open search to get one to the desired records in the first
 place? I'm poorly-versed in WC APIs so I'm hoping others have a better
 grasp.

 @rjw: the OCLC website does a thorough job of hiding email
 addresses or
 I would have asked this directly. Then again, a discussion here could
 have added value.

 Thanks,
 kc

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-10 Thread Kevin Ford

Does the worldcat search api return the data as described with the 
schema.org and OCLC extension vocabularies?


The use case mentioned extracting the RDFa data from those pages. 
Without knowing the answer to the leading question above, the mock 
solution addressed that condition.  If one simply wanted to create a 
comprehensive bibliography of works by a particular author, then, yes, 
the search response would suffice.


Kevin


On 07/10/2012 05:10 PM, Roy Tennant wrote:

Uh...what? For the given use case you would be much better off simply
using the WorldCat Search API response. Using it only to retrieve an
identifier and then going and scraping the Linked Data out of a
WorldCat.org page is, at best, redundant.

As Richard pointed out, some use cases -- like the one Karen provided
-- are not really a good use case for linked data. It's a better use
case for an API, which has been available for years.
Roy

On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote:

The use case clarifies perfectly.

Totally feasible.  Well, I should say totally feasible with the caveat
that I've never used the Worldcat Search API.  Not letting that stop me, so
long as it is what I imagine it is, then a developer should be able to
perform a search, retrieve the response, and, by integrating one of the
tools advertised on the schema.org website into his/her code, then retrieve
the microdata for each resource returned from the search (and save it as RDF
or whatever).

If someone has created something like this, do speak up.

Yours,

Kevin





On 07/10/2012 04:48 PM, Karen Coyle wrote:


Kevin, if you misunderstand then I undoubtedly haven't been clear (let's
at least share the confusion :-)). Here's the use case:

PersonA wants to create a comprehensive bibliography of works by
AuthorB. The goal is to do a search on AuthorB in WorldCat and extract
the RDFa data from those pages in order to populate the bibliography.

Apart from all of the issues of getting a perfect match on authors and
of manifestation duplicates (there would need to be editing of the
results after retrieval at the user's end), how feasible is this? Assume
that the author is prolific enough that one wouldn't want to look up all
of the records by hand.

kc

On 7/10/12 1:43 PM, Kevin Ford wrote:


As for someone who might want to do this programmatically, he/she
should take a look at the Programming languages section of the
second link I sent along:

http://schema.rdfs.org/tools.html

There one can find Ruby, Python, and Java extractors and parsers
capable of outputting RDF.  A developer can take one of these and
programmatically get at the data.

Apologies if I am misunderstanding your intent.

Yours,

Kevin



On 07/10/2012 04:34 PM, Karen Coyle wrote:


Thanks, Kevin! And Richard!

I'm thinking we need a good web site with links to tools. I had already
been introduced to

http://www.w3.org/2012/pyRdfa/

where you can past a URI and get ttl or rdf/xml. These are all good
resources. But what about someone who wants to do this programmatically,
not through a web site? Richard's message indicates that this isn't yet
available, so perhaps we should be gathering use cases to support the
need? And have a place to post various solutions, even ones that are not
OCLC-specific? (Because I am hoping that the use of microformats will
increase in general.)

kc


On 7/10/12 12:12 PM, Kevin Ford wrote:



is there an open search to get one to the desired records in the

first

place?

-- I'm not certain this will fully address your question, but try
these two sites:

Website: http://www.google.com/webmasters/tools/richsnippets
Example: http://tinyurl.com/dx3h5bg

Website: http://linter.structured-data.org/
Example: http://tinyurl.com/bmm8bbc

These sites will extract the data, but I don't think you get your
choice of serialization.  The data are extracted and displayed on the
resulting page in the HTML, but at least you can *see* the data.

Additionally, there are a number of tools to help with microdata
extraction here:

http://schema.rdfs.org/tools.html

Some of these will allow you to output specific (RDF) serializations.


HTH,

Kevin


On 07/10/2012 02:42 PM, Karen Coyle wrote:


I have demonstrated the schema.org/RDFa microdata in the WC
database to
various folks and the question always is: how do I get access to this?
(The only source I have is the Facebook API, me being a user rather
than a maker.) The microdata is CC-BY once you get a Worldcat
URI, but
is there an open search to get one to the desired records in the first
place? I'm poorly-versed in WC APIs so I'm hoping others have a better
grasp.

@rjw: the OCLC website does a thorough job of hiding email
addresses or
I would have asked this directly. Then again, a discussion here could
have added value.

Thanks,
kc

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-10 Thread Karen Coyle

I think we have a catch-22 here. You need an OCLC developer license to 
use WC to discover WC URIs using an application; you need WC URIs (or 
other URIs that are not very diffuse on the Web) to make use of the OCLC 
linked data. The OCLC linked data is ODC-BY for anyone wishing to use 
the data, but, if I'm not mistaken, the APIs are not publicly open to 
the Web public. Thus the schema.org data is ODC-BY but most applications 
on the web will have little opportunity to discover the OCLC-specific 
URIs. So the gatekeeper is the API access, that is, the ability to 
search WC for URI discovery (e.g. with an author's name). So you can 
link, but you can't easily discover the linking URIs.


I suppose that one could discover publications as linked data using the 
topical access of LCSH, the VIAF links in Wikipedia, or by going through 
databases like Open Library, which has some OCLC numbers associated with 
bibliographic data. All of these are accessible via open APIs, I 
believe, and are linked DBPedia. I understand that linking is linking 
but unless we are developing data for SkyNet, somewhere along the way 
the user needs to begin with a human-understandable query. Searching and 
linked data are not in conflict with each other, they give each other 
mutual support. It only makes sense that URIs will be discovered through 
searching at some point in the process of access, as applications like 
Wikipedia illustrate. (As does the Facebook API, which is a search.)


I've tried to find a clear statement of who can get access to the OCLC 
APIs, but I'm afraid that I can't find a page that clarifies that. I 
guess one is expected to apply for developer key in order to find out if 
they qualify. I'll pass that information along.


kc


On 7/10/12 2:32 PM, Kevin Ford wrote:
Does the worldcat search api return the data as described with the 
schema.org and OCLC extension vocabularies?


The use case mentioned extracting the RDFa data from those pages. 
Without knowing the answer to the leading question above, the mock 
solution addressed that condition.  If one simply wanted to create a 
comprehensive bibliography of works by a particular author, then, 
yes, the search response would suffice.


Kevin


On 07/10/2012 05:10 PM, Roy Tennant wrote:

Uh...what? For the given use case you would be much better off simply
using the WorldCat Search API response. Using it only to retrieve an
identifier and then going and scraping the Linked Data out of a
WorldCat.org page is, at best, redundant.

As Richard pointed out, some use cases -- like the one Karen provided
-- are not really a good use case for linked data. It's a better use
case for an API, which has been available for years.
Roy

On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote:

The use case clarifies perfectly.

Totally feasible.  Well, I should say totally feasible with the 
caveat
that I've never used the Worldcat Search API.  Not letting that stop 
me, so

long as it is what I imagine it is, then a developer should be able to
perform a search, retrieve the response, and, by integrating one of the
tools advertised on the schema.org website into his/her code, then 
retrieve
the microdata for each resource returned from the search (and save 
it as RDF

or whatever).

If someone has created something like this, do speak up.

Yours,

Kevin





On 07/10/2012 04:48 PM, Karen Coyle wrote:


Kevin, if you misunderstand then I undoubtedly haven't been clear 
(let's

at least share the confusion :-)). Here's the use case:

PersonA wants to create a comprehensive bibliography of works by
AuthorB. The goal is to do a search on AuthorB in WorldCat and extract
the RDFa data from those pages in order to populate the bibliography.

Apart from all of the issues of getting a perfect match on authors and
of manifestation duplicates (there would need to be editing of the
results after retrieval at the user's end), how feasible is this? 
Assume
that the author is prolific enough that one wouldn't want to look 
up all

of the records by hand.

kc

On 7/10/12 1:43 PM, Kevin Ford wrote:


As for someone who might want to do this programmatically, he/she
should take a look at the Programming languages section of the
second link I sent along:

http://schema.rdfs.org/tools.html

There one can find Ruby, Python, and Java extractors and parsers
capable of outputting RDF.  A developer can take one of these and
programmatically get at the data.

Apologies if I am misunderstanding your intent.

Yours,

Kevin



On 07/10/2012 04:34 PM, Karen Coyle wrote:


Thanks, Kevin! And Richard!

I'm thinking we need a good web site with links to tools. I had 
already

been introduced to

http://www.w3.org/2012/pyRdfa/

where you can past a URI and get ttl or rdf/xml. These are all good
resources. But what about someone who wants to do this 
programmatically,
not through a web site? Richard's message indicates that this 
isn't yet
available, so perhaps we should be gathering use

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-10 Thread Karen Coyle


On 7/10/12 2:10 PM, Roy Tennant wrote:

Uh...what? For the given use case you would be much better off simply
using the WorldCat Search API response. Using it only to retrieve an
identifier and then going and scraping the Linked Data out of a
WorldCat.org page is, at best, redundant.


I do not consider using linked data to be scraping by any meaning of 
that term. Machine-actionable data is returned in formats like RDF/XML 
or ttl or JSON. And I'm curious that linked data is somehow not 
considered to be usable as data and that microformat data is not 
considered to be searchable -- in fact, its raison d'etre is search 
optimization.




As Richard pointed out, some use cases -- like the one Karen provided
-- are not really a good use case for linked data. It's a better use
case for an API, which has been available for years.


But is it available to everyone, and is the data retrieved also usable 
as ODC-BY by any member of the Web public?


kc


Roy

On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote:

The use case clarifies perfectly.

Totally feasible.  Well, I should say totally feasible with the caveat
that I've never used the Worldcat Search API.  Not letting that stop me, so
long as it is what I imagine it is, then a developer should be able to
perform a search, retrieve the response, and, by integrating one of the
tools advertised on the schema.org website into his/her code, then retrieve
the microdata for each resource returned from the search (and save it as RDF
or whatever).

If someone has created something like this, do speak up.

Yours,

Kevin





On 07/10/2012 04:48 PM, Karen Coyle wrote:

Kevin, if you misunderstand then I undoubtedly haven't been clear (let's
at least share the confusion :-)). Here's the use case:

PersonA wants to create a comprehensive bibliography of works by
AuthorB. The goal is to do a search on AuthorB in WorldCat and extract
the RDFa data from those pages in order to populate the bibliography.

Apart from all of the issues of getting a perfect match on authors and
of manifestation duplicates (there would need to be editing of the
results after retrieval at the user's end), how feasible is this? Assume
that the author is prolific enough that one wouldn't want to look up all
of the records by hand.

kc

On 7/10/12 1:43 PM, Kevin Ford wrote:

As for someone who might want to do this programmatically, he/she
should take a look at the Programming languages section of the
second link I sent along:

http://schema.rdfs.org/tools.html

There one can find Ruby, Python, and Java extractors and parsers
capable of outputting RDF.  A developer can take one of these and
programmatically get at the data.

Apologies if I am misunderstanding your intent.

Yours,

Kevin



On 07/10/2012 04:34 PM, Karen Coyle wrote:

Thanks, Kevin! And Richard!

I'm thinking we need a good web site with links to tools. I had already
been introduced to

http://www.w3.org/2012/pyRdfa/

where you can past a URI and get ttl or rdf/xml. These are all good
resources. But what about someone who wants to do this programmatically,
not through a web site? Richard's message indicates that this isn't yet
available, so perhaps we should be gathering use cases to support the
need? And have a place to post various solutions, even ones that are not
OCLC-specific? (Because I am hoping that the use of microformats will
increase in general.)

kc


On 7/10/12 12:12 PM, Kevin Ford wrote:

is there an open search to get one to the desired records in the

first

place?

-- I'm not certain this will fully address your question, but try
these two sites:

Website: http://www.google.com/webmasters/tools/richsnippets
Example: http://tinyurl.com/dx3h5bg

Website: http://linter.structured-data.org/
Example: http://tinyurl.com/bmm8bbc

These sites will extract the data, but I don't think you get your
choice of serialization.  The data are extracted and displayed on the
resulting page in the HTML, but at least you can *see* the data.

Additionally, there are a number of tools to help with microdata
extraction here:

http://schema.rdfs.org/tools.html

Some of these will allow you to output specific (RDF) serializations.


HTH,

Kevin


On 07/10/2012 02:42 PM, Karen Coyle wrote:

I have demonstrated the schema.org/RDFa microdata in the WC
database to
various folks and the question always is: how do I get access to this?
(The only source I have is the Facebook API, me being a user rather
than a maker.) The microdata is CC-BY once you get a Worldcat
URI, but
is there an open search to get one to the desired records in the first
place? I'm poorly-versed in WC APIs so I'm hoping others have a better
grasp.

@rjw: the OCLC website does a thorough job of hiding email
addresses or
I would have asked this directly. Then again, a discussion here could
have added value.

Thanks,
kc



--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-10 Thread Richard Wallis

On 10 July 2012 23:13, Karen Coyle li...@kcoyle.net wrote:

 On 7/10/12 2:10 PM, Roy Tennant wrote:

 Uh...what? For the given use case you would be much better off simply
 using the WorldCat Search API response. Using it only to retrieve an
 identifier and then going and scraping the Linked Data out of a
 WorldCat.org page is, at best, redundant.


 I do not consider using linked data to be scraping by any meaning of
 that term.


The tools and code libraries that extract the RDF wrapped in RDFa markup in
html are doing just that - scraping it out of the page markup.  However, as
it is embedded in there in a structured form, so that process can be
considered far more reliable than is normally expected a scraping process
that is easily upset by visual changes.

Machine-actionable data is returned in formats like RDF/XML or ttl or JSON.


From the code and tools that interpret the RDFa in the page, yes.


 And I'm curious that linked data is somehow not considered to be usable as
 data and that microformat data is not considered to be searchable


Of course it is usable as data - I think what Roy was getting at is that
you could have satisfied your use case with tools that were available
before the embedding of linked data in to WorldCat detail pages.


 -- in fact, its raison d'etre is search optimization.


Yes one of the reasons for embedding structured data and identifiers, as
well as text [as Google puts it 'things not strings'] is SEO.   I'm sure
that the search engines are already using it for that now.

However, SEO is not the only reason for linked data - [as a linked data
enthusiast] I would suggest that better SEO is a nice side benefit of
something much more powerful.  evangelismoff/evangelism




 As Richard pointed out, some use cases -- like the one Karen provided
 -- are not really a good use case for linked data. It's a better use
 case for an API, which has been available for years.


 But is it available to everyone, and is the data retrieved also usable as
 ODC-BY by any member of the Web public?


Yes it is, and at this stage it is only available from within a html page.

This experiment is the first step in a process to make linked data about
WorldCat resources available.  As it will evolve over time other areas such
as API access, content-negotiation, search  other query methods,
additional RDF data vocabularies, etc., etc., will be considered in concert
with community feedback (such as this thread) as to the way forward.

Karen I know you are eager to work with and demonstrate the benefits of
this way of publishing data.  But these things take time and effort, so
please be a little patient, and keep firing off these use cases and issues
they are all valuable input.

~Richard.



 kc


  Roy

 On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote:

 The use case clarifies perfectly.

 Totally feasible.  Well, I should say totally feasible with the caveat
 that I've never used the Worldcat Search API.  Not letting that stop me,
 so
 long as it is what I imagine it is, then a developer should be able to
 perform a search, retrieve the response, and, by integrating one of the
 tools advertised on the schema.org website into his/her code, then
 retrieve
 the microdata for each resource returned from the search (and save it as
 RDF
 or whatever).

 If someone has created something like this, do speak up.

 Yours,

 Kevin





 On 07/10/2012 04:48 PM, Karen Coyle wrote:

 Kevin, if you misunderstand then I undoubtedly haven't been clear (let's
 at least share the confusion :-)). Here's the use case:

 PersonA wants to create a comprehensive bibliography of works by
 AuthorB. The goal is to do a search on AuthorB in WorldCat and extract
 the RDFa data from those pages in order to populate the bibliography.

 Apart from all of the issues of getting a perfect match on authors and
 of manifestation duplicates (there would need to be editing of the
 results after retrieval at the user's end), how feasible is this? Assume
 that the author is prolific enough that one wouldn't want to look up all
 of the records by hand.

 kc

 On 7/10/12 1:43 PM, Kevin Ford wrote:

 As for someone who might want to do this programmatically, he/she
 should take a look at the Programming languages section of the
 second link I sent along:

 http://schema.rdfs.org/tools.**htmlhttp://schema.rdfs.org/tools.html

 There one can find Ruby, Python, and Java extractors and parsers
 capable of outputting RDF.  A developer can take one of these and
 programmatically get at the data.

 Apologies if I am misunderstanding your intent.

 Yours,

 Kevin



 On 07/10/2012 04:34 PM, Karen Coyle wrote:

 Thanks, Kevin! And Richard!

 I'm thinking we need a good web site with links to tools. I had
 already
 been introduced to

 http://www.w3.org/2012/pyRdfa/

 where you can past a URI and get ttl or rdf/xml. These are all good
 resources. But what about someone who wants to do this
 programmatically,
 not through a

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-10 Thread Karen Coyle

On 7/10/12 4:02 PM, Richard Wallis wrote:

But is it available to everyone, and is the data retrieved also usable as
ODC-BY by any member of the Web public?

Yes it is, and at this stage it is only available from within a html page.

The it I was referring to was the API. Roy is telling me that people
should use the API, as if that is an obvious option that I am
overlooking. I am asking if the general web public can use the API to
get this data. I believe that should be a yes/no question/answer.

This experiment is the first step in a process to make linked data about
WorldCat resources available. As it will evolve over time other areas such
as API access, content-negotiation, search other query methods,
additional RDF data vocabularies, etc., etc., will be considered in concert
with community feedback (such as this thread) as to the way forward.

Karen I know you are eager to work with and demonstrate the benefits of
this way of publishing data. But these things take time and effort, so
please be a little patient, and keep firing off these use cases and issues
they are all valuable input.

~Richard.

Roy

On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote:

The use case clarifies perfectly.

Totally feasible. Well, I should say totally feasible with the caveat
that I've never used the Worldcat Search API. Not letting that stop me,
so
long as it is what I imagine it is, then a developer should be able to
perform a search, retrieve the response, and, by integrating one of the
tools advertised on the schema.org website into his/her code, then
retrieve
the microdata for each resource returned from the search (and save it as
RDF
or whatever).

If someone has created something like this, do speak up.

Yours,

Kevin

On 07/10/2012 04:48 PM, Karen Coyle wrote:

Kevin, if you misunderstand then I undoubtedly haven't been clear (let's
at least share the confusion :-)). Here's the use case:

PersonA wants to create a comprehensive bibliography of works by
AuthorB. The goal is to do a search on AuthorB in WorldCat and extract
the RDFa data from those pages in order to populate the bibliography.

Apart from all of the issues of getting a perfect match on authors and
of manifestation duplicates (there would need to be editing of the
results after retrieval at the user's end), how feasible is this? Assume
that the author is prolific enough that one wouldn't want to look up all
of the records by hand.

On 7/10/12 1:43 PM, Kevin Ford wrote:

As for someone who might want to do this programmatically, he/she
should take a look at the Programming languages section of the
second link I sent along:

http://schema.rdfs.org/tools.**htmlhttp://schema.rdfs.org/tools.html

There one can find Ruby, Python, and Java extractors and parsers
capable of outputting RDF. A developer can take one of these and
programmatically get at the data.

Apologies if I am misunderstanding your intent.

Yours,

Kevin

On 07/10/2012 04:34 PM, Karen Coyle wrote:

Thanks, Kevin! And Richard!

I'm thinking we need a good web site with links to tools. I had
already
been introduced to

http://www.w3.org/2012/pyRdfa/

where you can past a URI and get ttl or rdf/xml. These are all good
resources. But what about someone who wants to do this
programmatically,
not through a web site? Richard's message indicates that this isn't
yet
available, so perhaps we should be gathering use cases to support the
need? And have a place to post various solutions, even ones that are
not
OCLC-specific? (Because I am hoping that the use of microformats will
increase in general.)

On 7/10/12 12:12 PM, Kevin Ford wrote:

is there an open search to get one to the desired records in the
first

place?

-- I'm not certain this will fully address your question, but try
these two sites:

Website:
http://www.google.com/**webmasters/tools/richsnippetshttp://www.google.com/webmasters/tools/richsnippets
Example: http://tinyurl.com/dx3h5bg

Website:
http://linter.structured-data.**org/http://linter.structured-data.org/
Example: http://tinyurl.com/bmm8bbc

These sites will extract the data, but I don't think you get your
choice of serialization. The data are extracted and displayed on the
resulting page in the HTML, but at least you can *see* the data.

Additionally, there are a number of tools to help with microdata
extraction here:

http://schema.rdfs.org/tools.**htmlhttp://schema.rdfs.org/tools.html

Some of these will allow you to output specific (RDF) serializations.

HTH,

Kevin

On 07/10/2012 02:42 PM, Karen Coyle wrote:

I have demonstrated the schema.org/RDFa microdata in the WC
database to
various folks and the question always is: how do I get access to
this?
(The only source I have is the Facebook API, me being a user
rather
than a maker.) The microdata is CC-BY once you get a Worldcat
URI, but
is there an open search to get one to the desired records in the
first
place? I'm poorly-versed in WC APIs so I'm

Re: [CODE4LIB] Worldcat schema.org search API

2012-07-10 Thread Ross Singer

Worldcat does have the basic API, which is more open (assuming your
situation qualifies). At any rate, it's free and open to (non-commercial)
non-subscribers.

http://oclc.org/developer/documentation/worldcat-basic-api/using-api

Searching isn't terribly sophisticated, but might suit your need. And the
schema.org data will be much richer than what you'd normally get back from
the Basic API.

-Ross.


On Tuesday, July 10, 2012, Karen Coyle wrote:

 On 7/10/12 4:02 PM, Richard Wallis wrote:


 But is it available to everyone, and is the data retrieved also usable as
 ODC-BY by any member of the Web public?

 Yes it is, and at this stage it is only available from within a html page.


 The it I was referring to was the API. Roy is telling me that people
 should use the API, as if that is an obvious option that I am overlooking.
 I am asking if the general web public can use the API to get this data. I
 believe that should be a yes/no question/answer.

 kc


 This experiment is the first step in a process to make linked data about
 WorldCat resources available.  As it will evolve over time other areas such
 as API access, content-negotiation, search  other query methods,
 additional RDF data vocabularies, etc., etc., will be considered in concert
 with community feedback (such as this thread) as to the way forward.

 Karen I know you are eager to work with and demonstrate the benefits of
 this way of publishing data.  But these things take time and effort, so
 please be a little patient, and keep firing off these use cases and issues
 they are all valuable input.

 ~Richard.


 kc


   Roy

 On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford k...@3windmills.com wrote:

  The use case clarifies perfectly.

 Totally feasible.  Well, I should say totally feasible with the caveat
 that I've never used the Worldcat Search API.  Not letting that stop me,
 so
 long as it is what I imagine it is, then a developer should be able to
 perform a search, retrieve the response, and, by integrating one of the
 tools advertised on the schema.org website into his/her code, then
 retrieve
 the microdata for each resource returned from the search (and save it as
 RDF
 or whatever).

 If someone has created something like this, do speak up.

 Yours,

 Kevin





 On 07/10/2012 04:48 PM, Karen Coyle wrote:

  Kevin, if you misunderstand then I undoubtedly haven't been clear (let's
 at least share the confusion :-)). Here's the use case:

 PersonA wants to create a comprehensive bibliography of works by
 AuthorB. The goal is to do a search on AuthorB in WorldCat and extract
 the RDFa data from those pages in order to populate the bibliography.

 Apart from all of the issues of getting a perfect match on authors and
 of manifestation duplicates (there would need to be editing of the
 results after retrieval at the user's end), how feasible is this? Assume
 that the author is prolific enough that one wouldn't want to look up all
 of the records by hand.

 kc

 On 7/10/12 1:43 PM, Kevin Ford wrote:

  As for someone who might want to do this programmatically, he/she
 should take a look at the Programming languages section of the
 second link I sent along:

 http://schema.rdfs.org/tools.htmlhttp://schema.rdfs.org/tools.**html
 http://schema.rdfs.org/**tools.html http://schema.rdfs.org/tools.html

 There one can find Ruby, Python, and Java extractors and parsers
 capable of outputting RDF.  A developer can take one of these and
 programmatically get at the data.

 Apologies if I am misunderstanding your intent.

 Yours,

 Kevin



 On 07/10/2012 04:34 PM, Karen Coyle wrote:

  Thanks, Kevin! And Richard!

 I'm thinking we need a good web site with links to tools. I had
 already
 been introduced to

 http://www.w3.org/2012/pyRdfa/

 where you can past a URI and get ttl or rdf/xml. These are all good
 resources. But what about someone who wants to do this
 programmatically,
 not through a web site? Richard's message indicates that this isn't
 yet
 available, so perhaps we should be gathering use cases to support the
 need? And have a place to post various solutions, even ones that are
 not
 OCLC-specific? (Because I am hoping that the use of microformats will
 increase in general.)

 kc


 On 7/10/12 12:12 PM, Kevin Ford wrote:

  is there an open search to get one to the desired records in the
 first

  place?

  -- I'm not certain this will fully address your question, but try
 these two sites:

 Website: http://www.google.com/**webmasters/tools/richsnippets

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

2012-05-22 Thread Simon Spero

Arash - you might not want to use a straight dump of worldcat catalog
records- at least not without the associated holdings information.*

There are a lot of quasi-duplicate records that are  sufficiently broken
that the worldcat de-duplication algorithm refuses to merge them.  These
records will usually only be used by a handful of institutions;  the better
records will tend to have more associated holdings.  The holdings count
should be used to weight the strength of association between class numbers
and features.

Also, since classification/categorization is something that is usually
considered to be a property of works, rather than manifestations, one might
get better results by using Work sets for training.

I would suggest, er, contacting  Thom Hickey.

Simon

* Well, not precisely holdings - you just need the number of distinct
institutions with at least one copy.  I call them 'hasings'.

On Sat, May 19, 2012 at 8:42 PM, Roy Tennant roytenn...@gmail.com wrote:

 Arash,
 Yes, we have made WorldCat available to researchers under a special
 license agreement. I suggest contacting Thom Hickeyhic...@oclc.org
 about such an arrangement. Thanks,
 Roy

 On Fri, May 18, 2012 at 3:46 AM, Arash.Joorabchi arash.joorab...@ul.ie
 wrote:
  Dear Karen,
 
  I am conducting a research experiment on automatic text classification
 and I am trying to retrieve top matching bib records (which include DDC
 fields) for a set of keyphrases extracted from a given document. So, I
 suppose this is a rather exceptional use case. In fact, the right approach
 for this experiment is to process the full dump of WorldCat database
 directly rather than sending a limited number of queries via the API.
 
  I read here:
  http://dltj.org/article/worldcat-lld-may-become-available under-odc-by/
  that WorldCat might become available as open linked data in future,
 which would solve my problem and help similar text mining projects.
 However, I wonder if it is currently available to researchers under a
 research/non-commercial use license agreement.
 
  Regards,
  Arash
 
  -Original Message-
  From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Karen Coombs
  Sent: 17 May 2012 08:37
  To: CODE4LIB@LISTSERV.ND.EDU
  Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records
 without a DDC no from the result set
 
  I forwarded this thread to the Product Manager for the WorldCat Search
  API. She responded back that unfortunately this query is not possible
  using the API at this time.
 
  FYI, the SRU interface to WorldCat Search API doesn't currently
  support any scan type searches either.
 
  Is there a particular use case you're trying to support? Know that
  would help us document this as a possible enhancement.
 
  Karen
 
  Karen Coombs
  Senior Product Analyst
  Web Services
  OCLC
  coom...@oclc.org
 
  On Wed, May 16, 2012 at 9:49 PM, Arash.Joorabchi arash.joorab...@ul.ie
 wrote:
  Hi Andy,
 
 
 
  I am a SRU newbie myself, so I don't know how this could be achieved
  using scan operations and could not find much info on SRU website
  (http://www.loc.gov/standards/sru/).
 
  As for the wildcards, according to this guide:
 
 http://www.oclc.org/support/documentation/worldcat/searching/refcard/sea
  rchworldcatquickreference.pdf the symbols should be preceded by at least
  3 characters, and therefore clauses like:
 
 
 
  ... AND srw.dd=*
 
  ... AND srw.dd=?.*
 
  ... AND srw/dd=###.*
 
  ... AND srw/dd=?3.*
 
 
 
 
 
  do not work and result in the following error:
 
  Diagnostics
 
  Identifier:
 
  info:srw/diagnostic/1/9
 
  Meaning:
 
 
 
  Details:
 
 
 
  Message:
 
  Not enough chars in truncated term:Truncated words too short(9)
 
 
 
 
 
  Thanks,
 
  Arash
 
 
 
  
 
  From: Houghton,Andrew [mailto:hough...@oclc.org]
  Sent: 16 May 2012 11:58
  To: Arash.Joorabchi
  Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records
  without a DDC no from the result set
 
 
 
  I'm not an SRU guru, but is it possible to do a scan and look for a
  postings of zero?
 
 
 
  Andy.
 
  On May 16, 2012, at 6:39, Arash.Joorabchi arash.joorab...@ul.ie
  wrote:
 
 Hi mark,
 
 Srw.dd=* does not work either:
 
 Identifier: info:srw/diagnostic/1/27
 Meaning:
 Details:srw.dd
 Message:The index [srw.dd] did not include a searchable
  value
 
 I suppose the only option left is to retrieve everything and
  filter the results on the client side.
 
 Thanks for your quick reply.
 Arash
 
 
 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On
  Behalf Of Mike Taylor
 Sent: 16 May 2012 10:43
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of
  records without a DDC no from the result set
 
 There is no standard way in CQL to express field X

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

2012-05-22 Thread Arash.Joorabchi

Thank you Roy and Simon for the info.

As for your second point, I suppose one advantage of using the WorldCat
API at this experimental stage is that the returned bib records are
already FRBR-ized.

Ross - Thanks for the link of Open Library data dump. WorldCat
collection is 2 orders of magnitude larger than open library which makes
a significant difference considering the skewness and sparsity of bib
records classified according to library taxonomies, e.g., DDC, LCC (for
more info, see:
http://cdm15003.contentdm.oclc.org/cdm/singleitem/collection/p267701coll
27/id/277/rec/28)


Thanks,
Arash


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
Simon Spero
Sent: 22 May 2012 19:47
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records
without a DDC no from the result set

Arash - you might not want to use a straight dump of worldcat catalog
records- at least not without the associated holdings information.*

There are a lot of quasi-duplicate records that are  sufficiently broken
that the worldcat de-duplication algorithm refuses to merge them.  These
records will usually only be used by a handful of institutions;  the
better
records will tend to have more associated holdings.  The holdings count
should be used to weight the strength of association between class
numbers
and features.

Also, since classification/categorization is something that is usually
considered to be a property of works, rather than manifestations, one
might
get better results by using Work sets for training.

I would suggest, er, contacting  Thom Hickey.

Simon

* Well, not precisely holdings - you just need the number of distinct
institutions with at least one copy.  I call them 'hasings'.

On Sat, May 19, 2012 at 8:42 PM, Roy Tennant roytenn...@gmail.com
wrote:

 Arash,
 Yes, we have made WorldCat available to researchers under a special
 license agreement. I suggest contacting Thom Hickeyhic...@oclc.org
 about such an arrangement. Thanks,
 Roy

 On Fri, May 18, 2012 at 3:46 AM, Arash.Joorabchi
arash.joorab...@ul.ie
 wrote:
  Dear Karen,
 
  I am conducting a research experiment on automatic text
classification
 and I am trying to retrieve top matching bib records (which include
DDC
 fields) for a set of keyphrases extracted from a given document. So, I
 suppose this is a rather exceptional use case. In fact, the right
approach
 for this experiment is to process the full dump of WorldCat database
 directly rather than sending a limited number of queries via the API.
 
  I read here:
  http://dltj.org/article/worldcat-lld-may-become-available
under-odc-by/
  that WorldCat might become available as open linked data in future,
 which would solve my problem and help similar text mining projects.
 However, I wonder if it is currently available to researchers under a
 research/non-commercial use license agreement.
 
  Regards,
  Arash
 
  -Original Message-
  From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
Of
 Karen Coombs
  Sent: 17 May 2012 08:37
  To: CODE4LIB@LISTSERV.ND.EDU
  Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of
records
 without a DDC no from the result set
 
  I forwarded this thread to the Product Manager for the WorldCat
Search
  API. She responded back that unfortunately this query is not
possible
  using the API at this time.
 
  FYI, the SRU interface to WorldCat Search API doesn't currently
  support any scan type searches either.
 
  Is there a particular use case you're trying to support? Know that
  would help us document this as a possible enhancement.
 
  Karen
 
  Karen Coombs
  Senior Product Analyst
  Web Services
  OCLC
  coom...@oclc.org
 
  On Wed, May 16, 2012 at 9:49 PM, Arash.Joorabchi
arash.joorab...@ul.ie
 wrote:
  Hi Andy,
 
 
 
  I am a SRU newbie myself, so I don't know how this could be
achieved
  using scan operations and could not find much info on SRU website
  (http://www.loc.gov/standards/sru/).
 
  As for the wildcards, according to this guide:
 

http://www.oclc.org/support/documentation/worldcat/searching/refcard/sea
  rchworldcatquickreference.pdf the symbols should be preceded by at
least
  3 characters, and therefore clauses like:
 
 
 
  ... AND srw.dd=*
 
  ... AND srw.dd=?.*
 
  ... AND srw/dd=###.*
 
  ... AND srw/dd=?3.*
 
 
 
 
 
  do not work and result in the following error:
 
  Diagnostics
 
  Identifier:
 
  info:srw/diagnostic/1/9
 
  Meaning:
 
 
 
  Details:
 
 
 
  Message:
 
  Not enough chars in truncated term:Truncated words too short(9)
 
 
 
 
 
  Thanks,
 
  Arash
 
 
 
  
 
  From: Houghton,Andrew [mailto:hough...@oclc.org]
  Sent: 16 May 2012 11:58
  To: Arash.Joorabchi
  Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of
records
  without a DDC no from the result set
 
 
 
  I'm not an SRU guru, but is it possible to do a scan and look for a
  postings of zero?
 
 
 
  Andy.
 
  On May

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

2012-05-19 Thread Roy Tennant

Arash,
Yes, we have made WorldCat available to researchers under a special
license agreement. I suggest contacting Thom Hickeyhic...@oclc.org
about such an arrangement. Thanks,
Roy

On Fri, May 18, 2012 at 3:46 AM, Arash.Joorabchi arash.joorab...@ul.ie wrote:
 Dear Karen,

 I am conducting a research experiment on automatic text classification and I 
 am trying to retrieve top matching bib records (which include DDC fields) for 
 a set of keyphrases extracted from a given document. So, I suppose this is a 
 rather exceptional use case. In fact, the right approach for this experiment 
 is to process the full dump of WorldCat database directly rather than sending 
 a limited number of queries via the API.

 I read here:
 http://dltj.org/article/worldcat-lld-may-become-available under-odc-by/
 that WorldCat might become available as open linked data in future, which 
 would solve my problem and help similar text mining projects. However, I 
 wonder if it is currently available to researchers under a 
 research/non-commercial use license agreement.

 Regards,
 Arash

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Karen 
 Coombs
 Sent: 17 May 2012 08:37
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without 
 a DDC no from the result set

 I forwarded this thread to the Product Manager for the WorldCat Search
 API. She responded back that unfortunately this query is not possible
 using the API at this time.

 FYI, the SRU interface to WorldCat Search API doesn't currently
 support any scan type searches either.

 Is there a particular use case you're trying to support? Know that
 would help us document this as a possible enhancement.

 Karen

 Karen Coombs
 Senior Product Analyst
 Web Services
 OCLC
 coom...@oclc.org

 On Wed, May 16, 2012 at 9:49 PM, Arash.Joorabchi arash.joorab...@ul.ie 
 wrote:
 Hi Andy,



 I am a SRU newbie myself, so I don't know how this could be achieved
 using scan operations and could not find much info on SRU website
 (http://www.loc.gov/standards/sru/).

 As for the wildcards, according to this guide:
 http://www.oclc.org/support/documentation/worldcat/searching/refcard/sea
 rchworldcatquickreference.pdf the symbols should be preceded by at least
 3 characters, and therefore clauses like:



 ... AND srw.dd=*

 ... AND srw.dd=?.*

 ... AND srw/dd=###.*

 ... AND srw/dd=?3.*





 do not work and result in the following error:

 Diagnostics

 Identifier:

 info:srw/diagnostic/1/9

 Meaning:



 Details:



 Message:

 Not enough chars in truncated term:Truncated words too short(9)





 Thanks,

 Arash



 

 From: Houghton,Andrew [mailto:hough...@oclc.org]
 Sent: 16 May 2012 11:58
 To: Arash.Joorabchi
 Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records
 without a DDC no from the result set



 I'm not an SRU guru, but is it possible to do a scan and look for a
 postings of zero?



 Andy.

 On May 16, 2012, at 6:39, Arash.Joorabchi arash.joorab...@ul.ie
 wrote:

        Hi mark,

        Srw.dd=* does not work either:

        Identifier:     info:srw/diagnostic/1/27
        Meaning:
        Details:        srw.dd
        Message:        The index [srw.dd] did not include a searchable
 value

        I suppose the only option left is to retrieve everything and
 filter the results on the client side.

        Thanks for your quick reply.
        Arash


        -Original Message-
        From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On
 Behalf Of Mike Taylor
        Sent: 16 May 2012 10:43
        To: CODE4LIB@LISTSERV.ND.EDU
        Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of
 records without a DDC no from the result set

        There is no standard way in CQL to express field X is not
 empty.
        Depending on implementations, NOT srw.dd= might work (but
 evidently
        doesn't in this case).  Another possibility is srw.dd=*, but
 again
        that may or may not work, and might be appallingly inefficient
 if it
        does.  NOT srw.dd=null will definitely not work: null is not a
        special word in CQL.

        -- Mike.


        On 16 May 2012 10:32, Arash.Joorabchi arash.joorab...@ul.ie
 wrote:
          Hi all,
        
         I am sending SRU queries to the WorldCat in the following
 form:
        
        
                        String host =
         http://worldcat.org/webservices/catalog/search/;;
                    String query = sru?query=srw.kw=\ + keyword +
 \
                                        +  AND srw.ln exact \eng\
                                        +  AND srw.mt all \bks\
                                        +  AND srw.nt=\ + keyword +
 \
                                        + servicelevel=full
                                        + maximumRecords=100
                                      + sortKeys=relevance,,0

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

2012-05-18 Thread Arash.Joorabchi

Dear Karen,

I am conducting a research experiment on automatic text classification and I am 
trying to retrieve top matching bib records (which include DDC fields) for a 
set of keyphrases extracted from a given document. So, I suppose this is a 
rather exceptional use case. In fact, the right approach for this experiment is 
to process the full dump of WorldCat database directly rather than sending a 
limited number of queries via the API.

I read here: 
http://dltj.org/article/worldcat-lld-may-become-available under-odc-by/ 
that WorldCat might become available as open linked data in future, which would 
solve my problem and help similar text mining projects. However, I wonder if it 
is currently available to researchers under a research/non-commercial use 
license agreement.

Regards,
Arash

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Karen 
Coombs
Sent: 17 May 2012 08:37
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a 
DDC no from the result set

I forwarded this thread to the Product Manager for the WorldCat Search
API. She responded back that unfortunately this query is not possible
using the API at this time.

FYI, the SRU interface to WorldCat Search API doesn't currently
support any scan type searches either.

Is there a particular use case you're trying to support? Know that
would help us document this as a possible enhancement.

Karen

Karen Coombs
Senior Product Analyst
Web Services
OCLC
coom...@oclc.org

On Wed, May 16, 2012 at 9:49 PM, Arash.Joorabchi arash.joorab...@ul.ie wrote:
 Hi Andy,



 I am a SRU newbie myself, so I don't know how this could be achieved
 using scan operations and could not find much info on SRU website
 (http://www.loc.gov/standards/sru/).

 As for the wildcards, according to this guide:
 http://www.oclc.org/support/documentation/worldcat/searching/refcard/sea
 rchworldcatquickreference.pdf the symbols should be preceded by at least
 3 characters, and therefore clauses like:



 ... AND srw.dd=*

 ... AND srw.dd=?.*

 ... AND srw/dd=###.*

 ... AND srw/dd=?3.*





 do not work and result in the following error:

 Diagnostics

 Identifier:

 info:srw/diagnostic/1/9

 Meaning:



 Details:



 Message:

 Not enough chars in truncated term:Truncated words too short(9)





 Thanks,

 Arash



 

 From: Houghton,Andrew [mailto:hough...@oclc.org]
 Sent: 16 May 2012 11:58
 To: Arash.Joorabchi
 Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records
 without a DDC no from the result set



 I'm not an SRU guru, but is it possible to do a scan and look for a
 postings of zero?



 Andy.

 On May 16, 2012, at 6:39, Arash.Joorabchi arash.joorab...@ul.ie
 wrote:

        Hi mark,

        Srw.dd=* does not work either:

        Identifier:     info:srw/diagnostic/1/27
        Meaning:
        Details:        srw.dd
        Message:        The index [srw.dd] did not include a searchable
 value

        I suppose the only option left is to retrieve everything and
 filter the results on the client side.

        Thanks for your quick reply.
        Arash


        -Original Message-
        From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On
 Behalf Of Mike Taylor
        Sent: 16 May 2012 10:43
        To: CODE4LIB@LISTSERV.ND.EDU
        Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of
 records without a DDC no from the result set

        There is no standard way in CQL to express field X is not
 empty.
        Depending on implementations, NOT srw.dd= might work (but
 evidently
        doesn't in this case).  Another possibility is srw.dd=*, but
 again
        that may or may not work, and might be appallingly inefficient
 if it
        does.  NOT srw.dd=null will definitely not work: null is not a
        special word in CQL.

        -- Mike.


        On 16 May 2012 10:32, Arash.Joorabchi arash.joorab...@ul.ie
 wrote:
          Hi all,
        
         I am sending SRU queries to the WorldCat in the following
 form:
        
        
                        String host =
         http://worldcat.org/webservices/catalog/search/;;
                    String query = sru?query=srw.kw=\ + keyword +
 \
                                        +  AND srw.ln exact \eng\
                                        +  AND srw.mt all \bks\
                                        +  AND srw.nt=\ + keyword +
 \
                                        + servicelevel=full
                                        + maximumRecords=100
                                      + sortKeys=relevance,,0
                                        + wskey=[wskey];
        
         And it is working fine, however I'd like to limit the results
 to those
         records that have a DDC number assigned to them, but I don't
 know what's
         the right way to specify this limit in the query

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

2012-05-18 Thread Ross Singer

On May 18, 2012, at 6:46 AM, Arash.Joorabchi wrote:

 Dear Karen,
 
 I am conducting a research experiment on automatic text classification and I 
 am trying to retrieve top matching bib records (which include DDC fields) for 
 a set of keyphrases extracted from a given document. So, I suppose this is a 
 rather exceptional use case. In fact, the right approach for this experiment 
 is to process the full dump of WorldCat database directly rather than sending 
 a limited number of queries via the API.
 
 I read here: 
 http://dltj.org/article/worldcat-lld-may-become-available under-odc-by/ 
 that WorldCat might become available as open linked data in future, which 
 would solve my problem and help similar text mining projects. However, I 
 wonder if it is currently available to researchers under a 
 research/non-commercial use license agreement.

Why not use Open Library's dataset (which is freely available with no 
restrictions)?

http://openlibrary.org/developers/dumps

-Ross.

 
 Regards,
 Arash
 
 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Karen 
 Coombs
 Sent: 17 May 2012 08:37
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without 
 a DDC no from the result set
 
 I forwarded this thread to the Product Manager for the WorldCat Search
 API. She responded back that unfortunately this query is not possible
 using the API at this time.
 
 FYI, the SRU interface to WorldCat Search API doesn't currently
 support any scan type searches either.
 
 Is there a particular use case you're trying to support? Know that
 would help us document this as a possible enhancement.
 
 Karen
 
 Karen Coombs
 Senior Product Analyst
 Web Services
 OCLC
 coom...@oclc.org
 
 On Wed, May 16, 2012 at 9:49 PM, Arash.Joorabchi arash.joorab...@ul.ie 
 wrote:
 Hi Andy,
 
 
 
 I am a SRU newbie myself, so I don't know how this could be achieved
 using scan operations and could not find much info on SRU website
 (http://www.loc.gov/standards/sru/).
 
 As for the wildcards, according to this guide:
 http://www.oclc.org/support/documentation/worldcat/searching/refcard/sea
 rchworldcatquickreference.pdf the symbols should be preceded by at least
 3 characters, and therefore clauses like:
 
 
 
 ... AND srw.dd=*
 
 ... AND srw.dd=?.*
 
 ... AND srw/dd=###.*
 
 ... AND srw/dd=?3.*
 
 
 
 
 
 do not work and result in the following error:
 
 Diagnostics
 
 Identifier:
 
 info:srw/diagnostic/1/9
 
 Meaning:
 
 
 
 Details:
 
 
 
 Message:
 
 Not enough chars in truncated term:Truncated words too short(9)
 
 
 
 
 
 Thanks,
 
 Arash
 
 
 
 
 
 From: Houghton,Andrew [mailto:hough...@oclc.org]
 Sent: 16 May 2012 11:58
 To: Arash.Joorabchi
 Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records
 without a DDC no from the result set
 
 
 
 I'm not an SRU guru, but is it possible to do a scan and look for a
 postings of zero?
 
 
 
 Andy.
 
 On May 16, 2012, at 6:39, Arash.Joorabchi arash.joorab...@ul.ie
 wrote:
 
Hi mark,
 
Srw.dd=* does not work either:
 
Identifier: info:srw/diagnostic/1/27
Meaning:
Details:srw.dd
Message:The index [srw.dd] did not include a searchable
 value
 
I suppose the only option left is to retrieve everything and
 filter the results on the client side.
 
Thanks for your quick reply.
Arash
 
 
-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On
 Behalf Of Mike Taylor
Sent: 16 May 2012 10:43
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of
 records without a DDC no from the result set
 
There is no standard way in CQL to express field X is not
 empty.
Depending on implementations, NOT srw.dd= might work (but
 evidently
doesn't in this case).  Another possibility is srw.dd=*, but
 again
that may or may not work, and might be appallingly inefficient
 if it
does.  NOT srw.dd=null will definitely not work: null is not a
special word in CQL.
 
-- Mike.
 
 
On 16 May 2012 10:32, Arash.Joorabchi arash.joorab...@ul.ie
 wrote:
  Hi all,

 I am sending SRU queries to the WorldCat in the following
 form:


String host =
 http://worldcat.org/webservices/catalog/search/;;
String query = sru?query=srw.kw=\ + keyword +
 \
+  AND srw.ln exact \eng\
+  AND srw.mt all \bks\
+  AND srw.nt=\ + keyword +
 \
+ servicelevel=full
+ maximumRecords=100
  + sortKeys=relevance,,0

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

2012-05-17 Thread Karen Coombs

I forwarded this thread to the Product Manager for the WorldCat Search
API. She responded back that unfortunately this query is not possible
using the API at this time.

FYI, the SRU interface to WorldCat Search API doesn't currently
support any scan type searches either.

Is there a particular use case you're trying to support? Know that
would help us document this as a possible enhancement.

Karen

Karen Coombs
Senior Product Analyst
Web Services
OCLC
coom...@oclc.org

On Wed, May 16, 2012 at 9:49 PM, Arash.Joorabchi arash.joorab...@ul.ie wrote:
 Hi Andy,



 I am a SRU newbie myself, so I don't know how this could be achieved
 using scan operations and could not find much info on SRU website
 (http://www.loc.gov/standards/sru/).

 As for the wildcards, according to this guide:
 http://www.oclc.org/support/documentation/worldcat/searching/refcard/sea
 rchworldcatquickreference.pdf the symbols should be preceded by at least
 3 characters, and therefore clauses like:



 ... AND srw.dd=*

 ... AND srw.dd=?.*

 ... AND srw/dd=###.*

 ... AND srw/dd=?3.*





 do not work and result in the following error:

 Diagnostics

 Identifier:

 info:srw/diagnostic/1/9

 Meaning:



 Details:



 Message:

 Not enough chars in truncated term:Truncated words too short(9)





 Thanks,

 Arash



 

 From: Houghton,Andrew [mailto:hough...@oclc.org]
 Sent: 16 May 2012 11:58
 To: Arash.Joorabchi
 Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records
 without a DDC no from the result set



 I'm not an SRU guru, but is it possible to do a scan and look for a
 postings of zero?



 Andy.

 On May 16, 2012, at 6:39, Arash.Joorabchi arash.joorab...@ul.ie
 wrote:

        Hi mark,

        Srw.dd=* does not work either:

        Identifier:     info:srw/diagnostic/1/27
        Meaning:
        Details:        srw.dd
        Message:        The index [srw.dd] did not include a searchable
 value

        I suppose the only option left is to retrieve everything and
 filter the results on the client side.

        Thanks for your quick reply.
        Arash


        -Original Message-
        From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On
 Behalf Of Mike Taylor
        Sent: 16 May 2012 10:43
        To: CODE4LIB@LISTSERV.ND.EDU
        Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of
 records without a DDC no from the result set

        There is no standard way in CQL to express field X is not
 empty.
        Depending on implementations, NOT srw.dd= might work (but
 evidently
        doesn't in this case).  Another possibility is srw.dd=*, but
 again
        that may or may not work, and might be appallingly inefficient
 if it
        does.  NOT srw.dd=null will definitely not work: null is not a
        special word in CQL.

        -- Mike.


        On 16 May 2012 10:32, Arash.Joorabchi arash.joorab...@ul.ie
 wrote:
          Hi all,
        
         I am sending SRU queries to the WorldCat in the following
 form:
        
        
                        String host =
         http://worldcat.org/webservices/catalog/search/;;
                    String query = sru?query=srw.kw=\ + keyword +
 \
                                        +  AND srw.ln exact \eng\
                                        +  AND srw.mt all \bks\
                                        +  AND srw.nt=\ + keyword +
 \
                                        + servicelevel=full
                                        + maximumRecords=100
                                      + sortKeys=relevance,,0
                                        + wskey=[wskey];
        
         And it is working fine, however I'd like to limit the results
 to those
         records that have a DDC number assigned to them, but I don't
 know what's
         the right way to specify this limit in the query.
        
          NOT srw.dd=
          NOT srw.dd=null
        
         Neither of above work
        
        
         Thanks,
         Arash
        

 

 No virus found in this message.
 Checked by AVG - www.avg.com
 Version: 2012.0.2176 / Virus Database: 2425/5001 - Release Date:
 05/15/12

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

2012-05-16 Thread Mike Taylor

There is no standard way in CQL to express field X is not empty.
Depending on implementations, NOT srw.dd= might work (but evidently
doesn't in this case).  Another possibility is srw.dd=*, but again
that may or may not work, and might be appallingly inefficient if it
does.  NOT srw.dd=null will definitely not work: null is not a
special word in CQL.

-- Mike.


On 16 May 2012 10:32, Arash.Joorabchi arash.joorab...@ul.ie wrote:
  Hi all,

 I am sending SRU queries to the WorldCat in the following form:


                String host =
 http://worldcat.org/webservices/catalog/search/;;
            String query = sru?query=srw.kw=\ + keyword + \
                                +  AND srw.ln exact \eng\
                                +  AND srw.mt all \bks\
                                +  AND srw.nt=\ + keyword + \
                                + servicelevel=full
                                + maximumRecords=100
                              + sortKeys=relevance,,0
                                + wskey=[wskey];

 And it is working fine, however I'd like to limit the results to those
 records that have a DDC number assigned to them, but I don't know what's
 the right way to specify this limit in the query.

  NOT srw.dd=
  NOT srw.dd=null

 Neither of above work


 Thanks,
 Arash

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Chad Benjamin Nelson
 Sent: 15 May 2012 21:54
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] Atlanta Digital Libraries meetup - May 23rd

 The first / next Atlanta Digital Libraries meetup is coming up soon:

 Wednesday, May 23rd 7pm
 Manuel's Tavernhttp://www.manuelstavern.com/location.php
 602 N Highland Avenue Northeast
 Atlanta, GA 30307
 North Avenue Room

 We have two scheduled talks, and are still looking others interested in
 presenting. It's informal, so even if it is just a short topic you want
 to get some feedback on, we'd love to hear it.

 So, come along if you are interested and in the area.


 Chad


 Chad Nelson
 Web Services Programmer
 University Library
 Georgia State University

 e: cnelso...@gsu.edu
 t: 404 413 2771
 My Calendarhttp://bit.ly/qybPLJ

 -
 No virus found in this message.
 Checked by AVG - www.avg.com
 Version: 2012.0.2176 / Virus Database: 2425/5000 - Release Date:
 05/15/12

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

2012-05-16 Thread Arash.Joorabchi

Hi mark,

Srw.dd=* does not work either:

Identifier: info:srw/diagnostic/1/27
Meaning:
Details:srw.dd
Message:The index [srw.dd] did not include a searchable value

I suppose the only option left is to retrieve everything and filter the results 
on the client side.

Thanks for your quick reply.
Arash 


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Mike 
Taylor
Sent: 16 May 2012 10:43
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a 
DDC no from the result set

There is no standard way in CQL to express field X is not empty.
Depending on implementations, NOT srw.dd= might work (but evidently
doesn't in this case).  Another possibility is srw.dd=*, but again
that may or may not work, and might be appallingly inefficient if it
does.  NOT srw.dd=null will definitely not work: null is not a
special word in CQL.

-- Mike.


On 16 May 2012 10:32, Arash.Joorabchi arash.joorab...@ul.ie wrote:
  Hi all,

 I am sending SRU queries to the WorldCat in the following form:


                String host =
 http://worldcat.org/webservices/catalog/search/;;
            String query = sru?query=srw.kw=\ + keyword + \
                                +  AND srw.ln exact \eng\
                                +  AND srw.mt all \bks\
                                +  AND srw.nt=\ + keyword + \
                                + servicelevel=full
                                + maximumRecords=100
                              + sortKeys=relevance,,0
                                + wskey=[wskey];

 And it is working fine, however I'd like to limit the results to those
 records that have a DDC number assigned to them, but I don't know what's
 the right way to specify this limit in the query.

  NOT srw.dd=
  NOT srw.dd=null

 Neither of above work


 Thanks,
 Arash

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

2012-05-16 Thread Arash.Joorabchi

Hi Andy,

 

I am a SRU newbie myself, so I don't know how this could be achieved
using scan operations and could not find much info on SRU website
(http://www.loc.gov/standards/sru/).

As for the wildcards, according to this guide:
http://www.oclc.org/support/documentation/worldcat/searching/refcard/sea
rchworldcatquickreference.pdf the symbols should be preceded by at least
3 characters, and therefore clauses like: 

 

... AND srw.dd=*

... AND srw.dd=?.*

... AND srw/dd=###.*

... AND srw/dd=?3.*

 

 

do not work and result in the following error:

Diagnostics

Identifier:

info:srw/diagnostic/1/9

Meaning:

 

Details:

 

Message:

Not enough chars in truncated term:Truncated words too short(9)

 

 

Thanks,

Arash

 



From: Houghton,Andrew [mailto:hough...@oclc.org] 
Sent: 16 May 2012 11:58
To: Arash.Joorabchi
Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records
without a DDC no from the result set

 

I'm not an SRU guru, but is it possible to do a scan and look for a
postings of zero?

 

Andy.

On May 16, 2012, at 6:39, Arash.Joorabchi arash.joorab...@ul.ie
wrote:

Hi mark,

Srw.dd=* does not work either:

Identifier: info:srw/diagnostic/1/27
Meaning:   
Details:srw.dd
Message:The index [srw.dd] did not include a searchable
value

I suppose the only option left is to retrieve everything and
filter the results on the client side.

Thanks for your quick reply.
Arash


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On
Behalf Of Mike Taylor
Sent: 16 May 2012 10:43
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of
records without a DDC no from the result set

There is no standard way in CQL to express field X is not
empty.
Depending on implementations, NOT srw.dd= might work (but
evidently
doesn't in this case).  Another possibility is srw.dd=*, but
again
that may or may not work, and might be appallingly inefficient
if it
does.  NOT srw.dd=null will definitely not work: null is not a
special word in CQL.

-- Mike.


On 16 May 2012 10:32, Arash.Joorabchi arash.joorab...@ul.ie
wrote:
  Hi all,

 I am sending SRU queries to the WorldCat in the following
form:


String host =
 http://worldcat.org/webservices/catalog/search/;;
String query = sru?query=srw.kw=\ + keyword +
\
+  AND srw.ln exact \eng\
+  AND srw.mt all \bks\
+  AND srw.nt=\ + keyword +
\
+ servicelevel=full
+ maximumRecords=100
  + sortKeys=relevance,,0
+ wskey=[wskey];

 And it is working fine, however I'd like to limit the results
to those
 records that have a DDC number assigned to them, but I don't
know what's
 the right way to specify this limit in the query.

  NOT srw.dd=
  NOT srw.dd=null

 Neither of above work


 Thanks,
 Arash




No virus found in this message.
Checked by AVG - www.avg.com
Version: 2012.0.2176 / Virus Database: 2425/5001 - Release Date:
05/15/12

Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-16 Thread Tom Keays

We have been trying to enumerate serials holdings as explicitly as possible.
E.G., this microfiche supplement to a journal,
http://summit.syr.edu/cgi-bin/Pwebrecon.cgi?BBID=274291 shows apparently
missing issues. However, there are two pieces of inferred information here:

1) every print issue had a corresponding microfiche supplement (they didn't,
so most of these are complete even with the gaps)
2) that volumes, at least up until 1991, had only 26 issues (that is
probably is true, but it is not certain) and there is no way to be certain
how many issues per volume were published with 1992 (28?, 52?)

v.95:no.3 (1973)-v.95:no.8 (1973
v.95:no.10 (1973)-v.95:no.26 (1973)
v.96 (1974)-v.97 (1975)
v.98:no.1 (1976)-v.98:no.14 (1976)
v.98:no.16 (1976)-v.98:no.26 (1976)
v.99:no.1 (1977)-v.99:no.25 (1977)
v.100 (1978)-v.108 (1986)
v.109:no.1 (1987)-v.109:no.19 (1987)
v.109:no.21 (1987)-v.109:no.26 (1987)
v.110 (1988)-v.111 (1989)
v.112:no.1 (1990)-v.112:no.26 (1990)
v.113 (1991)
v.114:no.1 (1992)-v.114:no.21 (1992)
v.114:no.23 (1992)-v.114:no.27 (1992)
v.115 (1993)-v.119 (1997)
v.120:no.2 (1998:Jan.21)-v.120:no.51 (1998:Dec.30)




On Tue, Jun 15, 2010 at 9:56 PM, Bill Dueber b...@dueber.com wrote:

 On Tue, Jun 15, 2010 at 5:49 PM, Kyle Banerjee baner...@uoregon.edu
 wrote:
  No, but parsing holding statements for something that just gets cut off
  early or which starts late should be easy unless entry is insanely
  inconsistent.

 Andthere it is. :-)

 We're really dealing with a few problems here:

  - Inconsistent entry by catalogers (probably the least of our worries)
  - Inconsistent publishing schedules (e.g., the Jan 1942 issue was
 just plain never printed)
  - Inconsistent use of volume/number/year/month/whatever throughout a
 serial's run.

 So, for example, http://mirlyn.lib.umich.edu/Record/45417/Holdings#1

 There are six holdings:

 1919-1920 incompl
 1920 incompl.
 1922
 v.4 no.49
 v.6 1921 jul-dec
 v.6 1921jan-jun

 We have no way of knowing what year volume 4 was printed in, which
 issues are incomplete in the two volumes that cover 1920, whether
 volume number are associated with earlier (or later) issues, etc. We,
 as humans, could try to make some guesses, but they'd just be guesses.

 It's easy to find examples where month ranges overlap (or leave gaps),
 where month names and issue numbers are sometimes used
 interchangeably, where volume numbers suddenly change in the middle of
 a run because of a merge with another serial (or where the first
 volume isn't 1 because the serial broke off from a parent), etc.
 etc. etc.

 I don't mean to overstate the problem. For many (most?) serials whose
 existence only goes back a few decades, a relatively simple approach
 will likely work much of the time -- although even that relatively
 simple approach will have to take into account a solid dozen or so
 different ways that enumcron data may have been entered.

 But to be able to say, with some confidence, that we have the full
 run? Or a particular issue as labeled my a month name? Much, much
 harder in the general case.


  -Bill-


 --
 Bill Dueber
 Library Systems Programmer
 University of Michigan Library

Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-16 Thread Rosalyn Metz

Don't forget inconsistent data from the person sending the OpenURL.

Rosalyn



On Tue, Jun 15, 2010 at 9:56 PM, Bill Dueber b...@dueber.com wrote:
 On Tue, Jun 15, 2010 at 5:49 PM, Kyle Banerjee baner...@uoregon.edu wrote:
 No, but parsing holding statements for something that just gets cut off
 early or which starts late should be easy unless entry is insanely
 inconsistent.

 Andthere it is. :-)

 We're really dealing with a few problems here:

  - Inconsistent entry by catalogers (probably the least of our worries)
  - Inconsistent publishing schedules (e.g., the Jan 1942 issue was
 just plain never printed)
  - Inconsistent use of volume/number/year/month/whatever throughout a
 serial's run.

 So, for example, http://mirlyn.lib.umich.edu/Record/45417/Holdings#1

 There are six holdings:

 1919-1920 incompl
 1920 incompl.
 1922
 v.4 no.49
 v.6 1921 jul-dec
 v.6 1921jan-jun

 We have no way of knowing what year volume 4 was printed in, which
 issues are incomplete in the two volumes that cover 1920, whether
 volume number are associated with earlier (or later) issues, etc. We,
 as humans, could try to make some guesses, but they'd just be guesses.

 It's easy to find examples where month ranges overlap (or leave gaps),
 where month names and issue numbers are sometimes used
 interchangeably, where volume numbers suddenly change in the middle of
 a run because of a merge with another serial (or where the first
 volume isn't 1 because the serial broke off from a parent), etc.
 etc. etc.

 I don't mean to overstate the problem. For many (most?) serials whose
 existence only goes back a few decades, a relatively simple approach
 will likely work much of the time -- although even that relatively
 simple approach will have to take into account a solid dozen or so
 different ways that enumcron data may have been entered.

 But to be able to say, with some confidence, that we have the full
 run? Or a particular issue as labeled my a month name? Much, much
 harder in the general case.


  -Bill-


 --
 Bill Dueber
 Library Systems Programmer
 University of Michigan Library

Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-16 Thread Robertson, Wendy C

Regarding the data in OCLC, my understanding (as a former serials cataloger) is 
that there is detailed information for at least some institutions in the 
interlibrary loan portion of the OCLC database but this is not available via 
worldcat. I know our ILL department added detailed information for commonly 
requested titles years ago. I also know we are in the process of getting our 
detailed holdings loaded into OCLC (possibly just on the ILL side, I'm not sure 
about this) and maintaining our holdings through batch updates. Many of our 
current titles use summary holdings, but not all do. I believe the summary 
holdings work much more effectively with ILL as well so our serials catalogers 
have been working for years to improve our local data. As part of our move to 
summary holdings, we also reduced some of the detail in our holdings, so now we 
show only gaps of entire volumes, but not specific missing issues in our coded 
holdings (the missing issues are included in notes in our i!
 tem specific records).

If there is better data available to ILL staff, this may be an avenue you could 
pursue.

Wendy Robertson
Digital Resources Librarian .  The University of Iowa Libraries
1015 Main Library  .  Iowa City, Iowa 52242
wendy-robert...@uiowa.edu
319-335-5821

-Original Message-
From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Bill 
Dueber
Sent: Tuesday, June 15, 2010 8:57 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

On Tue, Jun 15, 2010 at 5:49 PM, Kyle Banerjee baner...@uoregon.edu wrote:
 No, but parsing holding statements for something that just gets cut off
 early or which starts late should be easy unless entry is insanely
 inconsistent.

Andthere it is. :-)

We're really dealing with a few problems here:

 - Inconsistent entry by catalogers (probably the least of our worries)
 - Inconsistent publishing schedules (e.g., the Jan 1942 issue was
just plain never printed)
 - Inconsistent use of volume/number/year/month/whatever throughout a
serial's run.

So, for example, http://mirlyn.lib.umich.edu/Record/45417/Holdings#1

There are six holdings:

1919-1920 incompl
1920 incompl.
1922
v.4 no.49
v.6 1921 jul-dec
v.6 1921jan-jun

We have no way of knowing what year volume 4 was printed in, which
issues are incomplete in the two volumes that cover 1920, whether
volume number are associated with earlier (or later) issues, etc. We,
as humans, could try to make some guesses, but they'd just be guesses.

It's easy to find examples where month ranges overlap (or leave gaps),
where month names and issue numbers are sometimes used
interchangeably, where volume numbers suddenly change in the middle of
a run because of a merge with another serial (or where the first
volume isn't 1 because the serial broke off from a parent), etc.
etc. etc.

I don't mean to overstate the problem. For many (most?) serials whose
existence only goes back a few decades, a relatively simple approach
will likely work much of the time -- although even that relatively
simple approach will have to take into account a solid dozen or so
different ways that enumcron data may have been entered.

But to be able to say, with some confidence, that we have the full
run? Or a particular issue as labeled my a month name? Much, much
harder in the general case.


  -Bill-


-- 
Bill Dueber
Library Systems Programmer
University of Michigan Library

Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Tom Keays

On Mon, Jun 14, 2010 at 3:47 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

 The trick here is that traditional library metadata practices make it _very
 hard_ to tell if a _specific volume/issue_ is held by a given library.  And
 those are the most common use cases for OpenURL.


Yep. That's true even for individual library's with link resolvers. OCLC is
not going to be able to solve that particular issue until the local
libraries do.


 If you just want to get to the title level (for a journal or a book), you
 can easily write your own thing that takes an OpenURL, and either just
 redirects straight to worldcat.org on isbn/lccn/oclcnum, or actually does
 a WorldCat API lookup to ensure the record exists first and/or looks up on
 author/title/etc too.


I was mainly thinking of sources that use COinS. If you have a rarely held
book, for instance, then OpenURLs resolved against random institutional
endpoints are going to mostly be unproductive. However, a union catalog
such as OCLC already has the information about libraries in the system that
own it. It seems like the more productive path if the goal of a user is
simply to locate a copy, where ever it is held.


 Umlaut already includes the 'naive' just link to worldcat.org based on
 isbn, oclcnum, or lccn approach, functionality that was written before the
 worldcat api exists. That is, Umlaut takes an incoming OpenURL, and provides
 the user with a link to a worldcat record based on isbn, oclcnum, or lccn.


Many institutions have chosen to do this. MPOW, however, represents a
counter-example and do not link out to OCLC.

Tom

Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Walker, David

 It seems like the more productive path if the goal of a user is
 simply to locate a copy, where ever it is held.

But I don't think users have *locating a copy* as their goal.  Rather, I think 
their goal is to *get their hands on the book*.

If I discover a book via COINs, and you drop me off at Worldcat.org, that 
allows me to see which libraries own the book.  But, unless I happen to be 
affiliated with those institutions, that's kinda useless information.  I have 
no real way of actually getting the book itself.

If, instead, you drop me off at your institution's link resolver menu, and 
provide me an ILL option in the event you don't have the book, the library can 
get the book for me, which is really my *goal*.

That seems like the more productive path, IMO.

--Dave

==
David Walker
Library Web Services Manager
California State University
http://xerxes.calstate.edu

From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Tom Keays 
[tomke...@gmail.com]
Sent: Tuesday, June 15, 2010 8:43 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

On Mon, Jun 14, 2010 at 3:47 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

 The trick here is that traditional library metadata practices make it _very
 hard_ to tell if a _specific volume/issue_ is held by a given library.  And
 those are the most common use cases for OpenURL.


Yep. That's true even for individual library's with link resolvers. OCLC is
not going to be able to solve that particular issue until the local
libraries do.


 If you just want to get to the title level (for a journal or a book), you
 can easily write your own thing that takes an OpenURL, and either just
 redirects straight to worldcat.org on isbn/lccn/oclcnum, or actually does
 a WorldCat API lookup to ensure the record exists first and/or looks up on
 author/title/etc too.


I was mainly thinking of sources that use COinS. If you have a rarely held
book, for instance, then OpenURLs resolved against random institutional
endpoints are going to mostly be unproductive. However, a union catalog
such as OCLC already has the information about libraries in the system that
own it. It seems like the more productive path if the goal of a user is
simply to locate a copy, where ever it is held.


 Umlaut already includes the 'naive' just link to worldcat.org based on
 isbn, oclcnum, or lccn approach, functionality that was written before the
 worldcat api exists. That is, Umlaut takes an incoming OpenURL, and provides
 the user with a link to a worldcat record based on isbn, oclcnum, or lccn.


Many institutions have chosen to do this. MPOW, however, represents a
counter-example and do not link out to OCLC.

Tom

Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Jonathan Rochkind

IF the user is coming from a recognized on-campus IP, you can configure 
WorldCat to give the user an ILL link to your library too. At least if 
you use ILLiad, maybe if you use something else (esp if your ILL 
software can accept OpenURLs too!).


I haven't yet found any good way to do this if the user is off-campus 
(ezproxy not a good solution, how do we 'force' the user to use ezproxy 
for worldcat.org anyway?).


But in any event, I agree with Dave that worldcat.org isn't a great 
interface even if you DO get it to have an ILL link in an odd place. I 
think we can do better. Which is really the whole purpose of Umlaut as 
an institutional link resolver, giving the user a better screen for I 
found this citation somewhere else, library what can you do to get it in 
my hands asap?


Still wondering why Umlaut hasn't gotten more interest from people, heh. 
But we're using it here at JHU, and NYU and the New School are also 
using it.


Jonathan

Walker, David wrote:

It seems like the more productive path if the goal of a user is
simply to locate a copy, where ever it is held.



But I don't think users have *locating a copy* as their goal.  Rather, I think 
their goal is to *get their hands on the book*.

If I discover a book via COINs, and you drop me off at Worldcat.org, that 
allows me to see which libraries own the book.  But, unless I happen to be 
affiliated with those institutions, that's kinda useless information.  I have 
no real way of actually getting the book itself.

If, instead, you drop me off at your institution's link resolver menu, and 
provide me an ILL option in the event you don't have the book, the library can 
get the book for me, which is really my *goal*.

That seems like the more productive path, IMO.

--Dave

==
David Walker
Library Web Services Manager
California State University
http://xerxes.calstate.edu

From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Tom Keays 
[tomke...@gmail.com]
Sent: Tuesday, June 15, 2010 8:43 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

On Mon, Jun 14, 2010 at 3:47 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

  

The trick here is that traditional library metadata practices make it _very
hard_ to tell if a _specific volume/issue_ is held by a given library.  And
those are the most common use cases for OpenURL.




Yep. That's true even for individual library's with link resolvers. OCLC is
not going to be able to solve that particular issue until the local
libraries do.


  

If you just want to get to the title level (for a journal or a book), you
can easily write your own thing that takes an OpenURL, and either just
redirects straight to worldcat.org on isbn/lccn/oclcnum, or actually does
a WorldCat API lookup to ensure the record exists first and/or looks up on
author/title/etc too.




I was mainly thinking of sources that use COinS. If you have a rarely held
book, for instance, then OpenURLs resolved against random institutional
endpoints are going to mostly be unproductive. However, a union catalog
such as OCLC already has the information about libraries in the system that
own it. It seems like the more productive path if the goal of a user is
simply to locate a copy, where ever it is held.


  

Umlaut already includes the 'naive' just link to worldcat.org based on
isbn, oclcnum, or lccn approach, functionality that was written before the
worldcat api exists. That is, Umlaut takes an incoming OpenURL, and provides
the user with a link to a worldcat record based on isbn, oclcnum, or lccn.




Many institutions have chosen to do this. MPOW, however, represents a
counter-example and do not link out to OCLC.

Tom

Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Kyle Banerjee


  The trick here is that traditional library metadata practices make it
 _very
  hard_ to tell if a _specific volume/issue_ is held by a given library.
  And
  those are the most common use cases for OpenURL.
 

 Yep. That's true even for individual library's with link resolvers. OCLC is
 not going to be able to solve that particular issue until the local
 libraries do.


This might not be as bad as people think. The normal argument is that
holdings are in free text and there's no way staff will ever have enough
time to record volume level holdings. However, significant chunks of the
problem can be addressed using relatively simple methods.

For example, if you can identify complete runs, you know that a library has
all holdings and can start automating things.

With this in mind, the first step is to identify incomplete holdings. The
mere presence of lingo like missing, lost, incomplete, scattered,
wanting, etc. is a dead giveaway.  So are bracketed fields that contain
enumeration or temporal data (though you'll get false hits using this method
when catalogers supply enumeration). Commas in any field that contains
enumeration or temporal data also indicate incomplete holdings.

I suspect that the mere presence of a note is a great indicator that
holdings are incomplete since what kind of yutz writes a note saying all
the holdings are here just like you'd expect? Having said that, I need to
crawl through a lot more data before being comfortable with that statement.

Regexp matches can be used to search for closed date ranges in open serials
or close dates within 866 that don't correspond to close dates within fixed
fields.

That's the first pass. The second pass would be to search for the most
common patterns that occur within incomplete holdings. Wash, rinse, repeat.
After awhile, you'll get to all the cornball schemes that don't lend
themselves towards automation, but hopefully that group of materials is
getting to a more manageable size where throwing labor at the metadata makes
some sense. Possibly guessing if a volume is available based on timeframe is
a good way to go.

Worst case scenario if the program can't handle it is you deflect the
request to the next institution, and that already happens all the time for a
variety of reasons.

While my comments are mostly concerned with journal holdings, similar logic
can be used with monographic series as well.

kyle

Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Jonathan Rochkind

When I've tried to do this, it's been much harder than your story, I'm 
afraid.


My library data is very inconsistent in the way it expresses it's 
holdings. Even _without_ missing items, the holdings are expressed in 
human-readable narrative form which is very difficult to parse reliably.


Theoretically, the holdings are expressed according to, I forget the 
name of the Z. standard, but some standard for expressing human readable 
holdings with certain punctuation and such. Even if they really WERE all 
exactly according to this standard, this standard is not very easy to 
parse consistently and reliably. But in fact, since when these tags are 
entered nothing validates them to this standard -- and at different 
times in history the cataloging staff entering them in various libraries 
had various ideas about how strictly they should follow this local 
policy -- our holdings are not even reliably according to that standard.


But if you think it's easy, please, give it a try and get back to us. :) 
Maybe your library's data is cleaner than mine.


I think it's kind of a crime that our ILS (and many other ILSs) doesn't 
provide a way for holdings to be efficiency entered (or guessed from 
prediction patterns etc) AND converted to an internal structured format 
that actually contains the semantic info we want. Offering catalogers 
the option to manually enter an MFHD is not a solution.


Jonathan

Kyle Banerjee wrote:

The trick here is that traditional library metadata practices make it
  

_very


hard_ to tell if a _specific volume/issue_ is held by a given library.
  

 And


those are the most common use cases for OpenURL.

  

Yep. That's true even for individual library's with link resolvers. OCLC is
not going to be able to solve that particular issue until the local
libraries do.




This might not be as bad as people think. The normal argument is that
holdings are in free text and there's no way staff will ever have enough
time to record volume level holdings. However, significant chunks of the
problem can be addressed using relatively simple methods.

For example, if you can identify complete runs, you know that a library has
all holdings and can start automating things.

With this in mind, the first step is to identify incomplete holdings. The
mere presence of lingo like missing, lost, incomplete, scattered,
wanting, etc. is a dead giveaway.  So are bracketed fields that contain
enumeration or temporal data (though you'll get false hits using this method
when catalogers supply enumeration). Commas in any field that contains
enumeration or temporal data also indicate incomplete holdings.

I suspect that the mere presence of a note is a great indicator that
holdings are incomplete since what kind of yutz writes a note saying all
the holdings are here just like you'd expect? Having said that, I need to
crawl through a lot more data before being comfortable with that statement.

Regexp matches can be used to search for closed date ranges in open serials
or close dates within 866 that don't correspond to close dates within fixed
fields.

That's the first pass. The second pass would be to search for the most
common patterns that occur within incomplete holdings. Wash, rinse, repeat.
After awhile, you'll get to all the cornball schemes that don't lend
themselves towards automation, but hopefully that group of materials is
getting to a more manageable size where throwing labor at the metadata makes
some sense. Possibly guessing if a volume is available based on timeframe is
a good way to go.

Worst case scenario if the program can't handle it is you deflect the
request to the next institution, and that already happens all the time for a
variety of reasons.

While my comments are mostly concerned with journal holdings, similar logic
can be used with monographic series as well.

kyle

Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Markus Fischer


Kyle Banerjee schrieb:

This might not be as bad as people think. The normal argument is that
holdings are in free text and there's no way staff will ever have enough
time to record volume level holdings. However, significant chunks of the
problem can be addressed using relatively simple methods.

For example, if you can identify complete runs, you know that a library has
all holdings and can start automating things.


That's what we've done for journal holdings (only) in

https://sourceforge.net/projects/doctor-doc/

Works perfect in combination with an EZB-account 
(rzblx1.uni-regensburg.de/ezeit) as a linkresolver. May be as exact as 
on issue level.


The tool is beeing used by around 100 libraries in Germany, Switzerland 
and Austria.


If you check this one out: Don't expect the perfect OS-system. It has 
been developped by me (head of library and no IT-Professional) and a 
colleague (IT-Professional). I learned a lot through this one.


There is plenty room for improvement in it: some things implemented not 
yet so nice, other things done quite nice ;-)


If you want to discuss, use or contribute:

https://sourceforge.net/projects/doctor-doc/support

Very welcome!

Markus Fischer



While my comments are mostly concerned with journal holdings, similar logic
can be used with monographic series as well.

kyle

Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Tom Keays

I think my perspective of the user's goal is actually the same (or close
enough to the same) as David's, just stated differently. The user wants the
most local copy or, failing that, a way to order it from another source.

However, I have plenty of examples of faculty and occasional grad students
who are willing to make the trek to a nearby library -- even out of town
libraries -- rather than do ILL. This doesn't encompass every use case or
even a typical use case (are there typical cases?), but it does no harm to
have information even if you can't always act on it.

The problem with OpenURL tied to a particular institution is
a) the person may not have (or know they have) an affiliation to a given
institution,
b) may be coming from outside their institution's IP range so that even the
OCLC Registry redirect trick will fail to get them to a (let alone the
correct) link resolver,
c) there may not be any recourse to find an item if the institution does not
own it (MPOW does not provide a link to WorldCat).

Tom

On Tue, Jun 15, 2010 at 12:16 PM, Walker, David dwal...@calstate.eduwrote:

  It seems like the more productive path if the goal of a user is
  simply to locate a copy, where ever it is held.

 But I don't think users have *locating a copy* as their goal.  Rather, I
 think their goal is to *get their hands on the book*.

 If I discover a book via COINs, and you drop me off at Worldcat.org, that
 allows me to see which libraries own the book.  But, unless I happen to be
 affiliated with those institutions, that's kinda useless information.  I
 have no real way of actually getting the book itself.

 If, instead, you drop me off at your institution's link resolver menu, and
 provide me an ILL option in the event you don't have the book, the library
 can get the book for me, which is really my *goal*.

 That seems like the more productive path, IMO.

 --Dave

 ==
 David Walker
 Library Web Services Manager
 California State University
 http://xerxes.calstate.edu
 
 From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Tom Keays
 [tomke...@gmail.com]
 Sent: Tuesday, June 15, 2010 8:43 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

 On Mon, Jun 14, 2010 at 3:47 PM, Jonathan Rochkind rochk...@jhu.edu
 wrote:

  The trick here is that traditional library metadata practices make it
 _very
  hard_ to tell if a _specific volume/issue_ is held by a given library.
  And
  those are the most common use cases for OpenURL.
 

 Yep. That's true even for individual library's with link resolvers. OCLC is
 not going to be able to solve that particular issue until the local
 libraries do.


  If you just want to get to the title level (for a journal or a book), you
  can easily write your own thing that takes an OpenURL, and either just
  redirects straight to worldcat.org on isbn/lccn/oclcnum, or actually
 does
  a WorldCat API lookup to ensure the record exists first and/or looks up
 on
  author/title/etc too.
 

 I was mainly thinking of sources that use COinS. If you have a rarely held
 book, for instance, then OpenURLs resolved against random institutional
 endpoints are going to mostly be unproductive. However, a union catalog
 such as OCLC already has the information about libraries in the system that
 own it. It seems like the more productive path if the goal of a user is
 simply to locate a copy, where ever it is held.


  Umlaut already includes the 'naive' just link to worldcat.org based on
  isbn, oclcnum, or lccn approach, functionality that was written before
 the
  worldcat api exists. That is, Umlaut takes an incoming OpenURL, and
 provides
  the user with a link to a worldcat record based on isbn, oclcnum, or
 lccn.
 

 Many institutions have chosen to do this. MPOW, however, represents a
 counter-example and do not link out to OCLC.

 Tom

Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Tom Keays

I do provide the user with the proxied WorldCat URL for just the reasons
Jonathan cites. But, no, being an otherwise open web resource, you can't
force a user to use it.

On Tue, Jun 15, 2010 at 12:22 PM, Jonathan Rochkind rochk...@jhu.eduwrote:


 I haven't yet found any good way to do this if the user is off-campus
 (ezproxy not a good solution, how do we 'force' the user to use ezproxy for
 worldcat.org anyway?).

Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Jonathan Rochkind

I'm not sure what you mean by complete holdings? The library holds the 
entire run of the journal from the first issue printed to the 
last/current? Or just holdings that dont' include missing statements?


Perhaps other institutions have more easily parseable holdings data (or 
even holdings data stored in structured form in the ILS) than mine.  For 
mine, even holdings that don't include missing are not feasibly 
reliably parseable, I've tried.


Jonathan

Kyle Banerjee wrote:

But if you think it's easy, please, give it a try and get back to us. :)
Maybe your library's data is cleaner than mine.




I don't think it's easy, but I think detecting *complete* holdings is a big
part of the picture and that can be done fairly well.

Cleanliness of data will vary from one institution to another, and quite a
bit of it will be parsible. Even if you only can't even get half, you're
still way ahead of where you'd otherwise be.


  

I think it's kind of a crime that our ILS (and many other ILSs) doesn't
provide a way for holdings to be efficiency entered (or guessed from
prediction patterns etc) AND converted to an internal structured format that
actually contains the semantic info we want.




There's too much variation in what people want to do.  Even going with
manual MFHD, it's still pretty easy to generate stuff that's pretty hard to
parse

kyle

Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Jonathan Rochkind

Oh you really do mean complete like complete publication run?  Very 
few of our journal holdings are complete in that sense, they are 
definitely in the minority.  We start getting something after issue 1, 
or stop getting it before the last issue. Or stop and then start again.


Is this really unusual?

If all you've figured out is the complete publication run of a 
journal, and are assuming your library holds it... wait, how is this 
something you need for any actual use case?


My use case is trying to figure out IF we have a particular 
volume/issue, and ideally,  if so, what shelf is it located on.  If I'm 
just going to deal with journals we have the complete publication 
history of, I don't have a problem anymore, because the answer will 
always be yes, that's a very simple algorithm, print yes, heh.  So, 
yes, if you assume only holdings of complete publication histories, the 
problem does get very easy.


Incidentally, if anyone is looking for a schema and transmission format 
for actual _structured_ holdings information, that's flexible enough for 
idiosyncratic publication histories and holdings, but still structured 
enough to actually be machine-actionable... I still can't recommend Onix 
Serial Holdings highly enough!   I don't think it gets much use, 
probably because most of our systems simply don't _have_ this structured 
information, most of our staff interfaces don't provide reasonably 
efficient interfaces for entering, etc. But if you can get the other 
pieces and just need a schema and representation format, Onix Serial 
Holdings is nice!


Jonathan

Kyle Banerjee wrote:

On Tue, Jun 15, 2010 at 10:13 AM, Jonathan Rochkind rochk...@jhu.eduwrote:

  

I'm not sure what you mean by complete holdings? The library holds the
entire run of the journal from the first issue printed to the last/current?
Or just holdings that dont' include missing statements?




Obviously, there has to  some sort of holdings statement -- I'm presuming
that something reasonably accurate is available. If there is no summary
holdings statement, items aren't inventoried, but holdings are believed to
be incomplete, there's not much to work with.

As far as retrospectively getting data up to scratch in the case of
 hopeless situations, there are paths that make sense. For instance,
retrospectively inventorying serials may be insane. However, from circ and
ILL data, you should know which titles are actually consulted the most. Get
those ones in shape first and work backwards.

In a major academic library, it may be the case that some titles are *never*
handled, but that doesn't cause problems if no one wants them. For low use
resources, it can make more sense to just handle things manually.

Perhaps other institutions have more easily parseable holdings data (or even
  

holdings data stored in structured form in the ILS) than mine.  For mine,
even holdings that don't include missing are not feasibly reliably
parseable, I've tried.




Note that you can get structured holdings data from sources other than the
library catalog -- if you know what's missing.

Sounds like your situation is particularly challenging. But there are gains
worth chasing. Service issues aside, problems like these raise existential
questions.

If we do an inadequate job of providing access, patrons will just turn to
subscription databases and no one will even care about what we do or even if
we're still around. Most major academic libraries never got their entire
card collection in the online catalog. Patrons don't use that stuff anymore,
and almost no one cares (even among librarians). It would be a mistake to
think this can't happen again.

kyle

Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Kyle Banerjee

 Oh you really do mean complete like complete publication run?  Very few
 of our journal holdings are complete in that sense, they are definitely in
 the minority.  We start getting something after issue 1, or stop getting it
 before the last issue. Or stop and then start again.

 Is this really unusual?


No, but parsing holding statements for something that just gets cut off
early or which starts late should be easy unless entry is insanely
inconsistent. If staff enter info even close to standard practices, you
still should be able to read a lot of it even when there are breaks. This is
when anal retentive behavior in the tech services dept saves your bacon.

This process will be lossy, but sometimes that's all you can do. Some
situations may be such that there's no reasonable fix that would
significantly improve things. But in that case, it makes sense to move onto
other problems. Otherwise, we wind up all our time futzing with fringe use
cases and people actually get what they need elsewhere.

kyle

Re: [CODE4LIB] WorldCat Terminologies

2010-03-21 Thread Ian Ibbotson

  I bet it's got an SRU api.

Aye, Ralph can confirm but I'm pretty sure what you see on your screen
is actually an XML sru (api) response to which your browser has
applied the suggested xsl stylesheet, thus rendering the API result in
a more human friendly manner. A view source on the page should show
the SRU.

Ian.

On 21 March 2010 04:19, Jonathan Rochkind rochk...@jhu.edu wrote:
 Yeah, the statement that it's a static copy from 2006 would have stopped me 
 in my tracks if I had somehow happened accross the page, which I probably 
 wouldn't have, but now I've bookmarked it so I might find it again -- but 
 will probably forget that it's REALLY up to date even though it says 2006 on 
 it. Nice catch Karen.

 Karen, that looks to me like an HTML front-end for an SRU service, I bet it's 
 got an SRU api. Which one of these days I'll get around to figuring out how 
 to write code for.
 
 From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Karen Coyle 
 [li...@kcoyle.net]
 Sent: Saturday, March 20, 2010 11:29 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] WorldCat Terminologies

 Quoting LeVan,Ralph le...@oclc.org:

 I hate to muddy the waters, but I can't resist here.

 Research also exposes a copy of the LC NAF at
 http://alcme.oclc.org/srw/search/lcnaf

 It gets updated every Tuesday night.

 Unfortunately, that page states right up front:

 A static copy of LC's Name Authority File from February of 2006

 That might confuse visitors. Maybe a quick revision is in order? :-)

 Also, API access?

 kc


 This is something I've been maintaining for years and is what
 Identities points at when you ask to see the NAF record associated
 with an Identities record.

 This particular service has none of the linked-data-type bells and
 whistles I'm putting into VIAF and Identities, but easily could, if
 there was interest.  I believe I've made the indexing on it
 consistent with what I do in Identities.

 Looking at the configuration file for the load of this database, I
 am omitting records with 100$k, 100$t, 100$v, 100$x or any 130
 fields.  I'm sure Ya'aqov (or other similarly expert Authority
 Librarian) could tell you why I am omitting them, because I can't
 off the top of my head.

 This service is actually running as a long established model of how
 similar services should run in Research.  While it is not running on
  a machine operated by our production staff, it is automatically
 monitored by them, they have restart procedures in places when the
 service becomes unresponsive and problems are escalated by email
 when the restart fails to fix the problem. (Those emails come to me
 and where they get treated appropriately.)

 Let me know if there are questions about any of this.

 Ralph

 -Original Message-
 From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of
 Ya'aqov Ziso
 Sent: Friday, March 19, 2010 3:29 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] WorldCat Terminologies

 Jonathan, thank you, in full accord.  Yes, the crux of the matter is Names
 (NAF being the more expensive library subscription
 and the one not available for free like http://id.loc.gov

 At http://orlabs.oclc.org/Identities/ I searched Œclapton, eric¹ and wonder:

 * are all WorldCat+NAF 49 retrieved headings there helpful? do we need in
 the list  Œclayton, cecile¹ (etc.)
 * Names that haven¹t made it into an authority record are definitely
 helpful, but can we suggest a way to sort and rank them more usefully (for
 the user) on the page?

 Your thoughts?
 Ya¹aqov






 On 3/19/10 2:35 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

  I don't think the inclusion of non-NAF headings in Identities is
 a flaw, it's
  a benefit to the purpose of Identities not to be held back by the somewhat
  glacial pace of change in NAF.  But you're right, the right tool
 for the job,
  I don't know that any of the existing OCLC free (or included
 with other OCLC
  membership/services) services are the right tool to replace any existing
  purchased authorities tools or sources. It depends on what you're
  using them
  for, of course.  I agree that the brochure statement was potentially
  misleading, but these (Identities, Terminologies, Research
 Terminologies) are
  still very interesting and useful services.
  
  From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of
 Ya'aqov Ziso
  [z...@rowan.edu]
  Sent: Friday, March 19, 2010 2:14 PM
  To: CODE4LIB@LISTSERV.ND.EDU
  Subject: Re: [CODE4LIB] WorldCat Terminologies
 
  Karen,
  Seems like pulling-teeth was worth it. Thank you for these updates and for
  making them available for all interested.
  Essentially, given your 6 months latency compared to
 http://id.loc.gov) and
  the inclusion of NAF and non-NAF headings in Identities,
  both Terminologies and Identities are not yet at a level of databases that
  libraries can go for, forgoing what

Re: [CODE4LIB] WorldCat Terminologies

2010-03-21 Thread Ziso, Ya'aqov

I'm certain that as Ralph indicated, this file has been kept weekly up-to-date.
The html page header will be, eventually, fixed as well to reflect accurately 
the file's last update and its SRU searchability. The fact remains that for 
all: terminologies/identities/xISSN/xISBN  WC-DEVNET is the customer support 
and quality control.

We have no other address for maintenance, and possibly OCLC Research's 
dedicated staff lack such address as well. Yes, these experimental services 
reside on OCLC servers.

Unfortunately, given this customer support model, OCLC Research will be 
constantly put in a defensive position and all we can do is flag problems and 
maintain this loop.

(unless any of you has an idea for a loophole and, please, bring it on!)
Ya'aqov





-Original Message-
From: Code for Libraries on behalf of Jonathan Rochkind
Sent: Sun 3/21/2010 12:19 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat Terminologies
 
Yeah, the statement that it's a static copy from 2006 would have stopped me in 
my tracks if I had somehow happened accross the page, which I probably wouldn't 
have, but now I've bookmarked it so I might find it again -- but will probably 
forget that it's REALLY up to date even though it says 2006 on it. Nice catch 
Karen. 

Karen, that looks to me like an HTML front-end for an SRU service, I bet it's 
got an SRU api. Which one of these days I'll get around to figuring out how to 
write code for. 

From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Karen Coyle 
[li...@kcoyle.net]
Sent: Saturday, March 20, 2010 11:29 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat Terminologies

Quoting LeVan,Ralph le...@oclc.org:

 I hate to muddy the waters, but I can't resist here.

 Research also exposes a copy of the LC NAF at
 http://alcme.oclc.org/srw/search/lcnaf

 It gets updated every Tuesday night.

Unfortunately, that page states right up front:

A static copy of LC's Name Authority File from February of 2006

That might confuse visitors. Maybe a quick revision is in order? :-)

Also, API access?

kc


 This is something I've been maintaining for years and is what
 Identities points at when you ask to see the NAF record associated
 with an Identities record.

 This particular service has none of the linked-data-type bells and
 whistles I'm putting into VIAF and Identities, but easily could, if
 there was interest.  I believe I've made the indexing on it
 consistent with what I do in Identities.

 Looking at the configuration file for the load of this database, I
 am omitting records with 100$k, 100$t, 100$v, 100$x or any 130
 fields.  I'm sure Ya'aqov (or other similarly expert Authority
 Librarian) could tell you why I am omitting them, because I can't
 off the top of my head.

 This service is actually running as a long established model of how
 similar services should run in Research.  While it is not running on
  a machine operated by our production staff, it is automatically
 monitored by them, they have restart procedures in places when the
 service becomes unresponsive and problems are escalated by email
 when the restart fails to fix the problem. (Those emails come to me
 and where they get treated appropriately.)

 Let me know if there are questions about any of this.

 Ralph

 -Original Message-
 From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of
 Ya'aqov Ziso
 Sent: Friday, March 19, 2010 3:29 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] WorldCat Terminologies

 Jonathan, thank you, in full accord.  Yes, the crux of the matter is Names
 (NAF being the more expensive library subscription
 and the one not available for free like http://id.loc.gov

 At http://orlabs.oclc.org/Identities/ I searched Oclapton, eric¹ and wonder:

 * are all WorldCat+NAF 49 retrieved headings there helpful? do we need in
 the list  Oclayton, cecile¹ (etc.)
 * Names that haven¹t made it into an authority record are definitely
 helpful, but can we suggest a way to sort and rank them more usefully (for
 the user) on the page?

 Your thoughts?
 Ya¹aqov






 On 3/19/10 2:35 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

  I don't think the inclusion of non-NAF headings in Identities is
 a flaw, it's
  a benefit to the purpose of Identities not to be held back by the somewhat
  glacial pace of change in NAF.  But you're right, the right tool
 for the job,
  I don't know that any of the existing OCLC free (or included
 with other OCLC
  membership/services) services are the right tool to replace any existing
  purchased authorities tools or sources. It depends on what you're
  using them
  for, of course.  I agree that the brochure statement was potentially
  misleading, but these (Identities, Terminologies, Research
 Terminologies) are
  still very interesting and useful services.
  
  From: Code for Libraries [code4...@listserv.nd.edu

Re: [CODE4LIB] WorldCat Terminologies

2010-03-21 Thread LeVan,Ralph

Actually, that's a standard XML interface with a stylesheet rendering the html. 
 The static message is probably coming out of my database configuration.

Ralph

-Original Message-
From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of 
Jonathan Rochkind
Sent: Sunday, March 21, 2010 12:19 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat Terminologies

Yeah, the statement that it's a static copy from 2006 would have stopped me in 
my tracks if I had somehow happened accross the page, which I probably wouldn't 
have, but now I've bookmarked it so I might find it again -- but will probably 
forget that it's REALLY up to date even though it says 2006 on it. Nice catch 
Karen. 

Karen, that looks to me like an HTML front-end for an SRU service, I bet it's 
got an SRU api. Which one of these days I'll get around to figuring out how to 
write code for. 

From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Karen Coyle 
[li...@kcoyle.net]
Sent: Saturday, March 20, 2010 11:29 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat Terminologies

Quoting LeVan,Ralph le...@oclc.org:

 I hate to muddy the waters, but I can't resist here.

 Research also exposes a copy of the LC NAF at
 http://alcme.oclc.org/srw/search/lcnaf

 It gets updated every Tuesday night.

Unfortunately, that page states right up front:

A static copy of LC's Name Authority File from February of 2006

That might confuse visitors. Maybe a quick revision is in order? :-)

Also, API access?

kc

 This is something I've been maintaining for years and is what
 Identities points at when you ask to see the NAF record associated
 with an Identities record.

 This particular service has none of the linked-data-type bells and
 whistles I'm putting into VIAF and Identities, but easily could, if
 there was interest.  I believe I've made the indexing on it
 consistent with what I do in Identities.

 Looking at the configuration file for the load of this database, I
 am omitting records with 100$k, 100$t, 100$v, 100$x or any 130
 fields.  I'm sure Ya'aqov (or other similarly expert Authority
 Librarian) could tell you why I am omitting them, because I can't
 off the top of my head.

 This service is actually running as a long established model of how
 similar services should run in Research.  While it is not running on
  a machine operated by our production staff, it is automatically
 monitored by them, they have restart procedures in places when the
 service becomes unresponsive and problems are escalated by email
 when the restart fails to fix the problem. (Those emails come to me
 and where they get treated appropriately.)

 Let me know if there are questions about any of this.

 Ralph

 -Original Message-
 From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of
 Ya'aqov Ziso
 Sent: Friday, March 19, 2010 3:29 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] WorldCat Terminologies

 Jonathan, thank you, in full accord.  Yes, the crux of the matter is Names
 (NAF being the more expensive library subscription
 and the one not available for free like http://id.loc.gov

 At http://orlabs.oclc.org/Identities/ I searched Œclapton, eric¹ and wonder:

 * are all WorldCat+NAF 49 retrieved headings there helpful? do we need in
 the list  Œclayton, cecile¹ (etc.)
 * Names that haven¹t made it into an authority record are definitely
 helpful, but can we suggest a way to sort and rank them more usefully (for
 the user) on the page?

 Your thoughts?
 Ya¹aqov

 On 3/19/10 2:35 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

  I don't think the inclusion of non-NAF headings in Identities is
 a flaw, it's
  a benefit to the purpose of Identities not to be held back by the somewhat
  glacial pace of change in NAF.  But you're right, the right tool
 for the job,
  I don't know that any of the existing OCLC free (or included
 with other OCLC
  membership/services) services are the right tool to replace any existing
  purchased authorities tools or sources. It depends on what you're
  using them
  for, of course.  I agree that the brochure statement was potentially
  misleading, but these (Identities, Terminologies, Research
 Terminologies) are
  still very interesting and useful services.

  From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of
 Ya'aqov Ziso
  [z...@rowan.edu]
  Sent: Friday, March 19, 2010 2:14 PM
  To: CODE4LIB@LISTSERV.ND.EDU
  Subject: Re: [CODE4LIB] WorldCat Terminologies

  Karen,
  Seems like pulling-teeth was worth it. Thank you for these updates and for
  making them available for all interested.
  Essentially, given your 6 months latency compared to
 http://id.loc.gov) and
  the inclusion of NAF and non-NAF headings in Identities,
  both Terminologies and Identities are not yet at a level of databases that
  libraries can go for, forgoing what

Re: [CODE4LIB] WorldCat Terminologies

2010-03-21 Thread LeVan,Ralph

I'm open to suggestions, Ya'aqov.

I've been talking up the idea of some sort of dashboard for our services.  
Display uptime and response time.  It will be tougher to automatically detect a 
database update and report it.  I'll give that some thought for stuff running 
over my software stack.

This seems like the right forum to solicit other suggestions.  Has anyone done 
this before?  It seems like there ought to be some lists lying around somewhere 
of information that would be helpful to service consumers.

Ralph

-Original Message-
From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Ziso, 
Ya'aqov
Sent: Sunday, March 21, 2010 1:09 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat Terminologies

I'm certain that as Ralph indicated, this file has been kept weekly up-to-date.
The html page header will be, eventually, fixed as well to reflect accurately 
the file's last update and its SRU searchability. The fact remains that for 
all: terminologies/identities/xISSN/xISBN  WC-DEVNET is the customer support 
and quality control.

We have no other address for maintenance, and possibly OCLC Research's 
dedicated staff lack such address as well. Yes, these experimental services 
reside on OCLC servers.

Unfortunately, given this customer support model, OCLC Research will be 
constantly put in a defensive position and all we can do is flag problems and 
maintain this loop.

(unless any of you has an idea for a loophole and, please, bring it on!)
Ya'aqov





-Original Message-
From: Code for Libraries on behalf of Jonathan Rochkind
Sent: Sun 3/21/2010 12:19 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat Terminologies
 
Yeah, the statement that it's a static copy from 2006 would have stopped me in 
my tracks if I had somehow happened accross the page, which I probably wouldn't 
have, but now I've bookmarked it so I might find it again -- but will probably 
forget that it's REALLY up to date even though it says 2006 on it. Nice catch 
Karen. 

Karen, that looks to me like an HTML front-end for an SRU service, I bet it's 
got an SRU api. Which one of these days I'll get around to figuring out how to 
write code for. 

From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Karen Coyle 
[li...@kcoyle.net]
Sent: Saturday, March 20, 2010 11:29 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat Terminologies

Quoting LeVan,Ralph le...@oclc.org:

 I hate to muddy the waters, but I can't resist here.

 Research also exposes a copy of the LC NAF at
 http://alcme.oclc.org/srw/search/lcnaf

 It gets updated every Tuesday night.

Unfortunately, that page states right up front:

A static copy of LC's Name Authority File from February of 2006

That might confuse visitors. Maybe a quick revision is in order? :-)

Also, API access?

kc


 This is something I've been maintaining for years and is what
 Identities points at when you ask to see the NAF record associated
 with an Identities record.

 This particular service has none of the linked-data-type bells and
 whistles I'm putting into VIAF and Identities, but easily could, if
 there was interest.  I believe I've made the indexing on it
 consistent with what I do in Identities.

 Looking at the configuration file for the load of this database, I
 am omitting records with 100$k, 100$t, 100$v, 100$x or any 130
 fields.  I'm sure Ya'aqov (or other similarly expert Authority
 Librarian) could tell you why I am omitting them, because I can't
 off the top of my head.

 This service is actually running as a long established model of how
 similar services should run in Research.  While it is not running on
  a machine operated by our production staff, it is automatically
 monitored by them, they have restart procedures in places when the
 service becomes unresponsive and problems are escalated by email
 when the restart fails to fix the problem. (Those emails come to me
 and where they get treated appropriately.)

 Let me know if there are questions about any of this.

 Ralph

 -Original Message-
 From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of
 Ya'aqov Ziso
 Sent: Friday, March 19, 2010 3:29 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] WorldCat Terminologies

 Jonathan, thank you, in full accord.  Yes, the crux of the matter is Names
 (NAF being the more expensive library subscription
 and the one not available for free like http://id.loc.gov

 At http://orlabs.oclc.org/Identities/ I searched Oclapton, eric¹ and wonder:

 * are all WorldCat+NAF 49 retrieved headings there helpful? do we need in
 the list  Oclayton, cecile¹ (etc.)
 * Names that haven¹t made it into an authority record are definitely
 helpful, but can we suggest a way to sort and rank them more usefully (for
 the user) on the page?

 Your thoughts?
 Ya¹aqov






 On 3/19/10 2:35 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

  I don't think

Re: [CODE4LIB] WorldCat Terminologies

2010-03-20 Thread Karen Coyle


Quoting LeVan,Ralph le...@oclc.org:


I hate to muddy the waters, but I can't resist here.

Research also exposes a copy of the LC NAF at   
http://alcme.oclc.org/srw/search/lcnaf


It gets updated every Tuesday night.


Unfortunately, that page states right up front:

A static copy of LC's Name Authority File from February of 2006

That might confuse visitors. Maybe a quick revision is in order? :-)

Also, API access?

kc



This is something I've been maintaining for years and is what   
Identities points at when you ask to see the NAF record associated   
with an Identities record.


This particular service has none of the linked-data-type bells and   
whistles I'm putting into VIAF and Identities, but easily could, if   
there was interest.  I believe I've made the indexing on it   
consistent with what I do in Identities.


Looking at the configuration file for the load of this database, I   
am omitting records with 100$k, 100$t, 100$v, 100$x or any 130   
fields.  I'm sure Ya'aqov (or other similarly expert Authority   
Librarian) could tell you why I am omitting them, because I can't   
off the top of my head.


This service is actually running as a long established model of how   
similar services should run in Research.  While it is not running on  
 a machine operated by our production staff, it is automatically   
monitored by them, they have restart procedures in places when the   
service becomes unresponsive and problems are escalated by email   
when the restart fails to fix the problem. (Those emails come to me   
and where they get treated appropriately.)


Let me know if there are questions about any of this.

Ralph


-Original Message-
From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of
Ya'aqov Ziso
Sent: Friday, March 19, 2010 3:29 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat Terminologies

Jonathan, thank you, in full accord.  Yes, the crux of the matter is Names
(NAF being the more expensive library subscription
and the one not available for free like http://id.loc.gov

At http://orlabs.oclc.org/Identities/ I searched Œclapton, eric¹ and wonder:

* are all WorldCat+NAF 49 retrieved headings there helpful? do we need in
the list  Œclayton, cecile¹ (etc.)
* Names that haven¹t made it into an authority record are definitely
helpful, but can we suggest a way to sort and rank them more usefully (for
the user) on the page?

Your thoughts?
Ya¹aqov






On 3/19/10 2:35 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

 I don't think the inclusion of non-NAF headings in Identities is   
a flaw, it's

 a benefit to the purpose of Identities not to be held back by the somewhat
 glacial pace of change in NAF.  But you're right, the right tool   
for the job,
 I don't know that any of the existing OCLC free (or included   
with other OCLC

 membership/services) services are the right tool to replace any existing
 purchased authorities tools or sources. It depends on what you're  
 using them

 for, of course.  I agree that the brochure statement was potentially
 misleading, but these (Identities, Terminologies, Research   
Terminologies) are

 still very interesting and useful services.
 
 From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of
Ya'aqov Ziso
 [z...@rowan.edu]
 Sent: Friday, March 19, 2010 2:14 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] WorldCat Terminologies

 Karen,
 Seems like pulling-teeth was worth it. Thank you for these updates and for
 making them available for all interested.
 Essentially, given your 6 months latency compared to   
http://id.loc.gov) and

 the inclusion of NAF and non-NAF headings in Identities,
 both Terminologies and Identities are not yet at a level of databases that
 libraries can go for, forgoing what they currently need to buy, load, and
 maintain locally.
 Your withdrawal of the brochure statement is justified and   
apology accepted.

 Ya¹aqov




 On 3/19/10 1:35 PM, Karen Coombs coom...@oclc.org wrote:

  Ya¹aqov,
 
  Identities is not based on a name authority file it is based   
on name data

 in
  WorldCat. These two are not the same thing. Names within   
Identities come

 from
  several different fields within the WorldCat MARC records   
including 1xx,

 6xx,
  and 7xx fields. This is why Identities contains names which   
have not made

  their way into a name authority file
  (http://www.worldcat.org/identities/np-levan,%20ralph%20r).   
Also Identities
  doesn¹t contain authority records, it contains Identity   
records. Identity

  records contain different information from Authority records.
 
  There are name authority projects going on within OCLC Research. The
most
  active is VIAF http://www.viaf.org/ . This service contains   
name authority
  information from several national libraries, not just Library   
of Congress

 and
  right now not all of NACO has been loaded, only differentiate personal
 names.
  More

Re: [CODE4LIB] WorldCat Terminologies

2010-03-20 Thread Jonathan Rochkind

Yeah, the statement that it's a static copy from 2006 would have stopped me in 
my tracks if I had somehow happened accross the page, which I probably wouldn't 
have, but now I've bookmarked it so I might find it again -- but will probably 
forget that it's REALLY up to date even though it says 2006 on it. Nice catch 
Karen. 

Karen, that looks to me like an HTML front-end for an SRU service, I bet it's 
got an SRU api. Which one of these days I'll get around to figuring out how to 
write code for. 

From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Karen Coyle 
[li...@kcoyle.net]
Sent: Saturday, March 20, 2010 11:29 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat Terminologies

Quoting LeVan,Ralph le...@oclc.org:

 I hate to muddy the waters, but I can't resist here.

 Research also exposes a copy of the LC NAF at
 http://alcme.oclc.org/srw/search/lcnaf

 It gets updated every Tuesday night.

Unfortunately, that page states right up front:

A static copy of LC's Name Authority File from February of 2006

That might confuse visitors. Maybe a quick revision is in order? :-)

Also, API access?

kc


 This is something I've been maintaining for years and is what
 Identities points at when you ask to see the NAF record associated
 with an Identities record.

 This particular service has none of the linked-data-type bells and
 whistles I'm putting into VIAF and Identities, but easily could, if
 there was interest.  I believe I've made the indexing on it
 consistent with what I do in Identities.

 Looking at the configuration file for the load of this database, I
 am omitting records with 100$k, 100$t, 100$v, 100$x or any 130
 fields.  I'm sure Ya'aqov (or other similarly expert Authority
 Librarian) could tell you why I am omitting them, because I can't
 off the top of my head.

 This service is actually running as a long established model of how
 similar services should run in Research.  While it is not running on
  a machine operated by our production staff, it is automatically
 monitored by them, they have restart procedures in places when the
 service becomes unresponsive and problems are escalated by email
 when the restart fails to fix the problem. (Those emails come to me
 and where they get treated appropriately.)

 Let me know if there are questions about any of this.

 Ralph

 -Original Message-
 From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of
 Ya'aqov Ziso
 Sent: Friday, March 19, 2010 3:29 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] WorldCat Terminologies

 Jonathan, thank you, in full accord.  Yes, the crux of the matter is Names
 (NAF being the more expensive library subscription
 and the one not available for free like http://id.loc.gov

 At http://orlabs.oclc.org/Identities/ I searched Œclapton, eric¹ and wonder:

 * are all WorldCat+NAF 49 retrieved headings there helpful? do we need in
 the list  Œclayton, cecile¹ (etc.)
 * Names that haven¹t made it into an authority record are definitely
 helpful, but can we suggest a way to sort and rank them more usefully (for
 the user) on the page?

 Your thoughts?
 Ya¹aqov






 On 3/19/10 2:35 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

  I don't think the inclusion of non-NAF headings in Identities is
 a flaw, it's
  a benefit to the purpose of Identities not to be held back by the somewhat
  glacial pace of change in NAF.  But you're right, the right tool
 for the job,
  I don't know that any of the existing OCLC free (or included
 with other OCLC
  membership/services) services are the right tool to replace any existing
  purchased authorities tools or sources. It depends on what you're
  using them
  for, of course.  I agree that the brochure statement was potentially
  misleading, but these (Identities, Terminologies, Research
 Terminologies) are
  still very interesting and useful services.
  
  From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of
 Ya'aqov Ziso
  [z...@rowan.edu]
  Sent: Friday, March 19, 2010 2:14 PM
  To: CODE4LIB@LISTSERV.ND.EDU
  Subject: Re: [CODE4LIB] WorldCat Terminologies
 
  Karen,
  Seems like pulling-teeth was worth it. Thank you for these updates and for
  making them available for all interested.
  Essentially, given your 6 months latency compared to
 http://id.loc.gov) and
  the inclusion of NAF and non-NAF headings in Identities,
  both Terminologies and Identities are not yet at a level of databases that
  libraries can go for, forgoing what they currently need to buy, load, and
  maintain locally.
  Your withdrawal of the brochure statement is justified and
 apology accepted.
  Ya¹aqov
 
 
 
 
  On 3/19/10 1:35 PM, Karen Coombs coom...@oclc.org wrote:
 
   Ya¹aqov,
  
   Identities is not based on a name authority file it is based
 on name data
  in
   WorldCat. These two are not the same thing. Names within
 Identities come
  from

Re: [CODE4LIB] WorldCat Terminologies

2010-03-20 Thread Jonathan Rochkind

PS:  Whatever I search for under the cql.any column, I get 0 hits.  Maybe 
cql.any isn't actually supported? Ah, the perils of trying to figure out out 
SRU. 

The SRU explain document (which is actually there if you view source: That page 
in fact just IS an SRU explain document with an XSLT transform to HTML) 
suggests that cql.any is indeed supported.  It might be lying. 

Whenever I've tried to use an SRU service, it hasn't been nearly as 
transparently self-explaining as SRU aims to be. 

Jonathan

From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Karen Coyle 
[li...@kcoyle.net]
Sent: Saturday, March 20, 2010 11:29 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat Terminologies

Quoting LeVan,Ralph le...@oclc.org:

 I hate to muddy the waters, but I can't resist here.

 Research also exposes a copy of the LC NAF at
 http://alcme.oclc.org/srw/search/lcnaf

 It gets updated every Tuesday night.

Unfortunately, that page states right up front:

A static copy of LC's Name Authority File from February of 2006

That might confuse visitors. Maybe a quick revision is in order? :-)

Also, API access?

kc


 This is something I've been maintaining for years and is what
 Identities points at when you ask to see the NAF record associated
 with an Identities record.

 This particular service has none of the linked-data-type bells and
 whistles I'm putting into VIAF and Identities, but easily could, if
 there was interest.  I believe I've made the indexing on it
 consistent with what I do in Identities.

 Looking at the configuration file for the load of this database, I
 am omitting records with 100$k, 100$t, 100$v, 100$x or any 130
 fields.  I'm sure Ya'aqov (or other similarly expert Authority
 Librarian) could tell you why I am omitting them, because I can't
 off the top of my head.

 This service is actually running as a long established model of how
 similar services should run in Research.  While it is not running on
  a machine operated by our production staff, it is automatically
 monitored by them, they have restart procedures in places when the
 service becomes unresponsive and problems are escalated by email
 when the restart fails to fix the problem. (Those emails come to me
 and where they get treated appropriately.)

 Let me know if there are questions about any of this.

 Ralph

 -Original Message-
 From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of
 Ya'aqov Ziso
 Sent: Friday, March 19, 2010 3:29 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] WorldCat Terminologies

 Jonathan, thank you, in full accord.  Yes, the crux of the matter is Names
 (NAF being the more expensive library subscription
 and the one not available for free like http://id.loc.gov

 At http://orlabs.oclc.org/Identities/ I searched Œclapton, eric¹ and wonder:

 * are all WorldCat+NAF 49 retrieved headings there helpful? do we need in
 the list  Œclayton, cecile¹ (etc.)
 * Names that haven¹t made it into an authority record are definitely
 helpful, but can we suggest a way to sort and rank them more usefully (for
 the user) on the page?

 Your thoughts?
 Ya¹aqov






 On 3/19/10 2:35 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

  I don't think the inclusion of non-NAF headings in Identities is
 a flaw, it's
  a benefit to the purpose of Identities not to be held back by the somewhat
  glacial pace of change in NAF.  But you're right, the right tool
 for the job,
  I don't know that any of the existing OCLC free (or included
 with other OCLC
  membership/services) services are the right tool to replace any existing
  purchased authorities tools or sources. It depends on what you're
  using them
  for, of course.  I agree that the brochure statement was potentially
  misleading, but these (Identities, Terminologies, Research
 Terminologies) are
  still very interesting and useful services.
  
  From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of
 Ya'aqov Ziso
  [z...@rowan.edu]
  Sent: Friday, March 19, 2010 2:14 PM
  To: CODE4LIB@LISTSERV.ND.EDU
  Subject: Re: [CODE4LIB] WorldCat Terminologies
 
  Karen,
  Seems like pulling-teeth was worth it. Thank you for these updates and for
  making them available for all interested.
  Essentially, given your 6 months latency compared to
 http://id.loc.gov) and
  the inclusion of NAF and non-NAF headings in Identities,
  both Terminologies and Identities are not yet at a level of databases that
  libraries can go for, forgoing what they currently need to buy, load, and
  maintain locally.
  Your withdrawal of the brochure statement is justified and
 apology accepted.
  Ya¹aqov
 
 
 
 
  On 3/19/10 1:35 PM, Karen Coombs coom...@oclc.org wrote:
 
   Ya¹aqov,
  
   Identities is not based on a name authority file it is based
 on name data
  in
   WorldCat. These two are not the same thing. Names within
 Identities come
  from

Re: [CODE4LIB] WorldCat Terminologies

2010-03-19 Thread Karen Coombs

Ya¹aqov,
We have decided that it is wiser to withdraw the statement from our
brochure, with our apologies, rather than attempt to defend it. We¹re sorry
if this has caused you any trouble.

As for your questions regarding frequency of update and any guaranteed level
of service, we have already answered those questions in venues to which you
subscribe. For example: http://tinyurl.com/yaldzcw .  It is common
practice, from Google to your local startup, for services to be put out in
³experimental² or ³beta² mode to judge interest and potential uses before
investing to support them at production level. Thank for your understanding,

Karen Coombs
OCLC Developer Network Manager

-- Forwarded Message
From: Ya'aqov Ziso z...@rowan.edu
Reply-To: WorldCat Developer Network Discussion List wc-devne...@oclc.org
Date: Thu, 18 Mar 2010 16:32:20 -0400
To: wc-devne...@oclc.org
Subject: Re: [WC-DEVNET-L] WorldCat Terminologies

Karen, 
At the CODE4LIB Wednesday in Asheville breakout session on API queries
http://wiki.code4lib.org/index.php/2010_Breakout_Sessions#Wednesday
we questioned the level of maintenance for NAF, LCSH (by OCLC Research for
Identities and Terminologies).  I also added a question (below) per your
distributed brochure. What is the status of these questions? Are we to deal
with dirtier data (compared to NAF/LCSH in CONNEXION) for now?
Note: without a WC-DEVNET tracking system, some questions get lost, by
chance or by intent.
Ya¹aqov 




On 3/3/10 2:30 PM, Ya'aqov Ziso z...@rowan.edu wrote:

 Karen Coombs,  Hi,
 
 ³Terminologies ... all those terminologies databases that you used to have to
 buy, load, and maintain locally  now available remotely for free ... (from
 the blurb OCLC distributed at the CODE4LIB in Asheville, 2/21-25-2010)
 
 Could you please elaborate, how can Terminologies Services substitute for what
 libraries upkeep and pay for currently, given the other statement on that
 blurb¹s page ³WorldCat Terminologies is still an experiment research service
 with no service assurances². Kind thanks,
 
 Ya¹aqov 
 
 
 
 
 ---
 Posted on: WorldCat Developer Network discussion list
 To post:  email to wc-devne...@oclc.org
 To subscribe, go to https://www3.oclc.org/app/listserv/
 To unsubscribe, change options, change to digest mode, or view archive, go to:
 http://listserv.oclc.org/scripts/wa.exe?A0=WC-DEVNET-L
 list owners:  Roy Tennant, Don Hamparian
 
 

---
Posted on: WorldCat Developer Network discussion list
To post:  email to wc-devne...@oclc.org
To subscribe, go to https://www3.oclc.org/app/listserv/
To unsubscribe, change options, change to digest mode, or view archive, go
to:
http://listserv.oclc.org/scripts/wa.exe?A0=WC-DEVNET-L
list owners:  Roy Tennant, Don Hamparian



-- End of Forwarded Message

Re: [CODE4LIB] WorldCat Terminologies

2010-03-19 Thread Ya'aqov Ziso

Hello Karen,

Since upkeep done to Terminologies and Identities involves all WorldCat
membership copied to your note, it will be helpful if OCLC Research would
post:
 an URL specifying upkeep done to Terminologies
 an URL specifying upkeep done to Identities (is NAF used via CONNEXION the
same as the one used for identities? if not, specify)

The library community has been by now sufficiently exposed to successful
attempts and|or baubles, and learned the difference between ³beta²
and ³experimental². As we all go through this learning experience, in this
case partnering with OCLC Research, we wish to grow successful.

Ya¹aqov Ziso, Electronic Resource Management Librarian, Rowan University 856
256 4804 

 




On 3/19/10 10:00 AM, Karen Coombs coom...@oclc.org wrote:

 Ya¹aqov,
 We have decided that it is wiser to withdraw the statement from our
 brochure, with our apologies, rather than attempt to defend it. We¹re sorry
 if this has caused you any trouble.
 
 As for your questions regarding frequency of update and any guaranteed level
 of service, we have already answered those questions in venues to which you
 subscribe. For example: http://tinyurl.com/yaldzcw .  It is common
 practice, from Google to your local startup, for services to be put out in
 ³experimental² or ³beta² mode to judge interest and potential uses before
 investing to support them at production level. Thank for your understanding,
 
 Karen Coombs
 OCLC Developer Network Manager
 
 -- Forwarded Message
 From: Ya'aqov Ziso z...@rowan.edu
 Reply-To: WorldCat Developer Network Discussion List wc-devne...@oclc.org
 Date: Thu, 18 Mar 2010 16:32:20 -0400
 To: wc-devne...@oclc.org
 Subject: Re: [WC-DEVNET-L] WorldCat Terminologies
 
 Karen, 
 At the CODE4LIB Wednesday in Asheville breakout session on API queries
 http://wiki.code4lib.org/index.php/2010_Breakout_Sessions#Wednesday
 we questioned the level of maintenance for NAF, LCSH (by OCLC Research for
 Identities and Terminologies).  I also added a question (below) per your
 distributed brochure. What is the status of these questions? Are we to deal
 with dirtier data (compared to NAF/LCSH in CONNEXION) for now?
 Note: without a WC-DEVNET tracking system, some questions get lost, by
 chance or by intent.
 Ya¹aqov 
 
 
 
 
 On 3/3/10 2:30 PM, Ya'aqov Ziso z...@rowan.edu wrote:
 
  Karen Coombs,  Hi,
  
  ³Terminologies ... all those terminologies databases that you used to have
 to
  buy, load, and maintain locally  now available remotely for free ... (from
  the blurb OCLC distributed at the CODE4LIB in Asheville, 2/21-25-2010)
  
  Could you please elaborate, how can Terminologies Services substitute for
 what
  libraries upkeep and pay for currently, given the other statement on that
  blurb¹s page ³WorldCat Terminologies is still an experiment research
 service
  with no service assurances². Kind thanks,
  
  Ya¹aqov 
  
  
  
  
  ---
  Posted on: WorldCat Developer Network discussion list
  To post:  email to wc-devne...@oclc.org
  To subscribe, go to https://www3.oclc.org/app/listserv/
  To unsubscribe, change options, change to digest mode, or view archive, go
 to:
  http://listserv.oclc.org/scripts/wa.exe?A0=WC-DEVNET-L
  list owners:  Roy Tennant, Don Hamparian
  
  
 
 ---
 Posted on: WorldCat Developer Network discussion list
 To post:  email to wc-devne...@oclc.org
 To subscribe, go to https://www3.oclc.org/app/listserv/
 To unsubscribe, change options, change to digest mode, or view archive, go
 to:
 http://listserv.oclc.org/scripts/wa.exe?A0=WC-DEVNET-L
 list owners:  Roy Tennant, Don Hamparian
 
 
 
 -- End of Forwarded Message

Re: [CODE4LIB] WorldCat Terminologies

2010-03-19 Thread Karen Coombs

Ya¹aqov,

Identities is not based on a name authority file it is based on name data in
WorldCat. These two are not the same thing. Names within Identities come
from several different fields within the WorldCat MARC records including
1xx, 6xx, and 7xx fields. This is why Identities contains names which have
not made their way into a name authority file
(http://www.worldcat.org/identities/np-levan,%20ralph%20r). Also Identities
doesn¹t contain authority records, it contains Identity records. Identity
records contain different information from Authority records.

There are name authority projects going on within OCLC Research. The most
active is VIAF http://www.viaf.org/ . This service contains name authority
information from several national libraries, not just Library of Congress
and right now not all of NACO has been loaded, only differentiate personal
names. More information on the VIAF project is available at
http://www.oclc.org/research/activities/viaf/

Regarding up to dateness of Terminologies as I said in my previous message a
schedule for updates was posted to the Developer Network listserv and the
Developer Network wiki
(http://worldcat.org/devnet/wiki/Terminologies_Updates). Andrew Houghton has
also informed you that information about how recently a given terminology is
located on the Experimental Terminologies page (
http://tspilot.oclc.org/resources/index.html).

Karen

-- 
Karen A. Coombs
Product Manager
OCLC Developer Network
coom...@oclc.org
281-886-0882
Skype:librarywebchic





On 3/19/10 10:29 AM, Ya'aqov Ziso z...@rowan.edu wrote:

 Hello Karen,
 
 Since upkeep done to Terminologies and Identities involves all WorldCat
 membership copied to your note, it will be helpful if OCLC Research would
 post:
  an URL specifying upkeep done to Terminologies
  an URL specifying upkeep done to Identities (is NAF used via CONNEXION
 the same as the one used for identities? if not, specify)
 
 The library community has been by now sufficiently exposed to successful
 attempts and|or baubles, and learned the difference between ³beta²
 and ³experimental². As we all go through this learning experience, in this
 case partnering with OCLC Research, we wish to grow successful.
 
 Ya¹aqov Ziso, Electronic Resource Management Librarian, Rowan University 856
 256 4804 
 
  
 
 
 
 
 On 3/19/10 10:00 AM, Karen Coombs coom...@oclc.org wrote:
 
 Ya¹aqov,
 We have decided that it is wiser to withdraw the statement from our
 brochure, with our apologies, rather than attempt to defend it. We¹re sorry
 if this has caused you any trouble.
 
 As for your questions regarding frequency of update and any guaranteed level
 of service, we have already answered those questions in venues to which you
 subscribe. For example: http://tinyurl.com/yaldzcw .  It is common
 practice, from Google to your local startup, for services to be put out in
 ³experimental² or ³beta² mode to judge interest and potential uses before
 investing to support them at production level. Thank for your understanding,
 
 Karen Coombs
 OCLC Developer Network Manager
 
 -- Forwarded Message
 From: Ya'aqov Ziso z...@rowan.edu
 Reply-To: WorldCat Developer Network Discussion List wc-devne...@oclc.org
 Date: Thu, 18 Mar 2010 16:32:20 -0400
 To: wc-devne...@oclc.org
 Subject: Re: [WC-DEVNET-L] WorldCat Terminologies
 
 Karen, 
 At the CODE4LIB Wednesday in Asheville breakout session on API queries
 http://wiki.code4lib.org/index.php/2010_Breakout_Sessions#Wednesday
 we questioned the level of maintenance for NAF, LCSH (by OCLC Research for
 Identities and Terminologies).  I also added a question (below) per your
 distributed brochure. What is the status of these questions? Are we to deal
 with dirtier data (compared to NAF/LCSH in CONNEXION) for now?
 Note: without a WC-DEVNET tracking system, some questions get lost, by
 chance or by intent.
 Ya¹aqov 
 
 
 
 
 On 3/3/10 2:30 PM, Ya'aqov Ziso z...@rowan.edu wrote:
 
  Karen Coombs,  Hi,
  
  ³Terminologies ... all those terminologies databases that you used to have
 to
  buy, load, and maintain locally  now available remotely for free ...
 (from
  the blurb OCLC distributed at the CODE4LIB in Asheville, 2/21-25-2010)
  
  Could you please elaborate, how can Terminologies Services substitute for
 what
  libraries upkeep and pay for currently, given the other statement on that
  blurb¹s page ³WorldCat Terminologies is still an experiment research
 service
  with no service assurances². Kind thanks,
  
  Ya¹aqov 
  
  
  
  
  ---
  Posted on: WorldCat Developer Network discussion list
  To post:  email to wc-devne...@oclc.org
  To subscribe, go to https://www3.oclc.org/app/listserv/
  To unsubscribe, change options, change to digest mode, or view archive, go
 to:
  http://listserv.oclc.org/scripts/wa.exe?A0=WC-DEVNET-L
  list owners:  Roy Tennant, Don Hamparian
  
  
 
 ---
 Posted on: WorldCat Developer Network discussion list
 To post:  email to wc-devne...@oclc.org
 To

Re: [CODE4LIB] WorldCat Terminologies

2010-03-19 Thread Ya'aqov Ziso

Karen,
Seems like pulling-teeth was worth it. Thank you for these updates and for
making them available for all interested.
Essentially, given your 6 months latency compared to http://id.loc.gov) and
the inclusion of NAF and non-NAF headings in Identities,
both Terminologies and Identities are not yet at a level of databases that
libraries can go for, forgoing what they currently need to buy, load, and
maintain locally. 
Your withdrawal of the brochure statement is justified and apology accepted.
Ya¹aqov 




On 3/19/10 1:35 PM, Karen Coombs coom...@oclc.org wrote:

 Ya¹aqov,
 
 Identities is not based on a name authority file it is based on name data in
 WorldCat. These two are not the same thing. Names within Identities come from
 several different fields within the WorldCat MARC records including 1xx, 6xx,
 and 7xx fields. This is why Identities contains names which have not made
 their way into a name authority file
 (http://www.worldcat.org/identities/np-levan,%20ralph%20r). Also Identities
 doesn¹t contain authority records, it contains Identity records. Identity
 records contain different information from Authority records.
 
 There are name authority projects going on within OCLC Research. The most
 active is VIAF http://www.viaf.org/ . This service contains name authority
 information from several national libraries, not just Library of Congress and
 right now not all of NACO has been loaded, only differentiate personal names.
 More information on the VIAF project is available at
 http://www.oclc.org/research/activities/viaf/
 
 Regarding up to dateness of Terminologies as I said in my previous message a
 schedule for updates was posted to the Developer Network listserv and the
 Developer Network wiki
 (http://worldcat.org/devnet/wiki/Terminologies_Updates). Andrew Houghton has
 also informed you that information about how recently a given terminology is
 located on the Experimental Terminologies page (
 http://tspilot.oclc.org/resources/index.html).
 
 Karen

Re: [CODE4LIB] WorldCat Terminologies

2010-03-19 Thread Jonathan Rochkind

I don't think the inclusion of non-NAF headings in Identities is a flaw, it's a 
benefit to the purpose of Identities not to be held back by the somewhat 
glacial pace of change in NAF.  But you're right, the right tool for the job, I 
don't know that any of the existing OCLC free (or included with other OCLC 
membership/services) services are the right tool to replace any existing 
purchased authorities tools or sources. It depends on what you're using them 
for, of course.  I agree that the brochure statement was potentially 
misleading, but these (Identities, Terminologies, Research Terminologies) are 
still very interesting and useful services. 

From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Ya'aqov Ziso 
[z...@rowan.edu]
Sent: Friday, March 19, 2010 2:14 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat Terminologies

Karen,
Seems like pulling-teeth was worth it. Thank you for these updates and for
making them available for all interested.
Essentially, given your 6 months latency compared to http://id.loc.gov) and
the inclusion of NAF and non-NAF headings in Identities,
both Terminologies and Identities are not yet at a level of databases that
libraries can go for, forgoing what they currently need to buy, load, and
maintain locally.
Your withdrawal of the brochure statement is justified and apology accepted.
Ya¹aqov




On 3/19/10 1:35 PM, Karen Coombs coom...@oclc.org wrote:

 Ya¹aqov,

 Identities is not based on a name authority file it is based on name data in
 WorldCat. These two are not the same thing. Names within Identities come from
 several different fields within the WorldCat MARC records including 1xx, 6xx,
 and 7xx fields. This is why Identities contains names which have not made
 their way into a name authority file
 (http://www.worldcat.org/identities/np-levan,%20ralph%20r). Also Identities
 doesn¹t contain authority records, it contains Identity records. Identity
 records contain different information from Authority records.

 There are name authority projects going on within OCLC Research. The most
 active is VIAF http://www.viaf.org/ . This service contains name authority
 information from several national libraries, not just Library of Congress and
 right now not all of NACO has been loaded, only differentiate personal names.
 More information on the VIAF project is available at
 http://www.oclc.org/research/activities/viaf/

 Regarding up to dateness of Terminologies as I said in my previous message a
 schedule for updates was posted to the Developer Network listserv and the
 Developer Network wiki
 (http://worldcat.org/devnet/wiki/Terminologies_Updates). Andrew Houghton has
 also informed you that information about how recently a given terminology is
 located on the Experimental Terminologies page (
 http://tspilot.oclc.org/resources/index.html).

 Karen

Re: [CODE4LIB] WorldCat Terminologies

2010-03-19 Thread Ya'aqov Ziso

Jonathan, thank you, in full accord.  Yes, the crux of the matter is Names
(NAF being the more expensive library subscription
and the one not available for free like http://id.loc.gov

At http://orlabs.oclc.org/Identities/ I searched clapton, eric¹ and wonder:

* are all WorldCat+NAF 49 retrieved headings there helpful? do we need in
the list  clayton, cecile¹ (etc.)
* Names that haven¹t made it into an authority record are definitely
helpful, but can we suggest a way to sort and rank them more usefully (for
the user) on the page?

Your thoughts? 
Ya¹aqov 






On 3/19/10 2:35 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

 I don't think the inclusion of non-NAF headings in Identities is a flaw, it's
 a benefit to the purpose of Identities not to be held back by the somewhat
 glacial pace of change in NAF.  But you're right, the right tool for the job,
 I don't know that any of the existing OCLC free (or included with other OCLC
 membership/services) services are the right tool to replace any existing
 purchased authorities tools or sources. It depends on what you're using them
 for, of course.  I agree that the brochure statement was potentially
 misleading, but these (Identities, Terminologies, Research Terminologies) are
 still very interesting and useful services.
 
 From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Ya'aqov Ziso
 [z...@rowan.edu]
 Sent: Friday, March 19, 2010 2:14 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] WorldCat Terminologies
 
 Karen,
 Seems like pulling-teeth was worth it. Thank you for these updates and for
 making them available for all interested.
 Essentially, given your 6 months latency compared to http://id.loc.gov) and
 the inclusion of NAF and non-NAF headings in Identities,
 both Terminologies and Identities are not yet at a level of databases that
 libraries can go for, forgoing what they currently need to buy, load, and
 maintain locally.
 Your withdrawal of the brochure statement is justified and apology accepted.
 Ya¹aqov
 
 
 
 
 On 3/19/10 1:35 PM, Karen Coombs coom...@oclc.org wrote:
 
  Ya¹aqov,
 
  Identities is not based on a name authority file it is based on name data
 in
  WorldCat. These two are not the same thing. Names within Identities come
 from
  several different fields within the WorldCat MARC records including 1xx,
 6xx,
  and 7xx fields. This is why Identities contains names which have not made
  their way into a name authority file
  (http://www.worldcat.org/identities/np-levan,%20ralph%20r). Also Identities
  doesn¹t contain authority records, it contains Identity records. Identity
  records contain different information from Authority records.
 
  There are name authority projects going on within OCLC Research. The most
  active is VIAF http://www.viaf.org/ . This service contains name authority
  information from several national libraries, not just Library of Congress
 and
  right now not all of NACO has been loaded, only differentiate personal
 names.
  More information on the VIAF project is available at
  http://www.oclc.org/research/activities/viaf/
 
  Regarding up to dateness of Terminologies as I said in my previous message
a
  schedule for updates was posted to the Developer Network listserv and the
  Developer Network wiki
  (http://worldcat.org/devnet/wiki/Terminologies_Updates). Andrew Houghton 
has
  also informed you that information about how recently a given terminology
 is
  located on the Experimental Terminologies page (
  http://tspilot.oclc.org/resources/index.html).
 
  Karen

Re: [CODE4LIB] WorldCat Terminologies

2010-03-19 Thread LeVan,Ralph

I hate to muddy the waters, but I can't resist here.

Research also exposes a copy of the LC NAF at 
http://alcme.oclc.org/srw/search/lcnaf

It gets updated every Tuesday night.

This is something I've been maintaining for years and is what Identities points 
at when you ask to see the NAF record associated with an Identities record.

This particular service has none of the linked-data-type bells and whistles I'm 
putting into VIAF and Identities, but easily could, if there was interest.  I 
believe I've made the indexing on it consistent with what I do in Identities.

Looking at the configuration file for the load of this database, I am omitting 
records with 100$k, 100$t, 100$v, 100$x or any 130 fields.  I'm sure Ya'aqov 
(or other similarly expert Authority Librarian) could tell you why I am 
omitting them, because I can't off the top of my head.

This service is actually running as a long established model of how similar 
services should run in Research.  While it is not running on a machine operated 
by our production staff, it is automatically monitored by them, they have 
restart procedures in places when the service becomes unresponsive and problems 
are escalated by email when the restart fails to fix the problem. (Those emails 
come to me and where they get treated appropriately.)

Let me know if there are questions about any of this.

Ralph

 -Original Message-
 From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of
 Ya'aqov Ziso
 Sent: Friday, March 19, 2010 3:29 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] WorldCat Terminologies
 
 Jonathan, thank you, in full accord.  Yes, the crux of the matter is Names
 (NAF being the more expensive library subscription
 and the one not available for free like http://id.loc.gov
 
 At http://orlabs.oclc.org/Identities/ I searched Œclapton, eric¹ and wonder:
 
 * are all WorldCat+NAF 49 retrieved headings there helpful? do we need in
 the list  Œclayton, cecile¹ (etc.)
 * Names that haven¹t made it into an authority record are definitely
 helpful, but can we suggest a way to sort and rank them more usefully (for
 the user) on the page?
 
 Your thoughts?
 Ya¹aqov
 
 
 
 
 
 
 On 3/19/10 2:35 PM, Jonathan Rochkind rochk...@jhu.edu wrote:
 
  I don't think the inclusion of non-NAF headings in Identities is a flaw, 
  it's
  a benefit to the purpose of Identities not to be held back by the somewhat
  glacial pace of change in NAF.  But you're right, the right tool for the 
  job,
  I don't know that any of the existing OCLC free (or included with other 
  OCLC
  membership/services) services are the right tool to replace any existing
  purchased authorities tools or sources. It depends on what you're using them
  for, of course.  I agree that the brochure statement was potentially
  misleading, but these (Identities, Terminologies, Research Terminologies) 
  are
  still very interesting and useful services.
  
  From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of
 Ya'aqov Ziso
  [z...@rowan.edu]
  Sent: Friday, March 19, 2010 2:14 PM
  To: CODE4LIB@LISTSERV.ND.EDU
  Subject: Re: [CODE4LIB] WorldCat Terminologies
 
  Karen,
  Seems like pulling-teeth was worth it. Thank you for these updates and for
  making them available for all interested.
  Essentially, given your 6 months latency compared to http://id.loc.gov) and
  the inclusion of NAF and non-NAF headings in Identities,
  both Terminologies and Identities are not yet at a level of databases that
  libraries can go for, forgoing what they currently need to buy, load, and
  maintain locally.
  Your withdrawal of the brochure statement is justified and apology accepted.
  Ya¹aqov
 
 
 
 
  On 3/19/10 1:35 PM, Karen Coombs coom...@oclc.org wrote:
 
   Ya¹aqov,
  
   Identities is not based on a name authority file it is based on name data
  in
   WorldCat. These two are not the same thing. Names within Identities come
  from
   several different fields within the WorldCat MARC records including 1xx,
  6xx,
   and 7xx fields. This is why Identities contains names which have not made
   their way into a name authority file
   (http://www.worldcat.org/identities/np-levan,%20ralph%20r). Also 
   Identities
   doesn¹t contain authority records, it contains Identity records. Identity
   records contain different information from Authority records.
  
   There are name authority projects going on within OCLC Research. The
 most
   active is VIAF http://www.viaf.org/ . This service contains name 
   authority
   information from several national libraries, not just Library of Congress
  and
   right now not all of NACO has been loaded, only differentiate personal
  names.
   More information on the VIAF project is available at
   http://www.oclc.org/research/activities/viaf/
  
   Regarding up to dateness of Terminologies as I said in my previous
 message
 a
   schedule for updates was posted to the Developer Network listserv

Re: [CODE4LIB] WorldCat API account

2008-06-25 Thread Roy Tennant

The WorldCat API is not yet in general release. It is presently being beta
tested by invited developers, many of whom (if not all) are on this list.
Thus the confusion. Sorry, but stay tuned. A good way to do that is to sign
up on the WorldCat Developer's Network listserv. A link to the signup page
can be found on this page:

http://worldcat.org/devnet/

Thanks,
Roy 


On 6/25/08 6/25/08  8:40 AM, Yitzchak Schaffer [EMAIL PROTECTED]
wrote:

 Greetings CODE4LIBers:
 
 Does anyone know how to get a test account for the WorldCat API?  The
 wiki, last I checked, instructed to contact OCLC, but my e-mail to
 their generic address yielded no response.
 
 Thanks,

--

Re: [CODE4LIB] worldcat

2007-05-21 Thread Hickey,Thom

Not really, although we talk about it a lot around here at OCLC.

--Th

-Original Message-
From: Code for Libraries [mailto:[EMAIL PROTECTED] On Behalf Of
Eric Lease Morgan
Sent: Monday, May 21, 2007 9:34 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] worldcat

We here at Notre Dame subscribe to (license?) WorldCat, and I'm
wondering, does it have a Web Services interface/API?

--
Eric Lease Morgan
University Libraries of Notre Dame

(574) 631-8604

Re: [CODE4LIB] worldcat

2007-05-21 Thread Houghton,Andrew

 From: Code for Libraries [mailto:[EMAIL PROTECTED] On
 Behalf Of Eric Lease Morgan
 Sent: 21 May, 2007 09:34
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] worldcat

 We here at Notre Dame subscribe to (license?) WorldCat, and
 I'm wondering, does it have a Web Services interface/API?

I guess it depends on what you consider a Web Service interface
and API.  Today you create URL's to retrieve XHTML documents.
The details are here:

http://www.worldcat.org/links/default.jsp

It's not pretty since you have to scrape the XHTML document for
the information you want and they don't use id attributes on
content to make it easy for you to pull information out of the
XHTML.  However, looking at the roadmap for WorldCat.org, they
have always planned a proper Web Service API for it.  It's still
beta, so some functionality has yet to be delivered.

Unfortunately, I cannot say when a Web Service API will be
delivered since I'm not on the WorldCat.org team and do not
know what their current development priorities are.

Also, if you have some specific use cases in mind for how you
would like to interact with WorldCat.org using Web services,
I'm sure the WorldCat.org folks would like to see them.  You
can send feedback here:

http://www.worldcat.org/oclc/?page=feedback

Andy.

Re: [CODE4LIB] worldcat

2007-05-21 Thread Eric Lease Morgan


On May 21, 2007, at 9:52 AM, Houghton,Andrew wrote:


We here at Notre Dame subscribe to (license?) WorldCat, and
I'm wondering, does it have a Web Services interface/API?


I guess it depends on what you consider a Web Service interface
and API.  Today you create URL's to retrieve XHTML documents.
The details are here:

http://www.worldcat.org/links/default.jsp



Thank you for the prompt replies. This is what I thought, and I was
just checking to see if I had missed something. (Bummer.)

--
Eric Lease Morgan

Re: [CODE4LIB] worldcat

2006-08-22 Thread Houghton,Andrew

 From: Code for Libraries [mailto:[EMAIL PROTECTED] On
 Behalf Of Eric Lease Morgan
 Sent: 22 August, 2006 16:24
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] worldcat

 Is there a public Z39.50/SRU/SRW/Web Services interface to
 WorldCat or OpenWorldCat?

 I would like to create a simple search engine to query
 Other's books, and *Cat seems like a great candidate.

 Inquiring minds would like to know.

I'm not sure about a Z39.50/SRU/SRW interface to WorldCat, but
you can access WorldCat via URL queries and it appears that the
data comes back as an XHTML document.  So... you could hack
something together with a little XSLT to simulate an SRU
interface.

Since this isn't documented anywhere, at least that I could find,
with a little digging and hacking I came up with the following:

URL query:
http://worldcat.org/search?q={your+query+goes+here},
e.g.
http://worldcat.org/search?q=database+design

Results XPath:
/html/body//[EMAIL PROTECTED]'tableResults']
/html/body//[EMAIL PROTECTED]'tableResults']/tr/[EMAIL PROTECTED]'result']

Next set of results:
http://worldcat.org/search?q=database+designstart={next+number+in+result+set+goes+here}qt=next_page,
e.g.
http://worldcat.org/search?q=database+designstart=11qt=next_page

Andy.

Re: [CODE4LIB] worldcat

2006-08-22 Thread Ryan Eby


I don't think they have a public one but there is one if your
institution has Firstsearch.

http://www.oclc.org/support/documentation/firstsearch/z3950/fs_z39_config_guide/default.htm

The production server provides access to all the databases available
and requires a valid FirstSearch authorization.

http://www.oclc.org/support/documentation/firstsearch/z3950/z3950_databases/specs/worldcat.htm

Ryan Eby

On 8/22/06, Eric Lease Morgan [EMAIL PROTECTED] wrote:

Is there a public Z39.50/SRU/SRW/Web Services interface to WorldCat
or OpenWorldCat?

I would like to create a simple search engine to query Other's
books, and *Cat seems like a great candidate.

Inquiring minds would like to know.

--
Eric Morgan

74 matches

Mail list logo