Re: [CODE4LIB] Worldcat schema.org & search API

Karen Coyle Tue, 10 Jul 2012 15:11:54 -0700

I think we have a catch-22 here. You need an OCLC developer license touse WC to "discover" WC URIs using an application; you need WC URIs (orother URIs that are not very diffuse on the Web) to make use of the OCLClinked data. The OCLC linked data is ODC-BY for anyone wishing to usethe data, but, if I'm not mistaken, the APIs are not publicly open tothe Web public. Thus the schema.org data is ODC-BY but most applicationson the web will have little opportunity to discover the OCLC-specificURIs. So the gatekeeper is the API access, that is, the ability tosearch WC for URI discovery (e.g. with an author's name). So you canlink, but you can't easily discover the linking URIs.

I suppose that one could discover publications as linked data using thetopical access of LCSH, the VIAF links in Wikipedia, or by going throughdatabases like Open Library, which has some OCLC numbers associated withbibliographic data. All of these are accessible via open APIs, Ibelieve, and are linked DBPedia. I understand that "linking is linking"but unless we are developing data for SkyNet, somewhere along the waythe user needs to begin with a human-understandable query. Searching andlinked data are not in conflict with each other, they give each othermutual support. It only makes sense that URIs will be discovered throughsearching at some point in the process of access, as applications likeWikipedia illustrate. (As does the Facebook API, which is a search.)

I've tried to find a clear statement of who can get access to the OCLCAPIs, but I'm afraid that I can't find a page that clarifies that. Iguess one is expected to apply for developer key in order to find out ifthey qualify. I'll pass that information along.


kc


On 7/10/12 2:32 PM, Kevin Ford wrote:

Does the worldcat search api return the data as described with theschema.org and OCLC extension vocabularies?
The use case mentioned extracting the RDFa data from those pages.Without knowing the answer to the leading question above, the mocksolution addressed that condition. If one simply wanted "to create acomprehensive bibliography of works" by a particular author, then,yes, the search response would suffice.
Kevin


On 07/10/2012 05:10 PM, Roy Tennant wrote:
Uh...what? For the given use case you would be much better off simply
using the WorldCat Search API response. Using it only to retrieve an
identifier and then going and scraping the Linked Data out of a
WorldCat.org page is, at best, redundant.

As Richard pointed out, some use cases -- like the one Karen provided
-- are not really a good use case for linked data. It's a better use
case for an API, which has been available for years.
Roy

On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford <[email protected]> wrote:
The use case clarifies perfectly.
Totally feasible. Well, I should say "totally feasible" with thecaveatthat I've never used the Worldcat Search API. Not letting that stopme, so
long as it is what I imagine it is, then a developer should be able to
perform a search, retrieve the response, and, by integrating one of the
tools advertised on the schema.org website into his/her code, thenretrievethe microdata for each resource returned from the search (and saveit as RDF
or whatever).

If someone has created something like this, do speak up.

Yours,

Kevin





On 07/10/2012 04:48 PM, Karen Coyle wrote:
Kevin, if you misunderstand then I undoubtedly haven't been clear(let's
at least share the confusion :-)). Here's the use case:

PersonA wants to create a comprehensive bibliography of works by
AuthorB. The goal is to do a search on AuthorB in WorldCat and extract
the RDFa data from those pages in order to populate the bibliography.

Apart from all of the issues of getting a perfect match on authors and
of manifestation duplicates (there would need to be editing of the
results after retrieval at the user's end), how feasible is this?Assumethat the author is prolific enough that one wouldn't want to lookup all
of the records by hand.

kc

On 7/10/12 1:43 PM, Kevin Ford wrote:
As for someone who might want to do this programmatically, he/she
should take a look at the "Programming languages" section of the
second link I sent along:

http://schema.rdfs.org/tools.html

There one can find Ruby, Python, and Java extractors and parsers
capable of outputting RDF.  A developer can take one of these and
programmatically get at the data.

Apologies if I am misunderstanding your intent.

Yours,

Kevin



On 07/10/2012 04:34 PM, Karen Coyle wrote:
Thanks, Kevin! And Richard!
I'm thinking we need a good web site with links to tools. I hadalready
been introduced to

http://www.w3.org/2012/pyRdfa/

where you can past a URI and get ttl or rdf/xml. These are all good
resources. But what about someone who wants to do thisprogrammatically,not through a web site? Richard's message indicates that thisisn't yetavailable, so perhaps we should be gathering use cases to supporttheneed? And have a place to post various solutions, even ones thatare notOCLC-specific? (Because I am hoping that the use of microformatswill
increase in general.)

kc


On 7/10/12 12:12 PM, Kevin Ford wrote:
is there an open search to get one to the desired records in the
first
place?
-- I'm not certain this will fully address your question, but try
these two sites:

Website: http://www.google.com/webmasters/tools/richsnippets
Example: http://tinyurl.com/dx3h5bg

Website: http://linter.structured-data.org/
Example: http://tinyurl.com/bmm8bbc

These sites will extract the data, but I don't think you get your
choice of serialization. The data are extracted and displayedon the
resulting page in the HTML, but at least you can *see* the data.

Additionally, there are a number of "tools" to help with microdata
extraction here:

http://schema.rdfs.org/tools.html
Some of these will allow you to output specific (RDF)serializations.
HTH,

Kevin


On 07/10/2012 02:42 PM, Karen Coyle wrote:
I have demonstrated the schema.org/RDFa microdata in the WC
database to
various folks and the question always is: how do I get accessto this?(The only source I have is the Facebook API, me being a "user"rather
than a "maker".) The microdata is CC-BY once you get a Worldcat
URI, but
is there an open search to get one to the desired records inthe firstplace? I'm poorly-versed in WC APIs so I'm hoping others have abetter
grasp.

@rjw: the OCLC website does a thorough job of hiding email
addresses or
I would have asked this directly. Then again, a discussion herecould
have added value.

Thanks,
kc


--
Karen Coyle
[email protected] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet

Re: [CODE4LIB] Worldcat schema.org & search API

Reply via email to