Re: [CODE4LIB] Worldcat schema.org & search API

Karen Coyle Thu, 12 Jul 2012 15:18:54 -0700

Ross, it might not be yahoo, but that doesn't mean I know what it is.The pyRDFa utility returns garbage for RDF/XML and TTL, but not forJSON. It's only in the JSON output that I am getting any bibliographicdata. The other two send me back a bunch of links to css files. I guessthis is good news for folks who prefer JSON. Also, I see the OCLC numberin the JSON, but not the URI, although the URI appears in the div withthe RDFa:

<div itemid="http://www.worldcat.org/oclc/527725"; itemscope=""itemtype="http://schema.org/Book";resource="http://www.worldcat.org/oclc/527725";typeof="http://schema.org/Book";><<ahref="http://www.worldcat.org/oclc/527725";>http://www.worldcat.org/oclc/527725</a>>

I must say I wonder a bit about those double "<<>>" but what do I know?Anywhere, here's what I get from pyRDFa:


RDF/XML:

<rdf:RDF><_4:Book rdf:about="http://schema.org/Book"/><rdf:Descriptionrdf:about="http://www.worldcat.org/title/selection-of-early-statistical-papers-of-j-neyman/oclc/527725";><xhv:stylesheetrdf:resource="http://static1.worldcat.org/wcpa/rel20120711/css/loginpopup.css"/><xhv:stylesheetrdf:resource="http://static1.worldcat.org/wcpa/rel20120711/html/masthead.css"/><xhv:stylesheetrdf:resource="http://static1.worldcat.org/wcpa/rel20120711/css/alerts.css"/><xhv:stylesheetrdf:resource="http://static1.worldcat.org/wcpa/rel20120711/css/modals_jquery.css"/><xhv:stylesheetrdf:resource="http://static1.worldcat.org/wcpa/rel20120711/html/layered_divs.css"/><xhv:stylesheetrdf:resource="http://static1.worldcat.org/wcpa/cssj/N245213502/bundles/print-min.css"/><xhv:stylesheetrdf:resource="http://static1.worldcat.org/wcpa/rel20120711/css/cr_print.css"/><xhv:stylesheetrdf:resource="http://static.weread.com/css/booksiread/relbookswidget.css?0:5"/><xhv:stylesheetrdf:resource="http://static1.worldcat.org/wcpa/rel20120711/css/itemformat.css"/><xhv:stylesheetrdf:resource="http://static1.worldcat.org/wcpa/cssj/N1807112156/bundles/screen-min.css"/><xhv:stylesheetrdf:resource="http://static1.worldcat.org/wcpa/rel20120711/html/record.css"/><xhv:stylesheetrdf:resource="http://static1.worldcat.org/wcpa/rel20120711/html/yui/build/reset-fonts-grids/reset-fonts-grids.css"/><xhv:stylesheetrdf:resource="http://static1.worldcat.org/wcpa/rel20120711/html/new_wcorg.css"/></rdf:Description></rdf:RDF>


JSON:

{
"@context": {
"library": "http://purl.org/library/";,
"oclc": "http://www.worldcat.org/oclc/";,
"skos": "http://www.w3.org/2004/02/skos/core#";,
"madsrdf": "http://www.loc.gov/mads/rdf/v1#";,
"schema": "http://schema.org/";,
"http://purl.org/library/placeOfPublication": {
"@type": "@id"
},
"http://schema.org/about": {
"@type": "@id"
},
"http://schema.org/publisher": {
"@type": "@id"
},
"http://schema.org/author": {
"@type": "@id"
},
"http://www.w3.org/2004/02/skos/core#inScheme": {
"@type": "@id"
},
"http://www.loc.gov/mads/rdf/v1#isIdentifiedByAuthority": {
"@type": "@id"
}
},
"@id": "oclc:527725",
"@type": "schema:Book",
"schema:inLanguage": {
"@value": "en",
"@language": "en"
},
"library:holdingsCount": {
"@value": "285",
"@language": "en"
},
"schema:author": {
"@id": "http://viaf.org/viaf/24666861";,
"@type": "schema:Person",

"madsrdf:isIdentifiedByAuthority":"http://id.loc.gov/authorities/names/n50066374";,

"schema:name": {
"@value": "Neyman, Jerzy, 1894-1981.",
"@language": "en"
}
},
"schema:name": {
"@value": "A selection of early statistical papers of J. Neyman.",
"@language": "en"
},
"schema:datePublished": {
"@value": "1967.",
"@language": "en"
},
"schema:numberOfPages": {
"@value": "429",
"@language": "en"
},
"library:oclcnum": {
"@value": "527725",
"@language": "en"
},
"schema:about": [
{
"@type": "skos:Concept",

"madsrdf:isIdentifiedByAuthority":"http://id.loc.gov/authorities/subjects/sh85082133";,

"schema:name": {
"@value": "Mathematical statistics.",
"@language": "en"
}
},
{
"@id": "http://dewey.info/class/519/";,
"@type": "skos:Concept",
"skos:inScheme": "http://dewey.info/scheme/";
},
{
"@type": "skos:Concept",
"schema:name": {
"@value": "Statistique mathématique.",
"@language": "en"
}
},
{
"@id": "http://id.worldcat.org/fast/1012127";,
"@type": "skos:Concept",
"schema:name": {
"@value": "Mathematical statistics‍",
"@language": "en"
}
}
],
"schema:publisher": {
"@type": "schema:Organization",
"schema:name": {
"@value": "University of California Press",
"@language": "en"
}
},
"library:placeOfPublication": {
"@type": "schema:Place",
"schema:name": {
"@value": "Berkeley,",
"@language": "en"
}
}
}

kc

On 7/12/12 2:13 PM, Ross Singer wrote:

Ok, the Pipe didn't quite work as planned.  Yahoo! is stripping out
all of the relevant html attributes when it's converting the WC
microdata html to a string, which renders the whole thing useless.

If I don't convert it to a string, it maintains all of the necessary
attributes in the JSON output, but it strips them from the RSS and
html outputs.

I mean, it's hard to complain about "free thing doesn't handle my
niche problem", but when has that ever stopped me?

Anyway, it's there for somebody to clone and poke around with.  Maybe
somebody more familiar with Pipes can figure a way around this
problem.

-Ross.

On Thu, Jul 12, 2012 at 3:03 PM, Ross Singer <[email protected]> wrote:

I made a Yahoo Pipe that merges the WorldCat Basic OpenSearch RSS
result with the microdata div in the Worldcat pages referred to in the
search results:

http://pipes.yahoo.com/pipes/pipe.info?_id=05ae2a7bc180f3abe36b11bcaf1adc52

You'll need to enter your wskey for it to work.

You can get the output as RSS (which will require the item/description
to be unescaped to use) or JSON (which wouldn't require unescaping).

It's not terribly fast, but it least should help somebody get started.

-Ross.

On Thu, Jul 12, 2012 at 1:09 PM, Karen Coyle <[email protected]> wrote:

It isn't unfortunate, it was deliberate. I have a key for the basic api, but
I was being advised that I had overlooked the obvious answer of the worldcat
search API. I have no confusion between the two, except for the confusion
that seems to be promulgated by OCLC itself.

kc



On 7/12/12 9:46 AM, Karen Coombs wrote:

Karen,

Unfortunately it looks like you requested a key for the WorldCat
Search API which does have specific eligibility criteria. The WorldCat
Basic API which Ross mentions is available to anyone -
http://www.oclc.org/developer/services/worldcat-basic-api

It allows you to do an OpenSearch keyword query of WorldCat and get
back basic metadata including the link to the worldcat.org page for
each record returned.

The easiest way to get a key is to go to http://worldcat.org/config/
and login with a WorldCat username/password. You should see a link
that says WorldCat Basic API Key which you can use to get a key.

I apologize for the confusion between the two APIs (WorldCat Search
and WorldCat Basic). The difference is something we've tried to make
clearer in our documentation but unfortunately given your experience
it is still an issue.

Karen


On Thu, Jul 12, 2012 at 11:33 AM, Karen Coyle <[email protected]> wrote:

On 7/10/12 5:07 PM, Karen Coyle wrote:

On 7/10/12 4:02 PM, Richard Wallis wrote:


But is it available to everyone, and is the data retrieved also usable
as
ODC-BY by any member of the Web public?

Yes it is, and at this stage it is only available from within a html
page.


The "it" I was referring to was the API. Roy is telling me that people
should use the API, as if that is an obvious option that I am
overlooking. I
am asking if the general web public can use the API to get this data. I
believe that should be a yes/no question/answer.


Since no one here from OCLC had the integrity to answer this question, I
went ahead and applied for a Worldcat API key, and here is the reply:

*****

Hello,

Thank you for your interest in the WorldCat Search API, however at this
time
the web service is only available to institutions, primarily libraries,
that
have a specific relationship with OCLC and then only for work related to
that library's services. The specific relationship is explained further
here,
http://oclc.org/developer/documentation/worldcat-search-api/who-can-use.

However, there are other OCLC services that are available to individual's
non-commercial use.  Looking at the list of services available on
http://www.worldcat.org/wcpa/content/affiliate/ you'll see that the
WorldCat
search box and WorldCat links with embedded searches are available to
anyone.   You may also be interested in checking out the WorldCat
Registry,
or low-volume use of the xISBN and xISSN services.

If you have questions about the service, please contact the product
manager,
Dawn Hendricks at [email protected] <mailto:[email protected]>.

*****

There is nothing wrong with having a proprietary API; but pretending that
it
isn't (either directly or through omission), or being afraid to say it,
is
the kind of thing that has caused me to lose respect for OCLC. Nothing
should be declared "open" that isn't available to all, not just members.
And
advertisements for WC API classes should state "members only." That would
be
honest. And telling folks on a wide-open list that they should use the
Worldcat API (without mentioning "if you are in a member institution and
using this for library services) is at best deceiving, at worst
dishonest.

I, for one, am tired of OCLC's lies, and I'm not afraid to say it.
Fortunately for me, retirement is looming and I don't need to care who
likes
what I say. This is a relief, to say the least.

kc

kc

This experiment is the first step in a process to make linked data
about
WorldCat resources available.  As it will evolve over time other areas
such
as API access, content-negotiation, search & other query methods,
additional RDF data vocabularies, etc., etc., will be considered in
concert
with community feedback (such as this thread) as to the way forward.

Karen I know you are eager to work with and demonstrate the benefits of
this way of publishing data.  But these things take time and effort, so
please be a little patient, and keep firing off these use cases and
issues
they are all valuable input.

~Richard.

kc


    Roy

On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford <[email protected]>
wrote:

The use case clarifies perfectly.

Totally feasible.  Well, I should say "totally feasible" with the
caveat
that I've never used the Worldcat Search API.  Not letting that stop
me,
so
long as it is what I imagine it is, then a developer should be able
to
perform a search, retrieve the response, and, by integrating one of
the
tools advertised on the schema.org website into his/her code, then
retrieve
the microdata for each resource returned from the search (and save
it
as
RDF
or whatever).

If someone has created something like this, do speak up.

Yours,

Kevin





On 07/10/2012 04:48 PM, Karen Coyle wrote:

Kevin, if you misunderstand then I undoubtedly haven't been clear
(let's
at least share the confusion :-)). Here's the use case:

PersonA wants to create a comprehensive bibliography of works by
AuthorB. The goal is to do a search on AuthorB in WorldCat and
extract
the RDFa data from those pages in order to populate the
bibliography.

Apart from all of the issues of getting a perfect match on authors
and
of manifestation duplicates (there would need to be editing of the
results after retrieval at the user's end), how feasible is this?
Assume
that the author is prolific enough that one wouldn't want to look
up
all
of the records by hand.

kc

On 7/10/12 1:43 PM, Kevin Ford wrote:

As for someone who might want to do this programmatically, he/she
should take a look at the "Programming languages" section of the
second link I sent along:



http://schema.rdfs.org/tools.**html<http://schema.rdfs.org/tools.html>

There one can find Ruby, Python, and Java extractors and parsers
capable of outputting RDF.  A developer can take one of these and
programmatically get at the data.

Apologies if I am misunderstanding your intent.

Yours,

Kevin



On 07/10/2012 04:34 PM, Karen Coyle wrote:

Thanks, Kevin! And Richard!

I'm thinking we need a good web site with links to tools. I had
already
been introduced to

http://www.w3.org/2012/pyRdfa/

where you can past a URI and get ttl or rdf/xml. These are all
good
resources. But what about someone who wants to do this
programmatically,
not through a web site? Richard's message indicates that this
isn't
yet
available, so perhaps we should be gathering use cases to support
the
need? And have a place to post various solutions, even ones that
are
not
OCLC-specific? (Because I am hoping that the use of microformats
will
increase in general.)

kc


On 7/10/12 12:12 PM, Kevin Ford wrote:

is there an open search to get one to the desired records in the
first

place?

-- I'm not certain this will fully address your question, but
try
these two sites:

Website:

http://www.google.com/**webmasters/tools/richsnippets<http://www.google.com/webmasters/tools/richsnippets>
Example: http://tinyurl.com/dx3h5bg

Website:

http://linter.structured-data.**org/<http://linter.structured-data.org/>
Example: http://tinyurl.com/bmm8bbc

These sites will extract the data, but I don't think you get
your
choice of serialization.  The data are extracted and displayed
on
the
resulting page in the HTML, but at least you can *see* the data.

Additionally, there are a number of "tools" to help with
microdata
extraction here:



http://schema.rdfs.org/tools.**html<http://schema.rdfs.org/tools.html>

Some of these will allow you to output specific (RDF)
serializations.


HTH,

Kevin


On 07/10/2012 02:42 PM, Karen Coyle wrote:

I have demonstrated the schema.org/RDFa microdata in the WC
database to
various folks and the question always is: how do I get access
to
this?
(The only source I have is the Facebook API, me being a "user"
rather
than a "maker".) The microdata is CC-BY once you get a Worldcat
URI, but
is there an open search to get one to the desired records in
the
first
place? I'm poorly-versed in WC APIs so I'm hoping others have a
better
grasp.

@rjw: the OCLC website does a thorough job of hiding email
addresses or
I would have asked this directly. Then again, a discussion here
could
have added value.

Thanks,
kc

--
Karen Coyle
[email protected] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet

--
Karen Coyle
[email protected] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet


--
Karen Coyle
[email protected] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet


--
Karen Coyle
[email protected] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet

Re: [CODE4LIB] Worldcat schema.org & search API

Reply via email to