Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-16 Thread Tom Keays
We have been trying to enumerate serials holdings as explicitly as possible.
E.G., this microfiche supplement to a journal,
http://summit.syr.edu/cgi-bin/Pwebrecon.cgi?BBID=274291 shows apparently
missing issues. However, there are two pieces of inferred information here:

1) every print issue had a corresponding microfiche supplement (they didn't,
so most of these are complete even with the gaps)
2) that volumes, at least up until 1991, had only 26 issues (that is
probably is true, but it is not certain) and there is no way to be certain
how many issues per volume were published with 1992 (28?, 52?)

v.95:no.3 (1973)-v.95:no.8 (1973
v.95:no.10 (1973)-v.95:no.26 (1973)
v.96 (1974)-v.97 (1975)
v.98:no.1 (1976)-v.98:no.14 (1976)
v.98:no.16 (1976)-v.98:no.26 (1976)
v.99:no.1 (1977)-v.99:no.25 (1977)
v.100 (1978)-v.108 (1986)
v.109:no.1 (1987)-v.109:no.19 (1987)
v.109:no.21 (1987)-v.109:no.26 (1987)
v.110 (1988)-v.111 (1989)
v.112:no.1 (1990)-v.112:no.26 (1990)
v.113 (1991)
v.114:no.1 (1992)-v.114:no.21 (1992)
v.114:no.23 (1992)-v.114:no.27 (1992)
v.115 (1993)-v.119 (1997)
v.120:no.2 (1998:Jan.21)-v.120:no.51 (1998:Dec.30)




On Tue, Jun 15, 2010 at 9:56 PM, Bill Dueber b...@dueber.com wrote:

 On Tue, Jun 15, 2010 at 5:49 PM, Kyle Banerjee baner...@uoregon.edu
 wrote:
  No, but parsing holding statements for something that just gets cut off
  early or which starts late should be easy unless entry is insanely
  inconsistent.

 Andthere it is. :-)

 We're really dealing with a few problems here:

  - Inconsistent entry by catalogers (probably the least of our worries)
  - Inconsistent publishing schedules (e.g., the Jan 1942 issue was
 just plain never printed)
  - Inconsistent use of volume/number/year/month/whatever throughout a
 serial's run.

 So, for example, http://mirlyn.lib.umich.edu/Record/45417/Holdings#1

 There are six holdings:

 1919-1920 incompl
 1920 incompl.
 1922
 v.4 no.49
 v.6 1921 jul-dec
 v.6 1921jan-jun

 We have no way of knowing what year volume 4 was printed in, which
 issues are incomplete in the two volumes that cover 1920, whether
 volume number are associated with earlier (or later) issues, etc. We,
 as humans, could try to make some guesses, but they'd just be guesses.

 It's easy to find examples where month ranges overlap (or leave gaps),
 where month names and issue numbers are sometimes used
 interchangeably, where volume numbers suddenly change in the middle of
 a run because of a merge with another serial (or where the first
 volume isn't 1 because the serial broke off from a parent), etc.
 etc. etc.

 I don't mean to overstate the problem. For many (most?) serials whose
 existence only goes back a few decades, a relatively simple approach
 will likely work much of the time -- although even that relatively
 simple approach will have to take into account a solid dozen or so
 different ways that enumcron data may have been entered.

 But to be able to say, with some confidence, that we have the full
 run? Or a particular issue as labeled my a month name? Much, much
 harder in the general case.


  -Bill-


 --
 Bill Dueber
 Library Systems Programmer
 University of Michigan Library



Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-16 Thread Rosalyn Metz
Don't forget inconsistent data from the person sending the OpenURL.

Rosalyn



On Tue, Jun 15, 2010 at 9:56 PM, Bill Dueber b...@dueber.com wrote:
 On Tue, Jun 15, 2010 at 5:49 PM, Kyle Banerjee baner...@uoregon.edu wrote:
 No, but parsing holding statements for something that just gets cut off
 early or which starts late should be easy unless entry is insanely
 inconsistent.

 Andthere it is. :-)

 We're really dealing with a few problems here:

  - Inconsistent entry by catalogers (probably the least of our worries)
  - Inconsistent publishing schedules (e.g., the Jan 1942 issue was
 just plain never printed)
  - Inconsistent use of volume/number/year/month/whatever throughout a
 serial's run.

 So, for example, http://mirlyn.lib.umich.edu/Record/45417/Holdings#1

 There are six holdings:

 1919-1920 incompl
 1920 incompl.
 1922
 v.4 no.49
 v.6 1921 jul-dec
 v.6 1921jan-jun

 We have no way of knowing what year volume 4 was printed in, which
 issues are incomplete in the two volumes that cover 1920, whether
 volume number are associated with earlier (or later) issues, etc. We,
 as humans, could try to make some guesses, but they'd just be guesses.

 It's easy to find examples where month ranges overlap (or leave gaps),
 where month names and issue numbers are sometimes used
 interchangeably, where volume numbers suddenly change in the middle of
 a run because of a merge with another serial (or where the first
 volume isn't 1 because the serial broke off from a parent), etc.
 etc. etc.

 I don't mean to overstate the problem. For many (most?) serials whose
 existence only goes back a few decades, a relatively simple approach
 will likely work much of the time -- although even that relatively
 simple approach will have to take into account a solid dozen or so
 different ways that enumcron data may have been entered.

 But to be able to say, with some confidence, that we have the full
 run? Or a particular issue as labeled my a month name? Much, much
 harder in the general case.


  -Bill-


 --
 Bill Dueber
 Library Systems Programmer
 University of Michigan Library



Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-16 Thread Robertson, Wendy C
Regarding the data in OCLC, my understanding (as a former serials cataloger) is 
that there is detailed information for at least some institutions in the 
interlibrary loan portion of the OCLC database but this is not available via 
worldcat. I know our ILL department added detailed information for commonly 
requested titles years ago. I also know we are in the process of getting our 
detailed holdings loaded into OCLC (possibly just on the ILL side, I'm not sure 
about this) and maintaining our holdings through batch updates. Many of our 
current titles use summary holdings, but not all do. I believe the summary 
holdings work much more effectively with ILL as well so our serials catalogers 
have been working for years to improve our local data. As part of our move to 
summary holdings, we also reduced some of the detail in our holdings, so now we 
show only gaps of entire volumes, but not specific missing issues in our coded 
holdings (the missing issues are included in notes in our i!
 tem specific records).

If there is better data available to ILL staff, this may be an avenue you could 
pursue.

Wendy Robertson
Digital Resources Librarian .  The University of Iowa Libraries
1015 Main Library  .  Iowa City, Iowa 52242
wendy-robert...@uiowa.edu
319-335-5821

-Original Message-
From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Bill 
Dueber
Sent: Tuesday, June 15, 2010 8:57 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

On Tue, Jun 15, 2010 at 5:49 PM, Kyle Banerjee baner...@uoregon.edu wrote:
 No, but parsing holding statements for something that just gets cut off
 early or which starts late should be easy unless entry is insanely
 inconsistent.

Andthere it is. :-)

We're really dealing with a few problems here:

 - Inconsistent entry by catalogers (probably the least of our worries)
 - Inconsistent publishing schedules (e.g., the Jan 1942 issue was
just plain never printed)
 - Inconsistent use of volume/number/year/month/whatever throughout a
serial's run.

So, for example, http://mirlyn.lib.umich.edu/Record/45417/Holdings#1

There are six holdings:

1919-1920 incompl
1920 incompl.
1922
v.4 no.49
v.6 1921 jul-dec
v.6 1921jan-jun

We have no way of knowing what year volume 4 was printed in, which
issues are incomplete in the two volumes that cover 1920, whether
volume number are associated with earlier (or later) issues, etc. We,
as humans, could try to make some guesses, but they'd just be guesses.

It's easy to find examples where month ranges overlap (or leave gaps),
where month names and issue numbers are sometimes used
interchangeably, where volume numbers suddenly change in the middle of
a run because of a merge with another serial (or where the first
volume isn't 1 because the serial broke off from a parent), etc.
etc. etc.

I don't mean to overstate the problem. For many (most?) serials whose
existence only goes back a few decades, a relatively simple approach
will likely work much of the time -- although even that relatively
simple approach will have to take into account a solid dozen or so
different ways that enumcron data may have been entered.

But to be able to say, with some confidence, that we have the full
run? Or a particular issue as labeled my a month name? Much, much
harder in the general case.


  -Bill-


-- 
Bill Dueber
Library Systems Programmer
University of Michigan Library


Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Tom Keays
On Mon, Jun 14, 2010 at 3:47 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

 The trick here is that traditional library metadata practices make it _very
 hard_ to tell if a _specific volume/issue_ is held by a given library.  And
 those are the most common use cases for OpenURL.


Yep. That's true even for individual library's with link resolvers. OCLC is
not going to be able to solve that particular issue until the local
libraries do.


 If you just want to get to the title level (for a journal or a book), you
 can easily write your own thing that takes an OpenURL, and either just
 redirects straight to worldcat.org on isbn/lccn/oclcnum, or actually does
 a WorldCat API lookup to ensure the record exists first and/or looks up on
 author/title/etc too.


I was mainly thinking of sources that use COinS. If you have a rarely held
book, for instance, then OpenURLs resolved against random institutional
endpoints are going to mostly be unproductive. However, a union catalog
such as OCLC already has the information about libraries in the system that
own it. It seems like the more productive path if the goal of a user is
simply to locate a copy, where ever it is held.


 Umlaut already includes the 'naive' just link to worldcat.org based on
 isbn, oclcnum, or lccn approach, functionality that was written before the
 worldcat api exists. That is, Umlaut takes an incoming OpenURL, and provides
 the user with a link to a worldcat record based on isbn, oclcnum, or lccn.


Many institutions have chosen to do this. MPOW, however, represents a
counter-example and do not link out to OCLC.

Tom


Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Walker, David
 It seems like the more productive path if the goal of a user is
 simply to locate a copy, where ever it is held.

But I don't think users have *locating a copy* as their goal.  Rather, I think 
their goal is to *get their hands on the book*.

If I discover a book via COINs, and you drop me off at Worldcat.org, that 
allows me to see which libraries own the book.  But, unless I happen to be 
affiliated with those institutions, that's kinda useless information.  I have 
no real way of actually getting the book itself.

If, instead, you drop me off at your institution's link resolver menu, and 
provide me an ILL option in the event you don't have the book, the library can 
get the book for me, which is really my *goal*.

That seems like the more productive path, IMO.

--Dave

==
David Walker
Library Web Services Manager
California State University
http://xerxes.calstate.edu

From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Tom Keays 
[tomke...@gmail.com]
Sent: Tuesday, June 15, 2010 8:43 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

On Mon, Jun 14, 2010 at 3:47 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

 The trick here is that traditional library metadata practices make it _very
 hard_ to tell if a _specific volume/issue_ is held by a given library.  And
 those are the most common use cases for OpenURL.


Yep. That's true even for individual library's with link resolvers. OCLC is
not going to be able to solve that particular issue until the local
libraries do.


 If you just want to get to the title level (for a journal or a book), you
 can easily write your own thing that takes an OpenURL, and either just
 redirects straight to worldcat.org on isbn/lccn/oclcnum, or actually does
 a WorldCat API lookup to ensure the record exists first and/or looks up on
 author/title/etc too.


I was mainly thinking of sources that use COinS. If you have a rarely held
book, for instance, then OpenURLs resolved against random institutional
endpoints are going to mostly be unproductive. However, a union catalog
such as OCLC already has the information about libraries in the system that
own it. It seems like the more productive path if the goal of a user is
simply to locate a copy, where ever it is held.


 Umlaut already includes the 'naive' just link to worldcat.org based on
 isbn, oclcnum, or lccn approach, functionality that was written before the
 worldcat api exists. That is, Umlaut takes an incoming OpenURL, and provides
 the user with a link to a worldcat record based on isbn, oclcnum, or lccn.


Many institutions have chosen to do this. MPOW, however, represents a
counter-example and do not link out to OCLC.

Tom


Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Jonathan Rochkind
IF the user is coming from a recognized on-campus IP, you can configure 
WorldCat to give the user an ILL link to your library too. At least if 
you use ILLiad, maybe if you use something else (esp if your ILL 
software can accept OpenURLs too!).


I haven't yet found any good way to do this if the user is off-campus 
(ezproxy not a good solution, how do we 'force' the user to use ezproxy 
for worldcat.org anyway?).


But in any event, I agree with Dave that worldcat.org isn't a great 
interface even if you DO get it to have an ILL link in an odd place. I 
think we can do better. Which is really the whole purpose of Umlaut as 
an institutional link resolver, giving the user a better screen for I 
found this citation somewhere else, library what can you do to get it in 
my hands asap?


Still wondering why Umlaut hasn't gotten more interest from people, heh. 
But we're using it here at JHU, and NYU and the New School are also 
using it.


Jonathan

Walker, David wrote:

It seems like the more productive path if the goal of a user is
simply to locate a copy, where ever it is held.



But I don't think users have *locating a copy* as their goal.  Rather, I think 
their goal is to *get their hands on the book*.

If I discover a book via COINs, and you drop me off at Worldcat.org, that 
allows me to see which libraries own the book.  But, unless I happen to be 
affiliated with those institutions, that's kinda useless information.  I have 
no real way of actually getting the book itself.

If, instead, you drop me off at your institution's link resolver menu, and 
provide me an ILL option in the event you don't have the book, the library can 
get the book for me, which is really my *goal*.

That seems like the more productive path, IMO.

--Dave

==
David Walker
Library Web Services Manager
California State University
http://xerxes.calstate.edu

From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Tom Keays 
[tomke...@gmail.com]
Sent: Tuesday, June 15, 2010 8:43 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

On Mon, Jun 14, 2010 at 3:47 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

  

The trick here is that traditional library metadata practices make it _very
hard_ to tell if a _specific volume/issue_ is held by a given library.  And
those are the most common use cases for OpenURL.




Yep. That's true even for individual library's with link resolvers. OCLC is
not going to be able to solve that particular issue until the local
libraries do.


  

If you just want to get to the title level (for a journal or a book), you
can easily write your own thing that takes an OpenURL, and either just
redirects straight to worldcat.org on isbn/lccn/oclcnum, or actually does
a WorldCat API lookup to ensure the record exists first and/or looks up on
author/title/etc too.




I was mainly thinking of sources that use COinS. If you have a rarely held
book, for instance, then OpenURLs resolved against random institutional
endpoints are going to mostly be unproductive. However, a union catalog
such as OCLC already has the information about libraries in the system that
own it. It seems like the more productive path if the goal of a user is
simply to locate a copy, where ever it is held.


  

Umlaut already includes the 'naive' just link to worldcat.org based on
isbn, oclcnum, or lccn approach, functionality that was written before the
worldcat api exists. That is, Umlaut takes an incoming OpenURL, and provides
the user with a link to a worldcat record based on isbn, oclcnum, or lccn.




Many institutions have chosen to do this. MPOW, however, represents a
counter-example and do not link out to OCLC.

Tom

  


Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Kyle Banerjee

  The trick here is that traditional library metadata practices make it
 _very
  hard_ to tell if a _specific volume/issue_ is held by a given library.
  And
  those are the most common use cases for OpenURL.
 

 Yep. That's true even for individual library's with link resolvers. OCLC is
 not going to be able to solve that particular issue until the local
 libraries do.


This might not be as bad as people think. The normal argument is that
holdings are in free text and there's no way staff will ever have enough
time to record volume level holdings. However, significant chunks of the
problem can be addressed using relatively simple methods.

For example, if you can identify complete runs, you know that a library has
all holdings and can start automating things.

With this in mind, the first step is to identify incomplete holdings. The
mere presence of lingo like missing, lost, incomplete, scattered,
wanting, etc. is a dead giveaway.  So are bracketed fields that contain
enumeration or temporal data (though you'll get false hits using this method
when catalogers supply enumeration). Commas in any field that contains
enumeration or temporal data also indicate incomplete holdings.

I suspect that the mere presence of a note is a great indicator that
holdings are incomplete since what kind of yutz writes a note saying all
the holdings are here just like you'd expect? Having said that, I need to
crawl through a lot more data before being comfortable with that statement.

Regexp matches can be used to search for closed date ranges in open serials
or close dates within 866 that don't correspond to close dates within fixed
fields.

That's the first pass. The second pass would be to search for the most
common patterns that occur within incomplete holdings. Wash, rinse, repeat.
After awhile, you'll get to all the cornball schemes that don't lend
themselves towards automation, but hopefully that group of materials is
getting to a more manageable size where throwing labor at the metadata makes
some sense. Possibly guessing if a volume is available based on timeframe is
a good way to go.

Worst case scenario if the program can't handle it is you deflect the
request to the next institution, and that already happens all the time for a
variety of reasons.

While my comments are mostly concerned with journal holdings, similar logic
can be used with monographic series as well.

kyle


Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Jonathan Rochkind
When I've tried to do this, it's been much harder than your story, I'm 
afraid.


My library data is very inconsistent in the way it expresses it's 
holdings. Even _without_ missing items, the holdings are expressed in 
human-readable narrative form which is very difficult to parse reliably.


Theoretically, the holdings are expressed according to, I forget the 
name of the Z. standard, but some standard for expressing human readable 
holdings with certain punctuation and such. Even if they really WERE all 
exactly according to this standard, this standard is not very easy to 
parse consistently and reliably. But in fact, since when these tags are 
entered nothing validates them to this standard -- and at different 
times in history the cataloging staff entering them in various libraries 
had various ideas about how strictly they should follow this local 
policy -- our holdings are not even reliably according to that standard.


But if you think it's easy, please, give it a try and get back to us. :) 
Maybe your library's data is cleaner than mine.


I think it's kind of a crime that our ILS (and many other ILSs) doesn't 
provide a way for holdings to be efficiency entered (or guessed from 
prediction patterns etc) AND converted to an internal structured format 
that actually contains the semantic info we want. Offering catalogers 
the option to manually enter an MFHD is not a solution.


Jonathan

Kyle Banerjee wrote:

The trick here is that traditional library metadata practices make it
  

_very


hard_ to tell if a _specific volume/issue_ is held by a given library.
  

 And


those are the most common use cases for OpenURL.

  

Yep. That's true even for individual library's with link resolvers. OCLC is
not going to be able to solve that particular issue until the local
libraries do.




This might not be as bad as people think. The normal argument is that
holdings are in free text and there's no way staff will ever have enough
time to record volume level holdings. However, significant chunks of the
problem can be addressed using relatively simple methods.

For example, if you can identify complete runs, you know that a library has
all holdings and can start automating things.

With this in mind, the first step is to identify incomplete holdings. The
mere presence of lingo like missing, lost, incomplete, scattered,
wanting, etc. is a dead giveaway.  So are bracketed fields that contain
enumeration or temporal data (though you'll get false hits using this method
when catalogers supply enumeration). Commas in any field that contains
enumeration or temporal data also indicate incomplete holdings.

I suspect that the mere presence of a note is a great indicator that
holdings are incomplete since what kind of yutz writes a note saying all
the holdings are here just like you'd expect? Having said that, I need to
crawl through a lot more data before being comfortable with that statement.

Regexp matches can be used to search for closed date ranges in open serials
or close dates within 866 that don't correspond to close dates within fixed
fields.

That's the first pass. The second pass would be to search for the most
common patterns that occur within incomplete holdings. Wash, rinse, repeat.
After awhile, you'll get to all the cornball schemes that don't lend
themselves towards automation, but hopefully that group of materials is
getting to a more manageable size where throwing labor at the metadata makes
some sense. Possibly guessing if a volume is available based on timeframe is
a good way to go.

Worst case scenario if the program can't handle it is you deflect the
request to the next institution, and that already happens all the time for a
variety of reasons.

While my comments are mostly concerned with journal holdings, similar logic
can be used with monographic series as well.

kyle

  


Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Markus Fischer

Kyle Banerjee schrieb:

This might not be as bad as people think. The normal argument is that
holdings are in free text and there's no way staff will ever have enough
time to record volume level holdings. However, significant chunks of the
problem can be addressed using relatively simple methods.

For example, if you can identify complete runs, you know that a library has
all holdings and can start automating things.


That's what we've done for journal holdings (only) in

https://sourceforge.net/projects/doctor-doc/

Works perfect in combination with an EZB-account 
(rzblx1.uni-regensburg.de/ezeit) as a linkresolver. May be as exact as 
on issue level.


The tool is beeing used by around 100 libraries in Germany, Switzerland 
and Austria.


If you check this one out: Don't expect the perfect OS-system. It has 
been developped by me (head of library and no IT-Professional) and a 
colleague (IT-Professional). I learned a lot through this one.


There is plenty room for improvement in it: some things implemented not 
yet so nice, other things done quite nice ;-)


If you want to discuss, use or contribute:

https://sourceforge.net/projects/doctor-doc/support

Very welcome!

Markus Fischer



While my comments are mostly concerned with journal holdings, similar logic
can be used with monographic series as well.

kyle


Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Tom Keays
I think my perspective of the user's goal is actually the same (or close
enough to the same) as David's, just stated differently. The user wants the
most local copy or, failing that, a way to order it from another source.

However, I have plenty of examples of faculty and occasional grad students
who are willing to make the trek to a nearby library -- even out of town
libraries -- rather than do ILL. This doesn't encompass every use case or
even a typical use case (are there typical cases?), but it does no harm to
have information even if you can't always act on it.

The problem with OpenURL tied to a particular institution is
a) the person may not have (or know they have) an affiliation to a given
institution,
b) may be coming from outside their institution's IP range so that even the
OCLC Registry redirect trick will fail to get them to a (let alone the
correct) link resolver,
c) there may not be any recourse to find an item if the institution does not
own it (MPOW does not provide a link to WorldCat).

Tom

On Tue, Jun 15, 2010 at 12:16 PM, Walker, David dwal...@calstate.eduwrote:

  It seems like the more productive path if the goal of a user is
  simply to locate a copy, where ever it is held.

 But I don't think users have *locating a copy* as their goal.  Rather, I
 think their goal is to *get their hands on the book*.

 If I discover a book via COINs, and you drop me off at Worldcat.org, that
 allows me to see which libraries own the book.  But, unless I happen to be
 affiliated with those institutions, that's kinda useless information.  I
 have no real way of actually getting the book itself.

 If, instead, you drop me off at your institution's link resolver menu, and
 provide me an ILL option in the event you don't have the book, the library
 can get the book for me, which is really my *goal*.

 That seems like the more productive path, IMO.

 --Dave

 ==
 David Walker
 Library Web Services Manager
 California State University
 http://xerxes.calstate.edu
 
 From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Tom Keays
 [tomke...@gmail.com]
 Sent: Tuesday, June 15, 2010 8:43 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

 On Mon, Jun 14, 2010 at 3:47 PM, Jonathan Rochkind rochk...@jhu.edu
 wrote:

  The trick here is that traditional library metadata practices make it
 _very
  hard_ to tell if a _specific volume/issue_ is held by a given library.
  And
  those are the most common use cases for OpenURL.
 

 Yep. That's true even for individual library's with link resolvers. OCLC is
 not going to be able to solve that particular issue until the local
 libraries do.


  If you just want to get to the title level (for a journal or a book), you
  can easily write your own thing that takes an OpenURL, and either just
  redirects straight to worldcat.org on isbn/lccn/oclcnum, or actually
 does
  a WorldCat API lookup to ensure the record exists first and/or looks up
 on
  author/title/etc too.
 

 I was mainly thinking of sources that use COinS. If you have a rarely held
 book, for instance, then OpenURLs resolved against random institutional
 endpoints are going to mostly be unproductive. However, a union catalog
 such as OCLC already has the information about libraries in the system that
 own it. It seems like the more productive path if the goal of a user is
 simply to locate a copy, where ever it is held.


  Umlaut already includes the 'naive' just link to worldcat.org based on
  isbn, oclcnum, or lccn approach, functionality that was written before
 the
  worldcat api exists. That is, Umlaut takes an incoming OpenURL, and
 provides
  the user with a link to a worldcat record based on isbn, oclcnum, or
 lccn.
 

 Many institutions have chosen to do this. MPOW, however, represents a
 counter-example and do not link out to OCLC.

 Tom



Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Tom Keays
I do provide the user with the proxied WorldCat URL for just the reasons
Jonathan cites. But, no, being an otherwise open web resource, you can't
force a user to use it.

On Tue, Jun 15, 2010 at 12:22 PM, Jonathan Rochkind rochk...@jhu.eduwrote:


 I haven't yet found any good way to do this if the user is off-campus
 (ezproxy not a good solution, how do we 'force' the user to use ezproxy for
 worldcat.org anyway?).




Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Jonathan Rochkind
I'm not sure what you mean by complete holdings? The library holds the 
entire run of the journal from the first issue printed to the 
last/current? Or just holdings that dont' include missing statements?


Perhaps other institutions have more easily parseable holdings data (or 
even holdings data stored in structured form in the ILS) than mine.  For 
mine, even holdings that don't include missing are not feasibly 
reliably parseable, I've tried.


Jonathan

Kyle Banerjee wrote:

But if you think it's easy, please, give it a try and get back to us. :)
Maybe your library's data is cleaner than mine.




I don't think it's easy, but I think detecting *complete* holdings is a big
part of the picture and that can be done fairly well.

Cleanliness of data will vary from one institution to another, and quite a
bit of it will be parsible. Even if you only can't even get half, you're
still way ahead of where you'd otherwise be.


  

I think it's kind of a crime that our ILS (and many other ILSs) doesn't
provide a way for holdings to be efficiency entered (or guessed from
prediction patterns etc) AND converted to an internal structured format that
actually contains the semantic info we want.




There's too much variation in what people want to do.  Even going with
manual MFHD, it's still pretty easy to generate stuff that's pretty hard to
parse

kyle

  


Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Jonathan Rochkind
Oh you really do mean complete like complete publication run?  Very 
few of our journal holdings are complete in that sense, they are 
definitely in the minority.  We start getting something after issue 1, 
or stop getting it before the last issue. Or stop and then start again.


Is this really unusual?

If all you've figured out is the complete publication run of a 
journal, and are assuming your library holds it... wait, how is this 
something you need for any actual use case?


My use case is trying to figure out IF we have a particular 
volume/issue, and ideally,  if so, what shelf is it located on.  If I'm 
just going to deal with journals we have the complete publication 
history of, I don't have a problem anymore, because the answer will 
always be yes, that's a very simple algorithm, print yes, heh.  So, 
yes, if you assume only holdings of complete publication histories, the 
problem does get very easy.


Incidentally, if anyone is looking for a schema and transmission format 
for actual _structured_ holdings information, that's flexible enough for 
idiosyncratic publication histories and holdings, but still structured 
enough to actually be machine-actionable... I still can't recommend Onix 
Serial Holdings highly enough!   I don't think it gets much use, 
probably because most of our systems simply don't _have_ this structured 
information, most of our staff interfaces don't provide reasonably 
efficient interfaces for entering, etc. But if you can get the other 
pieces and just need a schema and representation format, Onix Serial 
Holdings is nice!


Jonathan

Kyle Banerjee wrote:

On Tue, Jun 15, 2010 at 10:13 AM, Jonathan Rochkind rochk...@jhu.eduwrote:

  

I'm not sure what you mean by complete holdings? The library holds the
entire run of the journal from the first issue printed to the last/current?
Or just holdings that dont' include missing statements?




Obviously, there has to  some sort of holdings statement -- I'm presuming
that something reasonably accurate is available. If there is no summary
holdings statement, items aren't inventoried, but holdings are believed to
be incomplete, there's not much to work with.

As far as retrospectively getting data up to scratch in the case of
 hopeless situations, there are paths that make sense. For instance,
retrospectively inventorying serials may be insane. However, from circ and
ILL data, you should know which titles are actually consulted the most. Get
those ones in shape first and work backwards.

In a major academic library, it may be the case that some titles are *never*
handled, but that doesn't cause problems if no one wants them. For low use
resources, it can make more sense to just handle things manually.

Perhaps other institutions have more easily parseable holdings data (or even
  

holdings data stored in structured form in the ILS) than mine.  For mine,
even holdings that don't include missing are not feasibly reliably
parseable, I've tried.




Note that you can get structured holdings data from sources other than the
library catalog -- if you know what's missing.

Sounds like your situation is particularly challenging. But there are gains
worth chasing. Service issues aside, problems like these raise existential
questions.

If we do an inadequate job of providing access, patrons will just turn to
subscription databases and no one will even care about what we do or even if
we're still around. Most major academic libraries never got their entire
card collection in the online catalog. Patrons don't use that stuff anymore,
and almost no one cares (even among librarians). It would be a mistake to
think this can't happen again.

kyle

  


Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Kyle Banerjee
 Oh you really do mean complete like complete publication run?  Very few
 of our journal holdings are complete in that sense, they are definitely in
 the minority.  We start getting something after issue 1, or stop getting it
 before the last issue. Or stop and then start again.

 Is this really unusual?


No, but parsing holding statements for something that just gets cut off
early or which starts late should be easy unless entry is insanely
inconsistent. If staff enter info even close to standard practices, you
still should be able to read a lot of it even when there are breaks. This is
when anal retentive behavior in the tech services dept saves your bacon.

This process will be lossy, but sometimes that's all you can do. Some
situations may be such that there's no reasonable fix that would
significantly improve things. But in that case, it makes sense to move onto
other problems. Otherwise, we wind up all our time futzing with fringe use
cases and people actually get what they need elsewhere.

kyle