Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
We have been trying to enumerate serials holdings as explicitly as possible. E.G., this microfiche supplement to a journal, http://summit.syr.edu/cgi-bin/Pwebrecon.cgi?BBID=274291 shows apparently missing issues. However, there are two pieces of inferred information here: 1) every print issue had a corresponding microfiche supplement (they didn't, so most of these are complete even with the gaps) 2) that volumes, at least up until 1991, had only 26 issues (that is probably is true, but it is not certain) and there is no way to be certain how many issues per volume were published with 1992 (28?, 52?) v.95:no.3 (1973)-v.95:no.8 (1973 v.95:no.10 (1973)-v.95:no.26 (1973) v.96 (1974)-v.97 (1975) v.98:no.1 (1976)-v.98:no.14 (1976) v.98:no.16 (1976)-v.98:no.26 (1976) v.99:no.1 (1977)-v.99:no.25 (1977) v.100 (1978)-v.108 (1986) v.109:no.1 (1987)-v.109:no.19 (1987) v.109:no.21 (1987)-v.109:no.26 (1987) v.110 (1988)-v.111 (1989) v.112:no.1 (1990)-v.112:no.26 (1990) v.113 (1991) v.114:no.1 (1992)-v.114:no.21 (1992) v.114:no.23 (1992)-v.114:no.27 (1992) v.115 (1993)-v.119 (1997) v.120:no.2 (1998:Jan.21)-v.120:no.51 (1998:Dec.30) On Tue, Jun 15, 2010 at 9:56 PM, Bill Dueber b...@dueber.com wrote: On Tue, Jun 15, 2010 at 5:49 PM, Kyle Banerjee baner...@uoregon.edu wrote: No, but parsing holding statements for something that just gets cut off early or which starts late should be easy unless entry is insanely inconsistent. Andthere it is. :-) We're really dealing with a few problems here: - Inconsistent entry by catalogers (probably the least of our worries) - Inconsistent publishing schedules (e.g., the Jan 1942 issue was just plain never printed) - Inconsistent use of volume/number/year/month/whatever throughout a serial's run. So, for example, http://mirlyn.lib.umich.edu/Record/45417/Holdings#1 There are six holdings: 1919-1920 incompl 1920 incompl. 1922 v.4 no.49 v.6 1921 jul-dec v.6 1921jan-jun We have no way of knowing what year volume 4 was printed in, which issues are incomplete in the two volumes that cover 1920, whether volume number are associated with earlier (or later) issues, etc. We, as humans, could try to make some guesses, but they'd just be guesses. It's easy to find examples where month ranges overlap (or leave gaps), where month names and issue numbers are sometimes used interchangeably, where volume numbers suddenly change in the middle of a run because of a merge with another serial (or where the first volume isn't 1 because the serial broke off from a parent), etc. etc. etc. I don't mean to overstate the problem. For many (most?) serials whose existence only goes back a few decades, a relatively simple approach will likely work much of the time -- although even that relatively simple approach will have to take into account a solid dozen or so different ways that enumcron data may have been entered. But to be able to say, with some confidence, that we have the full run? Or a particular issue as labeled my a month name? Much, much harder in the general case. -Bill- -- Bill Dueber Library Systems Programmer University of Michigan Library
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
Don't forget inconsistent data from the person sending the OpenURL. Rosalyn On Tue, Jun 15, 2010 at 9:56 PM, Bill Dueber b...@dueber.com wrote: On Tue, Jun 15, 2010 at 5:49 PM, Kyle Banerjee baner...@uoregon.edu wrote: No, but parsing holding statements for something that just gets cut off early or which starts late should be easy unless entry is insanely inconsistent. Andthere it is. :-) We're really dealing with a few problems here: - Inconsistent entry by catalogers (probably the least of our worries) - Inconsistent publishing schedules (e.g., the Jan 1942 issue was just plain never printed) - Inconsistent use of volume/number/year/month/whatever throughout a serial's run. So, for example, http://mirlyn.lib.umich.edu/Record/45417/Holdings#1 There are six holdings: 1919-1920 incompl 1920 incompl. 1922 v.4 no.49 v.6 1921 jul-dec v.6 1921jan-jun We have no way of knowing what year volume 4 was printed in, which issues are incomplete in the two volumes that cover 1920, whether volume number are associated with earlier (or later) issues, etc. We, as humans, could try to make some guesses, but they'd just be guesses. It's easy to find examples where month ranges overlap (or leave gaps), where month names and issue numbers are sometimes used interchangeably, where volume numbers suddenly change in the middle of a run because of a merge with another serial (or where the first volume isn't 1 because the serial broke off from a parent), etc. etc. etc. I don't mean to overstate the problem. For many (most?) serials whose existence only goes back a few decades, a relatively simple approach will likely work much of the time -- although even that relatively simple approach will have to take into account a solid dozen or so different ways that enumcron data may have been entered. But to be able to say, with some confidence, that we have the full run? Or a particular issue as labeled my a month name? Much, much harder in the general case. -Bill- -- Bill Dueber Library Systems Programmer University of Michigan Library
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
Regarding the data in OCLC, my understanding (as a former serials cataloger) is that there is detailed information for at least some institutions in the interlibrary loan portion of the OCLC database but this is not available via worldcat. I know our ILL department added detailed information for commonly requested titles years ago. I also know we are in the process of getting our detailed holdings loaded into OCLC (possibly just on the ILL side, I'm not sure about this) and maintaining our holdings through batch updates. Many of our current titles use summary holdings, but not all do. I believe the summary holdings work much more effectively with ILL as well so our serials catalogers have been working for years to improve our local data. As part of our move to summary holdings, we also reduced some of the detail in our holdings, so now we show only gaps of entire volumes, but not specific missing issues in our coded holdings (the missing issues are included in notes in our i! tem specific records). If there is better data available to ILL staff, this may be an avenue you could pursue. Wendy Robertson Digital Resources Librarian . The University of Iowa Libraries 1015 Main Library . Iowa City, Iowa 52242 wendy-robert...@uiowa.edu 319-335-5821 -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Bill Dueber Sent: Tuesday, June 15, 2010 8:57 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat as an OpenURL endpoint ? On Tue, Jun 15, 2010 at 5:49 PM, Kyle Banerjee baner...@uoregon.edu wrote: No, but parsing holding statements for something that just gets cut off early or which starts late should be easy unless entry is insanely inconsistent. Andthere it is. :-) We're really dealing with a few problems here: - Inconsistent entry by catalogers (probably the least of our worries) - Inconsistent publishing schedules (e.g., the Jan 1942 issue was just plain never printed) - Inconsistent use of volume/number/year/month/whatever throughout a serial's run. So, for example, http://mirlyn.lib.umich.edu/Record/45417/Holdings#1 There are six holdings: 1919-1920 incompl 1920 incompl. 1922 v.4 no.49 v.6 1921 jul-dec v.6 1921jan-jun We have no way of knowing what year volume 4 was printed in, which issues are incomplete in the two volumes that cover 1920, whether volume number are associated with earlier (or later) issues, etc. We, as humans, could try to make some guesses, but they'd just be guesses. It's easy to find examples where month ranges overlap (or leave gaps), where month names and issue numbers are sometimes used interchangeably, where volume numbers suddenly change in the middle of a run because of a merge with another serial (or where the first volume isn't 1 because the serial broke off from a parent), etc. etc. etc. I don't mean to overstate the problem. For many (most?) serials whose existence only goes back a few decades, a relatively simple approach will likely work much of the time -- although even that relatively simple approach will have to take into account a solid dozen or so different ways that enumcron data may have been entered. But to be able to say, with some confidence, that we have the full run? Or a particular issue as labeled my a month name? Much, much harder in the general case. -Bill- -- Bill Dueber Library Systems Programmer University of Michigan Library
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
On Mon, Jun 14, 2010 at 3:47 PM, Jonathan Rochkind rochk...@jhu.edu wrote: The trick here is that traditional library metadata practices make it _very hard_ to tell if a _specific volume/issue_ is held by a given library. And those are the most common use cases for OpenURL. Yep. That's true even for individual library's with link resolvers. OCLC is not going to be able to solve that particular issue until the local libraries do. If you just want to get to the title level (for a journal or a book), you can easily write your own thing that takes an OpenURL, and either just redirects straight to worldcat.org on isbn/lccn/oclcnum, or actually does a WorldCat API lookup to ensure the record exists first and/or looks up on author/title/etc too. I was mainly thinking of sources that use COinS. If you have a rarely held book, for instance, then OpenURLs resolved against random institutional endpoints are going to mostly be unproductive. However, a union catalog such as OCLC already has the information about libraries in the system that own it. It seems like the more productive path if the goal of a user is simply to locate a copy, where ever it is held. Umlaut already includes the 'naive' just link to worldcat.org based on isbn, oclcnum, or lccn approach, functionality that was written before the worldcat api exists. That is, Umlaut takes an incoming OpenURL, and provides the user with a link to a worldcat record based on isbn, oclcnum, or lccn. Many institutions have chosen to do this. MPOW, however, represents a counter-example and do not link out to OCLC. Tom
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
It seems like the more productive path if the goal of a user is simply to locate a copy, where ever it is held. But I don't think users have *locating a copy* as their goal. Rather, I think their goal is to *get their hands on the book*. If I discover a book via COINs, and you drop me off at Worldcat.org, that allows me to see which libraries own the book. But, unless I happen to be affiliated with those institutions, that's kinda useless information. I have no real way of actually getting the book itself. If, instead, you drop me off at your institution's link resolver menu, and provide me an ILL option in the event you don't have the book, the library can get the book for me, which is really my *goal*. That seems like the more productive path, IMO. --Dave == David Walker Library Web Services Manager California State University http://xerxes.calstate.edu From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Tom Keays [tomke...@gmail.com] Sent: Tuesday, June 15, 2010 8:43 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat as an OpenURL endpoint ? On Mon, Jun 14, 2010 at 3:47 PM, Jonathan Rochkind rochk...@jhu.edu wrote: The trick here is that traditional library metadata practices make it _very hard_ to tell if a _specific volume/issue_ is held by a given library. And those are the most common use cases for OpenURL. Yep. That's true even for individual library's with link resolvers. OCLC is not going to be able to solve that particular issue until the local libraries do. If you just want to get to the title level (for a journal or a book), you can easily write your own thing that takes an OpenURL, and either just redirects straight to worldcat.org on isbn/lccn/oclcnum, or actually does a WorldCat API lookup to ensure the record exists first and/or looks up on author/title/etc too. I was mainly thinking of sources that use COinS. If you have a rarely held book, for instance, then OpenURLs resolved against random institutional endpoints are going to mostly be unproductive. However, a union catalog such as OCLC already has the information about libraries in the system that own it. It seems like the more productive path if the goal of a user is simply to locate a copy, where ever it is held. Umlaut already includes the 'naive' just link to worldcat.org based on isbn, oclcnum, or lccn approach, functionality that was written before the worldcat api exists. That is, Umlaut takes an incoming OpenURL, and provides the user with a link to a worldcat record based on isbn, oclcnum, or lccn. Many institutions have chosen to do this. MPOW, however, represents a counter-example and do not link out to OCLC. Tom
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
IF the user is coming from a recognized on-campus IP, you can configure WorldCat to give the user an ILL link to your library too. At least if you use ILLiad, maybe if you use something else (esp if your ILL software can accept OpenURLs too!). I haven't yet found any good way to do this if the user is off-campus (ezproxy not a good solution, how do we 'force' the user to use ezproxy for worldcat.org anyway?). But in any event, I agree with Dave that worldcat.org isn't a great interface even if you DO get it to have an ILL link in an odd place. I think we can do better. Which is really the whole purpose of Umlaut as an institutional link resolver, giving the user a better screen for I found this citation somewhere else, library what can you do to get it in my hands asap? Still wondering why Umlaut hasn't gotten more interest from people, heh. But we're using it here at JHU, and NYU and the New School are also using it. Jonathan Walker, David wrote: It seems like the more productive path if the goal of a user is simply to locate a copy, where ever it is held. But I don't think users have *locating a copy* as their goal. Rather, I think their goal is to *get their hands on the book*. If I discover a book via COINs, and you drop me off at Worldcat.org, that allows me to see which libraries own the book. But, unless I happen to be affiliated with those institutions, that's kinda useless information. I have no real way of actually getting the book itself. If, instead, you drop me off at your institution's link resolver menu, and provide me an ILL option in the event you don't have the book, the library can get the book for me, which is really my *goal*. That seems like the more productive path, IMO. --Dave == David Walker Library Web Services Manager California State University http://xerxes.calstate.edu From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Tom Keays [tomke...@gmail.com] Sent: Tuesday, June 15, 2010 8:43 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat as an OpenURL endpoint ? On Mon, Jun 14, 2010 at 3:47 PM, Jonathan Rochkind rochk...@jhu.edu wrote: The trick here is that traditional library metadata practices make it _very hard_ to tell if a _specific volume/issue_ is held by a given library. And those are the most common use cases for OpenURL. Yep. That's true even for individual library's with link resolvers. OCLC is not going to be able to solve that particular issue until the local libraries do. If you just want to get to the title level (for a journal or a book), you can easily write your own thing that takes an OpenURL, and either just redirects straight to worldcat.org on isbn/lccn/oclcnum, or actually does a WorldCat API lookup to ensure the record exists first and/or looks up on author/title/etc too. I was mainly thinking of sources that use COinS. If you have a rarely held book, for instance, then OpenURLs resolved against random institutional endpoints are going to mostly be unproductive. However, a union catalog such as OCLC already has the information about libraries in the system that own it. It seems like the more productive path if the goal of a user is simply to locate a copy, where ever it is held. Umlaut already includes the 'naive' just link to worldcat.org based on isbn, oclcnum, or lccn approach, functionality that was written before the worldcat api exists. That is, Umlaut takes an incoming OpenURL, and provides the user with a link to a worldcat record based on isbn, oclcnum, or lccn. Many institutions have chosen to do this. MPOW, however, represents a counter-example and do not link out to OCLC. Tom
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
The trick here is that traditional library metadata practices make it _very hard_ to tell if a _specific volume/issue_ is held by a given library. And those are the most common use cases for OpenURL. Yep. That's true even for individual library's with link resolvers. OCLC is not going to be able to solve that particular issue until the local libraries do. This might not be as bad as people think. The normal argument is that holdings are in free text and there's no way staff will ever have enough time to record volume level holdings. However, significant chunks of the problem can be addressed using relatively simple methods. For example, if you can identify complete runs, you know that a library has all holdings and can start automating things. With this in mind, the first step is to identify incomplete holdings. The mere presence of lingo like missing, lost, incomplete, scattered, wanting, etc. is a dead giveaway. So are bracketed fields that contain enumeration or temporal data (though you'll get false hits using this method when catalogers supply enumeration). Commas in any field that contains enumeration or temporal data also indicate incomplete holdings. I suspect that the mere presence of a note is a great indicator that holdings are incomplete since what kind of yutz writes a note saying all the holdings are here just like you'd expect? Having said that, I need to crawl through a lot more data before being comfortable with that statement. Regexp matches can be used to search for closed date ranges in open serials or close dates within 866 that don't correspond to close dates within fixed fields. That's the first pass. The second pass would be to search for the most common patterns that occur within incomplete holdings. Wash, rinse, repeat. After awhile, you'll get to all the cornball schemes that don't lend themselves towards automation, but hopefully that group of materials is getting to a more manageable size where throwing labor at the metadata makes some sense. Possibly guessing if a volume is available based on timeframe is a good way to go. Worst case scenario if the program can't handle it is you deflect the request to the next institution, and that already happens all the time for a variety of reasons. While my comments are mostly concerned with journal holdings, similar logic can be used with monographic series as well. kyle
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
When I've tried to do this, it's been much harder than your story, I'm afraid. My library data is very inconsistent in the way it expresses it's holdings. Even _without_ missing items, the holdings are expressed in human-readable narrative form which is very difficult to parse reliably. Theoretically, the holdings are expressed according to, I forget the name of the Z. standard, but some standard for expressing human readable holdings with certain punctuation and such. Even if they really WERE all exactly according to this standard, this standard is not very easy to parse consistently and reliably. But in fact, since when these tags are entered nothing validates them to this standard -- and at different times in history the cataloging staff entering them in various libraries had various ideas about how strictly they should follow this local policy -- our holdings are not even reliably according to that standard. But if you think it's easy, please, give it a try and get back to us. :) Maybe your library's data is cleaner than mine. I think it's kind of a crime that our ILS (and many other ILSs) doesn't provide a way for holdings to be efficiency entered (or guessed from prediction patterns etc) AND converted to an internal structured format that actually contains the semantic info we want. Offering catalogers the option to manually enter an MFHD is not a solution. Jonathan Kyle Banerjee wrote: The trick here is that traditional library metadata practices make it _very hard_ to tell if a _specific volume/issue_ is held by a given library. And those are the most common use cases for OpenURL. Yep. That's true even for individual library's with link resolvers. OCLC is not going to be able to solve that particular issue until the local libraries do. This might not be as bad as people think. The normal argument is that holdings are in free text and there's no way staff will ever have enough time to record volume level holdings. However, significant chunks of the problem can be addressed using relatively simple methods. For example, if you can identify complete runs, you know that a library has all holdings and can start automating things. With this in mind, the first step is to identify incomplete holdings. The mere presence of lingo like missing, lost, incomplete, scattered, wanting, etc. is a dead giveaway. So are bracketed fields that contain enumeration or temporal data (though you'll get false hits using this method when catalogers supply enumeration). Commas in any field that contains enumeration or temporal data also indicate incomplete holdings. I suspect that the mere presence of a note is a great indicator that holdings are incomplete since what kind of yutz writes a note saying all the holdings are here just like you'd expect? Having said that, I need to crawl through a lot more data before being comfortable with that statement. Regexp matches can be used to search for closed date ranges in open serials or close dates within 866 that don't correspond to close dates within fixed fields. That's the first pass. The second pass would be to search for the most common patterns that occur within incomplete holdings. Wash, rinse, repeat. After awhile, you'll get to all the cornball schemes that don't lend themselves towards automation, but hopefully that group of materials is getting to a more manageable size where throwing labor at the metadata makes some sense. Possibly guessing if a volume is available based on timeframe is a good way to go. Worst case scenario if the program can't handle it is you deflect the request to the next institution, and that already happens all the time for a variety of reasons. While my comments are mostly concerned with journal holdings, similar logic can be used with monographic series as well. kyle
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
Kyle Banerjee schrieb: This might not be as bad as people think. The normal argument is that holdings are in free text and there's no way staff will ever have enough time to record volume level holdings. However, significant chunks of the problem can be addressed using relatively simple methods. For example, if you can identify complete runs, you know that a library has all holdings and can start automating things. That's what we've done for journal holdings (only) in https://sourceforge.net/projects/doctor-doc/ Works perfect in combination with an EZB-account (rzblx1.uni-regensburg.de/ezeit) as a linkresolver. May be as exact as on issue level. The tool is beeing used by around 100 libraries in Germany, Switzerland and Austria. If you check this one out: Don't expect the perfect OS-system. It has been developped by me (head of library and no IT-Professional) and a colleague (IT-Professional). I learned a lot through this one. There is plenty room for improvement in it: some things implemented not yet so nice, other things done quite nice ;-) If you want to discuss, use or contribute: https://sourceforge.net/projects/doctor-doc/support Very welcome! Markus Fischer While my comments are mostly concerned with journal holdings, similar logic can be used with monographic series as well. kyle
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
I think my perspective of the user's goal is actually the same (or close enough to the same) as David's, just stated differently. The user wants the most local copy or, failing that, a way to order it from another source. However, I have plenty of examples of faculty and occasional grad students who are willing to make the trek to a nearby library -- even out of town libraries -- rather than do ILL. This doesn't encompass every use case or even a typical use case (are there typical cases?), but it does no harm to have information even if you can't always act on it. The problem with OpenURL tied to a particular institution is a) the person may not have (or know they have) an affiliation to a given institution, b) may be coming from outside their institution's IP range so that even the OCLC Registry redirect trick will fail to get them to a (let alone the correct) link resolver, c) there may not be any recourse to find an item if the institution does not own it (MPOW does not provide a link to WorldCat). Tom On Tue, Jun 15, 2010 at 12:16 PM, Walker, David dwal...@calstate.eduwrote: It seems like the more productive path if the goal of a user is simply to locate a copy, where ever it is held. But I don't think users have *locating a copy* as their goal. Rather, I think their goal is to *get their hands on the book*. If I discover a book via COINs, and you drop me off at Worldcat.org, that allows me to see which libraries own the book. But, unless I happen to be affiliated with those institutions, that's kinda useless information. I have no real way of actually getting the book itself. If, instead, you drop me off at your institution's link resolver menu, and provide me an ILL option in the event you don't have the book, the library can get the book for me, which is really my *goal*. That seems like the more productive path, IMO. --Dave == David Walker Library Web Services Manager California State University http://xerxes.calstate.edu From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Tom Keays [tomke...@gmail.com] Sent: Tuesday, June 15, 2010 8:43 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat as an OpenURL endpoint ? On Mon, Jun 14, 2010 at 3:47 PM, Jonathan Rochkind rochk...@jhu.edu wrote: The trick here is that traditional library metadata practices make it _very hard_ to tell if a _specific volume/issue_ is held by a given library. And those are the most common use cases for OpenURL. Yep. That's true even for individual library's with link resolvers. OCLC is not going to be able to solve that particular issue until the local libraries do. If you just want to get to the title level (for a journal or a book), you can easily write your own thing that takes an OpenURL, and either just redirects straight to worldcat.org on isbn/lccn/oclcnum, or actually does a WorldCat API lookup to ensure the record exists first and/or looks up on author/title/etc too. I was mainly thinking of sources that use COinS. If you have a rarely held book, for instance, then OpenURLs resolved against random institutional endpoints are going to mostly be unproductive. However, a union catalog such as OCLC already has the information about libraries in the system that own it. It seems like the more productive path if the goal of a user is simply to locate a copy, where ever it is held. Umlaut already includes the 'naive' just link to worldcat.org based on isbn, oclcnum, or lccn approach, functionality that was written before the worldcat api exists. That is, Umlaut takes an incoming OpenURL, and provides the user with a link to a worldcat record based on isbn, oclcnum, or lccn. Many institutions have chosen to do this. MPOW, however, represents a counter-example and do not link out to OCLC. Tom
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
I do provide the user with the proxied WorldCat URL for just the reasons Jonathan cites. But, no, being an otherwise open web resource, you can't force a user to use it. On Tue, Jun 15, 2010 at 12:22 PM, Jonathan Rochkind rochk...@jhu.eduwrote: I haven't yet found any good way to do this if the user is off-campus (ezproxy not a good solution, how do we 'force' the user to use ezproxy for worldcat.org anyway?).
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
I'm not sure what you mean by complete holdings? The library holds the entire run of the journal from the first issue printed to the last/current? Or just holdings that dont' include missing statements? Perhaps other institutions have more easily parseable holdings data (or even holdings data stored in structured form in the ILS) than mine. For mine, even holdings that don't include missing are not feasibly reliably parseable, I've tried. Jonathan Kyle Banerjee wrote: But if you think it's easy, please, give it a try and get back to us. :) Maybe your library's data is cleaner than mine. I don't think it's easy, but I think detecting *complete* holdings is a big part of the picture and that can be done fairly well. Cleanliness of data will vary from one institution to another, and quite a bit of it will be parsible. Even if you only can't even get half, you're still way ahead of where you'd otherwise be. I think it's kind of a crime that our ILS (and many other ILSs) doesn't provide a way for holdings to be efficiency entered (or guessed from prediction patterns etc) AND converted to an internal structured format that actually contains the semantic info we want. There's too much variation in what people want to do. Even going with manual MFHD, it's still pretty easy to generate stuff that's pretty hard to parse kyle
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
Oh you really do mean complete like complete publication run? Very few of our journal holdings are complete in that sense, they are definitely in the minority. We start getting something after issue 1, or stop getting it before the last issue. Or stop and then start again. Is this really unusual? If all you've figured out is the complete publication run of a journal, and are assuming your library holds it... wait, how is this something you need for any actual use case? My use case is trying to figure out IF we have a particular volume/issue, and ideally, if so, what shelf is it located on. If I'm just going to deal with journals we have the complete publication history of, I don't have a problem anymore, because the answer will always be yes, that's a very simple algorithm, print yes, heh. So, yes, if you assume only holdings of complete publication histories, the problem does get very easy. Incidentally, if anyone is looking for a schema and transmission format for actual _structured_ holdings information, that's flexible enough for idiosyncratic publication histories and holdings, but still structured enough to actually be machine-actionable... I still can't recommend Onix Serial Holdings highly enough! I don't think it gets much use, probably because most of our systems simply don't _have_ this structured information, most of our staff interfaces don't provide reasonably efficient interfaces for entering, etc. But if you can get the other pieces and just need a schema and representation format, Onix Serial Holdings is nice! Jonathan Kyle Banerjee wrote: On Tue, Jun 15, 2010 at 10:13 AM, Jonathan Rochkind rochk...@jhu.eduwrote: I'm not sure what you mean by complete holdings? The library holds the entire run of the journal from the first issue printed to the last/current? Or just holdings that dont' include missing statements? Obviously, there has to some sort of holdings statement -- I'm presuming that something reasonably accurate is available. If there is no summary holdings statement, items aren't inventoried, but holdings are believed to be incomplete, there's not much to work with. As far as retrospectively getting data up to scratch in the case of hopeless situations, there are paths that make sense. For instance, retrospectively inventorying serials may be insane. However, from circ and ILL data, you should know which titles are actually consulted the most. Get those ones in shape first and work backwards. In a major academic library, it may be the case that some titles are *never* handled, but that doesn't cause problems if no one wants them. For low use resources, it can make more sense to just handle things manually. Perhaps other institutions have more easily parseable holdings data (or even holdings data stored in structured form in the ILS) than mine. For mine, even holdings that don't include missing are not feasibly reliably parseable, I've tried. Note that you can get structured holdings data from sources other than the library catalog -- if you know what's missing. Sounds like your situation is particularly challenging. But there are gains worth chasing. Service issues aside, problems like these raise existential questions. If we do an inadequate job of providing access, patrons will just turn to subscription databases and no one will even care about what we do or even if we're still around. Most major academic libraries never got their entire card collection in the online catalog. Patrons don't use that stuff anymore, and almost no one cares (even among librarians). It would be a mistake to think this can't happen again. kyle
Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?
Oh you really do mean complete like complete publication run? Very few of our journal holdings are complete in that sense, they are definitely in the minority. We start getting something after issue 1, or stop getting it before the last issue. Or stop and then start again. Is this really unusual? No, but parsing holding statements for something that just gets cut off early or which starts late should be easy unless entry is insanely inconsistent. If staff enter info even close to standard practices, you still should be able to read a lot of it even when there are breaks. This is when anal retentive behavior in the tech services dept saves your bacon. This process will be lossy, but sometimes that's all you can do. Some situations may be such that there's no reasonable fix that would significantly improve things. But in that case, it makes sense to move onto other problems. Otherwise, we wind up all our time futzing with fringe use cases and people actually get what they need elsewhere. kyle