Re: [CODE4LIB] Twitter annotations and library software

2010-06-15 Thread Jakob Voss

On 07.06.2010 16:15, Jay Luker wrote:

Hi all,

I found this thread rather interesting and figured I'd try and revive
the convo since apparently some things have been happening in the
twitter annotation space in the past month. I just read on techcrunch
that testing of the annotation features will commence next week [1].
Also it appears that an initial schema for a book type has been
defined [2].


 [1] http://techcrunch.com/2010/06/02/twitter-annotations-testing/
 [2] http://apiwiki.twitter.com/Annotations-Overview#RecommendedTypes


Have any code4libbers gotten involved in this beyond just opining on list?


I don't this so - the discussion slipped to general data modelling 
questions. For the specific, limited use case of twitter annotations I 
bet the recommended format from [2] will be fine (title is implied as 
common attribute, url is optional):


{book:{
  title: ...,
  author: ...,
  isbn: ...,
  year: ,
  url: ...
}}

I only miss an article type with a doi field for non-books.

Cheers,
Jako


--
Jakob Voß jakob.v...@gbv.de, skype: nichtich
Verbundzentrale des GBV (VZG) / Common Library Network
Platz der Goettinger Sieben 1, 37073 Göttingen, Germany
+49 (0)551 39-10242, http://www.gbv.de


[CODE4LIB] proposal deadline approaches

2010-06-15 Thread EdUI Conference
For those of you looking for an opportunity to showcase a project, talk
about web design, user experience or just about anything web, here's a quick
reminder that the edUi 2010 http://eduiconf.org deadline for proposals is
about one month away.

What is edUi?
A learning opportunity for web professionals serving institutions of
learning.

When is edUi 2010?
November 8-9, 2010

Where is edUi 2010?
Charlottesville, VA

Thanks!

-Trey


[CODE4LIB] new version of cql-ruby

2010-06-15 Thread Jonathan Rochkind
cql-ruby is a ruby gem for parsing CQL, and serializing parse trees back 
to CQL, to xCQL, or to a solr query.


A new version has been released, 0.8.0, available from gem update/install.

The new version improves greatly on the #to_solr serialization as a solr 
query, providing support for translation from more CQL relations than 
previously, fixing a couple bugs, and making #to_solr raise appropriate 
exceptions if you try to convert CQL that is not supported for 
#to_solr.  See: 
http://cql-ruby.rubyforge.org/svn/trunk/lib/cql_ruby/cql_to_solr.rb


That's the only change from the previous version, improved #to_solr.

I wrote the improved #to_solr, Chick Markley wrote the original cql-ruby 
gem, which was a port of the Java CQL parsing code by Mike Taylor. Ain't 
open source grand?


Jonathan


Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Tom Keays
On Mon, Jun 14, 2010 at 3:47 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

 The trick here is that traditional library metadata practices make it _very
 hard_ to tell if a _specific volume/issue_ is held by a given library.  And
 those are the most common use cases for OpenURL.


Yep. That's true even for individual library's with link resolvers. OCLC is
not going to be able to solve that particular issue until the local
libraries do.


 If you just want to get to the title level (for a journal or a book), you
 can easily write your own thing that takes an OpenURL, and either just
 redirects straight to worldcat.org on isbn/lccn/oclcnum, or actually does
 a WorldCat API lookup to ensure the record exists first and/or looks up on
 author/title/etc too.


I was mainly thinking of sources that use COinS. If you have a rarely held
book, for instance, then OpenURLs resolved against random institutional
endpoints are going to mostly be unproductive. However, a union catalog
such as OCLC already has the information about libraries in the system that
own it. It seems like the more productive path if the goal of a user is
simply to locate a copy, where ever it is held.


 Umlaut already includes the 'naive' just link to worldcat.org based on
 isbn, oclcnum, or lccn approach, functionality that was written before the
 worldcat api exists. That is, Umlaut takes an incoming OpenURL, and provides
 the user with a link to a worldcat record based on isbn, oclcnum, or lccn.


Many institutions have chosen to do this. MPOW, however, represents a
counter-example and do not link out to OCLC.

Tom


Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Walker, David
 It seems like the more productive path if the goal of a user is
 simply to locate a copy, where ever it is held.

But I don't think users have *locating a copy* as their goal.  Rather, I think 
their goal is to *get their hands on the book*.

If I discover a book via COINs, and you drop me off at Worldcat.org, that 
allows me to see which libraries own the book.  But, unless I happen to be 
affiliated with those institutions, that's kinda useless information.  I have 
no real way of actually getting the book itself.

If, instead, you drop me off at your institution's link resolver menu, and 
provide me an ILL option in the event you don't have the book, the library can 
get the book for me, which is really my *goal*.

That seems like the more productive path, IMO.

--Dave

==
David Walker
Library Web Services Manager
California State University
http://xerxes.calstate.edu

From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Tom Keays 
[tomke...@gmail.com]
Sent: Tuesday, June 15, 2010 8:43 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

On Mon, Jun 14, 2010 at 3:47 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

 The trick here is that traditional library metadata practices make it _very
 hard_ to tell if a _specific volume/issue_ is held by a given library.  And
 those are the most common use cases for OpenURL.


Yep. That's true even for individual library's with link resolvers. OCLC is
not going to be able to solve that particular issue until the local
libraries do.


 If you just want to get to the title level (for a journal or a book), you
 can easily write your own thing that takes an OpenURL, and either just
 redirects straight to worldcat.org on isbn/lccn/oclcnum, or actually does
 a WorldCat API lookup to ensure the record exists first and/or looks up on
 author/title/etc too.


I was mainly thinking of sources that use COinS. If you have a rarely held
book, for instance, then OpenURLs resolved against random institutional
endpoints are going to mostly be unproductive. However, a union catalog
such as OCLC already has the information about libraries in the system that
own it. It seems like the more productive path if the goal of a user is
simply to locate a copy, where ever it is held.


 Umlaut already includes the 'naive' just link to worldcat.org based on
 isbn, oclcnum, or lccn approach, functionality that was written before the
 worldcat api exists. That is, Umlaut takes an incoming OpenURL, and provides
 the user with a link to a worldcat record based on isbn, oclcnum, or lccn.


Many institutions have chosen to do this. MPOW, however, represents a
counter-example and do not link out to OCLC.

Tom


Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Jonathan Rochkind
IF the user is coming from a recognized on-campus IP, you can configure 
WorldCat to give the user an ILL link to your library too. At least if 
you use ILLiad, maybe if you use something else (esp if your ILL 
software can accept OpenURLs too!).


I haven't yet found any good way to do this if the user is off-campus 
(ezproxy not a good solution, how do we 'force' the user to use ezproxy 
for worldcat.org anyway?).


But in any event, I agree with Dave that worldcat.org isn't a great 
interface even if you DO get it to have an ILL link in an odd place. I 
think we can do better. Which is really the whole purpose of Umlaut as 
an institutional link resolver, giving the user a better screen for I 
found this citation somewhere else, library what can you do to get it in 
my hands asap?


Still wondering why Umlaut hasn't gotten more interest from people, heh. 
But we're using it here at JHU, and NYU and the New School are also 
using it.


Jonathan

Walker, David wrote:

It seems like the more productive path if the goal of a user is
simply to locate a copy, where ever it is held.



But I don't think users have *locating a copy* as their goal.  Rather, I think 
their goal is to *get their hands on the book*.

If I discover a book via COINs, and you drop me off at Worldcat.org, that 
allows me to see which libraries own the book.  But, unless I happen to be 
affiliated with those institutions, that's kinda useless information.  I have 
no real way of actually getting the book itself.

If, instead, you drop me off at your institution's link resolver menu, and 
provide me an ILL option in the event you don't have the book, the library can 
get the book for me, which is really my *goal*.

That seems like the more productive path, IMO.

--Dave

==
David Walker
Library Web Services Manager
California State University
http://xerxes.calstate.edu

From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Tom Keays 
[tomke...@gmail.com]
Sent: Tuesday, June 15, 2010 8:43 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

On Mon, Jun 14, 2010 at 3:47 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

  

The trick here is that traditional library metadata practices make it _very
hard_ to tell if a _specific volume/issue_ is held by a given library.  And
those are the most common use cases for OpenURL.




Yep. That's true even for individual library's with link resolvers. OCLC is
not going to be able to solve that particular issue until the local
libraries do.


  

If you just want to get to the title level (for a journal or a book), you
can easily write your own thing that takes an OpenURL, and either just
redirects straight to worldcat.org on isbn/lccn/oclcnum, or actually does
a WorldCat API lookup to ensure the record exists first and/or looks up on
author/title/etc too.




I was mainly thinking of sources that use COinS. If you have a rarely held
book, for instance, then OpenURLs resolved against random institutional
endpoints are going to mostly be unproductive. However, a union catalog
such as OCLC already has the information about libraries in the system that
own it. It seems like the more productive path if the goal of a user is
simply to locate a copy, where ever it is held.


  

Umlaut already includes the 'naive' just link to worldcat.org based on
isbn, oclcnum, or lccn approach, functionality that was written before the
worldcat api exists. That is, Umlaut takes an incoming OpenURL, and provides
the user with a link to a worldcat record based on isbn, oclcnum, or lccn.




Many institutions have chosen to do this. MPOW, however, represents a
counter-example and do not link out to OCLC.

Tom

  


Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Kyle Banerjee

  The trick here is that traditional library metadata practices make it
 _very
  hard_ to tell if a _specific volume/issue_ is held by a given library.
  And
  those are the most common use cases for OpenURL.
 

 Yep. That's true even for individual library's with link resolvers. OCLC is
 not going to be able to solve that particular issue until the local
 libraries do.


This might not be as bad as people think. The normal argument is that
holdings are in free text and there's no way staff will ever have enough
time to record volume level holdings. However, significant chunks of the
problem can be addressed using relatively simple methods.

For example, if you can identify complete runs, you know that a library has
all holdings and can start automating things.

With this in mind, the first step is to identify incomplete holdings. The
mere presence of lingo like missing, lost, incomplete, scattered,
wanting, etc. is a dead giveaway.  So are bracketed fields that contain
enumeration or temporal data (though you'll get false hits using this method
when catalogers supply enumeration). Commas in any field that contains
enumeration or temporal data also indicate incomplete holdings.

I suspect that the mere presence of a note is a great indicator that
holdings are incomplete since what kind of yutz writes a note saying all
the holdings are here just like you'd expect? Having said that, I need to
crawl through a lot more data before being comfortable with that statement.

Regexp matches can be used to search for closed date ranges in open serials
or close dates within 866 that don't correspond to close dates within fixed
fields.

That's the first pass. The second pass would be to search for the most
common patterns that occur within incomplete holdings. Wash, rinse, repeat.
After awhile, you'll get to all the cornball schemes that don't lend
themselves towards automation, but hopefully that group of materials is
getting to a more manageable size where throwing labor at the metadata makes
some sense. Possibly guessing if a volume is available based on timeframe is
a good way to go.

Worst case scenario if the program can't handle it is you deflect the
request to the next institution, and that already happens all the time for a
variety of reasons.

While my comments are mostly concerned with journal holdings, similar logic
can be used with monographic series as well.

kyle


Re: [CODE4LIB] Code4Lib Northwest meeting report

2010-06-15 Thread Kyle Banerjee
Event was not recorded, but I'm sure a shoutout for slides will generate
some nice stuff to link to. Note to self: in future, it might not be a bad
idea to copy slides to a flash drive after each presentation

kyle

On Mon, Jun 14, 2010 at 7:29 PM, Ed Summers e...@pobox.com wrote:

 Wow, this looks like it was a great event. I don't suppose any of the
 talks were recorded, or that any slides are available? I'm
 particularly interested in Karen Estlund's talk about NoCode: Digital
 Preservation of Electronic Records...and well, all of the talks :-)

 //Ed

 On Mon, Jun 14, 2010 at 2:22 PM, Kyle Banerjee kyle.baner...@gmail.com
 wrote:
  Code4Lib Northwest was held June 7 at the White Stag building in
 Portland,
  OR.
 
  Registration was closed and a waiting list established at least a month
  before the event because the room capacity of 65 was reached. Ten 20
 minute
  sessions and 13 lightning talks listed at
  http://groups.google.com/group/pnwcode4lib/web/code4lib-northwest-2010made
  for a full day. Our official timekeeper (a screaming flying monkey who
 gets
  upset if people yak too long) was the only one who was bored as everyone
  stayed focused to the end.
 
  About half of the attendees filled out evaluation forms. Roughly 70%
 rated
  it as excellent, and everyone else gave it the next highest rating. A
 number
  of themes appeared in the responses. People loved the format consisting
 of
  short presentations and lightning talks. They also gave the content,
 food,
  and venue high marks.
 
  However, both at the post conference evaluation and in written comments,
  people said they'd like more opportunity to interact directly with
 others.
  Breakout sessions, short Q/A sessions after every couple speakers, and
 other
  ideas were suggested. These ideas and others will be considered for
  improving next year's event.
 
  Significantly, a number of excellent technologists with a feel for the
  code4lib spirit indicated willingness to help organize the next Code4Lib
  Northwest, so we're already looking forward to Code4Lib Northwest 2011.
 
  Respectfully submitted,
 
  kyle
 




-- 
--
Kyle Banerjee
Digital Services Program Manager
Orbis Cascade Alliance
baner...@uoregon.edu / 503.999.9787


Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Jonathan Rochkind
When I've tried to do this, it's been much harder than your story, I'm 
afraid.


My library data is very inconsistent in the way it expresses it's 
holdings. Even _without_ missing items, the holdings are expressed in 
human-readable narrative form which is very difficult to parse reliably.


Theoretically, the holdings are expressed according to, I forget the 
name of the Z. standard, but some standard for expressing human readable 
holdings with certain punctuation and such. Even if they really WERE all 
exactly according to this standard, this standard is not very easy to 
parse consistently and reliably. But in fact, since when these tags are 
entered nothing validates them to this standard -- and at different 
times in history the cataloging staff entering them in various libraries 
had various ideas about how strictly they should follow this local 
policy -- our holdings are not even reliably according to that standard.


But if you think it's easy, please, give it a try and get back to us. :) 
Maybe your library's data is cleaner than mine.


I think it's kind of a crime that our ILS (and many other ILSs) doesn't 
provide a way for holdings to be efficiency entered (or guessed from 
prediction patterns etc) AND converted to an internal structured format 
that actually contains the semantic info we want. Offering catalogers 
the option to manually enter an MFHD is not a solution.


Jonathan

Kyle Banerjee wrote:

The trick here is that traditional library metadata practices make it
  

_very


hard_ to tell if a _specific volume/issue_ is held by a given library.
  

 And


those are the most common use cases for OpenURL.

  

Yep. That's true even for individual library's with link resolvers. OCLC is
not going to be able to solve that particular issue until the local
libraries do.




This might not be as bad as people think. The normal argument is that
holdings are in free text and there's no way staff will ever have enough
time to record volume level holdings. However, significant chunks of the
problem can be addressed using relatively simple methods.

For example, if you can identify complete runs, you know that a library has
all holdings and can start automating things.

With this in mind, the first step is to identify incomplete holdings. The
mere presence of lingo like missing, lost, incomplete, scattered,
wanting, etc. is a dead giveaway.  So are bracketed fields that contain
enumeration or temporal data (though you'll get false hits using this method
when catalogers supply enumeration). Commas in any field that contains
enumeration or temporal data also indicate incomplete holdings.

I suspect that the mere presence of a note is a great indicator that
holdings are incomplete since what kind of yutz writes a note saying all
the holdings are here just like you'd expect? Having said that, I need to
crawl through a lot more data before being comfortable with that statement.

Regexp matches can be used to search for closed date ranges in open serials
or close dates within 866 that don't correspond to close dates within fixed
fields.

That's the first pass. The second pass would be to search for the most
common patterns that occur within incomplete holdings. Wash, rinse, repeat.
After awhile, you'll get to all the cornball schemes that don't lend
themselves towards automation, but hopefully that group of materials is
getting to a more manageable size where throwing labor at the metadata makes
some sense. Possibly guessing if a volume is available based on timeframe is
a good way to go.

Worst case scenario if the program can't handle it is you deflect the
request to the next institution, and that already happens all the time for a
variety of reasons.

While my comments are mostly concerned with journal holdings, similar logic
can be used with monographic series as well.

kyle

  


Re: [CODE4LIB] Code4Lib Northwest meeting report

2010-06-15 Thread Shirley Lincicum
A couple of presenters have already added links to their presentation slides
on the Schedule page at:
http://groups.google.com/group/pnwcode4lib/web/code4lib-northwest-2010

Perhaps we could encourage other presenters to do this as well?

Shirley

On Tue, Jun 15, 2010 at 9:32 AM, Kyle Banerjee kyle.baner...@gmail.comwrote:

 Event was not recorded, but I'm sure a shoutout for slides will generate
 some nice stuff to link to. Note to self: in future, it might not be a bad
 idea to copy slides to a flash drive after each presentation

 kyle

 On Mon, Jun 14, 2010 at 7:29 PM, Ed Summers e...@pobox.com wrote:

  Wow, this looks like it was a great event. I don't suppose any of the
  talks were recorded, or that any slides are available? I'm
  particularly interested in Karen Estlund's talk about NoCode: Digital
  Preservation of Electronic Records...and well, all of the talks :-)
 
  //Ed
 
  On Mon, Jun 14, 2010 at 2:22 PM, Kyle Banerjee kyle.baner...@gmail.com
  wrote:
   Code4Lib Northwest was held June 7 at the White Stag building in
  Portland,
   OR.
  
   Registration was closed and a waiting list established at least a month
   before the event because the room capacity of 65 was reached. Ten 20
  minute
   sessions and 13 lightning talks listed at
  
 http://groups.google.com/group/pnwcode4lib/web/code4lib-northwest-2010made
   for a full day. Our official timekeeper (a screaming flying monkey who
  gets
   upset if people yak too long) was the only one who was bored as
 everyone
   stayed focused to the end.
  
   About half of the attendees filled out evaluation forms. Roughly 70%
  rated
   it as excellent, and everyone else gave it the next highest rating. A
  number
   of themes appeared in the responses. People loved the format consisting
  of
   short presentations and lightning talks. They also gave the content,
  food,
   and venue high marks.
  
   However, both at the post conference evaluation and in written
 comments,
   people said they'd like more opportunity to interact directly with
  others.
   Breakout sessions, short Q/A sessions after every couple speakers, and
  other
   ideas were suggested. These ideas and others will be considered for
   improving next year's event.
  
   Significantly, a number of excellent technologists with a feel for the
   code4lib spirit indicated willingness to help organize the next
 Code4Lib
   Northwest, so we're already looking forward to Code4Lib Northwest 2011.
  
   Respectfully submitted,
  
   kyle
  
 



 --
 --
 Kyle Banerjee
 Digital Services Program Manager
 Orbis Cascade Alliance
 baner...@uoregon.edu / 503.999.9787



Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Markus Fischer

Kyle Banerjee schrieb:

This might not be as bad as people think. The normal argument is that
holdings are in free text and there's no way staff will ever have enough
time to record volume level holdings. However, significant chunks of the
problem can be addressed using relatively simple methods.

For example, if you can identify complete runs, you know that a library has
all holdings and can start automating things.


That's what we've done for journal holdings (only) in

https://sourceforge.net/projects/doctor-doc/

Works perfect in combination with an EZB-account 
(rzblx1.uni-regensburg.de/ezeit) as a linkresolver. May be as exact as 
on issue level.


The tool is beeing used by around 100 libraries in Germany, Switzerland 
and Austria.


If you check this one out: Don't expect the perfect OS-system. It has 
been developped by me (head of library and no IT-Professional) and a 
colleague (IT-Professional). I learned a lot through this one.


There is plenty room for improvement in it: some things implemented not 
yet so nice, other things done quite nice ;-)


If you want to discuss, use or contribute:

https://sourceforge.net/projects/doctor-doc/support

Very welcome!

Markus Fischer



While my comments are mostly concerned with journal holdings, similar logic
can be used with monographic series as well.

kyle


Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Tom Keays
I think my perspective of the user's goal is actually the same (or close
enough to the same) as David's, just stated differently. The user wants the
most local copy or, failing that, a way to order it from another source.

However, I have plenty of examples of faculty and occasional grad students
who are willing to make the trek to a nearby library -- even out of town
libraries -- rather than do ILL. This doesn't encompass every use case or
even a typical use case (are there typical cases?), but it does no harm to
have information even if you can't always act on it.

The problem with OpenURL tied to a particular institution is
a) the person may not have (or know they have) an affiliation to a given
institution,
b) may be coming from outside their institution's IP range so that even the
OCLC Registry redirect trick will fail to get them to a (let alone the
correct) link resolver,
c) there may not be any recourse to find an item if the institution does not
own it (MPOW does not provide a link to WorldCat).

Tom

On Tue, Jun 15, 2010 at 12:16 PM, Walker, David dwal...@calstate.eduwrote:

  It seems like the more productive path if the goal of a user is
  simply to locate a copy, where ever it is held.

 But I don't think users have *locating a copy* as their goal.  Rather, I
 think their goal is to *get their hands on the book*.

 If I discover a book via COINs, and you drop me off at Worldcat.org, that
 allows me to see which libraries own the book.  But, unless I happen to be
 affiliated with those institutions, that's kinda useless information.  I
 have no real way of actually getting the book itself.

 If, instead, you drop me off at your institution's link resolver menu, and
 provide me an ILL option in the event you don't have the book, the library
 can get the book for me, which is really my *goal*.

 That seems like the more productive path, IMO.

 --Dave

 ==
 David Walker
 Library Web Services Manager
 California State University
 http://xerxes.calstate.edu
 
 From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Tom Keays
 [tomke...@gmail.com]
 Sent: Tuesday, June 15, 2010 8:43 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

 On Mon, Jun 14, 2010 at 3:47 PM, Jonathan Rochkind rochk...@jhu.edu
 wrote:

  The trick here is that traditional library metadata practices make it
 _very
  hard_ to tell if a _specific volume/issue_ is held by a given library.
  And
  those are the most common use cases for OpenURL.
 

 Yep. That's true even for individual library's with link resolvers. OCLC is
 not going to be able to solve that particular issue until the local
 libraries do.


  If you just want to get to the title level (for a journal or a book), you
  can easily write your own thing that takes an OpenURL, and either just
  redirects straight to worldcat.org on isbn/lccn/oclcnum, or actually
 does
  a WorldCat API lookup to ensure the record exists first and/or looks up
 on
  author/title/etc too.
 

 I was mainly thinking of sources that use COinS. If you have a rarely held
 book, for instance, then OpenURLs resolved against random institutional
 endpoints are going to mostly be unproductive. However, a union catalog
 such as OCLC already has the information about libraries in the system that
 own it. It seems like the more productive path if the goal of a user is
 simply to locate a copy, where ever it is held.


  Umlaut already includes the 'naive' just link to worldcat.org based on
  isbn, oclcnum, or lccn approach, functionality that was written before
 the
  worldcat api exists. That is, Umlaut takes an incoming OpenURL, and
 provides
  the user with a link to a worldcat record based on isbn, oclcnum, or
 lccn.
 

 Many institutions have chosen to do this. MPOW, however, represents a
 counter-example and do not link out to OCLC.

 Tom



Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Tom Keays
I do provide the user with the proxied WorldCat URL for just the reasons
Jonathan cites. But, no, being an otherwise open web resource, you can't
force a user to use it.

On Tue, Jun 15, 2010 at 12:22 PM, Jonathan Rochkind rochk...@jhu.eduwrote:


 I haven't yet found any good way to do this if the user is off-campus
 (ezproxy not a good solution, how do we 'force' the user to use ezproxy for
 worldcat.org anyway?).




Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Jonathan Rochkind
I'm not sure what you mean by complete holdings? The library holds the 
entire run of the journal from the first issue printed to the 
last/current? Or just holdings that dont' include missing statements?


Perhaps other institutions have more easily parseable holdings data (or 
even holdings data stored in structured form in the ILS) than mine.  For 
mine, even holdings that don't include missing are not feasibly 
reliably parseable, I've tried.


Jonathan

Kyle Banerjee wrote:

But if you think it's easy, please, give it a try and get back to us. :)
Maybe your library's data is cleaner than mine.




I don't think it's easy, but I think detecting *complete* holdings is a big
part of the picture and that can be done fairly well.

Cleanliness of data will vary from one institution to another, and quite a
bit of it will be parsible. Even if you only can't even get half, you're
still way ahead of where you'd otherwise be.


  

I think it's kind of a crime that our ILS (and many other ILSs) doesn't
provide a way for holdings to be efficiency entered (or guessed from
prediction patterns etc) AND converted to an internal structured format that
actually contains the semantic info we want.




There's too much variation in what people want to do.  Even going with
manual MFHD, it's still pretty easy to generate stuff that's pretty hard to
parse

kyle

  


[CODE4LIB] code4lib.hu codesprint report

2010-06-15 Thread Király Péter

Hi!

I gladly report, that we had the first code4lib.hu codesprint yesterday.
The purpose was to code with each other, and learn something from
each other. It was a 3,5 hour session at the National Széchényi Library,
Budapest. We created a script, which extracts ISBN numbers and book
cover images from an OAI-PMH data provider, embeded as METS
records. Hopefuly this code will be part in two or three different library
or book related services in the next months. We have discussed the
technical details, and the advantages, and the right problems of uploading
a local history photo collection to Flickr. Unfortunatelly we didn't
have time to code the Flickr part.
There was only a couple of coders, but we had a goot talk, new 
acquaintances.

(For those in #code4lib: this time we had no bbq, nor 'slambuc', but lots of
biscuits and mineral water. ;-)

If - for whatever reason - you want to follow or join us, see our group 
page:

http://groups.google.com/group/ikr-fejlesztok/

The meeting was run as a section of the Library's K2 (library 2.0)
task force's workshop about the usage of library 2.0 tools.
http://blog.konyvtar.hu/k2/

Some technical details:
- we use PHP as the common language
- for OAI-PMH harvesting we use Omeka's OAI harvester plugin
- for Flickr communication we planned to use Phlickr, a PHP library
- the OAI server we harvested run at University of Debrecen, and based on 
DSpace

- we found a bug in the Ubuntu version of PHP 5.2.10 (SimpleXMLElement have
a problem with xpath() method) - but we found a workaround as well.

Regards,
Péter


Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Jonathan Rochkind
Oh you really do mean complete like complete publication run?  Very 
few of our journal holdings are complete in that sense, they are 
definitely in the minority.  We start getting something after issue 1, 
or stop getting it before the last issue. Or stop and then start again.


Is this really unusual?

If all you've figured out is the complete publication run of a 
journal, and are assuming your library holds it... wait, how is this 
something you need for any actual use case?


My use case is trying to figure out IF we have a particular 
volume/issue, and ideally,  if so, what shelf is it located on.  If I'm 
just going to deal with journals we have the complete publication 
history of, I don't have a problem anymore, because the answer will 
always be yes, that's a very simple algorithm, print yes, heh.  So, 
yes, if you assume only holdings of complete publication histories, the 
problem does get very easy.


Incidentally, if anyone is looking for a schema and transmission format 
for actual _structured_ holdings information, that's flexible enough for 
idiosyncratic publication histories and holdings, but still structured 
enough to actually be machine-actionable... I still can't recommend Onix 
Serial Holdings highly enough!   I don't think it gets much use, 
probably because most of our systems simply don't _have_ this structured 
information, most of our staff interfaces don't provide reasonably 
efficient interfaces for entering, etc. But if you can get the other 
pieces and just need a schema and representation format, Onix Serial 
Holdings is nice!


Jonathan

Kyle Banerjee wrote:

On Tue, Jun 15, 2010 at 10:13 AM, Jonathan Rochkind rochk...@jhu.eduwrote:

  

I'm not sure what you mean by complete holdings? The library holds the
entire run of the journal from the first issue printed to the last/current?
Or just holdings that dont' include missing statements?




Obviously, there has to  some sort of holdings statement -- I'm presuming
that something reasonably accurate is available. If there is no summary
holdings statement, items aren't inventoried, but holdings are believed to
be incomplete, there's not much to work with.

As far as retrospectively getting data up to scratch in the case of
 hopeless situations, there are paths that make sense. For instance,
retrospectively inventorying serials may be insane. However, from circ and
ILL data, you should know which titles are actually consulted the most. Get
those ones in shape first and work backwards.

In a major academic library, it may be the case that some titles are *never*
handled, but that doesn't cause problems if no one wants them. For low use
resources, it can make more sense to just handle things manually.

Perhaps other institutions have more easily parseable holdings data (or even
  

holdings data stored in structured form in the ILS) than mine.  For mine,
even holdings that don't include missing are not feasibly reliably
parseable, I've tried.




Note that you can get structured holdings data from sources other than the
library catalog -- if you know what's missing.

Sounds like your situation is particularly challenging. But there are gains
worth chasing. Service issues aside, problems like these raise existential
questions.

If we do an inadequate job of providing access, patrons will just turn to
subscription databases and no one will even care about what we do or even if
we're still around. Most major academic libraries never got their entire
card collection in the online catalog. Patrons don't use that stuff anymore,
and almost no one cares (even among librarians). It would be a mistake to
think this can't happen again.

kyle

  


Re: [CODE4LIB] WorldCat as an OpenURL endpoint ?

2010-06-15 Thread Kyle Banerjee
 Oh you really do mean complete like complete publication run?  Very few
 of our journal holdings are complete in that sense, they are definitely in
 the minority.  We start getting something after issue 1, or stop getting it
 before the last issue. Or stop and then start again.

 Is this really unusual?


No, but parsing holding statements for something that just gets cut off
early or which starts late should be easy unless entry is insanely
inconsistent. If staff enter info even close to standard practices, you
still should be able to read a lot of it even when there are breaks. This is
when anal retentive behavior in the tech services dept saves your bacon.

This process will be lossy, but sometimes that's all you can do. Some
situations may be such that there's no reasonable fix that would
significantly improve things. But in that case, it makes sense to move onto
other problems. Otherwise, we wind up all our time futzing with fringe use
cases and people actually get what they need elsewhere.

kyle