Re: [CODE4LIB] haititrust

2012-08-14 Thread Angelina Z
Hi Eric and others,

Belatedly, let me add that we are currently exploring ways of exposing and
making searchable the subsets of HathiTrust volumes that overlap with
individual partner library collections. It is also possible to do some
analysis (though not robust searching) using data from the hathifiles and
comparison with local holdings. If partners would like reports on holdings,
or have questions they can send requests to
feedb...@issues.hathitrust.org(or use the feedback link at the top
right of any HT page, as Jonathan
Rochkind pointed out).

Thanks,
Angelina Zaytsev
Project Librarian, HathiTrust

On Fri, Aug 3, 2012 at 2:51 PM, Karen Coyle li...@kcoyle.net wrote:

 I'm not the original poster, but I've run into this before in terms of
 linking library holdings to digital versions. There are a few reasons I can
 think of for doing this linking:

 1) Your library is a selection of works that you think will best serve
 your readers. The library catalog is not the only place they should look,
 but it is a useful first place to look.
 2) Other functions, like your courseware, link to your catalog;
 discovering additional copies in this way is useful (of course, this
 assumes they aren't running a service like Umlaut, right?)
 3) if you don't have a good record of what was digitized from your
 library, HathiTrust might be the best source of that

 One of the big problems that I see with mass digitization and the access
 to those items is the loss of the role of the library in
 selection/collection building. I suppose if you are in a huge library like
 Harvard the collection is so large that it almost approaches whatever.
 For smaller libraries, and with certain user populations, the mass of
 digitized texts is overwhelming. A library like Harvard assumes highly
 sophisticated users; when you combine Harvard and Michigan and California
 together you get a library that few of us can function in. I think the
 challenge for us now is to make that huge collection usable by folks other
 than a few experts.

 kc


 On 8/3/12 11:26 AM, Jonathan Rochkind wrote:

 Not an answer to your question, but if you want to share I'm curious what
 your use case is where you want to limit to items your library owns.

 If HathiTrust has em in fulltext -- why would it matter to your patrons
 if your library has a print copy or not? And if HT does not have them in
 fulltext still, why would it matter to your patrons if your library has
 a print copy or not?
 __**__
 From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Eric
 Lease Morgan [emor...@nd.edu]
 Sent: Friday, August 03, 2012 11:07 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] haititrust

 If I needed/wanted to know what materials held by my library were also in
 the HaitTrust, then programmatically how could I figure this out? In other
 words, do you know of a way to query the HaitTrust and limit the results to
 items my library owns? --Eric Lease Morgan


 --
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 ph: 1-510-540-7596
 m: 1-510-435-8234
 skype: kcoylenet






In your thirst for knowledge, be sure not to drown in all the
information.  ~Anthony J. D'Angelo

Il y a autant de beaux idéals que de formes de nez différentes ou de
caractères différents. ~Stendhal

Education can give you a skill, but a liberal education can give you
dignity. ~Ellen Key


Re: [CODE4LIB] haititrust

2012-08-14 Thread Robert Haschart

Eric ,

These blog postings are interesting.  Here at UVa we have added MARC 
records for publicly accessible items from Hathi Trust into our solr 
based online catalog, but we have made no attempt yet to link from the 
records drawn from our ILS that reference physical items on the shelves 
the the Hathi Trust MARC records that are digitized versions of the same 
item, the two records currently appear as separate search results, one 
of can return availability of the physical item(s) on the shelf, the 
other which provides a link to the Hathi Trust page turner for the item.


I think linking the two together would be useful, we simply haven't yet 
started a project to look at doing that.


-Bob Haschart
University of Virginia


On 8/14/2012 3:30 PM, Eric Lease Morgan wrote:

Yes, working with the HathiTrust data is interesting, to say the least.

I did a bit of investigation to determine the feasibility of linking our 
bibliographic records to HathiTrust records. These investigations manifested 
themselves in three blog postings:

   1. http://bit.ly/PVsKBg - Describes the overlap between the Hesburgh 
Libraries book collection at the University of Notre Dame and the HathiTrust. 
It also outlines possible services to be implemented the 'Trust.

   2. http://bit.ly/OgNhCU - Here I describe how I identified and downloaded a 
set of 25,000 MARC records describing public domain items in both the 
HathiTrust and the Hesburgh Libraries' collection.

   3. http://bit.ly/N0P4cl - In this posting I provide an interface for 
browsing a subset of the MARC records, as well as providing the means for 
downloading them to a local disc.

What did I learn? In short, linking our records to HathiTrust records will 
require the coordinated skills and expertise of collection managers, 
catalogers, pubic service types, and systems types. I sincerely believe we can 
provide enhanced services against our collection -- services beyond linking -- 
if we figure out ways to exploit the HathiTrust.

(P.S. It looks like I misspelled HathiTrust in my initial posting. Speln iz not 
mi 4ta.)



[CODE4LIB] haititrust

2012-08-03 Thread Eric Lease Morgan
If I needed/wanted to know what materials held by my library were also in the 
HaitTrust, then programmatically how could I figure this out? In other words, 
do you know of a way to query the HaitTrust and limit the results to items my 
library owns? --Eric Lease Morgan


Re: [CODE4LIB] haititrust

2012-08-03 Thread Jon Stroop
You can do an empty query in their catalog, and use the Original 
Location facet to filter to a holding library. Programatically, I'm not 
sure, but you'd probably need to use the Hathi files: 
http://www.hathitrust.org/hathifiles.


-Jon

On 08/03/2012 11:07 AM, Eric Lease Morgan wrote:

If I needed/wanted to know what materials held by my library were also in the 
HaitTrust, then programmatically how could I figure this out? In other words, 
do you know of a way to query the HaitTrust and limit the results to items my 
library owns? --Eric Lease Morgan


Re: [CODE4LIB] haititrust

2012-08-03 Thread Stephanie Collett
Hi Eric,

For on-the-fly queries there is a BibAPI. http://www.hathitrust.org/bib_api

The hathifiles, which is a tab-delimited output of the HathiTrust items, would 
well for adding links to your catalog records. 
http://www.hathitrust.org/hathifiles

-Stephanie

From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Jon Stroop 
[jstr...@princeton.edu]
Sent: Friday, August 03, 2012 8:15 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] haititrust

You can do an empty query in their catalog, and use the Original
Location facet to filter to a holding library. Programatically, I'm not
sure, but you'd probably need to use the Hathi files:
http://www.hathitrust.org/hathifiles.

-Jon

On 08/03/2012 11:07 AM, Eric Lease Morgan wrote:
 If I needed/wanted to know what materials held by my library were also in the 
 HaitTrust, then programmatically how could I figure this out? In other words, 
 do you know of a way to query the HaitTrust and limit the results to items my 
 library owns? --Eric Lease Morgan


Re: [CODE4LIB] haititrust

2012-08-03 Thread Erik Mitchell
Hi Eric

I used an OCLC number match to get a sense of overlap at WFU -
http://www.erikmitchell.info/2011/05/06/how-much-overlap-do-we-have-with-the-hathitrust/,
http://www.erikmitchell.info/2011/05/07/more-on-hathitrust-overlap/.
As I recall I simply pulled the oclc numbers from the MARC files
(perhaps even just their spreadsheets) and did some simple database
querying.

More recently I have been working with the HT files using text
similarity measures (e.g. pylevenshtein) to compare holdings across
libraries.  This takes a lot of CPU time but has proven to be a pretty
good way to compare holdings at a title level and I suppose with a
detailed enough text string (title, pub date, publisher...) you could
focus the comparison on expressions/manifestations rather than just
titles.

Erik

On Fri, Aug 3, 2012 at 11:15 AM, Jon Stroop jstr...@princeton.edu wrote:
 You can do an empty query in their catalog, and use the Original Location
 facet to filter to a holding library. Programatically, I'm not sure, but
 you'd probably need to use the Hathi files:
 http://www.hathitrust.org/hathifiles.

 -Jon


 On 08/03/2012 11:07 AM, Eric Lease Morgan wrote:

 If I needed/wanted to know what materials held by my library were also in
 the HaitTrust, then programmatically how could I figure this out? In other
 words, do you know of a way to query the HaitTrust and limit the results to
 items my library owns? --Eric Lease Morgan


Re: [CODE4LIB] haititrust

2012-08-03 Thread Ford, Kevin
Ideally, you shouldn't need the hathifiles.

The HathiTrust search page links to an OpenSearch document [1], which 
promisingly identifies an RSS feed and a JSON serialization of the search 
results.  Neither appears to work. In theory, doing as Jon says and then 
appending view=rss would get you an RSS feed.  There is a contact email in 
the OpenSearch document you might try.  

FWIW, if you look at the search page HTML, there is a fixme note in an HTML 
comment, the same comment, incidentally, that also comments out the RSS feed 
link in the HTML.

Yours,

Kevin

[1] http://catalog.hathitrust.org/Search/OpenSearch?method=describe





 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Jon Stroop
 Sent: Friday, August 03, 2012 11:15 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] haititrust
 
 You can do an empty query in their catalog, and use the Original
 Location facet to filter to a holding library. Programatically, I'm
 not sure, but you'd probably need to use the Hathi files:
 http://www.hathitrust.org/hathifiles.
 
 -Jon
 
 On 08/03/2012 11:07 AM, Eric Lease Morgan wrote:
  If I needed/wanted to know what materials held by my library were
 also
  in the HaitTrust, then programmatically how could I figure this out?
  In other words, do you know of a way to query the HaitTrust and limit
  the results to items my library owns? --Eric Lease Morgan


Re: [CODE4LIB] haititrust

2012-08-03 Thread Jonathan Rochkind
There is a HathiTrust search API that you can use, in addition to 
RSS/OpenSearch.  I can look up the details when i'm back at work next week if 
you can't find em googling.  In fact, I think there are two seperate HT apis, 
one that searches HT fulltext and one that just searches metadata. 

I use the metadata searching one in production, and indeed use it to look up HT 
records by ISBN, LCCN, and OCLCnum. 

I am not sure if you can limit to just items your library owns using this API 
though.  At a minimum (this may be obvious) your library would probably need to 
be a HT member, and have shared holdings information with HT -- otherwise HT 
has no idea which items your library owns. (My library is a HT member but has 
not yet shared holdings information with HT, because, well, we aren't able to 
identify our holdings reliably with OCLCnumbers, which is how HT (reasonably) 
wants it0. 

The support/question link at the top right of all HT pages, contrary to usual 
expectations (heh), actually does usually get directed to the right person and 
get a response, even for technical questions. I'd give a shot asking them 
directly. 

Jonathan

From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Ford, Kevin 
[k...@loc.gov]
Sent: Friday, August 03, 2012 12:20 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] haititrust

Ideally, you shouldn't need the hathifiles.

The HathiTrust search page links to an OpenSearch document [1], which 
promisingly identifies an RSS feed and a JSON serialization of the search 
results.  Neither appears to work. In theory, doing as Jon says and then 
appending view=rss would get you an RSS feed.  There is a contact email in 
the OpenSearch document you might try.

FWIW, if you look at the search page HTML, there is a fixme note in an HTML 
comment, the same comment, incidentally, that also comments out the RSS feed 
link in the HTML.

Yours,

Kevin

[1] http://catalog.hathitrust.org/Search/OpenSearch?method=describe





 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Jon Stroop
 Sent: Friday, August 03, 2012 11:15 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] haititrust

 You can do an empty query in their catalog, and use the Original
 Location facet to filter to a holding library. Programatically, I'm
 not sure, but you'd probably need to use the Hathi files:
 http://www.hathitrust.org/hathifiles.

 -Jon

 On 08/03/2012 11:07 AM, Eric Lease Morgan wrote:
  If I needed/wanted to know what materials held by my library were
 also
  in the HaitTrust, then programmatically how could I figure this out?
  In other words, do you know of a way to query the HaitTrust and limit
  the results to items my library owns? --Eric Lease Morgan


Re: [CODE4LIB] haititrust

2012-08-03 Thread Jonathan Rochkind
Not an answer to your question, but if you want to share I'm curious what your 
use case is where you want to limit to items your library owns. 

If HathiTrust has em in fulltext -- why would it matter to your patrons if your 
library has a print copy or not? And if HT does not have them in fulltext 
still, why would it matter to your patrons if your library has a print copy or 
not?

From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Eric Lease 
Morgan [emor...@nd.edu]
Sent: Friday, August 03, 2012 11:07 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] haititrust

If I needed/wanted to know what materials held by my library were also in the 
HaitTrust, then programmatically how could I figure this out? In other words, 
do you know of a way to query the HaitTrust and limit the results to items my 
library owns? --Eric Lease Morgan


Re: [CODE4LIB] haititrust

2012-08-03 Thread Karen Coyle
I'm not the original poster, but I've run into this before in terms of 
linking library holdings to digital versions. There are a few reasons I 
can think of for doing this linking:


1) Your library is a selection of works that you think will best serve 
your readers. The library catalog is not the only place they should 
look, but it is a useful first place to look.
2) Other functions, like your courseware, link to your catalog; 
discovering additional copies in this way is useful (of course, this 
assumes they aren't running a service like Umlaut, right?)
3) if you don't have a good record of what was digitized from your 
library, HathiTrust might be the best source of that


One of the big problems that I see with mass digitization and the access 
to those items is the loss of the role of the library in 
selection/collection building. I suppose if you are in a huge library 
like Harvard the collection is so large that it almost approaches 
whatever. For smaller libraries, and with certain user populations, 
the mass of digitized texts is overwhelming. A library like Harvard 
assumes highly sophisticated users; when you combine Harvard and 
Michigan and California together you get a library that few of us can 
function in. I think the challenge for us now is to make that huge 
collection usable by folks other than a few experts.


kc

On 8/3/12 11:26 AM, Jonathan Rochkind wrote:

Not an answer to your question, but if you want to share I'm curious what your 
use case is where you want to limit to items your library owns.

If HathiTrust has em in fulltext -- why would it matter to your patrons if your 
library has a print copy or not? And if HT does not have them in fulltext 
still, why would it matter to your patrons if your library has a print copy or 
not?

From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Eric Lease 
Morgan [emor...@nd.edu]
Sent: Friday, August 03, 2012 11:07 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] haititrust

If I needed/wanted to know what materials held by my library were also in the 
HaitTrust, then programmatically how could I figure this out? In other words, 
do you know of a way to query the HaitTrust and limit the results to items my 
library owns? --Eric Lease Morgan


--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet