Re: [CODE4LIB] haititrust
Hi Eric and others, Belatedly, let me add that we are currently exploring ways of exposing and making searchable the subsets of HathiTrust volumes that overlap with individual partner library collections. It is also possible to do some analysis (though not robust searching) using data from the hathifiles and comparison with local holdings. If partners would like reports on holdings, or have questions they can send requests to feedb...@issues.hathitrust.org(or use the feedback link at the top right of any HT page, as Jonathan Rochkind pointed out). Thanks, Angelina Zaytsev Project Librarian, HathiTrust On Fri, Aug 3, 2012 at 2:51 PM, Karen Coyle li...@kcoyle.net wrote: I'm not the original poster, but I've run into this before in terms of linking library holdings to digital versions. There are a few reasons I can think of for doing this linking: 1) Your library is a selection of works that you think will best serve your readers. The library catalog is not the only place they should look, but it is a useful first place to look. 2) Other functions, like your courseware, link to your catalog; discovering additional copies in this way is useful (of course, this assumes they aren't running a service like Umlaut, right?) 3) if you don't have a good record of what was digitized from your library, HathiTrust might be the best source of that One of the big problems that I see with mass digitization and the access to those items is the loss of the role of the library in selection/collection building. I suppose if you are in a huge library like Harvard the collection is so large that it almost approaches whatever. For smaller libraries, and with certain user populations, the mass of digitized texts is overwhelming. A library like Harvard assumes highly sophisticated users; when you combine Harvard and Michigan and California together you get a library that few of us can function in. I think the challenge for us now is to make that huge collection usable by folks other than a few experts. kc On 8/3/12 11:26 AM, Jonathan Rochkind wrote: Not an answer to your question, but if you want to share I'm curious what your use case is where you want to limit to items your library owns. If HathiTrust has em in fulltext -- why would it matter to your patrons if your library has a print copy or not? And if HT does not have them in fulltext still, why would it matter to your patrons if your library has a print copy or not? __**__ From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Eric Lease Morgan [emor...@nd.edu] Sent: Friday, August 03, 2012 11:07 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] haititrust If I needed/wanted to know what materials held by my library were also in the HaitTrust, then programmatically how could I figure this out? In other words, do you know of a way to query the HaitTrust and limit the results to items my library owns? --Eric Lease Morgan -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet In your thirst for knowledge, be sure not to drown in all the information. ~Anthony J. D'Angelo Il y a autant de beaux idéals que de formes de nez différentes ou de caractères différents. ~Stendhal Education can give you a skill, but a liberal education can give you dignity. ~Ellen Key
Re: [CODE4LIB] haititrust
Eric , These blog postings are interesting. Here at UVa we have added MARC records for publicly accessible items from Hathi Trust into our solr based online catalog, but we have made no attempt yet to link from the records drawn from our ILS that reference physical items on the shelves the the Hathi Trust MARC records that are digitized versions of the same item, the two records currently appear as separate search results, one of can return availability of the physical item(s) on the shelf, the other which provides a link to the Hathi Trust page turner for the item. I think linking the two together would be useful, we simply haven't yet started a project to look at doing that. -Bob Haschart University of Virginia On 8/14/2012 3:30 PM, Eric Lease Morgan wrote: Yes, working with the HathiTrust data is interesting, to say the least. I did a bit of investigation to determine the feasibility of linking our bibliographic records to HathiTrust records. These investigations manifested themselves in three blog postings: 1. http://bit.ly/PVsKBg - Describes the overlap between the Hesburgh Libraries book collection at the University of Notre Dame and the HathiTrust. It also outlines possible services to be implemented the 'Trust. 2. http://bit.ly/OgNhCU - Here I describe how I identified and downloaded a set of 25,000 MARC records describing public domain items in both the HathiTrust and the Hesburgh Libraries' collection. 3. http://bit.ly/N0P4cl - In this posting I provide an interface for browsing a subset of the MARC records, as well as providing the means for downloading them to a local disc. What did I learn? In short, linking our records to HathiTrust records will require the coordinated skills and expertise of collection managers, catalogers, pubic service types, and systems types. I sincerely believe we can provide enhanced services against our collection -- services beyond linking -- if we figure out ways to exploit the HathiTrust. (P.S. It looks like I misspelled HathiTrust in my initial posting. Speln iz not mi 4ta.)
[CODE4LIB] haititrust
If I needed/wanted to know what materials held by my library were also in the HaitTrust, then programmatically how could I figure this out? In other words, do you know of a way to query the HaitTrust and limit the results to items my library owns? --Eric Lease Morgan
Re: [CODE4LIB] haititrust
You can do an empty query in their catalog, and use the Original Location facet to filter to a holding library. Programatically, I'm not sure, but you'd probably need to use the Hathi files: http://www.hathitrust.org/hathifiles. -Jon On 08/03/2012 11:07 AM, Eric Lease Morgan wrote: If I needed/wanted to know what materials held by my library were also in the HaitTrust, then programmatically how could I figure this out? In other words, do you know of a way to query the HaitTrust and limit the results to items my library owns? --Eric Lease Morgan
Re: [CODE4LIB] haititrust
Hi Eric, For on-the-fly queries there is a BibAPI. http://www.hathitrust.org/bib_api The hathifiles, which is a tab-delimited output of the HathiTrust items, would well for adding links to your catalog records. http://www.hathitrust.org/hathifiles -Stephanie From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Jon Stroop [jstr...@princeton.edu] Sent: Friday, August 03, 2012 8:15 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] haititrust You can do an empty query in their catalog, and use the Original Location facet to filter to a holding library. Programatically, I'm not sure, but you'd probably need to use the Hathi files: http://www.hathitrust.org/hathifiles. -Jon On 08/03/2012 11:07 AM, Eric Lease Morgan wrote: If I needed/wanted to know what materials held by my library were also in the HaitTrust, then programmatically how could I figure this out? In other words, do you know of a way to query the HaitTrust and limit the results to items my library owns? --Eric Lease Morgan
Re: [CODE4LIB] haititrust
Hi Eric I used an OCLC number match to get a sense of overlap at WFU - http://www.erikmitchell.info/2011/05/06/how-much-overlap-do-we-have-with-the-hathitrust/, http://www.erikmitchell.info/2011/05/07/more-on-hathitrust-overlap/. As I recall I simply pulled the oclc numbers from the MARC files (perhaps even just their spreadsheets) and did some simple database querying. More recently I have been working with the HT files using text similarity measures (e.g. pylevenshtein) to compare holdings across libraries. This takes a lot of CPU time but has proven to be a pretty good way to compare holdings at a title level and I suppose with a detailed enough text string (title, pub date, publisher...) you could focus the comparison on expressions/manifestations rather than just titles. Erik On Fri, Aug 3, 2012 at 11:15 AM, Jon Stroop jstr...@princeton.edu wrote: You can do an empty query in their catalog, and use the Original Location facet to filter to a holding library. Programatically, I'm not sure, but you'd probably need to use the Hathi files: http://www.hathitrust.org/hathifiles. -Jon On 08/03/2012 11:07 AM, Eric Lease Morgan wrote: If I needed/wanted to know what materials held by my library were also in the HaitTrust, then programmatically how could I figure this out? In other words, do you know of a way to query the HaitTrust and limit the results to items my library owns? --Eric Lease Morgan
Re: [CODE4LIB] haititrust
Ideally, you shouldn't need the hathifiles. The HathiTrust search page links to an OpenSearch document [1], which promisingly identifies an RSS feed and a JSON serialization of the search results. Neither appears to work. In theory, doing as Jon says and then appending view=rss would get you an RSS feed. There is a contact email in the OpenSearch document you might try. FWIW, if you look at the search page HTML, there is a fixme note in an HTML comment, the same comment, incidentally, that also comments out the RSS feed link in the HTML. Yours, Kevin [1] http://catalog.hathitrust.org/Search/OpenSearch?method=describe -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Jon Stroop Sent: Friday, August 03, 2012 11:15 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] haititrust You can do an empty query in their catalog, and use the Original Location facet to filter to a holding library. Programatically, I'm not sure, but you'd probably need to use the Hathi files: http://www.hathitrust.org/hathifiles. -Jon On 08/03/2012 11:07 AM, Eric Lease Morgan wrote: If I needed/wanted to know what materials held by my library were also in the HaitTrust, then programmatically how could I figure this out? In other words, do you know of a way to query the HaitTrust and limit the results to items my library owns? --Eric Lease Morgan
Re: [CODE4LIB] haititrust
There is a HathiTrust search API that you can use, in addition to RSS/OpenSearch. I can look up the details when i'm back at work next week if you can't find em googling. In fact, I think there are two seperate HT apis, one that searches HT fulltext and one that just searches metadata. I use the metadata searching one in production, and indeed use it to look up HT records by ISBN, LCCN, and OCLCnum. I am not sure if you can limit to just items your library owns using this API though. At a minimum (this may be obvious) your library would probably need to be a HT member, and have shared holdings information with HT -- otherwise HT has no idea which items your library owns. (My library is a HT member but has not yet shared holdings information with HT, because, well, we aren't able to identify our holdings reliably with OCLCnumbers, which is how HT (reasonably) wants it0. The support/question link at the top right of all HT pages, contrary to usual expectations (heh), actually does usually get directed to the right person and get a response, even for technical questions. I'd give a shot asking them directly. Jonathan From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Ford, Kevin [k...@loc.gov] Sent: Friday, August 03, 2012 12:20 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] haititrust Ideally, you shouldn't need the hathifiles. The HathiTrust search page links to an OpenSearch document [1], which promisingly identifies an RSS feed and a JSON serialization of the search results. Neither appears to work. In theory, doing as Jon says and then appending view=rss would get you an RSS feed. There is a contact email in the OpenSearch document you might try. FWIW, if you look at the search page HTML, there is a fixme note in an HTML comment, the same comment, incidentally, that also comments out the RSS feed link in the HTML. Yours, Kevin [1] http://catalog.hathitrust.org/Search/OpenSearch?method=describe -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Jon Stroop Sent: Friday, August 03, 2012 11:15 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] haititrust You can do an empty query in their catalog, and use the Original Location facet to filter to a holding library. Programatically, I'm not sure, but you'd probably need to use the Hathi files: http://www.hathitrust.org/hathifiles. -Jon On 08/03/2012 11:07 AM, Eric Lease Morgan wrote: If I needed/wanted to know what materials held by my library were also in the HaitTrust, then programmatically how could I figure this out? In other words, do you know of a way to query the HaitTrust and limit the results to items my library owns? --Eric Lease Morgan
Re: [CODE4LIB] haititrust
Not an answer to your question, but if you want to share I'm curious what your use case is where you want to limit to items your library owns. If HathiTrust has em in fulltext -- why would it matter to your patrons if your library has a print copy or not? And if HT does not have them in fulltext still, why would it matter to your patrons if your library has a print copy or not? From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Eric Lease Morgan [emor...@nd.edu] Sent: Friday, August 03, 2012 11:07 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] haititrust If I needed/wanted to know what materials held by my library were also in the HaitTrust, then programmatically how could I figure this out? In other words, do you know of a way to query the HaitTrust and limit the results to items my library owns? --Eric Lease Morgan
Re: [CODE4LIB] haititrust
I'm not the original poster, but I've run into this before in terms of linking library holdings to digital versions. There are a few reasons I can think of for doing this linking: 1) Your library is a selection of works that you think will best serve your readers. The library catalog is not the only place they should look, but it is a useful first place to look. 2) Other functions, like your courseware, link to your catalog; discovering additional copies in this way is useful (of course, this assumes they aren't running a service like Umlaut, right?) 3) if you don't have a good record of what was digitized from your library, HathiTrust might be the best source of that One of the big problems that I see with mass digitization and the access to those items is the loss of the role of the library in selection/collection building. I suppose if you are in a huge library like Harvard the collection is so large that it almost approaches whatever. For smaller libraries, and with certain user populations, the mass of digitized texts is overwhelming. A library like Harvard assumes highly sophisticated users; when you combine Harvard and Michigan and California together you get a library that few of us can function in. I think the challenge for us now is to make that huge collection usable by folks other than a few experts. kc On 8/3/12 11:26 AM, Jonathan Rochkind wrote: Not an answer to your question, but if you want to share I'm curious what your use case is where you want to limit to items your library owns. If HathiTrust has em in fulltext -- why would it matter to your patrons if your library has a print copy or not? And if HT does not have them in fulltext still, why would it matter to your patrons if your library has a print copy or not? From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Eric Lease Morgan [emor...@nd.edu] Sent: Friday, August 03, 2012 11:07 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] haititrust If I needed/wanted to know what materials held by my library were also in the HaitTrust, then programmatically how could I figure this out? In other words, do you know of a way to query the HaitTrust and limit the results to items my library owns? --Eric Lease Morgan -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet