Re: [CODE4LIB] hathitrust research center workset browser

2015-06-01 Thread Karen Coyle
Right. Which is why *someone* copied all of the Google digitized books 
to the Internet Archive -- someone not associated with the library 
partners. So generally if you cannot download from HT you can find the 
same scan via openlibrary.org. Unfortunately that doesn't help with 
using the tool that ELM has alerted us to.


kc

On 6/1/15 2:19 PM, Jimmy Ghaphery wrote:

I think we are in agreement (especially about the utility of all things
HathiTrust). My one point is that any restrictions on digitized public
domain works, as I understand it, are not related to copyright.

On Mon, Jun 1, 2015 at 5:00 PM, Terry Reese  wrote:


However, the digitizing agency cannot dictate any copyright
restrictions on the digitized copies once released to the public

The digital objects have not, and as far as I understand, cannot be made
available to the public if digitized as part of the google books
digitization project.  Most institutions got very limited use, and
generally these were tied to their specific, immediate, communities.
Though, with that said each institution has slightly different terms.  For
what it's worth, the research center does not make the digital copies
available for download -- it provides tools for working with data in
aggregate (worksets) and provides a proof of concept environment
demonstrating the feasibility of creating a secured data repository with I
believe the long-term goal of providing data mining for the entire
hathitrust resources (both within and outside of the public domain).  But
even as it stands now, the tool has become a fantastic teaching tool when
talking to instructors and graduate students looking for large data sets to
work with, that also includes some pretty interesting research algori!
  thms for working with the data.

--tr

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
Jimmy Ghaphery
Sent: Monday, June 1, 2015 4:47 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] hathitrust research center workset browser

Thanks Eric for posting the webinar in the other thread.

I am pretty sure that digitizing something in the public domain does not
change its copyright status, at least in the U.S. The digitizing agency
certainly has the right to sell, restrict access, watermark, or even keep
the scans locked up on a thumb drive in a closet. They are not obligated to
share or to provide the digital files in a re-usable format. However, the
digitizing agency cannot dictate any copyright restrictions on the
digitized copies once released to the public.

#iamnotalawyer and welcome correction

best,

Jimmy



On Mon, Jun 1, 2015 at 12:12 PM, Eric Lease Morgan  wrote:


On Jun 1, 2015, at 10:58 AM, davesgonechina 
wrote:


They just informed me I need a .edu address. Having trouble
understanding the use of the term "public domain" here.

   Gung fhpx, naq fbhaqf ernyyl fbeg bs fghcvq!! --RYZ




--
Jimmy Ghaphery
Head, Digital Technologies
VCU Libraries
804-827-3551






--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: +1-510-435-8234
skype: kcoylenet/+1-510-984-3600


Re: [CODE4LIB] hathitrust research center workset browser

2015-06-01 Thread Jimmy Ghaphery
I think we are in agreement (especially about the utility of all things
HathiTrust). My one point is that any restrictions on digitized public
domain works, as I understand it, are not related to copyright.

On Mon, Jun 1, 2015 at 5:00 PM, Terry Reese  wrote:

> >> However, the digitizing agency cannot dictate any copyright
> >>restrictions on the digitized copies once released to the public
>
> The digital objects have not, and as far as I understand, cannot be made
> available to the public if digitized as part of the google books
> digitization project.  Most institutions got very limited use, and
> generally these were tied to their specific, immediate, communities.
> Though, with that said each institution has slightly different terms.  For
> what it's worth, the research center does not make the digital copies
> available for download -- it provides tools for working with data in
> aggregate (worksets) and provides a proof of concept environment
> demonstrating the feasibility of creating a secured data repository with I
> believe the long-term goal of providing data mining for the entire
> hathitrust resources (both within and outside of the public domain).  But
> even as it stands now, the tool has become a fantastic teaching tool when
> talking to instructors and graduate students looking for large data sets to
> work with, that also includes some pretty interesting research algori!
>  thms for working with the data.
>
> --tr
>
> -Original Message-
> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
> Jimmy Ghaphery
> Sent: Monday, June 1, 2015 4:47 PM
> To: CODE4LIB@LISTSERV.ND.EDU
> Subject: Re: [CODE4LIB] hathitrust research center workset browser
>
> Thanks Eric for posting the webinar in the other thread.
>
> I am pretty sure that digitizing something in the public domain does not
> change its copyright status, at least in the U.S. The digitizing agency
> certainly has the right to sell, restrict access, watermark, or even keep
> the scans locked up on a thumb drive in a closet. They are not obligated to
> share or to provide the digital files in a re-usable format. However, the
> digitizing agency cannot dictate any copyright restrictions on the
> digitized copies once released to the public.
>
> #iamnotalawyer and welcome correction
>
> best,
>
> Jimmy
>
>
>
> On Mon, Jun 1, 2015 at 12:12 PM, Eric Lease Morgan  wrote:
>
> > On Jun 1, 2015, at 10:58 AM, davesgonechina 
> > wrote:
> >
> > > They just informed me I need a .edu address. Having trouble
> > > understanding the use of the term "public domain" here.
> >
> >   Gung fhpx, naq fbhaqf ernyyl fbeg bs fghcvq!! --RYZ
> >
>
>
>
> --
> Jimmy Ghaphery
> Head, Digital Technologies
> VCU Libraries
> 804-827-3551
>



-- 
Jimmy Ghaphery
Head, Digital Technologies
VCU Libraries
804-827-3551


Re: [CODE4LIB] hathitrust research center workset browser

2015-06-01 Thread Terry Reese
>> However, the digitizing agency cannot dictate any copyright 
>>restrictions on the digitized copies once released to the public

The digital objects have not, and as far as I understand, cannot be made 
available to the public if digitized as part of the google books digitization 
project.  Most institutions got very limited use, and generally these were tied 
to their specific, immediate, communities.  Though, with that said each 
institution has slightly different terms.  For what it's worth, the research 
center does not make the digital copies available for download -- it provides 
tools for working with data in aggregate (worksets) and provides a proof of 
concept environment demonstrating the feasibility of creating a secured data 
repository with I believe the long-term goal of providing data mining for the 
entire hathitrust resources (both within and outside of the public domain).  
But even as it stands now, the tool has become a fantastic teaching tool when 
talking to instructors and graduate students looking for large data sets to 
work with, that also includes some pretty interesting research algori!
 thms for working with the data.  

--tr

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Jimmy 
Ghaphery
Sent: Monday, June 1, 2015 4:47 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] hathitrust research center workset browser

Thanks Eric for posting the webinar in the other thread.

I am pretty sure that digitizing something in the public domain does not change 
its copyright status, at least in the U.S. The digitizing agency certainly has 
the right to sell, restrict access, watermark, or even keep the scans locked up 
on a thumb drive in a closet. They are not obligated to share or to provide the 
digital files in a re-usable format. However, the digitizing agency cannot 
dictate any copyright restrictions on the digitized copies once released to the 
public.

#iamnotalawyer and welcome correction

best,

Jimmy



On Mon, Jun 1, 2015 at 12:12 PM, Eric Lease Morgan  wrote:

> On Jun 1, 2015, at 10:58 AM, davesgonechina 
> wrote:
>
> > They just informed me I need a .edu address. Having trouble 
> > understanding the use of the term "public domain" here.
>
>   Gung fhpx, naq fbhaqf ernyyl fbeg bs fghcvq!! --RYZ
>



--
Jimmy Ghaphery
Head, Digital Technologies
VCU Libraries
804-827-3551


Re: [CODE4LIB] hathitrust research center workset browser

2015-06-01 Thread Jimmy Ghaphery
Thanks Eric for posting the webinar in the other thread.

I am pretty sure that digitizing something in the public domain does not
change its copyright status, at least in the U.S. The digitizing agency
certainly has the right to sell, restrict access, watermark, or even keep
the scans locked up on a thumb drive in a closet. They are not obligated to
share or to provide the digital files in a re-usable format. However, the
digitizing agency cannot dictate any copyright restrictions on the
digitized copies once released to the public.

#iamnotalawyer and welcome correction

best,

Jimmy



On Mon, Jun 1, 2015 at 12:12 PM, Eric Lease Morgan  wrote:

> On Jun 1, 2015, at 10:58 AM, davesgonechina 
> wrote:
>
> > They just informed me I need a .edu address. Having trouble understanding
> > the use of the term "public domain" here.
>
>   Gung fhpx, naq fbhaqf ernyyl fbeg bs fghcvq!! --RYZ
>



-- 
Jimmy Ghaphery
Head, Digital Technologies
VCU Libraries
804-827-3551


[CODE4LIB] Let's Hack a Collaborative Website, ALA Annual LITA preconference

2015-06-01 Thread Junior Tidal
Apologies for cross-posting; Feel free to share with your colleagues

Going to ALA Annual? Consider signing up for our LITA preconference program. 
You'll get hands-on experience working with Bootstrap and Git!

Best,

Junior Tidal
Assistant Professor
Web Services and Multimedia Librarian
New York City College of Technology, CUNY 
300 Jay Street, Rm A434
Brooklyn, NY 11201
718.260.5481
 
http://library.citytech.cuny.edu



[CODE4LIB] hathitrust research center user group meeting

2015-06-01 Thread Eric Lease Morgan
Consider participating in Thursday's HathiTrust Research Center User Group 
Meeting:

Who - anybody and everybody
   What - a discussion of all things HathiTrust Research Center
   When - this Thursday, June 4 from 3-4:00 Eastern Time
  Where - via the telephone: (812) 856-3600 or (317) 278-7008 with PIN 803140
Why - because both you and they have something to offer librarianship

More specifically, Thursday's conference call is about at least two things: 1) 
your concerns regarding the Center, and 2) a discussion of my fledgling 
"Workset Browser". [1, 2] This is an opportunity for you to learn the why's & 
wherefore's of the Center, as well as influence the direction of programming 
initiatives. For example, you can learn more about their authorization and 
copyright restrictions. You can also discuss how you think the Center can 
provide support for the digital humanities and text mining. 

[1] HathiTrust Research Center - http://hathitrust.org/htrc
[2] blog posting describing the "Browser" - http://ntrda.me/1FUGP2g

—
Eric Lease Morgan
University of Notre Dame


[CODE4LIB] International Linked Data Survey for Implementers: RSVP by 17 July 2015

2015-06-01 Thread Roy Tennant
Posted on behalf of my colleague.
Roy

OCLC Research is repeating its survey to learn details of specific projects
or services that format metadata as linked data and/or make subsequent uses
of it. Many in the libraries/archives/museum community are excited by the
potential of linked data applications to make new, valuable uses of
existing metadata.

If you or a colleague have implemented or are implementing linked data
projects or services-either by publishing data as linked data or ingesting
linked data resources into your own data or applications-please take the
survey at https://www.surveymonkey.com/s/LinkedDataSurvey2015

Expected time to complete the survey: 15-20 minutes for each project
described. We ask that responses be completed by *17 July 2015.*

As with last year’s survey, examples collected will be shared for the
benefit of others wanting to undertake similar efforts, wondering what is
possible to do and how to go about it. Participating institutions will be
identified with the projects described, but contact information will be
held confidential. Responses to this survey will be valuable to others who
are also interested in starting Linked Data projects.

If you took the survey last year, please take this year’s as well, as
things might have changed. The questions are the same, but some multiple
choice questions have additional options taken from the “other” responses
in last year’s survey, and some open-ended questions have been changed to
multiple choice, again based on last year’s responses. You can check what
you answered last year on this publicly available spreadsheet, “Results of
Linked Data Survey for Implementers

.”

Please feel free to share the above link to the survey. We’d like as many
responses as possible!

With thanks,

Karen Smith-Yoshimura

OCLC Research


Re: [CODE4LIB] hathitrust research center workset browser

2015-06-01 Thread Eric Lease Morgan
On Jun 1, 2015, at 10:58 AM, davesgonechina  wrote:

> They just informed me I need a .edu address. Having trouble understanding
> the use of the term "public domain" here.

  Gung fhpx, naq fbhaqf ernyyl fbeg bs fghcvq!! --RYZ


[CODE4LIB] Code4Lib North: Car pooling from Brock residences to downtown

2015-06-01 Thread David Fiander
I'm staying at Brock for C4LN this week, but I won't have a car with me, 
so who's going to be around that I could catch a ride with downtown?


- David


Re: [CODE4LIB] hathitrust research center workset browser

2015-06-01 Thread Terry Reese
I know that Robert McDonald lurks around here -- so he could clarify this -- 
but what folks need to realize here is that the research center is providing 
tools that allow research access to materials within the hathitrust that are 
within the public domain.  However, the digitized materials themselves, are not 
public domain any more (as I understand it).  These materials, as I understand, 
are governed by the agreements institutions made as part of the google project. 
 So, while the materials that the research center is currently providing access 
to are ones identified as within the public domain, access to the research 
center is curated due to those agreements.  Robert or someone else can clarify 
if I've misspoken based on my understanding here.

--tr

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of 
davesgonechina
Sent: Monday, June 1, 2015 10:58 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] hathitrust research center workset browser

They just informed me I need a .edu address. Having trouble understanding the 
use of the term "public domain" here.

On Mon, Jun 1, 2015, 9:58 PM Eric Lease Morgan  wrote:

> On Jun 1, 2015, at 4:33 AM, davesgonechina 
> wrote:
>
> > If your *institutional* email address is not on their whitelist (not 
> > sure if it is limited to subscribing ones, they don't say) you 
> > cannot register using the signup form, instead you can only request 
> > an account by briefly explaining why you want one. Weird, because 
> > they'd have potentially
> learned
> > more about me if they just let me put my gmail address in the signup
> form.
> >
> > I don't get it - can all users download public domain content? If 
> > they
> give
> > me an account, will I be indistinguishable from a subscribing
> institution?
> > If not, why the extra hoops?
>
>
> Dave, you are the second person to bring this “white listing” issue to 
> my attention. Bummer! Yes, apparently, unless your email address is a 
> part of wider something or another, then you need to be authorized to 
> use the Research Center. Weird! In my opinion, while the Research 
> Center’s tools work, I believe the site suffers from usability issues.
>
> In any event, I have enhanced the auto-generated reports created by my 
> “Browser”, and while they are very textual, I also believe they are 
> insightful. For example, the complete works of:
>
>   * William Ellery Channing - http://bit.ly/browser-channing-about
>   * Jane Austen - http://bit.ly/browser-austen-about
>   * Ralph Waldo Emerson - http://bit.ly/browser-emerson-about
>   * Henry David Thoreau - http://bit.ly/browser-thoreau-about
>
> —
> Eric “Beginning To Suffer From ‘Creeping Featuritis’” Morgan
>


Re: [CODE4LIB] hathitrust research center workset browser

2015-06-01 Thread davesgonechina
They just informed me I need a .edu address. Having trouble understanding
the use of the term "public domain" here.

On Mon, Jun 1, 2015, 9:58 PM Eric Lease Morgan  wrote:

> On Jun 1, 2015, at 4:33 AM, davesgonechina 
> wrote:
>
> > If your *institutional* email address is not on their whitelist (not sure
> > if it is limited to subscribing ones, they don't say) you cannot register
> > using the signup form, instead you can only request an account by briefly
> > explaining why you want one. Weird, because they'd have potentially
> learned
> > more about me if they just let me put my gmail address in the signup
> form.
> >
> > I don't get it - can all users download public domain content? If they
> give
> > me an account, will I be indistinguishable from a subscribing
> institution?
> > If not, why the extra hoops?
>
>
> Dave, you are the second person to bring this “white listing” issue to my
> attention. Bummer! Yes, apparently, unless your email address is a part of
> wider something or another, then you need to be authorized to use the
> Research Center. Weird! In my opinion, while the Research Center’s tools
> work, I believe the site suffers from usability issues.
>
> In any event, I have enhanced the auto-generated reports created by my
> “Browser”, and while they are very textual, I also believe they are
> insightful. For example, the complete works of:
>
>   * William Ellery Channing - http://bit.ly/browser-channing-about
>   * Jane Austen - http://bit.ly/browser-austen-about
>   * Ralph Waldo Emerson - http://bit.ly/browser-emerson-about
>   * Henry David Thoreau - http://bit.ly/browser-thoreau-about
>
> —
> Eric “Beginning To Suffer From ‘Creeping Featuritis’” Morgan
>


Re: [CODE4LIB] hathitrust research center workset browser

2015-06-01 Thread Eric Lease Morgan
On Jun 1, 2015, at 4:33 AM, davesgonechina  wrote:

> If your *institutional* email address is not on their whitelist (not sure
> if it is limited to subscribing ones, they don't say) you cannot register
> using the signup form, instead you can only request an account by briefly
> explaining why you want one. Weird, because they'd have potentially learned
> more about me if they just let me put my gmail address in the signup form.
> 
> I don't get it - can all users download public domain content? If they give
> me an account, will I be indistinguishable from a subscribing institution?
> If not, why the extra hoops?


Dave, you are the second person to bring this “white listing” issue to my 
attention. Bummer! Yes, apparently, unless your email address is a part of 
wider something or another, then you need to be authorized to use the Research 
Center. Weird! In my opinion, while the Research Center’s tools work, I believe 
the site suffers from usability issues.

In any event, I have enhanced the auto-generated reports created by my 
“Browser”, and while they are very textual, I also believe they are insightful. 
For example, the complete works of:

  * William Ellery Channing - http://bit.ly/browser-channing-about
  * Jane Austen - http://bit.ly/browser-austen-about
  * Ralph Waldo Emerson - http://bit.ly/browser-emerson-about
  * Henry David Thoreau - http://bit.ly/browser-thoreau-about

—
Eric “Beginning To Suffer From ‘Creeping Featuritis’” Morgan


Re: [CODE4LIB] hathitrust research center workset browser

2015-06-01 Thread davesgonechina
If your *institutional* email address is not on their whitelist (not sure
if it is limited to subscribing ones, they don't say) you cannot register
using the signup form, instead you can only request an account by briefly
explaining why you want one. Weird, because they'd have potentially learned
more about me if they just let me put my gmail address in the signup form.

I don't get it - can all users download public domain content? If they give
me an account, will I be indistinguishable from a subscribing institution?
If not, why the extra hoops?

On Fri, May 29, 2015 at 1:51 AM, Eric Lease Morgan  wrote:

> On May 27, 2015, at 6:33 PM, Karen Coyle  wrote:
>
> >> In my copious spare time I have hacked together a thing I’m calling the
> HathiTrust Research Center Workset Browser, a (fledgling) tool for doing
> “distant reading” against corpora from the HathiTrust. [0, 1] ...
> >>
> >> 'Want to give it a try? For a limited period of time, go to the
> HathiTrust Research Center Portal, create (refine or identify) a collection
> of personal interest, use the Algorithms tool to export the collection's
> rsync file, and send the file to me. I will feed the rsync file to the
> Browser, and then send you the URL pointing to the results.
> >>
> >> [0] introduction in a blog posting - http://ntrda.me/1FUGP2g
> >> [1] HTRC Workset Browser - http://bit.ly/workset-browser
> >
> > Eric, what happens if you access this from a non-HT institution? When I
> go to HT I am often unable to download public domain titles because they
> aren't available to members of the general public.
>
>
> The short answer is, “Nothing”.
>
> The long answer is… longer. The HathiTrust proper is accessible to
> anybody, but the downloading of public domain content is only available to
> subscribing institutions.
>
> On the other hand, the “Workset Browser” is designed to work off the
> HathiTrust Research Center Portal, not the HathiTrust proper. The Portal is
> located at http://sharc.hathitrust.org From there anybody can search the
> collection of public domain content, create collections, and apply various
> algorithms against collections. One of the algorithms is “create RSYNC
> file” which, in turn, allows you to download bunches o’ metadata describing
> the items in your collection. (There is also a “download as MARC”
> algorithm.) This rsync file is the root of the Workset Browser. Feed the
> Browser a rsync file, and the Browser will mirror content locally, index
> it, and generate reports describing the collection.
>
> Thank you for asking. Many people do not know there is a HathiTrust
> Research Center.
>
> —
> Eric Morgan
>