On Wed, Jan 30, 2013 at 11:00 AM, Karen Coyle <[email protected]> wrote:
> > However, the Internet Archive does have copyright declarations AND > evidence fields for its digitized materials, so you can view those by > following the link to the full text. I don't know if there's a > reasonable way to pull those into the OL database. > The Internet Archive calls their field "possible copyright status" which doesn't sound very authoritative or useful to me. Certainly any date of publication or date of copyright would be useful information to bring over. > I understand when people are nervous about recording copyright > information, but my approach at U of C was that it is a disservice to > users to not at least tell them what you *do* know about materials you > are making available. I agree that more information is better. The only caveat that I'd add is that OL shouldn't replicate dynamic data it isn't committed to keeping up to date. It'd be better to direct people back to the original source. > At the same time, with a very few exceptions, actually determining > copyright status is a lengthy, expensive process. There is some evidence > that HathiTrust will be going through that process, and US libraries are > talking about sharing among them their determinations. HathiTrust is definitely engaged in doing this research on active basis and publishing the results of that research. They publish daily update files for the changes they've made to their database and those updates include fields for Access, Rights, and Reason for Rights Determination. Access is the high order allow/deny bit while Rights gives what they believe the status is and the third column is the reason they believe that. They describe the layout of their rights database along with the process they use here: http://www.hathitrust.org/rights_database The files themselves are at: http://www.hathitrust.org/hathifiles They layout of those files is described http://www.hathitrust.org/hathifiles_description For example, the Feb. 1 update includes over 17,000 records that they updated in some way during that day (not necessarily all for rights reasons). allow 6935 deny 10678 ic 7919 in-copyright pdus 3649 public domain in the U.S. pd 3284 public domain und 2759 <== These "undetermined" ones are the ones they're researching cc-by-nc-nd 2 bib 17415 bibliographically-derived by automatic processes ren 103 copyright renewal research was conducted nfi 60 needs further investigation (copyright research partially complete; an ambiguous, unclear, or other time-consuming situation was encountered) cdpp 12 title page or verso contain copyright date and/or place of publication information not in bib record crms 8 derived from multiple reviews in the Copyright Review Management System (CRMS) via an internal resolution policy; consult CRMS records for details con 2 contractual agreement with copyright holder on file ncn 1 no printed copyright notice As you can see, they are doing a ton of work and keeping pretty extensive records. It might be useful to replicate some of the top level info, especially for things which have reliable determinations of being in the public domain, but I'd be wary of including too much of the fine detail. Because their records include ISBNs, LCCNs, OCLC numbers, etc it should be pretty easy to do strong matching with OL records. Tom
_______________________________________________ Ol-tech mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech To unsubscribe from this mailing list, send email to [email protected]
