Re: [R] [OT] Is data copyrightable?

2007-05-14 Thread Thomas Lumley

This is an area where US law differs importantly from other countries. US 
law protects compilations of facts only to the extent that the selection 
of the facts is creative expression (and does not protect the facts 
themselves).  Many other jurisdictions (eg European Union) also offer 
protection based on the effort need to compile the facts regardless of any 
creativity.  A 1997 US Supreme Court decision (in a case about telephone 
directories) ruled that the 'sweat of the brow' rationale for copyright 
was inconsistent with the intellectual property clause of the US 
Constitution.  So, in the US, it depends on the data and their source.

Publishers that I have talked to tend to claim that data are definitely 
copyrightable, but since they tend to own the copyrights one might do well 
to recall the immortal words of Mandy Rice-Davies.

-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

On Sat, 12 May 2007, hadley wickham wrote:

 Dear all,

 This is a little bit off-topic, but I was wondering if anyone has any
 informed opinion on whether data (ie. a dataset) is copyrightable?

 Hadley

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [OT] Is data copyrightable?

2007-05-13 Thread Michael Sumner
A relevant book on this important (and evolving) topic is 

Math You Can't Use: Patents, Copyright, and Software
by Ben Klemens (2006)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [OT] Is data copyrightable?

2007-05-13 Thread hadley wickham
These links from the US copyright office seem relevant:

Copyright Registration for Automated Databases
http://www.copyright.gov/circs/circ65.html

and

Furthermore, copyright protection does not extend to works consisting
entirely of information that is common property containing no original
authorship, for example: standard calendars, height and weight charts,
tape measures and rulers, schedules of sporting events, and lists or
tables taken from public documents or other common sources.
from http://www.copyright.gov/circs/circ32.html

and also

Notwithstanding the provisions of sections 106 and 106A, the fair use
of a copyrighted work, including such use by reproduction in copies or
phonorecords or by any other means specified by that section, for
purposes such as criticism, comment, news reporting, teaching
(including multiple copies for classroom use), scholarship, or
research, is not an infringement of copyright. In determining whether
the use made of a work in any particular case is a fair use the
factors to be considered shall include —

(1) the purpose and character of the use, including whether such use
is of a commercial nature or is for nonprofit educational purposes;

(2) the nature of the copyrighted work;

(3) the amount and substantiality of the portion used in relation to
the copyrighted work as a whole; and

(4) the effect of the use upon the potential market for or value of
the copyrighted work.

The fact that a work is unpublished shall not itself bar a finding of
fair use if such finding is made upon consideration of all the above
factors.
from http://www.copyright.gov/title17/92chap1.html#102

and at Stanford:
A fact or a theory--for example, the fact that a comet will pass by
the Earth in 2027 --is not protected by copyright. If a scientist
discovered this fact, anyone would be free to use it without asking
for permission from the scientist. Similarly, if someone creates a
theory that the comet can be destroyed by a nuclear device, anyone
could use that theory to create a book or movie. However, the unique
manner in which a fact is expressed may be protected. Therefore, if a
filmmaker created a movie about destroying a comet with a nuclear
device, the specific way he presented the ideas in the movie would be
protected by copyright.

EXAMPLE: Neil Young wrote a song, Ohio, about the shooting of four
college students during the Vietnam War. You are free to use the facts
surrounding the shooting but you may not copy Mr. Young's unique
expression of these facts without his permission.

In some cases, you are not free to copy a collection of facts because
the collection of facts may be protectible as a compilation (see
Section B5). For more information on how copyright applies to facts,
refer to Chapter 2, Section F3.
http://fairuse.stanford.edu/Copyright_and_Fair_Use_Overview/chapter8/8-a.html#4

Hadley


On 5/13/07, hadley wickham [EMAIL PROTECTED] wrote:
 Dear Brian, Peter, Spencer,

 Thanks for your comments, which have cleared things up a little for
 me.  The thing I find most confusing about copyright is that it is
 emergent, not atomic - ie. if you split a copyrighted work into small
 enough pieces (eg. letters, pixels) those pieces are no longer
 copyrightable.  It is the combination of those small pieces into a
 specific form that is important, and the definition of derivative
 works seems to help define what rearrangement of those pieces is still
 covered under copyright.

 The specific case that I am interested in creating new data sets from
 publically available data (itself stored in copyrightable works) - in
 my case to produce interesting data sets to use in class.  For
 example, each individual page on ebay is copyrightable, but if I
 extract the price, name and category from (say) 200 pages, does the
 copyright of that dataset belong to ebay?  I'm quite comfortable using
 that data personally, or for a class, but if I want to publish it (ie.
 in jse) do I need to get permission?  Similarly, if I take a few mp3's
 and calculate some summary statistics for them, would that constitute
 a derivative work?

 Hadley


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [OT] Is data copyrightable?

2007-05-12 Thread Prof Brian Ripley
On Sat, 12 May 2007, hadley wickham wrote:

 This is a little bit off-topic, but I was wondering if anyone has any
 informed opinion on whether data (ie. a dataset) is copyrightable?

Yes, informed (we discussed this with legally qualified authorities when 
MASS was first published with software/datasets).

You may note that statistical tables can be copyrighted, and have been, 
and some of those copyrights have been defended.

Generally you cannot coypright numbers (like words), but you can copyright 
their layout if there is seen to be added intellectual content.

MASS/inst/LICENCE says

   Our understanding is that the dataset files MASS/data/*.rda are not
   copyright.

   Files spatial/data/*.dat were generated or digitized by B. D. Ripley: no
   copyright is asserted.

   All other files are copyright (C) 1994-2002  W. N. Venables and
   B. D. Ripley. Those parts which were distributed with the first
   edition are also copyright (C) 1994 Springer-Verlag New York Inc, with
   all rights assigned to W. N. Venables and B. D. Ripley.

(Springer registered the rights with the US coypright authorities, 
although it is moot if _they_ had any intellectual content to assert.)

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [OT] Is data copyrightable?

2007-05-12 Thread Peter Dalgaard
hadley wickham wrote:
 Dear all,

 This is a little bit off-topic, but I was wondering if anyone has any
 informed opinion on whether data (ie. a dataset) is copyrightable?

 Hadley
   
In general not, I believe. E.g., I didn't have to ask formal permission 
to use data from Altman's book in mine (and I did check with my 
publisher). I suspect that things can get murkier than that though; I 
seem to recall stories of plagiarism cases in relation to collections of 
mathematical tables. Beware also that there can be other legal 
complications, including rights to first publication of new results, 
which usually implies that you cannot publish entire datasets until 
their publication potential has been exhausted. And of course, proper 
attribution is required for reasons of scientific integrity and general 
courtesy. (Disclaimer: I Am Not A Lawyer, esp. not a US one...)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [OT] Is data copyrightable?

2007-05-12 Thread Spencer Graves
Dear Hadley:

  P.s.  Ben Klemens (2006) Math you can't use (Brookings) cites cases 
where people have been successfully sued for copyright infringement for 
using a theorem they independently discovered.  That's pretty scary to 
me and seems totally unreasonable, but apparently the law at least in 
the US.

  Spencer Graves


  Brian's reply seems more consistent with what I've heard than
Peter's.

  The briefest summary I know of copyright law is that expression
but not ideas can be copyrighted.  Copyright law exists to promote
useful arts, and a compilation of data is intended to be useful.
Google, led me to http://ahds.ac.uk/copyrightfaq.htm#faq1?;, says that
data or other materials which (a) are arranged in a systematic or
methodical way, or (b) are individually accessible by electronic or
other means can be copyrighted.

  Beyond that, there is a fair use doctrine, which in the US at
least allows use in many cases by educators in public institutions, but
the same use by someone not affiliated with a public school might be an
infringement.  Ten years ago, I heard from attorneys at the University
of Wisconsin that a college prof can run copies of a journal article and
distribute them to this class without worrying about copyright
infringement (provided any money collected is clearly designed to cover
costs not make a profit), but the same copies prepared by Kinko's off
campus for the same class (sold perhaps at the same price) must get
copyright permission.

  Hope this helps.
  Spencer Graves

Peter Dalgaard wrote:
 hadley wickham wrote:
   
 Dear all,

 This is a little bit off-topic, but I was wondering if anyone has any
 informed opinion on whether data (ie. a dataset) is copyrightable?

 Hadley
   
 
 In general not, I believe. E.g., I didn't have to ask formal permission 
 to use data from Altman's book in mine (and I did check with my 
 publisher). I suspect that things can get murkier than that though; I 
 seem to recall stories of plagiarism cases in relation to collections of 
 mathematical tables. Beware also that there can be other legal 
 complications, including rights to first publication of new results, 
 which usually implies that you cannot publish entire datasets until 
 their publication potential has been exhausted. And of course, proper 
 attribution is required for reasons of scientific integrity and general 
 courtesy. (Disclaimer: I Am Not A Lawyer, esp. not a US one...)

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.