Re: [R] R] adding a new dataset to the default R distribution

John Maindonald Thu, 04 Dec 2008 16:39:20 -0800

Making data, especially data that have been the subject of published  
papers, widely available, can be a useful spinoff from the R project,  
another gift to the scientific community beyond the provision of  
computing and analytic tools.  Nowadays, in a complete publication of  
a scientific result, there is every reason for the data to be part of  
that publication.  The Gentleman and Lang 2004 paper "Statistical  
Analyses and Reproducible Research" takes this further still, making a  
compelling case for opening the analysis to ready view. 
(http://www.bepress.com/bioconductor/paper2/ 
)  How else can critics know what analysis was done, and whether the  
data do really support the claimed conclusions?

As I see it, the first recourse should be use of archives that  
individual communities may establish.  Instructions on how to input  
the data into R would be a useful small item of ancillary  
information.  Links to such archives (under Data Archives, maybe)  
might be included on CRAN.  The Open Archaeology project would seem a  
good umbrella for the archiving of archaeology data.

Where there is no available repository, or there are reasons for  
putting the data into an R package, one possibility is to advertise on  
this list: "Orphan dataset, looking for a good home".  In this case I  
have offered to include the data in the DAAGxtras package, and I am  
open to further such requests.  Perhaps however, there should be a   
"miscdata" or suchlike package to which such datasets can be  
submitted? All it would require is for someone to offer to act as  
Keeper of the Miscellaneous Data".

John Maindonald             email: [EMAIL PROTECTED]
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

On 04/12/2008, at 10:00 PM, [EMAIL PROTECTED] wrote:

> From: Stefano Costa <[EMAIL PROTECTED]>
> Date: 3 December 2008 9:29:13 PM
> To: r-help@r-project.org
> Subject: [R] adding a new dataset to the default R distribution
>
>
> Hi,
> I am a student in archaeology with some interest in statistics and R.
>
> Recently I've obtained the permission to distribute in the public  
> domain
> a small dataset (named "spearheads" with 40 obs. of  14 variables)  
> that
> was used in an introductory statistics book for archaeologists
> (published in 1994).
>
> I've rewritten most of the exercises of that book in R and made them
> available at http://wiki.iosa.it/diggingnumbers:start along with the
> original dataset, but I was wondering if there is a standard procedure
> for getting a new dataset included in the default R distribution, like
> "cars" and others.
>
> Please add me to CC if replying because I'm not subscribed to the  
> list.
>
> Best regards,
> Stefano
>
> -- 
> Stefano Costa
> http://www.iosa.it/ Open Archaeology

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R] adding a new dataset to the default R distribution

Reply via email to