I found an issue with the data() command this evening when working on the 
survival package.

1. I have a lot of data sets in the package, almost all used in at least one 
vignette, 
help file, or test.  As a space saving measure, I have bundled many of them 
together, 
i.e., the file data/cancer.rda contains 19 data sets, many of them small. The 
resulting 
file (using xz compression) is quite a bit smaller than the individual ones.  
(I still get 
a warning note about size from R CMD check, but I'm no longer 2x the limit.)

2. Consider the lung data set.  All of these fail:
    data(lung)
    data("lung")
    data(lung, package="survival")

  a. The lung.Rd file had \usage{data(lung)}; that error was not caught by R 
CMD check.  
(Several other .Rd files as well.)

  b. In broader examples for teaching, I sometimes load data from other 
packages, e.g 
data(aidssi, package="mstate").  But this does not work for survival.  (The 
larger 
survival data sets that are in separate .rda files can be found.)

  c. What does work is survival::lung.  Might it be useful to add a comment to 
data.Rd to 
this effect?


3. Creating a separate package 'survivaldata' is of course one route, and is 
suggested in 
the "Writing R Extensions" guide.  But this is not possible since survival is a 
recommended package: it can't load any non-recommended package for it's tests 
or 
vignettes.  Longer term, perhaps there is way around this constraint?

Terry T.

-- 
Terry M Therneau, PhD
Department of Health Science Research
Mayo Clinic
thern...@mayo.edu

"TERR-ree THUR-noh"


        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to