Re: [Rd] [RFC] A case for freezing CRAN

Jari Oksanen Fri, 21 Mar 2014 02:18:22 -0700

On 21/03/2014, at 10:40 AM, Rainer M Krug wrote:

> 
> 
> This is a long and (mainly) interesting discussion, which is fanning out
> in many different directions, and I think many are not that relevant to
> the OP's suggestion. 
> 
> I see the advantages of having such a dynamic CRAN, but also of having a
> more stable CRAN. I prefer CRAN as it is now, but ion many cases a more
> stable CRAN might b an advantage. So having releases of CRAN might make
> sense. But then there is the archiving issue of CRAN.
> 
> The suggestion was made to move the responsibility away from CRAN and
> the R infrastructure to the user / researcher to guarantee that the
> results can be re-run years later. It would be nice to have this build
> in CRAN, but let's stick at the scenario that the user should care for
> reproducability.


There are two different problems that alternate in the discussion: 
reproducibility and breakage of CRAN dependencies. Frozen CRAN could make 
*approximate* reproducibility easier to achieve, but real reproducibility needs 
stricter solutions. Actual sessionInfo() is minimal information, but 
re-building a spitting image of old environment may still be demanding (but in 
many cases this does not matter). 

Another problem is that CRAN is so volatile that new versions of packages break 
other packages or old scripts. Here the main problem is how package developers 
work. Freezing CRAN would not change that: if package maintainers release 
breaking code, that would be frozen. I think that most packages do not make 
distinction between development and release branches, and CRAN policy won't 
change that. 

I can sympathize with package maintainers having 150 reverse dependencies. My 
main package only has ~50, and it is sure that I won't test them all with new 
release. I sometimes tried, but I could not even get all those built because 
they had other dependencies on packages that failed. Even those that I could 
test failed to detect problems (in one case all examples were \dontrun and 
passed nicely tests). I only wish that if people *really* depend on my package, 
they test it against R-Forge version and alert me before CRAN releases, but 
that is not very likely (I guess many dependencies are not *really* necessary, 
but only concern marginal features of the package, but CRAN forces to declare 
those). 

Still a few words about reproducibility of scripts: this can be hardly achieved 
with good coverage, because many scripts are so very ad hoc. When I edit and 
review manuscripts for journals, I very often get Sweave or knitr scripts that 
"just work", where "just" means "just so and so". Often they do not work at 
all, because they had some undeclared private functionalities or stray files in 
the author workspace that did not travel with the Sweave document. I think 
these -- published scientific papers -- are the main field where the code 
really should be reproducible, but they often are the hardest to reproduce. 
Nothing CRAN people do can help with sloppy code scientists write for 
publications. You know, they are scientists -- not engineers. 

Cheers, Jari Oksanen
> 
> Leaving the issue of compilation out, a package which is creating a
> custom installation of the R version which includes the source of the R
> version used and the sources of the packages in a on Linux compilable
> format, given that the relevant dependencies are installed, would be a
> huge step forward. 
> 
> I know - compilation on Windows (and sometimes Mac) is a serious
> problem), but to archive *all* binaries and to re-compile all older
> versions of R and all packages would be an impossible task.
> 
> Apart from that - doing your analysis in a Virtual Machine and then
> simply archiving this Virtual Machine, would also be an option, but only
> for the more tech savy users.
> 
> In a nutshell: I think a package would be able to provide the solution
> for a local archiving to make it possible to re-run the simulation with
> the same tools at a later stage - although guarantees would not be
> possible.
> 
> Cheers,
> 
> Rainer
> -- 
> Rainer M. Krug
> email: Rainer<at>krugs<dot>de
> PGP: 0x0F52F982
> 
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [RFC] A case for freezing CRAN

Reply via email to