On Wed, Mar 19, 2014 at 5:52 AM, Duncan Murdoch <murdoch.dun...@gmail.com>wrote:
> I don't see why CRAN needs to be involved in this effort at all. A third > party could take snapshots of CRAN at R release dates, and make those > available to package users in a separate repository. It is not hard to set > a different repository than CRAN as the default location from which to > obtain packages. > I am happy to see many people giving this some thought and engage in the discussion. Several have suggested that staging & freezing can be simply done by a third party. This solution and its limitations is also described in the paper [1] in the section titled "R: downstream staging and repackaging". If this would solve the problem without affecting CRAN, we would have been done this obviously. In fact, as described in the paper and pointed out by some people, initiatives such as Debian or Revolution Enterprise already include a frozen library of R packages. Also companies like Google maintain their own internal repository with packages that are used throughout the company. The problem with this approach is that when you using some 3rd party package snapshot, your r/sweave scripts will still only be reliable/reproducible for other users of that specific snapshot. E.g. for the examples above, a script that is written in R 3.0 by a Debian user is not guaranteed to work on R 3.0 in Google, or R 3.0 on some other 3rd party cran snapshot. Hence this solution merely redefines the problem from "this script depends on pkgA 1.1 and pkgB 0.2.3" to "this script depends on repository foo 2.0". And given that most users would still be pulling packages straight from CRAN, it would still be terribly difficult to reproduce a 5 year old sweave script from e.g. JSS. For this reason I believe the only effective place to organize this staging is all the way upstream, on CRAN. Imagine a world where your r/sweave script would be reliable/reproducible, out of the box, on any system, any platform in any company using on R 3.0. No need to investigate which specific packages or cran snapshot the author was using at the time of writing the script, and trying to reconstruct such libraries for each script you want to reproduce. No ambiguity about which package versions are used by R 3.0. However for better or worse, I think this could only be accomplished with a cran release cycle (i.e. "universal snapshots") accompanying the already existing r releases. > The only objection I can see to this is that it requires extra work by the > third party, rather than extra work by the CRAN team. I don't think the > total amount of work required is much different. I'm very unsympathetic to > proposals to dump work on others. I am merely trying to discuss a technical issue in an attempt to improve reliability of our software and reproducibility of papers created with R. [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel