Re: [Rd] [RFC] A case for freezing CRAN

2014-03-24 Thread Martin Maechler
-project.org mailto:r-devel@r-project.org Sent: Wednesday, March 19, 2014 11:03:32 PM Subject: Re: [Rd] [RFC] A case for freezing CRAN On Mar 19, 2014, at 7:45 PM, Jeroen Ooms wrote: On Wed, Mar 19, 2014 at 6:55 PM, Michael Weylandt michael.weyla...@gmail.com

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-24 Thread Gábor Csárdi
FWIW, I am mirroring CRAN at github now, here: https://github.com/cran One can install specific package versions using the devtools package: library(devtools) install_github(cran/package@version) In addition, one can also install versions based on the R version, e.g.:

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-21 Thread Jari Oksanen
Freezing CRAN solves no problem of reproducibility. If you know the sessionInfo() or the version of R, the packages used and their versions, you can reproduce that set up. If you do not know, then you cannot. You can try guess: source code of old release versions of R and old packages are in

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-21 Thread Rainer M Krug
This is a long and (mainly) interesting discussion, which is fanning out in many different directions, and I think many are not that relevant to the OP's suggestion. I see the advantages of having such a dynamic CRAN, but also of having a more stable CRAN. I prefer CRAN as it is now, but ion

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-21 Thread Rainer M Krug
Jari Oksanen jari.oksa...@oulu.fi writes: Freezing CRAN solves no problem of reproducibility. If you know the sessionInfo() or the version of R, the packages used and their versions, you can reproduce that set up. If you do not know, then you cannot. You can try guess: source code of old

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-21 Thread Philippe Grosjean
This is becoming an extremely long thread, and it is going in too many directions. However, I would like to mention here our ongoing five years projects ECOS project for the study of Open Source Ecosystems, among which, CRAN. You can find info here:

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-21 Thread Jari Oksanen
On 21/03/2014, at 10:40 AM, Rainer M Krug wrote: This is a long and (mainly) interesting discussion, which is fanning out in many different directions, and I think many are not that relevant to the OP's suggestion. I see the advantages of having such a dynamic CRAN, but also of having

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-21 Thread Rainer M Krug
Jari Oksanen jari.oksa...@oulu.fi writes: On 21/03/2014, at 10:40 AM, Rainer M Krug wrote: This is a long and (mainly) interesting discussion, which is fanning out in many different directions, and I think many are not that relevant to the OP's suggestion. I see the advantages of

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-21 Thread Philippe GROSJEAN
On 21 Mar 2014, at 11:08, Rainer M Krug rai...@krugs.de wrote: Jari Oksanen jari.oksa...@oulu.fi writes: On 21/03/2014, at 10:40 AM, Rainer M Krug wrote: This is a long and (mainly) interesting discussion, which is fanning out in many different directions, and I think many are not

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-21 Thread Tom Short
For me, the most important aspect is being able to reproduce my own work. Some other tools offer interesting approaches to managing packages: * NPM -- The Node Package Manager for Node.js loads a local copy of all packages and dependencies. This helps ensure reproducibility and avoids dependency

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread David Winsemius
On Mar 19, 2014, at 7:45 PM, Jeroen Ooms wrote: On Wed, Mar 19, 2014 at 6:55 PM, Michael Weylandt michael.weyla...@gmail.com wrote: Reading this thread again, is it a fair summary of your position to say reproducibility by default is more important than giving users access to the newest

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Dan Tenenbaum
- Original Message - From: David Winsemius dwinsem...@comcast.net To: Jeroen Ooms jeroen.o...@stat.ucla.edu Cc: r-devel r-devel@r-project.org Sent: Wednesday, March 19, 2014 11:03:32 PM Subject: Re: [Rd] [RFC] A case for freezing CRAN On Mar 19, 2014, at 7:45 PM, Jeroen Ooms

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Rainer M Krug
Michael Weylandt michael.weyla...@gmail.com writes: On Mar 19, 2014, at 22:17, Gavin Simpson ucfa...@gmail.com wrote: Michael, I think the issue is that Jeroen wants to take that responsibility out of the hands of the person trying to reproduce a work. If it used R 3.0.x and packages A, B

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Rainer M Krug
Hadley Wickham h.wick...@gmail.com writes: What would be more useful in terms of reproducibility is the capability of installing a specific version of a package from a repository using install.packages(), which would require archiving older versions in a coordinated fashion. I know CRAN

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Duncan Murdoch
On 14-03-20 2:15 AM, Dan Tenenbaum wrote: - Original Message - From: David Winsemius dwinsem...@comcast.net To: Jeroen Ooms jeroen.o...@stat.ucla.edu Cc: r-devel r-devel@r-project.org Sent: Wednesday, March 19, 2014 11:03:32 PM Subject: Re: [Rd] [RFC] A case for freezing CRAN On Mar

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Roger Bivand
Gavin Simpson ucfagls at gmail.com writes: ... To my mind it is incumbent upon those wanting reproducibility to build the tools to enable users to reproduce works. When you write a paper or release a tool, you will have tested it with a specific set of packages. It is relatively easy to

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread S Ellison
If we could all agree on a particular set of cran packages to be used with a certain release of R, then it doesn't matter how the 'snapshotting' gets implemented. This is pretty much the sticking point, though. I see no practical way of reaching that agreement without the kind of decision

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Jari Oksanen
On 20/03/2014, at 14:14 PM, S Ellison wrote: If we could all agree on a particular set of cran packages to be used with a certain release of R, then it doesn't matter how the 'snapshotting' gets implemented. This is pretty much the sticking point, though. I see no practical way of

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Hervé Pagès
: Re: [Rd] [RFC] A case for freezing CRAN On Mar 19, 2014, at 7:45 PM, Jeroen Ooms wrote: On Wed, Mar 19, 2014 at 6:55 PM, Michael Weylandt michael.weyla...@gmail.com wrote: Reading this thread again, is it a fair summary of your position to say reproducibility by default is more important than

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Ted Byers
-devel@r-project.org Sent: Wednesday, March 19, 2014 11:03:32 PM Subject: Re: [Rd] [RFC] A case for freezing CRAN On Mar 19, 2014, at 7:45 PM, Jeroen Ooms wrote: On Wed, Mar 19, 2014 at 6:55 PM, Michael Weylandt michael.weyla...@gmail.com wrote: Reading this thread again, is it a fair

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Jeroen Ooms
On Thu, Mar 20, 2014 at 1:28 PM, Ted Byers r.ted.by...@gmail.com wrote: Herve Pages mentions the risk of irreproducibility across three minor revisions of version 1.0 of Matrix. My gut reaction would be that if the results are not reproducible across such minor revisions of one library, they

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Tim Triche, Jr.
: Wednesday, March 19, 2014 11:03:32 PM Subject: Re: [Rd] [RFC] A case for freezing CRAN On Mar 19, 2014, at 7:45 PM, Jeroen Ooms wrote: On Wed, Mar 19, 2014 at 6:55 PM, Michael Weylandt michael.weyla...@gmail.com wrote: Reading this thread again, is it a fair summary of your

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Ted Byers
On Thu, Mar 20, 2014 at 4:53 PM, Jeroen Ooms jeroen.o...@stat.ucla.eduwrote: On Thu, Mar 20, 2014 at 1:28 PM, Ted Byers r.ted.by...@gmail.com wrote: Herve Pages mentions the risk of irreproducibility across three minor revisions of version 1.0 of Matrix. My gut reaction would be that if the

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Ted Byers
On Thu, Mar 20, 2014 at 5:11 PM, Tim Triche, Jr. tim.tri...@gmail.comwrote: That doesn't make sense. If an API changes (e.g. in Matrix) and a program written against the old API can no longer run, that is a very different issue than if the same numbers (data) give different results. The

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Tim Triche, Jr.
There is nothing like backups with due attention to detail. Agreed, although given the complexity of dependencies among packages, this might entail several GB of snapshots per paper (if not several TB for some papers) in various cases. Anyone who is reasonably prolific then gets the exciting

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Ted Byers
On Thu, Mar 20, 2014 at 5:27 PM, Tim Triche, Jr. tim.tri...@gmail.comwrote: There is nothing like backups with due attention to detail. Agreed, although given the complexity of dependencies among packages, this might entail several GB of snapshots per paper (if not several TB for some

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Hervé Pagès
Sent: Wednesday, March 19, 2014 11:03:32 PM Subject: Re: [Rd] [RFC] A case for freezing CRAN On Mar 19, 2014, at 7:45 PM, Jeroen Ooms wrote: On Wed, Mar 19, 2014 at 6:55 PM, Michael Weylandt michael.weyla

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Uwe Ligges
mailto:r-devel@r-project.org Sent: Wednesday, March 19, 2014 11:03:32 PM Subject: Re: [Rd] [RFC] A case for freezing CRAN On Mar 19, 2014, at 7:45 PM, Jeroen Ooms wrote: On Wed, Mar 19, 2014 at 6:55 PM, Michael Weylandt

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Hervé Pagès
Cc: r-devel r-devel@r-project.org mailto:r-devel@r-project.org Sent: Wednesday, March 19, 2014 11:03:32 PM Subject: Re: [Rd] [RFC] A case for freezing CRAN On Mar 19, 2014, at 7:45 PM, Jeroen Ooms wrote: On Wed

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Gábor Csárdi
Much of the discussion was about reproducibility so far. Let me emphasize another point from Jeroen's proposal. This is hard to measure of course, but I think I can say that the existence and the quality of CRAN and its packages contributed immensely to the success of R and the success of people

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread William Dunlap
with good test suites. Bill Dunlap TIBCO Software wdunlap tibco.com -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On Behalf Of Gábor Csárdi Sent: Thursday, March 20, 2014 6:24 PM To: r-devel Subject: Re: [Rd] [RFC] A case for freezing

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Gábor Csárdi
On Thu, Mar 20, 2014 at 9:45 PM, William Dunlap wdun...@tibco.com wrote: In particular, updating a package with many reverse dependencies is a frustrating process, for everybody. As a maintainer with ~150 reverse dependencies, I think not twice, but ten times if I really want to publish

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Tim Triche, Jr.
Heh, you just described BioC --t On Mar 20, 2014, at 7:15 PM, Gábor Csárdi csardi.ga...@gmail.com wrote: On Thu, Mar 20, 2014 at 9:45 PM, William Dunlap wdun...@tibco.com wrote: In particular, updating a package with many reverse dependencies is a frustrating process, for everybody. As a

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Tim Triche, Jr.
Except that tests (as vignettes) are mandatory for BioC. So if something blows up you hear about it right quick :-) --t On Mar 20, 2014, at 7:15 PM, Gábor Csárdi csardi.ga...@gmail.com wrote: On Thu, Mar 20, 2014 at 9:45 PM, William Dunlap wdun...@tibco.com wrote: In particular, updating

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-20 Thread Dan Tenenbaum
- Original Message - From: Gábor Csárdi csardi.ga...@gmail.com To: r-devel r-devel@r-project.org Sent: Thursday, March 20, 2014 6:23:33 PM Subject: Re: [Rd] [RFC] A case for freezing CRAN Much of the discussion was about reproducibility so far. Let me emphasize another point

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Frank Harrell
To me it boils down to one simple question: is an update to a package on CRAN more likely to (1) fix a bug, (2) introduce a bug or downward incompatibility, or (3) add a new feature or fix a compatibility problem without introducing a bug? I think the probability of (1) | (3) is much greater

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Joshua Ulrich
On Tue, Mar 18, 2014 at 3:24 PM, Jeroen Ooms jeroen.o...@stat.ucla.edu wrote: snip ## Summary Extending the r-release cycle to CRAN seems like a solution that would be easy to implement. Package updates simply only get pushed to the r-devel branches of cran, rather than r-release and

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Duncan Murdoch
I don't see why CRAN needs to be involved in this effort at all. A third party could take snapshots of CRAN at R release dates, and make those available to package users in a separate repository. It is not hard to set a different repository than CRAN as the default location from which to

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Kasper Daniel Hansen
Our experience in Bioconductor is that this is a pretty hard problem. What the OP presumably wants is some guarantee that all packages on CRAN work well together. A good example is when Rcpp was updated, it broke other packages (quick note: The Rcpp developers do a incredible amount of work to

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Dirk Eddelbuettel
Piling on: On 19 March 2014 at 07:52, Joshua Ulrich wrote: | There is nothing preventing you (or anyone else) from creating | repositories that do what you suggest. Create a CRAN mirror (or more | than one) that only include the package versions you think they | should. Then have your

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Hadley Wickham
What would be more useful in terms of reproducibility is the capability of installing a specific version of a package from a repository using install.packages(), which would require archiving older versions in a coordinated fashion. I know CRAN archives old versions, but I am not aware if we

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Geoff Jentry
using the identical version of each CRAN package. The bioconductor project uses similar policies. While I agree that this can be an issue, I don't think it is fair to compare CRAN to BioC. Unless things have changed, the latter has a more rigorous barrier to entry which includes buy in of

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Spencer Graves
What about having this purpose met with something like an expansion of R-Forge? We could have packages submitted to R-Forge rather than CRAN, and people who wanted the latest could get it from R-Forge. If changes I make on R-Forge break a reverse dependency, emails explaining the

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Joshua Ulrich
On Wed, Mar 19, 2014 at 12:59 PM, Jeroen Ooms jeroen.o...@stat.ucla.edu wrote: On Wed, Mar 19, 2014 at 5:52 AM, Duncan Murdoch murdoch.dun...@gmail.comwrote: I don't see why CRAN needs to be involved in this effort at all. A third party could take snapshots of CRAN at R release dates, and

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Carl Boettiger
Dear list, I'm curious what people would think of a more modest proposal at this time: State the version of the dependencies used by the package authors when the package was built. Eventually CRAN could enforce such a statement be present in the description. We encourage users to declare the

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Jeroen Ooms
On Wed, Mar 19, 2014 at 7:00 AM, Kasper Daniel Hansen kasperdanielhan...@gmail.com wrote: Our experience in Bioconductor is that this is a pretty hard problem. What the OP presumably wants is some guarantee that all packages on CRAN work well together. Obviously we can not guarantee that all

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Hervé Pagès
Hi, On 03/19/2014 07:00 AM, Kasper Daniel Hansen wrote: Our experience in Bioconductor is that this is a pretty hard problem. What's hard and requires a substantial amount of human resources is to run our build system (set up the build machines, keep up with changes in R, babysit the builds,

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Jeroen Ooms
On Wed, Mar 19, 2014 at 11:50 AM, Joshua Ulrich josh.m.ulr...@gmail.com wrote: The suggested solution is not described in the referenced article. It was not suggested that it be the operating system's responsibility to distribute snapshots, nor was it suggested to create binary repositories

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Joshua Ulrich
On Wed, Mar 19, 2014 at 4:28 PM, Jeroen Ooms jeroen.o...@stat.ucla.edu wrote: On Wed, Mar 19, 2014 at 11:50 AM, Joshua Ulrich josh.m.ulr...@gmail.com wrote: The suggested solution is not described in the referenced article. It was not suggested that it be the operating system's

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Dan Tenenbaum
- Original Message - From: Joshua Ulrich josh.m.ulr...@gmail.com To: Jeroen Ooms jeroen.o...@stat.ucla.edu Cc: r-devel r-devel@r-project.org Sent: Wednesday, March 19, 2014 2:59:53 PM Subject: Re: [Rd] [RFC] A case for freezing CRAN On Wed, Mar 19, 2014 at 4:28 PM, Jeroen Ooms

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Jeroen Ooms
On Wed, Mar 19, 2014 at 2:59 PM, Joshua Ulrich josh.m.ulr...@gmail.com wrote: So implementation isn't a problem. The problem is that you need a way to force people not to be able to use different package versions than what existed at the time of each R release. I said this in my previous

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Joshua Ulrich
On Wed, Mar 19, 2014 at 5:16 PM, Jeroen Ooms jeroen.o...@stat.ucla.edu wrote: On Wed, Mar 19, 2014 at 2:59 PM, Joshua Ulrich josh.m.ulr...@gmail.com wrote: So implementation isn't a problem. The problem is that you need a way to force people not to be able to use different package versions

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Hervé Pagès
On 03/19/2014 02:59 PM, Joshua Ulrich wrote: On Wed, Mar 19, 2014 at 4:28 PM, Jeroen Ooms jeroen.o...@stat.ucla.edu wrote: On Wed, Mar 19, 2014 at 11:50 AM, Joshua Ulrich josh.m.ulr...@gmail.com wrote: The suggested solution is not described in the referenced article. It was not suggested

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Romain Francois
Weighting in. FWIW, I find the proposal conceptually quite interesting. For package developers, it does not have to be a frustration to have to wait a new version of R to release their code. Anticipated frustration was my initial reaction. Thinking about this more, I think this could be

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Gavin Simpson
What am I overlooking? That this is already available and possible in R today, but perhaps not widely used. Developers do tend to only include a lower bound if they include any bounds at all on package dependencies. As I mentioned elsewhere, R packages often aren't built against other R packages

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Gavin Simpson
Given that R is (has) moved to a 12 month release cycle, I don't want to either i) wait a year to get new packages (or allow users to use new versions of my packages), or ii) have to run R-devel just to use new packages. (or be on R-testing for that matter). People then will start finding ways

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Michael Weylandt
On Mar 19, 2014, at 18:42, Joshua Ulrich josh.m.ulr...@gmail.com wrote: On Wed, Mar 19, 2014 at 5:16 PM, Jeroen Ooms jeroen.o...@stat.ucla.edu wrote: On Wed, Mar 19, 2014 at 2:59 PM, Joshua Ulrich josh.m.ulr...@gmail.com wrote: So implementation isn't a problem. The problem is that

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Gavin Simpson
Michael, I think the issue is that Jeroen wants to take that responsibility out of the hands of the person trying to reproduce a work. If it used R 3.0.x and packages A, B and C then it would be trivial to to install that version of R and then pull down the stable versions of A B and C for that

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Michael Weylandt
On Mar 19, 2014, at 22:17, Gavin Simpson ucfa...@gmail.com wrote: Michael, I think the issue is that Jeroen wants to take that responsibility out of the hands of the person trying to reproduce a work. If it used R 3.0.x and packages A, B and C then it would be trivial to to install that

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Jeroen Ooms
On Wed, Mar 19, 2014 at 6:55 PM, Michael Weylandt michael.weyla...@gmail.com wrote: Reading this thread again, is it a fair summary of your position to say reproducibility by default is more important than giving users access to the newest bug fixes and features by default? It's certainly

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Michael Weylandt
On Mar 19, 2014, at 22:45, Jeroen Ooms jeroen.o...@stat.ucla.edu wrote: On Wed, Mar 19, 2014 at 6:55 PM, Michael Weylandt michael.weyla...@gmail.com wrote: Reading this thread again, is it a fair summary of your position to say reproducibility by default is more important than giving users

Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Karl Millar
I think what you really want here is the ability to easily identify and sync to CRAN snapshots. The easy way to do this is setup a CRAN mirror, but back it up with version control, so that it's easy to reproduce the exact state of CRAN at any given point in time. CRAN's not particularly large

[Rd] [RFC] A case for freezing CRAN

2014-03-18 Thread Jeroen Ooms
This came up again recently with an irreproducible paper. Below an attempt to make a case for extending the r-devel/r-release cycle to CRAN packages. These suggestions are not in any way intended as criticism on anyone or the status quo. The proposal described in [1] is to freeze a snapshot of