Re: [Rd] The case for freezing CRAN

2014-03-26 Thread Geoff Jentry

On Thu, 20 Mar 2014, Dirk Eddelbuettel wrote:

o Roger correctly notes that R scripts and packages are just one issue.
  Compilers, libraries and the OS matter.  To me, the natural approach these
  days would be to think of something based on Docker or Vagrant or (if you
  must, VirtualBox).  The newer alternatives make snapshotting very cheap
  (eg by using Linux LXC).  That approach reproduces a full environemnt as
  best as we can while still ignoring the hardware layer (and some readers
  may recall the infamous Pentium bug of two decades ago).


At one of my previous jobs we did effectively this (albeit in a lower tech 
fashion). Every project had its own environment, complete with the exact 
snapshot of R & packages used, etc. All scripts/code was kept in that 
environment in a versioned fashion such that at any point one could go to 
any stage of development of that paper/project's analysis and reproduce it 
exactly.


It was hugely inefficient in terms of storage, but it solved the problem 
we're discussing here. As you note, with the tools available today it'd be 
trivial to distribute that environment for people to reproduce results.


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] The case for freezing CRAN

2014-03-21 Thread Dirk Eddelbuettel

On 21 March 2014 at 07:43, Therneau, Terry M., Ph.D. wrote:
| This has been a fascinating discussion.

I am not so sure. Seems more like rehashing of old and known arguments, while
some folks try to push their work (Hi Jeroen :) onto already overloaded
others.  The only real thing I learned so far is that Philippe is busy
earning publication credits along the line 'damn, just go and test it'
suggestion I made (somewhat flippantly) in my last email.

| I maintain the survival package which currently has 246 reverse dependencies 
and take a 
| slightly different view, which could be described as "the price of fame".  I 
feel a 
| responsiblity to not break R.  I have automated scripts which download the 
latest copy of 
| all 246, using the install-tests option, and run them all. Most updates have 
1-3 issues.  

Same here, but as a somewhat younger package Rcpp is so far "only" at 189 and
counting, with pretty decent growth.  My experience has been positive too,
and CRAN appears appreciative for us doing preemptive work and trying to be
careful about not introducing breaking changes.  I too see the latter part as
something we owe the users of our package: a "promise" not to mess with the
interface unless we absolutely must.   

| but also worth it.  I've built the test scripts over several years, with help 
from several 
| others; a place to share this information would be a useful addition.

I put my script on GitHub next to Rcpp itself, turns out that another thread
participant just a need for exactly that script yesterday.

Dirk

-- 
Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] The case for freezing CRAN

2014-03-21 Thread Gábor Csárdi
On Fri, Mar 21, 2014 at 8:43 AM, Therneau, Terry M., Ph.D. <
thern...@mayo.edu> wrote:
[...]
>
> Gabor Csardi discussed the problems with maintaining a package with lots
> of dependencies.
> I maintain the survival package which currently has 246 reverse
> dependencies and take a slightly different view, which could be described
> as "the price of fame".  I feel a responsiblity to not break R.  I have
> automated scripts which download the latest copy of all 246, using the
> install-tests option, and run them all. Most updates have 1-3 issues.
>  About 25% of the time it turns out to be a problem that I introduced, and
> in all the others I have found the other package authors to be responsive.
>  It is a nuisance, yes, but also worth it.  I've built the test scripts
> over several years, with help from several others; a place to share this
> information would be a useful addition.
>

Well, maybe you are just a better programmer and maintainer than me, and I
am alone with my problems. I hope that this is the case.

I actually do run automated tests against the reverse dependencies. It
downloads ~3GB of packages, the output is 500KB (much of it is the
compilation of my package, though), and it contains the word 'error' ~ 80
and the word 'warning' ~ 270 times:
http://pave.igraph.org/job/igraph-r-check-deps/15/consoleFull

This process also keeps me honest about any updates that are not backwards
> compatable.


Not really, this would only be true if all the 246 package had proper tests
for all of their survival uses. Unlikely. It definitely helps, I am not
saying that it does not, but I also think that it is up to the maintainer
of the package to test it, including testing it against newer versions of
its dependencies. Simply because the maintainers know best how their
packages are supposed to work, and how it is supposed to be tested.

The other thing is that quite often I do want to break the API, and this
would be much easier with having a CRAN-devel, so that there is some time
for the problems to come up.

Gabor

There is hardly a single option that is not used by some other package,
> somewhere.


[...]

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] The case for freezing CRAN

2014-03-21 Thread Therneau, Terry M., Ph.D.

This has been a fascinating discussion.

Carl Boettinger replied with a set of examples where the world is much more fragile than 
my examples.  That was useful.  It seems that people in my area (medical research and 
survival) are more careful with their packages (whew!).


Gabor Csardi discussed the problems with maintaining a package with lots of 
dependencies.
I maintain the survival package which currently has 246 reverse dependencies and take a 
slightly different view, which could be described as "the price of fame".  I feel a 
responsiblity to not break R.  I have automated scripts which download the latest copy of 
all 246, using the install-tests option, and run them all. Most updates have 1-3 issues.  
About 25% of the time it turns out to be a problem that I introduced, and in all the 
others I have found the other package authors to be responsive.  It is a nuisance, yes, 
but also worth it.  I've built the test scripts over several years, with help from several 
others; a place to share this information would be a useful addition.


This process also keeps me honest about any updates that are not backwards compatable.  
There is hardly a single option that is not used by some other package, somewhere.


Terry Therneau

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] The case for freezing CRAN

2014-03-20 Thread Marc Schwartz

On Mar 20, 2014, at 1:02 PM, Marc Schwartz  wrote:

> 
> On Mar 20, 2014, at 12:23 PM, Greg Snow <538...@gmail.com> wrote:
> 
>> On Thu, Mar 20, 2014 at 7:32 AM, Dirk Eddelbuettel  wrote:
>> [snip]
>> 
>>>(and some readers
>>>  may recall the infamous Pentium bug of two decades ago).
>> 
>> It was a "Flaw" not a "Bug".  At least I remember the Intel people
>> making a big deal about that distinction.
>> 
>> But I do remember the time well, I was a biostatistics Ph.D. student
>> at the time and bought one of the flawed pentiums.  My attempts at
>> getting the chip replaced resulted in a major run around and each
>> person that I talked to would first try to explain that I really did
>> not need the fix because the only people likely to be affected were
>> large corporations and research scientists.  I will admit that I was
>> not a large corporation, but if a Ph.D. student in biostatistics is
>> not a research scientist, then I did not know what they defined one
>> as.  When I pointed this out they would usually then say that it still
>> would not matter, unless I did a few thousand floating point
>> operations I was unlikely to encounter one of the problematic
>> divisions.  I would then point out that some days I did over 10,000
>> floating point operations before breakfast (I had checked after the
>> 1st person told me this and 10,000 was a low estimate of a lower bound
>> of one set of simulations) at which point they would admit that I had
>> a case and then send me to talk to someone else who would start the
>> process over.
> 
> 
> Further segue:
> 
> That (1994) was a watershed moment for Intel as a company. A time during 
> which Intel's future was quite literally at stake. Intel's internal response 
> to that debacle, which fundamentally altered their own perception of just who 
> their customer was (the OEM's like IBM, COMPAQ and Dell versus the end users 
> like us), took time to be realized, as the impact of increasingly negative PR 
> took hold. It was also a good example of the impact of public perception (a 
> flawed product) versus the realities of how infrequently the flaw would be 
> observed in "typical" computing. "Perception is reality", as some would 
> observe.
> 
> Intel ultimately spent somewhere in the neighborhood of $500 million (in 1994 
> U.S. dollars), as I recall, to implement a large scale Pentium chip 
> replacement infrastructure targeted to end users. The "Intel Inside" 
> marketing campaign was also an outgrowth of that time period.
> 


Quick correction, thanks to Peter, on my assertion that the "Intel Inside" 
campaign arose from the 1994 Pentium issue. It actually started in 1991.

I had a faulty recollection from my long ago reading of Andy Grove's 1996 book, 
"Only The Paranoid Survive", that the slogan arose from Intel's reaction to the 
Pentium fiasco. It actually pre-dated that time frame by a few years.

Thanks Peter!

Regards,

Marc

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] The case for freezing CRAN

2014-03-20 Thread Karl Millar
Given the version / dated snapshots of CRAN, and an agreement that
reproducibility is the responsibility of the study author, the author
simply needs to sync all their packages to a chosen date, run the analysis
and publish the chosen date.  It is true that this doesn't include
compilers, OS, system packages etc, but in my experience those are
significantly more stable than CRAN packages.


Also, my previous description of how to serve up a dated CRAN was way too
complicated.  Since most of the files on CRAN never change, they don't need
version control.  Only the metadata about which versions are current really
needs to be tracked, and that's small enough that it could be stored in
static files.




On Thu, Mar 20, 2014 at 6:32 AM, Dirk Eddelbuettel  wrote:

>
> No attempt to summarize the thread, but a few highlighted points:
>
>  o Karl's suggestion of versioned / dated access to the repo by adding a
>layer to webaccess is (as usual) nice.  It works on the 'supply' side.
> But
>Jeroen's problem is on the demand side.  Even when we know that an
>analysis was done on 20xx-yy-zz, and we reconstruct CRAN that day, it
> only
>gives us a 'ceiling' estimate of what was on the machine.  In production
>or lab environments, installations get stale.  Maybe packages were
> already
>a year old?  To me, this is an issue that needs to be addressed on the
>'demand' side of the user. But just writing out version numbers is not
>good enough.
>
>  o Roger correctly notes that R scripts and packages are just one issue.
>Compilers, libraries and the OS matter.  To me, the natural approach
> these
>days would be to think of something based on Docker or Vagrant or (if
> you
>must, VirtualBox).  The newer alternatives make snapshotting very cheap
>(eg by using Linux LXC).  That approach reproduces a full environemnt as
>best as we can while still ignoring the hardware layer (and some readers
>may recall the infamous Pentium bug of two decades ago).
>
>  o Reproduciblity will probably remain the responsibility of study
>authors. If an investigator on a mega-grant wants to (or needs to)
> freeze,
>they do have the tools now.  Requiring the need of a few to push work on
>those already overloaded (ie CRAN) and changing the workflow of
> everybody
>is a non-starter.
>
>  o As Terry noted, Jeroen made some strong claims about exactly how flawed
>the existing system is and keeps coming back to the example of 'a JSS
>paper that cannot be re-run'.  I would really like to see empirics on
>this.  Studies of reproducibility appear to be publishable these days,
> so
>maybe some enterprising grad student wants to run with the idea of
>actually _testing_ this.  We maybe be above Terry's 0/30 and nearer to
>Kevin's 'low'/30.  But let's bring some data to the debate.
>
>  o Overall, I would tend to think that our CRAN standards of releasing with
>tests, examples, and checks on every build and release already do a much
>better job of keeping things tidy and workable than in most if not all
>other related / similar open source projects. I would of course welcome
>contradictory examples.
>
> Dirk
>
> --
> Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] The case for freezing CRAN

2014-03-20 Thread Marc Schwartz

On Mar 20, 2014, at 12:23 PM, Greg Snow <538...@gmail.com> wrote:

> On Thu, Mar 20, 2014 at 7:32 AM, Dirk Eddelbuettel  wrote:
> [snip]
> 
>> (and some readers
>>   may recall the infamous Pentium bug of two decades ago).
> 
> It was a "Flaw" not a "Bug".  At least I remember the Intel people
> making a big deal about that distinction.
> 
> But I do remember the time well, I was a biostatistics Ph.D. student
> at the time and bought one of the flawed pentiums.  My attempts at
> getting the chip replaced resulted in a major run around and each
> person that I talked to would first try to explain that I really did
> not need the fix because the only people likely to be affected were
> large corporations and research scientists.  I will admit that I was
> not a large corporation, but if a Ph.D. student in biostatistics is
> not a research scientist, then I did not know what they defined one
> as.  When I pointed this out they would usually then say that it still
> would not matter, unless I did a few thousand floating point
> operations I was unlikely to encounter one of the problematic
> divisions.  I would then point out that some days I did over 10,000
> floating point operations before breakfast (I had checked after the
> 1st person told me this and 10,000 was a low estimate of a lower bound
> of one set of simulations) at which point they would admit that I had
> a case and then send me to talk to someone else who would start the
> process over.


Further segue:

That (1994) was a watershed moment for Intel as a company. A time during which 
Intel's future was quite literally at stake. Intel's internal response to that 
debacle, which fundamentally altered their own perception of just who their 
customer was (the OEM's like IBM, COMPAQ and Dell versus the end users like 
us), took time to be realized, as the impact of increasingly negative PR took 
hold. It was also a good example of the impact of public perception (a flawed 
product) versus the realities of how infrequently the flaw would be observed in 
"typical" computing. "Perception is reality", as some would observe.

Intel ultimately spent somewhere in the neighborhood of $500 million (in 1994 
U.S. dollars), as I recall, to implement a large scale Pentium chip replacement 
infrastructure targeted to end users. The "Intel Inside" marketing campaign was 
also an outgrowth of that time period.

Regards,

Marc Schwartz


> [snip]
>> --
>> Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com
>> 
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> 
> 
> -- 
> Gregory (Greg) L. Snow Ph.D.
> 538...@gmail.com
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] The case for freezing CRAN

2014-03-20 Thread Carl Boettiger
There seems to be some question of how frequently changes to software
packages result in irreproducible results.

I am sure Terry is correct that research using functions like `glm` and
other functions that are shipped with base R are quite reliable; and after
all they already benefit from being versioned with R releases as Jeroen
argues.

In my field of ecology and evolution, the situation is quite different.
 Packages are frequently developed by scientists without any background in
programming and become widely used, such as [geiger](
http://cran.r-project.org/web/packages/geiger/), with 463 papers citing it
and probably many more using it that do not cite it (both because it is
sometimes used only as a dependency of another package or just because our
community isn't great at citing packages).  The package has changed
substantially over the time it has been on CRAN and many functions that
would once run based on older versions could no longer run on newer ones.
 It's dependencies, notably the phylogenetics package ape, has changed
continually over that interval with both bug fixes and substantial changes
to the basic data structure.  The ape package has 1,276 citations (again a
lower bound).  I suspect that correctly identifying the right version of
the software used in any of these thousands of papers would prove difficult
and for a large fraction the results would simply not execute successfully.
It would be much harder to track down cases where the bug fixes would have
any impact on the result.  I have certainly seen both problems in the
hundreds of Sweave/knitr files I have produced over the years that use
these packages.

Even work that simply relies on a package that has been archived becomes a
substantial challenge to reproducibility by other scientists even when an
expert familiar with the packages (e.g. the original author) would not have
a problem, as the informatics team at the Evolutionary Synthesis center
recently concluded in an exercise trying to reproduce several papers
including my own that used a package that had been archived (odesolve,
whose replacement, deSolve, does not use quite the same function call for
the same `lsoda` function).

New methods are being published all the time, and I think it is excellent
that in ecology and evolution it is increasingly standard to publish R
packages implementing those methods, as a scan of any table of contents in
"methods in Ecology and Evolution", for instance, will quickly show.  But
unlike `glm`, these methods have a long way to go before they are fully
tested and debugged, and reproducing any work based on them requires a
close eye to the versions (particularly when unit tests and even detailed
changelogs are not common). The methods are invariably built by
"user-developers", researchers developing the code for their own needs, and
thus these packages can themselves fall afoul of changes as they depend and
build upon work of other nascent ecology and evolution packages.

Detailed reproducibility studies of published work in this area are still
hard to come by, not least because the actual code used by the researchers
is seldom published (other than when it is published as it's own R
package).  But incompatibilities between successive versions of the 100s of
packages in our domain, along with the interdependencies of those packages
might provide some window into the difficulties of computational
reproducibility.  I suspect changes in these fast-moving packages are far
more culprit than differences in compilers and operating systems.

Cheers,

Carl








On Thu, Mar 20, 2014 at 10:23 AM, Greg Snow <538...@gmail.com> wrote:

> On Thu, Mar 20, 2014 at 7:32 AM, Dirk Eddelbuettel  wrote:
> [snip]
>
> >  (and some readers
> >may recall the infamous Pentium bug of two decades ago).
>
> It was a "Flaw" not a "Bug".  At least I remember the Intel people
> making a big deal about that distinction.
>
> But I do remember the time well, I was a biostatistics Ph.D. student
> at the time and bought one of the flawed pentiums.  My attempts at
> getting the chip replaced resulted in a major run around and each
> person that I talked to would first try to explain that I really did
> not need the fix because the only people likely to be affected were
> large corporations and research scientists.  I will admit that I was
> not a large corporation, but if a Ph.D. student in biostatistics is
> not a research scientist, then I did not know what they defined one
> as.  When I pointed this out they would usually then say that it still
> would not matter, unless I did a few thousand floating point
> operations I was unlikely to encounter one of the problematic
> divisions.  I would then point out that some days I did over 10,000
> floating point operations before breakfast (I had checked after the
> 1st person told me this and 10,000 was a low estimate of a lower bound
> of one set of simulations) at which point they would admit that I had
> a case and then

Re: [Rd] The case for freezing CRAN

2014-03-20 Thread Greg Snow
On Thu, Mar 20, 2014 at 7:32 AM, Dirk Eddelbuettel  wrote:
[snip]

>  (and some readers
>may recall the infamous Pentium bug of two decades ago).

It was a "Flaw" not a "Bug".  At least I remember the Intel people
making a big deal about that distinction.

But I do remember the time well, I was a biostatistics Ph.D. student
at the time and bought one of the flawed pentiums.  My attempts at
getting the chip replaced resulted in a major run around and each
person that I talked to would first try to explain that I really did
not need the fix because the only people likely to be affected were
large corporations and research scientists.  I will admit that I was
not a large corporation, but if a Ph.D. student in biostatistics is
not a research scientist, then I did not know what they defined one
as.  When I pointed this out they would usually then say that it still
would not matter, unless I did a few thousand floating point
operations I was unlikely to encounter one of the problematic
divisions.  I would then point out that some days I did over 10,000
floating point operations before breakfast (I had checked after the
1st person told me this and 10,000 was a low estimate of a lower bound
of one set of simulations) at which point they would admit that I had
a case and then send me to talk to someone else who would start the
process over.



[snip]
> --
> Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] The case for freezing CRAN

2014-03-20 Thread Dirk Eddelbuettel

No attempt to summarize the thread, but a few highlighted points:

 o Karl's suggestion of versioned / dated access to the repo by adding a
   layer to webaccess is (as usual) nice.  It works on the 'supply' side. But
   Jeroen's problem is on the demand side.  Even when we know that an
   analysis was done on 20xx-yy-zz, and we reconstruct CRAN that day, it only
   gives us a 'ceiling' estimate of what was on the machine.  In production
   or lab environments, installations get stale.  Maybe packages were already
   a year old?  To me, this is an issue that needs to be addressed on the
   'demand' side of the user. But just writing out version numbers is not
   good enough.

 o Roger correctly notes that R scripts and packages are just one issue.
   Compilers, libraries and the OS matter.  To me, the natural approach these
   days would be to think of something based on Docker or Vagrant or (if you
   must, VirtualBox).  The newer alternatives make snapshotting very cheap
   (eg by using Linux LXC).  That approach reproduces a full environemnt as
   best as we can while still ignoring the hardware layer (and some readers
   may recall the infamous Pentium bug of two decades ago).

 o Reproduciblity will probably remain the responsibility of study
   authors. If an investigator on a mega-grant wants to (or needs to) freeze,
   they do have the tools now.  Requiring the need of a few to push work on
   those already overloaded (ie CRAN) and changing the workflow of everybody
   is a non-starter.

 o As Terry noted, Jeroen made some strong claims about exactly how flawed
   the existing system is and keeps coming back to the example of 'a JSS
   paper that cannot be re-run'.  I would really like to see empirics on
   this.  Studies of reproducibility appear to be publishable these days, so
   maybe some enterprising grad student wants to run with the idea of
   actually _testing_ this.  We maybe be above Terry's 0/30 and nearer to
   Kevin's 'low'/30.  But let's bring some data to the debate.

 o Overall, I would tend to think that our CRAN standards of releasing with
   tests, examples, and checks on every build and release already do a much
   better job of keeping things tidy and workable than in most if not all
   other related / similar open source projects. I would of course welcome
   contradictory examples.

Dirk
 
-- 
Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] The case for freezing CRAN

2014-03-20 Thread Kevin Coombes


On 3/20/2014 9:00 AM, Therneau, Terry M., Ph.D. wrote:



On 03/20/2014 07:48 AM, Michael Weylandt wrote:
On Mar 20, 2014, at 8:19, "Therneau, Terry M., Ph.D." 
 wrote:



There is a central assertion to this argument that I don't follow:

At the end of the day most published results obtained with R just 
won't be reproducible.


This is a very strong assertion. What is the evidence for it?


If I've understood Jeroen correctly, his point might be alternatively 
phrased as "won't be reproducED" (i.e., end user difficulties, not 
software availability).


Michael



That was my point as well.  Of the 30+ Sweave documents that I've 
produced I can't think of one that will change its output with a new 
version of R.  My 0/30 estimate is at odds with the "nearly all" 
assertion.  Perhaps I only do dull things?


Terry T.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


The only concrete example that comes to mind from my own Sweave reports 
was actually caused by BioConductor and not CRAN. I had a set of 
analyses that used DNAcopy, and the results changed substantially with a 
new release of the package in which they changed the default values to 
the main function call.   As a result, I've taken to writing out more of 
the defaults that I previously just accepted.  There have been a few 
minor issues similar to this one (with changes to parts of the Mclust 
package ??). So my estimates are somewhat higher than 0/30 but are still 
a long way from "almost all".


Kevin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] The case for freezing CRAN

2014-03-20 Thread Therneau, Terry M., Ph.D.



On 03/20/2014 07:48 AM, Michael Weylandt wrote:

On Mar 20, 2014, at 8:19, "Therneau, Terry M., Ph.D."  wrote:


There is a central assertion to this argument that I don't follow:


At the end of the day most published results obtained with R just won't be 
reproducible.


This is a very strong assertion. What is the evidence for it?


If I've understood Jeroen correctly, his point might be alternatively phrased as 
"won't be reproducED" (i.e., end user difficulties, not software availability).

Michael



That was my point as well.  Of the 30+ Sweave documents that I've produced I can't think 
of one that will change its output with a new version of R.  My 0/30 estimate is at odds 
with the "nearly all" assertion.  Perhaps I only do dull things?


Terry T.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] The case for freezing CRAN

2014-03-20 Thread Michael Weylandt
On Mar 20, 2014, at 8:19, "Therneau, Terry M., Ph.D."  wrote:

> There is a central assertion to this argument that I don't follow:
> 
>> At the end of the day most published results obtained with R just won't be 
>> reproducible.
> 
> This is a very strong assertion. What is the evidence for it?

If I've understood Jeroen correctly, his point might be alternatively phrased 
as "won't be reproducED" (i.e., end user difficulties, not software 
availability).

Michael

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] The case for freezing CRAN

2014-03-20 Thread Therneau, Terry M., Ph.D.

There is a central assertion to this argument that I don't follow:


At the end of the day most published results obtained with R just won't be 
reproducible.


This is a very strong assertion. What is the evidence for it?

 I write a lot of Sweave/knitr in house as a way of documenting complex analyses, and a 
glm() based logistic regression looks the same yesterday as it will tomorrow.


Terry Therneau

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel