On 09/19/2018 10:03 AM, Ben Bolker wrote:
...
    Balancing backward compatibility and correctness is a tough problem
here.

I think improvements in the RNG is a situation where backward compatibility is not really going to be lost, because people can specify the old generator, they just will not get it by default. My opinion is that the default needs to generally be the best option available because too many people will be expecting that, or not know better, in which case that is what they should get.

There are only two small problems that occur to me:

1/ Researchers that want to have reproducible results (all I hope) need to be aware the change has happened. In theory they should have recorded the RNG they were using, along with the seed (and, BTW, the number of nodes if they generate with a parallel generator). If they have not done that then they can figure out the RNG from knowing what version of R they used. If they haven't recorded that then they can figure it out by some experimentation and knowing roughly when they did the research. If none of this works then the research probably should be lost.

As an exercise, researchers might also want to experiment with whether the new default qualitatively changes their results. That might lead to publishable research, so no one should complain.

2/ Package maintainers that have used the default RNG to generate tests may need to change their tests to specify the old generator, or modify results used for comparisons in the tests. Since package testing is usually for code checking rather than statistical results, not using the best available generator is not usually an issue.

Most of my own package testing already specifies the generator, lots uses "buggy Kinderman-Ramage" because tests were set up a long time ago. I will have to change package setRNG which warns when the default generator changes. (This warning is intentional because I was bitten badly by a small change in the S generator circa 1990.)


If this goes into base R, what's the best way to do it?  What was
the protocol for migrating away from the "buggy Kinderman-Ramage"
generator, back in the day?   (Version 1.7 was sometime between 2001 and
2004).

I think there may have been a change in R 0.99 too. At least my notes suggest that the code I changed for R 1.7.0 had worked with the default generator from R 0.99 to 1.6.2.

I don't recall the protocol, I think it just happened and was announced in the NEWS. (Has this protocol changed?) The ramification for me was that I had to go through all of my packages' testing and change the name of the explicitly specified RNG to "buggy Kinderman-Ramage".

Perhaps there does need to be a protocol for testing before release. When my package setRNG fails then many of my other packages will also fail because they depend on it. This is a simple fix but reverse dependencies may make it look like lots of things are broken.

Paul Gilbert

   I couldn't find the exact commit in the GitHub mirror: this is related ...

https://github.com/wch/r-source/commit/7ad3044639fd1fe093c655e573fd1a67aa7f55f6#diff-dbcad570d4fb9b7005550ff630543b37



===
‘normal.kind’ can be ‘"Kinderman-Ramage"’, ‘"Buggy
      Kinderman-Ramage"’ (not for ‘set.seed’), ‘"Ahrens-Dieter"’,
      ‘"Box-Muller"’, ‘"Inversion"’ (the default), or ‘"user-supplied"’.
      (For inversion, see the reference in ‘qnorm’.)  The
      Kinderman-Ramage generator used in versions prior to 1.7.0 (now
      called ‘"Buggy"’) had several approximation errors and should only
      be used for reproduction of old results.

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to