I wonder how useful a (set of?) "time machine" functions which look up /infer things like this based on a date would be. Could ease the pain of changes generally, though not remove it completely.
~G On Wed, Aug 31, 2016 at 5:45 PM, Paul Gilbert <pgilbert...@gmail.com> wrote: > > > On 08/30/2016 06:29 PM, Duncan Murdoch wrote: > >> I don't see evidence of a bug. There have been several versions of the >> MT; we may be using a different version than you are. Ours is the >> 1999/10/28 version; the web page you cite uses one from 2002. >> >> Perhaps the newer version fixes some problems, and then it would be >> worth considering a change. But changing the default RNG definitely >> introduces problems in reproducibility, >> > > Well "problems in reproducibility" is a bit vague. Results would always be > reproducible by specifying kind="Mersenne-Twister" or kind="Buggy > Kinderman-Ramage" for older results, so there is no problem reproducing > results. The only problem is that users expecting to reproduce results > twenty years later will need to know what random generator they used. (BTW, > they may also need to record information about the normal or other > generator, as well as the seed.) Of course, these changes are recorded > pretty well for R, so the history of "default" can always be found. > > I think it is a mistake to encourage users into thinking they do not need > to keep track of some information if they want reproducibility. Perhaps the > default should be changed more often in order to encourage better user > habits. > > More seriously, I think "default" should continue to be something that is > currently considered to be good. So, if there really is a known problem, > then I think "default" should be changed. > > (And, no I did not get burned by the R 1.7.0 change in the default > generator. I got burned by a much earlier, unadvertised, and more subtle > change in the Splus generator.) > > Paul Gilbert > > > so it's not obvious that we > >> would do it. >> >> Duncan Murdoch >> >> >> On 30/08/2016 5:45 PM, Mark Roberts wrote: >> >>> Whomever, >>> >>> I recently sent the "bug report" below tor-c...@r-project.org and have >>> just been asked to instead submit it to you. >>> >>> Although I am basically not an R user, I have installed version 3.3.1 >>> and am also the author of a statistics program written in Visual Basic >>> that contains a component which correctly implements the Mersenne >>> Twister (MT) algorithm. I believe that it is not possible to generate >>> the correct stream of pseudorandom numbers using the MT default random >>> number generator in R, and am not the first person to notice this. Here >>> is a posted 2013 entry >>> (www.r-bloggers.com/reproducibility-and-randomness/) on an R website >>> that asserts that the SAS computer program implementation of the MT >>> algorithm produces different numbers than R does when using the same >>> starting seed number. The author of this post didn’t get anyone to >>> respond to his query about the reason for this SAS vs. R discrepancy. >>> >>> There are two ways of initializing the original MT computer program >>> (written in C) so that an identical stream of numbers can be repeatedly >>> generated: 1) with a particular integer seed number, and 2) with a >>> particular array of integers. In the 'compilation and usage' section >>> of this webpage (https://github.com/cslarsen/mersenne-twister) there is >>> a listing of the first 200 random numbers the MT algorithm should >>> produce for seed number = 1. The inventors of the Mersenne Twister >>> random number generator provided two different sets of the first 1000 >>> numbers produced by a correctly coded 32-bit implementation of the MT >>> algorithm when initializing it with a particular array of integers at: >>> www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/CODES/mt19937ar.out. >>> [There is a link to this output at: >>> www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/emt19937ar.html.] >>> >>> My statistics program obtains exactly those 200 numbers from the first >>> site mentioned in the previous paragraph and also obtains those same >>> numbers from the second website (though I didn't check all 2000 values). >>> Assuming that the MT code within R uses the 32-bit MT algorithm, I >>> suspect that the current version of R can't do that. If you (i.e., >>> anyone who might knowledgeably respond to this report) is able to >>> duplicate those reference test-values, then please send me the R code to >>> initialize the MT code within R to successfully do that, and I apologize >>> for having wasted your time. If you (collectively) can't do that, then R >>> is very likely using incorrectly implemented MT code. And if this >>> latter possibility is true, it seems to me that this is something that >>> should be fixed. >>> >>> Mark Roberts, Ph.D. >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >>> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > -- Gabriel Becker, PhD Associate Scientist (Bioinformatics) Genentech Research [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel