On 20 July 2011 at 18:02, peter dalgaard wrote: | | On Jul 20, 2011, at 15:38 , Dirk Eddelbuettel wrote: | | > | > On 20 July 2011 at 14:03, Jeroen Ooms wrote: | > | >> I think Bill Dunlap's answer addressed it: the claim appears to be false. | > | | > | Here is another example where there is randomness that is not due to | > | the seed. On the same machine, the same R binary, but through another | > | interface. First directly in the shell: | > | | > | > sessionInfo() | > | R version 2.13.1 (2011-07-08) | > | Platform: i686-pc-linux-gnu (32-bit) | > | | > | locale: | > | [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C | > | [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 | > | [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 | > | [7] LC_PAPER=en_US.UTF-8 LC_NAME=C | > | [9] LC_ADDRESS=C LC_TELEPHONE=C | > | [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C | > | | > | attached base packages: | > | [1] stats graphics grDevices utils datasets methods base | > | | > | > set.seed(123) | > | > print(coef(lm(dist~speed, data=cars)),digits=22) | > | (Intercept) speed | > | -17.579094890510951643137 3.932408759124087715975 | > | > That's PBKAC --- even double precision does NOT get you 22 digits precision. | | Hmm, yes, but you would expect the SAME function on the SAME data to yield the same floating point number, and give the SAME printout on the SAME R on the SAME hardware... | | FWIW all the Mac versions that I can access give the same results as the eclipse version. | | Let's look at the numbers side-by-side | | -17.579094890510951643137 3.932408759124087715975 | -17.57909489051087703615 3.93240875912408460735 | ! ! | 12.345678901234567890123 1.234567890123456789012 | | so we're seeing differences around the 15th/16th significant digit. This is consistent with a difference of about one unit of least precision in the actual objects, but there could conceivably be other explanations, e.g. the print() function picking up random garbage. Jeroen: Could you save() the results from the two cases, load() them in a new session and compute the difference?
Yes 15 to 16 is common. I should have added that to my post when I said '22 is too much'. And I did not want to give the impression that nine is what one gets, nine is the minimum as per the libc docs I quoted but as you illustrate, 15 to 16 can often be had. Thanks for the follow-up. Dirk | > You may want to read up on 'what every computer scientist should know about | > floating point arithmetic' by Goldberg (which is both a true internet classic) | > and ponder why a common setting for the various 'epsilon' settings of general | > convergence is set to of the constants supplied by the OS and/or its C | > library. R has | > | > #define SINGLE_EPS FLT_EPSILON | > [...] | > #define DOUBLE_EPS DBL_EPSILON | > | > in Constants.h. You can then chase the definition of FLT_EPSILON and | > DBL_EPSILON through your system headers (which is a good exercise). | > | > One place you may end up in the manual -- the following from the GNU libc | > documentationon :Floating Point Parameters" | > | > FLT_EPSILON | > This is the minimum positive floating point number of type float such that | > 1.0 + FLT_EPSILON != 1.0 is true. It's supposed to be no greater than 1E-5. | > | > DBL_EPSILON | > LDBL_EPSILON | > These are similar to FLT_EPSILON, but for the data types double and long | > double, respectively. The type of the macro's value is the same as the type | > it describes. The values are not supposed to be greater than 1E-9. | > | > So there -- nine digits. | > | > Dirk | > | > | > | # And this is through eclipse (java) | > | | > | > sessionInfo() | > | R version 2.13.1 (2011-07-08) | > | Platform: i686-pc-linux-gnu (32-bit) | > | | > | locale: | > | [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C | > | [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 | > | [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 | > | [7] LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8 | > | [9] LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8 | > | [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8 | > | | > | attached base packages: | > | [1] stats graphics grDevices utils datasets methods base | > | | > | other attached packages: | > | [1] rj_0.5.2-1 | > | | > | loaded via a namespace (and not attached): | > | [1] rJava_0.9-1 tools_2.13.1 | > | | > | > set.seed(123) | > | > print(coef(lm(dist~speed, data=cars)),digits=22) | > | (Intercept) speed | > | | | > | | > | ______________________________________________ | > | R-devel@r-project.org mailing list | > | https://stat.ethz.ch/mailman/listinfo/r-devel | > | > -- | > Gauss once played himself in a zero-sum game and won $50. | > -- #11 at http://www.gaussfacts.com | > | > ______________________________________________ | > R-devel@r-project.org mailing list | > https://stat.ethz.ch/mailman/listinfo/r-devel | | -- | Peter Dalgaard | Center for Statistics, Copenhagen Business School | Solbjerg Plads 3, 2000 Frederiksberg, Denmark | Phone: (+45)38153501 | Email: pd....@cbs.dk Priv: pda...@gmail.com | -- Gauss once played himself in a zero-sum game and won $50. -- #11 at http://www.gaussfacts.com ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel