On Jul 20, 2011, at 15:38 , Dirk Eddelbuettel wrote:

> 
> On 20 July 2011 at 14:03, Jeroen Ooms wrote:
> | >> I think Bill Dunlap's answer addressed it:  the claim appears to be 
> false.
> | 
> | Here is another example where there is randomness that is not due to
> | the seed. On the same machine, the same R binary, but through another
> | interface. First directly in the shell:
> | 
> | > sessionInfo()
> | R version 2.13.1 (2011-07-08)
> | Platform: i686-pc-linux-gnu (32-bit)
> | 
> | locale:
> |  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
> |  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
> |  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
> |  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
> |  [9] LC_ADDRESS=C               LC_TELEPHONE=C
> | [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> | 
> | attached base packages:
> | [1] stats     graphics  grDevices utils     datasets  methods   base
> | 
> | > set.seed(123)
> | > print(coef(lm(dist~speed, data=cars)),digits=22)
> |               (Intercept)                     speed
> | -17.579094890510951643137   3.932408759124087715975
> 
> That's PBKAC --- even double precision does NOT get you 22 digits precision.

Hmm, yes, but you would expect the SAME function on the SAME data to yield the 
same floating point number, and give the SAME printout on the SAME R on the 
SAME hardware... 

FWIW all the Mac versions that I can access give the same results as the 
eclipse version.

Let's look at the numbers side-by-side

-17.579094890510951643137   3.932408759124087715975
-17.57909489051087703615    3.93240875912408460735
                !                           !
 12.345678901234567890123   1.234567890123456789012

so we're seeing differences around the 15th/16th significant digit. This is 
consistent with a difference of about one unit of least precision in the actual 
objects, but there could conceivably be other explanations, e.g. the print() 
function picking up random garbage. Jeroen: Could you save() the results from 
the two cases, load() them in a new session and compute the difference?

>  
> 
> You may want to read up on 'what every computer scientist should know about
> floating point arithmetic' by Goldberg (which is both a true internet classic)
> and ponder why a common setting for the various 'epsilon' settings of general
> convergence is set to of the constants supplied by the OS and/or its C
> library. R has
> 
>  #define SINGLE_EPS     FLT_EPSILON
>  [...]
>  #define DOUBLE_EPS     DBL_EPSILON
> 
> in Constants.h. You can then chase the definition of FLT_EPSILON and
> DBL_EPSILON through your system headers (which is a good exercise).
> 
> One place you may end up in the manual -- the following from the GNU libc
> documentationon :Floating Point Parameters"
> 
> FLT_EPSILON
>     This is the minimum positive floating point number of type float such that
>     1.0 + FLT_EPSILON != 1.0 is true. It's supposed to be no greater than 
> 1E-5. 
> 
> DBL_EPSILON
> LDBL_EPSILON
>     These are similar to FLT_EPSILON, but for the data types double and long
>     double, respectively. The type of the macro's value is the same as the 
> type
>     it describes. The values are not supposed to be greater than 1E-9.
> 
> So there -- nine digits. 
> 
> Dirk 
> 
> 
> | # And this is through eclipse (java)
> | 
> | > sessionInfo()
> | R version 2.13.1 (2011-07-08)
> | Platform: i686-pc-linux-gnu (32-bit)
> | 
> | locale:
> |  [1] LC_CTYPE=en_US.UTF-8          LC_NUMERIC=C
> |  [3] LC_TIME=en_US.UTF-8           LC_COLLATE=en_US.UTF-8
> |  [5] LC_MONETARY=en_US.UTF-8       LC_MESSAGES=en_US.UTF-8
> |  [7] LC_PAPER=en_US.UTF-8          LC_NAME=en_US.UTF-8
> |  [9] LC_ADDRESS=en_US.UTF-8        LC_TELEPHONE=en_US.UTF-8
> | [11] LC_MEASUREMENT=en_US.UTF-8    LC_IDENTIFICATION=en_US.UTF-8
> | 
> | attached base packages:
> | [1] stats     graphics  grDevices utils     datasets  methods   base
> | 
> | other attached packages:
> | [1] rj_0.5.2-1
> | 
> | loaded via a namespace (and not attached):
> | [1] rJava_0.9-1  tools_2.13.1
> | 
> | > set.seed(123)
> | > print(coef(lm(dist~speed, data=cars)),digits=22)
> |              (Intercept)                    speed
> | 

> | 
> | ______________________________________________
> | R-devel@r-project.org mailing list
> | https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> -- 
> Gauss once played himself in a zero-sum game and won $50.
>                      -- #11 at http://www.gaussfacts.com
> 
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd....@cbs.dk  Priv: pda...@gmail.com

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to