Re: [Rd] significant digits (PR#9682)

Duncan Murdoch Tue, 03 Jun 2008 16:10:00 -0700

[EMAIL PROTECTED] wrote:

On Jun 3, 2008, at 5:12 PM, Duncan Murdoch wrote:
On 6/3/2008 4:36 PM, Simon Urbanek wrote:
On Jun 3, 2008, at 2:48 PM, Duncan Murdoch wrote:
On 6/3/2008 11:43 AM, Patrick Carr wrote:
On 6/3/08, Duncan Murdoch <[EMAIL PROTECTED]> wrote:
because signif(0.90, digits=2) == 0.9. Those two objects areidentical.
My text above that is poorly worded. They're identical internally,
yes. But in terms of the number of significant digits, 0.9 and 0.90
are different. And that matters when the number is printed, sayas anannotation on a graph. Passing it through sprintf() or format()laterrequires you to specify the number of digits after the decimal,whichis different than the number of significant digits, and requirescase
testing for numbers of different orders of magnitude.
The original complainant (and I) expected this behavior fromsignif(),
not merely rounding. As I said before, I wrote my own workaround so
this is somewhat academic, but I don't think we're alone.
As far as I know, rounding is fine in Windows:
round(1:10 + 0.5)
[1]  2  2  4  4  6  6  8  8 10 10
It might not be the rounding, then. (windows xp sp3)
 > signif(12345,digits=4)
 [1] 12340
 > signif(0.12345,digits=4)
 [1] 0.1235
It's easy to make mistakes in this, but a little outside-of-Rexperimentation suggests those are the right answers. The number12345 is exactly representable, so it is exactly half-way between12340 and 12350, so 12340 is the right answer by the unbiasedround- to-even rule. The number 0.12345 is not exactlyrepresentable, but (I think) it is represented by somethingslightly closer to 0.1235 than to 0.1234. So it looks as thoughWindows gets it right.
OS X (10.5.2/intel) does not have that problem.
Which would seem to imply OS X gets it wrong.
This has nothing to do with OS X, you get that same answer onpretty much all other platforms (Intel/Linux, MIPS/IRIX, Sparc/Sun, ...). Windows is the only one delivering the incorrect resulthere.
Both are supposed to be using the 64 bit floating point standard,so they should both give the same answer:
Should, yes, but Windows doesn't. In fact 10000.0 is exactlyrepresentable and so is 1234.5 which is the correct result thatall except Windows get.
I think you skipped a step.
I didn't - I was just pointing out that what you are trying to show isirrelevant. We are dealing with FP arithmetics here, so although yourreasoning is valid algebraically, it's not in FP world. You missed thefact that FP operations are used to actually get the result (*10000.0,round and divide again) and thus those operation will influence it aswell.

But your multiplication isn't accurate: 10000 is 16 times 625.Multiplying by 16 will shift the binary representation by 4 places, butmultiplying by an odd number should leave the last 1 bit in place. Ifyou go through the following series of operations, you get a differentanswer than multiplying by 10000:


> x <- 0.12345
> x <- 16*x
> y <- x
> x <- 624*x
> x + y - 1234.5
[1] 2.273737e-13

versus

> x <- 0.12345
> x <- 10000*x
> x - 1234.5
[1] 0

I get identical results for both of these calculations on a Mac and onWindows. Looks strange, but I think the explanation is that multiplyingby 625 (or 10000) needs one more bit than multiplying by 624 needs, sowe lose a bit at that point. Windows does the multiplication by 10000accurately in its 64 bit registers, and then loses the last bit onstoring the result into x in 53 bits. But in the signif() calculation,everything stays in 64 bit precision, so Windows gets the right result(0.1235), while those systems that round intermediate results get itwrong (0.1234).

It's all FP calculations, but in Windows we've programmed the run-timeto keep extra bits around for intermediate results and get more accurateresults in cases where things stay in registers, whereas the othersystems throw away the bits and get more consistent results regardlessof the execution path.

Duncan Murdoch

The correct answer is either 0.1234 or 0.1235, not something 10000times bigger. The first important question is whether 0.12345 isexactly representable, and the answer is no. The second question iswhether it is represented by a number bigger or smaller than thereal number 0.12345. If it is bigger, the answer should be 0.1235,and if it is smaller, the answer is 0.1234.
No. That was what I was trying to point out. You can see clearly frommy post that 0.12345 is not exactly representable and that therepresentation is slightly bigger. This is, however, irrelevant,because the next step is to multiply that number by 10000 (see fprecsource) and this is where your reasoning breaks down - the result isexact representation of 1234.5, because the imprecision gets lost inthe operation on all platforms but Windows. The result is that Windowsis inconsistent with others, whether that is a bug or feature I don'tcare. All I really wanted to say is that this has nothing to do withOS X - if anything then it's a Windows issue.
 My experiments suggest it is bigger.
I was not claiming otherwise.
 Yours don't look relevant.
Vice versa as it turns out.

Cheers,
Simon
It certainly isn't exactly equal to 1234.5/10000, because thatnumber is not representable. It's equal to x/2^y, for some x and y,and it's a pain to figure out exactly what they are.
However, I am pretty sure R is representing it (at least on Windows)as the binary expansion
0.000111111001101001101011010100001011000011110010011111
while the true binary expansion (using exact rational arithmetic)starts out
0.00011111100110100110101101010000101100001111001001111011101...
If you line those up, you'll see that the first number is biggerthan the second. (Ugly code to derive these is down below.)
Clearly the top representation is the correct one to that number ofbinary digits, so I think Windows got it right, and all those othersystems didn't. This is probably because R on Windows is usingextended precision (64 bit mantissas) for intermediate results, andthose other systems stick with 53 bit mantissas.
However, this means that Windows doesn't conform
Duncan Murdoch

# Convert number to binary expansion; add the decimal point manually

x <- 0.12345
while (x != 0) {
 cat(trunc(x))
 x <- x - trunc(x)
 x <- x * 2
}

# Do the same thing in exact rational arithmetic

num <- 12345
denom <- 100000
for (i in 1:60) {
 cat(ifelse(num > 100000, "1", "0"))
 num <- num %% 100000
 num <- 2*num
}

# Manually cut and paste the results to get these:

"0.000111111001101001101011010100001011000011110010011111"
"0.00011111100110100110101101010000101100001111001001111011101"
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] significant digits (PR#9682)

Reply via email to