Re: Always false float comparisons

Timon Gehr via Digitalmars-d Wed, 18 May 2016 10:22:20 -0700

On 17.05.2016 23:07, Walter Bright wrote:

On 5/17/2016 11:08 AM, Timon Gehr wrote:

Right. Hence, the 80-bit CTFE results have to be converted to the final
precision at some point in order to commence the runtime computation.
This means
that additional rounding happens, which was not present in the
original program.
The additional total roundoff error this introduces can exceed the
roundoff
error you would have suffered by using the lower precision in the
first place,
sometimes completely defeating precision-enhancing improvements to an
algorithm.


I'd like to see an example of double rounding "completely defeating" an
algorithm,


I have given this example, and I have explained it.

However, let me provide one of the examples I have given before, in amore concrete fashion. Unfortunately, there is no way to use "standard"D to illustrate the problem, as there is no way to write animplementation that is guaranteed not to be broken, so let us assumehypothetically for now that we are using D', where all computations areperformed at the specified precision.


I'm copying the code from:
https://en.wikipedia.org/wiki/Kahan_summation_algorithm


$ cat kahanDemo.d

module kahanDemo;

double sum(double[] arr){
    double s=0.0;
    foreach(x;arr) s+=x;
    return s;
}

double kahan(double[] arr){
    double sum = 0.0;
    double c = 0.0;
    foreach(x;arr){
        double y=x-c;
        double t=sum+y;
        c = (t-sum)-y;
        sum=t;
    }
    return sum;
}

double kahanBroken(double[] arr){
    double sum = 0;
    double c= 0.0;
    foreach(x;arr){
        real y=x-c;
        real t=sum+y;
        c = (t-sum)-y;
        sum=t;
    }
    return sum;
}

void main(){
    double[] data=[1e16,1,-9e15];
    import std.stdio;
    writefln("%f",sum(data)); // baseline
    writefln("%f",kahan(data)); // kahan
    writefln("%f",kahanBroken(data)); // broken kahan
}

In D, the compiler is in principle allowed to transform the non-brokenversion to the broken version. (And maybe, it will soon be allowed totransform the baseline version to the Kahan version. Who knows.)


Now let's see what DMD does:

$ dmd --version
DMD64 D Compiler v2.071.0
Copyright (c) 1999-2015 by Digital Mars written by Walter Bright

$ dmd -m64 -run kahanDemo.d
1000000000000000.000000
1000000000000001.000000
1000000000000000.000000

Nice, this is what I expect.

$ dmd -m64 -O -run kahanDemo.d
1000000000000000.000000
1000000000000001.000000
1000000000000000.000000

Still great.

$ dmd -m32 -run kahanDemo.d
1000000000000000.000000
1000000000000001.000000
1000000000000000.000000

Liking this.

$ dmd -m32 -O -run kahanDemo.d
1000000000000000.000000
1000000000000000.000000
1000000000000000.000000

Screw you, DMD!

And suddenly, I need to compile and test my code with all combinationsof compiler flags, and even then I am not sure the compiler is notintentionally screwing me over. How is this remotely acceptable?

and why an unusual case of producing a slightly worse

It's not just slightly worse, it can cut the number of useful bits inhalf or more! It is not unusual, I have actually run into those problemsin the past, and it can break an algorithm that is in Phobos today!

answer trumps the usual case of producing better answers.
...

The 'usual case of producing better answers' /is not actuallydesirable/, because the compiler does not guarantee that it happens allthe time! I don't want my code to rely on something to happen that mightnot always happen. I want to be sure that my code is correct. I cannotconveniently do so if you don't tell me in advance what it does, and/orif the behaviour has a lot of abstraction-breaking special cases.

There are other reasons why I think that this kind of
implementation-defined
behaviour is a terribly bad idea, eg.:

- it breaks common assumptions about code, especially how it behaves
under
seemingly innocuous refactorings, or with a different set of compiler
flags.


As pointed out, this already happens with just about every language. It
happens with all C/C++ compilers I'm aware of.

I'm not claiming those languages don't have broken floating pointsemantics. I have sometimes been using inline assembler in C++ to getthe results I want. It's painful and unnecessary.

It happens as the default behavior of the x86.


I know. I don't care. It is a stupid idea. See above.

And as pointed out, refactoring (x+y)+z to x+(y+z)
often produces different results, and surprises a lot of people.
...

As far as I can tell, you are saying: "You think A is bad, but A issimilar to B, and B is bad, but B is hard to fix, hence A is actuallygood." I disagree.

For floating point, '+' means: "add precisely, then round". It is indeedpotentially surprising and not very helpful for generic code thatfloating point types and integral types use the same syntax forconceptually different things (i.e., the language commits operatoroverloading abuse), but we are stuck with that now.Implementation-defined behaviour can actually be eliminated in abackward-compatible way.

Refactorings as simple as moving some expression into its own functionshould be possible without surprisingly wreaking havoc. Implementationscan (and do) choose strange and useless behaviours. The fact that typesdo not actually specify what kind of value will be used at runtime is apointless waste of type safety.

- it breaks reproducibility, which is sometimes more important that
being close
to the infinite precision result (which you cannot guarantee with any
finite
floating point type anyway).
  (E.g. in a game, it is enough if the result seems plausible, but it
should be
the same for everyone. For some scientific experiments, the ideal case
is to
have 100% reproducibility of the computation, even if it is horribly
wrong, such
that other scientists can easily uncover and diagnose the problem, for
example.)


Nobody is proposing a D feature that does not produce reproducible
results

Do you disagree with the notion that implementation-defined behaviour isalmost by definition detrimental to reproducibility? (I'm not trying tosay that it makes reproducing results impossible. It is a continuum.)

with the same program on the same inputs.

I'm talking about computations, not just programs, and I ideally wantconsistent behaviour across compilers/compiler versions/compiler flags(at least those flags not specifically designed to change languagesemantics in precisely that way). If you can additionally give merefactorability (i.e. a lack of unprincipled interplay between languagefeatures that should be orthogonal [1]), that would be really awesome.

The result should ideally not depend on e.g. whether a computation isrun at compile-time or run-time, or whatever other irrelevant implicitdetail that is changed during refactoring or when switching machines.(e.g. someone might be running a 32-bit system, and another person (orthe same person) might be running a 64-bit system, and they getdifferent floating-point results. It just adds a lot of friction andpointless work.)

This complaint is a strawman, as I've pointed out multiple times.
...

A strawman is an argument one incorrectly claims that the other partyhas made.

Here, I think the underlying problem is that there is amisunderstanding. (I.e. you think I'm saying something I didn't intendto say, or vice-versa. [2])

This is often the case, hence I try to avoid calling out strawmen byname and instead try to put my argument differently. (E.g., when youseemingly started to claim that my argument was something like "usinglower precision for the entire computation throughout should be expectedto yield more accurate results" and then ridiculed it.)

In fact, the results would be MORE portable than with C/C++,

I know that it is better than completely broken. I'm sorry, but I havehigher standards. It should be not broken.

because the
FP behavior is completely implementation defined, and compilers take
advantage of that.

I guess the reason why it is completely implementation defined is thedesire that there should be completely standard-compliant C/C++compilers for each platform, and the fact that implementers didn't wantto support IEEE 754 everywhere. One way to implement a specification isto weaken the specification.

[1] This is the single most important thing programming languagesdeveloped in academia often get right.

[2] To clarify: I know that if I compile a program on systems withidentical states, on an identical compiler with identical flags, I will(or should) get identical binaries that then produce identical resultson conforming hardware. (See how I didn't need to say "identicalhardware"? I want to be there!) It is not always desirable or practicalto keep around all that additional state though.

Re: Always false float comparisons

Reply via email to