[Bug tree-optimization/42286] October 23rd change to tree-ssa-pre.c breaks calculix on powerpc with -ffast-math
--- Comment #12 from meissner at gcc dot gnu dot org 2010-03-03 23:01 --- As I said in the last bug entry, I have convinced myself that the compiler is correct, and that the bug is caused by using -ffast-math when compiling, but the atan2 library function supporting full IEEE semantics, including -0.0. -- meissner at gcc dot gnu dot org changed: What|Removed |Added Status|NEW |RESOLVED Resolution||INVALID http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42286
[Bug tree-optimization/42286] October 23rd change to tree-ssa-pre.c breaks calculix on powerpc with -ffast-math
--- Comment #5 from dje at gcc dot gnu dot org 2010-01-05 15:47 --- sqrt(-0) = -0 and 0 is a discontinuity for atan2 Either sqrt should return +0 or atan2 should force +0 if games with signed zeros are allowed. -- dje at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2010-01-05 15:47:15 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42286
[Bug tree-optimization/42286] October 23rd change to tree-ssa-pre.c breaks calculix on powerpc with -ffast-math
--- Comment #6 from rguenth at gcc dot gnu dot org 2010-01-05 15:50 --- IIRC the problem is that using fma causes the -0 argument for fsqrt (which behaves 100% correct). Thus a more sensible fix would be to do the 0+ on the fma result. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42286
[Bug tree-optimization/42286] October 23rd change to tree-ssa-pre.c breaks calculix on powerpc with -ffast-math
--- Comment #7 from segher at kernel dot crashing dot org 2010-01-05 15:57 --- With -fno-signed-zeroes, a-b*c is transformed to -(b*c-a), which is a machine instruction. If the result would have been +0 before, it now is -0. The code then takes the sqrt() of that; sqrt(-0) is -0. This then is passed to atan2(), which has a discontinuity at 0, and we get wildly diverging results. The compiler does nothing wrong here; the transformation is perfectly valid. A solution might be to transform e.g. atan2(x,y) into atan2(+0.+x,+0.+y) when -fno-signed-zeroes is in effect (and of course somehow make sure those additions aren't optimised away). Similar for other math library functions with discontinuities at +/- 0. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42286
[Bug tree-optimization/42286] October 23rd change to tree-ssa-pre.c breaks calculix on powerpc with -ffast-math
--- Comment #8 from segher at kernel dot crashing dot org 2010-01-05 16:00 --- (In reply to comment #6) IIRC the problem is that using fma causes the -0 argument for fsqrt (which behaves 100% correct). Thus a more sensible fix would be to do the 0+ on the fma result. But a -0 result from fsqrt() is no problem /an sich/; problems only show up when you pass a +/- 0 to a function with a discontinuity there. Also, doing an extra addition on every fma is hugely expensive. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42286
[Bug tree-optimization/42286] October 23rd change to tree-ssa-pre.c breaks calculix on powerpc with -ffast-math
--- Comment #9 from rguenther at suse dot de 2010-01-05 16:03 --- Subject: Re: October 23rd change to tree-ssa-pre.c breaks calculix on powerpc with -ffast-math On Tue, 5 Jan 2010, segher at kernel dot crashing dot org wrote: --- Comment #7 from segher at kernel dot crashing dot org 2010-01-05 15:57 --- With -fno-signed-zeroes, a-b*c is transformed to -(b*c-a), which is a machine instruction. If the result would have been +0 before, it now is -0. The code then takes the sqrt() of that; sqrt(-0) is -0. This then is passed to atan2(), which has a discontinuity at 0, and we get wildly diverging results. The compiler does nothing wrong here; the transformation is perfectly valid. A solution might be to transform e.g. atan2(x,y) into atan2(+0.+x,+0.+y) when -fno-signed-zeroes is in effect (and of course somehow make sure those additions aren't optimised away). Similar for other math library functions with discontinuities at +/- 0. Right. Just it might be simpler with -fno-signed-zeros to transform a-b*c to 0 + -(b*c-a). Of course if the result was -0 before then it will be +0 after either variant (and the atan2 discontinuity would still happen even with your fix). Thus whatever fix the underlying problem is surely that calculix is not really -fno-signed-zeros safe. Can't we get lucky again as before by trying to recover the PRE code change? Richard. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42286
[Bug tree-optimization/42286] October 23rd change to tree-ssa-pre.c breaks calculix on powerpc with -ffast-math
--- Comment #10 from segher at kernel dot crashing dot org 2010-01-05 16:42 --- (In reply to comment #9) Right. Just it might be simpler with -fno-signed-zeros to transform a-b*c to 0 + -(b*c-a). a-b*c is two machine instructions; -(b*c-a) is one. Adding zero again makes it two, no good :-( Of course if the result was -0 before then it will be +0 after either variant (and the atan2 discontinuity would still happen even with your fix). Sure, it's not a fix, more a band-aid: it will produce more intuitive results when -fno-signed-zeroes is in effect, since atan2() does care about the sign of zero, and that compiler flag says we do not want that. Thus whatever fix the underlying problem is surely that calculix is not really -fno-signed-zeros safe. Yes, certainly. It seems to me this will happen with more sloppy code though, not just calculix. Can't we get lucky again as before by trying to recover the PRE code change? Well, the code change actually improved the generated code here, do we want to undo that? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42286
[Bug tree-optimization/42286] October 23rd change to tree-ssa-pre.c breaks calculix on powerpc with -ffast-math
--- Comment #11 from meissner at linux dot vnet dot ibm dot com 2010-01-05 18:40 --- Subject: Re: October 23rd change to tree-ssa-pre.c breaks calculix on powerpc with -ffast-math On Tue, Jan 05, 2010 at 04:03:17PM -, rguenther at suse dot de wrote: --- Comment #9 from rguenther at suse dot de 2010-01-05 16:03 --- Subject: Re: October 23rd change to tree-ssa-pre.c breaks calculix on powerpc with -ffast-math On Tue, 5 Jan 2010, segher at kernel dot crashing dot org wrote: --- Comment #7 from segher at kernel dot crashing dot org 2010-01-05 15:57 --- With -fno-signed-zeroes, a-b*c is transformed to -(b*c-a), which is a machine instruction. If the result would have been +0 before, it now is -0. The code then takes the sqrt() of that; sqrt(-0) is -0. This then is passed to atan2(), which has a discontinuity at 0, and we get wildly diverging results. The compiler does nothing wrong here; the transformation is perfectly valid. A solution might be to transform e.g. atan2(x,y) into atan2(+0.+x,+0.+y) when -fno-signed-zeroes is in effect (and of course somehow make sure those additions aren't optimised away). Similar for other math library functions with discontinuities at +/- 0. Right. Just it might be simpler with -fno-signed-zeros to transform a-b*c to 0 + -(b*c-a). Of course if the result was -0 before then it will be +0 after either variant (and the atan2 discontinuity would still happen even with your fix). Thus whatever fix the underlying problem is surely that calculix is not really -fno-signed-zeros safe. Can't we get lucky again as before by trying to recover the PRE code change? I've come to the conclusion that the compiler is doing the correct action in terms fo the FMA, and that we should not remove the current optimizations. I suspect the AMD/Intel folks will see this also when the AVX/FMA4 hardware shows up since they also have FMA instructions that generate -0.0 in this case. I don't recall when I was doing the old SSE5 support whether we had run the tests through the simulator with -ffast-math or not. However, I dunno whether there should be a -fno-signed-zeroes version of atan2 that does not give a different result for the function atan2 (-0.0, 1.0) than for atan2 (0.0, 1.0). I know in the past, we've floated ideas about having a fast math library that doesn't worry about Nans/negative 0/etc. Or whether Fortran needs such a wrapper since I don't believe signed zeroes are a Fortran concept. For my SPEC runs, I now use the GNU ld --wrap function for atan2, and 'fix' the negative zeroes by adding 0.0: static double zero = 0.0; extern double __real_atan2 (double, double); double __wrap_atan2 (double x, double y) { return __real_atan2 (x + zero, y + zero); } -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42286
[Bug tree-optimization/42286] October 23rd change to tree-ssa-pre.c breaks calculix on powerpc with -ffast-math
--- Comment #4 from meissner at linux dot vnet dot ibm dot com 2009-12-07 18:37 --- Subject: Re: October 23rd change to tree-ssa-pre.c breaks calculix on powerpc with -ffast-math On Sun, Dec 06, 2009 at 01:25:15PM -, irar at il dot ibm dot com wrote: --- Comment #3 from irar at il dot ibm dot com 2009-12-06 13:25 --- On powerpc64-suse-linux with current trunk calculix failed after a couple of minutes with -O3 -maltivec -ffast-math -O3 -maltivec -ffast-math -fno-tree-vectorize -O2 -maltivec -ffast-math -O1 -maltivec -ffast-math It is currently running for about an hour with -O0 -maltivec -ffast-math I suspect you need some amount of optimization to form the negate multiply and subtract/add instruction and -O0 doesn't do the combine step. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42286
[Bug tree-optimization/42286] October 23rd change to tree-ssa-pre.c breaks calculix on powerpc with -ffast-math
--- Comment #3 from irar at il dot ibm dot com 2009-12-06 13:25 --- On powerpc64-suse-linux with current trunk calculix failed after a couple of minutes with -O3 -maltivec -ffast-math -O3 -maltivec -ffast-math -fno-tree-vectorize -O2 -maltivec -ffast-math -O1 -maltivec -ffast-math It is currently running for about an hour with -O0 -maltivec -ffast-math Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42286
[Bug tree-optimization/42286] October 23rd change to tree-ssa-pre.c breaks calculix on powerpc with -ffast-math
--- Comment #1 from rguenth at gcc dot gnu dot org 2009-12-04 21:18 --- This change merely turns off PRE in cold code regions. So if -fno-tree-pre breaks calculix for you I'd guess some later optimization manages to miscompile it (my guess: the vectorizer). What options do you build calculix with? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42286
[Bug tree-optimization/42286] October 23rd change to tree-ssa-pre.c breaks calculix on powerpc with -ffast-math
--- Comment #2 from meissner at linux dot vnet dot ibm dot com 2009-12-04 21:28 --- Subject: Re: October 23rd change to tree-ssa-pre.c breaks calculix on powerpc with -ffast-math On Fri, Dec 04, 2009 at 09:18:45PM -, rguenth at gcc dot gnu dot org wrote: --- Comment #1 from rguenth at gcc dot gnu dot org 2009-12-04 21:18 --- This change merely turns off PRE in cold code regions. So if -fno-tree-pre breaks calculix for you I'd guess some later optimization manages to miscompile it (my guess: the vectorizer). What options do you build calculix with? -m32 -O3 -mcpu=power5 -ffast-math -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42286