[Bug tree-optimization/67781] [5/6 Regression] wrong code generated on big-endian with -O1 -fexpensive-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67781 --- Comment #14 from Thomas Preud'homme --- Author: thopre01 Date: Fri Jan 8 09:21:19 2016 New Revision: 232154 URL: https://gcc.gnu.org/viewcvs?rev=232154=gcc=rev Log: 2016-01-08 Thomas Preud'hommegcc/ PR tree-optimization/67781 * tree-ssa-math-opts.c (find_bswap_or_nop): Zero out bytes in cmpxchg and cmpnop in two steps: first the ones not accessed in original gimple expression in a endian independent way and then the ones not accessed in the final result in an endian-specific way. gcc/testsuite/ PR tree-optimization/67781 * gcc.c-torture/execute/pr67781.c: New file. Added: trunk/gcc/testsuite/gcc.c-torture/execute/pr67781.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-math-opts.c
[Bug tree-optimization/67781] [5/6 Regression] wrong code generated on big-endian with -O1 -fexpensive-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67781 --- Comment #12 from Mikael Pettersson --- Thanks for posting the patch, I'm currently doing a sparc64-linux bootstrap and regtest with it.
[Bug tree-optimization/67781] [5/6 Regression] wrong code generated on big-endian with -O1 -fexpensive-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67781 --- Comment #13 from Mikael Pettersson --- Patch passed bootstrap and regtest on sparc64-linux. No regressions.
[Bug tree-optimization/67781] [5/6 Regression] wrong code generated on big-endian with -O1 -fexpensive-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67781 --- Comment #11 from Thomas Preud'homme --- Yes, sorry. I had finished bootstrap on x86_64-linux-gnu and regression testsuite on both that and arm-none-eabi but I wanted to do a bootstrap on a big endian system. I then got caught up by holidays. I'll post the patch for review and send an update once big endian bootstrap is over.
[Bug tree-optimization/67781] [5/6 Regression] wrong code generated on big-endian with -O1 -fexpensive-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67781 --- Comment #10 from Mikael Pettersson --- (In reply to Thomas Preud'homme from comment #9) > I have a patch that seems to be working. Running regression testing and > bootstrap now. Is there any progress on that patch?
[Bug tree-optimization/67781] [5/6 Regression] wrong code generated on big-endian with -O1 -fexpensive-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67781 --- Comment #9 from Thomas Preud'homme --- I have a patch that seems to be working. Running regression testing and bootstrap now.
[Bug tree-optimization/67781] [5/6 Regression] wrong code generated on big-endian with -O1 -fexpensive-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67781 --- Comment #8 from Thomas Preud'homme --- Looking more into find_bswap_or_nop, it became clear that the rsize loop is fine for both endianness because it operates on the result of the expression being analyzed and that result lies in register. it's the cmpxchg and cmpnop bit manipulation below that need changing instead. the rsize loop is fine for both endianness because it operates on the result of the expression being analyzed and the result is in register. So it became clear that the cmpxchg and cmpnop bit manipulation were to blame instead. They are tasked of computing what to compare the result of the symbolic execution against. They fail to account that in the case of a memory source, some bytes at either end of the memory range can be lost and there would be no 0 in the symbolic number result. They currently only deal with the most significant bytes being lost. However, note how this mistake is not endian specific: it is true for both end of a symbolic number. The reason for the asymmetry between little endian and big endian is that the current code will always do the load from the lowest address of all sources in the expression being analyzed. This work fine for little endian since the most significant bytes are at the highest address so doing a smaller load is sufficient, and the real range computation in the rsize loop takes care of this. It fails for big endian targets though, where the most significant byte are at the lowest address and the current code does not adjust the address of the load accordingly. The right fix would thus be to change the cmpxchg and cmpnop computation to remove the least significant bytes lost in the expression (by comparing rsize against the range returned by find_bswap_or_nop_1). And this shows a possible improvement: add code to change the address of the load to allow both end of the symbolic number to be truncated. So I'll spin a patch tomorrow and hopefully propose a valid fix for GCC 5 and 6 by the following day.
[Bug tree-optimization/67781] [5/6 Regression] wrong code generated on big-endian with -O1 -fexpensive-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67781 --- Comment #7 from Thomas Preud'homme --- The exact place where it goes wrong is in find_bswap_or_nop (not the _1 helper function) in the following line: for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, rsize++); This code removes the most significant zero on little endian systems but the least significant one on big endian ones, changing the value in n->n. It should do some left shift but be careful of undefined behavior. Actually, this makes me realize that we only remove the most significant zeros but the least significant ones would be fine too if the base address is adjusted.
[Bug tree-optimization/67781] [5/6 Regression] wrong code generated on big-endian with -O1 -fexpensive-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67781 Thomas Preud'homme changed: What|Removed |Added Status|NEW |ASSIGNED CC||thopre01 at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |thopre01 at gcc dot gnu.org
[Bug tree-optimization/67781] [5/6 Regression] wrong code generated on big-endian with -O1 -fexpensive-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67781 --- Comment #6 from Thomas Preud'homme --- So after performing the symbolic computation, the result is: 0x04030201 with 01 being the byte represented by b. This is correct so far and n->range shows 5 since 5 bytes have been touched. However, the base address points to a because that's the lowest value but the lowest byte in the symbolic number is the second byte in a. That's this inconsistency that's causing the bug. I'll come up with a patch tomorrow and do the testing.
[Bug tree-optimization/67781] [5/6 Regression] wrong code generated on big-endian with -O1 -fexpensive-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67781 Eric Botcazou changed: What|Removed |Added Target|mips-linux; powerpc-linux |m68k-*-* mips-*-* ||powerpc-*-* sparc-*-* CC||ebotcazou at gcc dot gnu.org Summary|[5/6 Regression] Wrong code |[5/6 Regression] wrong code |generated on mips32 with|generated on big-endian |-O1 |with -O1 |-fexpensive-optimizations |-fexpensive-optimizations Severity|critical|major