http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58295
Bug ID: 58295 Summary: The combination pass doesn't eliminates some extra zero extensions Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: uranus at tinlans dot org $ cat test.c extern char zeb_test_array[10]; unsigned char ee_isdigit2(unsigned int i) { unsigned char c = zeb_test_array[i]; unsigned char retval; retval = ((c>='0') & (c<='9')) ? 1 : 0; return retval; } $ arm-eabi-gcc -v Using built-in specs. COLLECT_GCC=arm-eabi-gcc COLLECT_LTO_WRAPPER=/home1/lhtseng/arm/4.9/libexec/gcc/arm-eabi/4.9.0/lto-wrapper Target: arm-eabi Configured with: ../../../../work/4.9/src/gcc-4.9.0/configure --target=arm-eabi --prefix=/home1/lhtseng/arm/4.9 --disable-nls --disable-shared --enable-languages=c --enable-__cxa_atexit --enable-c99 --enable-long-long --enable-threads=single --with-newlib --disable-multilib --disable-libssp --disable-libgomp --disable-decimal-float --disable-libffi --disable-libmudflap --disable-lto --with-gmp=/home1/lhtseng/work/general --with-mpfr=/home1/lhtseng/work/general --with-mpc=/home1/lhtseng/work/general --with-isl=/home1/lhtseng/work/general --with-cloog=/home1/lhtseng/work/general Thread model: single gcc version 4.9.0 20130802 (experimental) (GCC) $ arm-eabi-gcc -O3 -S test.c $ cat test.s ... ee_isdigit2: @ Function supports interworking. @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. ldr r3, .L2 ldrb r0, [r3, r0] @ zero_extendqisi2 sub r0, r0, #48 and r0, r0, #255 cmp r0, #9 movhi r0, #0 movls r0, #1 bx lr ... The instruction 'and r0, r0, #255' is a redundant instruction which cannot be eliminated by the RTL instruction combination pass. This pass was able to handle this case before this commit: http://gcc.gnu.org/viewcvs/gcc/trunk/gcc/simplify-rtx.c?r1=191909&r2=191928&pathrev=192303 And the code was re-organized to line 643 ~ 656 after this commit: http://gcc.gnu.org/viewcvs/gcc/trunk/gcc/simplify-rtx.c?r1=192006&r2=192186&pathrev=192303 For example, GCC 4.6.3 can handle it perfectly. In GCC 4.9.0, reverting the two commits or simply commeting the lines mentioned above can make the combination pass handle this case again: $ arm-eabi-gcc-modified -O3 -da -S test.c $ cat test.c.166r.expand ... (insn 9 8 10 2 (set (reg:SI 120) (plus:SI (subreg:SI (reg:QI 118) 0) (const_int -48 [0xffffffffffffffd0]))) test.c:6 -1 (nil)) (insn 10 9 11 2 (set (reg:SI 121) (and:SI (reg:SI 120) (const_int 255 [0xff]))) test.c:6 -1 (nil)) (insn 11 10 12 2 (set (reg:CC 100 cc) (compare:CC (reg:SI 121) (const_int 9 [0x9]))) test.c:6 -1 (nil)) (insn 12 11 13 2 (set (reg:SI 122) (leu:SI (reg:CC 100 cc) (const_int 0 [0]))) test.c:6 -1 (nil)) ... $ cat test.c.197r.combine ... Trying 9, 10 -> 11: Failed to match this instruction: (set (reg:CC 100 cc) (compare:CC (plus:SI (reg:SI 119) (const_int -48 [0xffffffffffffffd0])) (const_int 9 [0x9]))) Successfully matched this instruction: (set (reg:SI 121) (plus:SI (reg:SI 119) (const_int -48 [0xffffffffffffffd0]))) Successfully matched this instruction: (set (reg:CC 100 cc) (compare:CC (reg:SI 121) (const_int 9 [0x9]))) deferring deletion of insn with uid = 9. modifying insn i2 10: r121:SI=r119:SI-0x30 REG_DEAD r119:SI deferring rescan insn with uid = 10. modifying insn i3 11: cc:CC=cmp(r121:SI,0x9) REG_DEAD r121:SI deferring rescan insn with uid = 11. ... The insn 10 is generated by (define_expand "zero_extendqisi2" ...) of ARM's machine description. Before the commits I mentioned above, the combination pass successfully combines it with the insn 9. However, after those commits, the combination pass never tries to do the combination '9, 10 -> 11.' After reading the commit messages of the file 'simplify-rtx.c', we can understand the commits, r191928, was trying to optimize x86 code generation, but it led to the suboptimal code generation of the ARM's target.