http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58295

--- Comment #4 from Ling-hua Tseng <uranus at tinlans dot org> ---
(In reply to Jakub Jelinek from comment #3)
> So perhaps you should just look at combiner dump and see what insns it tried
> and failed to match and see if you couldn't add some of them into the
> affected backends.

It's exactly what I did. Unfortunately, the combinder doesn't give any other
chance to eliminate that redundant zero extension. The cases tried by the
combinder are:
1. (set (reg:SI) (zero_extend:SI (plus:QI (mem:QI) (const_int))))
2. (set (reg:QI) (plus:QI (mem:QI) (const_int)))
3. (set (reg:QI) (plus:QI (subreg:QI) (const_int)))
4. (set (reg:CC) (compare:CC (subreg:QI) (const_int)))
5. (set (reg:CC) (compare:CC (plus:QI (mem:QI) (const_int))))
6. (set (reg:SI) (leu:SI (subreg:QI) (const_int)))
7. (set (reg:SI) (leu:SI (subreg:QI) (const_int)))
8. (set (reg:SI) (leu:SI (plus:QI ...)))

You know 1 & 2 are impossible to most RISC targets, and making all other ones
recognizable is lying GCC that your target supports QImode
arithmetic/comparison. Telling GCC a lie here will result in some code
generation bugs. For example, you will find a fail case in
gcc/testsuite/gcc.c-torture/execute/980617-1.c while you are running a test if
you provide a QImode comparison in the machine description. Here is the source
code of that test case:
void foo (unsigned int * p)
{
  if ((signed char)(*p & 0xFF) == 17 || (signed char)(*p & 0xFF) == 18)
    return;
  else
    abort ();
}

int main ()
{
  int i = 0x30011;
  foo(&i);
  exit (0);
}

The MSB 16 bits contain 0x0003, and the LSB 16 bits contain 0x0011. Using -O3
to compile this code, you will find that GCC simplifies the expression '(signed
char)(*p & 0xFF) == 17 || (signed char)(*p & 0xFF) == 18' to an SImode
subtraction and a QImode comparison.The result is incorrect, because the target
only supports SImode comparisons, i.e., you actually generate an SImode
hardware instruction for the pattern of a QImode comparison, and the MSB 16-bit
is still dirty. Hence 3 ~ 8 are not the ones we can match them in the RTL
combination pass.

Therefore, we can conclude that the original case tried by the combiner is the
best way to merge/reduce the redundant zero extension insn.

Reply via email to