Machine dependent Tree optimization?

Bingfeng Mei Tue, 16 Oct 2007 06:11:20 -0700

Hello,
I am working on GCC4.2.1 porting to our VLIW processor. Our No. 1
priority is code size. I noticed the following code generation:


Source code:

  if (a == 0x1ff )
    c = a + b;
  return c;


After tree copy propagation:


foo (a, b, c)
{
<bb 2>:
  if (a_2 == 511) goto <L0>; else goto <L1>;

<L0>:;
  c_5 = b_4 + 511;

  # c_1 = PHI <c_3(2), c_5(3)>;
<L1>:;
  return c_1;

}

It will generate the following assembly code for our processor
        tstieqw p0, r0, #0x1ff              //Compare r0 with 0x1ff and
write result to a predicate
        p0. addwi r2, r1, #0x1ff            //Predicated add
        sbl [link]      :       movw r8, r2


In our processor, p0. addwi r2, r1, #0x1ff   is a long instruction
(64-bit)

Ideally, I don't want this copy propagation if the immediate is out of
certain range. Then it will generate the following code

        tstieqw p0, r0, #0x1ff              //Compare r0 with 0x1ff and
write result to predicate
        p0. addw r2, r1, r0                  //Predicated add  (32-bit
instruciton)
        sbl [link]      :       movw r8, r2

It is going to save us four bytes. 

Of couse, for processors without long/short instructions, this copy
propagation is benefiical for performance by reducing unnecessary
dependency. Therefore, whether to apply this copy propagation is machine
dependent to some degree.  

What I do now is to add some check in tree-ssa-copy.c and tree-ssa-dom.c
for our target. But this is not very clean. My question is whether there
is better way to implement such machine-dependent tree-level
optimization (like hooks in RTL level).  I believe there are other
processors that have the similar problem. What is common solution? 


Thanks,
Bingfeng Mei

Broadcom UK

Machine dependent Tree optimization?

Reply via email to