http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55653



             Bug #: 55653

           Summary: Unnecessary initialization of vector register

    Classification: Unclassified

           Product: gcc

           Version: 4.8.0

            Status: UNCONFIRMED

          Severity: enhancement

          Priority: P3

         Component: middle-end

        AssignedTo: unassig...@gcc.gnu.org

        ReportedBy: josh.m.con...@gmail.com





When initializing all lanes of a vector register, I notice that the register is

first initialized to zero and then all lanes of the vector are independently

initialized, resulting in extra code.



Specifically, I'm looking at the aarch64 target, with the following source:



void

fmla_loop (double * restrict result, double * restrict mul1,

       double mul2, int size)

{

  int i;



  for (i = 0; i < size; i++)

    result[i] = result[i] + mul1[i] * mul2;

}



Compiled with:



aarch64-linux-gnu-gcc -std=c99 -O3 -ftree-vectorize -S -o test.s test.c



The resultant code to initialize a vector register with two instances of mul2

is:



  adr     x3, .LC0

  ld1     {v3.2d}, [x3]

  ins     v3.d[0], v0.d[0]

  ins     v3.d[1], v0.d[0]

...

.LC0:

  .word   0

  .word   0

  .word   0

  .word   0



Where the first two instructions (that initialize the vector register) are

unnecessary, as is the space for .LC0.



Note that this initialization is being performed here in store_constructor:



        /* Inform later passes that the old value is dead.  */

        if (!cleared && !vector && REG_P (target))

          emit_move_insn (target, CONST0_RTX (GET_MODE (target)));



right after another check to see if the vector needs to be cleared out (and

determine that it doesn't).



Instead of the emit_move_insn, that code used to be:



       emit_insn (gen_rtx_CLOBBER (VOIDmode, target));



But was changed in r101169, with the comment:



  "The expr.c change elides an extra move that's creeped in since we

changed clobbered values to get new registers in reload."



(see full checkin text here:

http://gcc.gnu.org/ml/gcc-patches/2005-06/msg01584.html)



It's not clear to me whether this can be changed back, or if later passes

should be recognizing this initialization as redundant, or whether we need a

new expand pattern to match vector fill (vector duplicate).  At any rate, the

code is certainly not ideal as it stands.



Thanks!

Reply via email to