https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86209

            Bug ID: 86209
           Summary: Peephole does not happen because the type of zero/sign
                    extended operands is not the same.
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: sameerad at gcc dot gnu.org
  Target Milestone: ---

While implementing peephole2 for combining shorter types loads/stores into
larger type load/store, following testcase was found for aarch64 for which
peephole does not happen because the type of zero/sign extended operands is not
the same.

Test program:
unsigned short
subus (unsigned short *array)
{
  return array[0] + array[1];
}

Expander generated RTL:
(insn 6 3 7 2 (set (reg:HI 96)
        (mem:HI (reg/v/f:DI 94 [ array ]) [1 *array_4(D)+0 S2 A16]))
     (nil))
(insn 7 6 8 2 (set (reg:HI 97)
        (mem:HI (plus:DI (reg/v/f:DI 94 [ array ])
                (const_int 2 [0x2])) [1 MEM[(short unsigned int *)array_4(D) +
2B]+0 S2 A16]))
     (nil))
(insn 8 7 9 2 (set (reg:SI 99)
        (subreg:SI (reg:HI 97) 0))
     (nil))
(insn 9 8 10 2 (set (reg:SI 98)
        (plus:SI (subreg:SI (reg:HI 96) 0)
            (reg:SI 99)))
     (expr_list:REG_EQUAL (plus:SI (subreg:SI (reg:HI 96) 0)
            (subreg:SI (reg:HI 97) 0))
        (nil)))

The combiner combines insn 7 and 8 to generate zero extension to SI mode.

(insn 8 7 9 2 (set (reg:SI 99 [ MEM[(short unsigned int *)array_4(D) + 2B] ])
        (zero_extend:SI (mem:HI (plus:DI (reg/v/f:DI 94 [ array ])
                    (const_int 2 [0x2])) [1 MEM[(short unsigned int
*)array_4(D) + 2B]+0 S2 A16]))) {*zero_extendhisi2_aarch64}
     (expr_list:REG_DEAD (reg/v/f:DI 94 [ array ])
        (nil)))

 The reload pass removes SUBREGs, which holds information about desired type,
because of which HImode regs are zero extended to DImode.

(insn 8 7 6 2 (set (reg:SI 1 x1 [orig:99 MEM[(short unsigned int *)array_4(D) +
2B] ] [99])
        (zero_extend:SI (mem:HI (plus:DI (reg/v/f:DI 0 x0 [orig:94 array ]
[94])
                    (const_int 2 [0x2])) [1 MEM[(short unsigned int
*)array_4(D) + 2B]+0 S2 A16]))) {*zero_extendhisi2_aarch64}
     (nil))
(insn 6 8 9 2 (set (reg:DI 0 x0)
        (zero_extend:DI (mem:HI (reg/v/f:DI 0 x0 [orig:94 array ] [94]) [1
*array_4(D)+0 S2 A16]))) {*zero_extendhidi2_aarch64}
     (nil))
(insn 9 6 14 2 (set (reg:SI 0 x0 [98])
        (plus:SI (reg:SI 0 x0 [orig:96 *array_4(D) ] [96])
            (reg:SI 1 x1 [orig:99 MEM[(short unsigned int *)array_4(D) + 2B] ]
[99]))){*addsi3_aarch64}
     (nil))
(insn 14 9 15 2 (set (reg/i:HI 0 x0)
        (reg:HI 0 x0 [98])) {*movhi_aarch64}
     (nil))
(insn 15 14 17 2 (use (reg/i:HI 0 x0)) 
     (nil))
(note 17 15 18 NOTE_INSN_DELETED)
(note 18 17 0 NOTE_INSN_DELETED)

Now as both memory accesses have different extended types, they cannot be
combined by peephole.

Because of this, even when sched_fusion has brought the loads/stores closer,
they cannot be merged.

Reply via email to