https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109040

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ebotcazou at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
That said, I've tried to reproduce this using -O1 -fno-tree-forwprop
unsigned int a;
unsigned short b;

__attribute__((noipa))
unsigned int
foo (void)
{
  unsigned int c = a & 0x8084c;
  unsigned short d = c;
  return d + b;
}
but while the RTL is quite similar in that case, this one works.
The problem with the #c0 testcase is during combine.

We have before combine:
(insn 32 23 34 2 (set (reg:SI 167 [ m ])
        (mem/c:SI (reg/v/f:SI 147) [1 m+0 S4 A128])) "pr109040.c":9:30 180
{*movsi_internal}
     (expr_list:REG_DEAD (reg/v/f:SI 147)
        (nil)))
(insn 34 32 35 2 (set (reg:SI 168)
        (const_int 526412 [0x8084c])) "pr109040.c":9:30 176 {*mvconst_internal}
     (nil))
(insn 35 34 47 2 (set (reg:SI 166)
        (and:SI (reg:SI 167 [ m ])
            (reg:SI 168))) "pr109040.c":9:30 95 {andsi3}
     (expr_list:REG_DEAD (reg:SI 168)
        (expr_list:REG_DEAD (reg:SI 167 [ m ])
            (expr_list:REG_EQUAL (and:SI (reg:SI 167 [ m ])
                    (const_int 526412 [0x8084c]))
                (nil)))))
(insn 47 35 39 2 (set (reg:HI 175)
        (subreg:HI (reg:SI 166) 0)) "pr109040.c":9:11 181 {*movhi_internal}
     (expr_list:REG_DEAD (reg:SI 166)
        (nil)))
(insn 39 47 40 2 (set (reg:SI 171)
        (zero_extend:SI (reg:HI 175))) "pr109040.c":9:11 111
{*zero_extendhisi2}
     (expr_list:REG_DEAD (reg:HI 175)
        (nil)))
(insn 40 39 43 2 (set (reg:SI 172)
        (leu:SI (reg:SI 171)
            (const_int 5 [0x5]))) "pr109040.c":9:11 291 {*sleu_sisi}
     (expr_list:REG_DEAD (reg:SI 171)
        (nil)))
Now, the zero extension from HImode to SImode of m & 0x8084c would be best
combined as
m & 0x84c, but 0x84c doesn't fit into signed 12-bit immediate for ANDI
instruction.
On the above shorter testcase the major difference before combine is that the
zero_extend
is combined with the subreg, so
(insn 10 9 11 2 (set (reg:SI 148 [ c ])
        (zero_extend:SI (subreg:HI (reg:SI 144 [ c ]) 0))) "pr109040-2.c":10:12
111 {*zero_extendhisi2}
     (expr_list:REG_DEAD (reg:SI 144 [ c ])
        (nil)))
in there.  On the short testcase, the first successful combine is trying to
combine the *mvconst_internal, and and zero_extend:
Failed to match this instruction:
(set (reg:SI 148 [ c ])
    (and:SI (reg:SI 145 [ a ])
        (const_int 2124 [0x84c])))
Successfully matched this instruction:
(set (reg:SI 144 [ c ])
    (const_int 2124 [0x84c]))
Successfully matched this instruction:
(set (reg:SI 148 [ c ])
    (and:SI (reg:SI 145 [ a ])
        (reg:SI 144 [ c ])))
and everything is fine.
On the #c0 testcase, the first successful combine from the above ones is trying
to combine the and and insn 47 (subreg) into:
Successfully matched this instruction:
(set (subreg:SI (reg:HI 175) 0)
    (and:SI (reg:SI 167 [ m ])
        (reg:SI 168)))
Now, not really sure if that's valid given that riscv is
WORD_REGISTER_OPERATIONS 1 target.
But maybe even the insn 47 before combine is wrong for such a target.
As pseudo 168 is 0x8084c, the upper half contains one randomish bit.
And later this new insn is combined with the leu into
Successfully matched this instruction:
(set (reg:SI 172)
    (leu:SI (subreg:SI (reg:HI 175) 0)
        (const_int 5 [0x5])))
which is definitely wrong, because the zero extension disappeared.

Reply via email to