On 3/23/23 19:53, Feng Wang wrote:
This patch optimize the combine processing for sext.b/h in rv64.
Please refer to the following test case,
int sextb32(int x)
{ return (x << 24) >> 24; }
The rtl expression is as follows,
(insn 6 3 7 2 (set (reg:SI 138)
(ashift:SI (subreg/s/u:SI (reg/v:DI 136 [ xD.2271 ]) 0)
(const_int 24 [0x18]))) "sextb.c":2:13 195 {ashlsi3}
(expr_list:REG_DEAD (reg/v:DI 136 [ xD.2271 ])
(nil)))
(insn 7 6 8 2 (set (reg:SI 137)
(ashiftrt:SI (reg:SI 138)
(const_int 24 [0x18]))) "sextb.c":2:20 196 {ashrsi3}
(expr_list:REG_DEAD (reg:SI 138)
(nil)))
During the combine phase, they will combine into
(set (reg:SI 137)
(ashiftrt:SI (subreg:SI (ashift:DI (reg:DI 140)
(const_int 24 [0x18])) 0)
(const_int 24 [0x18])))
The optimal combine result is
(set (reg:SI 137)
(sign_extend:SI (subreg:QI (reg:DI 140) 0)))
This can be converted to the sext ins.
Due to the influence of subreg,the current processing
can't obtain the imm of left shifts. Need to peel off
another layer of rtl to obtain it.
gcc/ChangeLog:
* combine.cc (extract_left_shift): Add SUBREG case.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zbb-sext-rv64.c: New test.
SUBREGs have painful semantics and we should be very careful just
stripping them.
For example, you might have a subreg that extracts the *high* part. Or
you might have (subreg (mem)) or a paradoxical subreg, etc.
At the *least* this case would need verification that you're getting the
lowpart. However, I suspect there's other conditions that need to be
checked to make this valid.
But I would suggest we look elsewhere. It could be that combine is
reassociating the subreg in ways that are undesirable and which
ultimately makes our job harder. Additionally if we can fix this in a
generic simplification/folder routine, then multiple passes can benefit.
For example in simplify_context::simplify_binary_operation we get a form
more amenable to optimization.
#0 simplify_context::simplify_binary_operation (this=0x7fffffffda68, code=ASHIFTRT, mode=E_SImode,
op0=0x7fffea11eb40, op1=0x7fffea009610) at /home/jlaw/riscv-persist/ventana/gcc/gcc/simplify-rtx.cc:2558
2558 gcc_assert (GET_RTX_CLASS (code) != RTX_COMPARE);
(gdb) p code
$24 = ASHIFTRT
(gdb) p mode
$25 = E_SImode
(gdb) p debug_rtx (op0)
(ashift:SI (subreg/s/u:SI (reg/v:DI 74 [ x ]) 0)
(const_int 24 [0x18]))
$26 = void
(gdb) p debug_rtx (op1)
(const_int 24 [0x18])
$27 = void
So that's (ashiftrt (ashift (object) 24) 24), ie sign extension.
ie, we really don't have to think about the fact that the underlying
object is a SUBREG because the outer operations are very clearly a sign
extension regardless of the object they're operating on.
With that in mind I would suggest you look at adding a case for detect
zero/sign extension in simplify_context::simplify_binary_operation_1.
Thanks,
Jeff