On 9/15/25 9:40 AM, Kishan Parmar wrote:
Hi All,
Changes from v1:
* Corrected commit message.
For a given rtx expression (and (lshiftrt (subreg X) shift) mask)
combine pass tries to simplify the RTL form to
(and (subreg (lshiftrt X shift)) mask)
where the SUBREG wraps the result of the shift. This leaves the AND
and the shift in different modes, which complicates recognition.
(and (lshiftrt (subreg X) shift) mask)
where the SUBREG is inside the shift and both operations share the same
mode. This form is easier to recognize across targets and enables
cleaner pattern matching.
This patch checks in simplify-rtx to perform this transformation when
safe: the SUBREG must be a lowpart, the shift amount must be valid, and
the precision of the operation must be preserved.
Tested on powerpc64le-linux-gnu, powerpc64-linux-gnu, and
x86_64-pc-linux-gnu with no regressions. On rs6000, the change reduces
insn counts due to improved matching.
2025-09-15 Kishan Parmar <[email protected]>
gcc/ChangeLog:
PR rtl-optimization/93738
* simplify-rtx.cc (simplify_context::simplify_binary_operation_1):
Canonicalize SUBREG(LSHIFTRT) into LSHIFTRT(SUBREG) when valid.
This doesn't look correct to me. I just don't see how we can safely
push the subreg to the inner ops like you're trying to do in a generally
safe manner.
Assume we have a narrowing lowpart subreg. Say QImode.
And let's take the original RTL:
(and (subreg (lshift X shift)) mask)
In this case the subreg select the low 8 bits from the output of the
lshift. Let's assume X is HImode and our shift count is 4. Given an
input value like
aaaa bbbb cccc dddd
We shift off the low 4 bits resulting in
0000 aaaa bbbb cccc
Then we apply the narrowing QI subreg resulting in
bbbb cccc
Let's do the same with your updated RTL
(and (lshift (subreg X) shift) mask)
aaaa bbbb cccc dddd
We apply the narrowing subreg resulting in
XXXX XXXX cccc dddd
Then shift 4 bits resulting in
0000 XXXX XXXX cccc
Note how we've lots the bits "bbbb" and replaced them with indeterminate
values.
This can only be safe if the mask was going to wipe those 4 bit positions
Am I missing something here?
jeff