https://gcc.gnu.org/g:9167c9eeea1ca55408a2f500d10ce521a296af0e
commit r16-6765-g9167c9eeea1ca55408a2f500d10ce521a296af0e Author: Jakub Jelinek <[email protected]> Date: Wed Jan 14 13:21:57 2026 +0100 simplify-rtx: Fix up SUBREG and LSHIFTRT order canonicalization for AND with constant [PR123544] On Tue, Nov 04, 2025 at 12:59:03PM +0530, Kishan Parmar wrote: > PR rtl-optimization/93738 > * simplify-rtx.cc (simplify_binary_operation_1): Canonicalize > SUBREG(LSHIFTRT) into LSHIFTRT(SUBREG) when valid. This change regressed the following testcase on aarch64-linux. From what I can see, the PR93738 change has been written with non-paradoxical SUBREGs in mind but on this testcase on aarch64 we have a paradoxical SUBREG, in particular simplify_binary_operation_1 is called with AND, SImode, (subreg:SI (lshiftrt:HI (subreg:HI (reg/v:SI 108 [ x ]) 0) (const_int 8 [0x8])) 0) and op1 (const_int 32767 [0x7fff]) and simplifies that since the PR93738 optimization was added into (and:SI (lshiftrt:SI (reg/v:SI 108 [ x ]) (const_int 8 [0x8])) (const_int 32767 [0x7fff])) This looks wrong to me. Consider (reg/v:SI 108 [ x ]) 0) could have value 0x12345678U. The original expression takes lowpart 16-bits from that, i.e. 0x5678U, shifts that right logically by 8 bits, so 0x56U, makes a paradoxical SUBREG from that, i.e. 0x????0056U and masks that with 0x7fff, i.e. result is 0x56U. The new expression shifts 0x12345678U logically right by 8 bits, i.e. 0x123456U and masks it by 0x7fff, result 0x3456U. Thus, I think we need to limit to non-paradoxical SUBREGs. On the rlwimi-2.c testcase I see on powerpc64le-linux no differences in emitted assembly without/with the patch. 2026-01-14 Jakub Jelinek <[email protected]> PR rtl-optimization/123544 * simplify-rtx.cc (simplify_context::simplify_binary_operation_1) <case AND>: Don't canonicalize (subreg (lshiftrt (x cnt)) low) into (lshiftrt (subreg x low) cnt) if the SUBREG is paradoxical. * gcc.dg/pr123544.c: New test. Diff: --- gcc/simplify-rtx.cc | 4 +++- gcc/testsuite/gcc.dg/pr123544.c | 44 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 47 insertions(+), 1 deletion(-) diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc index d7d51550820d..089f773290dd 100644 --- a/gcc/simplify-rtx.cc +++ b/gcc/simplify-rtx.cc @@ -4193,7 +4193,9 @@ simplify_context::simplify_binary_operation_1 (rtx_code code, Keeps shift and AND in the same mode, improving recognition. Only applied when subreg is a lowpart, shift is valid, and no precision is lost. */ - if (SUBREG_P (op0) && subreg_lowpart_p (op0) + if (SUBREG_P (op0) + && subreg_lowpart_p (op0) + && !paradoxical_subreg_p (op0) && GET_CODE (XEXP (op0, 0)) == LSHIFTRT /* simplify_subreg asserts the object being accessed is not VOIDmode or BLKmode. We may have a REG_EQUAL note which diff --git a/gcc/testsuite/gcc.dg/pr123544.c b/gcc/testsuite/gcc.dg/pr123544.c new file mode 100644 index 000000000000..4ff5e952af12 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr123544.c @@ -0,0 +1,44 @@ +/* PR rtl-optimization/123544 */ +/* { dg-do run { target bitint } } */ +/* { dg-options "-O1 -w" } */ + +typedef unsigned char V __attribute__((vector_size (4))); +unsigned long long c; + +signed char +foo () +{ + return 0; +} + +unsigned int +bar (V x, short int y, long long z) +{ + _BitInt (15) a = 828; + a &= (unsigned) z < 5 ? 0 : (unsigned) x[1]; + if (y) + return a; + for (unsigned short d; d; ) + ; +} + +int +baz (unsigned char x) +{ + unsigned long long a[] = { }; + unsigned long long b = bar ((V) { 1, 1, 1 }, 1, -1); + if (x) + { + c += b; + return 0; + } + a[bar ((V) {}, 0, foo ())]; +} + +int +main () +{ + baz (1); + if (c) + __builtin_abort (); +}
