On 5/21/2026 9:40 PM, Andrew Pinski wrote:
On Thu, May 21, 2026 at 8:31 PM Jeffrey Law
<[email protected]> wrote:
On 5/18/2026 9:39 PM, [email protected] wrote:
Dear contributor,
Our automatic CI has detected problems related to your patch(es). Please find
some details below.
In aarch64 native, after:
| commit gcc-17-577-gbc19036af435
| Author: Jeff Law <[email protected]>
| Date: Mon May 18 15:17:27 2026 -0600
|
| [RISC-V] Improve ext-dce's live bit tracking for IOR/AND with a
constant argument
|
| Investigation of a regression with some RISC-V target changes exposed
a clear
| missed optimization in ext-dce.c
|
| ... 67 lines of the commit log omitted.
Produces 2 regressions:
|
| regressions.sum:
| Running gcc:gcc.target/aarch64/aarch64.exp ...
| FAIL: gcc.target/aarch64/tbz_1.c check-function-bodies g1
| FAIL: gcc.target/aarch64/tbz_1.c check-function-bodies g2
I've reproduced this locally. I haven't gotten into the debug cycle
yet, but wanted folks to know it's mine AFAICT to avoid duplicating
debugging efforts.
I think the fix is just a testcase fix.
Currently the testcase has:
** tbnz w[0-9]+, #?0, .L([0-9]+)
But that should just can be:
** tbnz [wx][0-9]+, #?0, .L([0-9]+)
to match both x0 and w0 there. But are valid in this case with bit 0.
For both g1 and g2.
Yea, so I wanted to verify this was behaving per expectation. It is.
We have this in g1:
(insn 2 4 3 2 (set (reg/v:SI 101 [ x ])
(zero_extend:SI (reg:QI 0 x0 [ x ]))) "j.c":16:1 149
{*zero_extendqisi2_aarch64}
(expr_list:REG_DEAD (reg:QI 0 x0 [ x ])
(nil)))
(note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
(insn 6 3 7 2 (set (reg:SI 102)
(and:SI (reg/v:SI 101 [ x ])
(const_int 1 [0x1]))) "j.c":17:6 discrim 1 551 {andsi3}
(expr_list:REG_DEAD (reg/v:SI 101 [ x ])
(nil)))
We can trivially see that the zero extension is pointless because the
value is masked with 0x1 immediately thereafter. So ext-dce replaces
the zero_extend with a subreg which then simplifies the sequence into:
(insn 2 4 3 2 (set (reg/v:SI 101 [ x ])
(reg:SI 0 x0 [ x ])) "j.c":16:1 104 {*movsi_aarch64}
(expr_list:REG_DEAD (reg:QI 0 x0 [ x ])
(nil)))
(note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
(insn 6 3 7 2 (set (reg:SI 102)
(and:SI (reg/v:SI 101 [ x ])
(const_int 1 [0x1]))) "j.c":17:6 discrim 1 551 {andsi3}
(expr_list:REG_DEAD (reg/v:SI 101 [ x ])
(nil)))
Combine then turns it into:
(insn 28 4 2 2 (set (reg:SI 103 [ x ])
(reg:SI 0 x0 [ x ])) "j.c":16:1 -1
(expr_list:REG_DEAD (reg:SI 0 x0 [ x ])
(nil)))
(note 2 28 3 2 NOTE_INSN_DELETED)
(note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
(note 6 3 7 2 NOTE_INSN_DELETED)
(note 7 6 8 2 NOTE_INSN_DELETED)
(jump_insn 8 7 9 2 (parallel [
(set (pc)
(if_then_else (eq (zero_extract:DI (subreg:DI (reg:SI
103 [ x ]) 0)
(const_int 1 [0x1])
(const_int 0 [0]))
(const_int 0 [0]))
(label_ref:DI 14)
(pc)))
(clobber (reg:CC 66 cc))
]) "j.c":17:6 discrim 1 76 {aarch64_tbzeqdidi}
(expr_list:REG_UNUSED (reg:CC 66 cc)
(expr_list:REG_DEAD (reg:SI 103 [ x ])
(int_list:REG_BR_PROB 966367641 (nil))))
Without the recent ext-dce improvement we get this out of combine:
(jump_insn 8 7 9 2 (parallel [
(set (pc)
(if_then_else (eq (zero_extract:SI (reg:SI 0 x0 [ x ])
(const_int 1 [0x1])
(const_int 0 [0]))
(const_int 0 [0]))
(label_ref:DI 14)
(pc)))
(clobber (reg:CC 66 cc))
]) "k.c":17:6 discrim 1 47 {aarch64_tbeqsisi}
(expr_list:REG_UNUSED (reg:CC 66 cc)
(expr_list:REG_DEAD (reg:QI 0 x0 [ x ])
(int_list:REG_BR_PROB 966367644 (nil))))
Note the difference in zero_extract's mode within the bit-test insn. DI
vs SI. Hence the w vs x difference in the resulting assembly code.
Attached is the commited change to the testsuite.
Jeff
commit 2ec684467cbc610348ebd7b71e01425629b82f73
Author: Jeff Law <[email protected]>
Date: Fri May 22 09:21:31 2026 -0600
Re: [Linaro-TCWG-CI] gcc-17-577-gbc19036af435: 2 regressions on aarch64
>>> Produces 2 regressions:
>>> |
>>> | regressions.sum:
>>> | Running gcc:gcc.target/aarch64/aarch64.exp ...
>>> | FAIL: gcc.target/aarch64/tbz_1.c check-function-bodies g1
>>> | FAIL: gcc.target/aarch64/tbz_1.c check-function-bodies g2
>> I've reproduced this locally. I haven't gotten into the debug cycle
>> yet, but wanted folks to know it's mine AFAICT to avoid duplicating
>> debugging efforts.
>
> I think the fix is just a testcase fix.
> Currently the testcase has:
> ** tbnz w[0-9]+, #?0, .L([0-9]+)
>
> But that should just can be:
> ** tbnz [wx][0-9]+, #?0, .L([0-9]+)
>
> to match both x0 and w0 there. But are valid in this case with bit 0.
> For both g1 and g2.
Yea, so I wanted to verify this was behaving per expectation. It is. We
have
this in g1:
> (insn 2 4 3 2 (set (reg/v:SI 101 [ x ])
> (zero_extend:SI (reg:QI 0 x0 [ x ]))) "j.c":16:1 149
{*zero_extendqisi2_aarch64}
> (expr_list:REG_DEAD (reg:QI 0 x0 [ x ])
> (nil)))
> (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
> (insn 6 3 7 2 (set (reg:SI 102)
> (and:SI (reg/v:SI 101 [ x ])
> (const_int 1 [0x1]))) "j.c":17:6 discrim 1 551 {andsi3}
> (expr_list:REG_DEAD (reg/v:SI 101 [ x ])
> (nil)))
We can trivially see that the zero extension is pointless because the value
is
masked with 0x1 immediately thereafter. So ext-dce replaces the zero_extend
with a subreg which then simplifies the sequence into:
> (insn 2 4 3 2 (set (reg/v:SI 101 [ x ])
> (reg:SI 0 x0 [ x ])) "j.c":16:1 104 {*movsi_aarch64}
> (expr_list:REG_DEAD (reg:QI 0 x0 [ x ])
> (nil)))
> (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
> (insn 6 3 7 2 (set (reg:SI 102)
> (and:SI (reg/v:SI 101 [ x ])
> (const_int 1 [0x1]))) "j.c":17:6 discrim 1 551 {andsi3}
> (expr_list:REG_DEAD (reg/v:SI 101 [ x ])
> (nil)))
Combine then turns it into:
> (insn 28 4 2 2 (set (reg:SI 103 [ x ])
> (reg:SI 0 x0 [ x ])) "j.c":16:1 -1
> (expr_list:REG_DEAD (reg:SI 0 x0 [ x ])
> (nil)))
> (note 2 28 3 2 NOTE_INSN_DELETED)
> (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
> (note 6 3 7 2 NOTE_INSN_DELETED)
> (note 7 6 8 2 NOTE_INSN_DELETED)
> (jump_insn 8 7 9 2 (parallel [
> (set (pc)
> (if_then_else (eq (zero_extract:DI (subreg:DI (reg:SI 103
[ x ]) 0)
> (const_int 1 [0x1])
> (const_int 0 [0]))
> (const_int 0 [0]))
> (label_ref:DI 14)
> (pc)))
> (clobber (reg:CC 66 cc))
> ]) "j.c":17:6 discrim 1 76 {aarch64_tbzeqdidi}
> (expr_list:REG_UNUSED (reg:CC 66 cc)
> (expr_list:REG_DEAD (reg:SI 103 [ x ])
> (int_list:REG_BR_PROB 966367641 (nil))))
Without the recent ext-dce improvement we get this out of combine:
> (jump_insn 8 7 9 2 (parallel [
> (set (pc)
> (if_then_else (eq (zero_extract:SI (reg:SI 0 x0 [ x ])
> (const_int 1 [0x1])
> (const_int 0 [0]))
> (const_int 0 [0]))
> (label_ref:DI 14)
> (pc)))
> (clobber (reg:CC 66 cc))
> ]) "k.c":17:6 discrim 1 47 {aarch64_tbeqsisi}
> (expr_list:REG_UNUSED (reg:CC 66 cc)
> (expr_list:REG_DEAD (reg:QI 0 x0 [ x ])
> (int_list:REG_BR_PROB 966367644 (nil))))
Note the difference in zero_extract's mode within the bit-test insn. DI vs
SI.
Hence the w vs x difference in the resulting assembly code.
gcc/testsuite
* gcc.target/aarch64/tbz_1.c: Update expected output.
diff --git a/gcc/testsuite/gcc.target/aarch64/tbz_1.c
b/gcc/testsuite/gcc.target/aarch64/tbz_1.c
index 39deb58e278..065e7f62cf8 100644
--- a/gcc/testsuite/gcc.target/aarch64/tbz_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/tbz_1.c
@@ -8,7 +8,7 @@ void h(void);
/*
** g1:
-** tbnz w[0-9]+, #?0, .L([0-9]+)
+** tbnz [wx][0-9]+, #?0, .L([0-9]+)
** ret
** ...
*/
@@ -20,7 +20,7 @@ void g1(bool x)
/*
** g2:
-** tbz w[0-9]+, #?0, .L([0-9]+)
+** tbz [wx][0-9]+, #?0, .L([0-9]+)
** b h
** ...
*/