https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99142
Bug ID: 99142 Summary: [11 Regression] __builtin_clz match.pd transformation too greedy Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: hp at gcc dot gnu.org Target Milestone: --- Created attachment 50215 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50215&action=edit test-case gcc.dg/tree-ssa/prXXXXX.c See the attachment test-case, which is de-macroized from gcc.target/cris/pr93372-31.c, which started regressing with d2eb616a0f7b "match.pd: Add clz(X) == 0 -> (int)X < 0 etc. simpifications [PR94802]" In the test-case, the result *is* used more than once (twice more besides the transformed compare) and the match.pd matching expression *does* have the s modifier: (op (clz:s @0) INTEGER_CST@1), but since the transformation doesn't result in "an expression with more than one operator" (cf. doc/match-and-simplify.texi), it's still performed. The result is that the *input* is kept alive *after* the clz instruction. This generally causes additional register pressure and throws away any re-use of incidentally computed condition codes. Though the original observation was for cris-elf, where the effect is more dramatic, the effect is visible even for x86_64 and of the same kind: losing the re-use of non-zero condition codes from the bsrl instruction, i.e. the transformation causes an additional instruction: --- prXXXXX.s.64good 2021-02-17 02:26:57.646183108 +0100 +++ prXXXXX.s.64bad 2021-02-17 02:27:33.124979464 +0100 @@ -9,7 +9,8 @@ f: bsrl %edi, %eax xorl $31, %eax movl %eax, (%rsi) - je .L1 + testl %edi, %edi + js .L1 movl %eax, (%rdx) .L1: ret To wit, my conclusion is that the matching condition should better be gated by single_use(clz result) *everywhere*. Alternatively, the "s" modifier adjusted somehow, but I'm not sure besides obviously just making it *exactly* single_use, and that suggestion has been shot down before. Maybe there should be an additional *reverse* version of the "simplification", replacing "y = clz(x); if (x < 0) ...stuff using y but not x" -> "y = clz(x); if (y != 0) ...stuff using y but not x"!