[Bug rtl-optimization/114515] [14 Regression] Failure to use aarch64 lane forms after PR101523

2024-04-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114515

Tamar Christina  changed:

   What|Removed |Added

 CC||tnfchris at gcc dot gnu.org
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=114575

--- Comment #10 from Tamar Christina  ---
This has also broken our addressing modes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114575

[Bug rtl-optimization/114515] [14 Regression] Failure to use aarch64 lane forms after PR101523

2024-04-02 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114515

--- Comment #9 from Jeffrey A. Law  ---
Thanks for that info Edwin -- my tester flagged them too and mentally I'd
figured it was most likely the combiner change.

[Bug rtl-optimization/114515] [14 Regression] Failure to use aarch64 lane forms after PR101523

2024-04-02 Thread ewlu at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114515

--- Comment #8 from Edwin Lu  ---
(In reply to Robin Dapp from comment #7)
> There is some riscv fallout as well.  Edwin has the details.

I haven't done an in depth analysis but the full list of new riscv scan-dump
failures can be found here:
https://github.com/patrick-rivos/gcc-postcommit-ci/issues/694

[Bug rtl-optimization/114515] [14 Regression] Failure to use aarch64 lane forms after PR101523

2024-04-02 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114515

Robin Dapp  changed:

   What|Removed |Added

 CC||ewlu at rivosinc dot com,
   ||rdapp at gcc dot gnu.org

--- Comment #7 from Robin Dapp  ---
There is some riscv fallout as well.  Edwin has the details.

[Bug rtl-optimization/114515] [14 Regression] Failure to use aarch64 lane forms after PR101523

2024-04-02 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114515

Richard Biener  changed:

   What|Removed |Added

   Priority|P2  |P1
 Ever confirmed|0   |1
   Last reconfirmed||2024-04-02
 Status|UNCONFIRMED |NEW

--- Comment #6 from Richard Biener  ---
Note I think given the offending rev fixed a very old bug we should eventually
revert the fix and rework it during next stage1.  This was at least
unexpectedly big fallout AFAIU.

[Bug rtl-optimization/114515] [14 Regression] Failure to use aarch64 lane forms after PR101523

2024-03-29 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114515

Jeffrey A. Law  changed:

   What|Removed |Added

   Priority|P3  |P2
 CC||law at gcc dot gnu.org

[Bug rtl-optimization/114515] [14 Regression] Failure to use aarch64 lane forms after PR101523

2024-03-28 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114515

--- Comment #5 from Richard Sandiford  ---
For the record, the associated new testsuite failures are:

FAIL: gcc.target/aarch64/ashltidisi.c scan-assembler-times asr 3
FAIL: gcc.target/aarch64/asimd-mull-elem.c scan-assembler-times
\\s+fmul\\tv[0-9]+\\.4s, v[0-9]+\\.4s, v[0-9]+\\.s\\[0\\] 4
FAIL: gcc.target/aarch64/asimd-mull-elem.c scan-assembler-times
\\s+mul\\tv[0-9]+\\.4s, v[0-9]+\\.4s, v[0-9]+\\.s\\[0\\] 4
FAIL: gcc.target/aarch64/ccmp_3.c scan-assembler-not \tcbnz\t
FAIL: gcc.target/aarch64/pr100056.c scan-assembler-times \\t[us]bfiz\\tw[0-9]+,
w[0-9]+, 11 2
FAIL: gcc.target/aarch64/pr100056.c scan-assembler-times \\tadd\\tw[0-9]+,
w[0-9]+, w[0-9]+, uxtb\\n 2
FAIL: gcc.target/aarch64/pr108840.c scan-assembler-not and\\tw[0-9]+, w[0-9]+,
31
FAIL: gcc.target/aarch64/pr112105.c scan-assembler-not \\tdup\\t
FAIL: gcc.target/aarch64/pr112105.c scan-assembler-times
(?n)\\tfmul\\t.*v[0-9]+\\.s\\[0\\]\\n 2
FAIL: gcc.target/aarch64/rev16_2.c scan-assembler-times rev16\\tx[0-9]+ 2
FAIL: gcc.target/aarch64/vaddX_high_cost.c scan-assembler-not dup\\t
FAIL: gcc.target/aarch64/vmul_element_cost.c scan-assembler-not dup\\t
FAIL: gcc.target/aarch64/vmul_high_cost.c scan-assembler-not dup\\t
FAIL: gcc.target/aarch64/vsubX_high_cost.c scan-assembler-not dup\\t
FAIL: gcc.target/aarch64/sve/pr98119.c scan-assembler \\tand\\tx[0-9]+,
x[0-9]+, #?-31\\n
FAIL: gcc.target/aarch64/sve/pred-not-gen-1.c scan-assembler-not \\tbic\\t
FAIL: gcc.target/aarch64/sve/pred-not-gen-1.c scan-assembler-times
\\tnot\\tp[0-9]+\\.b, p[0-9]+/z, p[0-9]+\\.b\\n 1
FAIL: gcc.target/aarch64/sve/pred-not-gen-4.c scan-assembler-not \\tbic\\t
FAIL: gcc.target/aarch64/sve/pred-not-gen-4.c scan-assembler-times
\\tnot\\tp[0-9]+\\.b, p[0-9]+/z, p[0-9]+\\.b\\n 1
FAIL: gcc.target/aarch64/sve/var_stride_2.c scan-assembler-times
\\tubfiz\\tx[0-9]+, x2, 10, 16\\n 1
FAIL: gcc.target/aarch64/sve/var_stride_2.c scan-assembler-times
\\tubfiz\\tx[0-9]+, x3, 10, 16\\n 1
FAIL: gcc.target/aarch64/sve/var_stride_4.c scan-assembler-times
\\tsbfiz\\tx[0-9]+, x2, 10, 32\\n 1
FAIL: gcc.target/aarch64/sve/var_stride_4.c scan-assembler-times
\\tsbfiz\\tx[0-9]+, x3, 10, 32\\n 1

[Bug rtl-optimization/114515] [14 Regression] Failure to use aarch64 lane forms after PR101523

2024-03-28 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114515

--- Comment #4 from Richard Sandiford  ---
(In reply to Richard Biener from comment #1)
> Btw, why does forwprop not do this?
Not 100% sure (I wasn't involved in choosing the current heuristics).  But
fwprop can propagate across blocks, so there is probably more risk of
increasing register pressure.

[Bug rtl-optimization/114515] [14 Regression] Failure to use aarch64 lane forms after PR101523

2024-03-28 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114515

--- Comment #3 from Richard Sandiford  ---
In RTL terms, the dup is vec_duplicate.  The combination is:

Trying 10 -> 13:
   10: r107:V4SF=vec_duplicate(r115:SF)
  REG_DEAD r115:SF
   13: r110:V4SF=r111:V4SF*r107:V4SF
  REG_DEAD r111:V4SF
Failed to match this instruction:
(parallel [
(set (reg:V4SF 110 [ _2 ])
(mult:V4SF (vec_duplicate:V4SF (reg:SF 115))
(reg:V4SF 111 [ *ptr_6(D) ])))
(set (reg:V4SF 107)
(vec_duplicate:V4SF (reg:SF 115)))
])
Failed to match this instruction:
(parallel [
(set (reg:V4SF 110 [ _2 ])
(mult:V4SF (vec_duplicate:V4SF (reg:SF 115))
(reg:V4SF 111 [ *ptr_6(D) ])))
(set (reg:V4SF 107)
(vec_duplicate:V4SF (reg:SF 115)))
])
Successfully matched this instruction:
(set (reg:V4SF 107)
(vec_duplicate:V4SF (reg:SF 115)))
Successfully matched this instruction:
(set (reg:V4SF 110 [ _2 ])
(mult:V4SF (vec_duplicate:V4SF (reg:SF 115))
(reg:V4SF 111 [ *ptr_6(D) ])))
allowing combination of insns 10 and 13
original costs 8 + 20 = 28
replacement costs 8 + 20 = 28
modifying insn i210: r107:V4SF=vec_duplicate(r115:SF)
deferring rescan insn with uid = 10.
modifying insn i313: r110:V4SF=vec_duplicate(r115:SF)*r111:V4SF
  REG_DEAD r115:SF
  REG_DEAD r111:V4SF
deferring rescan insn with uid = 13.

[Bug rtl-optimization/114515] [14 Regression] Failure to use aarch64 lane forms after PR101523

2024-03-28 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114515

--- Comment #2 from Segher Boessenkool  ---
The PR101523 fix makes sure we do not get the same I2 back, because that
violates algorithmic assumptions of combine.  Importantly, the way it was
things can be changed back time and time again, and that actually happened.
There is no "canonical form" in combine, it all depends on what little
piece of context is and is not considered what form combine prefers.  Things
can -- and DID -- oscillate.

So, what is happening here?  The "dup" here is really a "splat"?  Should the
backend have some extra define_insn or define_split, or maybe even a peephole?

[Bug rtl-optimization/114515] [14 Regression] Failure to use aarch64 lane forms after PR101523

2024-03-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114515

--- Comment #1 from Richard Biener  ---
Btw, why does forwprop not do this?

[Bug rtl-optimization/114515] [14 Regression] Failure to use aarch64 lane forms after PR101523

2024-03-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114515

Richard Biener  changed:

   What|Removed |Added

 Target||aarch64
   Target Milestone|--- |14.0
 CC||rguenth at gcc dot gnu.org