[Bug target/103781] generic/cortex-a53 cost model for SLP for aarch64 is good
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103781 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2024-01-26 Ever confirmed|0 |1 --- Comment #6 from Andrew Pinski --- Confirmed. Note if sve is turned on, we get: ``` .L2: ldr q30, [x1], 16 ldr q29, [x2], 16 mul z29.d, z30.d, z29.d add v31.2d, v31.2d, v29.2d cmp x1, x3 bne .L2 ``` For the inner loop on the trunk which is 100% what you want as then it is vectorized.
[Bug target/103781] generic/cortex-a53 cost model for SLP for aarch64 is good
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103781 Andrew Pinski changed: What|Removed |Added CC||pinskia at gcc dot gnu.org --- Comment #5 from Andrew Pinski --- I know that the generic cost model has changed on the trunk but I am not sure this one is fixed ...
[Bug target/103781] generic/cortex-a53 cost model for SLP for aarch64 is good
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103781 --- Comment #4 from Devin Hussey --- Makes sense because the multiplier is what, 5 cycles on an A53?