[Bug target/97513] [11 regression] aarch64 SVE regressions since r11-3822
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97513 rsandifo at gcc dot gnu.org changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #10 from rsandifo at gcc dot gnu.org --- Fixed. I was going to count r11-8059 against this too, but forgot.
[Bug target/97513] [11 regression] aarch64 SVE regressions since r11-3822
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97513 --- Comment #9 from Jakub Jelinek --- So fixed?
[Bug target/97513] [11 regression] aarch64 SVE regressions since r11-3822
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97513 --- Comment #8 from CVS Commits --- The master branch has been updated by Richard Sandiford : https://gcc.gnu.org/g:2f3d9104610cb2058cf091707a20c1c6eff8d470 commit r11-8030-g2f3d9104610cb2058cf091707a20c1c6eff8d470 Author: Richard Sandiford Date: Wed Apr 7 15:21:56 2021 +0100 vect: Restore variable-length SLP permutes [PR97513] Many of the gcc.target/sve/slp-perm*.c tests started failing after the introduction of separate SLP permute nodes. This patch adds variable-length support using a similar technique to vect_transform_slp_perm_load. As there, the idea is to detect when every permute mask vector is the same and can be generated using a regular stepped sequence. We can easily handle those cases for variable-length, but still need to restrict the general case to constant-length. Again copying vect_transform_slp_perm_load, the idea is to distinguish the two cases regardless of whether the length is variable or not, partly to increase testing coverage and partly because it avoids generating redundant trees. Doing this means that we can also use SLP for the two-vector permute in pr88834.c, which we couldn't before VEC_PERM_EXPR nodes were introduced. The patch therefore makes pr88834.c check that we don't regress back to not using SLP and adds pr88834_ld3.c to check for the original problem in the PR. gcc/ PR tree-optimization/97513 * tree-vect-slp.c (vect_add_slp_permutation): New function, split out from... (vectorizable_slp_permutation): ...here. Detect cases in which all VEC_PERM_EXPRs are guaranteed to have the same stepped permute vector and only generate one permute vector for that case. Extend that case to handle variable-length vectors. gcc/testsuite/ * gcc.target/aarch64/sve/pr88834.c: Expect the vectorizer to use SLP. * gcc.target/aarch64/sve/pr88834_ld3.c: New test.
[Bug target/97513] [11 regression] aarch64 SVE regressions since r11-3822
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97513 rsandifo at gcc dot gnu.org changed: What|Removed |Added Status|WAITING |ASSIGNED --- Comment #7 from rsandifo at gcc dot gnu.org --- Testing a patch for the slp_perm_*.c regressions, which as Jakub says are real. The remaining SVE testsuite failures looks like testisms, so I'll deal with those separately.
[Bug target/97513] [11 regression] aarch64 SVE regressions since r11-3822
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97513 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #6 from Jakub Jelinek --- I chose to bisect randomly one test, gcc.target/aarch64/sve/slp_perm_1.c started FAILing with r11-3823 and FAILs even with current trunk, the difference seems to be that previously it was using the SVE variable length vectors and now it uses fixed V16QImode vectors. As this PR mentions a lot of different FAILs, some started earlier, others later, some got fixed afterwards, others not, it is hard to find out what can/should be done. I can't reproduce the gcc.target/aarch64/sve/mask_load_slp_1.c FAILs, I get 48 / 40 instructions as expected even with r11-3822. I get 40 / 32 in r11-3823 up to r11-4480, and starting from r11-4481 again 48 / 40 until now.
[Bug target/97513] [11 regression] aarch64 SVE regressions since r11-3822
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97513 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2021-03-09 Status|UNCONFIRMED |WAITING --- Comment #5 from Richard Biener --- What's the current state of affairs?
[Bug target/97513] [11 regression] aarch64 SVE regressions since r11-3822
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97513 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug target/97513] [11 regression] aarch64 SVE regressions since r11-3822
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97513 --- Comment #4 from Christophe Lyon --- Not quite: as of r11-5140, I see: FAIL: gcc.target/aarch64/sve/slp_perm_1.c -march=armv8.2-a+sve scan-assembler-times \\trevb\\tz[0-9]+\\.d, p[0-7]/m, z[0-9]+\\.d\\n 1 FAIL: gcc.target/aarch64/sve/slp_perm_2.c -march=armv8.2-a+sve scan-assembler-times \\trevb\\tz[0-9]+\\.s, p[0-7]/m, z[0-9]+\\.s\\n 1 FAIL: gcc.target/aarch64/sve/slp_perm_3.c -march=armv8.2-a+sve scan-assembler-times \\trevb\\tz[0-9]+\\.h, p[0-7]/m, z[0-9]+\\.h\\n 1 FAIL: gcc.target/aarch64/sve/slp_perm_6.c -march=armv8.2-a+sve scan-assembler-times \\ttbl\\tz[0-9]+\\.b, z[0-9]+\\.b, z[0-9]+\\.b\\n 1 FAIL: gcc.target/aarch64/sve/vcond_3.c -march=armv8.2-a+sve scan-assembler-times \\tmov\\tz[0-9]+\\.[hsd], p[0-7]/z, #-32768\\n 3 FAIL: gcc.target/aarch64/sve/vcond_3.c -march=armv8.2-a+sve scan-assembler-times \\tmov\\tz[0-9]+\\.[hsd], p[0-7]/z, #256\\n 3 FAIL: gcc.target/aarch64/sve/vcond_3.c -march=armv8.2-a+sve scan-assembler-times \\tmov\\tz[0-9]+\\.[hsd], p[0-7]/z, #2\\n 3 FAIL: gcc.target/aarch64/sve/vcond_3.c -march=armv8.2-a+sve scan-assembler-times \\tmov\\tz[0-9]+\\.b, p[0-7]/z, #-128\\n 1 FAIL: gcc.target/aarch64/sve/vcond_3.c -march=armv8.2-a+sve scan-assembler-times \\tmov\\tz[0-9]+\\.b, p[0-7]/z, #2\\n 1 gcc.target/aarch64/sve/loop_add_4.c started passing between r4882 and r4894.
[Bug target/97513] [11 regression] aarch64 SVE regressions since r11-3822
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97513 rsandifo at gcc dot gnu.org changed: What|Removed |Added CC||rsandifo at gcc dot gnu.org --- Comment #3 from rsandifo at gcc dot gnu.org --- With current trunk I'm seeing: FAIL: gcc.target/aarch64/sve/loop_add_4.c -march=armv8.2-a+sve scan-assembler-times \\tadd\\tz[0-9]+\\.d, z[0-9]+\\.d, z[0-9]+\\.d\\n 10 FAIL: gcc.target/aarch64/sve/loop_add_4.c -march=armv8.2-a+sve scan-assembler-times \\tadd\\tz[0-9]+\\.h, z[0-9]+\\.h, z[0-9]+\\.h\\n 10 FAIL: gcc.target/aarch64/sve/loop_add_4.c -march=armv8.2-a+sve scan-assembler-times \\tadd\\tz[0-9]+\\.s, z[0-9]+\\.s, z[0-9]+\\.s\\n 10 FAIL: gcc.target/aarch64/sve/loop_add_4.c -march=armv8.2-a+sve scan-assembler-times \\tdecd\\tz[0-9]+\\.d, all, mul #15\\n 1 FAIL: gcc.target/aarch64/sve/loop_add_4.c -march=armv8.2-a+sve scan-assembler-times \\tdecd\\tz[0-9]+\\.d, all, mul #16\\n 1 FAIL: gcc.target/aarch64/sve/loop_add_4.c -march=armv8.2-a+sve scan-assembler-times \\tdecd\\tz[0-9]+\\.d\\n 1 FAIL: gcc.target/aarch64/sve/loop_add_4.c -march=armv8.2-a+sve scan-assembler-times \\tdech\\tz[0-9]+\\.h, all, mul #15\\n 1 FAIL: gcc.target/aarch64/sve/loop_add_4.c -march=armv8.2-a+sve scan-assembler-times \\tdech\\tz[0-9]+\\.h, all, mul #16\\n 1 FAIL: gcc.target/aarch64/sve/loop_add_4.c -march=armv8.2-a+sve scan-assembler-times \\tdech\\tz[0-9]+\\.h\\n 1 FAIL: gcc.target/aarch64/sve/loop_add_4.c -march=armv8.2-a+sve scan-assembler-times \\tdecw\\tz[0-9]+\\.s, all, mul #15\\n 1 FAIL: gcc.target/aarch64/sve/loop_add_4.c -march=armv8.2-a+sve scan-assembler-times \\tdecw\\tz[0-9]+\\.s, all, mul #16\\n 1 FAIL: gcc.target/aarch64/sve/loop_add_4.c -march=armv8.2-a+sve scan-assembler-times \\tdecw\\tz[0-9]+\\.s\\n 1 FAIL: gcc.target/aarch64/sve/slp_perm_1.c -march=armv8.2-a+sve scan-assembler-times \\trevb\\tz[0-9]+\\.d, p[0-7]/m, z[0-9]+\\.d\\n 1 FAIL: gcc.target/aarch64/sve/slp_perm_2.c -march=armv8.2-a+sve scan-assembler-times \\trevb\\tz[0-9]+\\.s, p[0-7]/m, z[0-9]+\\.s\\n 1 FAIL: gcc.target/aarch64/sve/slp_perm_3.c -march=armv8.2-a+sve scan-assembler-times \\trevb\\tz[0-9]+\\.h, p[0-7]/m, z[0-9]+\\.h\\n 1 FAIL: gcc.target/aarch64/sve/slp_perm_6.c -march=armv8.2-a+sve scan-assembler-times \\ttbl\\tz[0-9]+\\.b, z[0-9]+\\.b, z[0-9]+\\.b\\n 1 FAIL: gcc.target/aarch64/sve/vcond_3.c -march=armv8.2-a+sve scan-assembler-times \\tmov\\tz[0-9]+\\.[hsd], p[0-7]/z, #-32768\\n 3 FAIL: gcc.target/aarch64/sve/vcond_3.c -march=armv8.2-a+sve scan-assembler-times \\tmov\\tz[0-9]+\\.[hsd], p[0-7]/z, #256\\n 3 FAIL: gcc.target/aarch64/sve/vcond_3.c -march=armv8.2-a+sve scan-assembler-times \\tmov\\tz[0-9]+\\.[hsd], p[0-7]/z, #2\\n 3 FAIL: gcc.target/aarch64/sve/vcond_3.c -march=armv8.2-a+sve scan-assembler-times \\tmov\\tz[0-9]+\\.b, p[0-7]/z, #-128\\n 1 FAIL: gcc.target/aarch64/sve/vcond_3.c -march=armv8.2-a+sve scan-assembler-times \\tmov\\tz[0-9]+\\.b, p[0-7]/z, #2\\n 1 Christophe, does that match your results?
[Bug target/97513] [11 regression] aarch64 SVE regressions since r11-3822
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97513 --- Comment #2 from Christophe Lyon --- Right, but the builds were broken before that (did not work with gcc-4.8.5 on the host), so I didn't notice this problem ealier.
[Bug target/97513] [11 regression] aarch64 SVE regressions since r11-3822
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97513 Alex Coplan changed: What|Removed |Added CC||acoplan at gcc dot gnu.org --- Comment #1 from Alex Coplan --- > Since r11-3822 (g:7e7352b2ad089ea68d689f3b79d93e3ee26326f7), I have noticed > several aarch64/SVE regressions Presumably that is just a revision that you've seen this at rather than the result of a bisection?
[Bug target/97513] [11 regression] aarch64 SVE regressions since r11-3822
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97513 Richard Biener changed: What|Removed |Added Component|tree-optimization |target Target Milestone|--- |11.0