[Bug testsuite/111228] [14 regression] gcc.target/powerpc/vsx-extract-6.c fails after r14-3381-g27de9aa152141e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111228 --- Comment #8 from Peter Bergner --- (In reply to CVS Commits from comment #7) > The master branch has been updated by Peter Bergner : Bah, wrong PR#, Sorry! :-(
[Bug testsuite/111228] [14 regression] gcc.target/powerpc/vsx-extract-6.c fails after r14-3381-g27de9aa152141e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111228 --- Comment #7 from CVS Commits --- The master branch has been updated by Peter Bergner : https://gcc.gnu.org/g:80277e18e1a77b68f938b605c3ecd2750194ed75 commit r14-3598-g80277e18e1a77b68f938b605c3ecd2750194ed75 Author: Peter Bergner Date: Thu Aug 31 08:56:47 2023 -0500 rs6000: Update instruction counts to match vec_* calls [PR111228] Commit r14-3258-ge7a36e4715c716 increased the amount of folding we perform, leading to better code. Update the expected instruction counts to match the changes. 2023-08-31 Peter Bergner gcc/testsuite/ PR testsuite/111228 * gcc.target/powerpc/fold-vec-logical-ors-char.c: Update instruction counts to match the number of associated vec_* built-in calls. * gcc.target/powerpc/fold-vec-logical-ors-int.c: Likewise. * gcc.target/powerpc/fold-vec-logical-ors-longlong.c: Likewise. * gcc.target/powerpc/fold-vec-logical-ors-short.c: Likewise. * gcc.target/powerpc/fold-vec-logical-other-char.c: Likewise. * gcc.target/powerpc/fold-vec-logical-other-int.c: Likewise. * gcc.target/powerpc/fold-vec-logical-other-longlong.c: Likewise. * gcc.target/powerpc/fold-vec-logical-other-short.c: Likewise.
[Bug testsuite/111228] [14 regression] gcc.target/powerpc/vsx-extract-6.c fails after r14-3381-g27de9aa152141e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111228 --- Comment #6 from Peter Bergner --- (In reply to Richard Biener from comment #5) > Should be fixed. Confirmed fixed. Thanks!
[Bug testsuite/111228] [14 regression] gcc.target/powerpc/vsx-extract-6.c fails after r14-3381-g27de9aa152141e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111228 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #5 from Richard Biener --- Should be fixed.
[Bug testsuite/111228] [14 regression] gcc.target/powerpc/vsx-extract-6.c fails after r14-3381-g27de9aa152141e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111228 --- Comment #4 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:caa7a99a052929d5970677c5b639e1fa5166e334 commit r14-3571-gcaa7a99a052929d5970677c5b639e1fa5166e334 Author: Richard Biener Date: Wed Aug 30 11:57:47 2023 +0200 tree-optimization/111228 - combine two VEC_PERM_EXPRs The following adds simplification of two VEC_PERM_EXPRs where the later one replaces all elements from either the first or the second input of the earlier permute. This allows a three input permute to be simplified to a two input one. I'm following the existing two input simplification case and only allow non-VLA permutes. The now existing three cases and the single case in tree-ssa-forwprop.cc somehow ask for merging, I'm not doing this as part of this change though. PR tree-optimization/111228 * match.pd ((vec_perm (vec_perm ..) @5 ..) -> (vec_perm @x @5 ..)): New simplifications. * gcc.dg/tree-ssa/forwprop-42.c: New testcase.
[Bug testsuite/111228] [14 regression] gcc.target/powerpc/vsx-extract-6.c fails after r14-3381-g27de9aa152141e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111228 Richard Biener changed: What|Removed |Added Target Milestone|--- |14.0 Keywords||testsuite-fail Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #3 from Richard Biener --- Mine. We have /* Merge c = VEC_PERM_EXPR ; d = VEC_PERM_EXPR ; to d = VEC_PERM_EXPR ; */ and tree-ssa-forwprop.cc has some other special-cases. What we lack is simplification of two consecutive permutes. > For gimple IRs: > > res_3 = VEC_PERM_EXPR ; > res_5 = VEC_PERM_EXPR ; > > I'd expect it can be further optimized into > > res_5 = VEC_PERM_EXPR ; where I think the vectors are all vector(2) unsigned long this works because the later permute replaces all elements the first permute uses from the first or second element. Thus the key is to identify whether the inherited elements are all from a single operand of the first source (and which ones).
[Bug testsuite/111228] [14 regression] gcc.target/powerpc/vsx-extract-6.c fails after r14-3381-g27de9aa152141e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111228 --- Comment #2 from Kewen Lin --- (In reply to Peter Bergner from comment #1) > Confirmed. The testsuite log shows for vsx-extract-6.c and vsx-extract-7.c: > > gcc.target/powerpc/vsx-extract-6.c: \\mxxpermdi\\M found 2 times > FAIL: gcc.target/powerpc/vsx-extract-6.c scan-assembler-times \\mxxpermdi\\M > 1 > FAIL: gcc.target/powerpc/vsx-extract-6.c scan-assembler-not \\mvspltisw\\M > > So we have an extra xxpermdi than we expected and we also have a vspltisw > when we expected none. I haven't looked at whether the code is better or > worse though, to know whether we should just update the expected counts or > whether this is really a code quality regression. The commit makes the vsx-extract-6.c end up with: test_vpasted: .LFB0: .cfi_startproc xxspltib 0,0 xxpermdi 34,34,0,1 xxpermdi 34,34,35,1 blr instead of (the original expected): test_vpasted: .LFB0: .cfi_startproc xxpermdi 34,34,35,1 blr I think it's a code quality regression. The optimized gimple IR is changed to: __vector unsigned long long test_vpasted (__vector unsigned long long high, __vector unsigned long long low) { __vector unsigned long long res; [local count: 1073741824]: res_3 = VEC_PERM_EXPR ; res_5 = VEC_PERM_EXPR ; return res_5; } from: __vector unsigned long long test_vpasted (__vector unsigned long long high, __vector unsigned long long low) { __vector unsigned long long res; long long unsigned int _1; long long unsigned int _2; [local count: 1073741824]: _1 = BIT_FIELD_REF ; res_5 = BIT_INSERT_EXPR ; _2 = BIT_FIELD_REF ; res_7 = BIT_INSERT_EXPR ; return res_7; } For gimple IRs: res_3 = VEC_PERM_EXPR ; res_5 = VEC_PERM_EXPR ; I'd expect it can be further optimized into res_5 = VEC_PERM_EXPR ;
[Bug testsuite/111228] [14 regression] gcc.target/powerpc/vsx-extract-6.c fails after r14-3381-g27de9aa152141e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111228 Peter Bergner changed: What|Removed |Added Last reconfirmed||2023-08-30 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #1 from Peter Bergner --- Confirmed. The testsuite log shows for vsx-extract-6.c and vsx-extract-7.c: gcc.target/powerpc/vsx-extract-6.c: \\mxxpermdi\\M found 2 times FAIL: gcc.target/powerpc/vsx-extract-6.c scan-assembler-times \\mxxpermdi\\M 1 FAIL: gcc.target/powerpc/vsx-extract-6.c scan-assembler-not \\mvspltisw\\M So we have an extra xxpermdi than we expected and we also have a vspltisw when we expected none. I haven't looked at whether the code is better or worse though, to know whether we should just update the expected counts or whether this is really a code quality regression.