[Bug testsuite/111228] [14 regression] gcc.target/powerpc/vsx-extract-6.c fails after r14-3381-g27de9aa152141e

2023-08-31 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111228

--- Comment #8 from Peter Bergner  ---
(In reply to CVS Commits from comment #7)
> The master branch has been updated by Peter Bergner :

Bah, wrong PR#, Sorry! :-(

[Bug testsuite/111228] [14 regression] gcc.target/powerpc/vsx-extract-6.c fails after r14-3381-g27de9aa152141e

2023-08-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111228

--- Comment #7 from CVS Commits  ---
The master branch has been updated by Peter Bergner :

https://gcc.gnu.org/g:80277e18e1a77b68f938b605c3ecd2750194ed75

commit r14-3598-g80277e18e1a77b68f938b605c3ecd2750194ed75
Author: Peter Bergner 
Date:   Thu Aug 31 08:56:47 2023 -0500

rs6000: Update instruction counts to match vec_* calls [PR111228]

Commit  r14-3258-ge7a36e4715c716 increased the amount of folding we
perform,
leading to better code.  Update the expected instruction counts to match
the
changes.

2023-08-31  Peter Bergner  

gcc/testsuite/
PR testsuite/111228
* gcc.target/powerpc/fold-vec-logical-ors-char.c: Update
instruction
counts to match the number of associated vec_* built-in calls.
* gcc.target/powerpc/fold-vec-logical-ors-int.c: Likewise.
* gcc.target/powerpc/fold-vec-logical-ors-longlong.c: Likewise.
* gcc.target/powerpc/fold-vec-logical-ors-short.c: Likewise.
* gcc.target/powerpc/fold-vec-logical-other-char.c: Likewise.
* gcc.target/powerpc/fold-vec-logical-other-int.c: Likewise.
* gcc.target/powerpc/fold-vec-logical-other-longlong.c: Likewise.
* gcc.target/powerpc/fold-vec-logical-other-short.c: Likewise.

[Bug testsuite/111228] [14 regression] gcc.target/powerpc/vsx-extract-6.c fails after r14-3381-g27de9aa152141e

2023-08-30 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111228

--- Comment #6 from Peter Bergner  ---
(In reply to Richard Biener from comment #5)
> Should be fixed.

Confirmed fixed.  Thanks!

[Bug testsuite/111228] [14 regression] gcc.target/powerpc/vsx-extract-6.c fails after r14-3381-g27de9aa152141e

2023-08-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111228

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #5 from Richard Biener  ---
Should be fixed.

[Bug testsuite/111228] [14 regression] gcc.target/powerpc/vsx-extract-6.c fails after r14-3381-g27de9aa152141e

2023-08-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111228

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:caa7a99a052929d5970677c5b639e1fa5166e334

commit r14-3571-gcaa7a99a052929d5970677c5b639e1fa5166e334
Author: Richard Biener 
Date:   Wed Aug 30 11:57:47 2023 +0200

tree-optimization/111228 - combine two VEC_PERM_EXPRs

The following adds simplification of two VEC_PERM_EXPRs where
the later one replaces all elements from either the first or the
second input of the earlier permute.  This allows a three input
permute to be simplified to a two input one.

I'm following the existing two input simplification case and only
allow non-VLA permutes.  The now existing three cases and the
single case in tree-ssa-forwprop.cc somehow ask for merging,
I'm not doing this as part of this change though.

PR tree-optimization/111228
* match.pd ((vec_perm (vec_perm ..) @5 ..) -> (vec_perm @x @5 ..)):
New simplifications.

* gcc.dg/tree-ssa/forwprop-42.c: New testcase.

[Bug testsuite/111228] [14 regression] gcc.target/powerpc/vsx-extract-6.c fails after r14-3381-g27de9aa152141e

2023-08-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111228

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
   Keywords||testsuite-fail
 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #3 from Richard Biener  ---
Mine.  We have

/* Merge
   c = VEC_PERM_EXPR ;
   d = VEC_PERM_EXPR ;
   to
   d = VEC_PERM_EXPR ;  */

and tree-ssa-forwprop.cc has some other special-cases.  What we lack is
simplification of two consecutive permutes.

> For gimple IRs:
> 
>  res_3 = VEC_PERM_EXPR ;
>  res_5 = VEC_PERM_EXPR ;
>
> I'd expect it can be further optimized into
>
>  res_5 = VEC_PERM_EXPR ;

where I think the vectors are all vector(2) unsigned long this works
because the later permute replaces all elements the first permute
uses from the first or second element.  Thus the key is to identify
whether the inherited elements are all from a single operand of the
first source (and which ones).

[Bug testsuite/111228] [14 regression] gcc.target/powerpc/vsx-extract-6.c fails after r14-3381-g27de9aa152141e

2023-08-29 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111228

--- Comment #2 from Kewen Lin  ---
(In reply to Peter Bergner from comment #1)
> Confirmed.  The testsuite log shows for vsx-extract-6.c and vsx-extract-7.c:
> 
> gcc.target/powerpc/vsx-extract-6.c: \\mxxpermdi\\M found 2 times
> FAIL: gcc.target/powerpc/vsx-extract-6.c scan-assembler-times \\mxxpermdi\\M
> 1
> FAIL: gcc.target/powerpc/vsx-extract-6.c scan-assembler-not \\mvspltisw\\M
> 
> So we have an extra xxpermdi than we expected and we also have a vspltisw
> when we expected none.  I haven't looked at whether the code is better or
> worse though, to know whether we should just update the expected counts or
> whether this is really a code quality regression.

The commit makes the vsx-extract-6.c end up with:

test_vpasted:
.LFB0:
.cfi_startproc
xxspltib 0,0
xxpermdi 34,34,0,1
xxpermdi 34,34,35,1
blr

instead of (the original expected):

test_vpasted:
.LFB0:
.cfi_startproc
xxpermdi 34,34,35,1
blr

I think it's a code quality regression. The optimized gimple IR is changed to:

__vector unsigned long long test_vpasted (__vector unsigned long long high,
__vector unsigned long long low)
{
  __vector unsigned long long res;

   [local count: 1073741824]:
  res_3 = VEC_PERM_EXPR ;
  res_5 = VEC_PERM_EXPR ;
  return res_5;

}

from:

__vector unsigned long long test_vpasted (__vector unsigned long long high,
__vector unsigned long long low)
{
  __vector unsigned long long res;
  long long unsigned int _1;
  long long unsigned int _2;

   [local count: 1073741824]:
  _1 = BIT_FIELD_REF ;
  res_5 = BIT_INSERT_EXPR ;
  _2 = BIT_FIELD_REF ;
  res_7 = BIT_INSERT_EXPR ;
  return res_7;

}

For gimple IRs:

  res_3 = VEC_PERM_EXPR ;
  res_5 = VEC_PERM_EXPR ;

I'd expect it can be further optimized into

  res_5 = VEC_PERM_EXPR ;

[Bug testsuite/111228] [14 regression] gcc.target/powerpc/vsx-extract-6.c fails after r14-3381-g27de9aa152141e

2023-08-29 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111228

Peter Bergner  changed:

   What|Removed |Added

   Last reconfirmed||2023-08-30
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Peter Bergner  ---
Confirmed.  The testsuite log shows for vsx-extract-6.c and vsx-extract-7.c:

gcc.target/powerpc/vsx-extract-6.c: \\mxxpermdi\\M found 2 times
FAIL: gcc.target/powerpc/vsx-extract-6.c scan-assembler-times \\mxxpermdi\\M 1
FAIL: gcc.target/powerpc/vsx-extract-6.c scan-assembler-not \\mvspltisw\\M

So we have an extra xxpermdi than we expected and we also have a vspltisw when
we expected none.  I haven't looked at whether the code is better or worse
though, to know whether we should just update the expected counts or whether
this is really a code quality regression.