Re: [PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-11 Thread Bill Schmidt via Gcc-patches
Fine.  I withdraw the patch request, and will remove my name from
the bugzilla.  Somebody else can deal with it.  I have more important
things to worry about.

Bill

On 2/11/22 1:31 AM, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Feb 10, 2022 at 04:28:02PM -0600, Bill Schmidt wrote:
>> On 2/10/22 4:11 PM, Segher Boessenkool wrote:
>>>> No, trunk has this, for example:
>>>>
>>>>   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
>>>>     VCLZLSBB_V16QI vctzlsbb_v16qi {endian}
>>> I see this on trunk:
>>>
>>>   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
>>> VCLZLSBB_V16QI vclzlsbb_v16qi {}
>>>
>>> Oh, you changed it?  Please fix it, then.
>> In a patch you approved, yes.
> Yes, I missed it.  That is not an argument that it would be good or
> should not be change.
>
>> I don't really understand why you want
>> it changed now.
> Because it is wrong.
>
>> You must not be looking at the most recent trunk revision.
> Indeed I haven't been able to update master for a week or so, it does
> not bootstrap, as we have talked about.
>
>>>> Throughout the new builtin infrastructure, the defaults are set for
>>>> little-endian, and the "endian" flag changes behavior for big-endian.
>>> That is a big mistake.  There are many machine instructions  that are
>>> *always* big-endian (most even!), and none that are always
>>> little-endian.  So this should be fixed, sooner rather than later :-(
>> That does not seem like a good idea in stage 4 to me.  That requires
>> yet another patch to reverse a bunch of other things unnecessarily.
> Things that were added in stage 4, a few days ago even.  Things that are
> broken and wrong.  Things I do not want to have to release with and deal
> with all the pain of having broken released versions.
>
>> This is a purely arbitrary choice.
> No, it is not.  It flies in the face of consistency.
>
>> The endian flag is only used when
>> a built-in function must have one behavior for big-endian, and another
>> behavior for little-endian.  Which one is chosen as the default is
>> absolutely arbitrary.
> The one that corresponds to the name should be the default.  I don't see
> how you can argue otherwise.
>
>> When we expand the built-in we will either
>> accept the default or change to the other.  The existence of machine
>> instructions that are only big-endian has nothing to do with the case;
>> what matters is the existence of built-in functions that have two
>> behaviors.
> Everything in our backend is BE by default, just like everything in the
> architecture is.  Yes, LE works almost as well (or just as well) in most
> places, but everything is named assuming BE.  This consistency is hugely
> important, without it the reader will not understand things as well and
> as easily.
>
>>>> That's something that should be fixed, I guess, but it's orthogonal
>>>> to this patch.
>>> Fixing it later is more work :-(
>>>
>>> Please at least open a bug report for it.
>> I can do that.
> Thanks!
>
>>> The other things need fixing before the patch is okay.
>> I'd ask you to reconsider, as explained above.
> It is purely an implementation thing, and it is completely trivial to
> do.  If you truly are afraid of breaking things (you should not be), it
> is marginally acceptable to do this as the very first thing in stage 1.
>
> Consistency matters.  Naming matters.  These shape how we think about
> things.
>
>
> Segher


Re: [PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-10 Thread Bill Schmidt via Gcc-patches
Hi!

On 2/10/22 4:11 PM, Segher Boessenkool wrote:
> On Thu, Feb 10, 2022 at 03:17:05PM -0600, Bill Schmidt wrote:
>>>>  /* 1 argument vector functions added in ISA 3.0 (power9). */
>>>> -BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",CONST,  vclzlsbb_v16qi)
>>>> -BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",  CONST,  vclzlsbb_v8hi)
>>>> -BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",  CONST,  vclzlsbb_v4si)
>>>> -BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",CONST,  vctzlsbb_v16qi)
>>>> -BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",  CONST,  vctzlsbb_v8hi)
>>>> -BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",  CONST,  vctzlsbb_v4si)
>>>> +BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",CONST,  vctzlsbb_v16qi)
>>>> +BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",  CONST,  vctzlsbb_v8hi)
>>>> +BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",  CONST,  vctzlsbb_v4si)
>>>> +BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",CONST,  vclzlsbb_v16qi)
>>>> +BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",  CONST,  vclzlsbb_v8hi)
>>>> +BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",  CONST,  vclzlsbb_v4si)
>>> Please change the default to be equal to the builtin name, so, the BE
>>> version.  We do that everywhere else as well, and it makes a lot more
>>> sense (since everything in Power has BE numbering).
>>>
>>> The trunk version has this correct afaics?
>> No, trunk has this, for example:
>>
>>   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
>>     VCLZLSBB_V16QI vctzlsbb_v16qi {endian}
> I see this on trunk:
>
>   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
> VCLZLSBB_V16QI vclzlsbb_v16qi {}
>
> Oh, you changed it?  Please fix it, then.

In a patch you approved, yes.  I don't really understand why you want
it changed now.  You must not be looking at the most recent trunk
revision.

>
>> Throughout the new builtin infrastructure, the defaults are set for
>> little-endian, and the "endian" flag changes behavior for big-endian.
> That is a big mistake.  There are many machine instructions  that are
> *always* big-endian (most even!), and none that are always
> little-endian.  So this should be fixed, sooner rather than later :-(

That does not seem like a good idea in stage 4 to me.  That requires
yet another patch to reverse a bunch of other things unnecessarily.

This is a purely arbitrary choice.  The endian flag is only used when
a built-in function must have one behavior for big-endian, and another
behavior for little-endian.  Which one is chosen as the default is
absolutely arbitrary.  When we expand the built-in we will either
accept the default or change to the other.  The existence of machine
instructions that are only big-endian has nothing to do with the case;
what matters is the existence of built-in functions that have two
behaviors.

>>>>  /* { dg-require-effective-target powerpc_p9vector_ok } */
>>>>  /* { dg-options "-mdejagnu-cpu=power9" } */
>>>> +/* { dg-additional-options "-mbig" { target powerpc64le-*-* } } */
>>> You don't need the target clause, if it already is BE by default it does
>>> not do anything to add it redundantly.
>>>
>>> But this is wrong anyway: the name of the target triple does not say
>>> whether we are BE or LE.  Instead you should use the be or le selectors.
>>> But again, just add -mbig always.
>> This was added by David Edelsohn to the trunk version of the patch, because
>> -mbig actually is not supported on all subtargets.  (I found that quite
>> surprising also.)
> Huh.  Yeah I think I encountered that before.
>
> So this is because these options are in sysv4.opt .
>
>> Apparently this doesn't work on AIX, for example.  But 
>> -mlittle works everywhere.  Go figure.
> ... and -mlittle is exactly the same?  Wtw.
>
> I only looked at the .opt files, maybe one of them is handled directly,
> or more likely in specs?  And not symmetrically?
>
>> That's something that should be fixed, I guess, but it's orthogonal
>> to this patch.
> Fixing it later is more work :-(
>
> Please at least open a bug report for it.

I can do that.

>
>
> The other things need fixing before the patch is okay.

I'd ask you to reconsider, as explained above.

Thanks,
Bill

>
>
> Segher


Re: [PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-10 Thread Bill Schmidt via Gcc-patches
Hi!

On 2/10/22 2:50 PM, Segher Boessenkool wrote:
> On Thu, Feb 10, 2022 at 12:22:28PM -0600, Bill Schmidt wrote:
>> This is a backport from mainline 3f30f2d1dbb3228b8468b26239fe60c2974ce2ac.
>> These built-ins were misimplemented as always having big-endian semantics.
>>
>> Because the built-in infrastructure has changed, the modifications to the
>> source are different but achieve the same purpose.  The modifications to
>> the test suite are identical (after fixing the issue with -mbig that David
>> pointed out with the original patch).
>>  /* 1 argument vector functions added in ISA 3.0 (power9). */
>> -BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",  CONST,  vclzlsbb_v16qi)
>> -BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",CONST,  vclzlsbb_v8hi)
>> -BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",CONST,  vclzlsbb_v4si)
>> -BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",  CONST,  vctzlsbb_v16qi)
>> -BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",CONST,  vctzlsbb_v8hi)
>> -BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",CONST,  vctzlsbb_v4si)
>> +BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",  CONST,  vctzlsbb_v16qi)
>> +BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",CONST,  vctzlsbb_v8hi)
>> +BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",CONST,  vctzlsbb_v4si)
>> +BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",  CONST,  vclzlsbb_v16qi)
>> +BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",CONST,  vclzlsbb_v8hi)
>> +BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",CONST,  vclzlsbb_v4si)
> Please change the default to be equal to the builtin name, so, the BE
> version.  We do that everywhere else as well, and it makes a lot more
> sense (since everything in Power has BE numbering).
>
> The trunk version has this correct afaics?

No, trunk has this, for example:

  const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
    VCLZLSBB_V16QI vctzlsbb_v16qi {endian}

So the backport matches what is on trunk.  

Throughout the new builtin infrastructure, the defaults are set for
little-endian, and the "endian" flag changes behavior for big-endian.

>
>> --- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
>> @@ -1,6 +1,7 @@
>>  /* { dg-do compile { target { powerpc*-*-* } } } */
> (Delete the redundant target clause when modifying any testcase, please).

Okay.
>
>>  /* { dg-require-effective-target powerpc_p9vector_ok } */
>>  /* { dg-options "-mdejagnu-cpu=power9" } */
>> +/* { dg-additional-options "-mbig" { target powerpc64le-*-* } } */
> You don't need the target clause, if it already is BE by default it does
> not do anything to add it redundantly.
>
> But this is wrong anyway: the name of the target triple does not say
> whether we are BE or LE.  Instead you should use the be or le selectors.
> But again, just add -mbig always.

This was added by David Edelsohn to the trunk version of the patch, because
-mbig actually is not supported on all subtargets.  (I found that quite
surprising also.)  Apparently this doesn't work on AIX, for example.  But 
-mlittle works everywhere.  Go figure.

That's something that should be fixed, I guess, but it's orthogonal
to this patch.

Thanks!
Bill

>
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c
>> @@ -0,0 +1,15 @@
>> +/* { dg-do compile { target { powerpc*-*-* } } } */
>> +/* { dg-require-effective-target powerpc_p9vector_ok } */
>> +/* { dg-options "-mdejagnu-cpu=power9 -mlittle" } */
> And here you do it correctly :-)
>
> Okay with those fixes (all happen a few times).  Thanks!
>
>
> Segher


Re: [PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-10 Thread Bill Schmidt via Gcc-patches
Hi!

On 2/10/22 2:06 PM, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Feb 10, 2022 at 12:22:28PM -0600, Bill Schmidt wrote:
>> This is a backport from mainline 3f30f2d1dbb3228b8468b26239fe60c2974ce2ac.
>> These built-ins were misimplemented as always having big-endian semantics.
> What is different compared to the trunk version?

The infrastructure changed, so:

(1) Instead of changing the default pattern in rs6000-builtins.def, I have
to change it in rs6000-builtin.def.  (Note the missing "s".)

(2) Instead of having the endian change driven by an "endian" flag in the
built-in description in rs6000-builtins.def, I have to add some more ad-hoc
code in rs6000_expand_builtin to handle the change to the big-endian
pattern.

That's all.

Thanks!
Bill

>
>
> Segher


[PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-10 Thread Bill Schmidt via Gcc-patches
Hi!

This is a backport from mainline 3f30f2d1dbb3228b8468b26239fe60c2974ce2ac.
These built-ins were misimplemented as always having big-endian semantics.

Because the built-in infrastructure has changed, the modifications to the
source are different but achieve the same purpose.  The modifications to
the test suite are identical (after fixing the issue with -mbig that David
pointed out with the original patch).

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for releases/gcc-11?

Thanks!
Bill


2022-02-10  Bill Schmidt  

gcc/
PR target/95082
* config/rs6000/rs6000-builtin.def (VCLZLSBB_V16QI): Change default
pattern.
(VCLZLSBB_V8HI): Likewise.
(VCLZLSBB_V4SI): Likewise.
(VCTZLSBB_V16QI): Likewise.
(VCTZLSBB_V8HI): Likewise.
(VCTZLSBB_V4SI): Likewise.
* config/rs6000/rs6000-call.c (rs6000_expand_builtin): Make big-endian
adjustments to P9V_BUILTIN_VC[LT]ZLSBB_* built-in expansions.

gcc/testsuite/
PR target/95082
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c: Restrict to big-endian.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c: Likewise.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c: New.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c: New.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-0.c: Restrict to big-endian.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-1.c: Likewise.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c: New.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c: New.
---
 gcc/config/rs6000/rs6000-builtin.def  | 12 
 gcc/config/rs6000/rs6000-call.c   | 30 +++
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c |  1 +
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c |  1 +
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c | 15 ++
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c | 15 ++
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-0.c |  1 +
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-1.c |  1 +
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c | 15 ++
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c | 15 ++
 10 files changed, 100 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c

diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index 6270444ef70..b28ee02070a 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2678,12 +2678,12 @@ BU_P9V_64BIT_AV_X (STXVL,   "stxvl",MISC)
 BU_P9V_64BIT_AV_X (XST_LEN_R,  "xst_len_r",MISC)
 
 /* 1 argument vector functions added in ISA 3.0 (power9). */
-BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi", CONST,  vclzlsbb_v16qi)
-BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",   CONST,  vclzlsbb_v8hi)
-BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",   CONST,  vclzlsbb_v4si)
-BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi", CONST,  vctzlsbb_v16qi)
-BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",   CONST,  vctzlsbb_v8hi)
-BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",   CONST,  vctzlsbb_v4si)
+BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi", CONST,  vctzlsbb_v16qi)
+BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",   CONST,  vctzlsbb_v8hi)
+BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",   CONST,  vctzlsbb_v4si)
+BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi", CONST,  vclzlsbb_v16qi)
+BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",   CONST,  vclzlsbb_v8hi)
+BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",   CONST,  vclzlsbb_v4si)
 
 /* Built-in support for Power9 "VSU option" string operations includes
new awareness of the "vector compare not equal" (vcmpneb, vcmpneb.,
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index ef20cb30388..27bb25fa4d8 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -13221,6 +13221,36 @@ rs6000_expand_builtin (tree exp, rtx target, rtx 
subtarget ATTRIBUTE_UNUSED,
}
   break;
 
+case P9V_BUILTIN_VCLZLSBB_V16QI:
+  if (BYTES_BIG_ENDIAN)
+   icode = CODE_FOR_vclzlsbb_v16qi;
+  break;
+
+case P9V_BUILTIN_VCLZLSBB_V8HI:
+  if (BYTES_BIG_ENDIAN)
+   icode = CODE_FOR_vclzlsbb_v8hi;
+  break;
+
+case P9V_BUILTIN_VCLZLSBB_V4SI:
+  if (BYTES_BIG_ENDIAN)
+   icode = CODE_FOR_vclzlsbb_v4si;
+  break;
+
+case P9V_BUILTIN_VCTZLSBB_V16QI:
+  if (BYTES_BIG_ENDIAN)
+   icode = CODE_FOR_vctzlsbb_v16qi;
+  break;
+
+case P9V_BUILTIN_VCTZLSBB_V8HI:
+  if (BYTES_BIG_ENDIAN)
+   icode 

[PATCH] rs6000: Rename vec_clrl and vec_clrr to agreed-upon names

2022-02-09 Thread Bill Schmidt via Gcc-patches
Hi!

After vec_clrl and vec_clrr were implemented and during review of the
documentation, it was agreed to change their names to vec_clr_first and
vec_clr_last to more clearly describe their bi-endian semantics.  ("Left"
and "right" are the wrong terms to be using.)  It looks like I neglected
to make that change, so fixing it now.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for trunk, and for backport to gcc 11 after some burn-in?

Thanks!
Bill


2022-02-09  Bill Schmidt  

gcc/
* config/rs6000/rs6000-overload.def (VEC_CLR_FIRST): Rename from
VEC_CLRL.
(VEC_CLR_LAST): Rename from VEC_CLRR.

gcc/testsuite/
* gcc.target/powerpc/vec-clrl-0.c: Adjust to new names.
* gcc.target/powerpc/vec-clrl-1.c: Likewise.
* gcc.target/powerpc/vec-clrl-2.c: Likewise.
* gcc.target/powerpc/vec-clrl-3.c: Likewise.
* gcc.target/powerpc/vec-clrr-0.c: Likewise.
* gcc.target/powerpc/vec-clrr-1.c: Likewise.
* gcc.target/powerpc/vec-clrr-2.c: Likewise.
* gcc.target/powerpc/vec-clrr-3.c: Likewise.
---
 gcc/config/rs6000/rs6000-overload.def | 12 ++--
 gcc/testsuite/gcc.target/powerpc/vec-clrl-0.c |  4 ++--
 gcc/testsuite/gcc.target/powerpc/vec-clrl-1.c |  4 ++--
 gcc/testsuite/gcc.target/powerpc/vec-clrl-2.c |  4 ++--
 gcc/testsuite/gcc.target/powerpc/vec-clrl-3.c |  4 ++--
 gcc/testsuite/gcc.target/powerpc/vec-clrr-0.c |  4 ++--
 gcc/testsuite/gcc.target/powerpc/vec-clrr-1.c |  4 ++--
 gcc/testsuite/gcc.target/powerpc/vec-clrr-2.c |  4 ++--
 gcc/testsuite/gcc.target/powerpc/vec-clrr-3.c |  4 ++--
 9 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-overload.def 
b/gcc/config/rs6000/rs6000-overload.def
index 44e2945aaa0..0b68cc3c3b2 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -557,16 +557,16 @@
   vuc __builtin_vec_vcipherlast_be (vuc, vuc);
 VCIPHERLAST_BE
 
-[VEC_CLRL, vec_clrl, __builtin_vec_clrl]
-  vsc __builtin_vec_clrl (vsc, unsigned int);
+[VEC_CLR_FIRST, vec_clr_first, __builtin_vec_clr_first]
+  vsc __builtin_vec_clr_first (vsc, unsigned int);
 VCLRLB  VCLRLB_S
-  vuc __builtin_vec_clrl (vuc, unsigned int);
+  vuc __builtin_vec_clr_first (vuc, unsigned int);
 VCLRLB  VCLRLB_U
 
-[VEC_CLRR, vec_clrr, __builtin_vec_clrr]
-  vsc __builtin_vec_clrr (vsc, unsigned int);
+[VEC_CLR_LAST, vec_clr_last, __builtin_vec_clr_last]
+  vsc __builtin_vec_clr_last (vsc, unsigned int);
 VCLRRB  VCLRRB_S
-  vuc __builtin_vec_clrr (vuc, unsigned int);
+  vuc __builtin_vec_clr_last (vuc, unsigned int);
 VCLRRB  VCLRRB_U
 
 ; We skip generating a #define because of the C-versus-C++ complexity
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-clrl-0.c 
b/gcc/testsuite/gcc.target/powerpc/vec-clrl-0.c
index d0b183ebfaf..df055c6535e 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-clrl-0.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-clrl-0.c
@@ -5,11 +5,11 @@
 
 extern void abort (void);
 
-/* Vector string clear left-most bytes of unsigned char.  */
+/* Vector string clear first bytes of unsigned char.  */
 vector unsigned char
 clrl (vector unsigned char arg, int n)
 {
-  return vec_clrl (arg, n);
+  return vec_clr_first (arg, n);
 }
 
 /* { dg-final { scan-assembler {\mvclrlb\M} { target be } } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-clrl-1.c 
b/gcc/testsuite/gcc.target/powerpc/vec-clrl-1.c
index 43ab32c0278..692f83e033b 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-clrl-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-clrl-1.c
@@ -7,11 +7,11 @@
 
 extern void abort (void);
 
-/* Vector string clear left-most bytes of unsigned char.  */
+/* Vector string clear first bytes of unsigned char.  */
 vector unsigned char
 clrl (vector unsigned char arg, int n)
 {
-  return vec_clrl (arg, n);
+  return vec_clr_first (arg, n);
 }
 
 int main (int argc, char *argv [])
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-clrl-2.c 
b/gcc/testsuite/gcc.target/powerpc/vec-clrl-2.c
index b9676b8b04c..ffecf432736 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-clrl-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-clrl-2.c
@@ -5,11 +5,11 @@
 
 extern void abort (void);
 
-/* Vector string clear left-most bytes of unsigned char.  */
+/* Vector string clear first bytes of unsigned char.  */
 vector signed char
 clrl (vector signed char arg, int n)
 {
-  return vec_clrl (arg, n);
+  return vec_clr_first (arg, n);
 }
 
 /* { dg-final { scan-assembler {\mvclrlb\M} { target be } } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-clrl-3.c 
b/gcc/testsuite/gcc.target/powerpc/vec-clrl-3.c
index 0ae5abcee50..456f655e7aa 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-clrl-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-clrl-3.c
@@ -7,11 +7,11 @@
 
 extern void abort (void);
 
-/* Vector string clear left-most bytes of unsigned char.  */
+/* Vector string clear first bytes of unsign

[PATCH] rs6000: Correct function prototypes for vec_replace_unaligned

2022-02-08 Thread Bill Schmidt via Gcc-patches
Hi!

Due to a pasto error in the documentation, vec_replace_unaligned was
implemented with the same function prototypes as vec_replace_elt.  It was
intended that vec_replace_unaligned always specify output vectors as having
type vector unsigned char, to emphasize that elements are potentially
misaligned by this built-in function.  This patch corrects the
misimplementation.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for trunk?  Eventually I would also like to backport it
to GCC 11, after burn-in.

Thanks!
Bill


2022-02-04  Bill Schmidt  

gcc/
* config/rs6000/rs6000-builtins.def (VREPLACE_UN_UV2DI): Change
function prototype.
(VREPLACE_UN_UV4SI): Likewise.
(VREPLACE_UN_V2DF): Likewise.
(VREPLACE_UN_V2DI): Likewise.
(VREPLACE_UN_V4SF): Likewise.
(VREPLACE_UN_V4SI): Likewise.
* config/rs6000/rs6000-overload.def (VEC_REPLACE_UN): Change all
function prototypes.
* config/rs6000/vsx.md (vreplace_un_): Remove define_expand.
(vreplace_un_): New define_insn.

gcc/testsuite/
* gcc.target/powerpc/vec-replace-word-runnable.c: Handle expected
prototypes for each call to vec_replace_unaligned.
---
 gcc/config/rs6000/rs6000-builtins.def | 16 ++--
 gcc/config/rs6000/rs6000-overload.def | 12 -
 gcc/config/rs6000/vsx.md  | 25 ---
 .../powerpc/vec-replace-word-runnable.c   | 20 ++-
 4 files changed, 38 insertions(+), 35 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 5c988cc1152..846c0bafd45 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -3387,25 +3387,25 @@
   const vull __builtin_altivec_vpextd (vull, vull);
 VPEXTD vpextd {}
 
-  const vull __builtin_altivec_vreplace_un_uv2di (vull, unsigned long long, \
-  const int<4>);
+  const vuc __builtin_altivec_vreplace_un_uv2di (vull, unsigned long long, \
+ const int<4>);
 VREPLACE_UN_UV2DI vreplace_un_v2di {}
 
-  const vui __builtin_altivec_vreplace_un_uv4si (vui, unsigned int, \
+  const vuc __builtin_altivec_vreplace_un_uv4si (vui, unsigned int, \
  const int<4>);
 VREPLACE_UN_UV4SI vreplace_un_v4si {}
 
-  const vd __builtin_altivec_vreplace_un_v2df (vd, double, const int<4>);
+  const vuc __builtin_altivec_vreplace_un_v2df (vd, double, const int<4>);
 VREPLACE_UN_V2DF vreplace_un_v2df {}
 
-  const vsll __builtin_altivec_vreplace_un_v2di (vsll, signed long long, \
- const int<4>);
+  const vuc __builtin_altivec_vreplace_un_v2di (vsll, signed long long, \
+const int<4>);
 VREPLACE_UN_V2DI vreplace_un_v2di {}
 
-  const vf __builtin_altivec_vreplace_un_v4sf (vf, float, const int<4>);
+  const vuc __builtin_altivec_vreplace_un_v4sf (vf, float, const int<4>);
 VREPLACE_UN_V4SF vreplace_un_v4sf {}
 
-  const vsi __builtin_altivec_vreplace_un_v4si (vsi, signed int, const int<4>);
+  const vuc __builtin_altivec_vreplace_un_v4si (vsi, signed int, const int<4>);
 VREPLACE_UN_V4SI vreplace_un_v4si {}
 
   const vull __builtin_altivec_vreplace_uv2di (vull, unsigned long long, \
diff --git a/gcc/config/rs6000/rs6000-overload.def 
b/gcc/config/rs6000/rs6000-overload.def
index 49a6104ddd2..44e2945aaa0 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -3059,17 +3059,17 @@
 VREPLACE_ELT_V2DF
 
 [VEC_REPLACE_UN, vec_replace_unaligned, __builtin_vec_replace_un]
-  vui __builtin_vec_replace_un (vui, unsigned int, const int);
+  vuc __builtin_vec_replace_un (vui, unsigned int, const int);
 VREPLACE_UN_UV4SI
-  vsi __builtin_vec_replace_un (vsi, signed int, const int);
+  vuc __builtin_vec_replace_un (vsi, signed int, const int);
 VREPLACE_UN_V4SI
-  vull __builtin_vec_replace_un (vull, unsigned long long, const int);
+  vuc __builtin_vec_replace_un (vull, unsigned long long, const int);
 VREPLACE_UN_UV2DI
-  vsll __builtin_vec_replace_un (vsll, signed long long, const int);
+  vuc __builtin_vec_replace_un (vsll, signed long long, const int);
 VREPLACE_UN_V2DI
-  vf __builtin_vec_replace_un (vf, float, const int);
+  vuc __builtin_vec_replace_un (vf, float, const int);
 VREPLACE_UN_V4SF
-  vd __builtin_vec_replace_un (vd, double, const int);
+  vuc __builtin_vec_replace_un (vd, double, const int);
 VREPLACE_UN_V2DF
 
 [VEC_REVB, vec_revb, __builtin_vec_revb]
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 2f5a2f7828d..b53de103872 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -4197,21 +4197,6 @

Re: [PATCH v3] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]

2022-02-08 Thread Bill Schmidt via Gcc-patches
Hi!

>From some discussion today, I think we want to limit the scope of
this patch to just the power8-fusion flag that's causing trouble for
now, given stage 4.  We've talked about making power8-fusion a do-
nothing flag, since it doesn't add much benefit now and probably
shouldn't be a separate flag anyway.  Having it as a meaningless
flag makes it more palatable to add an exception for it in the
inlining path.

Others, feel free to weigh in.

Thanks,
Bill

On 1/5/22 1:34 AM, Kewen.Lin wrote:
> Hi,
>
> This patch is to fix the inconsistent behaviors for non-LTO mode
> and LTO mode.  As Martin pointed out, currently the function
> rs6000_can_inline_p simply makes it inlinable if callee_tree is
> NULL, but it's unexpected, we should use the command line options
> from target_option_default_node as default.
>
> It replaces rs6000_isa_flags with target_option_default_node when
> caller_tree is NULL since it's more straightforward and doesn't
> suffer from some bug not to keep rs6000_isa_flags as default.
>
> It also extends the scope of the check for the case that callee
> has explicit set options, inlining in test case pr102059-5.c can
> happen unexpectedly before, it's fixed accordingly.
>
> As Richi/Mike pointed out, some tuning flags like MASK_P8_FUSION
> can be neglected for always inlining, this patch also takes some
> flags when the callee is attributed by always_inline.
>
> v1: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578552.html
> v2: https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586112.html
>
> This patch is one re-post of this updated version[1] and also
> rebased and adjusted on top of the related commit r12-6219.
>
> Bootstrapped and regtested on powerpc64-linux-gnu P8 and
> powerpc64le-linux-gnu P9 and P10.
>
> Is it ok for trunk?
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586296.html
>
> BR,
> Kewen
> -
> gcc/ChangeLog:
>
>   PR target/102059
>   * config/rs6000/rs6000.c (rs6000_can_inline_p): Adjust with
>   target_option_default_node and consider always_inline_safe flags.
>
> gcc/testsuite/ChangeLog:
>
>   PR target/102059
>   * gcc.target/powerpc/pr102059-4.c: New test.
>   * gcc.target/powerpc/pr102059-5.c: New test.
>   * gcc.target/powerpc/pr102059-6.c: New test.
>   * gcc.target/powerpc/pr102059-7.c: New test.
>   * gcc.target/powerpc/pr102059-8.c: New test.
>   * gcc.dg/lto/pr102059-1_0.c: Remove unneeded option.
>
>


Re: [PATCH] rs6000: Add support for vmsumcud and vec_msumc

2022-02-08 Thread Bill Schmidt via Gcc-patches


On 2/8/22 9:45 AM, Segher Boessenkool wrote:
> On Mon, Feb 07, 2022 at 10:06:36PM -0600, Bill Schmidt wrote:
>> On 2/7/22 5:05 PM, Segher Boessenkool wrote:
>>> On Mon, Feb 07, 2022 at 04:20:24PM -0600, Bill Schmidt wrote:
>>>> I observed recently that a couple of Power10 instructions and built-in 
>>>> functions
>>>> were somehow not implemented.  This patch adds one of them (vmsumcud).  
>>>> Although
>>>> this isn't normally stage-4 material, this is really simple and carries no
>>>> discernible risk, so I hope it can be considered.
>>> But what is the advantage?  That will be very tiny as well, afaics?
>>>
>>> Ah, this implements a builtin as well.  But that builtin is not in the
>>> PVIPR, so no one yet uses it most likely?
>> It's in the yet unpublished version of PVIPR that adds ISA 3.1 support,
>> currently awaiting public review.  It should have been implemented with
>> the rest of the ISA 3.1 built-ins.  (There are two more that were missed
>> as well, which I haven't yet addressed.)
> Ugh.  Too much process, not enough speed.
>
>>>> +;; vmsumcud
>>>> +(define_insn "vmsumcud"
>>>> +[(set (match_operand:V1TI 0 "register_operand" "+v")
>>>> +  (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v")
>>>> +(match_operand:V2DI 2 "register_operand" "v")
>>>> +  (match_operand:V1TI 3 "register_operand" "v")]
>>>> + UNSPEC_VMSUMCUD))]
>>>> +  "TARGET_POWER10"
>>>> +  "vmsumcud %0,%1,%2,%3"
>>>> +  [(set_attr "type" "vecsimple")]
>>>> +)
>>> This can be properly described in RTL instead of using an unspec.  This
>>> is much preferable.  I would say compare to maddhd[u], but those insns
>>> aren't implemented either (maddld is though).
>> Is it?  Note that vmsumcud produces the carry out of the final
>> result, not the result itself.  I couldn't immediately see how
>> to express this in RTL.
> It produces thw top 128 bits of the (infinitely precise) result.  But
> yeah that requires an OImode here (for the temp itself), and we do not
> have that in the backend yet.
>
>> The full operation multiplies the corresponding lanes of each
>> doubleword of arguments 1 and 2, adds them together with the
>> 128-bit value in argument 3, and produces the carry out of the
>> result as a 128-bit value in the result.  I think I'd need to
>> have a 256-bit mode to express this properly in RTL, right?
> Not if you actually calculate the carry, instead of computing the
> 256-bit result and truncating it.  But this is very unwieldy (it
> would be fine if adding just two datums, but here there are three).
>
> Should the type be vecsimple?  Don't we have a type for multiplications?
> Hrm it looks like we use veccomplex usually.
>
> Okay for trunk with that taken care of.  Thanks!

Thanks!  Revised as requested and pushed as r12-7110 (943d631abdd7be623c).

Bill

>
>
> Segher


Re: [PATCH] rs6000: Add support for vmsumcud and vec_msumc

2022-02-07 Thread Bill Schmidt via Gcc-patches
Hi!

On 2/7/22 5:05 PM, Segher Boessenkool wrote:
> Hi!
>
> On Mon, Feb 07, 2022 at 04:20:24PM -0600, Bill Schmidt wrote:
>> I observed recently that a couple of Power10 instructions and built-in 
>> functions
>> were somehow not implemented.  This patch adds one of them (vmsumcud).  
>> Although
>> this isn't normally stage-4 material, this is really simple and carries no
>> discernible risk, so I hope it can be considered.
> But what is the advantage?  That will be very tiny as well, afaics?
>
> Ah, this implements a builtin as well.  But that builtin is not in the
> PVIPR, so no one yet uses it most likely?

It's in the yet unpublished version of PVIPR that adds ISA 3.1 support,
currently awaiting public review.  It should have been implemented with
the rest of the ISA 3.1 built-ins.  (There are two more that were missed
as well, which I haven't yet addressed.)

>> gcc/
>>  * config/rs6000/rs6000-builtins.def (VMSUMCUD): New.
>>  * config/rs6000/rs6000-overload.def (VEC_MSUMC): New.
>>  * config/rs6000/vsx.md (UNSPEC_VMSUMCUD): New constant.
>>  (vmsumcud): New define_insn.
>>
>> gcc/testsuite/
>>  * gcc.target/powerpc/vec-msumc.c: New test.
>> +;; vmsumcud
>> +(define_insn "vmsumcud"
>> +[(set (match_operand:V1TI 0 "register_operand" "+v")
>> +  (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v")
>> +(match_operand:V2DI 2 "register_operand" "v")
>> +(match_operand:V1TI 3 "register_operand" "v")]
>> +   UNSPEC_VMSUMCUD))]
>> +  "TARGET_POWER10"
>> +  "vmsumcud %0,%1,%2,%3"
>> +  [(set_attr "type" "vecsimple")]
>> +)
> This can be properly described in RTL instead of using an unspec.  This
> is much preferable.  I would say compare to maddhd[u], but those insns
> aren't implemented either (maddld is though).

Is it?  Note that vmsumcud produces the carry out of the final
result, not the result itself.  I couldn't immediately see how
to express this in RTL.

The full operation multiplies the corresponding lanes of each
doubleword of arguments 1 and 2, adds them together with the
128-bit value in argument 3, and produces the carry out of the
result as a 128-bit value in the result.  I think I'd need to
have a 256-bit mode to express this properly in RTL, right?

Thanks,
Bill

>
>
> Segher


[PATCH] rs6000: Add support for vmsumcud and vec_msumc

2022-02-07 Thread Bill Schmidt via Gcc-patches
Hi!

I observed recently that a couple of Power10 instructions and built-in functions
were somehow not implemented.  This patch adds one of them (vmsumcud).  Although
this isn't normally stage-4 material, this is really simple and carries no
discernible risk, so I hope it can be considered.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
okay for trunk?

Thanks!
Bill


2022-02-07  Bill Schmidt  

gcc/
* config/rs6000/rs6000-builtins.def (VMSUMCUD): New.
* config/rs6000/rs6000-overload.def (VEC_MSUMC): New.
* config/rs6000/vsx.md (UNSPEC_VMSUMCUD): New constant.
(vmsumcud): New define_insn.

gcc/testsuite/
* gcc.target/powerpc/vec-msumc.c: New test.
---
 gcc/config/rs6000/rs6000-builtins.def|  3 ++
 gcc/config/rs6000/rs6000-overload.def|  4 ++
 gcc/config/rs6000/vsx.md | 13 +++
 gcc/testsuite/gcc.target/powerpc/vec-msumc.c | 39 
 4 files changed, 59 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-msumc.c

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index d0ea54d77e4..846c0bafd45 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -3497,6 +3497,9 @@
   const signed int __builtin_altivec_vstrihr_p (vss);
 VSTRIHR_P vstrir_p_v8hi {}
 
+  const vuq __builtin_vsx_vmsumcud (vull, vull, vuq);
+VMSUMCUD vmsumcud {}
+
   const signed int __builtin_vsx_xvtlsbb_all_ones (vsc);
 XVTLSBB_ONES xvtlsbbo {}
 
diff --git a/gcc/config/rs6000/rs6000-overload.def 
b/gcc/config/rs6000/rs6000-overload.def
index 5e38d597722..44e2945aaa0 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -2456,6 +2456,10 @@
   vuq __builtin_vec_msum (vull, vull, vuq);
 VMSUMUDM  VMSUMUDM_U
 
+[VEC_MSUMC, vec_msumc, __builtin_vec_msumc]
+  vuq __builtin_vec_msumc (vull, vull, vuq);
+VMSUMCUD
+
 [VEC_MSUMS, vec_msums, __builtin_vec_msums]
   vui __builtin_vec_msums (vus, vus, vui);
 VMSUMUHS
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 88053f11e29..e4904102526 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -372,6 +372,7 @@ (define_c_enum "unspec"
UNSPEC_REPLACE_UN
UNSPEC_VDIVES
UNSPEC_VDIVEU
+   UNSPEC_VMSUMCUD
UNSPEC_XXEVAL
UNSPEC_XXSPLTIW
UNSPEC_XXSPLTIDP
@@ -6615,3 +6616,15 @@ (define_split
   emit_move_insn (operands[0], tmp4);
   DONE;
 })
+
+;; vmsumcud
+(define_insn "vmsumcud"
+[(set (match_operand:V1TI 0 "register_operand" "+v")
+  (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v")
+(match_operand:V2DI 2 "register_operand" "v")
+   (match_operand:V1TI 3 "register_operand" "v")]
+  UNSPEC_VMSUMCUD))]
+  "TARGET_POWER10"
+  "vmsumcud %0,%1,%2,%3"
+  [(set_attr "type" "vecsimple")]
+)
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-msumc.c 
b/gcc/testsuite/gcc.target/powerpc/vec-msumc.c
new file mode 100644
index 000..524a2225c6c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-msumc.c
@@ -0,0 +1,39 @@
+/* { dg-do run { target { power10_hw } } } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+#include 
+
+#define DEBUG 0
+
+#if DEBUG
+#include 
+#endif
+
+extern void abort (void);
+
+int
+main ()
+{
+  vector unsigned long long arg1, arg2;
+  vector unsigned __int128 arg3, result, expected;
+  unsigned __int128 c = (unsigned __int128) (-1); /* 2^128 - 1 */
+
+  arg1 = (vector unsigned long long) { 111ULL, 300ULL };
+  arg2 = (vector unsigned long long) { 700ULL, 222ULL };
+  arg3 = (vector unsigned __int128) { c };
+  expected = (vector unsigned __int128) { 1 };
+
+  result = vec_msumc (arg1, arg2, arg3);
+  if (result[0] != expected[0])
+{
+#if DEBUG
+  printf ("ERROR, expected %d, result %d\n",
+ (unsigned int) expected[0],
+ (unsigned int) result[0]);
+#else
+  abort ();
+#endif
+}
+
+  return 0;
+}
-- 
2.27.0




Re: [PATCH 7/8] rs6000: vec_neg built-ins wrongly require POWER8

2022-02-07 Thread Bill Schmidt via Gcc-patches
Hi Segher,

Thanks for all the reviews for this series!  I'd like to gently ping the last 
two patches.

BR,
Bill

On 1/28/22 11:50 AM, Bill Schmidt via Gcc-patches wrote:
> As the subject states.  Fixing this is accomplished by moving the built-ins
> to the correct stanzas, [altivec] and [vsx].
>
> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
> Is this okay for trunk?
>
> Thanks,
> Bill
>
>
> 2022-01-27  Bill Schmidt  
>
> gcc/
>   * config/rs6000/rs6000-builtin.def (NEG_V16QI): Move to [altivec]
>   stanza.
>   (NEG_V4SF): Likewise.
>   (NEG_V4SI): Likewise.
>   (NEG_V8HI): Likewise.
>   (NEG_V2DF): Move to [vsx] stanza.
>   (NEG_V2DI): Likewise.
> ---
>  gcc/config/rs6000/rs6000-builtins.def | 36 +--
>  1 file changed, 18 insertions(+), 18 deletions(-)
>
> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
> b/gcc/config/rs6000/rs6000-builtins.def
> index 2bb997a5279..c8f0cf332eb 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -410,6 +410,18 @@
>const vss __builtin_altivec_nabs_v8hi (vss);
>  NABS_V8HI nabsv8hi2 {}
>
> +  const vsc __builtin_altivec_neg_v16qi (vsc);
> +NEG_V16QI negv16qi2 {}
> +
> +  const vf __builtin_altivec_neg_v4sf (vf);
> +NEG_V4SF negv4sf2 {}
> +
> +  const vsi __builtin_altivec_neg_v4si (vsi);
> +NEG_V4SI negv4si2 {}
> +
> +  const vss __builtin_altivec_neg_v8hi (vss);
> +NEG_V8HI negv8hi2 {}
> +
>void __builtin_altivec_stvebx (vsc, signed long, void *);
>  STVEBX altivec_stvebx {stvec}
>
> @@ -1175,6 +1187,12 @@
>const vsll __builtin_altivec_nabs_v2di (vsll);
>  NABS_V2DI nabsv2di2 {}
>
> +  const vd __builtin_altivec_neg_v2df (vd);
> +NEG_V2DF negv2df2 {}
> +
> +  const vsll __builtin_altivec_neg_v2di (vsll);
> +NEG_V2DI negv2di2 {}
> +
>void __builtin_altivec_stvx_v2df (vd, signed long, void *);
>  STVX_V2DF altivec_stvx_v2df {stvec}
>
> @@ -2118,24 +2136,6 @@
>const vus __builtin_altivec_nand_v8hi_uns (vus, vus);
>  NAND_V8HI_UNS nandv8hi3 {}
>
> -  const vsc __builtin_altivec_neg_v16qi (vsc);
> -NEG_V16QI negv16qi2 {}
> -
> -  const vd __builtin_altivec_neg_v2df (vd);
> -NEG_V2DF negv2df2 {}
> -
> -  const vsll __builtin_altivec_neg_v2di (vsll);
> -NEG_V2DI negv2di2 {}
> -
> -  const vf __builtin_altivec_neg_v4sf (vf);
> -NEG_V4SF negv4sf2 {}
> -
> -  const vsi __builtin_altivec_neg_v4si (vsi);
> -NEG_V4SI negv4si2 {}
> -
> -  const vss __builtin_altivec_neg_v8hi (vss);
> -NEG_V8HI negv8hi2 {}
> -
>const vsc __builtin_altivec_orc_v16qi (vsc, vsc);
>  ORC_V16QI orcv16qi3 {}
>


[PATCH, committed] rs6000: Clean up ISA 3.1 documentation [PR100808]

2022-02-04 Thread Bill Schmidt via Gcc-patches
Hi!

PR100808 pointed out some trivial formatting issues with Power documentation
for basic ISA 3.1 built-in functions.  This patch cleans those up.

Tested on powerpc64le-linux-gnu, committed as obvious.

Thanks!
Bill


2022-02-04  Bill Schmidt  

gcc/
PR target/100808
* doc/extend.texi (Basic PowerPC Built-in Functions Available on ISA
3.1): Provide consistent type names.  Remove unnecessary semicolons.
Fix bad line breaks.
---
 gcc/doc/extend.texi | 71 +++--
 1 file changed, 43 insertions(+), 28 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index a961fc4e0a2..cb1b2b98ca8 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -18276,74 +18276,89 @@ The following built-in functions are available on 
Linux 64-bit systems
 that use a future architecture instruction set (@option{-mcpu=power10}):
 
 @smallexample
-@exdent unsigned long long int
-@exdent __builtin_cfuged (unsigned long long int, unsigned long long int)
+@exdent unsigned long long
+@exdent __builtin_cfuged (unsigned long long, unsigned long long)
 @end smallexample
 Perform a 64-bit centrifuge operation, as if implemented by the
 @code{cfuged} instruction.
 @findex __builtin_cfuged
 
 @smallexample
-@exdent unsigned long long int
-@exdent __builtin_cntlzdm (unsigned long long int, unsigned long long int)
+@exdent unsigned long long
+@exdent __builtin_cntlzdm (unsigned long long, unsigned long long)
 @end smallexample
 Perform a 64-bit count leading zeros operation under mask, as if
 implemented by the @code{cntlzdm} instruction.
 @findex __builtin_cntlzdm
 
 @smallexample
-@exdent unsigned long long int
-@exdent __builtin_cnttzdm (unsigned long long int, unsigned long long int)
+@exdent unsigned long long
+@exdent __builtin_cnttzdm (unsigned long long, unsigned long long)
 @end smallexample
 Perform a 64-bit count trailing zeros operation under mask, as if
 implemented by the @code{cnttzdm} instruction.
 @findex __builtin_cnttzdm
 
 @smallexample
-@exdent unsigned long long int
-@exdent __builtin_pdepd (unsigned long long int, unsigned long long int)
+@exdent unsigned long long
+@exdent __builtin_pdepd (unsigned long long, unsigned long long)
 @end smallexample
 Perform a 64-bit parallel bits deposit operation, as if implemented by the
 @code{pdepd} instruction.
 @findex __builtin_pdepd
 
 @smallexample
-@exdent unsigned long long int
-@exdent __builtin_pextd (unsigned long long int, unsigned long long int)
+@exdent unsigned long long
+@exdent __builtin_pextd (unsigned long long, unsigned long long)
 @end smallexample
 Perform a 64-bit parallel bits extract operation, as if implemented by the
 @code{pextd} instruction.
 @findex __builtin_pextd
 
 @smallexample
-@exdent vector signed __int128 vsx_xl_sext (signed long long, signed char *);
-@exdent vector signed __int128 vsx_xl_sext (signed long long, signed short *);
-@exdent vector signed __int128 vsx_xl_sext (signed long long, signed int *);
-@exdent vector signed __int128 vsx_xl_sext (signed long long, signed long long 
*);
-@exdent vector unsigned __int128 vsx_xl_zext (signed long long, unsigned char 
*);
-@exdent vector unsigned __int128 vsx_xl_zext (signed long long, unsigned short 
*);
-@exdent vector unsigned __int128 vsx_xl_zext (signed long long, unsigned int 
*);
-@exdent vector unsigned __int128 vsx_xl_zext (signed long long, unsigned long 
long *);
+@exdent vector signed __int128 vsx_xl_sext (signed long long, signed char *)
+
+@exdent vector signed __int128 vsx_xl_sext (signed long long, signed short *)
+
+@exdent vector signed __int128 vsx_xl_sext (signed long long, signed int *)
+
+@exdent vector signed __int128 vsx_xl_sext (signed long long, signed long long 
*)
+
+@exdent vector unsigned __int128 vsx_xl_zext (signed long long, unsigned char 
*)
+
+@exdent vector unsigned __int128 vsx_xl_zext (signed long long, unsigned short 
*)
+
+@exdent vector unsigned __int128 vsx_xl_zext (signed long long, unsigned int *)
+
+@exdent vector unsigned __int128 vsx_xl_zext (signed long long, unsigned long 
long *)
 @end smallexample
 
 Load (and sign extend) to an __int128 vector, as if implemented by the ISA 3.1
-@code{lxvrbx} @code{lxvrhx} @code{lxvrwx} @code{lxvrdx} instructions.
+@code{lxvrbx}, @code{lxvrhx}, @code{lxvrwx}, and  @code{lxvrdx} instructions.
 @findex vsx_xl_sext
 @findex vsx_xl_zext
 
 @smallexample
-@exdent void vec_xst_trunc (vector signed __int128, signed long long, signed 
char *);
-@exdent void vec_xst_trunc (vector signed __int128, signed long long, signed 
short *);
-@exdent void vec_xst_trunc (vector signed __int128, signed long long, signed 
int *);
-@exdent void vec_xst_trunc (vector signed __int128, signed long long, signed 
long long *);
-@exdent void vec_xst_trunc (vector unsigned __int128, signed long long, 
unsigned char *);
-@exdent void vec_xst_trunc (vector unsigned __int128, signed long long, 
unsigned short *);
-@exdent void vec_xst_trunc (vector

[PATCH v3 1/8] rs6000: More factoring of overload processing

2022-02-03 Thread Bill Schmidt via Gcc-patches
Hi!

Although the previous patch was correct, the logic around what to do when
the number of arguments is wrong was still hard to understand.  It should
be better now.  I'm now explicitly counting the number of expected arguments
and comparing against that.  The way the argument list is represented ensures
there is always at least one element in the argument chain, by terminating
the chain with an argument type of void, which is why the previous logic was
so convoluted.

The revisions are in altivec_resolve_overloaded_builtin.  Otherwise the patch
is the same as before.  I hope this is much easier to read!  Bootstrapped and
tested on powerpc64le-linux-gnu.  Is this okay for trunk?

Original changelog message follows:

This patch continues the refactoring started with r12-6014.  I had previously
noted that the resolve_vec* routines can be further simplified by processing
the argument list earlier, so that all routines can use the arrays of arguments
and types.  I found that this was useful for some of the routines, but not for
all of them.

For several of the special-cased overloads, we don't specify all of the
possible type combinations in rs6000-overload.def, because the types don't
matter for the expansion we do.  For these, we can't use generic error message
handling when the number of arguments is incorrect, because the result is
misleading error messages that indicate argument types are wrong.

So this patch goes halfway and improves the factoring on the remaining special
cases, but leaves vec_splats, vec_promote, vec_extract, vec_insert, and
vec_step alone.

Thanks,
Bill


2022-02-02  Bill Schmidt  

gcc/
* config/rs6000/rs6000-c.cc (resolve_vec_mul): Accept args and types
parameters instead of arglist and nargs.  Simplify accordingly.  Remove
unnecessary test for argument count mismatch.
(resolve_vec_cmpne): Likewise.
(resolve_vec_adde_sube): Likewise.
(resolve_vec_addec_subec): Likewise.
(altivec_resolve_overloaded_builtin): Move overload special handling
after the gathering of arguments into args[] and types[] and the test
for correct number of arguments.  Don't perform the test for correct
number of arguments for certain special cases.  Call the other special
cases with args and types instead of arglist and nargs.
---
 gcc/config/rs6000/rs6000-c.cc | 304 ++
 1 file changed, 127 insertions(+), 177 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 145421ab8f2..15251efc209 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -939,37 +939,25 @@ altivec_build_resolved_builtin (tree *args, int n, tree 
fntype, tree ret_type,
 enum resolution { unresolved, resolved, resolved_bad };
 
 /* Resolve an overloaded vec_mul call and return a tree expression for the
-   resolved call if successful.  NARGS is the number of arguments to the call.
-   ARGLIST contains the arguments.  RES must be set to indicate the status of
+   resolved call if successful.  ARGS contains the arguments to the call.
+   TYPES contains their types.  RES must be set to indicate the status of
the resolution attempt.  LOC contains statement location information.  */
 
 static tree
-resolve_vec_mul (resolution *res, vec *arglist, unsigned nargs,
-location_t loc)
+resolve_vec_mul (resolution *res, tree *args, tree *types, location_t loc)
 {
   /* vec_mul needs to be special cased because there are no instructions for it
  for the {un}signed char, {un}signed short, and {un}signed int types.  */
-  if (nargs != 2)
-{
-  error ("builtin %qs only accepts 2 arguments", "vec_mul");
-  *res = resolved;
-  return error_mark_node;
-}
-
-  tree arg0 = (*arglist)[0];
-  tree arg0_type = TREE_TYPE (arg0);
-  tree arg1 = (*arglist)[1];
-  tree arg1_type = TREE_TYPE (arg1);
 
   /* Both arguments must be vectors and the types must be compatible.  */
-  if (TREE_CODE (arg0_type) != VECTOR_TYPE
-  || !lang_hooks.types_compatible_p (arg0_type, arg1_type))
+  if (TREE_CODE (types[0]) != VECTOR_TYPE
+  || !lang_hooks.types_compatible_p (types[0], types[1]))
 {
   *res = resolved_bad;
   return error_mark_node;
 }
 
-  switch (TYPE_MODE (TREE_TYPE (arg0_type)))
+  switch (TYPE_MODE (TREE_TYPE (types[0])))
 {
 case E_QImode:
 case E_HImode:
@@ -978,21 +966,21 @@ resolve_vec_mul (resolution *res, vec 
*arglist, unsigned nargs,
 case E_TImode:
   /* For scalar types just use a multiply expression.  */
   *res = resolved;
-  return fold_build2_loc (loc, MULT_EXPR, TREE_TYPE (arg0), arg0,
- fold_convert (TREE_TYPE (arg0), arg1));
+  return fold_build2_loc (loc, MULT_EXPR, types[0], args[0],
+ fold_convert (types[0], args[1]));
 case E_SFmode:
   {
/* For floats use the xvmulsp 

Re: [PATCH v2 1/8] rs6000: More factoring of overload processing

2022-02-02 Thread Bill Schmidt via Gcc-patches
Hi!

On 2/1/22 3:48 PM, Segher Boessenkool wrote:
> On Tue, Feb 01, 2022 at 08:49:34AM -0600, Bill Schmidt wrote:
>> I've modified the previous patch to add more explanatory commentary about
>> the number-of-arguments test that was previously confusing, and to convert
>> the switch into an if-then-else chain.  The rest of the patch is unchanged.
>> Bootstrapped and tested on powerpc64le-linux-gnu.  Is this okay for trunk?
>> gcc/
>>  * config/rs6000/rs6000-c.cc (resolve_vec_mul): Accept args and types
>>  parameters instead of arglist and nargs.  Simplify accordingly.  Remove
>>  unnecessary test for argument count mismatch.
>>  (resolve_vec_cmpne): Likewise.
>>  (resolve_vec_adde_sube): Likewise.
>>  (resolve_vec_addec_subec): Likewise.
>>  (altivec_resolve_overloaded_builtin): Move overload special handling
>>  after the gathering of arguments into args[] and types[] and the test
>>  for correct number of arguments.  Don't perform the test for correct
>>  number of arguments for certain special cases.  Call the other special
>>  cases with args and types instead of arglist and nargs.
>> +  if (fcode != RS6000_OVLD_VEC_PROMOTE
>> +  && fcode != RS6000_OVLD_VEC_SPLATS
>> +  && fcode != RS6000_OVLD_VEC_EXTRACT
>> +  && fcode != RS6000_OVLD_VEC_INSERT
>> +  && fcode != RS6000_OVLD_VEC_STEP
>> +  && (!VOID_TYPE_P (TREE_VALUE (fnargs)) || n < nargs))
>>  return NULL;
> Please don't do De Morgan manually, let the compiler deal with it?
> Although even with that the logic is as clear as mud.  This matters if
> someone (maybe even you) will have to debug this later, or modify this.
> Maybe adding some suitably named variables can clarify things  here?

I can de-deMorgan this.  Do you want to see the patch again, or is it okay
with that change?

Thanks!
Bill

>
>> +  if (fcode == RS6000_OVLD_VEC_MUL)
>> +returned_expr = resolve_vec_mul (, args, types, loc);
>> +  else if (fcode == RS6000_OVLD_VEC_CMPNE)
>> +returned_expr = resolve_vec_cmpne (, args, types, loc);
>> +  else if (fcode == RS6000_OVLD_VEC_ADDE || fcode == RS6000_OVLD_VEC_SUBE)
>> +returned_expr = resolve_vec_adde_sube (, fcode, args, types, loc);
>> +  else if (fcode == RS6000_OVLD_VEC_ADDEC || fcode == RS6000_OVLD_VEC_SUBEC)
>> +returned_expr = resolve_vec_addec_subec (, fcode, args, types, loc);
>> +  else if (fcode == RS6000_OVLD_VEC_SPLATS || fcode == 
>> RS6000_OVLD_VEC_PROMOTE)
>> +returned_expr = resolve_vec_splats (, fcode, arglist, nargs);
>> +  else if (fcode == RS6000_OVLD_VEC_EXTRACT)
>> +returned_expr = resolve_vec_extract (, arglist, nargs, loc);
>> +  else if (fcode == RS6000_OVLD_VEC_INSERT)
>> +returned_expr = resolve_vec_insert (, arglist, nargs, loc);
>> +  else if (fcode == RS6000_OVLD_VEC_STEP)
>> +returned_expr = resolve_vec_step (, arglist, nargs);
>> +
>> +  if (res == resolved)
>> +return returned_expr;
> This is so convoluted because the functions do two things, and have two
> return values (res and returned_expr).
>
>
> Segher


Re: [PATCH] rs6000: Fix up PCH on powerpc* [PR104323]

2022-02-01 Thread Bill Schmidt via Gcc-patches
Hi!

Jakub, thanks for fixing this.  I didn't realize the PCH implications here, 
clearly...

On 2/1/22 12:33 PM, Segher Boessenkool wrote:
> Hi!
>
> On Tue, Feb 01, 2022 at 04:27:40PM +0100, Jakub Jelinek wrote:
>> +/* PR target/104323 */
>> +/* { dg-require-effective-target powerpc_altivec_ok } */
>> +/* { dg-options "-maltivec" } */
>> +
>> +#include 
>> testcase which I'm not including into testsuite because for some reason
>> the test fails on non-powerpc* targets (is done even on those and fails
>> because of missing altivec.h etc.),
> powerpc_altivec_ok returns false if the target isn't Power, you can use
> this in the testsuite fine?  Why does it still fail on other targets,
> the test should be SKIPPED there?
>
> Or wait, proc check_effective_target_powerpc_altivec_ok is broken, and
> does not implement its intention or documentation.  Will fix.
>
>> PCH is broken on powerpc*-*-* since the
>> new builtin generator has been introduced.
>> The generator contains or emits comments like:
>>   /*  Cannot mark this as a GC root because only pointer types can
>>  be marked as GTY((user)) and be GC roots.  All trees in here are
>>  kept alive by other globals, so not a big deal.  Alternatively,
>>  we could change the enum fields to ints and cast them in and out
>>  to avoid requiring a GTY((user)) designation, but that seems
>>  unnecessarily gross.  */
>> Having the fntypes stored in other GC roots can work fine for GC,
>> ggc_collect will then always mark them and so they won't disappear from
>> the tables, but it definitely doesn't work for PCH, which when the
>> arrays with fntype members aren't GTY marked means on PCH write we create
>> copies of those FUNCTION_TYPEs and store in *.gch that the GC roots should
>> be updated, but don't store that rs6000_builtin_info[?].fntype etc. should
>> be updated.  When PCH is read again, the blob is read at some other address,
>> GC roots are updated, rs6000_builtin_info[?].fntype contains garbage
>> pointers (GC freed pointers with random data, or random unrelated types or
>> other trees).
>> The following patch fixes that.  It stops any user markings because that
>> is totally unnecessary, just skips fields we don't need to mark and adds
>> GTY(()) to the 2 array variables.  We can get rid of all those global
>> vars for the fn types, they can be now automatic vars.
>> With the patch we get
>>   {
>> _instance_info[0].fntype,
>> 1 * (RS6000_INST_MAX),
>> sizeof (rs6000_instance_info[0]),
>> _ggc_mx_tree_node,
>> _pch_nx_tree_node
>>   },
>>   {
>> _builtin_info[0].fntype,
>> 1 * (RS6000_BIF_MAX),
>> sizeof (rs6000_builtin_info[0]),
>> _ggc_mx_tree_node,
>> _pch_nx_tree_node
>>   },
>> as the new roots which is exactly what we want and significantly more
>> compact than countless
>>   {
>> _ftype_pudi_usi,
>> 1,
>> sizeof (uv2di_ftype_pudi_usi),
>> _ggc_mx_tree_node,
>> _pch_nx_tree_node
>>   },
>>   {
>> _ftype_lg_puv2di,
>> 1,
>> sizeof (uv2di_ftype_lg_puv2di),
>> _ggc_mx_tree_node,
>> _pch_nx_tree_node
>>   },
>>   {
>> _ftype_lg_pudi,
>> 1,
>> sizeof (uv2di_ftype_lg_pudi),
>> _ggc_mx_tree_node,
>> _pch_nx_tree_node
>>   },
>>   {
>> _ftype_di_puv2di,
>> 1,
>> sizeof (uv2di_ftype_di_puv2di),
>> _ggc_mx_tree_node,
>> _pch_nx_tree_node
>>   },
>> cases (822 of these instead of just those 4 shown).
> Bill, can you review the builtin side of this?

Yes, I've just read through it and it looks just fine to me.
It's a big improvement over what I had there, even ignoring
the PCH issues.

Thanks again, Jakub!

Bill

>
>>  PR target/104323
>>  * config/rs6000/t-rs6000 (EXTRA_GTYPE_DEPS): Append rs6000-builtins.h
>>  rather than $(srcdir)/config/rs6000/rs6000-builtins.def.
>>  * config/rs6000/rs6000-gen-builtins.cc (write_decls): Don't use
>>  GTY((user)) for struct bifdata and struct ovlddata.  Instead add
>>  GTY((skip(""))) to members with pointer and enum types that don't need
>>  to be tracked.  Add GTY(()) to rs6000_builtin_info and 
>> rs6000_instance_info
>>  declarations.  Don't emit gt_ggc_mx and gt_pch_nx declarations.
> Nice :-)
>
>>  (write_extern_fntype, write_fntype): Remove.
>>  (write_fntype_init): Emit the fntype vars as automatic vars instead
>>  of file scope ones.
>>  (write_header_file): Don't iterate with write_extern_fntype.
>>  (write_init_file): Don't iterate with write_fntype.  Don't emit
>>  gt_ggc_mx and gt_pch_nx definitions.
>>if (tf_found)
>> -fprintf (init_file, "  if (float128_type_node)\n  ");
>> +fprintf (init_file,
>> + "  tree %s = NULL_TREE;\n  if (float128_type_node)\n",
>> + buf);
>>else if (dfp_found)
>> -fprintf (init_file, "  if (dfloat64_type_node)\n  ");
>> +fprintf (init_file,
>> + "  tree %s = NULL_TREE;\n  if (dfloat64_type_node)\n",
>> + buf);
> Things are 

[PATCH v2 3/8] rs6000: Unify error messages for built-in constant restrictions

2022-02-01 Thread Bill Schmidt via Gcc-patches
Hi!

As discussed, I simplified this patch by just changing how the error
message is produced:

We currently give different error messages for built-in functions that
violate range restrictions on their arguments, depending on whether we
record them as requiring an n-bit literal or a literal between two values.
It's better to be consistent.  Change the error message for the n-bit
literal to look like the other one.

Bootstrapped and tested on powerpc64le-linux-gnu.  Is this okay for trunk?

Thanks!
Bill


2022-01-31  Bill Schmidt  

gcc/
* config/rs6000/rs6000-call.cc (rs6000_expand_builtin): Revise
error message for RES_BITS case.

gcc/testsuite/
* gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-data-class-10.c:
Adjust error messages.
* gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-data-class-2.c:
Likewise.
* gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-data-class-3.c:
Likewise.
* gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-data-class-4.c:
Likewise.
* gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-data-class-5.c:
Likewise.
* gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-data-class-9.c:
Likewise.
* gcc/testsuite/gcc.target/powerpc/bfp/vec-test-data-class-4.c:
Likewise.
* gcc/testsuite/gcc.target/powerpc/bfp/vec-test-data-class-5.c:
Likewise.
* gcc/testsuite/gcc.target/powerpc/bfp/vec-test-data-class-6.c:
Likewise.
* gcc/testsuite/gcc.target/powerpc/bfp/vec-test-data-class-7.c:
Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-12.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-14.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-17.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-19.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-2.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-22.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-24.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-27.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-29.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-32.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-34.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-37.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-39.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-4.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-42.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-44.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-47.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-49.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-52.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-54.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-57.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-59.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-62.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-64.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-67.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-69.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-7.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-72.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-74.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-77.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-79.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/dfp/dtstsfi-9.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/pr80315-1.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/pr80315-2.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/pr80315-3.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/pr80315-4.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/pr82015.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/pr91903.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c:
Likewise.
* gcc/testsuite/gcc.target/powerpc/vec-ternarylogic-10.c: Likewise.
---
 gcc/config/rs6000/rs6000-call.cc  |  6 +-
 .../powerpc/bfp/scalar-test-data-class-10.c   |  2 +-
 .../powerpc/bfp/scalar-test-data-class-2.c|  2 +-
 .../powerpc/bfp/scalar-test-data-class-3.c|  2 +-
 .../powerpc/bfp/scalar-test-data-class-4.c|  2 +-
 .../powerpc/bfp/scalar-test-data-class-5.c|  2 +-
 .../powerpc/bfp/scalar-test-data-class-9.c|  2 +-
 .../powerpc/bfp/vec-test-data-class-4.c   |  2 +-
 .../powerpc/bfp/vec-test-data-class-5.c   |  2 +-
 .../powerpc/bfp/vec-test-data-class-6.c   |  2 +-
 .../powerpc

[PATCH v2 1/8] rs6000: More factoring of overload processing

2022-02-01 Thread Bill Schmidt via Gcc-patches
Hi,

I've modified the previous patch to add more explanatory commentary about
the number-of-arguments test that was previously confusing, and to convert
the switch into an if-then-else chain.  The rest of the patch is unchanged.
Bootstrapped and tested on powerpc64le-linux-gnu.  Is this okay for trunk?

Remainder of commit message follows:

This patch continues the refactoring started with r12-6014.  I had previously
noted that the resolve_vec* routines can be further simplified by processing
the argument list earlier, so that all routines can use the arrays of arguments
and types.  I found that this was useful for some of the routines, but not for
all of them.

For several of the special-cased overloads, we don't specify all of the
possible type combinations in rs6000-overload.def, because the types don't
matter for the expansion we do.  For these, we can't use generic error message
handling when the number of arguments is incorrect, because the result is
misleading error messages that indicate argument types are wrong.

So this patch goes halfway and improves the factoring on the remaining special
cases, but leaves vec_splats, vec_promote, vec_extract, vec_insert, and
vec_step alone.

Thanks!
Bill


2022-01-31  Bill Schmidt  

gcc/
* config/rs6000/rs6000-c.cc (resolve_vec_mul): Accept args and types
parameters instead of arglist and nargs.  Simplify accordingly.  Remove
unnecessary test for argument count mismatch.
(resolve_vec_cmpne): Likewise.
(resolve_vec_adde_sube): Likewise.
(resolve_vec_addec_subec): Likewise.
(altivec_resolve_overloaded_builtin): Move overload special handling
after the gathering of arguments into args[] and types[] and the test
for correct number of arguments.  Don't perform the test for correct
number of arguments for certain special cases.  Call the other special
cases with args and types instead of arglist and nargs.
---
 gcc/config/rs6000/rs6000-c.cc | 297 ++
 1 file changed, 120 insertions(+), 177 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 145421ab8f2..4911e5f509c 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -939,37 +939,25 @@ altivec_build_resolved_builtin (tree *args, int n, tree 
fntype, tree ret_type,
 enum resolution { unresolved, resolved, resolved_bad };
 
 /* Resolve an overloaded vec_mul call and return a tree expression for the
-   resolved call if successful.  NARGS is the number of arguments to the call.
-   ARGLIST contains the arguments.  RES must be set to indicate the status of
+   resolved call if successful.  ARGS contains the arguments to the call.
+   TYPES contains their types.  RES must be set to indicate the status of
the resolution attempt.  LOC contains statement location information.  */
 
 static tree
-resolve_vec_mul (resolution *res, vec *arglist, unsigned nargs,
-location_t loc)
+resolve_vec_mul (resolution *res, tree *args, tree *types, location_t loc)
 {
   /* vec_mul needs to be special cased because there are no instructions for it
  for the {un}signed char, {un}signed short, and {un}signed int types.  */
-  if (nargs != 2)
-{
-  error ("builtin %qs only accepts 2 arguments", "vec_mul");
-  *res = resolved;
-  return error_mark_node;
-}
-
-  tree arg0 = (*arglist)[0];
-  tree arg0_type = TREE_TYPE (arg0);
-  tree arg1 = (*arglist)[1];
-  tree arg1_type = TREE_TYPE (arg1);
 
   /* Both arguments must be vectors and the types must be compatible.  */
-  if (TREE_CODE (arg0_type) != VECTOR_TYPE
-  || !lang_hooks.types_compatible_p (arg0_type, arg1_type))
+  if (TREE_CODE (types[0]) != VECTOR_TYPE
+  || !lang_hooks.types_compatible_p (types[0], types[1]))
 {
   *res = resolved_bad;
   return error_mark_node;
 }
 
-  switch (TYPE_MODE (TREE_TYPE (arg0_type)))
+  switch (TYPE_MODE (TREE_TYPE (types[0])))
 {
 case E_QImode:
 case E_HImode:
@@ -978,21 +966,21 @@ resolve_vec_mul (resolution *res, vec 
*arglist, unsigned nargs,
 case E_TImode:
   /* For scalar types just use a multiply expression.  */
   *res = resolved;
-  return fold_build2_loc (loc, MULT_EXPR, TREE_TYPE (arg0), arg0,
- fold_convert (TREE_TYPE (arg0), arg1));
+  return fold_build2_loc (loc, MULT_EXPR, types[0], args[0],
+ fold_convert (types[0], args[1]));
 case E_SFmode:
   {
/* For floats use the xvmulsp instruction directly.  */
*res = resolved;
tree call = rs6000_builtin_decls[RS6000_BIF_XVMULSP];
-   return build_call_expr (call, 2, arg0, arg1);
+   return build_call_expr (call, 2, args[0], args[1]);
   }
 case E_DFmode:
   {
/* For doubles use the xvmuldp instruction directly.  */
*res = resolved;
tree call = 

Re: [PATCH 4/8] rs6000: Consolidate target built-ins code

2022-01-31 Thread Bill Schmidt via Gcc-patches
Hi Segher,

On 1/31/22 3:32 PM, Segher Boessenkool wrote:
> Hi!
>
> On Fri, Jan 28, 2022 at 11:50:22AM -0600, Bill Schmidt wrote:
>> Continuing with the refactoring effort, this patch moves as much of the
>> target-specific built-in support code into a new file, rs6000-builtin.cc.
>> However, we can't easily move the overloading support code out of
>> rs6000-c.cc, because the build machinery understands that as a special file
>> to be included with the C and C++ front ends.
> And the other C-like frontends.
>
>> This patch is just a straightforward move, with one exception.  I found
>> that the builtin_mode_to_type[] array is no longer used, so I also removed
>> all code having to do with it.
> Oh nice, your rewrite removed the need for that array.  Great :-)
>
>> The code in rs6000-builtin.cc is organized in related sections:
>>  - General support functions
>>  - Initialization support
>>  - GIMPLE folding support
>>  - Expansion support
>>
>> Overloading support remains in rs6000-c.cc.
> So, what is needed to move that as well?  Is moving that in the plan?

No, as explained above, that code needs to stay in the "special" file
that the build machinery understands.  It looks very difficult to
tease that apart, so I've given up on that.  Sorry!

>
>>  * config/rs6000/rs6000-builtin.cc: New file, containing code moved
>>  from other files.
> (You're breaking lines early again.)
>
>> -extra_objs="${extra_objs} rs6000-builtins.o"
>> +extra_objs="${extra_objs} rs6000-builtins.o rs6000-builtin.o"
> It's pretty unfortunate that these files are named alike.  The source
> files exist in different places of course, so the danger of confusion
> is minimal usually.
>
>> +/* Support targetm.vectorize.builtin_mask_for_load.  */
>> +tree altivec_builtin_mask_for_load;
> "Support"?  What does that mean?  Please describe what this tree is.

That comment is just moved.  There's a target hook from the vectorizer
for Altivec-style unaligned load masking.  The target needs to provide
a built-in function for this.  The tree contains its function decl.
I can change the comment.

>
>> +/*  General support functions.  */
> This isn't a sentence so should not have a full stop.  (And otherwise
> it should be followed by two spaces!)
>
>> +bool
>> +rs6000_builtin_is_supported (enum rs6000_gen_builtins fncode)
>> +{
>> +  switch (rs6000_builtin_info[(size_t) fncode].enable)
>> +{
>> +case ENB_ALWAYS:
>> +  return true;
>> +case ENB_P5:
>> +  return TARGET_POPCNTB;
>> +case ENB_P6:
>> +  return TARGET_CMPB;
>> +case ENB_P6_64:
>> +  return TARGET_CMPB && TARGET_POWERPC64;
>> +case ENB_P7:
>> +  return TARGET_POPCNTD;
>> +case ENB_P7_64:
>> +  return TARGET_POPCNTD && TARGET_POWERPC64;
>> +case ENB_P8:
>> +  return TARGET_DIRECT_MOVE;
>> +case ENB_P8V:
>> +  return TARGET_P8_VECTOR;
>> +case ENB_P9:
>> +  return TARGET_MODULO;
>> +case ENB_P9_64:
>> +  return TARGET_MODULO && TARGET_POWERPC64;
>> +case ENB_P9V:
>> +  return TARGET_P9_VECTOR;
>> +case ENB_P10:
>> +  return TARGET_POWER10;
>> +case ENB_P10_64:
>> +  return TARGET_POWER10 && TARGET_POWERPC64;
>> +case ENB_ALTIVEC:
>> +  return TARGET_ALTIVEC;
>> +case ENB_VSX:
>> +  return TARGET_VSX;
>> +case ENB_CELL:
>> +  return TARGET_ALTIVEC && rs6000_cpu == PROCESSOR_CELL;
>> +case ENB_IEEE128_HW:
>> +  return TARGET_FLOAT128_HW;
>> +case ENB_DFP:
>> +  return TARGET_DFP;
>> +case ENB_CRYPTO:
>> +  return TARGET_CRYPTO;
>> +case ENB_HTM:
>> +  return TARGET_HTM;
>> +case ENB_MMA:
>> +  return TARGET_MMA;
>> +default:
>> +  gcc_unreachable ();
>> +}
>> +  gcc_unreachable ();
>> +}
> If you rewrite this without switch it is shorter and clearer, and you do
> not need to duplicate the gcc_unreachable (which the broken warning
> forces you to).
>
>> +  if (fcode >= RS6000_OVLD_MAX)
>> +return error_mark_node;
> This shows that that isn't really the max, it is the number of elts in
> the array, instead (maximum is inclusive).  Maybe fis that some day :-)
>
>> +/* Implement targetm.vectorize.builtin_md_vectorized_function.  */
>> +
>> +tree
>> +rs6000_builtin_md_vectorized_function (tree fndecl, tree type_out,
>> +   

Re: [PATCH 3/8] rs6000: Convert built-in constraints to form

2022-01-31 Thread Bill Schmidt via Gcc-patches
On 1/31/22 11:28 AM, Segher Boessenkool wrote:
> On Mon, Jan 31, 2022 at 11:21:32AM -0600, Bill Schmidt wrote:
>> On 1/28/22 5:24 PM, Segher Boessenkool wrote:
>>> On Fri, Jan 28, 2022 at 11:50:21AM -0600, Bill Schmidt wrote:
>>>> When introducing the new built-in support, I tried to match as many
>>>> existing error messages as possible.  One common form was "argument X must
>>>> be a Y-bit unsigned literal".  Another was "argument X must be a literal
>>>> between X' and  Y', inclusive".  During reviews, Segher requested that I
>>>> eventually convert all messages of the first form into the second form for
>>>> consistency.  That's what this patch does, replacing all -form
>>>> constraints (first form) with -form constraints (second form).
>>> Well, I asked for the error messages to be clearer and more consistent
>>> like that.  I don't think changing our source code like this is an
>>> improvement (*we* know what a 5-bit signed number is).  Do you think
>>> after your patch it is clearer and we will make fewer errors?
>> No, I don't think the patch is a particular improvement.  It sounds like
>> I may have misinterpreted what you were looking for here.  Please let me
>> know what I might do differently.
>>
>> For example, if we leave the  format in place in the source, I could
>> change the error messages that we produce to calculate the minimum and
>> maximum allowed values.  Then we'd still have the changes to the test
>> cases, but fewer changes to the source.  Thoughts?
> That is exactly what I asked for, and what I still think is the best
> option.  I haven't tried it out though, so there may be arguments
> against this :-)

Thanks for the clarification!  I'll make a run at it.

Bill

>
> Segher


Re: [PATCH 3/8] rs6000: Convert built-in constraints to form

2022-01-31 Thread Bill Schmidt via Gcc-patches
On 1/28/22 5:24 PM, Segher Boessenkool wrote:
> On Fri, Jan 28, 2022 at 11:50:21AM -0600, Bill Schmidt wrote:
>> When introducing the new built-in support, I tried to match as many
>> existing error messages as possible.  One common form was "argument X must
>> be a Y-bit unsigned literal".  Another was "argument X must be a literal
>> between X' and  Y', inclusive".  During reviews, Segher requested that I
>> eventually convert all messages of the first form into the second form for
>> consistency.  That's what this patch does, replacing all -form
>> constraints (first form) with -form constraints (second form).
> Well, I asked for the error messages to be clearer and more consistent
> like that.  I don't think changing our source code like this is an
> improvement (*we* know what a 5-bit signed number is).  Do you think
> after your patch it is clearer and we will make fewer errors?

No, I don't think the patch is a particular improvement.  It sounds like
I may have misinterpreted what you were looking for here.  Please let me
know what I might do differently.

For example, if we leave the  format in place in the source, I could
change the error messages that we produce to calculate the minimum and
maximum allowed values.  Then we'd still have the changes to the test
cases, but fewer changes to the source.  Thoughts?

Thanks,
Bill

>
> Segher


Re: [PATCH 2/8] rs6000: Don't #ifdef "short" built-in names

2022-01-28 Thread Bill Schmidt via Gcc-patches


On 1/28/22 2:32 PM, Segher Boessenkool wrote:
> On Fri, Jan 28, 2022 at 11:50:20AM -0600, Bill Schmidt wrote:
>> It was recently pointed out that we get anomalous behavior when using
>> __attribute__((target)) to select a CPU.  As an example, when building for
>> -mcpu=power8 but using __attribute__((target("mcpu=power10")), it is legal
>> to call __builtin_vec_mod, but not vec_mod, even though these are
>> equivalent.  This is because the equivalence is established with a #define
>> that is guarded by #ifdef _ARCH_PWR10.
> Yeah that is bad.
>
>> This goofy behavior occurs with both the old builtins support and the
>> new.  One of the goals of the new builtins support was to make sure all
>> appropriate interfaces are available using __attribute__((target)), so I
>> failed in this respect.  This patch corrects the problem by removing the
>> apply.  For example, #ifdef __PPU__ is still appropriate.
> "By removing the apply"...  What does that mean?

Er, wow.  Meant to say "by removing the #define."  Strange error... will fix.

Thanks for catching that!
Bill

>
> Nice cleanup (and nice bugfix of course).  Okay for trunk (with that
> comment improved a bit perhaps).  Thanks!
>
>
> Segher


Re: [PATCH 1/8] rs6000: More factoring of overload processing

2022-01-28 Thread Bill Schmidt via Gcc-patches


On 1/28/22 1:11 PM, Segher Boessenkool wrote:
> On Fri, Jan 28, 2022 at 11:50:19AM -0600, Bill Schmidt wrote:
>> This patch continues the refactoring started with r12-6014.
> ab3f5b71dc6e
>
>> + and the generic code will issue the appropriate error message.  Skip
>> + this test for functions where we don't fully describe all the possible
>> + overload signatures in rs6000-overload.def (because they aren't 
>> relevant
>> + to the expansion here).  If we don't, we get confusing error messages. 
>>  */
>> +  if (fcode != RS6000_OVLD_VEC_PROMOTE
>> +  && fcode != RS6000_OVLD_VEC_SPLATS
>> +  && fcode != RS6000_OVLD_VEC_EXTRACT
>> +  && fcode != RS6000_OVLD_VEC_INSERT
>> +  && fcode != RS6000_OVLD_VEC_STEP
>> +  && (!VOID_TYPE_P (TREE_VALUE (fnargs)) || n < nargs))
>>  return NULL;
> Can you expand a bit on this, give an example for example?  It is very
> hard to understand this code, the way it depends on code following many
> lines later.

Sure, sorry.

This check gives up if the number of arguments doesn't match the prototype.
It gives a fairly generic error message.  That part of it has always been
in here.

Now, I moved this check forward relative to the big switch statement on
fcode, because there are redundant checks for the number of arguments
in each of the resolve_vec_* helper functions.  This allowed me to simplify
those a bit.

Now, it turns out that this doesn't work so well for functions that aren't
fully described in rs6000-overload.def.  For example, for vec_splats we
have:

; There are no actual builtins for vec_splats.  There is special handling for
; this in altivec_resolve_overloaded_builtin in rs6000-c.cc, where the call
; is replaced by a constructor.  The single overload here causes
; __builtin_vec_splats to be registered with the front end so that can happen.
[VEC_SPLATS, vec_splats, __builtin_vec_splats]
  vsi __builtin_vec_splats (vsi);
ABS_V4SI SPLATS_FAKERY

So even though __builtin_vec_splats accepts all vector types, the
infrastructure cheats and just records one prototype.  We end up getting
an error message that refers to this specific prototype even when we are
handling a different argument type.  That is completely confusing to the
user.  So I felt I was starting to get too deep for a simple refactoring
patch, and gave up on early number-of-arguments checking for the special
cases that use the _FAKERY technique.

That's probably still not clear, but maybe clearer?

>
>> +default:
>> +  ;
> Don't.
>
> I like this better than a BS break statement, but it is just as stupid.
>
> If you need this, you don't want a switch statement, but some number of
> if statements.  You cannot use a switch as a shorthand for this because
> we have a silly warning and -Werror for this use.
>
> You probably get easier to understand code that way, too, you can get
> rid of the above (just do some early returns), etc.

If I understand correctly, you'd like me to resubmit this in if-then-else
form.  That's fine, just want to be sure that's what you want.

Thanks for the review!
Bill

>
>
> Segher


[PATCH 8/8] rs6000: Fix some missing built-in attributes [PR104004]

2022-01-28 Thread Bill Schmidt via Gcc-patches
PR104004 caught some misses on my part in converting to the new built-in
function infrastructure.  In particular, I forgot to mark all of the "nosoft"
built-ins, and one of those should also have been marked "no32bit".

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for trunk?

Thanks,
Bill


2022-01-27  Bill Schmidt  

gcc/
* config/rs6000/rs6000-builtin.def (MFFSL): Mark nosoft.
(MTFSB0): Likewise.
(MTFSB1): Likewise.
(SET_FPSCR_RN): Likewise.
(SET_FPSCR_DRN): Mark nosoft and no32bit.
---
 gcc/config/rs6000/rs6000-builtins.def | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index c8f0cf332eb..98619a649e3 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -215,7 +215,7 @@
 ; processors, this builtin automatically falls back to mffs on older
 ; platforms.  Thus it appears here in the [always] stanza.
   double __builtin_mffsl ();
-MFFSL rs6000_mffsl {}
+MFFSL rs6000_mffsl {nosoft}
 
 ; This is redundant with __builtin_pack_ibm128, as it requires long
 ; double to be __ibm128.  Should probably be deprecated.
@@ -226,10 +226,10 @@
 MFTB rs6000_mftb_di {32bit}
 
   void __builtin_mtfsb0 (const int<0,31>);
-MTFSB0 rs6000_mtfsb0 {}
+MTFSB0 rs6000_mtfsb0 {nosoft}
 
   void __builtin_mtfsb1 (const int<0,31>);
-MTFSB1 rs6000_mtfsb1 {}
+MTFSB1 rs6000_mtfsb1 {nosoft}
 
   void __builtin_mtfsf (const int<0,255>, double);
 MTFSF rs6000_mtfsf {}
@@ -238,7 +238,7 @@
 PACK_IF packif {}
 
   void __builtin_set_fpscr_rn (const int[0,3]);
-SET_FPSCR_RN rs6000_set_fpscr_rn {}
+SET_FPSCR_RN rs6000_set_fpscr_rn {nosoft}
 
   const double __builtin_unpack_ibm128 (__ibm128, const int<0,1>);
 UNPACK_IF unpackif {}
@@ -2969,7 +2969,7 @@
 PACK_TD packtd {}
 
   void __builtin_set_fpscr_drn (const int[0,7]);
-SET_FPSCR_DRN rs6000_set_fpscr_drn {}
+SET_FPSCR_DRN rs6000_set_fpscr_drn {nosoft,no32bit}
 
   const unsigned long long __builtin_unpack_dec128 (_Decimal128, \
 const int<0,1>);
-- 
2.27.0



[PATCH 7/8] rs6000: vec_neg built-ins wrongly require POWER8

2022-01-28 Thread Bill Schmidt via Gcc-patches
As the subject states.  Fixing this is accomplished by moving the built-ins
to the correct stanzas, [altivec] and [vsx].

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for trunk?

Thanks,
Bill


2022-01-27  Bill Schmidt  

gcc/
* config/rs6000/rs6000-builtin.def (NEG_V16QI): Move to [altivec]
stanza.
(NEG_V4SF): Likewise.
(NEG_V4SI): Likewise.
(NEG_V8HI): Likewise.
(NEG_V2DF): Move to [vsx] stanza.
(NEG_V2DI): Likewise.
---
 gcc/config/rs6000/rs6000-builtins.def | 36 +--
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 2bb997a5279..c8f0cf332eb 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -410,6 +410,18 @@
   const vss __builtin_altivec_nabs_v8hi (vss);
 NABS_V8HI nabsv8hi2 {}
 
+  const vsc __builtin_altivec_neg_v16qi (vsc);
+NEG_V16QI negv16qi2 {}
+
+  const vf __builtin_altivec_neg_v4sf (vf);
+NEG_V4SF negv4sf2 {}
+
+  const vsi __builtin_altivec_neg_v4si (vsi);
+NEG_V4SI negv4si2 {}
+
+  const vss __builtin_altivec_neg_v8hi (vss);
+NEG_V8HI negv8hi2 {}
+
   void __builtin_altivec_stvebx (vsc, signed long, void *);
 STVEBX altivec_stvebx {stvec}
 
@@ -1175,6 +1187,12 @@
   const vsll __builtin_altivec_nabs_v2di (vsll);
 NABS_V2DI nabsv2di2 {}
 
+  const vd __builtin_altivec_neg_v2df (vd);
+NEG_V2DF negv2df2 {}
+
+  const vsll __builtin_altivec_neg_v2di (vsll);
+NEG_V2DI negv2di2 {}
+
   void __builtin_altivec_stvx_v2df (vd, signed long, void *);
 STVX_V2DF altivec_stvx_v2df {stvec}
 
@@ -2118,24 +2136,6 @@
   const vus __builtin_altivec_nand_v8hi_uns (vus, vus);
 NAND_V8HI_UNS nandv8hi3 {}
 
-  const vsc __builtin_altivec_neg_v16qi (vsc);
-NEG_V16QI negv16qi2 {}
-
-  const vd __builtin_altivec_neg_v2df (vd);
-NEG_V2DF negv2df2 {}
-
-  const vsll __builtin_altivec_neg_v2di (vsll);
-NEG_V2DI negv2di2 {}
-
-  const vf __builtin_altivec_neg_v4sf (vf);
-NEG_V4SF negv4sf2 {}
-
-  const vsi __builtin_altivec_neg_v4si (vsi);
-NEG_V4SI negv4si2 {}
-
-  const vss __builtin_altivec_neg_v8hi (vss);
-NEG_V8HI negv8hi2 {}
-
   const vsc __builtin_altivec_orc_v16qi (vsc, vsc);
 ORC_V16QI orcv16qi3 {}
 
-- 
2.27.0



[PATCH 6/8] rs6000: Remove -m[no-]fold-gimple flag [PR103686]

2022-01-28 Thread Bill Schmidt via Gcc-patches
The -m[no-]fold-gimple flag was really intended primarily for internal
testing while implementing GIMPLE folding for rs6000 vector built-in
functions.  It ended up leaking into other places, causing problems such
as PR103686 identifies.  Let's remove it.

There are a number of tests in the testsuite that require adjustment.
Some specify -mfold-gimple directly, which is the default, so that is
handled by removing the option.  Others unnecessarily specify
-mno-fold-gimple, as the tests work fine without this.  Again that is
handled by removing the option.  There are a couple of extra variants of
tests specifically for -mno-fold-gimple; for those, we can just remove the
whole test.

gcc.target/powerpc/builtins-1.c was more problematic.  It was written in
such a way as to be extremely fragile.  For this one, I rewrote the whole
test in a different style, using individual functions to test each
built-in function.  These same tests are also largely covered by
builtins-1-be-folded.c and builtins-1-le-folded.c, so I chose to
explicitly make this test -mbig for simplicity, and use -O2 for clean code
generation.  I made some slight modifications to the expected instruction
counts as a result, and tested on both 32- and 64-bit.  Most instruction
count tests now use the {\m ... \M} style, but I wasn't able to figure out
how to get this right for vcmpequd. and vcmpgtud.  Using \. didn't do the
trick, and I got tired of messing with it.  I can change those if you
suggest the proper incantation for an opcode ending with a period.

Bootstrapped and tested on powerpc64le-linux-gnu and on
powerpc64-linux-gnu (32- and 64-bit) with no regressions.
Is this okay for trunk?

Thanks,
Bill


2022-01-27  Bill Schmidt  

gcc/
PR target/103686
* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin):
Remove test for !rs6000_fold_gimple.
* config/rs6000/rs6000.cc (rs6000_option_override_internal):
Likewise.
* config/rs6000/rs6000.opt (mfold-gimple): Remove.

gcc/testsuite/
PR target/103686
* gcc.target/powerpc/builtins-1-be-folded.c: Remove -mfold-gimple
option.
* gcc.target/powerpc/builtins-1-le-folded.c: Likewise.
* gcc.target/powerpc/builtins-1.c: Rewrite to use small functions
and restrict to -O2 -mbig for predictability.  Adjust instruction
counts.
* gcc.target/powerpc/builtins-5.c: Remove -mno-fold-gimple
option.
* gcc.target/powerpc/p8-vec-xl-xst.c: Likewise.
* gcc.target/powerpc/pr83926.c: Likewise.
* gcc.target/powerpc/pr86731-nogimplefold-longlong.c: Delete.
* gcc.target/powerpc/pr86731-nogimplefold.c: Delete.
* gcc.target/powerpc/swaps-p8-17.c: Remove -mno-fold-gimple
option.
---
 gcc/config/rs6000/rs6000-builtin.cc   |3 -
 gcc/config/rs6000/rs6000.cc   |4 -
 gcc/config/rs6000/rs6000.opt  |4 -
 .../gcc.target/powerpc/builtins-1-be-folded.c |2 +-
 .../gcc.target/powerpc/builtins-1-le-folded.c |2 +-
 gcc/testsuite/gcc.target/powerpc/builtins-1.c | 1210 +
 gcc/testsuite/gcc.target/powerpc/builtins-5.c |3 +-
 .../gcc.target/powerpc/p8-vec-xl-xst.c|3 +-
 gcc/testsuite/gcc.target/powerpc/pr83926.c|3 +-
 .../powerpc/pr86731-nogimplefold-longlong.c   |   32 -
 .../gcc.target/powerpc/pr86731-nogimplefold.c |   63 -
 .../gcc.target/powerpc/swaps-p8-17.c  |3 +-
 12 files changed, 951 insertions(+), 381 deletions(-)
 delete mode 100644 
gcc/testsuite/gcc.target/powerpc/pr86731-nogimplefold-longlong.c
 delete mode 100644 gcc/testsuite/gcc.target/powerpc/pr86731-nogimplefold.c

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 163287f2b67..dc9e3a4df1d 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -1299,9 +1299,6 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
   fprintf (stderr, "rs6000_gimple_fold_builtin %d %s %s\n",
   fn_code, fn_name1, fn_name2);
 
-  if (!rs6000_fold_gimple)
-return false;
-
   /* Prevent gimple folding for code that does not have a LHS, unless it is
  allowed per the rs6000_builtin_valid_without_lhs helper function.  */
   if (!gimple_call_lhs (stmt)
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index d27e1ec4a60..a4acb5d1f43 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3851,10 +3851,6 @@ rs6000_option_override_internal (bool global_init_p)
   & OPTION_MASK_DIRECT_MOVE))
 rs6000_isa_flags |= ~rs6000_isa_flags_explicit & OPTION_MASK_STRICT_ALIGN;
 
-  if (!rs6000_fold_gimple)
- fprintf (stderr,
- "gimple folding of rs6000 builtins has been disabled.\n");
-
   /* Add some warnings for VSX.  */
   if (TARGET_VSX)
 {
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
in

[PATCH 5/8] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-01-28 Thread Bill Schmidt via Gcc-patches
These built-ins were misimplemented as always having big-endian semantics.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for trunk?

Thanks,
Bill


2022-01-18  Bill Schmidt  

gcc/
PR target/95082
* config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin): Handle
endianness for vclzlsbb and vctzlsbb.
* config/rs6000/rs6000-builtins.def (VCLZLSBB_V16QI): Change
default pattern and indicate a different pattern will be used for
big endian.
(VCLZLSBB_V4SI): Likewise.
(VCLZLSBB_V8HI): Likewise.
(VCTZLSBB_V16QI): Likewise.
(VCTZLSBB_V4SI): Likewise.
(VCTZLSBB_V8HI): Likewise.

gcc/testsuite/
PR target/95082
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c: Restrict to -mbig.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c: Likewise.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c: New.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c: New.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-0.c: Restrict to -mbig.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-1.c: Likewise.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c: New.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c: New.
---
 gcc/config/rs6000/rs6000-builtin.cc   | 12 
 gcc/config/rs6000/rs6000-builtins.def | 12 ++--
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c |  2 +-
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c |  2 +-
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c | 15 +++
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c | 15 +++
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-0.c |  2 +-
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-1.c |  2 +-
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c | 15 +++
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c | 15 +++
 10 files changed, 82 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 191a6108a5e..163287f2b67 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -3485,6 +3485,18 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* 
subtarget */,
icode = CODE_FOR_vsx_store_v8hi;
   else if (fcode == RS6000_BIF_ST_ELEMREV_V16QI)
icode = CODE_FOR_vsx_store_v16qi;
+  else if (fcode == RS6000_BIF_VCLZLSBB_V16QI)
+   icode = CODE_FOR_vclzlsbb_v16qi;
+  else if (fcode == RS6000_BIF_VCLZLSBB_V4SI)
+   icode = CODE_FOR_vclzlsbb_v4si;
+  else if (fcode == RS6000_BIF_VCLZLSBB_V8HI)
+   icode = CODE_FOR_vclzlsbb_v8hi;
+  else if (fcode == RS6000_BIF_VCTZLSBB_V16QI)
+   icode = CODE_FOR_vctzlsbb_v16qi;
+  else if (fcode == RS6000_BIF_VCTZLSBB_V4SI)
+   icode = CODE_FOR_vctzlsbb_v4si;
+  else if (fcode == RS6000_BIF_VCTZLSBB_V8HI)
+   icode = CODE_FOR_vctzlsbb_v8hi;
   else
gcc_unreachable ();
 }
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index cfe31c2e7de..2bb997a5279 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2551,13 +2551,13 @@
 VBPERMD altivec_vbpermd {}
 
   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
-VCLZLSBB_V16QI vclzlsbb_v16qi {}
+VCLZLSBB_V16QI vctzlsbb_v16qi {endian}
 
   const signed int __builtin_altivec_vclzlsbb_v4si (vsi);
-VCLZLSBB_V4SI vclzlsbb_v4si {}
+VCLZLSBB_V4SI vctzlsbb_v4si {endian}
 
   const signed int __builtin_altivec_vclzlsbb_v8hi (vss);
-VCLZLSBB_V8HI vclzlsbb_v8hi {}
+VCLZLSBB_V8HI vctzlsbb_v8hi {endian}
 
   const vsc __builtin_altivec_vctzb (vsc);
 VCTZB ctzv16qi2 {}
@@ -2572,13 +2572,13 @@
 VCTZW ctzv4si2 {}
 
   const signed int __builtin_altivec_vctzlsbb_v16qi (vsc);
-VCTZLSBB_V16QI vctzlsbb_v16qi {}
+VCTZLSBB_V16QI vclzlsbb_v16qi {endian}
 
   const signed int __builtin_altivec_vctzlsbb_v4si (vsi);
-VCTZLSBB_V4SI vctzlsbb_v4si {}
+VCTZLSBB_V4SI vclzlsbb_v4si {endian}
 
   const signed int __builtin_altivec_vctzlsbb_v8hi (vss);
-VCTZLSBB_V8HI vctzlsbb_v8hi {}
+VCTZLSBB_V8HI vclzlsbb_v8hi {endian}
 
   const signed int __builtin_altivec_vcmpaeb_p (vsc, vsc);
 VCMPAEB_P vector_ae_v16qi_p {}
diff --git a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c 
b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
index 0faf233425e..dc92d6fdd65 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
@@ -1,6 +1,6 @@
 /* { dg-do compile { target { powerpc

[PATCH 3/8] rs6000: Convert built-in constraints to form

2022-01-28 Thread Bill Schmidt via Gcc-patches
When introducing the new built-in support, I tried to match as many
existing error messages as possible.  One common form was "argument X must
be a Y-bit unsigned literal".  Another was "argument X must be a literal
between X' and  Y', inclusive".  During reviews, Segher requested that I
eventually convert all messages of the first form into the second form for
consistency.  That's what this patch does, replacing all -form
constraints (first form) with -form constraints (second form).

For the moment, the parser will still accept  arguments, but I've added
a note in rs6000-builtins.def that this form is deprecated in favor of
.  I think it's harmless to leave it in, in case a desire for the
distinction comes up in the future.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for trunk?

Thanks,
Bill


2022-01-12  Bill Schmidt  

gcc/
* config/rs6000/rs6000-builtins.def (MTFSB0): Replace -form
constraints with -form constraints.
(MTFSB1): Likewise.
(MTFSF): Likewise.
(UNPACK_IF): Likewise.
(UNPACK_TF): Likewise.
(DSS): Likewise.
(DST): Likewise.
(DSTST): Likewise.
(DSTSTT): Likewise.
(DSTT): Likewise.
(VCFSX): Likewise.
(VCFUX): Likewise.
(VCTSXS): Likewise.
(VCTUXS): Likewise.
(VSLDOI_16QI): Likewise.
(VSLDOI_4SF): Likewise.
(VSLDOI_4SI): Likewise.
(VSLDOI_8HI): Likewise.
(VSPLTB): Likewise.
(VSPLTH): Likewise.
(VSPLTW): Likewise.
(VEC_SET_V16QI): Likewise.
(VEC_SET_V4SF): Likewise.
(VEC_SET_V4SI): Likewise.
(VEC_SET_V8HI): Likewise.
(VSLDOI_2DF): Likewise.
(VSLDOI_2DI): Likewise.
(VEC_SET_V2DF): Likewise.
(VEC_SET_V2DI): Likewise.
(XVCVSXDDP_SCALE): Likewise.
(XVCVUXDDP_SCALE): Likewise.
(XXPERMDI_16QI): Likewise.
(XXPERMDI_1TI): Likewise.
(XXPERMDI_2DF): Likewise.
(XXPERMDI_2DI): Likewise.
(XXPERMDI_4SF): Likewise.
(XXPERMDI_4SI): Likewise.
(XXPERMDI_8HI): Likewise.
(XXSLDWI_16QI): Likewise.
(XXSLDWI_2DF): Likewise.
(XXSLDWI_2DI): Likewise.
(XXSLDWI_4SF): Likewise.
(XXSLDWI_4SI): Likewise.
(XXSLDWI_8HI): Likewise.
(XXSPLTD_V2DF): Likewise.
(XXSPLTD_V2DI): Likewise.
(UNPACK_V1TI): Likewise.
(BCDADD_V1TI): Likewise.
(BCDADD_V16QI): Likewise.
(BCDADD_EQ_V1TI): Likewise.
(BCDADD_EQ_V16QI): Likewise.
(BCDADD_GT_V1TI): Likewise.
(BCDADD_GT_V16QI): Likewise.
(BCDADD_LT_V1TI): Likewise.
(BCDADD_LT_V16QI): Likewise.
(BCDADD_OV_V1TI): Likewise.
(BCDADD_OV_V16QI): Likewise.
(BCDSUB_V1TI): Likewise.
(BCDSUB_V16QI): Likewise.
(BCDSUB_EQ_V1TI): Likewise.
(BCDSUB_EQ_V16QI): Likewise.
(BCDSUB_GT_V1TI): Likewise.
(BCDSUB_GT_V16QI): Likewise.
(BCDSUB_LT_V1TI): Likewise.
(BCDSUB_LT_V16QI): Likewise.
(BCDSUB_OV_V1TI): Likewise.
(BCDSUB_OV_V16QI): Likewise.
(VSTDCDP): Likewise.
(VSTDCSP): Likewise.
(VTDCDP): Likewise.
(VTDCSP): Likewise.
(TSTSFI_EQ_DD): Likewise.
(TSTSFI_EQ_TD): Likewise.
(TSTSFI_GT_DD): Likewise.
(TSTSFI_GT_TD): Likewise.
(TSTSFI_LT_DD): Likewise.
(TSTSFI_LT_TD): Likewise.
(TSTSFI_OV_DD): Likewise.
(TSTSFI_OV_TD): Likewise.
(VSTDCQP): Likewise.
(DDEDPD): Likewise.
(DDEDPDQ): Likewise.
(DENBCD): Likewise.
(DENBCDQ): Likewise.
(DSCLI): Likewise.
(DSCLIQ): Likewise.
(DSCRI): Likewise.
(DSCRIQ): Likewise.
(UNPACK_TD): Likewise.
(VSHASIGMAD): Likewise.
(VSHASIGMAW): Likewise.
(VCNTMBB): Likewise.
(VCNTMBD): Likewise.
(VCNTMBH): Likewise.
(VCNTMBW): Likewise.
(VREPLACE_UN_UV2DI): Likewise.
(VREPLACE_UN_UV4SI): Likewise.
(VREPLACE_UN_V2DF): Likewise.
(VREPLACE_UN_V2DI): Likewise.
(VREPLACE_UN_V4SF): Likewise.
(VREPLACE_UN_V4SI): Likewise.
(VREPLACE_ELT_UV2DI): Likewise.
(VREPLACE_ELT_UV4SI): Likewise.
(VREPLACE_ELT_V2DF): Likewise.
(VREPLACE_ELT_V2DI): Likewise.
(VREPLACE_ELT_V4SF): Likewise.
(VREPLACE_ELT_V4SI): Likewise.
(VSLDB_V16QI): Likewise.
(VSLDB_V2DI): Likewise.
(VSLDB_V4SI): Likewise.
(VSLDB_V8HI): Likewise.
(VSRDB_V16QI): Likewise.
(VSRDB_V2DI): Likewise.
(VSRDB_V4SI): Likewise.
(VSRDB_V8HI): Likewise.
(VXXSPLTI32DX_V4SF): Likewise.
(VXXSPLTI32DX_V4SI): Likewise.
(XXEVAL): Likewise.
(XXGENPCVM_V16QI): Likewise.
(XXGENPCVM_V2DI): Likewise.
(XXGENPCVM_V4SI): Likewise.
(XXGEN

[PATCH 2/8] rs6000: Don't #ifdef "short" built-in names

2022-01-28 Thread Bill Schmidt via Gcc-patches
It was recently pointed out that we get anomalous behavior when using
__attribute__((target)) to select a CPU.  As an example, when building for
-mcpu=power8 but using __attribute__((target("mcpu=power10")), it is legal
to call __builtin_vec_mod, but not vec_mod, even though these are
equivalent.  This is because the equivalence is established with a #define
that is guarded by #ifdef _ARCH_PWR10.

This goofy behavior occurs with both the old builtins support and the
new.  One of the goals of the new builtins support was to make sure all
appropriate interfaces are available using __attribute__((target)), so I
failed in this respect.  This patch corrects the problem by removing the
apply.  For example, #ifdef __PPU__ is still appropriate.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for trunk?

Thanks,
Bill


2022-01-06  Bill Schmidt  

gcc/
* config/rs6000/rs6000-overload.def (VEC_ABSD): Remove #ifdef token.
(VEC_BLENDV): Likewise.
(VEC_BPERM): Likewise.
(VEC_CFUGE): Likewise.
(VEC_CIPHER_BE): Likewise.
(VEC_CIPHERLAST_BE): Likewise.
(VEC_CLRL): Likewise.
(VEC_CLRR): Likewise.
(VEC_CMPNEZ): Likewise.
(VEC_CNTLZ): Likewise.
(VEC_CNTLZM): Likewise.
(VEC_CNTTZM): Likewise.
(VEC_CNTLZ_LSBB): Likewise.
(VEC_CNTM): Likewise.
(VEC_CNTTZ): Likewise.
(VEC_CNTTZ_LSBB): Likewise.
(VEC_CONVERT_4F32_8F16): Likewise.
(VEC_DIV): Likewise.
(VEC_DIVE): Likewise.
(VEC_EQV): Likewise.
(VEC_EXPANDM): Likewise.
(VEC_EXTRACT_FP_FROM_SHORTH): Likewise.
(VEC_EXTRACT_FP_FROM_SHORTL): Likewise.
(VEC_EXTRACTH): Likewise.
(VEC_EXTRACTL): Likewise.
(VEC_EXTRACTM): Likewise.
(VEC_EXTRACT4B): Likewise.
(VEC_EXTULX): Likewise.
(VEC_EXTURX): Likewise.
(VEC_FIRSTMATCHINDEX): Likewise.
(VEC_FIRSTMACHOREOSINDEX): Likewise.
(VEC_FIRSTMISMATCHINDEX): Likewise.
(VEC_FIRSTMISMATCHOREOSINDEX): Likewise.
(VEC_GB): Likewise.
(VEC_GENBM): Likewise.
(VEC_GENHM): Likewise.
(VEC_GENWM): Likewise.
(VEC_GENDM): Likewise.
(VEC_GENQM): Likewise.
(VEC_GENPCVM): Likewise.
(VEC_GNB): Likewise.
(VEC_INSERTH): Likewise.
(VEC_INSERTL): Likewise.
(VEC_INSERT4B): Likewise.
(VEC_LXVL): Likewise.
(VEC_MERGEE): Likewise.
(VEC_MERGEO): Likewise.
(VEC_MOD): Likewise.
(VEC_MSUB): Likewise.
(VEC_MULH): Likewise.
(VEC_NAND): Likewise.
(VEC_NCIPHER_BE): Likewise.
(VEC_NCIPHERLAST_BE): Likewise.
(VEC_NEARBYINT): Likewise.
(VEC_NMADD): Likewise.
(VEC_ORC): Likewise.
(VEC_PDEP): Likewise.
(VEC_PERMX): Likewise.
(VEC_PEXT): Likewise.
(VEC_POPCNT): Likewise.
(VEC_PARITY_LSBB): Likewise.
(VEC_REPLACE_ELT): Likewise.
(VEC_REPLACE_UN): Likewise.
(VEC_REVB): Likewise.
(VEC_RINT): Likewise.
(VEC_RLMI): Likewise.
(VEC_RLNM): Likewise.
(VEC_SBOX_BE): Likewise.
(VEC_SIGNEXTI): Likewise.
(VEC_SIGNEXTLL): Likewise.
(VEC_SIGNEXTQ): Likewise.
(VEC_SLDB): Likewise.
(VEC_SLV): Likewise.
(VEC_SPLATI): Likewise.
(VEC_SPLATID): Likewise.
(VEC_SPLATI_INS): Likewise.
(VEC_SQRT): Likewise.
(VEC_SRDB): Likewise.
(VEC_SRV): Likewise.
(VEC_STRIL): Likewise.
(VEC_STRIL_P): Likewise.
(VEC_STRIR): Likewise.
(VEC_STRIR_P): Likewise.
(VEC_STXVL): Likewise.
(VEC_TERNARYLOGIC): Likewise.
(VEC_TEST_LSBB_ALL_ONES): Likewise.
(VEC_TEST_LSBB_ALL_ZEROS): Likewise.
(VEC_VEE): Likewise.
(VEC_VES): Likewise.
(VEC_VIE): Likewise.
(VEC_VPRTYB): Likewise.
(VEC_VSCEEQ): Likewise.
(VEC_VSCEGT): Likewise.
(VEC_VSCELT): Likewise.
(VEC_VSCEUO): Likewise.
(VEC_VSEE): Likewise.
(VEC_VSES): Likewise.
(VEC_VSIE): Likewise.
(VEC_VSTDC): Likewise.
(VEC_VSTDCN): Likewise.
(VEC_VTDC): Likewise.
(VEC_XL): Likewise.
(VEC_XL_BE): Likewise.
(VEC_XL_LEN_R): Likewise.
(VEC_XL_SEXT): Likewise.
(VEC_XL_ZEXT): Likewise.
(VEC_XST): Likewise.
(VEC_XST_BE): Likewise.
(VEC_XST_LEN_R): Likewise.
(VEC_XST_TRUNC): Likewise.
(VEC_XXPERMDI): Likewise.
(VEC_XXSLDWI): Likewise.
(VEC_TSTSFI_EQ_DD): Likewise.
(VEC_TSTSFI_EQ_TD): Likewise.
(VEC_TSTSFI_GT_DD): Likewise.
(VEC_TSTSFI_GT_TD): Likewise.
(VEC_TSTSFI_LT_DD): Likewise.
(VEC_TSTSFI_LT_TD): Likewise.
(VEC_TSTSFI_OV_DD): Likewise.
(VEC_TSTSFI_OV_TD): Likewise.
(VEC_VADDCUQ): Likewise.
(VE

[PATCH 1/8] rs6000: More factoring of overload processing

2022-01-28 Thread Bill Schmidt via Gcc-patches
This patch continues the refactoring started with r12-6014.  I had previously
noted that the resolve_vec* routines can be further simplified by processing
the argument list earlier, so that all routines can use the arrays of arguments
and types.  I found that this was useful for some of the routines, but not for
all of them.

For several of the special-cased overloads, we don't specify all of the
possible type combinations in rs6000-overload.def, because the types don't
matter for the expansion we do.  For these, we can't use generic error message
handling when the number of arguments is incorrect, because the result is
misleading error messages that indicate argument types are wrong.

So this patch goes halfway and improves the factoring on the remaining special
cases, but leaves vec_splats, vec_promote, vec_extract, vec_insert, and
vec_step alone.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for trunk?

Thanks,
Bill


2022-01-18  Bill Schmidt  

gcc/
* config/rs6000/rs6000-c.cc (resolve_vec_mul): Accept args and types
parameters instead of arglist and nargs.  Simplify accordingly.  Remove
unnecessary test for argument count mismatch.
(resolve_vec_cmpne): Likewise.
(resolve_vec_adde_sube): Likewise.
(resolve_vec_addec_subec): Likewise.
(altivec_resolve_overloaded_builtin): Move overload special handling
after the gathering of arguments into args[] and types[] and the test
for correct number of arguments.  Don't perform the test for correct
number of arguments for certain special cases.  Call the other special
cases with args and types instead of arglist and nargs.
---
 gcc/config/rs6000/rs6000-c.cc | 304 ++
 1 file changed, 127 insertions(+), 177 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 145421ab8f2..35c1383f059 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -939,37 +939,25 @@ altivec_build_resolved_builtin (tree *args, int n, tree 
fntype, tree ret_type,
 enum resolution { unresolved, resolved, resolved_bad };
 
 /* Resolve an overloaded vec_mul call and return a tree expression for the
-   resolved call if successful.  NARGS is the number of arguments to the call.
-   ARGLIST contains the arguments.  RES must be set to indicate the status of
+   resolved call if successful.  ARGS contains the arguments to the call.
+   TYPES contains their types.  RES must be set to indicate the status of
the resolution attempt.  LOC contains statement location information.  */
 
 static tree
-resolve_vec_mul (resolution *res, vec *arglist, unsigned nargs,
-location_t loc)
+resolve_vec_mul (resolution *res, tree *args, tree *types, location_t loc)
 {
   /* vec_mul needs to be special cased because there are no instructions for it
  for the {un}signed char, {un}signed short, and {un}signed int types.  */
-  if (nargs != 2)
-{
-  error ("builtin %qs only accepts 2 arguments", "vec_mul");
-  *res = resolved;
-  return error_mark_node;
-}
-
-  tree arg0 = (*arglist)[0];
-  tree arg0_type = TREE_TYPE (arg0);
-  tree arg1 = (*arglist)[1];
-  tree arg1_type = TREE_TYPE (arg1);
 
   /* Both arguments must be vectors and the types must be compatible.  */
-  if (TREE_CODE (arg0_type) != VECTOR_TYPE
-  || !lang_hooks.types_compatible_p (arg0_type, arg1_type))
+  if (TREE_CODE (types[0]) != VECTOR_TYPE
+  || !lang_hooks.types_compatible_p (types[0], types[1]))
 {
   *res = resolved_bad;
   return error_mark_node;
 }
 
-  switch (TYPE_MODE (TREE_TYPE (arg0_type)))
+  switch (TYPE_MODE (TREE_TYPE (types[0])))
 {
 case E_QImode:
 case E_HImode:
@@ -978,21 +966,21 @@ resolve_vec_mul (resolution *res, vec 
*arglist, unsigned nargs,
 case E_TImode:
   /* For scalar types just use a multiply expression.  */
   *res = resolved;
-  return fold_build2_loc (loc, MULT_EXPR, TREE_TYPE (arg0), arg0,
- fold_convert (TREE_TYPE (arg0), arg1));
+  return fold_build2_loc (loc, MULT_EXPR, types[0], args[0],
+ fold_convert (types[0], args[1]));
 case E_SFmode:
   {
/* For floats use the xvmulsp instruction directly.  */
*res = resolved;
tree call = rs6000_builtin_decls[RS6000_BIF_XVMULSP];
-   return build_call_expr (call, 2, arg0, arg1);
+   return build_call_expr (call, 2, args[0], args[1]);
   }
 case E_DFmode:
   {
/* For doubles use the xvmuldp instruction directly.  */
*res = resolved;
tree call = rs6000_builtin_decls[RS6000_BIF_XVMULDP];
-   return build_call_expr (call, 2, arg0, arg1);
+   return build_call_expr (call, 2, args[0], args[1]);
   }
 /* Other types are errors.  */
 default:
@@ -1002,37 +990,25 @@ resolve_vec_mul (

[PATCH 0/8] rs6000: Built-in function cleanups and bug fixes

2022-01-28 Thread Bill Schmidt via Gcc-patches
Hi!

This is a resubmission of some patches and a new submission of others.
Patches 1, 3, and 4 finish up the pending clean-up work for the new built-in
infrastructure support.  Patches 2 and 5-8 fix a variety of bugs not specific
to the new infrastructure.  I'm submitting these as a group primarily because
5-8 are dependent on the previous patches, particularly patch 4, which
consolidates much of the built-in code in a new file.

Thanks for your consideration!

Bill


Bill Schmidt (8):
  rs6000: More factoring of overload processing
  rs6000: Don't #ifdef "short" built-in names
  rs6000: Convert  built-in constraints to  form
  rs6000: Consolidate target built-ins code
  rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]
  rs6000: Remove -m[no-]fold-gimple flag [PR103686]
  rs6000: vec_neg built-ins wrongly require POWER8
  rs6000: Fix some missing built-in attributes [PR104004]

 gcc/config.gcc|2 +-
 gcc/config/rs6000/rs6000-builtin.cc   | 3721 +
 gcc/config/rs6000/rs6000-builtins.def |  578 +--
 gcc/config/rs6000/rs6000-c.cc |  304 +-
 gcc/config/rs6000/rs6000-call.cc  | 3524 
 gcc/config/rs6000/rs6000-overload.def |  344 +-
 gcc/config/rs6000/rs6000.cc   |  167 +-
 gcc/config/rs6000/rs6000.h|1 -
 gcc/config/rs6000/rs6000.opt  |4 -
 gcc/config/rs6000/t-rs6000|4 +
 .../powerpc/bfp/scalar-test-data-class-10.c   |2 +-
 .../powerpc/bfp/scalar-test-data-class-2.c|2 +-
 .../powerpc/bfp/scalar-test-data-class-3.c|2 +-
 .../powerpc/bfp/scalar-test-data-class-4.c|2 +-
 .../powerpc/bfp/scalar-test-data-class-5.c|2 +-
 .../powerpc/bfp/scalar-test-data-class-9.c|2 +-
 .../powerpc/bfp/vec-test-data-class-4.c   |2 +-
 .../powerpc/bfp/vec-test-data-class-5.c   |2 +-
 .../powerpc/bfp/vec-test-data-class-6.c   |2 +-
 .../powerpc/bfp/vec-test-data-class-7.c   |2 +-
 .../gcc.target/powerpc/builtins-1-be-folded.c |2 +-
 .../gcc.target/powerpc/builtins-1-le-folded.c |2 +-
 gcc/testsuite/gcc.target/powerpc/builtins-1.c | 1210 --
 gcc/testsuite/gcc.target/powerpc/builtins-5.c |3 +-
 .../gcc.target/powerpc/dfp/dtstsfi-12.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-14.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-17.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-19.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-2.c|2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-22.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-24.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-27.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-29.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-32.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-34.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-37.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-39.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-4.c|2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-42.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-44.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-47.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-49.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-52.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-54.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-57.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-59.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-62.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-64.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-67.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-69.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-7.c|2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-72.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-74.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-77.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-79.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-9.c|2 +-
 .../gcc.target/powerpc/p8-vec-xl-xst.c|3 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-1.c  |2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-2.c  |2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-3.c  |2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-4.c  |2 +-
 gcc/testsuite/gcc.target/powerpc/pr82015.c|4 +-
 gcc/testsuite/gcc.target/powerpc/pr83926.c|3 +-
 .../powerpc/pr86731-nogimplefold-longlong.c   |   32 -
 .../gcc.target/powerpc/pr86731-nogimplefold.c |   63 -
 gcc/testsuite/gcc.target/powerpc/pr91903.c|   60 +-
 .../gcc.target/powerpc/swaps-p8-17.c  |3 +-
 .../powerpc/test_fpscr_rn_builtin_error.c |8 +-
 .../gcc.target/powerpc/vec-ternarylogic-10.c  |6 +-
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c |2 +-
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c |2 +-
 .../gcc.targ

Re: [PATCH v9] rtl: builtins: (not just) rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2022-01-24 Thread Bill Schmidt via Gcc-patches
Adding the patch author for his information.

Thanks,
Bill

On 1/24/22 2:26 PM, Jakub Jelinek via Gcc-patches wrote:
> On Mon, Jan 24, 2022 at 08:55:37AM -0600, Segher Boessenkool wrote:
>> Hi!
>>
>> On Thu, Jan 13, 2022 at 02:08:53PM -0300, Raoni Fassina Firmino wrote:
>>> Changes since v8[8]:
>>>   - Refactored and expanded builtin-feclearexcept-feraiseexcept-2.c
>>> testcase:
>>> + Use a macro to avoid extended repetition of the core test code.
>>> + Expanded the test code to check builtins return code.
>>> + Added more tests to test all valid (standard) exceptions input
>> This is okay for trunk (Jeff already approved the generic parts).
>> Thanks!
> This breaks bootstrap with --enable-checking=rtl, e.g. while compiling
> libquadmath/math/llrintq.c
> #0  internal_error (gmsgid=0x131bb1e0 "RTL check: expected code '%s', have 
> '%s' in %s, at %s:%d") at ../../gcc/diagnostic.cc:1938
> #1  0x113a0e94 in rtl_check_failed_code1 (r=0x3fffaf4a24a8, 
> code=CONST_INT, file=0x13400018 "../../gcc/config/rs6000/rs6000.md", 
> line=7010, 
> func=0x13409298  
> "gen_feraiseexceptsi") at ../../gcc/rtl.cc:918
> #2  0x125154e8 in gen_feraiseexceptsi (operand0=0x3fffaf4a3720, 
> operand1=0x3fffaf4a24a8) at ../../gcc/config/rs6000/rs6000.md:7010
> #3  0x108badf4 in insn_gen_fn::operator() 
> (this=0x138ee440 ) at ../../gcc/recog.h:407
> #4  0x10890b1c in expand_builtin_feclear_feraise_except 
> (exp=0x3fffaf3041a0, target=0x3fffaf4a3720, target_mode=E_SImode, 
> op_optab=feraiseexcept_optab)
> at ../../gcc/builtins.cc:2606
> #5  0x108a6f74 in expand_builtin (exp=0x3fffaf3041a0, 
> target=0x3fffaf100490, subtarget=0x0, mode=E_VOIDmode, ignore=1) at 
> ../../gcc/builtins.cc:7130
> #6  0x10c01770 in expand_expr_real_1 (exp=0x3fffaf3041a0, target=0x0, 
> tmode=E_VOIDmode, modifier=EXPAND_NORMAL, alt_rtl=0x0, 
> inner_reference_p=false)
> at ../../gcc/expr.cc:11536
> #7  0x10bf0604 in expand_expr_real (exp=0x3fffaf3041a0, 
> target=0x3fffaf100490, tmode=E_VOIDmode, modifier=EXPAND_NORMAL, alt_rtl=0x0, 
> inner_reference_p=false)
> at ../../gcc/expr.cc:8737
> #8  0x108ffa00 in expand_expr (exp=0x3fffaf3041a0, 
> target=0x3fffaf100490, mode=E_VOIDmode, modifier=EXPAND_NORMAL) at 
> ../../gcc/expr.h:301
> #9  0x1090c934 in expand_call_stmt (stmt=0x3fffaf1314d0) at 
> ../../gcc/cfgexpand.cc:2831
> #10 0x10911e18 in expand_gimple_stmt_1 (stmt=0x3fffaf1314d0) at 
> ../../gcc/cfgexpand.cc:3864
> #11 0x10912730 in expand_gimple_stmt (stmt=0x3fffaf1314d0) at 
> ../../gcc/cfgexpand.cc:4028
> #12 0x1091ecb0 in expand_gimple_basic_block (bb=0x3fffaf190c98, 
> disable_tail_calls=false) at ../../gcc/cfgexpand.cc:6069
> #13 0x10921be8 in (anonymous namespace)::pass_expand::execute 
> (this=0x13ab0d40, fun=0x3fffaf0c0c38) at ../../gcc/cfgexpand.cc:6795
> #14 0x11216ea4 in execute_one_pass (pass=0x13ab0d40) at 
> ../../gcc/passes.cc:2637
> #15 0x112173d8 in execute_pass_list_1 (pass=0x13ab0d40) at 
> ../../gcc/passes.cc:2737
> #16 0x112174b0 in execute_pass_list (fn=0x3fffaf0c0c38, 
> pass=0x13aac8c0) at ../../gcc/passes.cc:2748
> #17 0x109b4e4c in cgraph_node::expand (this=0x3fffaf151760) at 
> ../../gcc/cgraphunit.cc:1834
> #18 0x109b5844 in expand_all_functions () at 
> ../../gcc/cgraphunit.cc:1998
> #19 0x109b67d0 in symbol_table::compile (this=0x3fffaf0d) at 
> ../../gcc/cgraphunit.cc:2348
> #20 0x109b6f40 in symbol_table::finalize_compilation_unit 
> (this=0x3fffaf0d) at ../../gcc/cgraphunit.cc:2529
> #21 0x114f10f4 in compile_file () at ../../gcc/toplev.cc:479
> #22 0x114f6204 in do_compile (no_backend=false) at 
> ../../gcc/toplev.cc:2158
> #23 0x114f68d0 in toplev::main (this=0x3fffeb64, argc=45, 
> argv=0x3fffef98) at ../../gcc/toplev.cc:2310
> #24 0x12f97a6c in main (argc=45, argv=0x3fffef98) at 
> ../../gcc/main.cc:39
>
> expand_builtin_feclear_feraise_except doesn't check if op0 matches
> the predicate of operands[1], the backend requires const_int_operand,
> but because the call isn't done with a constant integer:
> feraiseexcept (t == LLONG_MIN ? FE_INEXACT : FE_INVALID);
> op0 is a REG.
> If CONST_INT is what is expected on all targets, then it should punt if
> op0 isn't one, otherwise it should the predicate.
>
>   Jakub
>


Re: [PATCH] rs6000: Support vector float/double for vec_sldw

2022-01-21 Thread Bill Schmidt via Gcc-patches
Thanks!  Pushed as r12-6806 with the testcase adjusted.

Bill

On 1/21/22 11:47 AM, Segher Boessenkool wrote:
> Hi!
>
> On Fri, Jan 21, 2022 at 11:31:34AM -0600, Bill Schmidt wrote:
>> It was recently discovered that Clang supports a couple of variants of 
>> vec_sldw that
>> GCC does not.  After some discussion, we decided that these variants are 
>> reasonable,
>> and GCC will also support them.  This patch adds that support.
> As we discussed, this is reasonable only because we already allow
> non-integer inputs (and outputs) for all(?) other permute class
> instructions.
>
>> I updated an existing test and discovered it wasn't actually checking for 
>> generation
>> of the xxsldwi instruction, so I added that check as well.
> It can always generate vsldoi instead, which is a strict superset (if
> all registers used are VRs).  They will not likely be here, because
> these are such simple functions, but that is a bit fragile.
>
>>  * gcc.target/powerpc/builtins-4.c: Add two test variants.  Adjust
>>  assembler counts.
> Is there any justification for the new counts?
>
> ... Ah, it didn't count the sld's at all before.  Okay.
>
>> @@ -161,6 +175,6 @@ test_sll_vuill_vuill_vuc (vector unsigned long long int 
>> x,
>>  /* { dg-final { scan-assembler-times "xvnabssp"  1 } } */
>>  /* { dg-final { scan-assembler-times "xvnabsdp"  1 } } */
>>  /* { dg-final { scan-assembler-times "vslo"  4 } } */
>> -/* { dg-final { scan-assembler-times "xxlor" 30 } } */
>> +/* { dg-final { scan-assembler-times "xxlor" 32 } } */
> This will need modification for the phase of the moon.  It also does not
> even test only xxlor insn (also xxlorc insns, for example).
>
>> +/* { dg-final { scan-assembler-times "xxsldwi"   10 } } */
> Okay if you make this
>   \mxxsldwi\M
> or even
>   \m(?:xxsldwi|vsldoi)\M
>
> Thanks!
>
>
> Segher


[PATCH] rs6000: Support vector float/double for vec_sldw

2022-01-21 Thread Bill Schmidt via Gcc-patches
Hi,

It was recently discovered that Clang supports a couple of variants of vec_sldw 
that
GCC does not.  After some discussion, we decided that these variants are 
reasonable,
and GCC will also support them.  This patch adds that support.

I updated an existing test and discovered it wasn't actually checking for 
generation
of the xxsldwi instruction, so I added that check as well.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this 
okay
for trunk?

Thanks!
Bill


2022-01-21  Bill Schmidt  

gcc/
* config/rs6000/rs6000-overload.def (VEC_SLDW): Add instances for
vector float and vector double.

gcc/testsuite/
* gcc.target/powerpc/builtins-4.c: Add two test variants.  Adjust
assembler counts.
---
 gcc/config/rs6000/rs6000-overload.def |  4 +++
 gcc/testsuite/gcc.target/powerpc/builtins-4.c | 34 +--
 2 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-overload.def 
b/gcc/config/rs6000/rs6000-overload.def
index dea6f5d4258..cdc703e9764 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -3405,6 +3405,10 @@
 XXSLDWI_2DI  XXSLDWI_VSLL
   vull __builtin_vec_sldw (vull, vull, const int);
 XXSLDWI_2DI  XXSLDWI_VULL
+  vf __builtin_vec_sldw (vf, vf, const int);
+XXSLDWI_4SF  XXSLDWI_VF
+  vd __builtin_vec_sldw (vd, vd, const int);
+XXSLDWI_2DF  XXSLDWI_VD
 
 [VEC_SLL, vec_sll, __builtin_vec_sll]
   vsc __builtin_vec_sll (vsc, vuc);
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-4.c 
b/gcc/testsuite/gcc.target/powerpc/builtins-4.c
index 4e3b543f242..df012e9b7d6 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-4.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-4.c
@@ -119,6 +119,18 @@ test_vul_sldw_vul_vul (vector unsigned long long x,
return vec_sldw (x, y, 3);
 }
 
+vector float
+test_vf_sldw_vf_vf (vector float x, vector float y)
+{
+  return vec_sldw (x, y, 3);
+}
+
+vector double
+test_vd_sldw_vd_vd (vector double x, vector double y)
+{
+  return vec_sldw (x, y, 1);
+}
+
 vector signed int long long
 test_sll_vsill_vsill_vuc (vector signed long long int x,
  vector unsigned char y)
@@ -146,14 +158,16 @@ test_sll_vuill_vuill_vuc (vector unsigned long long int x,
  test_slo_vsll_slo_vsll_vuc1 vslo
  test_slo_vull_slo_vull_vsc1 vslo
  test_slo_vull_slo_vull_vuc1 vslo
- test_vsc_sldw_vsc_vsc 1 xxlor
- test_vuc_sldw_vuc_vuc 1 xxlor
- test_vssi_sldw_vssi_vssi  1 xxlor
- test_vusi_sldw_vusi_vusi  1 xxlor
- test_vsi_sldw_vsi_vsi 1 xxlor
- test_vui_sldw_vui_vui 1 xxlor
- test_vsl_sldw_vsl_vsl 1 xxlor
- test_vul_sldw_vul_vul 1 xxlor
+ test_vsc_sldw_vsc_vsc 1 xxlor, 1 xxsldwi
+ test_vuc_sldw_vuc_vuc 1 xxlor, 1 xxsldwi
+ test_vssi_sldw_vssi_vssi  1 xxlor, 1 xxsldwi
+ test_vusi_sldw_vusi_vusi  1 xxlor, 1 xxsldwi
+ test_vsi_sldw_vsi_vsi 1 xxlor, 1 xxsldwi
+ test_vui_sldw_vui_vui 1 xxlor, 1 xxsldwi
+ test_vsl_sldw_vsl_vsl 1 xxlor, 1 xxsldwi
+ test_vul_sldw_vul_vul 1 xxlor, 1 xxsldwi
+ test_vf_sldw_vf_vf1 xxlor, 1 xxsldwi
+ test_vd_sldw_vd_vd1 xxlor, 1 xxsldwi
  test_sll_vsill_vsill_vuc  1 vsl
  test_sll_vuill_vuill_vuc  1 vsl  */
 
@@ -161,6 +175,6 @@ test_sll_vuill_vuill_vuc (vector unsigned long long int x,
 /* { dg-final { scan-assembler-times "xvnabssp"  1 } } */
 /* { dg-final { scan-assembler-times "xvnabsdp"  1 } } */
 /* { dg-final { scan-assembler-times "vslo"  4 } } */
-/* { dg-final { scan-assembler-times "xxlor" 30 } } */
+/* { dg-final { scan-assembler-times "xxlor" 32 } } */
 /* { dg-final { scan-assembler-times {\mvsl\M}   5 } } */
-
+/* { dg-final { scan-assembler-times "xxsldwi"   10 } } */
-- 
2.27.0




[PATCH v2] rs6000: More factoring of overload processing

2022-01-19 Thread Bill Schmidt via Gcc-patches
Hi!

[I'm resubmitting this because the filename changed with the recent conversion
from .c to .cc.]

This patch continues the refactoring started with r12-6014.  I had previously
noted that the resolve_vec* routines can be further simplified by processing
the argument list earlier, so that all routines can use the arrays of arguments
and types.  I found that this was useful for some of the routines, but not for
all of them.

For several of the special-cased overloads, we don't specify all of the
possible type combinations in rs6000-overload.def, because the types don't
matter for the expansion we do.  For these, we can't use generic error message
handling when the number of arguments is incorrect, because the result is
misleading error messages that indicate argument types are wrong.

So this patch goes halfway and improves the factoring on the remaining special
cases, but leaves vec_splats, vec_promote, vec_extract, vec_insert, and
vec_step alone.

Bootstrapped and tested on powerpc64le-linux-gnu.  Is this okay for trunk?

Thanks,
Bill


2022-01-18  Bill Schmidt  

gcc/
* config/rs6000/rs6000-c.cc (resolve_vec_mul): Accept args and types
parameters instead of arglist and nargs.  Simplify accordingly.  Remove
unnecessary test for argument count mismatch.
(resolve_vec_cmpne): Likewise.
(resolve_vec_adde_sube): Likewise.
(resolve_vec_addec_subec): Likewise.
(altivec_resolve_overloaded_builtin): Move overload special handling
after the gathering of arguments into args[] and types[] and the test
for correct number of arguments.  Don't perform the test for correct
number of arguments for certain special cases.  Call the other special
cases with args and types instead of arglist and nargs.
---
 gcc/config/rs6000/rs6000-c.cc | 304 ++
 1 file changed, 127 insertions(+), 177 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 145421ab8f2..35c1383f059 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -939,37 +939,25 @@ altivec_build_resolved_builtin (tree *args, int n, tree 
fntype, tree ret_type,
 enum resolution { unresolved, resolved, resolved_bad };
 
 /* Resolve an overloaded vec_mul call and return a tree expression for the
-   resolved call if successful.  NARGS is the number of arguments to the call.
-   ARGLIST contains the arguments.  RES must be set to indicate the status of
+   resolved call if successful.  ARGS contains the arguments to the call.
+   TYPES contains their types.  RES must be set to indicate the status of
the resolution attempt.  LOC contains statement location information.  */
 
 static tree
-resolve_vec_mul (resolution *res, vec *arglist, unsigned nargs,
-location_t loc)
+resolve_vec_mul (resolution *res, tree *args, tree *types, location_t loc)
 {
   /* vec_mul needs to be special cased because there are no instructions for it
  for the {un}signed char, {un}signed short, and {un}signed int types.  */
-  if (nargs != 2)
-{
-  error ("builtin %qs only accepts 2 arguments", "vec_mul");
-  *res = resolved;
-  return error_mark_node;
-}
-
-  tree arg0 = (*arglist)[0];
-  tree arg0_type = TREE_TYPE (arg0);
-  tree arg1 = (*arglist)[1];
-  tree arg1_type = TREE_TYPE (arg1);
 
   /* Both arguments must be vectors and the types must be compatible.  */
-  if (TREE_CODE (arg0_type) != VECTOR_TYPE
-  || !lang_hooks.types_compatible_p (arg0_type, arg1_type))
+  if (TREE_CODE (types[0]) != VECTOR_TYPE
+  || !lang_hooks.types_compatible_p (types[0], types[1]))
 {
   *res = resolved_bad;
   return error_mark_node;
 }
 
-  switch (TYPE_MODE (TREE_TYPE (arg0_type)))
+  switch (TYPE_MODE (TREE_TYPE (types[0])))
 {
 case E_QImode:
 case E_HImode:
@@ -978,21 +966,21 @@ resolve_vec_mul (resolution *res, vec 
*arglist, unsigned nargs,
 case E_TImode:
   /* For scalar types just use a multiply expression.  */
   *res = resolved;
-  return fold_build2_loc (loc, MULT_EXPR, TREE_TYPE (arg0), arg0,
- fold_convert (TREE_TYPE (arg0), arg1));
+  return fold_build2_loc (loc, MULT_EXPR, types[0], args[0],
+ fold_convert (types[0], args[1]));
 case E_SFmode:
   {
/* For floats use the xvmulsp instruction directly.  */
*res = resolved;
tree call = rs6000_builtin_decls[RS6000_BIF_XVMULSP];
-   return build_call_expr (call, 2, arg0, arg1);
+   return build_call_expr (call, 2, args[0], args[1]);
   }
 case E_DFmode:
   {
/* For doubles use the xvmuldp instruction directly.  */
*res = resolved;
tree call = rs6000_builtin_decls[RS6000_BIF_XVMULDP];
-   return build_call_expr (call, 2, arg0, arg1);
+   return build_call_expr (call, 2, args[0], args[1]);
   }
 /* Ot

[PATCH] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-01-19 Thread Bill Schmidt via Gcc-patches
Hi!

https://gcc.gnu.org/PR95082 demonstrates that we don't generate correct code for
vec_cntlz_lsbb and vec_cnttz_lsbb for little-endian targets.  This patch 
corrects
the problem by marking the built-ins as bif_is_endian and using the correct
target patterns for each endianness.  Note that the default patterns are for
little endian, and the overridden patterns in rs6000-builtin.cc are for big
endian.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
okay for trunk, and eventually for backport to GCC 11?

Thanks!
Bill


2022-01-18  Bill Schmidt  

gcc/
PR target/95082
* config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin): Handle
endianness for vclzlsbb and vctzlsbb.
* config/rs6000/rs6000-builtins.def (VCLZLSBB_V16QI): Change
default pattern and indicate a different pattern will be used for
big endian.
(VCLZLSBB_V4SI): Likewise.
(VCLZLSBB_V8HI): Likewise.
(VCTZLSBB_V16QI): Likewise.
(VCTZLSBB_V4SI): Likewise.
(VCTZLSBB_V8HI): Likewise.

gcc/testsuite/
PR target/95082
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c: Restrict to -mbig.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c: Likewise.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c: New.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c: New.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-0.c: Restrict to -mbig.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-1.c: Likewise.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c: New.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c: New.
---
 gcc/config/rs6000/rs6000-builtin.cc   | 12 
 gcc/config/rs6000/rs6000-builtins.def | 12 ++--
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c |  2 +-
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c |  2 +-
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c | 15 +++
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c | 15 +++
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-0.c |  2 +-
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-1.c |  2 +-
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c | 15 +++
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c | 15 +++
 10 files changed, 82 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 6eca3568c02..421277a0ef0 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -3485,6 +3485,18 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* 
subtarget */,
icode = CODE_FOR_vsx_store_v8hi;
   else if (fcode == RS6000_BIF_ST_ELEMREV_V16QI)
icode = CODE_FOR_vsx_store_v16qi;
+  else if (fcode == RS6000_BIF_VCLZLSBB_V16QI)
+   icode = CODE_FOR_vclzlsbb_v16qi;
+  else if (fcode == RS6000_BIF_VCLZLSBB_V4SI)
+   icode = CODE_FOR_vclzlsbb_v4si;
+  else if (fcode == RS6000_BIF_VCLZLSBB_V8HI)
+   icode = CODE_FOR_vclzlsbb_v8hi;
+  else if (fcode == RS6000_BIF_VCTZLSBB_V16QI)
+   icode = CODE_FOR_vctzlsbb_v16qi;
+  else if (fcode == RS6000_BIF_VCTZLSBB_V4SI)
+   icode = CODE_FOR_vctzlsbb_v4si;
+  else if (fcode == RS6000_BIF_VCTZLSBB_V8HI)
+   icode = CODE_FOR_vctzlsbb_v8hi;
   else
gcc_unreachable ();
 }
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index cfe31c2e7de..2bb997a5279 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2551,13 +2551,13 @@
 VBPERMD altivec_vbpermd {}
 
   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
-VCLZLSBB_V16QI vclzlsbb_v16qi {}
+VCLZLSBB_V16QI vctzlsbb_v16qi {endian}
 
   const signed int __builtin_altivec_vclzlsbb_v4si (vsi);
-VCLZLSBB_V4SI vclzlsbb_v4si {}
+VCLZLSBB_V4SI vctzlsbb_v4si {endian}
 
   const signed int __builtin_altivec_vclzlsbb_v8hi (vss);
-VCLZLSBB_V8HI vclzlsbb_v8hi {}
+VCLZLSBB_V8HI vctzlsbb_v8hi {endian}
 
   const vsc __builtin_altivec_vctzb (vsc);
 VCTZB ctzv16qi2 {}
@@ -2572,13 +2572,13 @@
 VCTZW ctzv4si2 {}
 
   const signed int __builtin_altivec_vctzlsbb_v16qi (vsc);
-VCTZLSBB_V16QI vctzlsbb_v16qi {}
+VCTZLSBB_V16QI vclzlsbb_v16qi {endian}
 
   const signed int __builtin_altivec_vctzlsbb_v4si (vsi);
-VCTZLSBB_V4SI vctzlsbb_v4si {}
+VCTZLSBB_V4SI vclzlsbb_v4si {endian}
 
   const signed int __builtin_altivec_vctzlsbb_v8hi (vss);
-VCTZLSBB_V8HI vctzlsbb_v8hi {}
+VCTZLSBB_V8HI vclzlsbb_v8hi {endian}
 
   const signed int __builtin_altivec_vcmpaeb_p (vsc, vsc);
 VCMPAEB_P

[PATCH] rs6000: Convert built-in constraints to form

2022-01-12 Thread Bill Schmidt via Gcc-patches
Hi!

When introducing the new built-in support, I tried to match as many
existing error messages as possible.  One common form was "argument X must
be a Y-bit unsigned literal".  Another was "argument X must be a literal
between X' and  Y', inclusive".  During reviews, Segher requested that I
eventually convert all messages of the first form into the second form for
consistency.  That's what this patch does, replacing all -form
constraints (first form) with -form constraints (second form).

For the moment, the parser will still accept  arguments, but I've added
a note in rs6000-builtins.def that this form is deprecated in favor of
.  I think it's harmless to leave it in, in case a desire for the
distinction comes up in the future.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for trunk?

Thanks!
Bill

2022-01-12  Bill Schmidt  

gcc/
* config/rs6000/rs6000-builtins.def (MTFSB0): Replace -form
constraints with -form constraints.
(MTFSB1): Likewise.
(MTFSF): Likewise.
(UNPACK_IF): Likewise.
(UNPACK_TF): Likewise.
(DSS): Likewise.
(DST): Likewise.
(DSTST): Likewise.
(DSTSTT): Likewise.
(DSTT): Likewise.
(VCFSX): Likewise.
(VCFUX): Likewise.
(VCTSXS): Likewise.
(VCTUXS): Likewise.
(VSLDOI_16QI): Likewise.
(VSLDOI_4SF): Likewise.
(VSLDOI_4SI): Likewise.
(VSLDOI_8HI): Likewise.
(VSPLTB): Likewise.
(VSPLTH): Likewise.
(VSPLTW): Likewise.
(VEC_SET_V16QI): Likewise.
(VEC_SET_V4SF): Likewise.
(VEC_SET_V4SI): Likewise.
(VEC_SET_V8HI): Likewise.
(VSLDOI_2DF): Likewise.
(VSLDOI_2DI): Likewise.
(VEC_SET_V2DF): Likewise.
(VEC_SET_V2DI): Likewise.
(XVCVSXDDP_SCALE): Likewise.
(XVCVUXDDP_SCALE): Likewise.
(XXPERMDI_16QI): Likewise.
(XXPERMDI_1TI): Likewise.
(XXPERMDI_2DF): Likewise.
(XXPERMDI_2DI): Likewise.
(XXPERMDI_4SF): Likewise.
(XXPERMDI_4SI): Likewise.
(XXPERMDI_8HI): Likewise.
(XXSLDWI_16QI): Likewise.
(XXSLDWI_2DF): Likewise.
(XXSLDWI_2DI): Likewise.
(XXSLDWI_4SF): Likewise.
(XXSLDWI_4SI): Likewise.
(XXSLDWI_8HI): Likewise.
(XXSPLTD_V2DF): Likewise.
(XXSPLTD_V2DI): Likewise.
(UNPACK_V1TI): Likewise.
(BCDADD_V1TI): Likewise.
(BCDADD_V16QI): Likewise.
(BCDADD_EQ_V1TI): Likewise.
(BCDADD_EQ_V16QI): Likewise.
(BCDADD_GT_V1TI): Likewise.
(BCDADD_GT_V16QI): Likewise.
(BCDADD_LT_V1TI): Likewise.
(BCDADD_LT_V16QI): Likewise.
(BCDADD_OV_V1TI): Likewise.
(BCDADD_OV_V16QI): Likewise.
(BCDSUB_V1TI): Likewise.
(BCDSUB_V16QI): Likewise.
(BCDSUB_EQ_V1TI): Likewise.
(BCDSUB_EQ_V16QI): Likewise.
(BCDSUB_GT_V1TI): Likewise.
(BCDSUB_GT_V16QI): Likewise.
(BCDSUB_LT_V1TI): Likewise.
(BCDSUB_LT_V16QI): Likewise.
(BCDSUB_OV_V1TI): Likewise.
(BCDSUB_OV_V16QI): Likewise.
(VSTDCDP): Likewise.
(VSTDCSP): Likewise.
(VTDCDP): Likewise.
(VTDCSP): Likewise.
(TSTSFI_EQ_DD): Likewise.
(TSTSFI_EQ_TD): Likewise.
(TSTSFI_GT_DD): Likewise.
(TSTSFI_GT_TD): Likewise.
(TSTSFI_LT_DD): Likewise.
(TSTSFI_LT_TD): Likewise.
(TSTSFI_OV_DD): Likewise.
(TSTSFI_OV_TD): Likewise.
(VSTDCQP): Likewise.
(DDEDPD): Likewise.
(DDEDPDQ): Likewise.
(DENBCD): Likewise.
(DENBCDQ): Likewise.
(DSCLI): Likewise.
(DSCLIQ): Likewise.
(DSCRI): Likewise.
(DSCRIQ): Likewise.
(UNPACK_TD): Likewise.
(VSHASIGMAD): Likewise.
(VSHASIGMAW): Likewise.
(VCNTMBB): Likewise.
(VCNTMBD): Likewise.
(VCNTMBH): Likewise.
(VCNTMBW): Likewise.
(VREPLACE_UN_UV2DI): Likewise.
(VREPLACE_UN_UV4SI): Likewise.
(VREPLACE_UN_V2DF): Likewise.
(VREPLACE_UN_V2DI): Likewise.
(VREPLACE_UN_V4SF): Likewise.
(VREPLACE_UN_V4SI): Likewise.
(VREPLACE_ELT_UV2DI): Likewise.
(VREPLACE_ELT_UV4SI): Likewise.
(VREPLACE_ELT_V2DF): Likewise.
(VREPLACE_ELT_V2DI): Likewise.
(VREPLACE_ELT_V4SF): Likewise.
(VREPLACE_ELT_V4SI): Likewise.
(VSLDB_V16QI): Likewise.
(VSLDB_V2DI): Likewise.
(VSLDB_V4SI): Likewise.
(VSLDB_V8HI): Likewise.
(VSRDB_V16QI): Likewise.
(VSRDB_V2DI): Likewise.
(VSRDB_V4SI): Likewise.
(VSRDB_V8HI): Likewise.
(VXXSPLTI32DX_V4SF): Likewise.
(VXXSPLTI32DX_V4SI): Likewise.
(XXEVAL): Likewise.
(XXGENPCVM_V16QI): Likewise.
(XXGENPCVM_V2DI): Likewise.
(XXGENPCVM_V4SI): Likewise.
(XXGEN

Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only

2022-01-12 Thread Bill Schmidt via Gcc-patches
I think we need a fix or a revert for this today, please.  Bootstrap has been 
broken
for a couple of days during the last week of stage 3, which is really 
problematic.

Thanks,
Bill

On 1/12/22 6:57 AM, Richard Biener via Gcc-patches wrote:
> On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
>
>> On 12/01/2022 11:59, Richard Biener wrote:
>>> On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
>>>
 On 12/01/2022 11:44, Richard Sandiford wrote:
> Another alternative would be to push autodetected_vector_mode when the
> length is 1 and keep 1 as the starting point.
>
> Richard
 I'm guessing we would still want to skip epilogue vectorization if
 !VECTOR_MODE_P (autodetected_vector_mode) in that case?
>>> Practically we currently only support fixed width word_mode there,
>>> but eventually one could end up with 64bit DImode for the main loop
>>> and 32bit V4QImode in the epilogue ... so not sure if it's worth
>>> special-casing.  But I don't mind adding that skip.
>>>
>>> Richard.
>> I left out the skip, it shouldn't break anything as it would try that same
>> mode before anyway.
>> Just to clarify what I meant though was to skip if autodetected_vector_mode
>> wasn't a vector AND the target didn't define autovectorize_vector_modes, so 
>> in
>> that scenario it wouldn't ever try  V4QImode for the epilogue if the mainloop
>> was autodetected DImode, I think...
>> Either way, this is less code, less complicated and doesn't analyze more than
>> it did before the original patch, so I'm happy with that too.
>>
>> Is this what you had in mind?
> -  mode_i = 1;
> +  if (vector_modes.length () == 1)
> +{
> +  /* If we only had VOIDmode then use AUTODETECTED_VECTOR_MODE to see
> if
> +an epilogue can be created with that mode.  */
> +  vector_modes[0] = autodetected_vector_mode;
> +  mode_i = 0;
> +}
> +  else
> +mode_i = 1;
> +
>
> I would have left out the condition and unconditionally do
>
>   vector_modes[0] = autodetected_vector_mode;
>   mode_i = 0;
>
> but OK if you think it makes sense to special case length == 1.
>
> Richard.


Re: [PATCH] PR 102935, Fix pr101384-1.c code generation test.

2022-01-11 Thread Bill Schmidt via Gcc-patches
Hi Mike,

This looks fine to me.  Maintainers?

Thanks,
Bill

On 1/7/22 6:33 PM, Michael Meissner wrote:
> Fix pr101384-1.c code generation test.
>
> Add support for the compiler using XXSPLTIB reg,255 to load all 1's into a
> register on power9 and above instead of using VSPLTI{B,H,W} reg,-1.
>
> gcc/testsuite/
> 2022-01-07  Michael Meissner  
>
>   PR testsuite/102935
>   * gcc.target/powerpc/pr101384-1.c: Update insn regexp for power9
>   and power10.
> ---
>  gcc/testsuite/gcc.target/powerpc/pr101384-1.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr101384-1.c 
> b/gcc/testsuite/gcc.target/powerpc/pr101384-1.c
> index 627d7d76721..41cf84bf8bc 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr101384-1.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr101384-1.c
> @@ -2,7 +2,7 @@
>  /* { dg-do compile { target le } } */
>  /* { dg-options "-O2 -maltivec" } */
>  /* { dg-require-effective-target powerpc_altivec_ok } */
> -/* { dg-final { scan-assembler-times {\mvspltis[whb] [^\n\r]*,-1\M} 9 } } */
> +/* { dg-final { scan-assembler-times {\mvspltis[whb] 
> [^\n\r]*,-1\M|\mxxspltib[^\n\r]*,255\M} 9 } } */
>  /* { dg-final { scan-assembler-times {\mvslw\M} 3 } } */
>  /* { dg-final { scan-assembler-times {\mvslh\M} 3 } } */
>  /* { dg-final { scan-assembler-times {\mvslb\M} 3 } } */


[PATCH] rs6000: Don't #ifdef "short" built-in names

2022-01-06 Thread Bill Schmidt via Gcc-patches
Hi!

It was recently pointed out that we get anomalous behavior when using
__attribute__((target)) to select a CPU.  As an example, when building for
-mcpu=power8 but using __attribute__((target("mcpu=power10")), it is legal
to call __builtin_vec_mod, but not vec_mod, even though these are
equivalent.  This is because the equivalence is established with a #define
that is guarded by #ifdef _ARCH_PWR10.

This goofy behavior occurs with both the old builtins support and the
new.  One of the goals of the new builtins support was to make sure all
appropriate interfaces are available using __attribute__((target)), so I
failed in this respect.  This patch corrects the problem by removing the
apply.  For example, #ifdef __PPU__ is still appropriate.

Bootstrapped and tested on powerpc64le-linux-gnu.  Is this okay for trunk?

Thanks!
Bill


2022-01-06  Bill Schmidt  

gcc/
* config/rs6000/rs6000-overload.def (VEC_ABSD): Remove #ifdef token.
(VEC_BLENDV): Likewise.
(VEC_BPERM): Likewise.
(VEC_CFUGE): Likewise.
(VEC_CIPHER_BE): Likewise.
(VEC_CIPHERLAST_BE): Likewise.
(VEC_CLRL): Likewise.
(VEC_CLRR): Likewise.
(VEC_CMPNEZ): Likewise.
(VEC_CNTLZ): Likewise.
(VEC_CNTLZM): Likewise.
(VEC_CNTTZM): Likewise.
(VEC_CNTLZ_LSBB): Likewise.
(VEC_CNTM): Likewise.
(VEC_CNTTZ): Likewise.
(VEC_CNTTZ_LSBB): Likewise.
(VEC_CONVERT_4F32_8F16): Likewise.
(VEC_DIV): Likewise.
(VEC_DIVE): Likewise.
(VEC_EQV): Likewise.
(VEC_EXPANDM): Likewise.
(VEC_EXTRACT_FP_FROM_SHORTH): Likewise.
(VEC_EXTRACT_FP_FROM_SHORTL): Likewise.
(VEC_EXTRACTH): Likewise.
(VEC_EXTRACTL): Likewise.
(VEC_EXTRACTM): Likewise.
(VEC_EXTRACT4B): Likewise.
(VEC_EXTULX): Likewise.
(VEC_EXTURX): Likewise.
(VEC_FIRSTMATCHINDEX): Likewise.
(VEC_FIRSTMACHOREOSINDEX): Likewise.
(VEC_FIRSTMISMATCHINDEX): Likewise.
(VEC_FIRSTMISMATCHOREOSINDEX): Likewise.
(VEC_GB): Likewise.
(VEC_GENBM): Likewise.
(VEC_GENHM): Likewise.
(VEC_GENWM): Likewise.
(VEC_GENDM): Likewise.
(VEC_GENQM): Likewise.
(VEC_GENPCVM): Likewise.
(VEC_GNB): Likewise.
(VEC_INSERTH): Likewise.
(VEC_INSERTL): Likewise.
(VEC_INSERT4B): Likewise.
(VEC_LXVL): Likewise.
(VEC_MERGEE): Likewise.
(VEC_MERGEO): Likewise.
(VEC_MOD): Likewise.
(VEC_MSUB): Likewise.
(VEC_MULH): Likewise.
(VEC_NAND): Likewise.
(VEC_NCIPHER_BE): Likewise.
(VEC_NCIPHERLAST_BE): Likewise.
(VEC_NEARBYINT): Likewise.
(VEC_NMADD): Likewise.
(VEC_ORC): Likewise.
(VEC_PDEP): Likewise.
(VEC_PERMX): Likewise.
(VEC_PEXT): Likewise.
(VEC_POPCNT): Likewise.
(VEC_PARITY_LSBB): Likewise.
(VEC_REPLACE_ELT): Likewise.
(VEC_REPLACE_UN): Likewise.
(VEC_REVB): Likewise.
(VEC_RINT): Likewise.
(VEC_RLMI): Likewise.
(VEC_RLNM): Likewise.
(VEC_SBOX_BE): Likewise.
(VEC_SIGNEXTI): Likewise.
(VEC_SIGNEXTLL): Likewise.
(VEC_SIGNEXTQ): Likewise.
(VEC_SLDB): Likewise.
(VEC_SLV): Likewise.
(VEC_SPLATI): Likewise.
(VEC_SPLATID): Likewise.
(VEC_SPLATI_INS): Likewise.
(VEC_SQRT): Likewise.
(VEC_SRDB): Likewise.
(VEC_SRV): Likewise.
(VEC_STRIL): Likewise.
(VEC_STRIL_P): Likewise.
(VEC_STRIR): Likewise.
(VEC_STRIR_P): Likewise.
(VEC_STXVL): Likewise.
(VEC_TERNARYLOGIC): Likewise.
(VEC_TEST_LSBB_ALL_ONES): Likewise.
(VEC_TEST_LSBB_ALL_ZEROS): Likewise.
(VEC_VEE): Likewise.
(VEC_VES): Likewise.
(VEC_VIE): Likewise.
(VEC_VPRTYB): Likewise.
(VEC_VSCEEQ): Likewise.
(VEC_VSCEGT): Likewise.
(VEC_VSCELT): Likewise.
(VEC_VSCEUO): Likewise.
(VEC_VSEE): Likewise.
(VEC_VSES): Likewise.
(VEC_VSIE): Likewise.
(VEC_VSTDC): Likewise.
(VEC_VSTDCN): Likewise.
(VEC_VTDC): Likewise.
(VEC_XL): Likewise.
(VEC_XL_BE): Likewise.
(VEC_XL_LEN_R): Likewise.
(VEC_XL_SEXT): Likewise.
(VEC_XL_ZEXT): Likewise.
(VEC_XST): Likewise.
(VEC_XST_BE): Likewise.
(VEC_XST_LEN_R): Likewise.
(VEC_XST_TRUNC): Likewise.
(VEC_XXPERMDI): Likewise.
(VEC_XXSLDWI): Likewise.
(VEC_TSTSFI_EQ_DD): Likewise.
(VEC_TSTSFI_EQ_TD): Likewise.
(VEC_TSTSFI_GT_DD): Likewise.
(VEC_TSTSFI_GT_TD): Likewise.
(VEC_TSTSFI_LT_DD): Likewise.
(VEC_TSTSFI_LT_TD): Likewise.
(VEC_TSTSFI_OV_DD): Likewise.
(VEC_TSTSFI_OV_TD): Likewise.
(VEC_VADDCUQ): Likewise.
(VEC_VADDECUQ)

[PATCH] rs6000: More factoring of overload processing

2022-01-06 Thread Bill Schmidt via Gcc-patches
Hi!

This patch continues the refactoring started with r12-6014.  I had previously
noted that the resolve_vec* routines can be further simplified by processing
the argument list earlier, so that all routines can use the arrays of arguments
and types.  I found that this was useful for some of the routines, but not for
all of them.

For several of the special-cased overloads, we don't specify all of the
possible type combinations in rs6000-overload.def, because the types don't
matter for the expansion we do.  For these, we can't use generic error message
handling when the number of arguments is incorrect, because the result is
misleading error messages that indicate argument types are wrong.

So this patch goes halfway and improves the factoring on the remaining special
cases, but leaves vec_splats, vec_promote, vec_extract, vec_insert, and
vec_step alone.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
okay for trunk?

Thanks!
Bill


2022-01-06  Bill Schmidt  

gcc/
* config/rs6000/rs6000-c.c (resolve_vec_mul): Accept args and types
parameters instead of arglist and nargs.  Simplify accordingly.  Remove
unnecessary test for argument count mismatch.
(resolve_vec_cmpne): Likewise.
(resolve_vec_adde_sube): Likewise.
(resolve_vec_addec_subec): Likewise.
(altivec_resolve_overloaded_builtin): Move overload special handling
after the gathering of arguments into args[] and types[] and the test
for correct number of arguments.  Don't perform the test for correct
number of arguments for certain special cases.  Call the other special
cases with args and types instead of arglist and nargs.
---
 gcc/config/rs6000/rs6000-c.c | 304 +++
 1 file changed, 127 insertions(+), 177 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index 24a081ced37..189a70d89bf 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -939,37 +939,25 @@ altivec_build_resolved_builtin (tree *args, int n, tree 
fntype, tree ret_type,
 enum resolution { unresolved, resolved, resolved_bad };
 
 /* Resolve an overloaded vec_mul call and return a tree expression for the
-   resolved call if successful.  NARGS is the number of arguments to the call.
-   ARGLIST contains the arguments.  RES must be set to indicate the status of
+   resolved call if successful.  ARGS contains the arguments to the call.
+   TYPES contains their types.  RES must be set to indicate the status of
the resolution attempt.  LOC contains statement location information.  */
 
 static tree
-resolve_vec_mul (resolution *res, vec *arglist, unsigned nargs,
-location_t loc)
+resolve_vec_mul (resolution *res, tree *args, tree *types, location_t loc)
 {
   /* vec_mul needs to be special cased because there are no instructions for it
  for the {un}signed char, {un}signed short, and {un}signed int types.  */
-  if (nargs != 2)
-{
-  error ("builtin %qs only accepts 2 arguments", "vec_mul");
-  *res = resolved;
-  return error_mark_node;
-}
-
-  tree arg0 = (*arglist)[0];
-  tree arg0_type = TREE_TYPE (arg0);
-  tree arg1 = (*arglist)[1];
-  tree arg1_type = TREE_TYPE (arg1);
 
   /* Both arguments must be vectors and the types must be compatible.  */
-  if (TREE_CODE (arg0_type) != VECTOR_TYPE
-  || !lang_hooks.types_compatible_p (arg0_type, arg1_type))
+  if (TREE_CODE (types[0]) != VECTOR_TYPE
+  || !lang_hooks.types_compatible_p (types[0], types[1]))
 {
   *res = resolved_bad;
   return error_mark_node;
 }
 
-  switch (TYPE_MODE (TREE_TYPE (arg0_type)))
+  switch (TYPE_MODE (TREE_TYPE (types[0])))
 {
 case E_QImode:
 case E_HImode:
@@ -978,21 +966,21 @@ resolve_vec_mul (resolution *res, vec 
*arglist, unsigned nargs,
 case E_TImode:
   /* For scalar types just use a multiply expression.  */
   *res = resolved;
-  return fold_build2_loc (loc, MULT_EXPR, TREE_TYPE (arg0), arg0,
- fold_convert (TREE_TYPE (arg0), arg1));
+  return fold_build2_loc (loc, MULT_EXPR, types[0], args[0],
+ fold_convert (types[0], args[1]));
 case E_SFmode:
   {
/* For floats use the xvmulsp instruction directly.  */
*res = resolved;
tree call = rs6000_builtin_decls[RS6000_BIF_XVMULSP];
-   return build_call_expr (call, 2, arg0, arg1);
+   return build_call_expr (call, 2, args[0], args[1]);
   }
 case E_DFmode:
   {
/* For doubles use the xvmuldp instruction directly.  */
*res = resolved;
tree call = rs6000_builtin_decls[RS6000_BIF_XVMULDP];
-   return build_call_expr (call, 2, arg0, arg1);
+   return build_call_expr (call, 2, args[0], args[1]);
   }
 /* Other types are errors.  */
 default:
@@ -1002,37 +990,25 @@ resolve_vec_mul (

Re: [PATCH] rs6000: Skip overload instances with uninitialized fntype (PR103622)

2022-01-05 Thread Bill Schmidt via Gcc-patches
Hi!  I'd like to ping this patch, now that I'm back from break.

Thanks!
Bill

On 12/13/21 10:15 AM, Bill Schmidt wrote:
> Hi!
>
> For some data types like IEEE-128, we determine whether the type is available
> at built-in function initialization time.  If it's not, then we don't provide
> the function type for function instances that require the data type.  PR103622
> observes that this can cause us to ICE when running the list of instances when
> the target doesn't support the data type.
>
> Ideally, we wouldn't even put such an instance in the list of instances that
> an overload can map to, but to do that is much more complicated.  Instead,
> this patch just ensures we don't dereference a NULL pointer when the situation
> arises.
>
> Tested the fix on a powerpc-e300c3-linux-gnu cross.  Bootstrapped and tested 
> on
> powerpc64le-linux-gnu with no regressions.  Is this okay for trunk?
>
> Thanks!
> Bill
>
>
> 2021-12-13  Bill Schmidt  
>
> gcc/
>   PR target/103622
>   * config/rs6000/rs6000-c.c (altivec_resolve_new_overloaded_builtin):
>   Skip over instances with undefined function types.
> ---
>  gcc/config/rs6000/rs6000-c.c | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
> index 8e83d97e72f..fc4cc929884 100644
> --- a/gcc/config/rs6000/rs6000-c.c
> +++ b/gcc/config/rs6000/rs6000-c.c
> @@ -2943,6 +2943,12 @@ altivec_resolve_new_overloaded_builtin (location_t 
> loc, tree fndecl,
>  
>   for (; instance != NULL; instance = instance->next)
> {
> + /* It is possible for an instance to require a data type that isn't
> +defined on this target, in which case instance->fntype will be
> +NULL.  */
> + if (!instance->fntype)
> +   continue;
> +
>   bool mismatch = false;
>   tree nextparm = TYPE_ARG_TYPES (instance->fntype);
>  


Re: [PATCH 2/2] rs6000: Update darn testcases

2021-12-17 Thread Bill Schmidt via Gcc-patches
Hi!

On 12/17/21 11:36 AM, Segher Boessenkool wrote:
> Make the darn testcases work (and be tested) in 32-bit mode as well.
> They used to ICE, but they no longer do.
>
>
> 2021-12-17  Segher Boessenkool 
>
> gcc/testsuite/
>   PR target/103624
>   * gcc.target/powerpc/darn-0.c: Remove target clause.
>   * gcc.target/powerpc/darn-1.c: Remove target clause. Remove lp64
>   requirement.  Change return type to long.
>   * gcc.target/powerpc/darn-2.c: Ditto.
>   * gcc.target/powerpc/darn-3.c: Remove target clause.

LGTM.

Thanks!
Bill

>
> ---
>  gcc/testsuite/gcc.target/powerpc/darn-0.c | 2 +-
>  gcc/testsuite/gcc.target/powerpc/darn-1.c | 5 ++---
>  gcc/testsuite/gcc.target/powerpc/darn-2.c | 5 ++---
>  gcc/testsuite/gcc.target/powerpc/darn-3.c | 2 +-
>  4 files changed, 6 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/powerpc/darn-0.c 
> b/gcc/testsuite/gcc.target/powerpc/darn-0.c
> index f446f494b06d..64d98f5f91d7 100644
> --- a/gcc/testsuite/gcc.target/powerpc/darn-0.c
> +++ b/gcc/testsuite/gcc.target/powerpc/darn-0.c
> @@ -1,4 +1,4 @@
> -/* { dg-do compile { target { powerpc*-*-* } } } */
> +/* { dg-do compile } */
>  /* { dg-require-effective-target powerpc_p9vector_ok } */
>  /* { dg-skip-if "" { powerpc*-*-aix* } } */
>  /* { dg-options "-mdejagnu-cpu=power9" } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/darn-1.c 
> b/gcc/testsuite/gcc.target/powerpc/darn-1.c
> index 0938718a5ad6..f483a89862d0 100644
> --- a/gcc/testsuite/gcc.target/powerpc/darn-1.c
> +++ b/gcc/testsuite/gcc.target/powerpc/darn-1.c
> @@ -1,12 +1,11 @@
> -/* { dg-do compile { target { powerpc*-*-* } } } */
> +/* { dg-do compile } */
>  /* { dg-require-effective-target powerpc_p9vector_ok } */
> -/* { dg-require-effective-target lp64 } */
>  /* { dg-skip-if "" { powerpc*-*-aix* } } */
>  /* { dg-options "-mdejagnu-cpu=power9" } */
>  
>  #include 
>  
> -long long get_conditioned_random ()
> +long get_conditioned_random ()
>  {
>return __builtin_darn ();
>  }
> diff --git a/gcc/testsuite/gcc.target/powerpc/darn-2.c 
> b/gcc/testsuite/gcc.target/powerpc/darn-2.c
> index 64e44b244c4b..56a9ffb677b4 100644
> --- a/gcc/testsuite/gcc.target/powerpc/darn-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/darn-2.c
> @@ -1,12 +1,11 @@
> -/* { dg-do compile { target { powerpc*-*-* } } } */
> +/* { dg-do compile } */
>  /* { dg-require-effective-target powerpc_p9vector_ok } */
> -/* { dg-require-effective-target lp64 } */
>  /* { dg-skip-if "" { powerpc*-*-aix* } } */
>  /* { dg-options "-mdejagnu-cpu=power9" } */
>  
>  #include 
>  
> -long long get_raw_random ()
> +long get_raw_random ()
>  {
>return __builtin_darn_raw ();
>  }
> diff --git a/gcc/testsuite/gcc.target/powerpc/darn-3.c 
> b/gcc/testsuite/gcc.target/powerpc/darn-3.c
> index 477901fde70d..4c68fad80d5d 100644
> --- a/gcc/testsuite/gcc.target/powerpc/darn-3.c
> +++ b/gcc/testsuite/gcc.target/powerpc/darn-3.c
> @@ -1,4 +1,4 @@
> -/* { dg-do compile { target { powerpc*-*-* } } } */
> +/* { dg-do compile } */
>  /* { dg-skip-if "" { powerpc*-*-aix* } } */
>  /* { dg-options "-O2 -mdejagnu-cpu=power9" } */
>  


Re: [PATCH 1/2] rs6000: Redo darn (PR103624)

2021-12-17 Thread Bill Schmidt via Gcc-patches
Hi!

On 12/17/21 11:36 AM, Segher Boessenkool wrote:
> The builtins now all return "long".  The patterns have :GPR as the
> output mode, so they can be 32-bit as well (the instruction makes sense
> in 32 bit just fine).  The builtins expand to the DImode version
> normally, but to the SImode if {32bit} is true.
>
> 2021-12-17  Segher Boessenkool 
>
>   PR target/103624
>   * config/rs6000/rs6000-builtins.def (__builtin_darn): Expand to
>   darn_64_di.  Add {32bit} attribute.  Return long.
>   (__builtin_darn_32): Expand to darn_32_di.  Add {32bit} attribute.
>   Return long.
>   (__builtin_darn_raw): Expand to darn_raw_di.  Add {32bit} attribute.
>   Return long.
>   * config/rs6000/rs6000-call.c (rs6000_expand_builtin): Expand the darn
>   builtins to the _si variants for -m32.
>   * config/rs6000/rs6000.md (UNSPECV_DARN_32, UNSPECV_DARN_RAW): Delete.
>   (UNSPECV_DARN): Update comment.
>   (darn_32, darn_raw, darn): Delete.
>   (darn_32_, darn_64_, darn_raw_ for GPR): New.
>   (@darn for GPR): New.

Patch LGTM.  Thanks for doing the legwork on this!

Bill

>
> ---
>  gcc/config/rs6000/rs6000-builtins.def | 12 -
>  gcc/config/rs6000/rs6000-call.c   |  6 +
>  gcc/config/rs6000/rs6000.md   | 47 
> +--
>  3 files changed, 40 insertions(+), 25 deletions(-)
>
> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
> b/gcc/config/rs6000/rs6000-builtins.def
> index 45ce160bd421..3ad5a135eaec 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -2798,14 +2798,14 @@
>  
>  ; Miscellaneous P9 functions
>  [power9]
> -  signed long long __builtin_darn ();
> -DARN darn {}
> +  signed long __builtin_darn ();
> +DARN darn_64_di {32bit}
>  
> -  signed int __builtin_darn_32 ();
> -DARN_32 darn_32 {}
> +  signed long __builtin_darn_32 ();
> +DARN_32 darn_32_di {32bit}
>  
> -  signed long long __builtin_darn_raw ();
> -DARN_RAW darn_raw {}
> +  signed long __builtin_darn_raw ();
> +DARN_RAW darn_raw_di {32bit}
>  
>const signed int __builtin_dtstsfi_eq_dd (const int<6>, _Decimal64);
>  TSTSFI_EQ_DD dfptstsfi_eq_dd {}
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index b98f4a4c97f7..cc55174c6b72 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -5631,6 +5631,12 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* 
> subtarget */,
>   icode = CODE_FOR_rs6000_mftb_si;
>else if (fcode == RS6000_BIF_BPERMD)
>   icode = CODE_FOR_bpermd_si;
> +  else if (fcode == RS6000_BIF_DARN)
> + icode = CODE_FOR_darn_64_si;
> +  else if (fcode == RS6000_BIF_DARN_32)
> + icode = CODE_FOR_darn_32_si;
> +  else if (fcode == RS6000_BIF_DARN_RAW)
> + icode = CODE_FOR_darn_raw_si;
>else
>   gcc_unreachable ();
>  }
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 4122acb98cfd..9be484c7cf83 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -172,9 +172,7 @@ (define_c_enum "unspecv"
> UNSPECV_EH_RR ; eh_reg_restore
> UNSPECV_ISYNC ; isync instruction
> UNSPECV_MFTB  ; move from time base
> -   UNSPECV_DARN  ; darn 1 (deliver a random number)
> -   UNSPECV_DARN_32   ; darn 2
> -   UNSPECV_DARN_RAW  ; darn 0
> +   UNSPECV_DARN  ; darn (deliver a random number)
> UNSPECV_NLGR  ; non-local goto receiver
> UNSPECV_MFFS  ; Move from FPSCR
> UNSPECV_MFFSL ; Move from FPSCR light instruction version
> @@ -15065,25 +15063,36 @@ (define_insn "*cmp_hw"
>  
>  ;; Miscellaneous ISA 3.0 (power9) instructions
>  
> -(define_insn "darn_32"
> -  [(set (match_operand:SI 0 "register_operand" "=r")
> -(unspec_volatile:SI [(const_int 0)] UNSPECV_DARN_32))]
> +(define_expand "darn_32_"
> +  [(use (match_operand:GPR 0 "register_operand"))]
>"TARGET_P9_MISC"
> -  "darn %0,0"
> -  [(set_attr "type" "integer")])
> +{
> +  emit_insn (gen_darn (mode, operands[0], const0_rtx));
> +  DONE;
> +})
>  
> -(define_insn "darn_raw"
> -  [(set (match_operand:DI 0 "register_operand" "=r")
> -(unspec_volatile:DI [(const_int 0)] UNSPECV_DARN_RAW))]
> -  "TARGET_P9_MISC && TARGET_64BIT"
> -  "darn %0,2"
> -  [(set_attr "type" "integer")])
> +(define_expand "darn_64_"
> +  [(use (match_operand:GPR 0 "register_operand"))]
> +  "TARGET_P9_MISC"
> +{
> +  emit_insn (gen_darn (mode, operands[0], const1_rtx));
> +  DONE;
> +})
>  
> -(define_insn "darn"
> -  [(set (match_operand:DI 0 "register_operand" "=r")
> -(unspec_volatile:DI [(const_int 0)] UNSPECV_DARN))]
> -  "TARGET_P9_MISC && TARGET_64BIT"
> -  "darn %0,1"
> +(define_expand "darn_raw_"
> +  [(use (match_operand:GPR 0 "register_operand"))]
> +  

[COMMITTED] rs6000: Fix fake vec_promote overload

2021-12-17 Thread Bill Schmidt via Gcc-patches
Hi!

rs6000-overload.def defines one instance of vec_promote so that it can be
registered with the front end.  Actual expansion of the vec_promote overload
is done with special-case code in rs6000-c.c.  During another cleanup, I
observed that the fake instance has the wrong number of arguments.  Fix that.
This has no effect other than to avoid confusion.

Bootstrapped and tested on powerpc64le-linux-gnu, committed as obvious in
r12-6043.

Thanks!
Bill


diff --git a/gcc/config/rs6000/rs6000-overload.def 
b/gcc/config/rs6000/rs6000-overload.def
index 531a4fcd1af..2b2853918c0 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -3021,7 +3021,7 @@
 ; is replaced by a constructor.  The single overload here causes
 ; __builtin_vec_promote to be registered with the front end so that can happen.
 [VEC_PROMOTE, vec_promote, __builtin_vec_promote]
-  vsi __builtin_vec_promote (vsi);
+  vsi __builtin_vec_promote (vsi, const int);
 ABS_V4SI PROMOTE_FAKERY
 
 [VEC_RE, vec_re, __builtin_vec_re]



Re: [pushed] Darwin, ppc: Additional change for r12-5974.

2021-12-17 Thread Bill Schmidt via Gcc-patches
Iain, thanks very much for fixing this, and I'm very sorry for the oversight!

Bill

On 12/17/21 3:46 AM, Iain Sandoe via Gcc-patches wrote:
> This adds a missed change from r12-5974-g926d64906af.
> The builin_decls array has been renamed to drop the trailing
> _x that was used during the main changes to the builtins.
>
> This fixes bootstrap for powerpc-darwin9, tested there, pushed
> to master, thanks,
> Iain
>
> Signed-off-by: Iain Sandoe 
>
> gcc/ChangeLog:
>
>   * config/rs6000/darwin.h: Drop trailing _x from the
>   builtin_decls array name.
> ---
>  gcc/config/rs6000/darwin.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/config/rs6000/darwin.h b/gcc/config/rs6000/darwin.h
> index 7bc1009a523..8288003038e 100644
> --- a/gcc/config/rs6000/darwin.h
> +++ b/gcc/config/rs6000/darwin.h
> @@ -507,7 +507,7 @@
>  #define SUBTARGET_INIT_BUILTINS  
> \
>  do { \
>darwin_patch_builtins ();  \
> -  rs6000_builtin_decls_x[(unsigned) (RS6000_BIF_CFSTRING)]   \
> +  rs6000_builtin_decls[(unsigned) (RS6000_BIF_CFSTRING)] \
>  = darwin_init_cfstring_builtins ((unsigned) (RS6000_BIF_CFSTRING)); \
>  } while(0)
>


Re: [PATCH] rs6000: Refactor altivec_build_resolved_builtin

2021-12-15 Thread Bill Schmidt via Gcc-patches
Hi!

On 12/15/21 12:16 PM, Segher Boessenkool wrote:
>> +  /* Note:  vec_nand also works but opt changes vec_nand's
>> + to vec_nor's anyway.  */
> Maybe there should be a vec_not?  There is one at the RTL level (called
> one_cmpl2).

As I recall, we have an issue open for this already... but nobody's grabbed it 
yet.

Thanks for the review!

(I'll change all those VEC_* things to lower-case.)

Bill



Re: [PATCH] rs6000: __builtin_darn[_raw] should be in [power9-64] (PR103624)

2021-12-15 Thread Bill Schmidt via Gcc-patches


On 12/15/21 12:41 PM, Segher Boessenkool wrote:
> On Wed, Dec 15, 2021 at 08:00:02AM -0600, Bill Schmidt wrote:
>>> No, all builtins should work in either mode, and always return long.
>>> If the patterns are broken, the *patterns* should be fixed :-)
>> OK, thanks!  This is much clearer now.
>>
>> I've opened an internal issue about the deficiencies of the darn patterns and
>> their associated built-ins.  In response to PR103624, I would like to start
>> with the existing patch to ensure the new support mirrors what we had before,
>> so we have that as a baseline.  We can then move on to fixing the larger
>> set of problems.  Is that a reasonable plan?
> It is much more work than doing it correct in the first place.
>
> I'll do the RTL side, if you want?

Sure, go ahead.

Bill

>
>
> Segher


[PATCH] rs6000: Refactor altivec_build_resolved_builtin

2021-12-15 Thread Bill Schmidt via Gcc-patches
Hi!

While replacing the built-in machinery, we agreed to defer some necessary
refactoring of the overload processing.  This patch cleans it up considerably.

I've put in one FIXME for an additional level of cleanup that should be done
independently.  The various helper functions (resolve_VEC_*) can be simplified
if we move the argument processing in altivec_resolve_overloaded_builtin
earlier.  But this requires making nontrivial changes to those functions that
will need careful review.  Let's do that in a later patch.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
okay for trunk?

Thanks!
Bill


2021-12-09  Bill Schmidt  

gcc/
* config/rs6000/rs6000-c.c (resolution): New enum.
(resolve_VEC_MUL): New function.
(resolve_VEC_CMPNE): Likewise.
(resolve_VEC_ADDE_SUBE): Likewise.
(resolve_VEC_ADDEC_SUBEC): Likewise.
(resolve_VEC_SPLATS): Likewise.
(resolve_VEC_EXTRACT): Likewise.
(resolve_VEC_INSERT): Likewise.
(resolve_VEC_STEP): Likewise.
(find_instance): Likewise.
(altivec_resolve_overloaded_builtin): Many cleanups:  Call factored-out
functions.  Move variable declarations closer to uses.  Add commentary.
Remove unnecessary levels of braces.  Avoid use of gotos.  Change
misleading variable names.  Use switches over if-else-if chains.
---
 gcc/config/rs6000/rs6000-c.c | 1835 +++---
 1 file changed, 1004 insertions(+), 831 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index e0ebdeed548..45f485aab44 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -928,28 +928,847 @@ altivec_build_resolved_builtin (tree *args, int n, tree 
fntype, tree ret_type,
   return fold_convert (ret_type, call);
 }
 
+/* Enumeration of possible results from attempted overload resolution.
+   This is used by special-case helper functions to tell their caller
+   whether they succeeded and what still needs to be done.
+
+   unresolved = Still needs processing
+ resolved = Resolved (but may be an error_mark_node)
+  resolved_bad = An error that needs handling by the caller.  */
+
+enum resolution { unresolved, resolved, resolved_bad };
+
+/* Resolve an overloaded vec_mul call and return a tree expression for the
+   resolved call if successful.  NARGS is the number of arguments to the call.
+   ARGLIST contains the arguments.  RES must be set to indicate the status of
+   the resolution attempt.  LOC contains statement location information.  */
+
+static tree
+resolve_VEC_MUL (resolution *res, vec *arglist, unsigned nargs,
+location_t loc)
+{
+  /* vec_mul needs to be special cased because there are no instructions for it
+ for the {un}signed char, {un}signed short, and {un}signed int types.  */
+  if (nargs != 2)
+{
+  error ("builtin %qs only accepts 2 arguments", "vec_mul");
+  *res = resolved;
+  return error_mark_node;
+}
+
+  tree arg0 = (*arglist)[0];
+  tree arg0_type = TREE_TYPE (arg0);
+  tree arg1 = (*arglist)[1];
+  tree arg1_type = TREE_TYPE (arg1);
+
+  /* Both arguments must be vectors and the types must be compatible.  */
+  if (TREE_CODE (arg0_type) != VECTOR_TYPE
+  || !lang_hooks.types_compatible_p (arg0_type, arg1_type))
+{
+  *res = resolved_bad;
+  return error_mark_node;
+}
+
+  switch (TYPE_MODE (TREE_TYPE (arg0_type)))
+{
+case E_QImode:
+case E_HImode:
+case E_SImode:
+case E_DImode:
+case E_TImode:
+  /* For scalar types just use a multiply expression.  */
+  *res = resolved;
+  return fold_build2_loc (loc, MULT_EXPR, TREE_TYPE (arg0), arg0,
+ fold_convert (TREE_TYPE (arg0), arg1));
+case E_SFmode:
+  {
+   /* For floats use the xvmulsp instruction directly.  */
+   *res = resolved;
+   tree call = rs6000_builtin_decls[RS6000_BIF_XVMULSP];
+   return build_call_expr (call, 2, arg0, arg1);
+  }
+case E_DFmode:
+  {
+   /* For doubles use the xvmuldp instruction directly.  */
+   *res = resolved;
+   tree call = rs6000_builtin_decls[RS6000_BIF_XVMULDP];
+   return build_call_expr (call, 2, arg0, arg1);
+  }
+/* Other types are errors.  */
+default:
+  *res = resolved_bad;
+  return error_mark_node;
+}
+}
+
+/* Resolve an overloaded vec_cmpne call and return a tree expression for the
+   resolved call if successful.  NARGS is the number of arguments to the call.
+   ARGLIST contains the arguments.  RES must be set to indicate the status of
+   the resolution attempt.  LOC contains statement location information.  */
+
+static tree
+resolve_VEC_CMPNE (resolution *res, vec *arglist, unsigned nargs,
+  location_t loc)
+{
+  /* vec_cmpne needs to be special cased because there are no instructions
+ for it (prior to power 9).  

Re: [PATCH] rs6000: __builtin_darn[_raw] should be in [power9-64] (PR103624)

2021-12-15 Thread Bill Schmidt via Gcc-patches


On 12/14/21 8:23 PM, Segher Boessenkool wrote:
> On Tue, Dec 14, 2021 at 07:32:30AM -0600, Bill Schmidt wrote:
>> On 12/13/21 6:22 PM, Segher Boessenkool wrote:
>>> On Mon, Dec 13, 2021 at 02:37:43PM -0600, Bill Schmidt wrote:
>>>> On 12/13/21 10:54 AM, Segher Boessenkool wrote:
>>>>> On Mon, Dec 13, 2021 at 11:30:28AM -0500, David Edelsohn wrote:
>>>>>> On Mon, Dec 13, 2021 at 10:48 AM Bill Schmidt  
>>>>>> wrote:
>>>>>>> PR103624 observes that we get segfaults for the 64-bit darn builtins 
>>>>>>> when compiled
>>>>>>> on a 32-bit architecture.  The old built-in infrastructure requires 
>>>>>>> TARGET_64BIT, and
>>>>>>> this was missed in the new support.  Moving these two builtins from the 
>>>>>>> [power9]
>>>>>>> stanza to the [power9-64] stanza solves the problem.
>>>>>>>
>>>>>>> Tested the fix on a powerpc-e300c3-linux-gnu cross.  Bootstrapped and 
>>>>>>> tested on
>>>>>>> powerpc64le-linux-gnu with no regressions.  Is this okay for trunk?
>>>>>> Okay.
>>>>> No, as I said before this is not correct, not without a lot more
>>>>> explanation at least.  We should not copy errors in the old code into
>>>>> the new code.  That is negating one of the main advantages of
>>>>> reimplementing this in the first place!
>>>> Can you please be more specific?
>>>>
>>>> All I have from you before is "It should work for 32-bit though?"  I 
>>>> responded in the
>>>> bug report that __builtin_darn_32 was used for this purpose.  I haven't 
>>>> seen a
>>>> response to that.  What do you want to see happen?
>>> That of course does not work for _raw.
>>>
>>> These builtins should just return a "long", just like __builtin_ppc_mftb
>>> does.  All three of them.
>> Well, that seems wrong for __builtin_darn_32, which maps to an SImode 
>> pattern.
> That is Yet Another Bug, then.
>
> The insn returns a full register.  The patterns should use either :P or
> :GPR (the latter if SImode makes sense for it, so we could have that for
> all darn variants).  :DI and :SI never make sense for this.
>
>> So, I assume what you'd like to see is for the other two built-ins to return
>> long, and for the "&& TARGET_64BIT" to be removed from the darn_raw and darn
>> patterns?
> No, all builtins should work in either mode, and always return long.
> If the patterns are broken, the *patterns* should be fixed :-)


OK, thanks!  This is much clearer now.

I've opened an internal issue about the deficiencies of the darn patterns and
their associated built-ins.  In response to PR103624, I would like to start
with the existing patch to ensure the new support mirrors what we had before,
so we have that as a baseline.  We can then move on to fixing the larger
set of problems.  Is that a reasonable plan?

Thanks!
Bill

>
>>> Avoiding ICEs should not be a goal.  It should be a side effect of doing
>>> the right thing in the first place!
>> There's no reason to get snippy.  Given that you approved Kelvin's original
>> implementation of the darn patterns and built-in functions, I think I can be
>> forgiven for thinking that those were the desired semantics. :-)
> Sorry if I sound annoyed.  I am annoyed, but not with you.  Just with
> the world in general I suppose.
>
> With the new builtins representation it is much easier to spot problems,
> it is a great success already!
>
>
> Segher


Re: [PATCH v2 6/6] rs6000: Rename arrays to remove temporary _x suffix

2021-12-14 Thread Bill Schmidt via Gcc-patches
Ping.  Thanks!

Bill

On 12/6/21 2:49 PM, Bill Schmidt via Gcc-patches wrote:
> Hi!
>
> While we had two sets of built-in infrastructure at once, I added _x as a
> suffix to two arrays to disambiguate the old and new versions.  Time to fix
> that also.
>
> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
> okay for trunk?
>
> Thanks!
> Bill
>
> 2021-12-06  Bill Schmidt  
>
> gcc/
>   * config/rs6000/rs6000-c.c (altivec_build_resolved_builtin): Rename
>   rs6000_builtin_decls_x to rs6000_builtin_decls.
>   (altivec_resolve_overloaded_builtin): Likewise.  Also rename
>   rs6000_builtin_info_x to rs6000_builtin_info.
>   * config/rs6000/rs6000-call.c (rs6000_invalid_builtin): Rename
>   rs6000_builtin_info_x to rs6000_builtin_info.
>   (rs6000_builtin_is_supported): Likewise.
>   (rs6000_gimple_fold_mma_builtin): Likewise.  Also rename
>   rs6000_builtin_decls_x to rs6000_builtin_decls.
>   (rs6000_gimple_fold_builtin): Rename rs6000_builtin_info_x to
>   rs6000_builtin_info.
>   (cpu_expand_builtin): Likewise.
>   (rs6000_expand_builtin): Likewise.
>   (rs6000_init_builtins): Likewise.  Also rename rs6000_builtin_decls_x
>   to rs6000_builtin_decls.
>   (rs6000_builtin_decl): Rename rs6000_builtin_decls_x to
>   rs6000_builtin_decls.
>   * config/rs6000/rs6000-gen-builtins.c (write_decls): In generated code,
>   rename rs6000_builtin_decls_x to rs6000_builtin_decls, and rename
>   rs6000_builtin_info_x to rs6000_builtin_info.
>   (write_bif_static_init): In generated code, rename
>   rs6000_builtin_info_x to rs6000_builtin_info.
>   (write_init_bif_table): In generated code, rename
>   rs6000_builtin_decls_x to rs6000_builtin_decls, and rename
>   rs6000_builtin_info_x to rs6000_builtin_info.
>   (write_init_ovld_table): In generated code, rename
>   rs6000_builtin_decls_x to rs6000_builtin_decls.
>   (write_init_file): Likewise.
>   * config/rs6000/rs6000.c (rs6000_builtin_vectorized_function):
>   Likewise.
>   (rs6000_builtin_md_vectorized_function): Likewise.
>   (rs6000_builtin_reciprocal): Likewise.
>   (add_condition_to_bb): Likewise.
>   (rs6000_atomic_assign_expand_fenv): Likewise.
> ---
>  gcc/config/rs6000/rs6000-c.c| 64 -
>  gcc/config/rs6000/rs6000-call.c | 46 +-
>  gcc/config/rs6000/rs6000-gen-builtins.c | 27 +--
>  gcc/config/rs6000/rs6000.c  | 58 +++---
>  4 files changed, 96 insertions(+), 99 deletions(-)
>
> diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
> index f790c72d621..e0ebdeed548 100644
> --- a/gcc/config/rs6000/rs6000-c.c
> +++ b/gcc/config/rs6000/rs6000-c.c
> @@ -867,7 +867,7 @@ altivec_build_resolved_builtin (tree *args, int n, tree 
> fntype, tree ret_type,
>  {
>tree argtypes = TYPE_ARG_TYPES (fntype);
>tree arg_type[MAX_OVLD_ARGS];
> -  tree fndecl = rs6000_builtin_decls_x[bif_id];
> +  tree fndecl = rs6000_builtin_decls[bif_id];
>
>for (int i = 0; i < n; i++)
>  {
> @@ -1001,13 +1001,13 @@ altivec_resolve_overloaded_builtin (location_t loc, 
> tree fndecl,
> case E_SFmode:
>   {
> /* For floats use the xvmulsp instruction directly.  */
> -   tree call = rs6000_builtin_decls_x[RS6000_BIF_XVMULSP];
> +   tree call = rs6000_builtin_decls[RS6000_BIF_XVMULSP];
> return build_call_expr (call, 2, arg0, arg1);
>   }
> case E_DFmode:
>   {
> /* For doubles use the xvmuldp instruction directly.  */
> -   tree call = rs6000_builtin_decls_x[RS6000_BIF_XVMULDP];
> +   tree call = rs6000_builtin_decls[RS6000_BIF_XVMULDP];
> return build_call_expr (call, 2, arg0, arg1);
>   }
> /* Other types are errors.  */
> @@ -1066,7 +1066,7 @@ altivec_resolve_overloaded_builtin (location_t loc, 
> tree fndecl,
>   vec_safe_push (params, arg0);
>   vec_safe_push (params, arg1);
>   tree call = altivec_resolve_overloaded_builtin
> -   (loc, rs6000_builtin_decls_x[RS6000_OVLD_VEC_CMPEQ],
> +   (loc, rs6000_builtin_decls[RS6000_OVLD_VEC_CMPEQ],
>  params);
>   /* Use save_expr to ensure that operands used more than once
>  that may have side effects (like calls) are only evaluated
> @@ -1076,7 +1076,7 @@ altivec_resolve_overloaded_builtin (location_t loc, 
> tree fndecl,
>   vec_safe_push (params, call);
>   vec_safe_push (params, call);
>

Re: [PATCH v2 /6] rs6000: Rename functions with "new" in their names

2021-12-14 Thread Bill Schmidt via Gcc-patches
Ping.  Thanks!

Bill

On 12/6/21 2:49 PM, Bill Schmidt via Gcc-patches wrote:
> Hi!
>
> While we had two sets of built-in functionality at the same time, I put "new"
> in the names of quite a few functions.  Time to undo that.
>
> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
> okay for trunk?
>
> Thanks!
> Bill
>
> 2021-12-02  Bill Schmidt  
>
> gcc/
>   * config/rs6000/rs6000-c.c (altivec_resolve_new_overloaded_builtin):
>   Remove forward declaration.
>   (rs6000_new_builtin_type_compatible): Rename to
>   rs6000_builtin_type_compatible.
>   (rs6000_builtin_type_compatible): Remove.
>   (altivec_resolve_overloaded_builtin): Remove.
>   (altivec_build_new_resolved_builtin): Rename to
>   altivec_build_resolved_builtin.
>   (altivec_resolve_new_overloaded_builtin): Rename to
>   altivec_resolve_overloaded_builtin.  Remove static keyword.  Adjust
>   called function names.
>   * config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): Remove
>   forward declaration.
>   (rs6000_gimple_fold_new_builtin): Likewise.
>   (rs6000_invalid_new_builtin): Rename to rs6000_invalid_builtin.
>   (rs6000_gimple_fold_builtin): Remove.
>   (rs6000_new_builtin_valid_without_lhs): Rename to
>   rs6000_builtin_valid_without_lhs.
>   (rs6000_new_builtin_is_supported): Rename to
>   rs6000_builtin_is_supported.
>   (rs6000_gimple_fold_new_mma_builtin): Rename to
>   rs6000_gimple_fold_mma_builtin.
>   (rs6000_gimple_fold_new_builtin): Rename to
>   rs6000_gimple_fold_builtin.  Remove static keyword.  Adjust called
>   function names.
>   (rs6000_expand_builtin): Remove.
>   (new_cpu_expand_builtin): Rename to cpu_expand_builtin.
>   (new_mma_expand_builtin): Rename to mma_expand_builtin.
>   (new_htm_spr_num): Rename to htm_spr_num.
>   (new_htm_expand_builtin): Rename to htm_expand_builtin.  Change name
>   of called function.
>   (rs6000_expand_new_builtin): Rename to rs6000_expand_builtin.  Remove
>   static keyword.  Adjust called function names.
>   (rs6000_new_builtin_decl): Rename to rs6000_builtin_decl.  Remove
>   static keyword.
>   (rs6000_builtin_decl): Remove.
>   * config/rs6000/rs6000-gen-builtins.c (write_decls): In gnerated code,
>   rename rs6000_new_builtin_is_supported to rs6000_builtin_is_supported.
>   * config/rs6000/rs6000-internal.h (rs6000_invalid_new_builtin): Rename
>   to rs6000_invalid_builtin.
>   * config/rs6000/rs6000.c (rs6000_new_builtin_vectorized_function):
>   Rename to rs6000_builtin_vectorized_function.
>   (rs6000_new_builtin_md_vectorized_function): Rename to
>   rs6000_builtin_md_vectorized_function.
>   (rs6000_builtin_vectorized_function): Remove.
>   (rs6000_builtin_md_vectorized_function): Remove.
> ---
>  gcc/config/rs6000/rs6000-c.c| 120 +---
>  gcc/config/rs6000/rs6000-call.c |  99 ++-
>  gcc/config/rs6000/rs6000-gen-builtins.c |   3 +-
>  gcc/config/rs6000/rs6000-internal.h |   2 +-
>  gcc/config/rs6000/rs6000.c  |  31 ++
>  5 files changed, 80 insertions(+), 175 deletions(-)
>
> diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
> index d44edf585aa..f790c72d621 100644
> --- a/gcc/config/rs6000/rs6000-c.c
> +++ b/gcc/config/rs6000/rs6000-c.c
> @@ -37,9 +37,6 @@
>
>  #include "rs6000-internal.h"
>
> -static tree altivec_resolve_new_overloaded_builtin (location_t, tree, void 
> *);
> -
> -
>  /* Handle the machine specific pragma longcall.  Its syntax is
>
> # pragma longcall ( TOGGLE )
> @@ -817,7 +814,7 @@ is_float128_p (tree t)
>
>  /* Return true iff ARGTYPE can be compatibly passed as PARMTYPE.  */
>  static bool
> -rs6000_new_builtin_type_compatible (tree parmtype, tree argtype)
> +rs6000_builtin_type_compatible (tree parmtype, tree argtype)
>  {
>if (parmtype == error_mark_node)
>  return false;
> @@ -840,23 +837,6 @@ rs6000_new_builtin_type_compatible (tree parmtype, tree 
> argtype)
>return lang_hooks.types_compatible_p (parmtype, argtype);
>  }
>
> -static inline bool
> -rs6000_builtin_type_compatible (tree t, int id)
> -{
> -  tree builtin_type;
> -  builtin_type = rs6000_builtin_type (id);
> -  if (t == error_mark_node)
> -return false;
> -  if (INTEGRAL_TYPE_P (t) && INTEGRAL_TYPE_P (builtin_type))
> -return true;
> -  else if (TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
> -&& is_float128_p (t) && is_float128_p (builtin_type))
> -return t

Re: [PATCH v2 4/6] rs6000: Remove rs6000-builtin.def and associated data and functions

2021-12-14 Thread Bill Schmidt via Gcc-patches
Ping.  Thanks!

Bill

On 12/6/21 2:49 PM, Bill Schmidt via Gcc-patches wrote:
> Hi!
>
> The old rs6000-builtin.def file is no longer needed.  Remove it and the code
> that depends on it.
>
> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
> okay for trunk?
>
> Thanks!
> Bill
>
> 2021-12-02  Bill Schmidt  
>
> gcc/
>   * config/rs6000/rs6000-builtin.def: Delete.
>   * config/rs6000/rs6000-call.c (builtin_compatibility): Delete.
>   (builtin_description): Delete.
>   (builtin_hash_struct): Delete.
>   (builtin_hasher): Delete.
>   (builtin_hash_table): Delete.
>   (builtin_hasher::hash): Delete.
>   (builtin_hasher::equal): Delete.
>   (rs6000_builtin_info_type): Delete.
>   (rs6000_builtin_info): Delete.
>   (bdesc_compat): Delete.
>   (bdesc_3arg): Delete.
>   (bdesc_4arg): Delete.
>   (bdesc_dst): Delete.
>   (bdesc_2arg): Delete.
>   (bdesc_altivec_preds): Delete.
>   (bdesc_abs): Delete.
>   (bdesc_1arg): Delete.
>   (bdesc_0arg): Delete.
>   (bdesc_htm): Delete.
>   (bdesc_mma): Delete.
>   (rs6000_overloaded_builtin_p): Delete.
>   (rs6000_overloaded_builtin_name): Delete.
>   (htm_spr_num): Delete.
>   (rs6000_builtin_is_supported_p): Delete.
>   (rs6000_gimple_fold_mma_builtin): Delete.
>   (gt-rs6000-call.h): Remove include directive.
>   * config/rs6000/rs6000-protos.h (rs6000_overloaded_builtin_p): Delete.
>   (rs6000_builtin_is_supported_p): Delete.
>   (rs6000_overloaded_builtin_name): Delete.
>   * config/rs6000/rs6000.c (rs6000_builtin_decls): Delete.
>   (rs6000_debug_reg_global): Remove reference to RS6000_BUILTIN_COUNT.
>   * config/rs6000/rs6000.h (rs6000_builtins): Delete.
>   (altivec_builtin_types): Delete.
>   (rs6000_builtin_decls): Delete.
>   * config/rs6000/t-rs6000 (TM_H): Don't add rs6000-builtin.def.
> ---
>  gcc/config/rs6000/rs6000-builtin.def | 3350 --
>  gcc/config/rs6000/rs6000-call.c  |  712 --
>  gcc/config/rs6000/rs6000-protos.h|3 -
>  gcc/config/rs6000/rs6000.c   |3 -
>  gcc/config/rs6000/rs6000.h   |   57 -
>  gcc/config/rs6000/t-rs6000   |1 -
>  6 files changed, 4126 deletions(-)
>  delete mode 100644 gcc/config/rs6000/rs6000-builtin.def
>
> diff --git a/gcc/config/rs6000/rs6000-builtin.def 
> b/gcc/config/rs6000/rs6000-builtin.def
> deleted file mode 100644
> index 9dbf16f48c4..000
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index 86054f75756..a5ee06c991f 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -89,20 +89,6 @@
>  #define TARGET_NO_PROTOTYPE 0
>  #endif
>
> -struct builtin_compatibility
> -{
> -  const enum rs6000_builtins code;
> -  const char *const name;
> -};
> -
> -struct builtin_description
> -{
> -  const HOST_WIDE_INT mask;
> -  const enum insn_code icode;
> -  const char *const name;
> -  const enum rs6000_builtins code;
> -};
> -
>  /* Used by __builtin_cpu_is(), mapping from PLATFORM names to values.  */
>  static const struct
>  {
> @@ -184,127 +170,6 @@ static const struct
>
>  static rtx rs6000_expand_new_builtin (tree, rtx, rtx, machine_mode, int);
>  static bool rs6000_gimple_fold_new_builtin (gimple_stmt_iterator *gsi);
> -
> -
> -/* Hash table to keep track of the argument types for builtin functions.  */
> -
> -struct GTY((for_user)) builtin_hash_struct
> -{
> -  tree type;
> -  machine_mode mode[4];  /* return value + 3 arguments.  */
> -  unsigned char uns_p[4];/* and whether the types are unsigned.  */
> -};
> -
> -struct builtin_hasher : ggc_ptr_hash
> -{
> -  static hashval_t hash (builtin_hash_struct *);
> -  static bool equal (builtin_hash_struct *, builtin_hash_struct *);
> -};
> -
> -static GTY (()) hash_table *builtin_hash_table;
> -
> -/* Hash function for builtin functions with up to 3 arguments and a return
> -   type.  */
> -hashval_t
> -builtin_hasher::hash (builtin_hash_struct *bh)
> -{
> -  unsigned ret = 0;
> -  int i;
> -
> -  for (i = 0; i < 4; i++)
> -{
> -  ret = (ret * (unsigned)MAX_MACHINE_MODE) + ((unsigned)bh->mode[i]);
> -  ret = (ret * 2) + bh->uns_p[i];
> -}
> -
> -  return ret;
> -}
> -
> -/* Compare builtin hash entries H1 and H2 for equivalence.  */
> -bool
> -builtin_hasher::equal (builtin_hash_struct *p1, builtin_hash_struct *p2)
> -{
> -  return ((p1->mode[0] == p2->mode[0])
> -   && (p1->mode[1] == p2->mode[1])

Re: [PATCH v2 3/6] rs6000: Rename rs6000-builtin-new.def to rs6000-builtins.def

2021-12-14 Thread Bill Schmidt via Gcc-patches
Ping.  Thanks!
Bill

On 12/6/21 2:49 PM, Bill Schmidt via Gcc-patches wrote:
> Hi!
>
> This patch just renames a file and updates the build machinery accordingly.
>
> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
> okay for trunk?
>
> Thanks!
> Bill
>
> 2021-12-02  Bill Schmidt  
>
> gcc/
>   * config/rs6000/rs6000-builtin-new.def: Rename to...
>   * config/rs6000/rs6000-builtins.def: ...this.
>   * config/rs6000/rs6000-gen-builtins.c: Adjust header commentary.
>   * config/rs6000/t-rs6000 (EXTRA_GTYPE_DEPS): Rename
>   rs6000-builtin-new.def to rs6000-builtins.def.
>   (rs6000-builtins.c): Likewise.
> ---
>  .../rs6000/{rs6000-builtin-new.def => rs6000-builtins.def}  | 0
>  gcc/config/rs6000/rs6000-gen-builtins.c | 4 ++--
>  gcc/config/rs6000/t-rs6000  | 6 +++---
>  3 files changed, 5 insertions(+), 5 deletions(-)
>  rename gcc/config/rs6000/{rs6000-builtin-new.def => rs6000-builtins.def} 
> (100%)
>
> diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
> b/gcc/config/rs6000/rs6000-builtins.def
> similarity index 100%
> rename from gcc/config/rs6000/rs6000-builtin-new.def
> rename to gcc/config/rs6000/rs6000-builtins.def
> diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
> b/gcc/config/rs6000/rs6000-gen-builtins.c
> index 78b2486aafc..9c61b7d9fe6 100644
> --- a/gcc/config/rs6000/rs6000-gen-builtins.c
> +++ b/gcc/config/rs6000/rs6000-gen-builtins.c
> @@ -22,7 +22,7 @@ along with GCC; see the file COPYING3.  If not see
> recognition code for Power targets, based on text files that
> describe the built-in functions and vector overloads:
>
> - rs6000-builtin-new.def Table of built-in functions
> + rs6000-builtins.defTable of built-in functions
>   rs6000-overload.defTable of overload functions
>
> Both files group similar functions together in "stanzas," as
> @@ -125,7 +125,7 @@ along with GCC; see the file COPYING3.  If not see
>
> The second line contains the  that this particular instance of
> the overloaded function maps to.  It must match a token that appears in
> -   rs6000-builtin-new.def.  Optionally, a second token may appear.  If only
> +   rs6000-builtins.def.  Optionally, a second token may appear.  If only
> one token is on the line, it is also used to build the unique identifier
> for the overloaded function.  If a second token is present, the second
> token is used instead for this purpose.  This is necessary in cases
> diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000
> index d48a4b1be6c..3d3143a171d 100644
> --- a/gcc/config/rs6000/t-rs6000
> +++ b/gcc/config/rs6000/t-rs6000
> @@ -22,7 +22,7 @@ TM_H += $(srcdir)/config/rs6000/rs6000-builtin.def
>  TM_H += $(srcdir)/config/rs6000/rs6000-cpus.def
>  TM_H += $(srcdir)/config/rs6000/rs6000-modes.h
>  PASSES_EXTRA += $(srcdir)/config/rs6000/rs6000-passes.def
> -EXTRA_GTYPE_DEPS += $(srcdir)/config/rs6000/rs6000-builtin-new.def
> +EXTRA_GTYPE_DEPS += $(srcdir)/config/rs6000/rs6000-builtins.def
>
>  rs6000-pcrel-opt.o: $(srcdir)/config/rs6000/rs6000-pcrel-opt.c
>   $(COMPILE) $<
> @@ -59,10 +59,10 @@ build/rs6000-gen-builtins$(build_exeext): 
> build/rs6000-gen-builtins.o \
>  # For now, the header files depend on rs6000-builtins.c, which avoids
>  # races because the .c file is closed last in rs6000-gen-builtins.c.
>  rs6000-builtins.c: build/rs6000-gen-builtins$(build_exeext) \
> -$(srcdir)/config/rs6000/rs6000-builtin-new.def \
> +$(srcdir)/config/rs6000/rs6000-builtins.def \
>  $(srcdir)/config/rs6000/rs6000-overload.def
>   $(RUN_GEN) ./build/rs6000-gen-builtins$(build_exeext) \
> - $(srcdir)/config/rs6000/rs6000-builtin-new.def \
> + $(srcdir)/config/rs6000/rs6000-builtins.def \
>   $(srcdir)/config/rs6000/rs6000-overload.def rs6000-builtins.h \
>   rs6000-builtins.c rs6000-vecdefines.h
>


Re: [PATCH v2 0/6] Remove "old" built-in function infrastructure

2021-12-14 Thread Bill Schmidt via Gcc-patches
Hi!  I'd like to ping patches 2 through 6 of this series.  Much obliged!

Thanks,
Bill


On 12/6/21 2:49 PM, Bill Schmidt via Gcc-patches wrote:
> Hi!
>
> Now that the new built-in function support is all upstream and enabled, it
> seems safe and prudent to remove the old code to avoid confusion.  I broke 
> this
> up to the extent possible, but a couple of patches are still pretty large.
>
> David Edelsohn found that I had broken some C++ library functions for AIX, and
> his fix for that required me to re-spin the patches.  I also generated the 
> diff
> with a more efficient algorithm to reduce the patch size.  Otherwise this
> series is identical to V1.
>
> Thanks!
> Bill
>
> Bill Schmidt (6):
>   rs6000: Remove new_builtins_are_live and dead code it was guarding
>   rs6000: Remove altivec_overloaded_builtins array and initialization
>   rs6000: Rename rs6000-builtin-new.def to rs6000-builtins.def
>   rs6000: Remove rs6000-builtin.def and associated data and functions
>   rs6000: Rename functions with "new" in their names
>   rs6000: Rename arrays to remove temporary _x suffix
>
>  gcc/config/rs6000/darwin.h| 8 +-
>  gcc/config/rs6000/rs6000-builtin.def  |  3350 -
>  ...00-builtin-new.def => rs6000-builtins.def} | 0
>  gcc/config/rs6000/rs6000-c.c  |  1266 +-
>  gcc/config/rs6000/rs6000-call.c   | 11964 +---
>  gcc/config/rs6000/rs6000-gen-builtins.c   |   115 +-
>  gcc/config/rs6000/rs6000-internal.h   | 2 +-
>  gcc/config/rs6000/rs6000-protos.h | 3 -
>  gcc/config/rs6000/rs6000.c|   334 +-
>  gcc/config/rs6000/rs6000.h|58 -
>  gcc/config/rs6000/t-rs6000| 7 +-
>  11 files changed, 224 insertions(+), 16883 deletions(-)
>  delete mode 100644 gcc/config/rs6000/rs6000-builtin.def
>  rename gcc/config/rs6000/{rs6000-builtin-new.def => rs6000-builtins.def} 
> (100%)
>


Re: [PATCH] rs6000: __builtin_darn[_raw] should be in [power9-64] (PR103624)

2021-12-14 Thread Bill Schmidt via Gcc-patches
On 12/14/21 7:32 AM, Bill Schmidt wrote:
> Hi!
>
> On 12/13/21 6:22 PM, Segher Boessenkool wrote:
>>
>> These builtins should just return a "long", just like __builtin_ppc_mftb
>> does.  All three of them.
> Well, that seems wrong for __builtin_darn_32, which maps to an SImode pattern.
>
> So, I assume what you'd like to see is for the other two built-ins to return
> long, and for the "&& TARGET_64BIT" to be removed from the darn_raw and darn
> patterns?
>
For the record, I don't see how this can work.  WHen I compile:

#include 

long get_raw_random ()
{
  return __builtin_darn_raw ();
}

with these changes, the compiler thinks that __builtin_darn_raw returns a
register pair, presumably due to it being a DImode pattern.  It then pulls
the second register of the pair as the actual result.

get_raw_random:
.LFB0:
darn 10,2
mr 3,11
blr

The vregs dump shows:

(insn 5 2 6 2 (set (reg:DI 118)
(unspec_volatile:DI [
(const_int 0 [0])
] UNSPECV_DARN_RAW)) "darn-thing.c":11:10 1043 {darn_raw}
 (nil))
(insn 6 5 10 2 (set (reg:SI 117 [  ])
(subreg:SI (reg:DI 118) 4)) "darn-thing.c":11:10 543 {*movsi_internal1}
 (nil))
(insn 10 6 11 2 (set (reg/i:SI 3 3)
(reg:SI 117 [  ])) "darn-thing.c":12:1 543 {*movsi_internal1}
 (nil))
(insn 11 10 0 2 (use (reg/i:SI 3 3)) "darn-thing.c":12:1 -1
 (nil))

So if you want to support these patterns for 32-bit mode, there's more work
required.

Given this, I'd like to ask you to reconsider the original submitted patch
for now.

Thanks,
Bill



Re: [PATCH] rs6000: __builtin_darn[_raw] should be in [power9-64] (PR103624)

2021-12-14 Thread Bill Schmidt via Gcc-patches
Hi!

On 12/13/21 6:22 PM, Segher Boessenkool wrote:
> On Mon, Dec 13, 2021 at 02:37:43PM -0600, Bill Schmidt wrote:
>> On 12/13/21 10:54 AM, Segher Boessenkool wrote:
>>> On Mon, Dec 13, 2021 at 11:30:28AM -0500, David Edelsohn wrote:
>>>> On Mon, Dec 13, 2021 at 10:48 AM Bill Schmidt  
>>>> wrote:
>>>>> PR103624 observes that we get segfaults for the 64-bit darn builtins when 
>>>>> compiled
>>>>> on a 32-bit architecture.  The old built-in infrastructure requires 
>>>>> TARGET_64BIT, and
>>>>> this was missed in the new support.  Moving these two builtins from the 
>>>>> [power9]
>>>>> stanza to the [power9-64] stanza solves the problem.
>>>>>
>>>>> Tested the fix on a powerpc-e300c3-linux-gnu cross.  Bootstrapped and 
>>>>> tested on
>>>>> powerpc64le-linux-gnu with no regressions.  Is this okay for trunk?
>>>> Okay.
>>> No, as I said before this is not correct, not without a lot more
>>> explanation at least.  We should not copy errors in the old code into
>>> the new code.  That is negating one of the main advantages of
>>> reimplementing this in the first place!
>> Can you please be more specific?
>>
>> All I have from you before is "It should work for 32-bit though?"  I 
>> responded in the
>> bug report that __builtin_darn_32 was used for this purpose.  I haven't seen 
>> a
>> response to that.  What do you want to see happen?
> That of course does not work for _raw.
>
> These builtins should just return a "long", just like __builtin_ppc_mftb
> does.  All three of them.

Well, that seems wrong for __builtin_darn_32, which maps to an SImode pattern.

So, I assume what you'd like to see is for the other two built-ins to return
long, and for the "&& TARGET_64BIT" to be removed from the darn_raw and darn
patterns?

>
>> The patterns in rs6000.md are darn_32, gated by TARGET_P9_MISC; darn_raw, 
>> gated by
>> TARGET_P9_MISC && TARGET_64BIT; and darn, gated by TARGET_P9_MISC && 
>> TARGET_64BIT.
>> The builtins correspond to these patterns in the obvious way.
>>
>> If you think that these patterns should be enabled differently, that's fine, 
>> but
>> that's a completely different patch than fixing the incorrect built-ins to 
>> match
>> what the patterns do and thus avoid ICEing.
> Avoiding ICEs should not be a goal.  It should be a side effect of doing
> the right thing in the first place!


There's no reason to get snippy.  Given that you approved Kelvin's original
implementation of the darn patterns and built-in functions, I think I can be
forgiven for thinking that those were the desired semantics. :-)

Thanks,
Bill

>
>
> Segher


Re: [PATCH] rs6000: Some builtins require IBM-128 long double format (PR103623)

2021-12-13 Thread Bill Schmidt via Gcc-patches
Hi!

On 12/13/21 2:15 PM, Martin Sebor wrote:
> On 12/13/21 8:55 AM, Bill Schmidt via Gcc-patches wrote:
>> Hi!
>>
>> PR103623 shows that we ICE if __builtin_pack_longdouble or 
>> __builtin_unpack_longdouble
>> is used when long double is not defined to be the IBM-128 (double-double) 
>> format.
>> To solve this, I introduce a new built-in function attribute "ibmld" that 
>> enforces
>> this requirement.
>>
>> Tested the fix on a powerpc-e300c3-linux-gnu cross.  Bootstrapped and tested 
>> on
>> powerpc64le-linux-gnu with no regressions.  Is this okay for trunk?
>
> Just a minor note about the format of the new error message
> below:
>
> ...
>> diff --git a/gcc/config/rs6000/rs6000-call.c 
>> b/gcc/config/rs6000/rs6000-call.c
>> index d9736eaf21c..b6f0c6c4c08 100644
>> --- a/gcc/config/rs6000/rs6000-call.c
>> +++ b/gcc/config/rs6000/rs6000-call.c
>> @@ -15741,6 +15741,13 @@ rs6000_expand_new_builtin (tree exp, rtx target,
>>     return const0_rtx;
>>   }
>>   +  if (bif_is_ibmld (*bifaddr) && !FLOAT128_2REG_P (TFmode))
>> +    {
>> +  error ("%<%s%> requires long double to be IBM 128-bit format",
>
> as a keyword long double should be quoted in the message.

Thanks, Martin, good point.  Sorry for overlooking that!

Bill
>
> Martin


Re: [PATCH] rs6000: __builtin_darn[_raw] should be in [power9-64] (PR103624)

2021-12-13 Thread Bill Schmidt via Gcc-patches
Hi!

On 12/13/21 10:54 AM, Segher Boessenkool wrote:
> On Mon, Dec 13, 2021 at 11:30:28AM -0500, David Edelsohn wrote:
>> On Mon, Dec 13, 2021 at 10:48 AM Bill Schmidt  wrote:
>>> Hi!
>>>
>>> PR103624 observes that we get segfaults for the 64-bit darn builtins when 
>>> compiled
>>> on a 32-bit architecture.  The old built-in infrastructure requires 
>>> TARGET_64BIT, and
>>> this was missed in the new support.  Moving these two builtins from the 
>>> [power9]
>>> stanza to the [power9-64] stanza solves the problem.
>>>
>>> Tested the fix on a powerpc-e300c3-linux-gnu cross.  Bootstrapped and 
>>> tested on
>>> powerpc64le-linux-gnu with no regressions.  Is this okay for trunk?
>> Okay.
> No, as I said before this is not correct, not without a lot more
> explanation at least.  We should not copy errors in the old code into
> the new code.  That is negating one of the main advantages of
> reimplementing this in the first place!

Can you please be more specific?

All I have from you before is "It should work for 32-bit though?"  I responded 
in the
bug report that __builtin_darn_32 was used for this purpose.  I haven't seen a
response to that.  What do you want to see happen?

The patterns in rs6000.md are darn_32, gated by TARGET_P9_MISC; darn_raw, gated 
by
TARGET_P9_MISC && TARGET_64BIT; and darn, gated by TARGET_P9_MISC && 
TARGET_64BIT.
The builtins correspond to these patterns in the obvious way.

If you think that these patterns should be enabled differently, that's fine, but
that's a completely different patch than fixing the incorrect built-ins to match
what the patterns do and thus avoid ICEing.

Thanks,
Bill

>
> Segher
>
>
>>> PR target/103624
>>> * config/rs6000/rs6000-builtin-new.def (__builtin_darn): Move to
>>> [power9-64] stanza.
>>> (__builtin_darn_raw): Likewise.


[PATCH] rs6000: Skip overload instances with uninitialized fntype (PR103622)

2021-12-13 Thread Bill Schmidt via Gcc-patches
Hi!

For some data types like IEEE-128, we determine whether the type is available
at built-in function initialization time.  If it's not, then we don't provide
the function type for function instances that require the data type.  PR103622
observes that this can cause us to ICE when running the list of instances when
the target doesn't support the data type.

Ideally, we wouldn't even put such an instance in the list of instances that
an overload can map to, but to do that is much more complicated.  Instead,
this patch just ensures we don't dereference a NULL pointer when the situation
arises.

Tested the fix on a powerpc-e300c3-linux-gnu cross.  Bootstrapped and tested on
powerpc64le-linux-gnu with no regressions.  Is this okay for trunk?

Thanks!
Bill


2021-12-13  Bill Schmidt  

gcc/
PR target/103622
* config/rs6000/rs6000-c.c (altivec_resolve_new_overloaded_builtin):
Skip over instances with undefined function types.
---
 gcc/config/rs6000/rs6000-c.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index 8e83d97e72f..fc4cc929884 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -2943,6 +2943,12 @@ altivec_resolve_new_overloaded_builtin (location_t loc, 
tree fndecl,
 
for (; instance != NULL; instance = instance->next)
  {
+   /* It is possible for an instance to require a data type that isn't
+  defined on this target, in which case instance->fntype will be
+  NULL.  */
+   if (!instance->fntype)
+ continue;
+
bool mismatch = false;
tree nextparm = TYPE_ARG_TYPES (instance->fntype);
 
-- 
2.27.0




[PATCH] rs6000: Builtins for doubleword compare should be in [power8-vector] (PR103625)

2021-12-13 Thread Bill Schmidt via Gcc-patches
Hi!

PR103625 observes that we ICE when doing vector compares on doublewords.
The original built-in function support requires Power8 vector support for
these, but this was missed in the new built-in support.  Moving these
functions to the [power8-vector] stanza solves the problem.

Tested the fix on a powerpc-e300c3-linux-gnu cross.  Bootstrapped and tested on
powerpc64le-linux-gnu with no regressions.  Is this okay for trunk?

Thanks!
Bill


2021-12-13  Bill Schmidt  

gcc/
PR target/103625
* config/rs6000/rs6000-builtin-new.def (__builtin_altivec_vcmpequd):
Move to power8-vector stanza.
(__builtin_altivec_vcmpequd_p): Likewise.
(__builtin_altivec_vcmpgtsd): Likewise.
(__builtin_altivec_vcmpgtsd_p): Likewise.
(__builtin_altivec_vcmpgtud): Likewise.
(__builtin_altivec_vcmpgtud_p): Likewise.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 36 
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
b/gcc/config/rs6000/rs6000-builtin-new.def
index a020dbbe2fb..bd950f8db36 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -1200,24 +1200,6 @@
   const vull __builtin_altivec_vandc_v2di_uns (vull, vull);
 VANDC_V2DI_UNS andcv2di3 {}
 
-  const vsll __builtin_altivec_vcmpequd (vull, vull);
-VCMPEQUD vector_eqv2di {}
-
-  const int __builtin_altivec_vcmpequd_p (int, vsll, vsll);
-VCMPEQUD_P vector_eq_v2di_p {pred}
-
-  const vsll __builtin_altivec_vcmpgtsd (vsll, vsll);
-VCMPGTSD vector_gtv2di {}
-
-  const int __builtin_altivec_vcmpgtsd_p (int, vsll, vsll);
-VCMPGTSD_P vector_gt_v2di_p {pred}
-
-  const vsll __builtin_altivec_vcmpgtud (vull, vull);
-VCMPGTUD vector_gtuv2di {}
-
-  const int __builtin_altivec_vcmpgtud_p (int, vsll, vsll);
-VCMPGTUD_P vector_gtu_v2di_p {pred}
-
   const vd __builtin_altivec_vnor_v2df (vd, vd);
 VNOR_V2DF norv2df3 {}
 
@@ -2221,6 +2203,24 @@
   const vsc __builtin_altivec_vbpermq2 (vsc, vsc);
 VBPERMQ2 altivec_vbpermq2 {}
 
+  const vsll __builtin_altivec_vcmpequd (vull, vull);
+VCMPEQUD vector_eqv2di {}
+
+  const int __builtin_altivec_vcmpequd_p (int, vsll, vsll);
+VCMPEQUD_P vector_eq_v2di_p {pred}
+
+  const vsll __builtin_altivec_vcmpgtsd (vsll, vsll);
+VCMPGTSD vector_gtv2di {}
+
+  const int __builtin_altivec_vcmpgtsd_p (int, vsll, vsll);
+VCMPGTSD_P vector_gt_v2di_p {pred}
+
+  const vsll __builtin_altivec_vcmpgtud (vull, vull);
+VCMPGTUD vector_gtuv2di {}
+
+  const int __builtin_altivec_vcmpgtud_p (int, vsll, vsll);
+VCMPGTUD_P vector_gtu_v2di_p {pred}
+
   const vsll __builtin_altivec_vmaxsd (vsll, vsll);
 VMAXSD smaxv2di3 {}
 
-- 
2.27.0




[PATCH] rs6000: Some builtins require IBM-128 long double format (PR103623)

2021-12-13 Thread Bill Schmidt via Gcc-patches
Hi!

PR103623 shows that we ICE if __builtin_pack_longdouble or 
__builtin_unpack_longdouble
is used when long double is not defined to be the IBM-128 (double-double) 
format.
To solve this, I introduce a new built-in function attribute "ibmld" that 
enforces
this requirement.

Tested the fix on a powerpc-e300c3-linux-gnu cross.  Bootstrapped and tested on
powerpc64le-linux-gnu with no regressions.  Is this okay for trunk?

Thanks!
Bill


2021-12-13  Bill Schmidt  

gcc/
PR target/103623
* config/rs6000/rs6000-builtin-new.def (__builtin_pack_longdouble): Add
ibmld attribute.
(__builtin_unpack_longdouble): Likewise.
* config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): Add special
handling for ibmld attribute.
* config/rs6000/rs6000-gen-builtins.c (attrinfo): Add isibmld.
(parse_bif_attrs): Handle ibmld.
(write_decls): Likewise.
(write_bif_static_init): Likewise.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 11 +++
 gcc/config/rs6000/rs6000-call.c  |  7 +++
 gcc/config/rs6000/rs6000-gen-builtins.c  | 13 +++--
 3 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
b/gcc/config/rs6000/rs6000-builtin-new.def
index 2becd96a36c..a020dbbe2fb 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -137,6 +137,7 @@
 ;   lxvrse   Needs special handling for load-rightmost, sign-extended
 ;   lxvrze   Needs special handling for load-rightmost, zero-extended
 ;   endian   Needs special handling for endianness
+;   ibmldRestrict usage to the case when TFmode is IBM-128
 ;
 ; Each attribute corresponds to extra processing required when
 ; the built-in is expanded.  All such special processing should
@@ -215,13 +216,8 @@
   double __builtin_mffsl ();
 MFFSL rs6000_mffsl {}
 
-; This thing really assumes long double == __ibm128, and I'm told it has
-; been used as such within libgcc.  Given that __builtin_pack_ibm128
-; exists for the same purpose, this should really not be used at all.
-; TODO: Consider adding special handling for this to warn whenever
-; long double is not __ibm128.
   const long double __builtin_pack_longdouble (double, double);
-PACK_TF packtf {}
+PACK_TF packtf {ibmld}
 
   unsigned long __builtin_ppc_mftb ();
 MFTB rs6000_mftb_di {32bit}
@@ -244,9 +240,8 @@
   const double __builtin_unpack_ibm128 (__ibm128, const int<1>);
 UNPACK_IF unpackif {}
 
-; See above comments for __builtin_pack_longdouble.
   const double __builtin_unpack_longdouble (long double, const int<1>);
-UNPACK_TF unpacktf {}
+UNPACK_TF unpacktf {ibmld}
 
 
 ; Builtins that have been around just about forever, but not quite.
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index d9736eaf21c..b6f0c6c4c08 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -15741,6 +15741,13 @@ rs6000_expand_new_builtin (tree exp, rtx target,
   return const0_rtx;
 }
 
+  if (bif_is_ibmld (*bifaddr) && !FLOAT128_2REG_P (TFmode))
+{
+  error ("%<%s%> requires long double to be IBM 128-bit format",
+bifaddr->bifname);
+  return const0_rtx;
+}
+
   if (bif_is_cpu (*bifaddr))
 return new_cpu_expand_builtin (fcode, exp, target);
 
diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index d2e9c4ce547..ababe83895c 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -92,6 +92,7 @@ along with GCC; see the file COPYING3.  If not see
  lxvrse   Needs special handling for load-rightmost, sign-extended
  lxvrze   Needs special handling for load-rightmost, zero-extended
  endian   Needs special handling for endianness
+ ibmldRestrict usage to the case when TFmode is IBM-128
 
An example stanza might look like this:
 
@@ -390,6 +391,7 @@ struct attrinfo
   bool islxvrse;
   bool islxvrze;
   bool isendian;
+  bool isibmld;
 };
 
 /* Fields associated with a function prototype (bif or overload).  */
@@ -1435,6 +1437,8 @@ parse_bif_attrs (attrinfo *attrptr)
  attrptr->islxvrze = 1;
else if (!strcmp (attrname, "endian"))
  attrptr->isendian = 1;
+   else if (!strcmp (attrname, "ibmld"))
+ attrptr->isibmld = 1;
else
  {
diag (oldpos, "unknown attribute.\n");
@@ -1468,14 +1472,14 @@ parse_bif_attrs (attrinfo *attrptr)
"ldvec = %d, stvec = %d, reve = %d, pred = %d, htm = %d, "
"htmspr = %d, htmcr = %d, mma = %d, quad = %d, pair = %d, "
"mmaint = %d, no32bit = %d, 32bit = %d, cpu = %d, ldstmask = %d, "
-   "lxvrse = %d, lxvrze = %d, endian = %d.\n",
+   "lxvrse = %d, lxvrze = %d, 

[PATCH] rs6000: __builtin_darn[_raw] should be in [power9-64] (PR103624)

2021-12-13 Thread Bill Schmidt via Gcc-patches
Hi!

PR103624 observes that we get segfaults for the 64-bit darn builtins when 
compiled
on a 32-bit architecture.  The old built-in infrastructure requires 
TARGET_64BIT, and
this was missed in the new support.  Moving these two builtins from the [power9]
stanza to the [power9-64] stanza solves the problem.

Tested the fix on a powerpc-e300c3-linux-gnu cross.  Bootstrapped and tested on
powerpc64le-linux-gnu with no regressions.  Is this okay for trunk?

Thanks!
Bill


2021-12-13  Bill Schmidt  

gcc/
PR target/103624
* config/rs6000/rs6000-builtin-new.def (__builtin_darn): Move to
[power9-64] stanza.
(__builtin_darn_raw): Likewise.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
b/gcc/config/rs6000/rs6000-builtin-new.def
index 30556e5c7f2..2becd96a36c 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -2799,15 +2799,9 @@
 
 ; Miscellaneous P9 functions
 [power9]
-  signed long long __builtin_darn ();
-DARN darn {}
-
   signed int __builtin_darn_32 ();
 DARN_32 darn_32 {}
 
-  signed long long __builtin_darn_raw ();
-DARN_RAW darn_raw {}
-
   const signed int __builtin_dtstsfi_eq_dd (const int<6>, _Decimal64);
 TSTSFI_EQ_DD dfptstsfi_eq_dd {}
 
@@ -2840,6 +2834,12 @@
   void __builtin_altivec_stxvl (vsc, void *, long);
 STXVL stxvl {}
 
+  signed long long __builtin_darn ();
+DARN darn {}
+
+  signed long long __builtin_darn_raw ();
+DARN_RAW darn_raw {}
+
   const signed int __builtin_scalar_byte_in_set (signed int, signed long long);
 CMPEQB cmpeqb {}
 
-- 
2.27.0




Re: [PATCH] rs6000: Refactor altivec_build_resolved_builtin

2021-12-09 Thread Bill Schmidt via Gcc-patches
I forgot to point out that this patch is dependent on the pending patches
to remove the old builtins code.

Thanks,
Bill

On 12/9/21 12:33 PM, Bill Schmidt via Gcc-patches wrote:
> Hi!
>
> While replacing the built-in machinery, we agreed to defer some necessary
> refactoring of the overload processing.  This patch cleans it up considerably.
>
> I've put in one FIXME for an additional level of cleanup that should be done
> independently.  The various helper functions (resolve_VEC_*) can be simplified
> if we move the argument processing in altivec_resolve_overloaded_builtin
> earlier.  But this requires making nontrivial changes to those functions that
> will need careful review.  Let's do that in a later patch.
>
> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
> okay for trunk?
>
> Thanks!
> Bill
>
>
> 2021-12-09  Bill Schmidt  
>
> gcc/
>   * config/rs6000/rs6000-c.c (resolution): New enum.
>   (resolve_VEC_MUL): New function.
>   (resolve_VEC_CMPNE): Likewise.
>   (resolve_VEC_ADDE_SUBE): Likewise.
>   (resolve_VEC_ADDEC_SUBEC): Likewise.
>   (resolve_VEC_SPLATS): Likewise.
>   (resolve_VEC_EXTRACT): Likewise.
>   (resolve_VEC_INSERT): Likewise.
>   (resolve_VEC_STEP): Likewise.
>   (find_instance): Likewise.
>   (altivec_resolve_overloaded_builtin): Many cleanups:  Call factored-out
>   functions.  Move variable declarations closer to uses.  Add commentary.
>   Remove unnecessary levels of braces.  Avoid use of gotos.  Change
>   misleading variable names.  Use switches over if-else-if chains.
> ---
>  gcc/config/rs6000/rs6000-c.c | 1717 +++---
>  1 file changed, 945 insertions(+), 772 deletions(-)
>
> diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
> index e0ebdeed548..45f485aab44 100644
> --- a/gcc/config/rs6000/rs6000-c.c
> +++ b/gcc/config/rs6000/rs6000-c.c
> @@ -928,710 +928,939 @@ altivec_build_resolved_builtin (tree *args, int n, 
> tree fntype, tree ret_type,
>return fold_convert (ret_type, call);
>  }
>
> -/* Implementation of the resolve_overloaded_builtin target hook, to
> -   support Altivec's overloaded builtins.  FIXME: This code needs
> -   to be brutally factored.  */
> +/* Enumeration of possible results from attempted overload resolution.
> +   This is used by special-case helper functions to tell their caller
> +   whether they succeeded and what still needs to be done.
>
> -tree
> -altivec_resolve_overloaded_builtin (location_t loc, tree fndecl,
> - void *passed_arglist)
> + unresolved = Still needs processing
> +   resolved = Resolved (but may be an error_mark_node)
> +  resolved_bad = An error that needs handling by the caller.  */
> +
> +enum resolution { unresolved, resolved, resolved_bad };
> +
> +/* Resolve an overloaded vec_mul call and return a tree expression for the
> +   resolved call if successful.  NARGS is the number of arguments to the 
> call.
> +   ARGLIST contains the arguments.  RES must be set to indicate the status of
> +   the resolution attempt.  LOC contains statement location information.  */
> +
> +static tree
> +resolve_VEC_MUL (resolution *res, vec *arglist, unsigned nargs,
> +  location_t loc)
>  {
> -  vec *arglist = static_cast *> 
> (passed_arglist);
> -  unsigned int nargs = vec_safe_length (arglist);
> -  enum rs6000_gen_builtins fcode
> -= (enum rs6000_gen_builtins) DECL_MD_FUNCTION_CODE (fndecl);
> -  tree fnargs = TYPE_ARG_TYPES (TREE_TYPE (fndecl));
> -  tree types[MAX_OVLD_ARGS];
> -  tree args[MAX_OVLD_ARGS];
> +  /* vec_mul needs to be special cased because there are no instructions for 
> it
> + for the {un}signed char, {un}signed short, and {un}signed int types.  */
> +  if (nargs != 2)
> +{
> +  error ("builtin %qs only accepts 2 arguments", "vec_mul");
> +  *res = resolved;
> +  return error_mark_node;
> +}
>
> -  /* Return immediately if this isn't an overload.  */
> -  if (fcode <= RS6000_OVLD_NONE)
> -return NULL_TREE;
> +  tree arg0 = (*arglist)[0];
> +  tree arg0_type = TREE_TYPE (arg0);
> +  tree arg1 = (*arglist)[1];
> +  tree arg1_type = TREE_TYPE (arg1);
>
> -  unsigned int adj_fcode = fcode - RS6000_OVLD_NONE;
> +  /* Both arguments must be vectors and the types must be compatible.  */
> +  if (TREE_CODE (arg0_type) != VECTOR_TYPE
> +  || !lang_hooks.types_compatible_p (arg0_type, arg1_type))
> +{
> +  *res = resolved_bad;
> +  return error_mark_node;
> +}
>
> -  if (TARGET_DEBUG_BUILTIN)
> -fprintf (stderr, "alti

[PATCH] rs6000: Refactor altivec_build_resolved_builtin

2021-12-09 Thread Bill Schmidt via Gcc-patches
Hi!

While replacing the built-in machinery, we agreed to defer some necessary
refactoring of the overload processing.  This patch cleans it up considerably.

I've put in one FIXME for an additional level of cleanup that should be done
independently.  The various helper functions (resolve_VEC_*) can be simplified
if we move the argument processing in altivec_resolve_overloaded_builtin
earlier.  But this requires making nontrivial changes to those functions that
will need careful review.  Let's do that in a later patch.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
okay for trunk?

Thanks!
Bill


2021-12-09  Bill Schmidt  

gcc/
* config/rs6000/rs6000-c.c (resolution): New enum.
(resolve_VEC_MUL): New function.
(resolve_VEC_CMPNE): Likewise.
(resolve_VEC_ADDE_SUBE): Likewise.
(resolve_VEC_ADDEC_SUBEC): Likewise.
(resolve_VEC_SPLATS): Likewise.
(resolve_VEC_EXTRACT): Likewise.
(resolve_VEC_INSERT): Likewise.
(resolve_VEC_STEP): Likewise.
(find_instance): Likewise.
(altivec_resolve_overloaded_builtin): Many cleanups:  Call factored-out
functions.  Move variable declarations closer to uses.  Add commentary.
Remove unnecessary levels of braces.  Avoid use of gotos.  Change
misleading variable names.  Use switches over if-else-if chains.
---
 gcc/config/rs6000/rs6000-c.c | 1717 +++---
 1 file changed, 945 insertions(+), 772 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index e0ebdeed548..45f485aab44 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -928,710 +928,939 @@ altivec_build_resolved_builtin (tree *args, int n, tree 
fntype, tree ret_type,
   return fold_convert (ret_type, call);
 }
 
-/* Implementation of the resolve_overloaded_builtin target hook, to
-   support Altivec's overloaded builtins.  FIXME: This code needs
-   to be brutally factored.  */
+/* Enumeration of possible results from attempted overload resolution.
+   This is used by special-case helper functions to tell their caller
+   whether they succeeded and what still needs to be done.
 
-tree
-altivec_resolve_overloaded_builtin (location_t loc, tree fndecl,
-   void *passed_arglist)
+   unresolved = Still needs processing
+ resolved = Resolved (but may be an error_mark_node)
+  resolved_bad = An error that needs handling by the caller.  */
+
+enum resolution { unresolved, resolved, resolved_bad };
+
+/* Resolve an overloaded vec_mul call and return a tree expression for the
+   resolved call if successful.  NARGS is the number of arguments to the call.
+   ARGLIST contains the arguments.  RES must be set to indicate the status of
+   the resolution attempt.  LOC contains statement location information.  */
+
+static tree
+resolve_VEC_MUL (resolution *res, vec *arglist, unsigned nargs,
+location_t loc)
 {
-  vec *arglist = static_cast *> (passed_arglist);
-  unsigned int nargs = vec_safe_length (arglist);
-  enum rs6000_gen_builtins fcode
-= (enum rs6000_gen_builtins) DECL_MD_FUNCTION_CODE (fndecl);
-  tree fnargs = TYPE_ARG_TYPES (TREE_TYPE (fndecl));
-  tree types[MAX_OVLD_ARGS];
-  tree args[MAX_OVLD_ARGS];
+  /* vec_mul needs to be special cased because there are no instructions for it
+ for the {un}signed char, {un}signed short, and {un}signed int types.  */
+  if (nargs != 2)
+{
+  error ("builtin %qs only accepts 2 arguments", "vec_mul");
+  *res = resolved;
+  return error_mark_node;
+}
 
-  /* Return immediately if this isn't an overload.  */
-  if (fcode <= RS6000_OVLD_NONE)
-return NULL_TREE;
+  tree arg0 = (*arglist)[0];
+  tree arg0_type = TREE_TYPE (arg0);
+  tree arg1 = (*arglist)[1];
+  tree arg1_type = TREE_TYPE (arg1);
 
-  unsigned int adj_fcode = fcode - RS6000_OVLD_NONE;
+  /* Both arguments must be vectors and the types must be compatible.  */
+  if (TREE_CODE (arg0_type) != VECTOR_TYPE
+  || !lang_hooks.types_compatible_p (arg0_type, arg1_type))
+{
+  *res = resolved_bad;
+  return error_mark_node;
+}
 
-  if (TARGET_DEBUG_BUILTIN)
-fprintf (stderr, "altivec_resolve_overloaded_builtin, code = %4d, %s\n",
-(int) fcode, IDENTIFIER_POINTER (DECL_NAME (fndecl)));
+  switch (TYPE_MODE (TREE_TYPE (arg0_type)))
+{
+case E_QImode:
+case E_HImode:
+case E_SImode:
+case E_DImode:
+case E_TImode:
+  /* For scalar types just use a multiply expression.  */
+  *res = resolved;
+  return fold_build2_loc (loc, MULT_EXPR, TREE_TYPE (arg0), arg0,
+ fold_convert (TREE_TYPE (arg0), arg1));
+case E_SFmode:
+  {
+   /* For floats use the xvmulsp instruction directly.  */
+   *res = resolved;
+   tree call = rs6000_builtin_decls[RS6000_BIF_XVMULSP];

[PATCH 6/6] rs6000: Rename arrays to remove temporary _x suffix

2021-12-06 Thread Bill Schmidt via Gcc-patches
Hi!

While we had two sets of built-in infrastructure at once, I added _x as a
suffix to two arrays to disambiguate the old and new versions.  Time to fix
that also.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
okay for trunk?

Thanks!
Bill

2021-12-06  Bill Schmidt  

gcc/
* config/rs6000/rs6000-c.c (altivec_build_resolved_builtin): Rename
rs6000_builtin_decls_x to rs6000_builtin_decls.
(altivec_resolve_overloaded_builtin): Likewise.  Also rename
rs6000_builtin_info_x to rs6000_builtin_info.
* config/rs6000/rs6000-call.c (rs6000_invalid_builtin): Rename
rs6000_builtin_info_x to rs6000_builtin_info.
(rs6000_builtin_is_supported): Likewise.
(rs6000_gimple_fold_mma_builtin): Likewise.  Also rename
rs6000_builtin_decls_x to rs6000_builtin_decls.
(rs6000_gimple_fold_builtin): Rename rs6000_builtin_info_x to
rs6000_builtin_info.
(cpu_expand_builtin): Likewise.
(rs6000_expand_builtin): Likewise.
(rs6000_init_builtins): Likewise.  Also rename rs6000_builtin_decls_x
to rs6000_builtin_decls.
(rs6000_builtin_decl): Rename rs6000_builtin_decls_x to
rs6000_builtin_decls.
* config/rs6000/rs6000-gen-builtins.c (write_decls): In generated code,
rename rs6000_builtin_decls_x to rs6000_builtin_decls, and rename
rs6000_builtin_info_x to rs6000_builtin_info.
(write_bif_static_init): In generated code, rename
rs6000_builtin_info_x to rs6000_builtin_info.
(write_init_bif_table): In generated code, rename
rs6000_builtin_decls_x to rs6000_builtin_decls, and rename
rs6000_builtin_info_x to rs6000_builtin_info.
(write_init_ovld_table): In generated code, rename
rs6000_builtin_decls_x to rs6000_builtin_decls.
(write_init_file): Likewise.
* config/rs6000/rs6000.c (rs6000_builtin_vectorized_function):
Likewise.
(rs6000_builtin_md_vectorized_function): Likewise.
(rs6000_builtin_reciprocal): Likewise.
(add_condition_to_bb): Likewise.
(rs6000_atomic_assign_expand_fenv): Likewise.
---
 gcc/config/rs6000/rs6000-c.c| 64 -
 gcc/config/rs6000/rs6000-call.c | 46 +-
 gcc/config/rs6000/rs6000-gen-builtins.c | 27 +--
 gcc/config/rs6000/rs6000.c  | 58 +++---
 4 files changed, 96 insertions(+), 99 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index f790c72d621..e0ebdeed548 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -867,7 +867,7 @@ altivec_build_resolved_builtin (tree *args, int n, tree 
fntype, tree ret_type,
 {
   tree argtypes = TYPE_ARG_TYPES (fntype);
   tree arg_type[MAX_OVLD_ARGS];
-  tree fndecl = rs6000_builtin_decls_x[bif_id];
+  tree fndecl = rs6000_builtin_decls[bif_id];
 
   for (int i = 0; i < n; i++)
 {
@@ -1001,13 +1001,13 @@ altivec_resolve_overloaded_builtin (location_t loc, 
tree fndecl,
  case E_SFmode:
{
  /* For floats use the xvmulsp instruction directly.  */
- tree call = rs6000_builtin_decls_x[RS6000_BIF_XVMULSP];
+ tree call = rs6000_builtin_decls[RS6000_BIF_XVMULSP];
  return build_call_expr (call, 2, arg0, arg1);
}
  case E_DFmode:
{
  /* For doubles use the xvmuldp instruction directly.  */
- tree call = rs6000_builtin_decls_x[RS6000_BIF_XVMULDP];
+ tree call = rs6000_builtin_decls[RS6000_BIF_XVMULDP];
  return build_call_expr (call, 2, arg0, arg1);
}
  /* Other types are errors.  */
@@ -1066,7 +1066,7 @@ altivec_resolve_overloaded_builtin (location_t loc, tree 
fndecl,
vec_safe_push (params, arg0);
vec_safe_push (params, arg1);
tree call = altivec_resolve_overloaded_builtin
- (loc, rs6000_builtin_decls_x[RS6000_OVLD_VEC_CMPEQ],
+ (loc, rs6000_builtin_decls[RS6000_OVLD_VEC_CMPEQ],
   params);
/* Use save_expr to ensure that operands used more than once
   that may have side effects (like calls) are only evaluated
@@ -1076,7 +1076,7 @@ altivec_resolve_overloaded_builtin (location_t loc, tree 
fndecl,
vec_safe_push (params, call);
vec_safe_push (params, call);
return altivec_resolve_overloaded_builtin
- (loc, rs6000_builtin_decls_x[RS6000_OVLD_VEC_NOR], params);
+ (loc, rs6000_builtin_decls[RS6000_OVLD_VEC_NOR], params);
  }
  /* Other types are errors.  */
default:
@@ -1129,9 +1129,9 @@ altivec_resolve_overloaded_builtin (location_t loc, tree 
fndecl,
  vec_safe_push (params, arg1);
 
  if (fc

[PATCH 5/6] rs6000: Rename functions with "new" in their names

2021-12-06 Thread Bill Schmidt via Gcc-patches
Hi!

While we had two sets of built-in functionality at the same time, I put "new"
in the names of quite a few functions.  Time to undo that.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
okay for trunk?

Thanks!
Bill

2021-12-02  Bill Schmidt  

gcc/
* config/rs6000/rs6000-c.c (altivec_resolve_new_overloaded_builtin):
Remove forward declaration.
(rs6000_new_builtin_type_compatible): Rename to
rs6000_builtin_type_compatible.
(rs6000_builtin_type_compatible): Remove.
(altivec_resolve_overloaded_builtin): Remove.
(altivec_build_new_resolved_builtin): Rename to
altivec_build_resolved_builtin.
(altivec_resolve_new_overloaded_builtin): Rename to
altivec_resolve_overloaded_builtin.  Remove static keyword.  Adjust
called function names.
* config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): Remove
forward declaration.
(rs6000_gimple_fold_new_builtin): Likewise.
(rs6000_invalid_new_builtin): Rename to rs6000_invalid_builtin.
(rs6000_gimple_fold_builtin): Remove.
(rs6000_new_builtin_valid_without_lhs): Rename to
rs6000_builtin_valid_without_lhs.
(rs6000_new_builtin_is_supported): Rename to
rs6000_builtin_is_supported.
(rs6000_gimple_fold_new_mma_builtin): Rename to
rs6000_gimple_fold_mma_builtin.
(rs6000_gimple_fold_new_builtin): Rename to
rs6000_gimple_fold_builtin.  Remove static keyword.  Adjust called
function names.
(rs6000_expand_builtin): Remove.
(new_cpu_expand_builtin): Rename to cpu_expand_builtin.
(new_mma_expand_builtin): Rename to mma_expand_builtin.
(new_htm_spr_num): Rename to htm_spr_num.
(new_htm_expand_builtin): Rename to htm_expand_builtin.  Change name
of called function.
(rs6000_expand_new_builtin): Rename to rs6000_expand_builtin.  Remove
static keyword.  Adjust called function names.
(rs6000_new_builtin_decl): Rename to rs6000_builtin_decl.  Remove
static keyword.
(rs6000_builtin_decl): Remove.
* config/rs6000/rs6000-gen-builtins.c (write_decls): In gnerated code,
rename rs6000_new_builtin_is_supported to rs6000_builtin_is_supported.
* config/rs6000/rs6000-internal.h (rs6000_invalid_new_builtin): Rename
to rs6000_invalid_builtin.
* config/rs6000/rs6000.c (rs6000_new_builtin_vectorized_function):
Rename to rs6000_builtin_vectorized_function.
(rs6000_new_builtin_md_vectorized_function): Rename to
rs6000_builtin_md_vectorized_function.
(rs6000_builtin_vectorized_function): Remove.
(rs6000_builtin_md_vectorized_function): Remove.
---
 gcc/config/rs6000/rs6000-c.c| 120 +---
 gcc/config/rs6000/rs6000-call.c |  99 ++-
 gcc/config/rs6000/rs6000-gen-builtins.c |   3 +-
 gcc/config/rs6000/rs6000-internal.h |   2 +-
 gcc/config/rs6000/rs6000.c  |  31 ++
 5 files changed, 80 insertions(+), 175 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index d44edf585aa..f790c72d621 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -37,9 +37,6 @@
 
 #include "rs6000-internal.h"
 
-static tree altivec_resolve_new_overloaded_builtin (location_t, tree, void *);
-
-
 /* Handle the machine specific pragma longcall.  Its syntax is
 
# pragma longcall ( TOGGLE )
@@ -817,7 +814,7 @@ is_float128_p (tree t)
 
 /* Return true iff ARGTYPE can be compatibly passed as PARMTYPE.  */
 static bool
-rs6000_new_builtin_type_compatible (tree parmtype, tree argtype)
+rs6000_builtin_type_compatible (tree parmtype, tree argtype)
 {
   if (parmtype == error_mark_node)
 return false;
@@ -840,23 +837,6 @@ rs6000_new_builtin_type_compatible (tree parmtype, tree 
argtype)
   return lang_hooks.types_compatible_p (parmtype, argtype);
 }
 
-static inline bool
-rs6000_builtin_type_compatible (tree t, int id)
-{
-  tree builtin_type;
-  builtin_type = rs6000_builtin_type (id);
-  if (t == error_mark_node)
-return false;
-  if (INTEGRAL_TYPE_P (t) && INTEGRAL_TYPE_P (builtin_type))
-return true;
-  else if (TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
-  && is_float128_p (t) && is_float128_p (builtin_type))
-return true;
-  else
-return lang_hooks.types_compatible_p (t, builtin_type);
-}
-
-
 /* In addition to calling fold_convert for EXPR of type TYPE, also
call c_fully_fold to remove any C_MAYBE_CONST_EXPRs that could be
hiding there (PR47197).  */
@@ -873,16 +853,6 @@ fully_fold_convert (tree type, tree expr)
   return result;
 }
 
-/* Implementation of the resolve_overloaded_builtin target hook, to
-   support Altivec's overloaded builtins.  */
-
-tree
-altivec_resolve_overloaded_builtin (location_t loc, t

[PATCH 4/6] rs6000: Remove rs6000-builtin.def and associated data and functions

2021-12-06 Thread Bill Schmidt via Gcc-patches
Hi!

The old rs6000-builtin.def file is no longer needed.  Remove it and the code
that depends on it.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
okay for trunk?

Thanks!
Bill

2021-12-02  Bill Schmidt  

gcc/
* config/rs6000/rs6000-builtin.def: Delete.
* config/rs6000/rs6000-call.c (builtin_compatibility): Delete.
(builtin_description): Delete.
(builtin_hash_struct): Delete.
(builtin_hasher): Delete.
(builtin_hash_table): Delete.
(builtin_hasher::hash): Delete.
(builtin_hasher::equal): Delete.
(rs6000_builtin_info_type): Delete.
(rs6000_builtin_info): Delete.
(bdesc_compat): Delete.
(bdesc_3arg): Delete.
(bdesc_4arg): Delete.
(bdesc_dst): Delete.
(bdesc_2arg): Delete.
(bdesc_altivec_preds): Delete.
(bdesc_abs): Delete.
(bdesc_1arg): Delete.
(bdesc_0arg): Delete.
(bdesc_htm): Delete.
(bdesc_mma): Delete.
(rs6000_overloaded_builtin_p): Delete.
(rs6000_overloaded_builtin_name): Delete.
(htm_spr_num): Delete.
(rs6000_builtin_is_supported_p): Delete.
(rs6000_gimple_fold_mma_builtin): Delete.
(gt-rs6000-call.h): Remove include directive.
* config/rs6000/rs6000-protos.h (rs6000_overloaded_builtin_p): Delete.
(rs6000_builtin_is_supported_p): Delete.
(rs6000_overloaded_builtin_name): Delete.
* config/rs6000/rs6000.c (rs6000_builtin_decls): Delete.
(rs6000_debug_reg_global): Remove reference to RS6000_BUILTIN_COUNT.
* config/rs6000/rs6000.h (rs6000_builtins): Delete.
(altivec_builtin_types): Delete.
(rs6000_builtin_decls): Delete.
* config/rs6000/t-rs6000 (TM_H): Don't add rs6000-builtin.def.
---
 gcc/config/rs6000/rs6000-builtin.def | 3350 --
 gcc/config/rs6000/rs6000-call.c  |  712 --
 gcc/config/rs6000/rs6000-protos.h|3 -
 gcc/config/rs6000/rs6000.c   |3 -
 gcc/config/rs6000/rs6000.h   |   57 -
 gcc/config/rs6000/t-rs6000   |1 -
 6 files changed, 4126 deletions(-)
 delete mode 100644 gcc/config/rs6000/rs6000-builtin.def

diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
deleted file mode 100644
index 9dbf16f48c4..000
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 86054f75756..a5ee06c991f 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -89,20 +89,6 @@
 #define TARGET_NO_PROTOTYPE 0
 #endif
 
-struct builtin_compatibility
-{
-  const enum rs6000_builtins code;
-  const char *const name;
-};
-
-struct builtin_description
-{
-  const HOST_WIDE_INT mask;
-  const enum insn_code icode;
-  const char *const name;
-  const enum rs6000_builtins code;
-};
-
 /* Used by __builtin_cpu_is(), mapping from PLATFORM names to values.  */
 static const struct
 {
@@ -184,127 +170,6 @@ static const struct
 
 static rtx rs6000_expand_new_builtin (tree, rtx, rtx, machine_mode, int);
 static bool rs6000_gimple_fold_new_builtin (gimple_stmt_iterator *gsi);
-
-
-/* Hash table to keep track of the argument types for builtin functions.  */
-
-struct GTY((for_user)) builtin_hash_struct
-{
-  tree type;
-  machine_mode mode[4];/* return value + 3 arguments.  */
-  unsigned char uns_p[4];  /* and whether the types are unsigned.  */
-};
-
-struct builtin_hasher : ggc_ptr_hash
-{
-  static hashval_t hash (builtin_hash_struct *);
-  static bool equal (builtin_hash_struct *, builtin_hash_struct *);
-};
-
-static GTY (()) hash_table *builtin_hash_table;
-
-/* Hash function for builtin functions with up to 3 arguments and a return
-   type.  */
-hashval_t
-builtin_hasher::hash (builtin_hash_struct *bh)
-{
-  unsigned ret = 0;
-  int i;
-
-  for (i = 0; i < 4; i++)
-{
-  ret = (ret * (unsigned)MAX_MACHINE_MODE) + ((unsigned)bh->mode[i]);
-  ret = (ret * 2) + bh->uns_p[i];
-}
-
-  return ret;
-}
-
-/* Compare builtin hash entries H1 and H2 for equivalence.  */
-bool
-builtin_hasher::equal (builtin_hash_struct *p1, builtin_hash_struct *p2)
-{
-  return ((p1->mode[0] == p2->mode[0])
- && (p1->mode[1] == p2->mode[1])
- && (p1->mode[2] == p2->mode[2])
- && (p1->mode[3] == p2->mode[3])
- && (p1->uns_p[0] == p2->uns_p[0])
- && (p1->uns_p[1] == p2->uns_p[1])
- && (p1->uns_p[2] == p2->uns_p[2])
- && (p1->uns_p[3] == p2->uns_p[3]));
-}
-
-
-/* Table that classifies rs6000 builtin functions (pure, const, etc.).  */
-#undef RS6000_BUILTIN_0
-#undef RS6000_BUILTIN_1
-#undef RS6000_BUILTIN_2
-#undef RS6000_BUILTIN_3
-#undef RS6000_BUILTIN_4
-#undef RS6000_BUILTIN_A
-#undef RS6000_BUILTIN_D
-#undef RS6000_BUILTIN_H
-#undef RS6000_BUILTIN_M
-#undef RS6000

[PATCH 3/6] rs6000: Rename rs6000-builtin-new.def to rs6000-builtins.def

2021-12-06 Thread Bill Schmidt via Gcc-patches
Hi!

This patch just renames a file and updates the build machinery accordingly.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
okay for trunk?

Thanks!
Bill

2021-12-02  Bill Schmidt  

gcc/
* config/rs6000/rs6000-builtin-new.def: Rename to...
* config/rs6000/rs6000-builtins.def: ...this.
* config/rs6000/rs6000-gen-builtins.c: Adjust header commentary.
* config/rs6000/t-rs6000 (EXTRA_GTYPE_DEPS): Rename
rs6000-builtin-new.def to rs6000-builtins.def.
(rs6000-builtins.c): Likewise.
---
 .../rs6000/{rs6000-builtin-new.def => rs6000-builtins.def}  | 0
 gcc/config/rs6000/rs6000-gen-builtins.c | 4 ++--
 gcc/config/rs6000/t-rs6000  | 6 +++---
 3 files changed, 5 insertions(+), 5 deletions(-)
 rename gcc/config/rs6000/{rs6000-builtin-new.def => rs6000-builtins.def} (100%)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
b/gcc/config/rs6000/rs6000-builtins.def
similarity index 100%
rename from gcc/config/rs6000/rs6000-builtin-new.def
rename to gcc/config/rs6000/rs6000-builtins.def
diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index 78b2486aafc..9c61b7d9fe6 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -22,7 +22,7 @@ along with GCC; see the file COPYING3.  If not see
recognition code for Power targets, based on text files that
describe the built-in functions and vector overloads:
 
- rs6000-builtin-new.def Table of built-in functions
+ rs6000-builtins.defTable of built-in functions
  rs6000-overload.defTable of overload functions
 
Both files group similar functions together in "stanzas," as
@@ -125,7 +125,7 @@ along with GCC; see the file COPYING3.  If not see
 
The second line contains the  that this particular instance of
the overloaded function maps to.  It must match a token that appears in
-   rs6000-builtin-new.def.  Optionally, a second token may appear.  If only
+   rs6000-builtins.def.  Optionally, a second token may appear.  If only
one token is on the line, it is also used to build the unique identifier
for the overloaded function.  If a second token is present, the second
token is used instead for this purpose.  This is necessary in cases
diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000
index d48a4b1be6c..3d3143a171d 100644
--- a/gcc/config/rs6000/t-rs6000
+++ b/gcc/config/rs6000/t-rs6000
@@ -22,7 +22,7 @@ TM_H += $(srcdir)/config/rs6000/rs6000-builtin.def
 TM_H += $(srcdir)/config/rs6000/rs6000-cpus.def
 TM_H += $(srcdir)/config/rs6000/rs6000-modes.h
 PASSES_EXTRA += $(srcdir)/config/rs6000/rs6000-passes.def
-EXTRA_GTYPE_DEPS += $(srcdir)/config/rs6000/rs6000-builtin-new.def
+EXTRA_GTYPE_DEPS += $(srcdir)/config/rs6000/rs6000-builtins.def
 
 rs6000-pcrel-opt.o: $(srcdir)/config/rs6000/rs6000-pcrel-opt.c
$(COMPILE) $<
@@ -59,10 +59,10 @@ build/rs6000-gen-builtins$(build_exeext): 
build/rs6000-gen-builtins.o \
 # For now, the header files depend on rs6000-builtins.c, which avoids
 # races because the .c file is closed last in rs6000-gen-builtins.c.
 rs6000-builtins.c: build/rs6000-gen-builtins$(build_exeext) \
-  $(srcdir)/config/rs6000/rs6000-builtin-new.def \
+  $(srcdir)/config/rs6000/rs6000-builtins.def \
   $(srcdir)/config/rs6000/rs6000-overload.def
$(RUN_GEN) ./build/rs6000-gen-builtins$(build_exeext) \
-   $(srcdir)/config/rs6000/rs6000-builtin-new.def \
+   $(srcdir)/config/rs6000/rs6000-builtins.def \
$(srcdir)/config/rs6000/rs6000-overload.def rs6000-builtins.h \
rs6000-builtins.c rs6000-vecdefines.h
 
-- 
2.27.0



[PATCH v2 0/6] Remove "old" built-in function infrastructure

2021-12-06 Thread Bill Schmidt via Gcc-patches
Hi!

Now that the new built-in function support is all upstream and enabled, it
seems safe and prudent to remove the old code to avoid confusion.  I broke this
up to the extent possible, but a couple of patches are still pretty large.

David Edelsohn found that I had broken some C++ library functions for AIX, and
his fix for that required me to re-spin the patches.  I also generated the diff
with a more efficient algorithm to reduce the patch size.  Otherwise this
series is identical to V1.

Thanks!
Bill

Bill Schmidt (6):
  rs6000: Remove new_builtins_are_live and dead code it was guarding
  rs6000: Remove altivec_overloaded_builtins array and initialization
  rs6000: Rename rs6000-builtin-new.def to rs6000-builtins.def
  rs6000: Remove rs6000-builtin.def and associated data and functions
  rs6000: Rename functions with "new" in their names
  rs6000: Rename arrays to remove temporary _x suffix

 gcc/config/rs6000/darwin.h| 8 +-
 gcc/config/rs6000/rs6000-builtin.def  |  3350 -
 ...00-builtin-new.def => rs6000-builtins.def} | 0
 gcc/config/rs6000/rs6000-c.c  |  1266 +-
 gcc/config/rs6000/rs6000-call.c   | 11964 +---
 gcc/config/rs6000/rs6000-gen-builtins.c   |   115 +-
 gcc/config/rs6000/rs6000-internal.h   | 2 +-
 gcc/config/rs6000/rs6000-protos.h | 3 -
 gcc/config/rs6000/rs6000.c|   334 +-
 gcc/config/rs6000/rs6000.h|58 -
 gcc/config/rs6000/t-rs6000| 7 +-
 11 files changed, 224 insertions(+), 16883 deletions(-)
 delete mode 100644 gcc/config/rs6000/rs6000-builtin.def
 rename gcc/config/rs6000/{rs6000-builtin-new.def => rs6000-builtins.def} (100%)

-- 
2.27.0



Re: [PATCH 0/6] rs6000: Remove "old" built-in function infrastructure

2021-12-06 Thread Bill Schmidt via Gcc-patches
I had difficulty with patch 1/6 being too large, and there have been some small
upstream changes in this area, so I will resubmit this series shortly.  There
were also problems with my SMTP server for some of the CCs as well...

Sorry for the churn!
Bill

On 12/3/21 12:22 PM, Bill Schmidt wrote:
> From: Bill Schmidt 
>
> Hi!
>
> Now that the new built-in function support is all upstream and enabled, it
> seems safe and prudent to remove the old code to avoid confusion.  I broke 
> this
> up to the extent possible, but the first patch is a bit large and messy 
> because
> so many dead functions have to be removed when taking out the
> "new_builtins_are_live" variable.
>
> Bill Schmidt (6):
>   rs6000: Remove new_builtins_are_live and dead code it was guarding
>   rs6000: Remove altivec_overloaded_builtins array and initialization
>   rs6000: Rename rs6000-builtin-new.def to rs6000-builtins.def
>   rs6000: Remove rs6000-builtin.def and associated data and functions
>   rs6000: Rename functions with "new" in their names
>   rs6000: Rename arrays to remove temporary _x suffix
>
>  gcc/config/rs6000/darwin.h| 8 +-
>  gcc/config/rs6000/rs6000-builtin.def  |  3350 ---
>  ...00-builtin-new.def => rs6000-builtins.def} | 0
>  gcc/config/rs6000/rs6000-c.c  |  1342 +-
>  gcc/config/rs6000/rs6000-call.c   | 17810 +++-
>  gcc/config/rs6000/rs6000-gen-builtins.c   |   115 +-
>  gcc/config/rs6000/rs6000-internal.h   | 2 +-
>  gcc/config/rs6000/rs6000-protos.h | 3 -
>  gcc/config/rs6000/rs6000.c|   334 +-
>  gcc/config/rs6000/rs6000.h|58 -
>  gcc/config/rs6000/t-rs6000| 7 +-
>  11 files changed, 3173 insertions(+), 19856 deletions(-)
>  delete mode 100644 gcc/config/rs6000/rs6000-builtin.def
>  rename gcc/config/rs6000/{rs6000-builtin-new.def => rs6000-builtins.def} 
> (100%)
>


[PATCH 6/6] rs6000: Rename arrays to remove temporary _x suffix

2021-12-03 Thread Bill Schmidt via Gcc-patches
From: Bill Schmidt 

Hi!

While we had two sets of built-in infrastructure at once, I added _x as a
suffix to two arrays to disambiguate the old and new versions.  Time to fix
that also.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
okay for trunk?

Thanks!
Bill

2021-12-02  Bill Schmidt  

gcc/
* config/rs6000/rs6000-c.c (altivec_build_resolved_builtin): Rename
rs6000_builtin_decls_x to rs6000_builtin_decls.
(altivec_resolve_overloaded_builtin): Likewise.  Also rename
rs6000_builtin_info_x to rs6000_builtin_info.
* config/rs6000/rs6000-call.c (rs6000_invalid_builtin): Rename
rs6000_builtin_info_x to rs6000_builtin_info.
(rs6000_builtin_is_supported): Likewise.
(rs6000_gimple_fold_mma_builtin): Likewise.  Also rename
rs6000_builtin_decls_x to rs6000_builtin_decls.
(rs6000_gimple_fold_builtin): Rename rs6000_builtin_info_x to
rs6000_builtin_info.
(cpu_expand_builtin): Likewise.
(rs6000_expand_builtin): Likewise.
(rs6000_init_builtins): Likewise.  Also rename rs6000_builtin_decls_x
to rs6000_builtin_decls.
(rs6000_builtin_decl): Rename rs6000_builtin_decls_x to
rs6000_builtin_decls.
* config/rs6000/rs6000-gen-builtins.c (write_decls): In generated code,
rename rs6000_builtin_decls_x to rs6000_builtin_decls, and rename
rs6000_builtin_info_x to rs6000_builtin_info.
(write_bif_static_init): In generated code, rename
rs6000_builtin_info_x to rs6000_builtin_info.
(write_init_bif_table): In generated code, rename
rs6000_builtin_decls_x to rs6000_builtin_decls, and rename
rs6000_builtin_info_x to rs6000_builtin_info.
(write_init_ovld_table): In generated code, rename
rs6000_builtin_decls_x to rs6000_builtin_decls.
(write_init_file): Likewise.
* config/rs6000/rs6000.c (rs6000_builtin_vectorized_function):
Likewise.
(rs6000_builtin_md_vectorized_function): Likewise.
(rs6000_builtin_reciprocal): Likewise.
(add_condition_to_bb): Likewise.
(rs6000_atomic_assign_expand_fenv): Likewise.
---
 gcc/config/rs6000/rs6000-c.c| 64 -
 gcc/config/rs6000/rs6000-call.c | 46 +-
 gcc/config/rs6000/rs6000-gen-builtins.c | 27 +--
 gcc/config/rs6000/rs6000.c  | 58 +++---
 4 files changed, 96 insertions(+), 99 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index f790c72d621..e0ebdeed548 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -867,7 +867,7 @@ altivec_build_resolved_builtin (tree *args, int n, tree 
fntype, tree ret_type,
 {
   tree argtypes = TYPE_ARG_TYPES (fntype);
   tree arg_type[MAX_OVLD_ARGS];
-  tree fndecl = rs6000_builtin_decls_x[bif_id];
+  tree fndecl = rs6000_builtin_decls[bif_id];
 
   for (int i = 0; i < n; i++)
 {
@@ -1001,13 +1001,13 @@ altivec_resolve_overloaded_builtin (location_t loc, 
tree fndecl,
  case E_SFmode:
{
  /* For floats use the xvmulsp instruction directly.  */
- tree call = rs6000_builtin_decls_x[RS6000_BIF_XVMULSP];
+ tree call = rs6000_builtin_decls[RS6000_BIF_XVMULSP];
  return build_call_expr (call, 2, arg0, arg1);
}
  case E_DFmode:
{
  /* For doubles use the xvmuldp instruction directly.  */
- tree call = rs6000_builtin_decls_x[RS6000_BIF_XVMULDP];
+ tree call = rs6000_builtin_decls[RS6000_BIF_XVMULDP];
  return build_call_expr (call, 2, arg0, arg1);
}
  /* Other types are errors.  */
@@ -1066,7 +1066,7 @@ altivec_resolve_overloaded_builtin (location_t loc, tree 
fndecl,
vec_safe_push (params, arg0);
vec_safe_push (params, arg1);
tree call = altivec_resolve_overloaded_builtin
- (loc, rs6000_builtin_decls_x[RS6000_OVLD_VEC_CMPEQ],
+ (loc, rs6000_builtin_decls[RS6000_OVLD_VEC_CMPEQ],
   params);
/* Use save_expr to ensure that operands used more than once
   that may have side effects (like calls) are only evaluated
@@ -1076,7 +1076,7 @@ altivec_resolve_overloaded_builtin (location_t loc, tree 
fndecl,
vec_safe_push (params, call);
vec_safe_push (params, call);
return altivec_resolve_overloaded_builtin
- (loc, rs6000_builtin_decls_x[RS6000_OVLD_VEC_NOR], params);
+ (loc, rs6000_builtin_decls[RS6000_OVLD_VEC_NOR], params);
  }
  /* Other types are errors.  */
default:
@@ -1129,9 +1129,9 @@ altivec_resolve_overloaded_builtin (location_t loc, tree 
fndecl,
  vec_safe_push (params, a

[PATCH 5/6] rs6000: Rename functions with "new" in their names

2021-12-03 Thread Bill Schmidt via Gcc-patches
From: Bill Schmidt 

Hi!

While we had two sets of built-in functionality at the same time, I put "new"
in the names of quite a few functions.  Time to undo that.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
okay for trunk?

Thanks!
Bill

2021-12-02  Bill Schmidt  

gcc/
* config/rs6000/rs6000-c.c (altivec_resolve_new_overloaded_builtin):
Remove forward declaration.
(rs6000_new_builtin_type_compatible): Rename to
rs6000_builtin_type_compatible.
(rs6000_builtin_type_compatible): Remove.
(altivec_resolve_overloaded_builtin): Remove.
(altivec_build_new_resolved_builtin): Rename to
altivec_build_resolved_builtin.
(altivec_resolve_new_overloaded_builtin): Rename to
altivec_resolve_overloaded_builtin.  Remove static keyword.  Adjust
called function names.
* config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): Remove
forward declaration.
(rs6000_gimple_fold_new_builtin): Likewise.
(rs6000_invalid_new_builtin): Rename to rs6000_invalid_builtin.
(rs6000_gimple_fold_builtin): Remove.
(rs6000_new_builtin_valid_without_lhs): Rename to
rs6000_builtin_valid_without_lhs.
(rs6000_new_builtin_is_supported): Rename to
rs6000_builtin_is_supported.
(rs6000_gimple_fold_new_mma_builtin): Rename to
rs6000_gimple_fold_mma_builtin.
(rs6000_gimple_fold_new_builtin): Rename to
rs6000_gimple_fold_builtin.  Remove static keyword.  Adjust called
function names.
(rs6000_expand_builtin): Remove.
(new_cpu_expand_builtin): Rename to cpu_expand_builtin.
(new_mma_expand_builtin): Rename to mma_expand_builtin.
(new_htm_spr_num): Rename to htm_spr_num.
(new_htm_expand_builtin): Rename to htm_expand_builtin.  Change name
of called function.
(rs6000_expand_new_builtin): Rename to rs6000_expand_builtin.  Remove
static keyword.  Adjust called function names.
(rs6000_new_builtin_decl): Rename to rs6000_builtin_decl.  Remove
static keyword.
(rs6000_builtin_decl): Remove.
* config/rs6000/rs6000-gen-builtins.c (write_decls): In gnerated code,
rename rs6000_new_builtin_is_supported to rs6000_builtin_is_supported.
* config/rs6000/rs6000-internal.h (rs6000_invalid_new_builtin): Rename
to rs6000_invalid_builtin.
* config/rs6000/rs6000.c (rs6000_new_builtin_vectorized_function):
Rename to rs6000_builtin_vectorized_function.
(rs6000_new_builtin_md_vectorized_function): Rename to
rs6000_builtin_md_vectorized_function.
(rs6000_builtin_vectorized_function): Remove.
(rs6000_builtin_md_vectorized_function): Remove.
---
 gcc/config/rs6000/rs6000-c.c| 120 +---
 gcc/config/rs6000/rs6000-call.c |  99 ++-
 gcc/config/rs6000/rs6000-gen-builtins.c |   3 +-
 gcc/config/rs6000/rs6000-internal.h |   2 +-
 gcc/config/rs6000/rs6000.c  |  31 ++
 5 files changed, 80 insertions(+), 175 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index d44edf585aa..f790c72d621 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -37,9 +37,6 @@
 
 #include "rs6000-internal.h"
 
-static tree altivec_resolve_new_overloaded_builtin (location_t, tree, void *);
-
-
 /* Handle the machine specific pragma longcall.  Its syntax is
 
# pragma longcall ( TOGGLE )
@@ -817,7 +814,7 @@ is_float128_p (tree t)
 
 /* Return true iff ARGTYPE can be compatibly passed as PARMTYPE.  */
 static bool
-rs6000_new_builtin_type_compatible (tree parmtype, tree argtype)
+rs6000_builtin_type_compatible (tree parmtype, tree argtype)
 {
   if (parmtype == error_mark_node)
 return false;
@@ -840,23 +837,6 @@ rs6000_new_builtin_type_compatible (tree parmtype, tree 
argtype)
   return lang_hooks.types_compatible_p (parmtype, argtype);
 }
 
-static inline bool
-rs6000_builtin_type_compatible (tree t, int id)
-{
-  tree builtin_type;
-  builtin_type = rs6000_builtin_type (id);
-  if (t == error_mark_node)
-return false;
-  if (INTEGRAL_TYPE_P (t) && INTEGRAL_TYPE_P (builtin_type))
-return true;
-  else if (TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
-  && is_float128_p (t) && is_float128_p (builtin_type))
-return true;
-  else
-return lang_hooks.types_compatible_p (t, builtin_type);
-}
-
-
 /* In addition to calling fold_convert for EXPR of type TYPE, also
call c_fully_fold to remove any C_MAYBE_CONST_EXPRs that could be
hiding there (PR47197).  */
@@ -873,16 +853,6 @@ fully_fold_convert (tree type, tree expr)
   return result;
 }
 
-/* Implementation of the resolve_overloaded_builtin target hook, to
-   support Altivec's overloaded builtins.  */
-
-tree
-altivec_res

[PATCH 3/6] rs6000: Rename rs6000-builtin-new.def to rs6000-builtins.def

2021-12-03 Thread Bill Schmidt via Gcc-patches
From: Bill Schmidt 

Hi!

This patch just renames a file and updates the build machinery accordingly.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
okay for trunk?

Thanks!
Bill

2021-12-02  Bill Schmidt  

gcc/
* config/rs6000/rs6000-builtin-new.def: Rename to...
* config/rs6000/rs6000-builtins.def: ...this.
* config/rs6000/rs6000-gen-builtins.c: Adjust header commentary.
* config/rs6000/t-rs6000 (EXTRA_GTYPE_DEPS): Rename
rs6000-builtin-new.def to rs6000-builtins.def.
(rs6000-builtins.c): Likewise.
---
 .../rs6000/{rs6000-builtin-new.def => rs6000-builtins.def}  | 0
 gcc/config/rs6000/rs6000-gen-builtins.c | 4 ++--
 gcc/config/rs6000/t-rs6000  | 6 +++---
 3 files changed, 5 insertions(+), 5 deletions(-)
 rename gcc/config/rs6000/{rs6000-builtin-new.def => rs6000-builtins.def} (100%)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
b/gcc/config/rs6000/rs6000-builtins.def
similarity index 100%
rename from gcc/config/rs6000/rs6000-builtin-new.def
rename to gcc/config/rs6000/rs6000-builtins.def
diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index 78b2486aafc..9c61b7d9fe6 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -22,7 +22,7 @@ along with GCC; see the file COPYING3.  If not see
recognition code for Power targets, based on text files that
describe the built-in functions and vector overloads:
 
- rs6000-builtin-new.def Table of built-in functions
+ rs6000-builtins.defTable of built-in functions
  rs6000-overload.defTable of overload functions
 
Both files group similar functions together in "stanzas," as
@@ -125,7 +125,7 @@ along with GCC; see the file COPYING3.  If not see
 
The second line contains the  that this particular instance of
the overloaded function maps to.  It must match a token that appears in
-   rs6000-builtin-new.def.  Optionally, a second token may appear.  If only
+   rs6000-builtins.def.  Optionally, a second token may appear.  If only
one token is on the line, it is also used to build the unique identifier
for the overloaded function.  If a second token is present, the second
token is used instead for this purpose.  This is necessary in cases
diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000
index d48a4b1be6c..3d3143a171d 100644
--- a/gcc/config/rs6000/t-rs6000
+++ b/gcc/config/rs6000/t-rs6000
@@ -22,7 +22,7 @@ TM_H += $(srcdir)/config/rs6000/rs6000-builtin.def
 TM_H += $(srcdir)/config/rs6000/rs6000-cpus.def
 TM_H += $(srcdir)/config/rs6000/rs6000-modes.h
 PASSES_EXTRA += $(srcdir)/config/rs6000/rs6000-passes.def
-EXTRA_GTYPE_DEPS += $(srcdir)/config/rs6000/rs6000-builtin-new.def
+EXTRA_GTYPE_DEPS += $(srcdir)/config/rs6000/rs6000-builtins.def
 
 rs6000-pcrel-opt.o: $(srcdir)/config/rs6000/rs6000-pcrel-opt.c
$(COMPILE) $<
@@ -59,10 +59,10 @@ build/rs6000-gen-builtins$(build_exeext): 
build/rs6000-gen-builtins.o \
 # For now, the header files depend on rs6000-builtins.c, which avoids
 # races because the .c file is closed last in rs6000-gen-builtins.c.
 rs6000-builtins.c: build/rs6000-gen-builtins$(build_exeext) \
-  $(srcdir)/config/rs6000/rs6000-builtin-new.def \
+  $(srcdir)/config/rs6000/rs6000-builtins.def \
   $(srcdir)/config/rs6000/rs6000-overload.def
$(RUN_GEN) ./build/rs6000-gen-builtins$(build_exeext) \
-   $(srcdir)/config/rs6000/rs6000-builtin-new.def \
+   $(srcdir)/config/rs6000/rs6000-builtins.def \
$(srcdir)/config/rs6000/rs6000-overload.def rs6000-builtins.h \
rs6000-builtins.c rs6000-vecdefines.h
 
-- 
2.27.0



[PATCH 0/6] rs6000: Remove "old" built-in function infrastructure

2021-12-03 Thread Bill Schmidt via Gcc-patches
From: Bill Schmidt 

Hi!

Now that the new built-in function support is all upstream and enabled, it
seems safe and prudent to remove the old code to avoid confusion.  I broke this
up to the extent possible, but the first patch is a bit large and messy because
so many dead functions have to be removed when taking out the
"new_builtins_are_live" variable.

Bill Schmidt (6):
  rs6000: Remove new_builtins_are_live and dead code it was guarding
  rs6000: Remove altivec_overloaded_builtins array and initialization
  rs6000: Rename rs6000-builtin-new.def to rs6000-builtins.def
  rs6000: Remove rs6000-builtin.def and associated data and functions
  rs6000: Rename functions with "new" in their names
  rs6000: Rename arrays to remove temporary _x suffix

 gcc/config/rs6000/darwin.h| 8 +-
 gcc/config/rs6000/rs6000-builtin.def  |  3350 ---
 ...00-builtin-new.def => rs6000-builtins.def} | 0
 gcc/config/rs6000/rs6000-c.c  |  1342 +-
 gcc/config/rs6000/rs6000-call.c   | 17810 +++-
 gcc/config/rs6000/rs6000-gen-builtins.c   |   115 +-
 gcc/config/rs6000/rs6000-internal.h   | 2 +-
 gcc/config/rs6000/rs6000-protos.h | 3 -
 gcc/config/rs6000/rs6000.c|   334 +-
 gcc/config/rs6000/rs6000.h|58 -
 gcc/config/rs6000/t-rs6000| 7 +-
 11 files changed, 3173 insertions(+), 19856 deletions(-)
 delete mode 100644 gcc/config/rs6000/rs6000-builtin.def
 rename gcc/config/rs6000/{rs6000-builtin-new.def => rs6000-builtins.def} (100%)

-- 
2.27.0



Re: [PATCH] rs6000: Fix use of wrong enum for built-in function code.

2021-12-03 Thread Bill Schmidt via Gcc-patches
On 12/3/21 10:26 AM, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Dec 02, 2021 at 04:53:18PM -0600, Bill Schmidt wrote:
>> I discovered this bug while working on patches to remove the old built-ins
>> infrastructure.  I missed a spot in converting from the rs6000_builtins enum 
>> to
>> the rs6000_gen_builtins enum.  This fixes it.  The fix is technically not 
>> right
>> if new_builtins_are_enabled were to be set to zero, but we're not going to do
>> that anymore, and the remnants of that code will be removed shortly.
>> gcc/
>>  * config/rs6000/rs6000.c (rs6000_builtin_reciprocal): Fix builtin
>>  identifiers.
> What an informative changelog ;-)
>
> Okay for trunk.  Thanks!

Thanks!  Pushed as r12-5776.

Bill

>
>
> Segher


[PATCH] rs6000: Fix use of wrong enum for built-in function code.

2021-12-02 Thread Bill Schmidt via Gcc-patches
Hi!

I discovered this bug while working on patches to remove the old built-ins
infrastructure.  I missed a spot in converting from the rs6000_builtins enum to
the rs6000_gen_builtins enum.  This fixes it.  The fix is technically not right
if new_builtins_are_enabled were to be set to zero, but we're not going to do
that anymore, and the remnants of that code will be removed shortly.

Regstrap is in progress on powerpc64le-linux-gnu.  Assuming no problems, is this
okay to commit to trunk?

Thanks!
Bill


2021-12-02  Bill Schmidt  

gcc/
* config/rs6000/rs6000.c (rs6000_builtin_reciprocal): Fix builtin
identifiers.
---
 gcc/config/rs6000/rs6000.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 945157b1c1a..0c18e69b012 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -22741,7 +22741,7 @@ rs6000_builtin_reciprocal (tree fndecl)
 {
   switch (DECL_MD_FUNCTION_CODE (fndecl))
 {
-case VSX_BUILTIN_XVSQRTDP:
+case RS6000_BIF_XVSQRTDP:
   if (!RS6000_RECIP_AUTO_RSQRTE_P (V2DFmode))
return NULL_TREE;
 
@@ -22749,7 +22749,7 @@ rs6000_builtin_reciprocal (tree fndecl)
return rs6000_builtin_decls_x[RS6000_BIF_RSQRT_2DF];
   return rs6000_builtin_decls[VSX_BUILTIN_RSQRT_2DF];
 
-case VSX_BUILTIN_XVSQRTSP:
+case RS6000_BIF_XVSQRTSP:
   if (!RS6000_RECIP_AUTO_RSQRTE_P (V4SFmode))
return NULL_TREE;
 
-- 
2.27.0




Re: [PATCH] rs6000: Builtins test changes for test_fpscr_[d]rn_builtin_error.c

2021-12-02 Thread Bill Schmidt via Gcc-patches
Hi!

On 12/1/21 5:00 PM, Segher Boessenkool wrote:
> On Thu, Nov 18, 2021 at 10:36:52AM -0600, Bill Schmidt wrote:
>> Hi!  This is the last patch broken out of the previous test suite patch
>> for the new builtins support.
> Whew :-)
>
>> One advantage of the new builtins support is uniform error messages for
>> arguments with restricted values.  Previously this was done in many places
>> in an ad hoc manner, with little uniformity.  This patch adjusts the
>> expected error messages accordingly.
>>
>> All such error messages are now one of the following:
>>   "argument %d must be a %d-bit unsigned literal"
>>   "argument %d must be a literal between %d and %d, inclusive"
>>   "argument %d must be a variable or a literal between %d and %d, inclusive"
>>   "argument %d must be either a literal %d or a literal %d"
>>
>> These messages were chosen to require the fewest changes from previous
>> messages while still introducing uniformity.  This patch adjusts error
>> messages for some cases where this produces changed messages.  In
>> particular, some messages are improved because previously they did not
>> admit the possibility that an argument could hold a variable.
> Same comment as on the previous patch.  But, okay for trunk.  Thanks!

Thank you for all of the reviews!  I combined recent patches that need to go
upstream together to avoid bisect problems, and pushed them as r12-5752.
The new built-in infrastructure is now enabled!

I'll start working on a patch series to remove all the no longer needed code.

Thanks again for all of your help in getting this work reviewed, and for
all the improvements as a result!

Bill

>
>
> Segher


Re: [PATCH] rs6000: Builtins test changes for pr80315-*.c, pr88100.c

2021-12-01 Thread Bill Schmidt via Gcc-patches
Hi!

On 12/1/21 4:29 PM, Segher Boessenkool wrote:
> On Thu, Nov 18, 2021 at 10:15:21AM -0600, Bill Schmidt wrote:
>> Hi!  This patch is broken out from the test case patch for the new
>> builtins support.
>>
>> One advantage of the new builtins support is uniform error messages for
>> arguments with restricted values.  Previously this was done in many places
>> in an ad hoc manner, with little uniformity.  This patch adjusts the
>> expected error messages accordingly.
>>
>> All error messages are now one of the following:
>>   "argument %d must be a %d-bit unsigned literal"
>>   "argument %d must be a literal between %d and %d, inclusive"
>>   "argument %d must be a variable or a literal between %d and %d, inclusive"
>>   "argument %d must be either a literal %d or a literal %d"
>>
>> These messages were chosen to require the fewest changes from previous
>> messages while still introducing uniformity.  This patch adjusts error
>> messages for some cases where this produces changed messages.
>>
>> Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64) with
>> no regressions.  is this okay for trunk?
> We should have opnly the middle two of those messages.  But, okay for
> trunk if you put this on some to-do list.  Thanks!

The last one is actually needed also, because we have at least one case 
where the two supported values aren't contiguous.  We can do without the
first one, but that will affect quite a number of test cases, so agree
that this should be done later.  (We already had a whole lot of tests
of this form.)

Thanks for the review!
Bill

>
> Segher


Re: [PATCH v2] rs6000: Fix a handful of 32-bit built-in function problems

2021-12-01 Thread Bill Schmidt via Gcc-patches
Hi!

On 12/1/21 3:08 PM, Segher Boessenkool wrote:
> On Tue, Nov 16, 2021 at 12:56:52PM -0600, Bill Schmidt wrote:
>> Hi!  I previously posted [1] to correct some problems with the new builtins
>> support targeting 32-bit code gen.  Based on the discussion, I've made some
>> adjustments and would like to submit this for consideration.
>>
>> We eventually agreed that the strange behavior for -m32 -mpowerpc64 for 
>> certain
>> HTM builtins should be removed.  All of the registers TEXASR, TEXASRU, TFHAR,
>> and TFIAR are now accessed using the unsigned long data type in all 
>> configurations.
>> gcc/
>>  * config/rs6000/rs6000-builtin-new.def (CMPB): Flag as no32bit.
>>  (BPERMD): Flag as 32bit (needing special handling for 32-bit).
>>  (UNPACK_TD): Return unsigned long long instead of unsigned long.
>>  (GET_TEXASR): Return unsigned long instead of unsigned long long.
>>  (GET_TEXASRU): Likewise.
>>  (GET_TFHAR): Likewise.
>>  (GET_TFIAR): Likewise.
>>  (SET_TEXASR): Pass unsigned long instead of unsigned long long.
>>  (SET_TEXASRU): Likewise.
>>  (SET_TFHAR): Likewise.
>>  (SET_TFIAR): Likewise.
>>  (TABORTDC): Likewise.
>>  (TABORTDCI): Likewise.
>>  * config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): Fix error
>>  handling for no32bit.  Add 32bit handling for RS6000_BIF_BPERMD.
>>
>> gcc/testsuite/
>>  * gcc.target/powerpc/cmpb-3.c: Adjust error message.
> Okay for trunk.  Thanks!
>
>
> Could you put some short blurb about the changed prototype of the HTM
> reg builtins in the release notes please?  Thanks x2 :-)

Already done!

Bill

>
> Segher


Re: [PATCH] rs6000: Mirror fix for PR102347 into the new builtins support

2021-12-01 Thread Bill Schmidt via Gcc-patches
On 12/1/21 11:21 AM, Segher Boessenkool wrote:
> Hi!
>
> On Wed, Dec 01, 2021 at 09:29:42AM -0600, Bill Schmidt wrote:
>> Recently Kewen fixed a problem in the old builtins support where
>> rs6000_builtin_decl prematurely indicated that a target builtin is
>> unavailable.  This also needs to be done for the new builtins support, but in
>> this case we have to ensure the error message is still produced from the
>> overload support in rs6000-c.c.  Unfortunately, this is less straightforward
>> than it could be, because header file includes need to be adjusted to make 
>> this
>> happen.  Someday we'll consolidate all the builtin code in one file and this
>> won't have to be so ugly.
>>
>> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is 
>> this
>> okay for trunk?
> This is okay for trunk.  Thanks!
>
> Is there some place we can store what original builtin was used when
> some overload is resolved?  Just in the new builtin code, don't spend
> time on the old stuff :-)

I think we can do much better, with a little work.  The macros are a little
problematic, but I have ideas about how to make this better in a future
patch.  (There's a fair amount of test suite fallout, so it should wait.)

Thanks for the review!

Bill
>
>
> Segher


[PATCH, PING] rs6000: Builtins test changes for test_fpscr_[d]rn_builtin_error.c

2021-12-01 Thread Bill Schmidt via Gcc-patches
Hi!  I'd like to ping this patch.

Thanks!
Bill

On 11/18/21 10:36 AM, Bill Schmidt wrote:
> Hi!  This is the last patch broken out of the previous test suite patch
> for the new builtins support.
>
> One advantage of the new builtins support is uniform error messages for
> arguments with restricted values.  Previously this was done in many places
> in an ad hoc manner, with little uniformity.  This patch adjusts the
> expected error messages accordingly.
>
> All such error messages are now one of the following:
>   "argument %d must be a %d-bit unsigned literal"
>   "argument %d must be a literal between %d and %d, inclusive"
>   "argument %d must be a variable or a literal between %d and %d, inclusive"
>   "argument %d must be either a literal %d or a literal %d"
>
> These messages were chosen to require the fewest changes from previous
> messages while still introducing uniformity.  This patch adjusts error
> messages for some cases where this produces changed messages.  In
> particular, some messages are improved because previously they did not
> admit the possibility that an argument could hold a variable.
>
> Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64)
> with no regressions.  Is this okay for trunk?
>
> Thanks!
> Bill
>
>
> 2021-11-17  Bill Schmidt  
>
> gcc/testsuite/
>   * gcc.target/powerpc/test_fpscr_drn_builtin_error.c: Adjust error
>   messages.
>   * gcc.target/powerpc/test_fpscr_rn_builtin_error.c: Likewise.
> ---
>  .../powerpc/test_fpscr_drn_builtin_error.c   |  4 ++--
>  .../gcc.target/powerpc/test_fpscr_rn_builtin_error.c | 12 ++--
>  2 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/powerpc/test_fpscr_drn_builtin_error.c 
> b/gcc/testsuite/gcc.target/powerpc/test_fpscr_drn_builtin_error.c
> index 028ab0b6d66..4f9d9e08e8a 100644
> --- a/gcc/testsuite/gcc.target/powerpc/test_fpscr_drn_builtin_error.c
> +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_drn_builtin_error.c
> @@ -9,8 +9,8 @@ int main ()
>   __builtin_set_fpscr_drn() also support a variable as an argument but
>   can't test variable value at compile time.  */
>  
> -  __builtin_set_fpscr_drn(-1);  /* { dg-error "Argument must be a value 
> between 0 and 7" } */ 
> -  __builtin_set_fpscr_drn(8);   /* { dg-error "Argument must be a value 
> between 0 and 7" } */ 
> +  __builtin_set_fpscr_drn(-1);  /* { dg-error "argument 1 must be a variable 
> or a literal between 0 and 7, inclusive" } */ 
> +  __builtin_set_fpscr_drn(8);   /* { dg-error "argument 1 must be a variable 
> or a literal between 0 and 7, inclusive" } */ 
>  
>  }
>  
> diff --git a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c 
> b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c
> index aea65091b0c..10391b71008 100644
> --- a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c
> +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c
> @@ -8,13 +8,13 @@ int main ()
>   int arguments.  The builtins __builtin_set_fpscr_rn() also supports a
>   variable as an argument but can't test variable value at compile time.  
> */
>  
> -  __builtin_mtfsb0(-1);  /* { dg-error "Argument must be a constant between 
> 0 and 31" } */
> -  __builtin_mtfsb0(32);  /* { dg-error "Argument must be a constant between 
> 0 and 31" } */
> +  __builtin_mtfsb0(-1);  /* { dg-error "argument 1 must be a 5-bit unsigned 
> literal" } */
> +  __builtin_mtfsb0(32);  /* { dg-error "argument 1 must be a 5-bit unsigned 
> literal" } */
>  
> -  __builtin_mtfsb1(-1);  /* { dg-error "Argument must be a constant between 
> 0 and 31" } */
> -  __builtin_mtfsb1(32);  /* { dg-error "Argument must be a constant between 
> 0 and 31" } */ 
> +  __builtin_mtfsb1(-1);  /* { dg-error "argument 1 must be a 5-bit unsigned 
> literal" } */
> +  __builtin_mtfsb1(32);  /* { dg-error "argument 1 must be a 5-bit unsigned 
> literal" } */ 
>  
> -  __builtin_set_fpscr_rn(-1);  /* { dg-error "Argument must be a value 
> between 0 and 3" } */ 
> -  __builtin_set_fpscr_rn(4);   /* { dg-error "Argument must be a value 
> between 0 and 3" } */ 
> +  __builtin_set_fpscr_rn(-1);  /* { dg-error "argument 1 must be a variable 
> or a literal between 0 and 3, inclusive" } */ 
> +  __builtin_set_fpscr_rn(4);   /* { dg-error "argument 1 must be a variable 
> or a literal between 0 and 3, inclusive" } */ 
>  }
>  


[PATCH, PING] rs6000: Builtins test changes for pragma_misc9.c

2021-12-01 Thread Bill Schmidt via Gcc-patches
Hi!  I'd like to ping this patch.

Thanks!
Bill

On 11/18/21 10:18 AM, Bill Schmidt wrote:
> Hi!  This patch is broken out from the test suite patch for the new
> builtins support.  This one is just a minor adjustment for the error
> message wording.
>
> Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64)
> with no regressions.  Is this okay for trunk?
>
> Thanks!
> Bill
>
>
> 2021-11-17  Bill Schmidt  
>
> gcc/testsuite/
>   * gcc.target/powerpc/pragma_misc9.c: Adjust error message.
> ---
>  gcc/testsuite/gcc.target/powerpc/pragma_misc9.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/powerpc/pragma_misc9.c 
> b/gcc/testsuite/gcc.target/powerpc/pragma_misc9.c
> index e03099bd084..c1667d9f7db 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pragma_misc9.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pragma_misc9.c
> @@ -20,7 +20,7 @@ vector bool int
>  test2 (vector signed int a, vector signed int b)
>  {
>return vec_cmpnez (a, b);
> -  /* { dg-error "'__builtin_altivec_vcmpnezw' requires the '-mcpu=power9' 
> option" "" { target *-*-* } .-1 } */
> +  /* { dg-error "'__builtin_altivec_vcmpnezw' requires the '-mcpu=power9' 
> and '-mvsx' options" "" { target *-*-* } .-1 } */
>  }
>  
>  #pragma GCC target ("cpu=power7")
> @@ -28,7 +28,7 @@ vector signed int
>  test3 (vector signed int a, vector signed int b)
>  {
>return vec_mergee (a, b);
> -  /* { dg-error "'__builtin_altivec_vmrgew_v4si' requires the 
> '-mpower8-vector' option" "" { target *-*-* } .-1 } */
> +  /* { dg-error "'__builtin_altivec_vmrgew_v4si' requires the '-mcpu=power8' 
> and '-mvsx' options" "" { target *-*-* } .-1 } */
>  }
>  
>  #pragma GCC target ("cpu=power6")


[PATCH, PING] rs6000: Builtins test changes for pr80315-*.c, pr88100.c

2021-12-01 Thread Bill Schmidt via Gcc-patches
Hi!  I'd like to ping this patch.

Thanks!
Bill

On 11/18/21 10:15 AM, Bill Schmidt wrote:
> Hi!  This patch is broken out from the test case patch for the new
> builtins support.
>
> One advantage of the new builtins support is uniform error messages for
> arguments with restricted values.  Previously this was done in many places
> in an ad hoc manner, with little uniformity.  This patch adjusts the
> expected error messages accordingly.
>
> All error messages are now one of the following:
>   "argument %d must be a %d-bit unsigned literal"
>   "argument %d must be a literal between %d and %d, inclusive"
>   "argument %d must be a variable or a literal between %d and %d, inclusive"
>   "argument %d must be either a literal %d or a literal %d"
>
> These messages were chosen to require the fewest changes from previous
> messages while still introducing uniformity.  This patch adjusts error
> messages for some cases where this produces changed messages.
>
> Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64) with
> no regressions.  is this okay for trunk?
>
> Thanks!
> Bill
>
>
> 2021-11-17  Bill Schmidt  
>
> gcc/testsuite/
>   * gcc.target/powerpc/pr80315-1.c: Adjust error message.
>   * gcc.target/powerpc/pr80315-2.c: Likewise.
>   * gcc.target/powerpc/pr80315-3.c: Likewise.
>   * gcc.target/powerpc/pr80315-4.c: Likewise.
>   * gcc.target/powerpc/pr88100.c: Likewise.
> ---
>  gcc/testsuite/gcc.target/powerpc/pr80315-1.c |  2 +-
>  gcc/testsuite/gcc.target/powerpc/pr80315-2.c |  2 +-
>  gcc/testsuite/gcc.target/powerpc/pr80315-3.c |  2 +-
>  gcc/testsuite/gcc.target/powerpc/pr80315-4.c |  2 +-
>  gcc/testsuite/gcc.target/powerpc/pr88100.c   | 12 ++--
>  5 files changed, 10 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr80315-1.c 
> b/gcc/testsuite/gcc.target/powerpc/pr80315-1.c
> index e2db0ff4b5f..f37f1f169a2 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr80315-1.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr80315-1.c
> @@ -10,6 +10,6 @@ main()
>int mask;
>  
>/* Argument 2 must be 0 or 1.  Argument 3 must be in range 0..15.  */
> -  res = __builtin_crypto_vshasigmaw (test, 1, 0xff); /* { dg-error {argument 
> 3 must be in the range \[0, 15\]} } */
> +  res = __builtin_crypto_vshasigmaw (test, 1, 0xff); /* { dg-error {argument 
> 3 must be a 4-bit unsigned literal} } */
>return 0;
>  }
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr80315-2.c 
> b/gcc/testsuite/gcc.target/powerpc/pr80315-2.c
> index 144b705c012..0819a0511b7 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr80315-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr80315-2.c
> @@ -10,6 +10,6 @@ main ()
>int mask;
>  
>/* Argument 2 must be 0 or 1.  Argument 3 must be in range 0..15.  */
> -  res = __builtin_crypto_vshasigmad (test, 1, 0xff); /* { dg-error {argument 
> 3 must be in the range \[0, 15\]} } */
> +  res = __builtin_crypto_vshasigmad (test, 1, 0xff); /* { dg-error {argument 
> 3 must be a 4-bit unsigned literal} } */
>return 0;
>  }
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr80315-3.c 
> b/gcc/testsuite/gcc.target/powerpc/pr80315-3.c
> index 99a3e24eadd..cc2e46cf5cb 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr80315-3.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr80315-3.c
> @@ -12,6 +12,6 @@ main ()
>int mask;
>  
>/* Argument 2 must be 0 or 1.  Argument 3 must be in range 0..15.  */
> -  res = vec_shasigma_be (test, 1, 0xff); /* { dg-error {argument 3 must be 
> in the range \[0, 15\]} } */
> +  res = vec_shasigma_be (test, 1, 0xff); /* { dg-error {argument 3 must be a 
> 4-bit unsigned literal} } */
>return res;
>  }
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr80315-4.c 
> b/gcc/testsuite/gcc.target/powerpc/pr80315-4.c
> index 7f5f6f75029..ac12910741b 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr80315-4.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr80315-4.c
> @@ -12,6 +12,6 @@ main ()
>int mask;
>  
>/* Argument 2 must be 0 or 1.  Argument 3 must be in range 0..15.  */
> -  res = vec_shasigma_be (test, 1, 0xff); /* { dg-error {argument 3 must be 
> in the range \[0, 15\]} } */
> +  res = vec_shasigma_be (test, 1, 0xff); /* { dg-error {argument 3 must be a 
> 4-bit unsigned literal} } */
>return res;
>  }
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr88100.c 
> b/gcc/testsuite/gcc.target/powerpc/pr88100.c
> index 4452145ce95..764c897a497 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr88100.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr88100.c
> @@ -10,35 +10,35 @@
>  vector unsigned char
>  splatu1 (void)
&g

[PATCH, PING] rs6000: Builtins test changes for compare-bytes tests

2021-12-01 Thread Bill Schmidt via Gcc-patches
Hi!  I'd like to ping this patch.

Thanks!
Bill

On 11/18/21 7:47 AM, Bill Schmidt wrote:
> Hi!  This patch is broken out from the patch with test suite changes for the
> new builtins support.
>
> With the old builtins support, cmpb-2.c produces:
>   warning: implicit declaration of function '__builtin_cmpb; did you mean 
> '__builtin_bcmp'?
>
> With the new support, it produces:
>   error: '__builtin_p6_cmpb requires the '-mcpu=power6' option and either the 
> '-m64' or '-mpowerpc64' option
>   note: builtin '__builtin_cmpb' requires builtin '__builtin_p6_cmpb'
>
> The reason for this is that this builtin wasn't even initialized in the
> old support.  This reflects a difference in philosophy between the old and
> new methods.  The old support often doesn't initialize builtins for which
> the conditions don't apply based on compile options, but this can backfire
> in general when such constructs as "#pragma target" are used.  The new
> support initializes all builtins, and waits until expand time to determine
> whether or not they are enabled.  Besides added flexibility, we also get
> better error messages as a result.
>
> The case for cmpb32-2.c is similar.
>
> Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64) with
> no regressions.  Is this okay for trunk?
>
> Thanks!
> Bill
>
>
> 2021-11-17  Bill Schmidt  
>
> gcc/testsuite/
>   * gcc.target/powerpc/cmpb-2.c: Adjust error message.
>   * gcc.target/powerpc/cmpb32-2.c: Likewise.
> ---
>  gcc/testsuite/gcc.target/powerpc/cmpb-2.c   | 2 +-
>  gcc/testsuite/gcc.target/powerpc/cmpb32-2.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/powerpc/cmpb-2.c 
> b/gcc/testsuite/gcc.target/powerpc/cmpb-2.c
> index 113ab6a5f99..02b84d0731d 100644
> --- a/gcc/testsuite/gcc.target/powerpc/cmpb-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/cmpb-2.c
> @@ -8,7 +8,7 @@ void abort ();
>  unsigned long long int
>  do_compare (unsigned long long int a, unsigned long long int b)
>  {
> -  return __builtin_cmpb (a, b);  /* { dg-warning "implicit declaration 
> of function '__builtin_cmpb'" } */
> +  return __builtin_cmpb (a, b);  /* { dg-error "'__builtin_p6_cmpb' 
> requires the '-mcpu=power6' option" } */
>  }
>  
>  void
> diff --git a/gcc/testsuite/gcc.target/powerpc/cmpb32-2.c 
> b/gcc/testsuite/gcc.target/powerpc/cmpb32-2.c
> index 37b54745e0e..d4264ab6e7d 100644
> --- a/gcc/testsuite/gcc.target/powerpc/cmpb32-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/cmpb32-2.c
> @@ -7,7 +7,7 @@ void abort ();
>  unsigned int
>  do_compare (unsigned int a, unsigned int b)
>  {
> -  return __builtin_cmpb (a, b);  /* { dg-warning "implicit declaration of 
> function '__builtin_cmpb'" } */
> +  return __builtin_cmpb (a, b);  /* { dg-error "'__builtin_p6_cmpb_32' 
> requires the '-mcpu=power6' option" } */
>  }
>  
>  void


[PATCH, PING] rs6000: Builtins test changes for BFP scalar tests

2021-12-01 Thread Bill Schmidt via Gcc-patches
Hi!  I'd like to ping this patch.  Segher had objected to the change in 
diagnostics,
but I hope we've solved that now with the better informational message [1].

Thanks!
Bill

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585250.html

On 11/17/21 2:58 PM, Bill Schmidt wrote:
> Hi!  This patch is broken out of the previous patch for all the builtins test
> suite adjustments.  Here we have some slight changes in error messages due to
> how the internals have changed between the old and new builtins methods.
>
> For scalar-extract-exp-2.c we change:
>   error: '__builtin_vec_scalar_extract_exp is not supported in this compiler 
> configuration'
>
> to:
>   error: '__builtin_vsx_scalar_extract_exp' requires the '-mcpu=power9' 
> option and either the '-m64' or '-mpowerpc64' option
>   note: builtin '__builtin_vec_scalar_extract_exp' requires builtin 
> '__builtin_vsx_scalar_extract_exp'
>
> The new message provides more information.  In both cases, it is less than
> ideal that we don't refer to scalar_extract_exp, which is referenced in
> the source line, but this is because scalar_extract_exp is #define'd to
> __builtin_vec_scalar_extract_exp, so it's unavoidable.  Certainly this is no
> worse than before, and arguably better.
>
> The cases for:
>   scalar-insert-exp-2.c
>   scalar-insert-exp-5.c
>   scalar-insert-exp-8.c
> are all similar.
>
> For scalar-extract-sig-2.c we again change:
>   error: '__builtin_vec_scalar_extract_sig' is not supported in this compiler 
> configuration'
>
> to:
>   error: '__builtin_vsx_scalar_extract_sig' requires the '-mcpu=power9' 
> option and either the '-m64' or '-mpowerpc64' option
>   note: builtin '__builtin_vec_scalar_extract_sig' requires builtin 
> '__builtin_vsx_scalar_extract_sig'
>
> Here it is clearer because there is no #define to muddy things up, and
> again the new message is arguably better than the old.
>
> For scalar-test-neg-{2,3,5}.c, we actually change the test case.  This is
> because we deliberately removed some undocumented and pointless   
> overloads,
> where each overload mapped to a single builtin.  These were:
>   __builtin_vec_scalar_test_neg_sp
>   __builtin_vec_scalar_test_neg_dp
>   __builtin_vec_scalar_test_neg_qp
> which are redundant with the "real" overload:
>   __builtin_vec_scalar_test_neg
> The latter maps to three builtins of the appropriate type.
>
> The revised test case uses the "real" overload instead, and otherwise the
> changes to the error messages are the same as for all the other cases.
>
> 2021-11-17  Bill Schmidt  
>
> gcc/testsuite/
>   * gcc.target/powerpc/bfp/scalar-extract-exp-2.c: Adjust error
>   message.
>   * gcc.target/powerpc/bfp/scalar-extract-sig-2.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-2.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-5.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-8.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-test-neg-2.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-test-neg-3.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-test-neg-5.c: Likewise.
> ---
>  gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c | 2 +-
>  gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c | 2 +-
>  gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-2.c  | 2 +-
>  gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-5.c  | 2 +-
>  gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-8.c  | 2 +-
>  gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-2.c| 2 +-
>  gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-3.c| 2 +-
>  gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-5.c| 2 +-
>  8 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c 
> b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
> index 922180675fc..53b67c95cf9 100644
> --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
> @@ -14,7 +14,7 @@ get_exponent (double *p)
>  {
>double source = *p;
>  
> -  return scalar_extract_exp (source);/* { dg-error 
> "'__builtin_vec_scalar_extract_exp' is not supported in this compiler 
> configuration" } */
> +  return scalar_extract_exp (source);/* { dg-error 
> "'__builtin_vsx_scalar_extract_exp' requires the" } */
>  }
>  
>  
> diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c 
> b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c
> index e24d4bd23fe..39ee74c94dc 100644
> --- a/gcc/testsuite/gcc.ta

Re: [PATCH v2] rs6000: Fix a handful of 32-bit built-in function problems

2021-12-01 Thread Bill Schmidt via Gcc-patches
Hi!

I'd like to ping this patch.  By the way, the diagnostics are improved [1] 
since I
sent it, so that we now inform the user that the overloaded function is 
implemented
by the instantiated function.

Thanks!
Bill

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585250.html

On 11/16/21 12:56 PM, Bill Schmidt wrote:
> Hi!  I previously posted [1] to correct some problems with the new builtins
> support targeting 32-bit code gen.  Based on the discussion, I've made some
> adjustments and would like to submit this for consideration.
>
> We eventually agreed that the strange behavior for -m32 -mpowerpc64 for 
> certain
> HTM builtins should be removed.  All of the registers TEXASR, TEXASRU, TFHAR,
> and TFIAR are now accessed using the unsigned long data type in all 
> configurations.
>
> Segher didn't like the change in the error message for the cmpb-3.c test case,
> but I think this should be fine.  The test case just tests for the error 
> message,
> but there is also a "note" message that provides additional information.  The
> diagnostics that the user sees will look like this:
>
> cmpb-3.c:11:3: error: '__builtin_p6_cmpb' requires the '-mcpu=power6' option 
> and either the '-m64' or '-mpowerpc64' option
> cmpb-3.c:11:3: note: builtin '__builtin_cmpb' requires builtin 
> '__builtin_p6_cmpb'
>
> So it's clear to the user that their use of __builtin_cmpb at line 11 
> triggered
> the error.
>
> Bootstrapped and tested on powerpc64le-linux-gnu, and on powerpc64-linux-gnu
> using -m32/-m64.  Is this okay for trunk?
>
> Thanks!
> Bill
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583905.html
>
>
> 2021-11-16  Bill Schmidt  
>
> gcc/
>   * config/rs6000/rs6000-builtin-new.def (CMPB): Flag as no32bit.
>   (BPERMD): Flag as 32bit (needing special handling for 32-bit).
>   (UNPACK_TD): Return unsigned long long instead of unsigned long.
>   (GET_TEXASR): Return unsigned long instead of unsigned long long.
>   (GET_TEXASRU): Likewise.
>   (GET_TFHAR): Likewise.
>   (GET_TFIAR): Likewise.
>   (SET_TEXASR): Pass unsigned long instead of unsigned long long.
>   (SET_TEXASRU): Likewise.
>   (SET_TFHAR): Likewise.
>   (SET_TFIAR): Likewise.
>   (TABORTDC): Likewise.
>   (TABORTDCI): Likewise.
>   * config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): Fix error
>   handling for no32bit.  Add 32bit handling for RS6000_BIF_BPERMD.
>
> gcc/testsuite/
>   * gcc.target/powerpc/cmpb-3.c: Adjust error message.
> ---
>  gcc/config/rs6000/rs6000-builtin-new.def  | 30 +++
>  gcc/config/rs6000/rs6000-call.c   |  9 ---
>  gcc/testsuite/gcc.target/powerpc/cmpb-3.c |  2 +-
>  3 files changed, 22 insertions(+), 19 deletions(-)
>
> diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
> b/gcc/config/rs6000/rs6000-builtin-new.def
> index 58dfce1ca37..30556e5c7f2 100644
> --- a/gcc/config/rs6000/rs6000-builtin-new.def
> +++ b/gcc/config/rs6000/rs6000-builtin-new.def
> @@ -273,7 +273,7 @@
>  ; Power6 builtins requiring 64-bit GPRs (even with 32-bit addressing).
>  [power6-64]
>const signed long __builtin_p6_cmpb (signed long, signed long);
> -CMPB cmpbdi3 {}
> +CMPB cmpbdi3 {no32bit}
>  
>  
>  ; AltiVec builtins.
> @@ -2018,7 +2018,7 @@
>  ADDG6S addg6s {}
>  
>const signed long __builtin_bpermd (signed long, signed long);
> -BPERMD bpermd_di {}
> +BPERMD bpermd_di {32bit}
>  
>const unsigned int __builtin_cbcdtd (unsigned int);
>  CBCDTD cbcdtd {}
> @@ -2971,7 +2971,7 @@
>void __builtin_set_fpscr_drn (const int[0,7]);
>  SET_FPSCR_DRN rs6000_set_fpscr_drn {}
>  
> -  const unsigned long __builtin_unpack_dec128 (_Decimal128, const int<1>);
> +  const unsigned long long __builtin_unpack_dec128 (_Decimal128, const 
> int<1>);
>  UNPACK_TD unpacktd {}
>  
>  
> @@ -3014,39 +3014,39 @@
>  
>  
>  [htm]
> -  unsigned long long __builtin_get_texasr ();
> +  unsigned long __builtin_get_texasr ();
>  GET_TEXASR nothing {htm,htmspr}
>  
> -  unsigned long long __builtin_get_texasru ();
> +  unsigned long __builtin_get_texasru ();
>  GET_TEXASRU nothing {htm,htmspr}
>  
> -  unsigned long long __builtin_get_tfhar ();
> +  unsigned long __builtin_get_tfhar ();
>  GET_TFHAR nothing {htm,htmspr}
>  
> -  unsigned long long __builtin_get_tfiar ();
> +  unsigned long __builtin_get_tfiar ();
>  GET_TFIAR nothing {htm,htmspr}
>  
> -  void __builtin_set_texasr (unsigned long long);
> +  void __builtin_set_texasr (unsigned long);
>  SET_TEXASR nothing {htm,htmspr}
>  
> -  

[PATCH] rs6000: Mirror fix for PR102347 into the new builtins support

2021-12-01 Thread Bill Schmidt via Gcc-patches
Hi!

Recently Kewen fixed a problem in the old builtins support where
rs6000_builtin_decl prematurely indicated that a target builtin is
unavailable.  This also needs to be done for the new builtins support, but in
this case we have to ensure the error message is still produced from the
overload support in rs6000-c.c.  Unfortunately, this is less straightforward
than it could be, because header file includes need to be adjusted to make this
happen.  Someday we'll consolidate all the builtin code in one file and this
won't have to be so ugly.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
okay for trunk?

Thanks!
Bill


2021-12-01  Bill Schmidt  

gcc/
PR target/102347
* config/rs6000/rs6000-c.c (rs6000-builtins.h): Stop including.
(rs6000-internal.h): Include.
(altivec_resolve_new_overloaded_builtin): Move call to
rs6000_invalid_new_builtin here from rs6000_new_builtin_decl.
* config/rs6000/rs6000-call.c (rs6000-builtins.h): Stop including.
(rs6000_invalid_new_builtin): Remove static qualifier.
(rs6000_new_builtin_decl): Remove test for supported builtin.
* config/rs6000/rs6000-internal.h (rs6000-builtins.h): Include.
(rs6000_invalid_new_builtin): Declare.
* config/rs6000/rs6000.c (rs6000-builtins.h): Don't include.
---
 gcc/config/rs6000/rs6000-c.c| 11 +++
 gcc/config/rs6000/rs6000-call.c |  9 +
 gcc/config/rs6000/rs6000-internal.h |  3 +++
 gcc/config/rs6000/rs6000.c  |  1 -
 4 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index 5eeac9d4c06..8e83d97e72f 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -35,7 +35,7 @@
 #include "langhooks.h"
 #include "c/c-tree.h"
 
-#include "rs6000-builtins.h"
+#include "rs6000-internal.h"
 
 static tree altivec_resolve_new_overloaded_builtin (location_t, tree, void *);
 
@@ -2987,11 +2987,14 @@ altivec_resolve_new_overloaded_builtin (location_t loc, 
tree fndecl,
const char *name = rs6000_overload_info[adj_fcode].ovld_name;
if (!supported)
  {
+   /* Indicate that the instantiation of the overloaded builtin
+  name is not available with the target flags in effect.  */
+   rs6000_gen_builtins fcode = (rs6000_gen_builtins) instance->bifid;
+   rs6000_invalid_new_builtin (fcode);
+   /* Provide clarity of the relationship between the overload
+  and the instantiation.  */
const char *internal_name
  = rs6000_builtin_info_x[instance->bifid].bifname;
-   /* An error message making reference to the name of the
-  non-overloaded function has already been issued.  Add
-  clarification of the previous message.  */
rich_location richloc (line_table, input_location);
inform (,
"overloaded builtin %qs is implemented by builtin %qs",
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index cd477fa4876..01688c4169d 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -69,7 +69,6 @@
 #include "opts.h"
 
 #include "rs6000-internal.h"
-#include "rs6000-builtins.h"
 
 #if TARGET_MACHO
 #include "gstab.h"  /* for N_SLINE */
@@ -11905,7 +11904,7 @@ rs6000_invalid_builtin (enum rs6000_builtins fncode)
 /* Raise an error message for a builtin function that is called without the
appropriate target options being set.  */
 
-static void
+void
 rs6000_invalid_new_builtin (enum rs6000_gen_builtins fncode)
 {
   size_t j = (size_t) fncode;
@@ -16624,12 +16623,6 @@ rs6000_new_builtin_decl (unsigned code, bool /* 
initialize_p */)
   if (fcode >= RS6000_OVLD_MAX)
 return error_mark_node;
 
-  if (!rs6000_new_builtin_is_supported (fcode))
-{
-  rs6000_invalid_new_builtin (fcode);
-  return error_mark_node;
-}
-
   return rs6000_builtin_decls_x[code];
 }
 
diff --git a/gcc/config/rs6000/rs6000-internal.h 
b/gcc/config/rs6000/rs6000-internal.h
index 88cf9bd5692..a880fd37618 100644
--- a/gcc/config/rs6000/rs6000-internal.h
+++ b/gcc/config/rs6000/rs6000-internal.h
@@ -22,6 +22,8 @@
 #ifndef GCC_RS6000_INTERNAL_H
 #define GCC_RS6000_INTERNAL_H
 
+#include "rs6000-builtins.h"
+
 /* Structure used to define the rs6000 stack */
 typedef struct rs6000_stack {
   int reload_completed;/* stack info won't change from here on 
*/
@@ -140,6 +142,7 @@ extern void rs6000_output_mi_thunk (FILE *file,
 extern bool rs6000_output_addr_const_extra (FILE *file, rtx x);
 extern bool rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi);
 extern tree rs6000_build_builtin_va_list (void);
+extern void rs6000_invalid_new_builtin (rs6000_gen_builtins fncode);
 extern void rs6000

[PATCH, committed] rs6000: Fix test_mffsl.c effective target check

2021-11-23 Thread Bill Schmidt via Gcc-patches
Hi!

Paul Clarke pointed out to me that I had wrongly used a compile-time check
instead of a run-time check in this executable test.  This patch fixes
that.  I also fixed a typo in a string that caught my eye.

Tested on powerpc64le-linux-gnu, committed as obvious.

Thanks!
Bill


2021-11-23  Bill Schmidt  

gcc/testsuite/
* gcc.target/powerpc/test_mffsl.c: Change effective target to
a run-time check.  Fix a typo in a debug print statement.
---
 gcc/testsuite/gcc.target/powerpc/test_mffsl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/test_mffsl.c 
b/gcc/testsuite/gcc.target/powerpc/test_mffsl.c
index 28c2b91988e..f1f960c51c7 100644
--- a/gcc/testsuite/gcc.target/powerpc/test_mffsl.c
+++ b/gcc/testsuite/gcc.target/powerpc/test_mffsl.c
@@ -1,6 +1,6 @@
 /* { dg-do run { target { powerpc*-*-* } } } */
 /* { dg-options "-O2 -std=c99 -mcpu=power9" } */
-/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-require-effective-target p9vector_hw } */
 
 #ifdef DEBUG
 #include 
@@ -28,7 +28,7 @@ int main ()
   if (mffs_val.ll != mffsl_val.ll)
 {
 #ifdef DEBUG
-  printf("ERROR, __builtin_mffsl() returned 0x%llx, not the expecected 
value 0x%llx\n",
+  printf("ERROR, __builtin_mffsl() returned 0x%llx, not the expected value 
0x%llx\n",
 mffsl_val.ll, mffs_val.ll);
 #else
   abort();
-- 
2.25.1




[PATCH] rs6000: Clarify overloaded builtin diagnostic

2021-11-23 Thread Bill Schmidt via Gcc-patches
Hi!

When a built-in function required by an overloaded function name is not
currently enabled, the diagnostic message is not as clear as it should be.
Saying that one built-in "requires" another is somewhat misleading.  It is
better to explicitly state that the overloaded builtin is implemented by the
missing builtin, so the user knows that the previous error message for the
implementing builtin is because of the overload relationship.

This patch adjusts the informational diagnostic for both the original support
and the new builtin support.  This doesn't affect the test suite, since we
don't test for "note" diagnostics anywhere.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is
this okay for trunk?

Thanks!
Bill


2021-11-23  Bill Schmidt  

gcc/
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
Clarify diagnostic.
(altivec_resolve_new_overloaded_builtin): Likewise.
---
 gcc/config/rs6000/rs6000-c.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index d08bdfec3ae..5eeac9d4c06 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -1946,7 +1946,8 @@ altivec_resolve_overloaded_builtin (location_t loc, tree 
fndecl,
   non-overloaded function has already been issued.  Add
   clarification of the previous message.  */
rich_location richloc (line_table, input_location);
-   inform (, "builtin %qs requires builtin %qs",
+   inform (,
+   "overloaded builtin %qs is implemented by builtin %qs",
name, internal_name);
  }
else
@@ -2992,7 +2993,8 @@ altivec_resolve_new_overloaded_builtin (location_t loc, 
tree fndecl,
   non-overloaded function has already been issued.  Add
   clarification of the previous message.  */
rich_location richloc (line_table, input_location);
-   inform (, "builtin %qs requires builtin %qs",
+   inform (,
+   "overloaded builtin %qs is implemented by builtin %qs",
name, internal_name);
  }
else
-- 
2.27.0




Re: [PATCH 1/3] Add power10 zero cycle moves for switches & indirect jumps

2021-11-22 Thread Bill Schmidt via Gcc-patches
Hi Mike,

Thanks for this patch!

On 11/19/21 8:53 AM, Michael Meissner wrote:
> Add power10 zero cycle moves for switches.
>
> Power10 will fuse adjacenet 'mtctr' and 'bctr' instructions to form zero
> cycle moves.  This code exploits this fusion opportunity.
>
> I have built bootstrapped compilers with this patch on little endian power9 
> and
> power10 systems with no regressions.  Can I install this into the master
> branch?
>
> 2021-11-19  Michael Meissner  
>
>   * config/rs6000/rs6000-cpus.def (ISA_3_1_MASKS_SERVER): Add
>   support for -mpower10-fusion-zero-cycle.
>   (POWERPC_MASKS): Likewise.
>   * config/rs6000/rs6000.c (rs6000_option_override_internal):
>   Likewise.
>   * config/rs6000/rs6000.md (indirect_jump): Support zero cycle
>   moves.
>   (indirect_jump_zero_cycle): New insns.
>   (tablejump_normal): Likewise.
>   (tablejump_absolute): Likewise.
>   (tablejump_insn_zero_cycle): New insn.
>   * config/rs6000/rs6000.opt (-mpower10-fusion-zero-cycle): New
>   debug switch.
> ---
>  gcc/config/rs6000/rs6000-cpus.def |  4 ++-
>  gcc/config/rs6000/rs6000.c|  4 +++
>  gcc/config/rs6000/rs6000.md   | 52 ---
>  gcc/config/rs6000/rs6000.opt  |  4 +++
>  4 files changed, 59 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/config/rs6000/rs6000-cpus.def 
> b/gcc/config/rs6000/rs6000-cpus.def
> index f5812da0184..cc072ee94ea 100644
> --- a/gcc/config/rs6000/rs6000-cpus.def
> +++ b/gcc/config/rs6000/rs6000-cpus.def
> @@ -91,7 +91,8 @@
>| OPTION_MASK_P10_FUSION_LOGADD\
>| OPTION_MASK_P10_FUSION_ADDLOG\
>| OPTION_MASK_P10_FUSION_2ADD  \
> -  | OPTION_MASK_P10_FUSION_2STORE)
> +  | OPTION_MASK_P10_FUSION_2STORE\
> +  | OPTION_MASK_P10_FUSION_ZERO_CYCLE)

I guess it's fine to introduce one more for now, but ultimately we want
all these to get collapsed down to one.  No worries from me.

>  
>  /* Flags that need to be turned off if -mno-power9-vector.  */
>  #define OTHER_P9_VECTOR_MASKS(OPTION_MASK_FLOAT128_HW
> \
> @@ -145,6 +146,7 @@
>| OPTION_MASK_P10_FUSION_ADDLOG\
>| OPTION_MASK_P10_FUSION_2ADD  \
>| OPTION_MASK_P10_FUSION_2STORE\
> +  | OPTION_MASK_P10_FUSION_ZERO_CYCLE\
>| OPTION_MASK_HTM  \
>| OPTION_MASK_ISEL \
>| OPTION_MASK_MFCRF\
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index e4843eb0f1c..6780304a5eb 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -4497,6 +4497,10 @@ rs6000_option_override_internal (bool global_init_p)
>&& (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_2STORE) == 0)
>  rs6000_isa_flags |= OPTION_MASK_P10_FUSION_2STORE;
>  
> +  if (TARGET_POWER10
> +  && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_ZERO_CYCLE) == 
> 0)
> +rs6000_isa_flags |= OPTION_MASK_P10_FUSION_ZERO_CYCLE;
> +
>/* Turn off vector pair/mma options on non-power10 systems.  */
>else if (!TARGET_POWER10 && TARGET_MMA)
>  {
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 6bec2bddbde..ea41eb4ada3 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -12988,15 +12988,34 @@ (define_expand "indirect_jump"
>  emit_jump_insn (gen_indirect_jump_nospec (Pmode, operands[0], ccreg));
>  DONE;
>}
> +  if (TARGET_P10_FUSION && TARGET_P10_FUSION_ZERO_CYCLE)
> +{
> +  emit_jump_insn (gen_indirect_jump_zero_cycle (Pmode, operands[0]));
> +  DONE;
> +}
>  })
>  
>  (define_insn "*indirect_jump"
>[(set (pc)
>   (match_operand:P 0 "register_operand" "c,*l"))]
> -  "rs6000_speculate_indirect_jumps"
> +  "rs6000_speculate_indirect_jumps
> +   && !(TARGET_P10_FUSION && TARGET_P10_FUSION_ZERO_CYCLE)"
>"b%T0"
>[(set_attr "type" "jmpreg")])
>  
> +(define_insn "@indirect_jump_zero_cycle"

I don't know why this is an "@" pattern, but honestly I don't
know why @indirect_jump_nospec is an "@" pattern either.
The documentation for such things is hard for me to understand,
so I'm probably just missing something obvious, but I don't
immediately see why we would need the @ here.

> +  [(set (pc)
> + (match_operand:P 0 "register_operand" "r,r,!cl"))
> +   (clobber (match_scratch:P 1 "=c,*l,X"))]

Do we need the *l and X alternatives if we're only doing this for
mtctr/bctr?

> +  "rs6000_speculate_indirect_jumps && TARGET_P10_FUSION
> +   && TARGET_P10_FUSION_ZERO_CYCLE"
> +  "@
> +   mt%T1 

Re: [PATCH 2/3] Set power10 fusion if -mtune=power10.

2021-11-22 Thread Bill Schmidt via Gcc-patches
Hi Mike,

On 11/19/21 8:55 AM, Michael Meissner wrote:
> Set power10 fusion if -mtune=power10.
>
> In doing the patch for zero cycle moves for switch statements and indirect
> jumps, I noticed the fusion support is only done if -mcpu=power10.  This 
> option
> enables power10 fusion if we use -mtune=power10.
>
> I have built and run the testsuites on little endian power9 and power10 
> systems
> with no regressions.  Can I install this patch?

This all seems fine, but since we're planning on collapsing all those flags
anyway, maybe it would be better if we did that first.  This seems like work
that will mostly be removed soon.  But no concerns from me otherwise.

Thanks!
Bill

>
> 2021-11-19  Michael Meissner  
>
>   * config/rs6000/rs6000.c (rs6000_option_override_internal): Enable
>   power10 fusion if -mtune=power10.
>   (rs6000_opt_masks): Add power10 fusion options.
> ---
>  gcc/config/rs6000/rs6000.c | 25 +
>  1 file changed, 17 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 6780304a5eb..8531cef0337 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -4469,35 +4469,36 @@ rs6000_option_override_internal (bool global_init_p)
>if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_MMA) == 0)
>  rs6000_isa_flags |= OPTION_MASK_MMA;
>  
> -  if (TARGET_POWER10
> +  /* Enable power10 tuning if either -mcpu=power10 or -mtune=power10.  */
> +  if ((TARGET_POWER10 || rs6000_tune == PROCESSOR_POWER10)
>&& (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION) == 0)
>  rs6000_isa_flags |= OPTION_MASK_P10_FUSION;
>  
> -  if (TARGET_POWER10 &&
> +  if (TARGET_P10_FUSION &&
>(rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_LD_CMPI) == 0)
>  rs6000_isa_flags |= OPTION_MASK_P10_FUSION_LD_CMPI;
>  
> -  if (TARGET_POWER10
> +  if (TARGET_P10_FUSION
>&& (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_2LOGICAL) == 0)
>  rs6000_isa_flags |= OPTION_MASK_P10_FUSION_2LOGICAL;
>  
> -  if (TARGET_POWER10
> +  if (TARGET_P10_FUSION
>&& (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_LOGADD) == 0)
>  rs6000_isa_flags |= OPTION_MASK_P10_FUSION_LOGADD;
>  
> -  if (TARGET_POWER10
> +  if (TARGET_P10_FUSION
>&& (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_ADDLOG) == 0)
>  rs6000_isa_flags |= OPTION_MASK_P10_FUSION_ADDLOG;
>  
> -  if (TARGET_POWER10
> +  if (TARGET_P10_FUSION
>&& (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_2ADD) == 0)
>  rs6000_isa_flags |= OPTION_MASK_P10_FUSION_2ADD;
>  
> -  if (TARGET_POWER10
> +  if (TARGET_P10_FUSION
>&& (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_2STORE) == 0)
>  rs6000_isa_flags |= OPTION_MASK_P10_FUSION_2STORE;
>  
> -  if (TARGET_POWER10
> +  if (TARGET_P10_FUSION
>&& (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_ZERO_CYCLE) == 
> 0)
>  rs6000_isa_flags |= OPTION_MASK_P10_FUSION_ZERO_CYCLE;
>  
> @@ -24292,6 +24293,14 @@ static struct rs6000_opt_mask const 
> rs6000_opt_masks[] =
>{ "power9-misc",   OPTION_MASK_P9_MISC,false, true  },
>{ "power9-vector", OPTION_MASK_P9_VECTOR,  false, true  },
>{ "power10-fusion",OPTION_MASK_P10_FUSION, false, 
> true  },
> +  { "power10-fusion-ld-cmpi",OPTION_MASK_P10_FUSION_LD_CMPI, false, 
> true  },
> +  { "power10-fusion-2logical",   OPTION_MASK_P10_FUSION_2LOGICAL,false, 
> true  },
> +  { "power10-fusion-logical-add", OPTION_MASK_P10_FUSION_LOGADD,false, true  
> },
> +  { "power10-fusion-add-logical", OPTION_MASK_P10_FUSION_ADDLOG,false, true  
> },
> +  { "power10-fusion-2add",   OPTION_MASK_P10_FUSION_2ADD,false, true  },
> +  { "power10-fusion-2store", OPTION_MASK_P10_FUSION_2STORE,  false, true  },
> +  { "power10-fusion-zero-cycle", OPTION_MASK_P10_FUSION_ZERO_CYCLE,
> + false, true  },
>{ "powerpc-gfxopt",OPTION_MASK_PPC_GFXOPT, false, 
> true  },
>{ "powerpc-gpopt", OPTION_MASK_PPC_GPOPT,  false, true  },
>{ "prefixed",  OPTION_MASK_PREFIXED,   false, 
> true  },


Re: [PATCH 0/3] Add zero cycle move support

2021-11-22 Thread Bill Schmidt via Gcc-patches
Hi!

On 11/19/21 8:49 AM, Michael Meissner wrote:
> The next set of 3 patches add zero cycle move support to the Power10.  Zero
> cycle moves are where the move to LR/CTR/TAR register that is adjacent to the
> jump to LR/CTR/TAR register can be fused together.
>
> At the moment, these set of three patches add support for zero cycle moves for
> indirect jumps and switch tables using the CTR register.  Potential zero cycle
> moves for doing returns are not currently handled.
>
> In looking at the code, I discovered that just using zero cycle moves isn't as
> helpful unless we can eliminate the add instruction before doing the jump.  I
> also noticed that the various power10 fusion options are only done if
> -mcpu=power10.  I added a patch to do the fusion for -mtune=power10 as well.
>
> I have done bootstraps and make check with these patches installed on both
> little endian power9 and little endian power10 systems.  Can I install these
> patches?
>
> The following patches will be posted:
>
> 1) Patch to add zero cycle move for indirect jumps and switches.
>
> 2) Patch to enable p10 fusion for -mtune=power10 in addition to -mcpu=power10.
>
> 3) Patch to use absolute addresses for switch tables instead of relative
>addresses if zero cycle fusion is enabled.
>
For this last point, I had thought that the plan was to always switch over to
absolute addresses for switch tables, following the work that Hao Chen did in
this area.  Am I misremembering?  Hao Chen, can you please remind me where we
ended up here?

Thanks!
Bill



Re: [PATCH] rs6000: Builtins test changes for BFP scalar tests

2021-11-18 Thread Bill Schmidt via Gcc-patches
Hi!

On 11/18/21 3:32 PM, Segher Boessenkool wrote:
> On Thu, Nov 18, 2021 at 03:30:48PM -0600, Bill Schmidt wrote:
>> On 11/18/21 3:16 PM, Segher Boessenkool wrote:
>>> Hi!
>>>
>>> On Wed, Nov 17, 2021 at 05:06:05PM -0600, Bill Schmidt wrote:
>>>>> I don't like that at all.  The user didn't write the _vsx thing, and it
>>>>> isn't documented either (neither is the _vec one, but that is a separate
>>>>> issue, specific to this builtin).
>>>> I feel like I haven't explained this well.  This kind of thing has been in
>>>> existence forever even in the old builtins code.  The combination of the
>>>> error showing the internal builtin name, and the note tying the overload
>>>> name to the internal builtin name, has been there all along.  The name of
>>>> the internal builtin is pretty meaningless.  The only thing that's 
>>>> interesting
>>>> in this case is that we previously didn't get this *for this specific case*
>>>> because the old code went to a generic fallback.  But in many other cases
>>>> you get exactly this same kind of error message for the old code.
>>> Yes.  And it still is a regression (in *this* case).
>> Sorry, I don't understand.  Why specifically is this a regression?
> It is wrong now, in ways that it wasn't wrong before.  That is the
> definition of regression!

I'm sorry, I disagree.  With clarification of the note, I don't see how
this can be considered a regression.  It is providing information in a
different way, but it is still clear that the user has misused the builtin
in the context, and, unlike before, it now tells them *what* is wrong with
the options that were used (not just "unavailable in this configuration").
The fact that an internal builtin name is *also* disclosed as part of
this does not make it wrong.

The way that overloads work, we can only tell whether a builtin is
enabled with the current set of options by looking at the true builtin
that the overload maps to.  The enablement checking code doesn't have
any knowledge that an overloaded function maps to it.  So that error
message is produced without knowledge of the overloading.  The note
diagnostic is added by the overload processing code that *is* aware
that the mapping exists.

The enablement checking code (rs6000_invalid_builtin in the old code,
rs6000_invalid_new_builtin in the new code) is called from multiple
places, and not always in an overload context, so we can't assume
this is the case.  Not all builtins are mapped to by overloads, but
they still need enablement checking.

Would it be possible to change things so that we pass in the overload
name to be used in the error message when appropriate?  Yes.  But
this would have a much larger impact on the test suite than the
current method, because all error tests for overloads would now
have to change.  That is, there are many existing tests that are
already "wrong" in the sense that they report the internal builtin
name.

I suggest that we add that to the list of future cleanups due to
the size of the impact.  I agree that it has never been ideal to
mention the internal builtin name on these messages.  It's just
not unique to the changes I've made here; it's a pre-existing
condition that needs work to cleanse it everywhere.

Can we move forward this way?

Thanks,
Bill

>
>
> Segher


Re: [PATCH] rs6000: Builtins test changes for BFP scalar tests

2021-11-18 Thread Bill Schmidt via Gcc-patches


On 11/18/21 3:16 PM, Segher Boessenkool wrote:
> Hi!
>
> On Wed, Nov 17, 2021 at 05:06:05PM -0600, Bill Schmidt wrote:
>>> I don't like that at all.  The user didn't write the _vsx thing, and it
>>> isn't documented either (neither is the _vec one, but that is a separate
>>> issue, specific to this builtin).
>> I feel like I haven't explained this well.  This kind of thing has been in
>> existence forever even in the old builtins code.  The combination of the
>> error showing the internal builtin name, and the note tying the overload
>> name to the internal builtin name, has been there all along.  The name of
>> the internal builtin is pretty meaningless.  The only thing that's 
>> interesting
>> in this case is that we previously didn't get this *for this specific case*
>> because the old code went to a generic fallback.  But in many other cases
>> you get exactly this same kind of error message for the old code.
> Yes.  And it still is a regression (in *this* case).

Sorry, I don't understand.  Why specifically is this a regression?

Bill

>
>
> Segher


Summary of outstanding builtins infrastructure patches

2021-11-18 Thread Bill Schmidt via Gcc-patches
Hi!  Thanks for all the recent reviews and conversations on the builtins
infrastructure patches.  I've posted a lot of stuff in the last couple
of days, so I thought it might be useful to summarize which patches still
need review.  No rush, just trying to make it easier to consume...

https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584631.html
  Add [power6-64] stanza
  Not yet reviewed

https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584638.html
  V2: Fix a handful of 32-bit built-in function problems
  Follow-up patch after lengthy discussion

https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584826.html
  Test changes for BFP scalar tests
  This is the last post in the conversation.  I'm hoping that an additional
patch to change the note diagnostic will make this more palatable.
Resolving this will help with some of the other patches.

https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584827.html
  Test changes for byte-in-set-2.c
  Not yet reviewed

https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584829.html
  Test changes for compare-bytes
  Not yet reviewed

https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584857.html
  Test changes for pr80315-*.c, pr88100.c
  Not yet reviewed

https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584860.html
  Test changes for pragma_misc9.c
  Not yet reviewed

https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584866.html
  Test changes for test_fpscr_[d]rn_builtin_error.c
  Not yet reviewed


That's the lot.  Once again, I really appreciate all the time you've
spent reviewing this series this year!

Thanks,
Bill



[PATCH] rs6000: Builtins test changes for test_fpscr_[d]rn_builtin_error.c

2021-11-18 Thread Bill Schmidt via Gcc-patches
Hi!  This is the last patch broken out of the previous test suite patch
for the new builtins support.

One advantage of the new builtins support is uniform error messages for
arguments with restricted values.  Previously this was done in many places
in an ad hoc manner, with little uniformity.  This patch adjusts the
expected error messages accordingly.

All such error messages are now one of the following:
  "argument %d must be a %d-bit unsigned literal"
  "argument %d must be a literal between %d and %d, inclusive"
  "argument %d must be a variable or a literal between %d and %d, inclusive"
  "argument %d must be either a literal %d or a literal %d"

These messages were chosen to require the fewest changes from previous
messages while still introducing uniformity.  This patch adjusts error
messages for some cases where this produces changed messages.  In
particular, some messages are improved because previously they did not
admit the possibility that an argument could hold a variable.

Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64)
with no regressions.  Is this okay for trunk?

Thanks!
Bill


2021-11-17  Bill Schmidt  

gcc/testsuite/
* gcc.target/powerpc/test_fpscr_drn_builtin_error.c: Adjust error
messages.
* gcc.target/powerpc/test_fpscr_rn_builtin_error.c: Likewise.
---
 .../powerpc/test_fpscr_drn_builtin_error.c   |  4 ++--
 .../gcc.target/powerpc/test_fpscr_rn_builtin_error.c | 12 ++--
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/test_fpscr_drn_builtin_error.c 
b/gcc/testsuite/gcc.target/powerpc/test_fpscr_drn_builtin_error.c
index 028ab0b6d66..4f9d9e08e8a 100644
--- a/gcc/testsuite/gcc.target/powerpc/test_fpscr_drn_builtin_error.c
+++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_drn_builtin_error.c
@@ -9,8 +9,8 @@ int main ()
  __builtin_set_fpscr_drn() also support a variable as an argument but
  can't test variable value at compile time.  */
 
-  __builtin_set_fpscr_drn(-1);  /* { dg-error "Argument must be a value 
between 0 and 7" } */ 
-  __builtin_set_fpscr_drn(8);   /* { dg-error "Argument must be a value 
between 0 and 7" } */ 
+  __builtin_set_fpscr_drn(-1);  /* { dg-error "argument 1 must be a variable 
or a literal between 0 and 7, inclusive" } */ 
+  __builtin_set_fpscr_drn(8);   /* { dg-error "argument 1 must be a variable 
or a literal between 0 and 7, inclusive" } */ 
 
 }
 
diff --git a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c 
b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c
index aea65091b0c..10391b71008 100644
--- a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c
+++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c
@@ -8,13 +8,13 @@ int main ()
  int arguments.  The builtins __builtin_set_fpscr_rn() also supports a
  variable as an argument but can't test variable value at compile time.  */
 
-  __builtin_mtfsb0(-1);  /* { dg-error "Argument must be a constant between 0 
and 31" } */
-  __builtin_mtfsb0(32);  /* { dg-error "Argument must be a constant between 0 
and 31" } */
+  __builtin_mtfsb0(-1);  /* { dg-error "argument 1 must be a 5-bit unsigned 
literal" } */
+  __builtin_mtfsb0(32);  /* { dg-error "argument 1 must be a 5-bit unsigned 
literal" } */
 
-  __builtin_mtfsb1(-1);  /* { dg-error "Argument must be a constant between 0 
and 31" } */
-  __builtin_mtfsb1(32);  /* { dg-error "Argument must be a constant between 0 
and 31" } */ 
+  __builtin_mtfsb1(-1);  /* { dg-error "argument 1 must be a 5-bit unsigned 
literal" } */
+  __builtin_mtfsb1(32);  /* { dg-error "argument 1 must be a 5-bit unsigned 
literal" } */ 
 
-  __builtin_set_fpscr_rn(-1);  /* { dg-error "Argument must be a value between 
0 and 3" } */ 
-  __builtin_set_fpscr_rn(4);   /* { dg-error "Argument must be a value between 
0 and 3" } */ 
+  __builtin_set_fpscr_rn(-1);  /* { dg-error "argument 1 must be a variable or 
a literal between 0 and 3, inclusive" } */ 
+  __builtin_set_fpscr_rn(4);   /* { dg-error "argument 1 must be a variable or 
a literal between 0 and 3, inclusive" } */ 
 }
 
-- 
2.27.0




Re: [PATCH] rs6000: Builtin test changes for int_128bit-runnable.c

2021-11-18 Thread Bill Schmidt via Gcc-patches
Hi!

On 11/18/21 10:22 AM, Segher Boessenkool wrote:
> On Thu, Nov 18, 2021 at 10:09:53AM -0600, Bill Schmidt wrote:
>> Hi!  This patch is broken out from the test case patch for the new builtins 
>> support.
>>
>> The old builtins code performs gimple folding on 128-bit compares.  This
>> results in correct but very inefficient code.  (I suspect we may be
>> missing some optab entries, misleading gimple into 64-bit emulation.)
> It is sub-optimal if Gimple ever does this: it is better done later, at
> expand time.
>
>> In
>> the new support, I disabled this gimple folding, which results in us
>> directly generating the 128-bit comparison instructions.  This patch
>> adjusts the scan-assembler-times counts for those instructions.
>>
>> I've opened PR103316 to track this.
> Thanks.
>
> So when the generic code wisens up this testcase will still pass?  But
> you do then need to re-introduce the folding here?

Right.  If we fix the generic code, the test will still pass.  We need
to re-introduce the folding to leverage it, and will only do that if
the test still passes.  We always want these single-instruction
comparisons to fall out for simple tests like these.

>
>> gcc/testsuite/
>>  * gcc.target/powerpc/int_128bit-runnable.c: Adjust instruction
>>  counts since we do better by not gimple-folding some builtins.
> Wrap later please?  80 chars is fine, 79 chars is fine, 10 chars or 70
> chars is not :-(
>
> (Not that it matters much *here* of course; it just annoys me

A slight argument in favor of earlier wrapping:  With Git, the ChangeLog
entries in the commit messages get indented, so wrapping a little
earlier makes those much easier to read.  That's why I started reducing
the length of my entries a little.  Not a big deal either way, but it's
really noticeable in git log output.

>
> Also, s/ since.*/./ please.  Changelogs say what changed, not why, and
> the "why" you say here is only half of the story, pretty misleading for
> future archaeologists.

Good call.
>
>> --- a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
>> @@ -11,9 +11,9 @@
>>  /* { dg-final { scan-assembler-times {\mvrlq\M} 2 } } */
>>  /* { dg-final { scan-assembler-times {\mvrlqnm\M} 2 } } */
>>  /* { dg-final { scan-assembler-times {\mvrlqmi\M} 2 } } */
>> -/* { dg-final { scan-assembler-times {\mvcmpequq\M} 16 } } */
>> -/* { dg-final { scan-assembler-times {\mvcmpgtsq\M} 16 } } */
>> -/* { dg-final { scan-assembler-times {\mvcmpgtuq\M} 16 } } */
>> +/* { dg-final { scan-assembler-times {\mvcmpequq\M} 24 } } */
>> +/* { dg-final { scan-assembler-times {\mvcmpgtsq\M} 26 } } */
>> +/* { dg-final { scan-assembler-times {\mvcmpgtuq\M} 26 } } */
>>  /* { dg-final { scan-assembler-times {\mvmuloud\M} 1 } } */
>>  /* { dg-final { scan-assembler-times {\mvmulesd\M} 1 } } */
>>  /* { dg-final { scan-assembler-times {\mvmulosd\M} 1 } } */
> If you think it actually generates better code now, and this is expected
> code, then okay for trunk.  Thanks!

Thanks very much for the review!
Bill
>
>
> Segher


[PATCH] rs6000: Builtins test changes for pragma_misc9.c

2021-11-18 Thread Bill Schmidt via Gcc-patches
Hi!  This patch is broken out from the test suite patch for the new
builtins support.  This one is just a minor adjustment for the error
message wording.

Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64)
with no regressions.  Is this okay for trunk?

Thanks!
Bill


2021-11-17  Bill Schmidt  

gcc/testsuite/
* gcc.target/powerpc/pragma_misc9.c: Adjust error message.
---
 gcc/testsuite/gcc.target/powerpc/pragma_misc9.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/pragma_misc9.c 
b/gcc/testsuite/gcc.target/powerpc/pragma_misc9.c
index e03099bd084..c1667d9f7db 100644
--- a/gcc/testsuite/gcc.target/powerpc/pragma_misc9.c
+++ b/gcc/testsuite/gcc.target/powerpc/pragma_misc9.c
@@ -20,7 +20,7 @@ vector bool int
 test2 (vector signed int a, vector signed int b)
 {
   return vec_cmpnez (a, b);
-  /* { dg-error "'__builtin_altivec_vcmpnezw' requires the '-mcpu=power9' 
option" "" { target *-*-* } .-1 } */
+  /* { dg-error "'__builtin_altivec_vcmpnezw' requires the '-mcpu=power9' and 
'-mvsx' options" "" { target *-*-* } .-1 } */
 }
 
 #pragma GCC target ("cpu=power7")
@@ -28,7 +28,7 @@ vector signed int
 test3 (vector signed int a, vector signed int b)
 {
   return vec_mergee (a, b);
-  /* { dg-error "'__builtin_altivec_vmrgew_v4si' requires the 
'-mpower8-vector' option" "" { target *-*-* } .-1 } */
+  /* { dg-error "'__builtin_altivec_vmrgew_v4si' requires the '-mcpu=power8' 
and '-mvsx' options" "" { target *-*-* } .-1 } */
 }
 
 #pragma GCC target ("cpu=power6")
-- 
2.27.0




[PATCH] rs6000: Builtins test changes for pr80315-*.c, pr88100.c

2021-11-18 Thread Bill Schmidt via Gcc-patches
Hi!  This patch is broken out from the test case patch for the new
builtins support.

One advantage of the new builtins support is uniform error messages for
arguments with restricted values.  Previously this was done in many places
in an ad hoc manner, with little uniformity.  This patch adjusts the
expected error messages accordingly.

All error messages are now one of the following:
  "argument %d must be a %d-bit unsigned literal"
  "argument %d must be a literal between %d and %d, inclusive"
  "argument %d must be a variable or a literal between %d and %d, inclusive"
  "argument %d must be either a literal %d or a literal %d"

These messages were chosen to require the fewest changes from previous
messages while still introducing uniformity.  This patch adjusts error
messages for some cases where this produces changed messages.

Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64) with
no regressions.  is this okay for trunk?

Thanks!
Bill


2021-11-17  Bill Schmidt  

gcc/testsuite/
* gcc.target/powerpc/pr80315-1.c: Adjust error message.
* gcc.target/powerpc/pr80315-2.c: Likewise.
* gcc.target/powerpc/pr80315-3.c: Likewise.
* gcc.target/powerpc/pr80315-4.c: Likewise.
* gcc.target/powerpc/pr88100.c: Likewise.
---
 gcc/testsuite/gcc.target/powerpc/pr80315-1.c |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-2.c |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-3.c |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-4.c |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr88100.c   | 12 ++--
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/pr80315-1.c 
b/gcc/testsuite/gcc.target/powerpc/pr80315-1.c
index e2db0ff4b5f..f37f1f169a2 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr80315-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr80315-1.c
@@ -10,6 +10,6 @@ main()
   int mask;
 
   /* Argument 2 must be 0 or 1.  Argument 3 must be in range 0..15.  */
-  res = __builtin_crypto_vshasigmaw (test, 1, 0xff); /* { dg-error {argument 3 
must be in the range \[0, 15\]} } */
+  res = __builtin_crypto_vshasigmaw (test, 1, 0xff); /* { dg-error {argument 3 
must be a 4-bit unsigned literal} } */
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/pr80315-2.c 
b/gcc/testsuite/gcc.target/powerpc/pr80315-2.c
index 144b705c012..0819a0511b7 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr80315-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr80315-2.c
@@ -10,6 +10,6 @@ main ()
   int mask;
 
   /* Argument 2 must be 0 or 1.  Argument 3 must be in range 0..15.  */
-  res = __builtin_crypto_vshasigmad (test, 1, 0xff); /* { dg-error {argument 3 
must be in the range \[0, 15\]} } */
+  res = __builtin_crypto_vshasigmad (test, 1, 0xff); /* { dg-error {argument 3 
must be a 4-bit unsigned literal} } */
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/pr80315-3.c 
b/gcc/testsuite/gcc.target/powerpc/pr80315-3.c
index 99a3e24eadd..cc2e46cf5cb 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr80315-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr80315-3.c
@@ -12,6 +12,6 @@ main ()
   int mask;
 
   /* Argument 2 must be 0 or 1.  Argument 3 must be in range 0..15.  */
-  res = vec_shasigma_be (test, 1, 0xff); /* { dg-error {argument 3 must be in 
the range \[0, 15\]} } */
+  res = vec_shasigma_be (test, 1, 0xff); /* { dg-error {argument 3 must be a 
4-bit unsigned literal} } */
   return res;
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/pr80315-4.c 
b/gcc/testsuite/gcc.target/powerpc/pr80315-4.c
index 7f5f6f75029..ac12910741b 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr80315-4.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr80315-4.c
@@ -12,6 +12,6 @@ main ()
   int mask;
 
   /* Argument 2 must be 0 or 1.  Argument 3 must be in range 0..15.  */
-  res = vec_shasigma_be (test, 1, 0xff); /* { dg-error {argument 3 must be in 
the range \[0, 15\]} } */
+  res = vec_shasigma_be (test, 1, 0xff); /* { dg-error {argument 3 must be a 
4-bit unsigned literal} } */
   return res;
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/pr88100.c 
b/gcc/testsuite/gcc.target/powerpc/pr88100.c
index 4452145ce95..764c897a497 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr88100.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr88100.c
@@ -10,35 +10,35 @@
 vector unsigned char
 splatu1 (void)
 {
-  return vec_splat_u8(0x100);/* { dg-error "argument 1 must be a 5-bit signed 
literal" } */
+  return vec_splat_u8(0x100);/* { dg-error "argument 1 must be a literal 
between -16 and 15, inclusive" } */
 }
 
 vector unsigned short
 splatu2 (void)
 {
-  return vec_splat_u16(0x1);/* { dg-error "argument 1 must be a 5-bit 
signed literal" } */
+  return vec_splat_u16(0x1);/* { dg-error "argument 1 must be a literal 
between -16 and 15, inclusive" } */
 }
 
 vector unsigned int
 splatu3 (void)
 {
-  return vec_splat_u32(0x1000);/* { dg-error "argument 1 mu

  1   2   3   4   5   6   7   8   9   10   >