Re: [PATCH V4] Rework 128-bit complex multiply and divide.

2023-03-20 Thread Segher Boessenkool
On Mon, Mar 20, 2023 at 01:43:41PM -0400, Michael Meissner wrote:
> On Fri, Mar 17, 2023 at 02:35:16PM -0500, Segher Boessenkool wrote:
> > On Thu, Mar 09, 2023 at 08:40:36PM -0500, Michael Meissner wrote:
> > /* { dg-final { scan-assembler {\m__divtc3\M} } } */
> > 
> > It might well be that we can use a sloppier regexp here, but why would
> > we do that?  It is a good thing to use the \m and \M constraint escapes
> > pretty much always.
> 
> The last time I posted the patch, you said:
> 
> | > +/* { dg-final { scan-assembler "bl __divtc3" } } */
> |
> | This name depends on what object format and ABI is in use (some have an
> | extra leading underscore, or a dot, or whatever).
> 
> So the patch was an attempt to match the other cases.

I also said you could do {\mbl .?__divtc3\M} or similar, in cases where
you need to consider multiple ABIs.  Just searching for a substring is
very suboptimal always.

> > Does this need backports?
> 
> I think we will need backports for GCC 12.  The issue exists in GCC 11, but I
> don't think that GCC 11 can really work on systems with IEEE long double, 
> since
> a lot of the stuff to really finish up the support was not in GCC 11.  I think
> I tried dropping the patch into GCC 12, and it looks like something else may 
> be
> needed.  I will look into it.

Okay, thanks.  We'll see.


Segher


Re: [PATCH V4] Rework 128-bit complex multiply and divide.

2023-03-20 Thread Michael Meissner via Gcc-patches
On Mon, Mar 20, 2023 at 01:43:41PM -0400, Michael Meissner wrote:
> I think we will need backports for GCC 12.  The issue exists in GCC 11, but I
> don't think that GCC 11 can really work on systems with IEEE long double, 
> since
> a lot of the stuff to really finish up the support was not in GCC 11.  I think
> I tried dropping the patch into GCC 12, and it looks like something else may 
> be
> needed.  I will look into it.

The current patch applies to GCC 12 without changes, and it does fix the
problem.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Re: [PATCH V4] Rework 128-bit complex multiply and divide.

2023-03-20 Thread Michael Meissner via Gcc-patches
On Fri, Mar 17, 2023 at 02:35:16PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Mar 09, 2023 at 08:40:36PM -0500, Michael Meissner wrote:
> > PR target/109067
> > * config/rs6000/rs6000.cc (create_complex_muldiv): Delete.
> > (init_float128_ieee): Delete code to switch complex multiply and divide
> > for long double.
> > (complex_multiply_builtin_code): New helper function.
> > (complex_divide_builtin_code): Likewise.
> > (rs6000_mangle_decl_assembler_name): Add support for mangling the name
> > of complex 128-bit multiply and divide built-in functions.
> 
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/divic3-1.c
> > +/* { dg-final { scan-assembler "__divtc3" } } */
> 
> /* { dg-final { scan-assembler {\m__divtc3\M} } } */
> 
> It might well be that we can use a sloppier regexp here, but why would
> we do that?  It is a good thing to use the \m and \M constraint escapes
> pretty much always.

The last time I posted the patch, you said:

| > +/* { dg-final { scan-assembler "bl __divtc3" } } */
|
| This name depends on what object format and ABI is in use (some have an
| extra leading underscore, or a dot, or whatever).

So the patch was an attempt to match the other cases.

> Similar for the other three testcases of course.
> 
> This patch is okay for trunk, if you have tested it on all
> configurations (powerpc-linux, powerpc64-linux, powerpc64le-linux with
> and without default IEEE128 long double at least).  Thank you!
> 
> Does this need backports?

I think we will need backports for GCC 12.  The issue exists in GCC 11, but I
don't think that GCC 11 can really work on systems with IEEE long double, since
a lot of the stuff to really finish up the support was not in GCC 11.  I think
I tried dropping the patch into GCC 12, and it looks like something else may be
needed.  I will look into it.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Re: [PATCH V4] Rework 128-bit complex multiply and divide.

2023-03-17 Thread Segher Boessenkool
Hi!

On Thu, Mar 09, 2023 at 08:40:36PM -0500, Michael Meissner wrote:
>   PR target/109067
>   * config/rs6000/rs6000.cc (create_complex_muldiv): Delete.
>   (init_float128_ieee): Delete code to switch complex multiply and divide
>   for long double.
>   (complex_multiply_builtin_code): New helper function.
>   (complex_divide_builtin_code): Likewise.
>   (rs6000_mangle_decl_assembler_name): Add support for mangling the name
>   of complex 128-bit multiply and divide built-in functions.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/divic3-1.c
> +/* { dg-final { scan-assembler "__divtc3" } } */

/* { dg-final { scan-assembler {\m__divtc3\M} } } */

It might well be that we can use a sloppier regexp here, but why would
we do that?  It is a good thing to use the \m and \M constraint escapes
pretty much always.

Similar for the other three testcases of course.

This patch is okay for trunk, if you have tested it on all
configurations (powerpc-linux, powerpc64-linux, powerpc64le-linux with
and without default IEEE128 long double at least).  Thank you!

Does this need backports?


Segher


[PATCH V4] Rework 128-bit complex multiply and divide.

2023-03-09 Thread Michael Meissner via Gcc-patches
This patch reworks how the complex multiply and divide built-in functions are
done.  Previously GCC created built-in declarations for doing long double 
complex
multiply and divide when long double is IEEE 128-bit.  However, it did not
support __ibm128 complex multiply and divide if long double is IEEE 128-bit.

This code does not create the built-in declaration with the changed name.
Instead, it uses the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change the name
before it is written out to the assembler file like it now does for all of the
other long double built-in functions.

Originally, the patch was part of a larger patch set and the comments reflected
this.  I have removed the comments referring to the other patches.  While this
patch was originally developed as part of those other patches, it is a stand
alone patch.

I have tried to take the comments in the last patch review in this patch.
Note, I will be away from the computer from March 10 through the 13th.  So I
would not be checking in the patches until I get back.  But I thought I would
share the results of the changes that were asked for.

I fixed the complex_multiply_builtin_code and complex_divide_builtin_code
functions to have an assert tht the mode is within the proper modes.  I have
tried to make the code a little bit clearer.

I have cleaned up the tests to eliminate the target powerpc in the tests.  I
have elimited the -mpower8-vector option.  I have changed the scan assembler
lines jut to look for __divtc3 or __multc3, and not depend on the format of the
'bl' call to those functions.  I have kept the -Wno-psabi option, because this
is needed to prevent spurious errors on systems with older libraries (like big
endian) that don't have IEEE 128-bit support.

2023-03-09   Michael Meissner  

gcc/

PR target/109067
* config/rs6000/rs6000.cc (create_complex_muldiv): Delete.
(init_float128_ieee): Delete code to switch complex multiply and divide
for long double.
(complex_multiply_builtin_code): New helper function.
(complex_divide_builtin_code): Likewise.
(rs6000_mangle_decl_assembler_name): Add support for mangling the name
of complex 128-bit multiply and divide built-in functions.

gcc/testsuite/

PR target/109067
* gcc.target/powerpc/divic3-1.c: New test.
* gcc.target/powerpc/divic3-2.c: Likewise.
* gcc.target/powerpc/mulic3-1.c: Likewise.
* gcc.target/powerpc/mulic3-2.c: Likewise.
---
 gcc/config/rs6000/rs6000.cc | 111 +++-
 gcc/testsuite/gcc.target/powerpc/divic3-1.c |  21 
 gcc/testsuite/gcc.target/powerpc/divic3-2.c |  25 +
 gcc/testsuite/gcc.target/powerpc/mulic3-1.c |  21 
 gcc/testsuite/gcc.target/powerpc/mulic3-2.c |  25 +
 5 files changed, 156 insertions(+), 47 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/divic3-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/divic3-2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/mulic3-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/mulic3-2.c

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 8e0b0d022db..fa5f93a874f 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -11154,26 +11154,6 @@ init_float128_ibm (machine_mode mode)
 }
 }
 
-/* Create a decl for either complex long double multiply or complex long double
-   divide when long double is IEEE 128-bit floating point.  We can't use
-   __multc3 and __divtc3 because the original long double using IBM extended
-   double used those names.  The complex multiply/divide functions are encoded
-   as builtin functions with a complex result and 4 scalar inputs.  */
-
-static void
-create_complex_muldiv (const char *name, built_in_function fncode, tree fntype)
-{
-  tree fndecl = add_builtin_function (name, fntype, fncode, BUILT_IN_NORMAL,
- name, NULL_TREE);
-
-  set_builtin_decl (fncode, fndecl, true);
-
-  if (TARGET_DEBUG_BUILTIN)
-fprintf (stderr, "create complex %s, fncode: %d\n", name, (int) fncode);
-
-  return;
-}
-
 /* Set up IEEE 128-bit floating point routines.  Use different names if the
arguments can be passed in a vector register.  The historical PowerPC
implementation of IEEE 128-bit floating point used _q_ for the names, so
@@ -11185,32 +11165,6 @@ init_float128_ieee (machine_mode mode)
 {
   if (FLOAT128_VECTOR_P (mode))
 {
-  static bool complex_muldiv_init_p = false;
-
-  /* Set up to call __mulkc3 and __divkc3 under -mabi=ieeelongdouble.  If
-we have clone or target attributes, this will be called a second
-time.  We want to create the built-in function only once.  */
- if (mode == TFmode && TARGET_IEEEQUAD && !complex_muldiv_init_p)
-   {
-complex_muldiv_init_p = true;
-built_in_function fncode_mul =
-  (built_in_function) (BUILT_IN_COMPLEX_MUL_MIN + TCmode
-