Re: [PATCH] rs6000, __builtin_set_fpscr_rn add retrun value

2023-06-29 Thread Peter Bergner via Gcc-patches
On 6/29/23 4:13 AM, Kewen.Lin wrote:
> on 2023/6/19 23:57, Carl Love wrote:
>> The following patch changes the return type of the
>>  __builtin_set_fpscr_rn builtin from void to double.  The return value
>> is the current value of the various FPSCR fields DRN, VE, OE, UE, ZE,
>> XE, NI, RN bit positions when the builtin is called.  The builtin then
>> updated the RN field with the new value provided as an argument to the
>> builtin.  The patch adds new testcases to test_fpscr_rn_builtin.c to
>> check that the builtin returns the current value of the FPSCR fields
>> and then updates the RN field.
> 
> But this patch also introduces one more overloading instance with argument
> type double, I had a check on glibc usage of mffscrn/mffscrni, I don't
> think it's necessary to add this new instance, as it takes the given
> rounding mode as integral type.

I agree with Ke Wen, I don't know why there is an extra overloaded
instance.  I assumed we'd have just the one we had before, except now
its return type is double and not void.

That said, we need to announce to users that the return type has
changed, so new code compiled with a new GCC can get the new return
value, but if that new code is compiled with an old GCC (still has void
return type), it knows it needs to use some other method to get the
FPSCR value, if it needs it.  We normally do that with a predefine
macro (#define ...) that the user can test for in their code, ala:

#ifdef __SET_FPSCR_RN_RETURNS_FPSCR__  /* Or whatever name.  */
  ret = __builtin_set_fpscr_rn (..);
#else
  __builtin_set_fpscr_rn (..);
  ret = ;
#endif

We add those predefined macros in rs6000-c.cc:rs6000_target_modify_macros()
and it should be gated via the same conditions that the built-in itself
is enabled.  Look at my addition of the __TM_FENCE__ macro that let the
user know our HTM insn patterns were fixed to now act as memory barriers.


Peter



Re: [PATCH] rs6000, __builtin_set_fpscr_rn add retrun value

2023-06-29 Thread Kewen.Lin via Gcc-patches
Hi Carl,

on 2023/6/19 23:57, Carl Love wrote:
> GCC maintainers:
> 
> 
> The GLibC team requested a builtin to replace the mffscrn and mffscrni inline 
> asm instructions in the GLibC code> Previously there was discussion on adding 
> builtins for the mffscrn instructions.
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620261.html
> 
> In the end, it was felt that it would be to extend the existing
> __builtin_set_fpscr_rn builtin to return a double instead of a void
> type.  The desire is that we could have the functionality of the
> mffscrn and mffscrni instructions on older ISAs.  The two instructions
> were initially added in ISA 3.0.  The __builtin_set_fpscr_rn has the
> needed functionality to set the RN field using the mffscrn and mffscrni
> instructions if ISA 3.0 is supported or fall back to using logical
> instructions to mask and set the bits for earlier ISAs.  The
> instructions return the current value of the FPSCR fields DRN, VE, OE,
> UE, ZE, XE, NI, RN bit positions then update the RN bit positions with
> the new RN value provided.
> 
> The current __builtin_set_fpscr_rn builtin has a return type of void. 
> So, changing the return type to double and returning the  FPSCR fields
> DRN, VE, OE, UE, ZE, XE, NI, RN bit positions would then give the
> functionally equivalent of the mffscrn and mffscrni instructions.  Any
> current uses of the builtin would just ignore the return value yet any
> new uses could use the return value.  So the requirement is for the
> change to the __builtin_set_fpscr_rn builtin to be backwardly
> compatible and work for all ISAs.
> 
> The following patch changes the return type of the
>  __builtin_set_fpscr_rn builtin from void to double.  The return value
> is the current value of the various FPSCR fields DRN, VE, OE, UE, ZE,
> XE, NI, RN bit positions when the builtin is called.  The builtin then
> updated the RN field with the new value provided as an argument to the
> builtin.  The patch adds new testcases to test_fpscr_rn_builtin.c to
> check that the builtin returns the current value of the FPSCR fields
> and then updates the RN field.

But this patch also introduces one more overloading instance with argument
type double, I had a check on glibc usage of mffscrn/mffscrni, I don't
think it's necessary to add this new instance, as it takes the given
rounding mode as integral type.

For examples:

#define __fe_mffscrn(rn)\
  ({register fenv_union_t __fr; \
if (__builtin_constant_p (rn))  \
  __asm__ __volatile__ (\
".machine push; .machine \"power9\"; mffscrni %0,%1; .machine pop" \
: "=f" (__fr.fenv) : "n" (rn)); \
else\
{   \
  __fr.l = (rn);\
  __asm__ __volatile__ (\
".machine push; .machine \"power9\"; mffscrn %0,%1; .machine pop" \
: "=f" (__fr.fenv) : "f" (__fr.fenv));  \
}   \
__fr.fenv;  \
  })


/* Same as __fesetround_inline, however without runtime check to use DFP
   mtfsfi syntax (as relax_fenv_state) or if round value is valid.  */
static inline void
__fesetround_inline_nocheck (const int round)
{
#ifdef _ARCH_PWR9
  __fe_mffscrn (round);
#else
  if (__glibc_likely (GLRO(dl_hwcap2) & PPC_FEATURE2_ARCH_3_00))
__fe_mffscrn (round);
  else
asm volatile ("mtfsfi 7,%0" : : "n" (round));
#endif
}

So could you just extend return type (from void to double) but without one
more overloading instance?

Without overloading, we can still use the original bif instance SET_FPSCR_RN
and its correpsonding expander rs6000_set_fpscr_rn, just add some more
handlings to fetch bits for return value.  It would be simpler IMHO.

> 
> The GLibC team has reviewed the patch to make sure it met their needs
> as a drop in replacement for the inline asm mffscr and mffscrni
> statements in the GLibC code.  T
> 
> The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power 10
> LE.
> 
> Please let me know if the patch is acceptable for mainline.  Thanks.
> 
>Carl 
> 
> 
> rs6000, __builtin_set_fpscr_rn add retrun value
> 
> Change the return value from void to double.  The return value consists of
> the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit positions.  Add an
> overloaded version which accepts a double argument.
> 
> The test powerpc/test_fpscr_rn_builtin.c is updated to add tests for the
> double reterun value and the new double argument.
> 
> gcc/ChangeLog:

[PATCH] rs6000, __builtin_set_fpscr_rn add retrun value

2023-06-19 Thread Carl Love via Gcc-patches
GCC maintainers:


The GLibC team requested a builtin to replace the mffscrn and mffscrniinline 
asm instructions in the GLibC code.  Previously there was discussion on adding 
builtins for the mffscrn instructions.

https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620261.html

In the end, it was felt that it would be to extend the existing
__builtin_set_fpscr_rn builtin to return a double instead of a void
type.  The desire is that we could have the functionality of the
mffscrn and mffscrni instructions on older ISAs.  The two instructions
were initially added in ISA 3.0.  The __builtin_set_fpscr_rn has the
needed functionality to set the RN field using the mffscrn and mffscrni
instructions if ISA 3.0 is supported or fall back to using logical
instructions to mask and set the bits for earlier ISAs.  The
instructions return the current value of the FPSCR fields DRN, VE, OE,
UE, ZE, XE, NI, RN bit positions then update the RN bit positions with
the new RN value provided.

The current __builtin_set_fpscr_rn builtin has a return type of void. 
So, changing the return type to double and returning the  FPSCR fields
DRN, VE, OE, UE, ZE, XE, NI, RN bit positions would then give the
functionally equivalent of the mffscrn and mffscrni instructions.  Any
current uses of the builtin would just ignore the return value yet any
new uses could use the return value.  So the requirement is for the
change to the __builtin_set_fpscr_rn builtin to be backwardly
compatible and work for all ISAs.

The following patch changes the return type of the
 __builtin_set_fpscr_rn builtin from void to double.  The return value
is the current value of the various FPSCR fields DRN, VE, OE, UE, ZE,
XE, NI, RN bit positions when the builtin is called.  The builtin then
updated the RN field with the new value provided as an argument to the
builtin.  The patch adds new testcases to test_fpscr_rn_builtin.c to
check that the builtin returns the current value of the FPSCR fields
and then updates the RN field.

The GLibC team has reviewed the patch to make sure it met their needs
as a drop in replacement for the inline asm mffscr and mffscrni
statements in the GLibC code.  T

The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power 10
LE.

Please let me know if the patch is acceptable for mainline.  Thanks.

   Carl 


rs6000, __builtin_set_fpscr_rn add retrun value

Change the return value from void to double.  The return value consists of
the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit positions.  Add an
overloaded version which accepts a double argument.

The test powerpc/test_fpscr_rn_builtin.c is updated to add tests for the
double reterun value and the new double argument.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_set_fpscr_rn): Delete.
(__builtin_set_fpscr_rn_i): New builtin definition.
(__builtin_set_fpscr_rn_d): New builtin definition.
* config/rs6000/rs6000-overload.def (__builtin_set_fpscr_rn): New
overloaded definition.
* config/rs6000/rs6000.md ((rs6000_get_fpscr_fields): New
define_expand.
(rs6000_update_fpscr_rn_field): New define_expand.
(rs6000_set_fpscr_rn_d): New define expand.
(rs6000_set_fpscr_rn_i): Renamed from rs6000_set_fpscr_rn, Added
return argument.  Updated to use new rs6000_get_fpscr_fields and
rs6000_update_fpscr_rn_field define _expands.
* doc/extend.texi (__builtin_set_fpscr_rn): Update description for
the return value and new double argument.

gcc/testsuite/ChangeLog:
gcc.target/powerpc/test_fpscr_rn_builtin.c: Add new tests th check
double return value.  Add tests for overloaded double argument.
re
---
 gcc/config/rs6000/rs6000-builtins.def |   7 +-
 gcc/config/rs6000/rs6000-overload.def |   6 +
 gcc/config/rs6000/rs6000.md   | 122 ---
 gcc/doc/extend.texi   |  25 ++-
 .../powerpc/test_fpscr_rn_builtin.c   | 143 +-
 5 files changed, 262 insertions(+), 41 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 289a37998b1..30e0b0bb06d 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -237,8 +237,11 @@
   const __ibm128 __builtin_pack_ibm128 (double, double);
 PACK_IF packif {ibm128}
 
-  void __builtin_set_fpscr_rn (const int[0,3]);
-SET_FPSCR_RN rs6000_set_fpscr_rn {nosoft}
+  double __builtin_set_fpscr_rn_i (const int[0,3]);
+SET_FPSCR_RN_I rs6000_set_fpscr_rn_i {nosoft}
+
+  double __builtin_set_fpscr_rn_d (double);
+SET_FPSCR_RN_D rs6000_set_fpscr_rn_d {nosoft}
 
   const double __builtin_unpack_ibm128 (__ibm128, const int<1>);
 UNPACK_IF unpackif {ibm128}
diff --git a/gcc/config/rs6000/rs6000-overload.def 
b/gcc/config/rs6000/rs6000-overload.def
index