Re: [PATCH] rs6000, __builtin_set_fpscr_rn add retrun value
On 6/29/23 4:13 AM, Kewen.Lin wrote: > on 2023/6/19 23:57, Carl Love wrote: >> The following patch changes the return type of the >> __builtin_set_fpscr_rn builtin from void to double. The return value >> is the current value of the various FPSCR fields DRN, VE, OE, UE, ZE, >> XE, NI, RN bit positions when the builtin is called. The builtin then >> updated the RN field with the new value provided as an argument to the >> builtin. The patch adds new testcases to test_fpscr_rn_builtin.c to >> check that the builtin returns the current value of the FPSCR fields >> and then updates the RN field. > > But this patch also introduces one more overloading instance with argument > type double, I had a check on glibc usage of mffscrn/mffscrni, I don't > think it's necessary to add this new instance, as it takes the given > rounding mode as integral type. I agree with Ke Wen, I don't know why there is an extra overloaded instance. I assumed we'd have just the one we had before, except now its return type is double and not void. That said, we need to announce to users that the return type has changed, so new code compiled with a new GCC can get the new return value, but if that new code is compiled with an old GCC (still has void return type), it knows it needs to use some other method to get the FPSCR value, if it needs it. We normally do that with a predefine macro (#define ...) that the user can test for in their code, ala: #ifdef __SET_FPSCR_RN_RETURNS_FPSCR__ /* Or whatever name. */ ret = __builtin_set_fpscr_rn (..); #else __builtin_set_fpscr_rn (..); ret = ; #endif We add those predefined macros in rs6000-c.cc:rs6000_target_modify_macros() and it should be gated via the same conditions that the built-in itself is enabled. Look at my addition of the __TM_FENCE__ macro that let the user know our HTM insn patterns were fixed to now act as memory barriers. Peter
Re: [PATCH] rs6000, __builtin_set_fpscr_rn add retrun value
Hi Carl, on 2023/6/19 23:57, Carl Love wrote: > GCC maintainers: > > > The GLibC team requested a builtin to replace the mffscrn and mffscrni inline > asm instructions in the GLibC code> Previously there was discussion on adding > builtins for the mffscrn instructions. > > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620261.html > > In the end, it was felt that it would be to extend the existing > __builtin_set_fpscr_rn builtin to return a double instead of a void > type. The desire is that we could have the functionality of the > mffscrn and mffscrni instructions on older ISAs. The two instructions > were initially added in ISA 3.0. The __builtin_set_fpscr_rn has the > needed functionality to set the RN field using the mffscrn and mffscrni > instructions if ISA 3.0 is supported or fall back to using logical > instructions to mask and set the bits for earlier ISAs. The > instructions return the current value of the FPSCR fields DRN, VE, OE, > UE, ZE, XE, NI, RN bit positions then update the RN bit positions with > the new RN value provided. > > The current __builtin_set_fpscr_rn builtin has a return type of void. > So, changing the return type to double and returning the FPSCR fields > DRN, VE, OE, UE, ZE, XE, NI, RN bit positions would then give the > functionally equivalent of the mffscrn and mffscrni instructions. Any > current uses of the builtin would just ignore the return value yet any > new uses could use the return value. So the requirement is for the > change to the __builtin_set_fpscr_rn builtin to be backwardly > compatible and work for all ISAs. > > The following patch changes the return type of the > __builtin_set_fpscr_rn builtin from void to double. The return value > is the current value of the various FPSCR fields DRN, VE, OE, UE, ZE, > XE, NI, RN bit positions when the builtin is called. The builtin then > updated the RN field with the new value provided as an argument to the > builtin. The patch adds new testcases to test_fpscr_rn_builtin.c to > check that the builtin returns the current value of the FPSCR fields > and then updates the RN field. But this patch also introduces one more overloading instance with argument type double, I had a check on glibc usage of mffscrn/mffscrni, I don't think it's necessary to add this new instance, as it takes the given rounding mode as integral type. For examples: #define __fe_mffscrn(rn)\ ({register fenv_union_t __fr; \ if (__builtin_constant_p (rn)) \ __asm__ __volatile__ (\ ".machine push; .machine \"power9\"; mffscrni %0,%1; .machine pop" \ : "=f" (__fr.fenv) : "n" (rn)); \ else\ { \ __fr.l = (rn);\ __asm__ __volatile__ (\ ".machine push; .machine \"power9\"; mffscrn %0,%1; .machine pop" \ : "=f" (__fr.fenv) : "f" (__fr.fenv)); \ } \ __fr.fenv; \ }) /* Same as __fesetround_inline, however without runtime check to use DFP mtfsfi syntax (as relax_fenv_state) or if round value is valid. */ static inline void __fesetround_inline_nocheck (const int round) { #ifdef _ARCH_PWR9 __fe_mffscrn (round); #else if (__glibc_likely (GLRO(dl_hwcap2) & PPC_FEATURE2_ARCH_3_00)) __fe_mffscrn (round); else asm volatile ("mtfsfi 7,%0" : : "n" (round)); #endif } So could you just extend return type (from void to double) but without one more overloading instance? Without overloading, we can still use the original bif instance SET_FPSCR_RN and its correpsonding expander rs6000_set_fpscr_rn, just add some more handlings to fetch bits for return value. It would be simpler IMHO. > > The GLibC team has reviewed the patch to make sure it met their needs > as a drop in replacement for the inline asm mffscr and mffscrni > statements in the GLibC code. T > > The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power 10 > LE. > > Please let me know if the patch is acceptable for mainline. Thanks. > >Carl > > > rs6000, __builtin_set_fpscr_rn add retrun value > > Change the return value from void to double. The return value consists of > the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit positions. Add an > overloaded version which accepts a double argument. > > The test powerpc/test_fpscr_rn_builtin.c is updated to add tests for the > double reterun value and the new double argument. > > gcc/ChangeLog:
[PATCH] rs6000, __builtin_set_fpscr_rn add retrun value
GCC maintainers: The GLibC team requested a builtin to replace the mffscrn and mffscrniinline asm instructions in the GLibC code. Previously there was discussion on adding builtins for the mffscrn instructions. https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620261.html In the end, it was felt that it would be to extend the existing __builtin_set_fpscr_rn builtin to return a double instead of a void type. The desire is that we could have the functionality of the mffscrn and mffscrni instructions on older ISAs. The two instructions were initially added in ISA 3.0. The __builtin_set_fpscr_rn has the needed functionality to set the RN field using the mffscrn and mffscrni instructions if ISA 3.0 is supported or fall back to using logical instructions to mask and set the bits for earlier ISAs. The instructions return the current value of the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit positions then update the RN bit positions with the new RN value provided. The current __builtin_set_fpscr_rn builtin has a return type of void. So, changing the return type to double and returning the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit positions would then give the functionally equivalent of the mffscrn and mffscrni instructions. Any current uses of the builtin would just ignore the return value yet any new uses could use the return value. So the requirement is for the change to the __builtin_set_fpscr_rn builtin to be backwardly compatible and work for all ISAs. The following patch changes the return type of the __builtin_set_fpscr_rn builtin from void to double. The return value is the current value of the various FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit positions when the builtin is called. The builtin then updated the RN field with the new value provided as an argument to the builtin. The patch adds new testcases to test_fpscr_rn_builtin.c to check that the builtin returns the current value of the FPSCR fields and then updates the RN field. The GLibC team has reviewed the patch to make sure it met their needs as a drop in replacement for the inline asm mffscr and mffscrni statements in the GLibC code. T The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power 10 LE. Please let me know if the patch is acceptable for mainline. Thanks. Carl rs6000, __builtin_set_fpscr_rn add retrun value Change the return value from void to double. The return value consists of the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit positions. Add an overloaded version which accepts a double argument. The test powerpc/test_fpscr_rn_builtin.c is updated to add tests for the double reterun value and the new double argument. gcc/ChangeLog: * config/rs6000/rs6000-builtins.def (__builtin_set_fpscr_rn): Delete. (__builtin_set_fpscr_rn_i): New builtin definition. (__builtin_set_fpscr_rn_d): New builtin definition. * config/rs6000/rs6000-overload.def (__builtin_set_fpscr_rn): New overloaded definition. * config/rs6000/rs6000.md ((rs6000_get_fpscr_fields): New define_expand. (rs6000_update_fpscr_rn_field): New define_expand. (rs6000_set_fpscr_rn_d): New define expand. (rs6000_set_fpscr_rn_i): Renamed from rs6000_set_fpscr_rn, Added return argument. Updated to use new rs6000_get_fpscr_fields and rs6000_update_fpscr_rn_field define _expands. * doc/extend.texi (__builtin_set_fpscr_rn): Update description for the return value and new double argument. gcc/testsuite/ChangeLog: gcc.target/powerpc/test_fpscr_rn_builtin.c: Add new tests th check double return value. Add tests for overloaded double argument. re --- gcc/config/rs6000/rs6000-builtins.def | 7 +- gcc/config/rs6000/rs6000-overload.def | 6 + gcc/config/rs6000/rs6000.md | 122 --- gcc/doc/extend.texi | 25 ++- .../powerpc/test_fpscr_rn_builtin.c | 143 +- 5 files changed, 262 insertions(+), 41 deletions(-) diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def index 289a37998b1..30e0b0bb06d 100644 --- a/gcc/config/rs6000/rs6000-builtins.def +++ b/gcc/config/rs6000/rs6000-builtins.def @@ -237,8 +237,11 @@ const __ibm128 __builtin_pack_ibm128 (double, double); PACK_IF packif {ibm128} - void __builtin_set_fpscr_rn (const int[0,3]); -SET_FPSCR_RN rs6000_set_fpscr_rn {nosoft} + double __builtin_set_fpscr_rn_i (const int[0,3]); +SET_FPSCR_RN_I rs6000_set_fpscr_rn_i {nosoft} + + double __builtin_set_fpscr_rn_d (double); +SET_FPSCR_RN_D rs6000_set_fpscr_rn_d {nosoft} const double __builtin_unpack_ibm128 (__ibm128, const int<1>); UNPACK_IF unpackif {ibm128} diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def index