Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
On Thu, Aug 27, 2020 at 08:43:40AM -0700, Carl Love wrote: > 2020-08-26 Carl Love > * config/rs6000/rs6000-builtin.def: (BU_P10V_VSX_1) New builtin macro > expansion. > (XVCVBF16SPN, XVCVSPBF16): Replace macro expansion BU_VSX_1 with > BU_P10V_VSX_1. > * config/rs6000/rs6000-call.c: (VSX_BUILTIN_XVCVSPBF16, > VSX_BUILTIN_XVCVBF16SPN): > Replace with P10V_BUILTIN_XVCVSPBF16, P10V_BUILTIN_XVCVBF16SPN > respectively. (lines too long) This patch is fine for GCC 10. Thanks! As an aside: > -case VSX_BUILTIN_XVCVSPBF16: > -case VSX_BUILTIN_XVCVBF16SPN: > +case P10V_BUILTIN_XVCVSPBF16: > +case P10V_BUILTIN_XVCVBF16SPN: Having "P10V" in the name here doesn't really help anything; in fact, it could just be BUILTIN_XVCBCVXBCXBNVC? Simpler names like that will improve readability as well. But that is all for the future :-) Segher
Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
GCC maintainers: The following patch has been updated based on the comments from Will and Segher. The patch is a subset of the mainline commit: commit 07d456bb80a16405723c98c2ab74ccc2a5a23898 Author: Carl Love Date: Mon Aug 10 19:37:41 2020 -0500 rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions. Only the changes from the mainline patch to restrict the bfloat convert intrinsics (XVCVBF16SPN and XVCVSPBF16) to Power 10 are included in this patch. This patch adds the BU_P10V_VSX_1 macro definition for Power 10. The macro definition restricts the use of the named intrinsics similarly to Power 8 and Power 9. The changes for the BU_10V macro definitions from the mainline patch do not apply to the GCC 10 branch as the macro definitions and uses do not exist in the GCC 10 branch. The patch has been updated per the comments above. It was rebased onto the latest GCC 10 code base and retested on powerpc64le-unknown-linux-gnu (Power 9 LE) with no regressions. Please let me know if this patch is acceptable for the GCC 10 branch. Thank you. Carl Love rs6000, restrict bfloat convert intrinsic to Power 10. gcc/ChangeLog 2020-08-26 Carl Love * config/rs6000/rs6000-builtin.def: (BU_P10V_VSX_1) New builtin macro expansion. (XVCVBF16SPN, XVCVSPBF16): Replace macro expansion BU_VSX_1 with BU_P10V_VSX_1. * config/rs6000/rs6000-call.c: (VSX_BUILTIN_XVCVSPBF16, VSX_BUILTIN_XVCVBF16SPN): Replace with P10V_BUILTIN_XVCVSPBF16, P10V_BUILTIN_XVCVBF16SPN respectively. --- gcc/config/rs6000/rs6000-builtin.def | 12 ++-- gcc/config/rs6000/rs6000-call.c | 4 ++-- 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index 88f78cb0a15..5de17e79855 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -1014,6 +1014,14 @@ | RS6000_BTC_BINARY), \ CODE_FOR_ ## ICODE) /* ICODE */ +/* Built-ins for ISA 3.1 Altivec instructions. */ +#define BU_P10V_VSX_1(ENUM, NAME, ATTR, ICODE)\ + RS6000_BUILTIN_1 (P10V_BUILTIN_ ## ENUM, /* ENUM */ \ + "__builtin_vsx_" NAME, /* NAME */ \ + RS6000_BTM_P10, /* MASK */ \ + (RS6000_BTC_ ## ATTR/* ATTR */ \ + | RS6000_BTC_UNARY),\ + CODE_FOR_ ## ICODE) /* ICODE */ #endif @@ -2698,8 +2706,8 @@ BU_SPECIAL_X (RS6000_BUILTIN_CFSTRING, "__builtin_cfstring", RS6000_BTM_ALWAYS, RS6000_BTC_MISC) /* POWER10 MMA builtins. */ -BU_VSX_1 (XVCVBF16SPN, "xvcvbf16spn", MISC, vsx_xvcvbf16spn) -BU_VSX_1 (XVCVSPBF16, "xvcvspbf16", MISC, vsx_xvcvspbf16) +BU_P10V_VSX_1 (XVCVBF16SPN,"xvcvbf16spn", MISC, vsx_xvcvbf16spn) +BU_P10V_VSX_1 (XVCVSPBF16, "xvcvspbf16", MISC, vsx_xvcvspbf16) BU_MMA_1 (XXMFACC, "xxmfacc", QUAD, mma_xxmfacc) BU_MMA_1 (XXMTACC, "xxmtacc", QUAD, mma_xxmtacc) diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index 2cf78dfa5fe..fc1671e1bb2 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -13383,8 +13383,8 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0, case P8V_BUILTIN_VGBBD: case MISC_BUILTIN_CDTBCD: case MISC_BUILTIN_CBCDTD: -case VSX_BUILTIN_XVCVSPBF16: -case VSX_BUILTIN_XVCVBF16SPN: +case P10V_BUILTIN_XVCVSPBF16: +case P10V_BUILTIN_XVCVBF16SPN: h.uns_p[0] = 1; h.uns_p[1] = 1; break; -- 2.17.1
Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
Hi! (All that Will says included by reference ;-) ) On Mon, Aug 24, 2020 at 02:39:41PM -0700, Carl Love wrote: > I attempted to do the backport however the patch doesn't even come > close to applying. The names XVCVBF16SPN and XVCVSPBF16 are the only > two builtin names that exist in GCC 10. The other issue is there is no > Power 10 builtin macro definitions in GCC 10. So basically, I can fix > the issue with XVCVBF16SPN and XVCVSPBF16 to be restricted to Power 10 > by adding the needed Power 10 macro definition. Okay, that is what we should do I agree. > This is a whole new patch so I figure it needs to be reviewed to make > sure we want to make this change to GCC 10. I did run the regression > tests again using a Power 9 machine to verify it complies and there are > no regression test failures. > > Please let me know if this is OK for the GCC 10 tree. Thanks. The patch looks fine. It now is impossible to write a correct changelog for a backport like this, so I won't review that part. Please make it clear that this is a partial backport in the commit message (and commit of what ofc). Okay for trunk with that. Thanks! Segher
Re: [EXTERNAL] Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
On Mon, 2020-08-24 at 14:39 -0700, Carl Love wrote: > Segher: > > On Wed, 2020-08-19 at 15:16 -0500, Segher Boessenkool wrote: > > On Wed, Aug 19, 2020 at 02:19:12PM -0500, Peter Bergner wrote: > > > On 8/14/20 7:42 PM, Segher Boessenkool wrote: > > > > I think your current code is fine; I hadn't considered Bill's > > > > upcoming > > > > rewrite. It is more important to make that go smoother than to > > > > fix some > > > > aesthetics right now. > > > > > > > > Please check if the names for the builtins match the (future) > > > > documentation, and then, approved for trunk. Thank you! > > > > > > This is also a bug in GCC 10, so this should be backported too. > > > > > > Segher, is this ok for Carl to backport to GCC 10 after it has sat > > > on > > > trunk for a few days? > > > > Yes. Thanks guys. Hi, > > I attempted to do the backport however the patch doesn't even come > close to applying. The names XVCVBF16SPN and XVCVSPBF16 are the only > two builtin names that exist in GCC 10. The other issue is there is no > Power 10 builtin macro definitions in GCC 10. So basically, I can fix > the issue with XVCVBF16SPN and XVCVSPBF16 to be restricted to Power 10 > by adding the needed Power 10 macro definition. > > This is a whole new patch so I figure it needs to be reviewed to make > sure we want to make this change to GCC 10. I did run the regression > tests again using a Power 9 machine to verify it complies and there are > no regression test failures. > I recommend a pure and clean description of what the patch is doing in a paragraph, separate from the history. " This patch corrects the built-in definitions for cvcvspbf16 and xvcvbf16spn to restrict their use to P10 and beyond in a way that is consistent with the P8 and P9 builtin naming conventions. This is a subset of changes made to trunk ... " > Please let me know if this is OK for the GCC 10 tree. Thanks. > > Carl Love > > > [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Back port to > GCC 10. > > gcc/ChangeLog > > 2020-08-24 Carl Love whitespace/indentation > * config/rs6000/rs6000-builtin.def: (BU_P10V_VSX_1) New builtin macro > expansion. > (XVCVBF16SPN, XVCVSPBF16): Replace macro expansion BU_VSX_1 with > BU_P10V_VSX_1. OK as long as it's clear 'replace' means the define being used, versus the macro expansions themselves being replaced. > * config/rs6000/rs6000-call.c: (VSX_BUILTIN_XVCVSPBF16, > VSX_BUILTIN_XVCVBF16SPN): > Replace with P10V_BUILTIN_XVCVSPBF16, P10V_BUILTIN_XVCVBF16SPN > respectively. s/Replace with/Rename to/ ? > --- > gcc/config/rs6000/rs6000-builtin.def | 14 -- > gcc/config/rs6000/rs6000-call.c | 4 ++-- > 2 files changed, 14 insertions(+), 4 deletions(-) > > diff --git a/gcc/config/rs6000/rs6000-builtin.def > b/gcc/config/rs6000/rs6000-builtin.def > index 88f78cb0a15..512b7629a46 100644 > --- a/gcc/config/rs6000/rs6000-builtin.def > +++ b/gcc/config/rs6000/rs6000-builtin.def > @@ -1014,6 +1014,16 @@ >| RS6000_BTC_BINARY), \ > CODE_FOR_ ## ICODE) /* ICODE */ > > +/* For builtins for power10 vector instructions that are encoded as altivec > + instructions, use __builtin_altivec_ as the builtin name. */ That comment doesn't seem to apply to this change, update or drop ? > + > +#define BU_P10V_VSX_1(ENUM, NAME, ATTR, ICODE)\ > + RS6000_BUILTIN_1 (P10_BUILTIN_ ## ENUM,/* ENUM */ \ > + "__builtin_vsx_" NAME, /* NAME */ \ > + RS6000_BTM_P10, /* MASK */ \ > + (RS6000_BTC_ ## ATTR/* ATTR */ \ > + | RS6000_BTC_UNARY),\ > + CODE_FOR_ ## ICODE) /* ICODE */ > #endif > > > @@ -2698,8 +2708,8 @@ BU_SPECIAL_X (RS6000_BUILTIN_CFSTRING, > "__builtin_cfstring", RS6000_BTM_ALWAYS, > RS6000_BTC_MISC) > > /* POWER10 MMA builtins. */ > -BU_VSX_1 (XVCVBF16SPN, "xvcvbf16spn", MISC, vsx_xvcvbf16spn) > -BU_VSX_1 (XVCVSPBF16,"xvcvspbf16", MISC, vsx_xvcvspbf16) > +BU_P10V_VSX_1 (XVCVBF16SPN, "xvcvbf16spn", MISC, vsx_xvcvbf16spn) > +BU_P10V_VSX_1 (XVCVSPBF16, "xvcvspbf16", MISC, vsx_xvcvspbf16) > > BU_MMA_1 (XXMFACC, "xxmfacc", QUAD, mma_xxmfacc) > BU_MMA_1 (XXMTACC, "xxmtacc", QUAD, mma_xxmtacc) > diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c > index 2cf78dfa5fe..fc1671e1bb2 100644 > --- a/gcc/config/rs6000/rs6000-call.c > +++ b/gcc/config/rs6000/rs6000-call.c > @@ -13383,8 +13383,8 @@ builtin_function_type (machine_mode mode_ret, > machine_mode mode_arg0, > case P8V_BUILTIN_VGBBD: > case
RE: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
Segher: On Wed, 2020-08-19 at 15:16 -0500, Segher Boessenkool wrote: > On Wed, Aug 19, 2020 at 02:19:12PM -0500, Peter Bergner wrote: > > On 8/14/20 7:42 PM, Segher Boessenkool wrote: > > > I think your current code is fine; I hadn't considered Bill's > > > upcoming > > > rewrite. It is more important to make that go smoother than to > > > fix some > > > aesthetics right now. > > > > > > Please check if the names for the builtins match the (future) > > > documentation, and then, approved for trunk. Thank you! > > > > This is also a bug in GCC 10, so this should be backported too. > > > > Segher, is this ok for Carl to backport to GCC 10 after it has sat > > on > > trunk for a few days? > > Yes. Thanks guys. I attempted to do the backport however the patch doesn't even come close to applying. The names XVCVBF16SPN and XVCVSPBF16 are the only two builtin names that exist in GCC 10. The other issue is there is no Power 10 builtin macro definitions in GCC 10. So basically, I can fix the issue with XVCVBF16SPN and XVCVSPBF16 to be restricted to Power 10 by adding the needed Power 10 macro definition. This is a whole new patch so I figure it needs to be reviewed to make sure we want to make this change to GCC 10. I did run the regression tests again using a Power 9 machine to verify it complies and there are no regression test failures. Please let me know if this is OK for the GCC 10 tree. Thanks. Carl Love [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Back port to GCC 10. gcc/ChangeLog 2020-08-24 Carl Love * config/rs6000/rs6000-builtin.def: (BU_P10V_VSX_1) New builtin macro expansion. (XVCVBF16SPN, XVCVSPBF16): Replace macro expansion BU_VSX_1 with BU_P10V_VSX_1. * config/rs6000/rs6000-call.c: (VSX_BUILTIN_XVCVSPBF16, VSX_BUILTIN_XVCVBF16SPN): Replace with P10V_BUILTIN_XVCVSPBF16, P10V_BUILTIN_XVCVBF16SPN respectively. --- gcc/config/rs6000/rs6000-builtin.def | 14 -- gcc/config/rs6000/rs6000-call.c | 4 ++-- 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index 88f78cb0a15..512b7629a46 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -1014,6 +1014,16 @@ | RS6000_BTC_BINARY), \ CODE_FOR_ ## ICODE) /* ICODE */ +/* For builtins for power10 vector instructions that are encoded as altivec + instructions, use __builtin_altivec_ as the builtin name. */ + +#define BU_P10V_VSX_1(ENUM, NAME, ATTR, ICODE)\ + RS6000_BUILTIN_1 (P10_BUILTIN_ ## ENUM, /* ENUM */ \ + "__builtin_vsx_" NAME, /* NAME */ \ + RS6000_BTM_P10, /* MASK */ \ + (RS6000_BTC_ ## ATTR/* ATTR */ \ + | RS6000_BTC_UNARY),\ + CODE_FOR_ ## ICODE) /* ICODE */ #endif @@ -2698,8 +2708,8 @@ BU_SPECIAL_X (RS6000_BUILTIN_CFSTRING, "__builtin_cfstring", RS6000_BTM_ALWAYS, RS6000_BTC_MISC) /* POWER10 MMA builtins. */ -BU_VSX_1 (XVCVBF16SPN, "xvcvbf16spn", MISC, vsx_xvcvbf16spn) -BU_VSX_1 (XVCVSPBF16, "xvcvspbf16", MISC, vsx_xvcvspbf16) +BU_P10V_VSX_1 (XVCVBF16SPN,"xvcvbf16spn", MISC, vsx_xvcvbf16spn) +BU_P10V_VSX_1 (XVCVSPBF16, "xvcvspbf16", MISC, vsx_xvcvspbf16) BU_MMA_1 (XXMFACC, "xxmfacc", QUAD, mma_xxmfacc) BU_MMA_1 (XXMTACC, "xxmtacc", QUAD, mma_xxmtacc) diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index 2cf78dfa5fe..fc1671e1bb2 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -13383,8 +13383,8 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0, case P8V_BUILTIN_VGBBD: case MISC_BUILTIN_CDTBCD: case MISC_BUILTIN_CBCDTD: -case VSX_BUILTIN_XVCVSPBF16: -case VSX_BUILTIN_XVCVBF16SPN: +case P10V_BUILTIN_XVCVSPBF16: +case P10V_BUILTIN_XVCVBF16SPN: h.uns_p[0] = 1; h.uns_p[1] = 1; break; -- 2.17.1
RE: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
On Wed, 2020-08-19 at 15:16 -0500, Segher Boessenkool wrote: > On Wed, Aug 19, 2020 at 02:19:12PM -0500, Peter Bergner wrote: > > On 8/14/20 7:42 PM, Segher Boessenkool wrote: > > > I think your current code is fine; I hadn't considered Bill's > > > upcoming > > > rewrite. It is more important to make that go smoother than to > > > fix some > > > aesthetics right now. > > > > > > Please check if the names for the builtins match the (future) > > > documentation, and then, approved for trunk. Thank you! > > > > This is also a bug in GCC 10, so this should be backported too. > > > > Segher, is this ok for Carl to backport to GCC 10 after it has sat > > on > > trunk for a few days? > > Yes. Thanks guys. The patch was committed today, I left issue 935 open for the moment. I will work on a backport patch for GCC 10 and then as long as nothing bad happens I will push the patch early next week. Carl
Re: [EXTERNAL] Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
On Wed, Aug 19, 2020 at 02:19:12PM -0500, Peter Bergner wrote: > On 8/14/20 7:42 PM, Segher Boessenkool wrote: > > I think your current code is fine; I hadn't considered Bill's upcoming > > rewrite. It is more important to make that go smoother than to fix some > > aesthetics right now. > > > > Please check if the names for the builtins match the (future) > > documentation, and then, approved for trunk. Thank you! > > This is also a bug in GCC 10, so this should be backported too. > > Segher, is this ok for Carl to backport to GCC 10 after it has sat on > trunk for a few days? Yes. Thanks guys. Segher
Re: [EXTERNAL] Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
On 8/14/20 7:42 PM, Segher Boessenkool wrote: > I think your current code is fine; I hadn't considered Bill's upcoming > rewrite. It is more important to make that go smoother than to fix some > aesthetics right now. > > Please check if the names for the builtins match the (future) > documentation, and then, approved for trunk. Thank you! This is also a bug in GCC 10, so this should be backported too. Segher, is this ok for Carl to backport to GCC 10 after it has sat on trunk for a few days? Peter
RE: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
Bill: On Mon, 2020-08-17 at 13:09 -0500, Bill Schmidt wrote: > > > > There are three prototypes __builtin_cfuged, __builtin_pdepd, > > __builtin_pextd defined in the document. > > > > The corresponding builtin definitions in GCC are: > > > > __builtin_altivec_cfuged, __builtin_altivec_pdepd, > > __builtin_altivec_pextd > > > > which does not match the defined prototype in the document. > > > These are scalar instructions, not vector, so they should not be > using > any flavor of "V". They should be using BU_P10_MISC_n, where n is > the > number of arguments. Yes, looks like that is those are the scalar versions. I got them mixed up with the vector definitions vector unsigned long long int vec_pdep() vector unsigned long long int vec_pext () vector unsigned long long int vec_cfuge () I was thinking the __builtin_name() was also referring to the vector versions. So, given that there are separate definitions, it does appear that the names are all consistent with the documentation. Thanks Bill. Carl
Re: [EXTERNAL] Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
On 8/17/20 12:13 PM, Carl Love wrote: Segher, Bill, Peter: On Fri, 2020-08-14 at 19:42 -0500, Segher Boessenkool wrote: Do the names agree with the (future) documentation now? Did not double check on the documentation. Someone should... Looking at the box document "Proposed function Prototypes for P10". There are a number of builtins of the form "name()" which get expanded to __builtin_altivec_name or __builtin_vsx_name. But there does not appear to be any additional defined prototype for the __builtin_altivec_name or __builtin_vsx_name in the document so we don't need to worry about these prototypes as far as I can see. There are three prototypes __builtin_cfuged, __builtin_pdepd, __builtin_pextd defined in the document. The corresponding builtin definitions in GCC are: __builtin_altivec_cfuged, __builtin_altivec_pdepd, __builtin_altivec_pextd which does not match the defined prototype in the document. These are scalar instructions, not vector, so they should not be using any flavor of "V". They should be using BU_P10_MISC_n, where n is the number of arguments. Bill I don't see any defines in gcc/config/rs6000 that would map __builtin_name to __builtin_altivec_name so these three appear to be unsupported as far as I can see. I assume adding #define __builtin_name __builtin_altivec_name to gcc/config/rs6000/altivec.h would be the easiest way to define the prototypes from the document. I can add the defines if you think that is the correct fix. Please let me know. The MMA related builtins at the end of the document appear to have the proper define BU_MMA_# macro expansions to generate the defined prototype names. Looking at the builtin definitions in box for RFC 2608, RFC 2609, RFC 2629 the builtins are all of the form name() so I don't see any issues with the internal GCC name changes for the builtins in these documents. Carl
RE: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
Segher, Bill, Peter: On Fri, 2020-08-14 at 19:42 -0500, Segher Boessenkool wrote: > > > Do the names agree with the (future) documentation now? > > > > Did not double check on the documentation. > > Someone should... Looking at the box document "Proposed function Prototypes for P10". There are a number of builtins of the form "name()" which get expanded to __builtin_altivec_name or __builtin_vsx_name. But there does not appear to be any additional defined prototype for the __builtin_altivec_name or __builtin_vsx_name in the document so we don't need to worry about these prototypes as far as I can see. There are three prototypes __builtin_cfuged, __builtin_pdepd, __builtin_pextd defined in the document. The corresponding builtin definitions in GCC are: __builtin_altivec_cfuged, __builtin_altivec_pdepd, __builtin_altivec_pextd which does not match the defined prototype in the document. I don't see any defines in gcc/config/rs6000 that would map __builtin_name to __builtin_altivec_name so these three appear to be unsupported as far as I can see. I assume adding #define __builtin_name __builtin_altivec_name to gcc/config/rs6000/altivec.h would be the easiest way to define the prototypes from the document. I can add the defines if you think that is the correct fix. Please let me know. The MMA related builtins at the end of the document appear to have the proper define BU_MMA_# macro expansions to generate the defined prototype names. Looking at the builtin definitions in box for RFC 2608, RFC 2609, RFC 2629 the builtins are all of the form name() so I don't see any issues with the internal GCC name changes for the builtins in these documents. Carl
Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
Hi! On Fri, Aug 14, 2020 at 03:32:47PM -0700, Carl Love wrote: > On Fri, 2020-08-14 at 16:33 -0500, Segher Boessenkool wrote: > > So _vsx if it is for all VSRs, but _altivec for just the VRs? > > Yes, I worked off the rule that Altivec registers are designated with > 5-bits and VSR registers are designated with 6-bits. Excellent :-) > > Do the names agree with the (future) documentation now? > > Did not double check on the documentation. Someone should... > > > -#define BU_P10V_1(ENUM, NAME, ATTR, ICODE) > > > \ > > > - RS6000_BUILTIN_1 (P10_BUILTIN_ ## ENUM,/* ENUM */ > > > \ > > > - "__builtin_altivec_" NAME, /* NAME */ > > > \ > > > +#define BU_P10V_VSX_1(ENUM, NAME, ATTR, ICODE) > > > \ > > > > Hrm, this now says "V" (for vector) as well as "VSX" (for all 64 > > vector > > regs allowed). One of those is superfluous. > > > > > + RS6000_BUILTIN_1 (P10V_BUILTIN_ ## ENUM, /* ENUM */ > > > \ > > > > So this enum name doesn't say it allows all 64 registers? > > > > > + "__builtin_vsx_" NAME, /* NAME */ \ > > > > > > > /* Builtins for vector instructions added in ISA 3.1 > > > (power10). */ > > > -BU_P10V_2 (VCLRLB, "vclrlb", CONST, vclrlb) > > > +BU_P10V_AV_2 (VCLRLB, "vclrlb", CONST, vclrlb) > > > > Maybe you should just keep "V" for insns using only the VRs (which > > you > > call V_AV now), and do "VS" for those working on all VSRs (which you > > call > > V_VSX here)? > > The names BU_P10V_AV_# and BU_P10V_VSX_# were used for consistency with > the Power 8 and Power 9 naming, per my discussion with Peter. Ah, okay, yes let's keep it all consistent then, to make the rewrite easier :-) > The overload macros: > > #define BU_P8V_OVERLOAD_1(ENUM, NAME) > #define BU_P8V_OVERLOAD_2(ENUM, NAME) > #define BU_P8V_OVERLOAD_3(ENUM, NAME) > > #define BU_P9_OVERLOAD_2(ENUM, NAME) > > #define BU_P9V_OVERLOAD_1(ENUM, NAME) > #define BU_P9V_OVERLOAD_2(ENUM, NAME) > #define BU_P9V_OVERLOAD_3(ENUM, NAME) > > Looks like BU_P9_OVERLOAD_2 and BU_P9V_OVERLOAD_2 are identical > definitions. That should probably be fixed. It's __builtin_ vs. __builtin_vec_. > Thoughts? I will change to whatever everyone agrees on. I think your current code is fine; I hadn't considered Bill's upcoming rewrite. It is more important to make that go smoother than to fix some aesthetics right now. Please check if the names for the builtins match the (future) documentation, and then, approved for trunk. Thank you! Segher
RE: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
On Fri, 2020-08-14 at 16:33 -0500, Segher Boessenkool wrote: > Hi Carl, > > On Thu, Aug 13, 2020 at 09:12:48AM -0700, Carl Love wrote: > > The macro expansion for the bfloat convert intrinsics XVCVBF16SP > > and > > XVCVSPBF16 need to be restricted to P10. > > The following patch creates new macro expansions BU_P10V_VSX_# and > > BU_P10V_AV_# for the VSX and Altivec instructions > > respectively. The > > new names are consistent with the P8 and P9 naming convention for > > the > > VSX and Altivec instructions. > > So _vsx if it is for all VSRs, but _altivec for just the VRs? Yes, I worked off the rule that Altivec registers are designated with 5-bits and VSR registers are designated with 6-bits. > > > The macro expansion for XVCVBF16SP and XVCVSPBF16 is changed from > > BU_VSX_1 to BU_P10V_VSX_1 to restrict it to P10 and beyond. Also > > MISC > > is changed to CONST in the macro expansion call. > > The spelling of the xvcvbf16sp name will probably change, fwiw. So > you > might want to wait before committing this (but changing it later is > fine > as well). OK, not sure of the urgency to get this patch in. Bill noted the error recently in the expansion. Will check with him to see if we should wait and go with the new name or go with the old name for now. > > Do the names agree with the (future) documentation now? Did not double check on the documentation. > > > -#define BU_P10V_1(ENUM, NAME, ATTR, ICODE) > > \ > > - RS6000_BUILTIN_1 (P10_BUILTIN_ ## ENUM, /* ENUM */ \ > > - "__builtin_altivec_" NAME, /* NAME */ > > \ > > +#define BU_P10V_VSX_1(ENUM, NAME, ATTR, ICODE) > > \ > > Hrm, this now says "V" (for vector) as well as "VSX" (for all 64 > vector > regs allowed). One of those is superfluous. > > > + RS6000_BUILTIN_1 (P10V_BUILTIN_ ## ENUM, /* ENUM */ \ > > So this enum name doesn't say it allows all 64 registers? > > > + "__builtin_vsx_" NAME, /* NAME */ \ > > > > /* Builtins for vector instructions added in ISA 3.1 > > (power10). */ > > -BU_P10V_2 (VCLRLB, "vclrlb", CONST, vclrlb) > > +BU_P10V_AV_2 (VCLRLB, "vclrlb", CONST, vclrlb) > > Maybe you should just keep "V" for insns using only the VRs (which > you > call V_AV now), and do "VS" for those working on all VSRs (which you > call > V_VSX here)? The names BU_P10V_AV_# and BU_P10V_VSX_# were used for consistency with the Power 8 and Power 9 naming, per my discussion with Peter. The following already exist in rs6000-builtin.def for similar uses. #define BU_P8V_AV_1(ENUM, NAME, ATTR, ICODE) #define BU_P8V_AV_2(ENUM, NAME, ATTR, ICODE) #define BU_P8V_AV_3(ENUM, NAME, ATTR, ICODE) #define BU_P9V_AV_1(ENUM, NAME, ATTR, ICODE) #define BU_P9V_AV_2(ENUM, NAME, ATTR, ICODE) #define BU_P9V_AV_3(ENUM, NAME, ATTR, ICODE) #define BU_P8V_VSX_1(ENUM, NAME, ATTR, ICODE) #define BU_P8V_VSX_2(ENUM, NAME, ATTR, ICODE) #define BU_P9V_VSX_1(ENUM, NAME, ATTR, ICODE) #define BU_P9V_VSX_2(ENUM, NAME, ATTR, ICODE) #define BU_P9V_VSX_3(ENUM, NAME, ATTR, ICODE) I agree the V seems redundant given the VSX, AV in the name. The overload macros: #define BU_P8V_OVERLOAD_1(ENUM, NAME) #define BU_P8V_OVERLOAD_2(ENUM, NAME) #define BU_P8V_OVERLOAD_3(ENUM, NAME) #define BU_P9_OVERLOAD_2(ENUM, NAME) #define BU_P9V_OVERLOAD_1(ENUM, NAME) #define BU_P9V_OVERLOAD_2(ENUM, NAME) #define BU_P9V_OVERLOAD_3(ENUM, NAME) Looks like BU_P9_OVERLOAD_2 and BU_P9V_OVERLOAD_2 are identical definitions. That should probably be fixed. Seems like you need V in the processor name for the OVERLOAD definitions to make it clear they are vector definitions. I don't believe the OVERLOAD cares if the builtin is for vsx or altivec registers. It kinda makes sense to leave the V in BU_P8V_VSX_#, BU_P9V_VSX_#, BU_P8V_AV_#, BU_P9V_AV_# definitions for consistency with the OVERLOAD macro name? Thoughts? I will change to whatever everyone agrees on. Carl
Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
Hi Carl, On Thu, Aug 13, 2020 at 09:12:48AM -0700, Carl Love wrote: > The macro expansion for the bfloat convert intrinsics XVCVBF16SP and > XVCVSPBF16 need to be restricted to P10. > The following patch creates new macro expansions BU_P10V_VSX_# and > BU_P10V_AV_# for the VSX and Altivec instructions respectively. The > new names are consistent with the P8 and P9 naming convention for the > VSX and Altivec instructions. So _vsx if it is for all VSRs, but _altivec for just the VRs? > The macro expansion for XVCVBF16SP and XVCVSPBF16 is changed from > BU_VSX_1 to BU_P10V_VSX_1 to restrict it to P10 and beyond. Also MISC > is changed to CONST in the macro expansion call. The spelling of the xvcvbf16sp name will probably change, fwiw. So you might want to wait before committing this (but changing it later is fine as well). > The side effect of creating the macro expansions for VSX and Altivec is > it changes all of the expanded names. The patch fixes all the uses of > the expanded names as needed for the new VSX and Altivec macros. Joy :-) Do the names agree with the (future) documentation now? > -#define BU_P10V_1(ENUM, NAME, ATTR, ICODE) \ > - RS6000_BUILTIN_1 (P10_BUILTIN_ ## ENUM,/* ENUM */ \ > - "__builtin_altivec_" NAME, /* NAME */ \ > +#define BU_P10V_VSX_1(ENUM, NAME, ATTR, ICODE) > \ Hrm, this now says "V" (for vector) as well as "VSX" (for all 64 vector regs allowed). One of those is superfluous. > + RS6000_BUILTIN_1 (P10V_BUILTIN_ ## ENUM, /* ENUM */ \ So this enum name doesn't say it allows all 64 registers? > + "__builtin_vsx_" NAME, /* NAME */ \ > /* Builtins for vector instructions added in ISA 3.1 (power10). */ > -BU_P10V_2 (VCLRLB, "vclrlb", CONST, vclrlb) > +BU_P10V_AV_2 (VCLRLB, "vclrlb", CONST, vclrlb) Maybe you should just keep "V" for insns using only the VRs (which you call V_AV now), and do "VS" for those working on all VSRs (which you call V_VSX here)? Segher
Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
On 8/13/20 3:00 PM, Carl Love wrote: > On Thu, 2020-08-13 at 14:48 -0500, Bill Schmidt wrote: >> OK, but that was just meant as an example. We have a fair number of >> things that changed names, so I was somewhat surprised. It could be >> that all of these are likewise hidden via the overload mechanism. >> Just >> checking to be sure. > > OK, I will go dig thru the test cases in a similar way for all of the > changes just to make sure. I didn't get any test failures but yea, a > lot of changes so lets double check. I too was surprised there were no testsuite changes required. If you ran the testsuite twice with the unpatched and patched builds (as is required for patch submission) and there were no regressions, then great. Wow, but great. Peter
Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
Bill: On Thu, 2020-08-13 at 14:48 -0500, Bill Schmidt wrote: > OK, but that was just meant as an example. We have a fair number of > things that changed names, so I was somewhat surprised. It could be > that all of these are likewise hidden via the overload mechanism. > Just > checking to be sure. OK, I will go dig thru the test cases in a similar way for all of the changes just to make sure. I didn't get any test failures but yea, a lot of changes so lets double check. Carl
Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
On 8/13/20 2:24 PM, Carl Love wrote: Bill: On Thu, 2020-08-13 at 13:38 -0500, Bill Schmidt wrote: Hi Carl, Thanks for cleaning up the consistency issue. The new names and related adjustments LGTM. Are there no affected test cases that need adjusting? That surprises me. For example, didn't __builtin_altivec_xxeval become __builtin_vsx_xxeval as a result of this change? Does that not appear in any test cases? Thanks, Bill In gcc/config/rs6000/rs6000-builtin.def we have #define vec_ternarylogic(a, b, c, d) __builtin_vec_xxeval (a, b, c, d) The vec_ternarylogic() builtin is used in test files gcc/testsuite/gcc.target/powerpc/vec-ternarylogic-X.c where X stands for 1, 2, 3, 4, 5, 6, 7, 8, 9. In gcc/confit/rs6000/rs6000-builtin.def BU_P10V_VSX_4 (XXEVAL, "xxeval", CONST, xxeval) now expands to __builtin_vsx_xxeval as you expect. I do not see a test case that uses the old builtin name __builtin_altivec_xxeval. carll@genoa:~/GCC/gcc-mainline-935/gcc/testsuite/gcc.target/powerpc$ grep -r xxeval * vec-ternarylogic-0.c:/* { dg-final { scan-assembler {\mxxeval\M} } } */ vec-ternarylogic-2.c:/* { dg-final { scan-assembler {\mxxeval\M} } } */ vec-ternarylogic-3.c:/* { dg-final { scan-assembler {\mxxeval\M} } } */ vec-ternarylogic-4.c:/* { dg-final { scan-assembler {\mxxeval\M} } } */ vec-ternarylogic-6.c:/* { dg-final { scan-assembler {\mxxeval\M} } } */ vec-ternarylogic-8.c:/* { dg-final { scan-assembler {\mxxeval\M} } } */ vec-ternarylogic-9.c:/* { dg-final { scan-assembler {\mxxeval\M} } } */ carll@genoa:~/GCC/gcc-mainline-935/gcc/testsuite/gcc.target/powerpc$ There just seems to be the various tests that are expected to generate the xxeval instruction. As far as I can see there is no test program that uses the __builtin_altivec_xxeval name. OK, but that was just meant as an example. We have a fair number of things that changed names, so I was somewhat surprised. It could be that all of these are likewise hidden via the overload mechanism. Just checking to be sure. Thanks, Bill Carl
Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
Bill: On Thu, 2020-08-13 at 13:38 -0500, Bill Schmidt wrote: > Hi Carl, > > Thanks for cleaning up the consistency issue. The new names and > related > adjustments LGTM. > > Are there no affected test cases that need adjusting? That > surprises > me. For example, didn't __builtin_altivec_xxeval become > __builtin_vsx_xxeval as a result of this change? Does that not > appear > in any test cases? > > Thanks, > > Bill In gcc/config/rs6000/rs6000-builtin.def we have #define vec_ternarylogic(a, b, c, d) __builtin_vec_xxeval (a, b, c, d) The vec_ternarylogic() builtin is used in test files gcc/testsuite/gcc.target/powerpc/vec-ternarylogic-X.c where X stands for 1, 2, 3, 4, 5, 6, 7, 8, 9. In gcc/confit/rs6000/rs6000-builtin.def BU_P10V_VSX_4 (XXEVAL, "xxeval", CONST, xxeval) now expands to __builtin_vsx_xxeval as you expect. I do not see a test case that uses the old builtin name __builtin_altivec_xxeval. carll@genoa:~/GCC/gcc-mainline-935/gcc/testsuite/gcc.target/powerpc$ grep -r xxeval * vec-ternarylogic-0.c:/* { dg-final { scan-assembler {\mxxeval\M} } } */ vec-ternarylogic-2.c:/* { dg-final { scan-assembler {\mxxeval\M} } } */ vec-ternarylogic-3.c:/* { dg-final { scan-assembler {\mxxeval\M} } } */ vec-ternarylogic-4.c:/* { dg-final { scan-assembler {\mxxeval\M} } } */ vec-ternarylogic-6.c:/* { dg-final { scan-assembler {\mxxeval\M} } } */ vec-ternarylogic-8.c:/* { dg-final { scan-assembler {\mxxeval\M} } } */ vec-ternarylogic-9.c:/* { dg-final { scan-assembler {\mxxeval\M} } } */ carll@genoa:~/GCC/gcc-mainline-935/gcc/testsuite/gcc.target/powerpc$ There just seems to be the various tests that are expected to generate the xxeval instruction. As far as I can see there is no test program that uses the __builtin_altivec_xxeval name. Carl
Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
On 8/13/20 11:12 AM, Carl Love wrote: GCC maintainers: The macro expansion for the bfloat convert intrinsics XVCVBF16SP and XVCVSPBF16 need to be restricted to P10. The macro expansions BU_P10V_0, BU_P10V_1, BU_P10V_2, BU_P10V_3 expand the name field as "__builtin_altivec_". These macro expansions are being used for both VSX and Altivec instructions. There needs to be separate expansions for VSX with the name field "__builtin_vsx_" and for Altivec with the name field "__builtin_altivec_". The following patch creates new macro expansions BU_P10V_VSX_# and BU_P10V_AV_# for the VSX and Altivec instructions respectively. The new names are consistent with the P8 and P9 naming convention for the VSX and Altivec instructions. The macro expansion for XVCVBF16SP and XVCVSPBF16 is changed from BU_VSX_1 to BU_P10V_VSX_1 to restrict it to P10 and beyond. Also MISC is changed to CONST in the macro expansion call. The side effect of creating the macro expansions for VSX and Altivec is it changes all of the expanded names. The patch fixes all the uses of the expanded names as needed for the new VSX and Altivec macros. The patch has been run on powerpc64le-unknown-linux-gnu (Power 8 LE) powerpc64le-unknown-linux-gnu (Power 9 LE) with no regressions. Please let me know if the patch is acceptable for trunk. Hi Carl, Thanks for cleaning up the consistency issue. The new names and related adjustments LGTM. Are there no affected test cases that need adjusting? That surprises me. For example, didn't __builtin_altivec_xxeval become __builtin_vsx_xxeval as a result of this change? Does that not appear in any test cases? Thanks, Bill Carl Love - [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions. gcc/ChangeLog 2020-08-12 Carl Love * config/rs6000/rs6000-builtin.def (BU_P10V_0, BU_P10V_1, BU_P10V_2, BU_P10V_3): Rename BU_P10V_VSX_0, BU_P10V_VSX_1, BU_P10V_VSX_2, BU_P10V_VSX_3 respectively. (BU_P10V_4): Remove. (BU_P10V_AV_0, BU_P10V_AV_1, BU_P10V_AV_2, BU_P10V_AV_3, BU_P10V_AV_4): New definitions for Power 10 Altivec macros. (VSTRIBR, VSTRIHR, VSTRIBL, VSTRIHL, VSTRIBR_P, VSTRIHR_P, VSTRIBL_P, VSTRIHL_P, MTVSRBM, MTVSRHM, MTVSRWM, MTVSRDM, MTVSRQM, VEXPANDMB, VEXPANDMH, VEXPANDMW, VEXPANDMD, VEXPANDMQ, VEXTRACTMB, VEXTRACTMH, VEXTRACTMW, VEXTRACTMD, VEXTRACTMQ): Replace macro expansion BU_P10V_1 with BU_P10V_AV_1. (VCLRLB, VCLRRB, VCFUGED, VCLZDM, VCTZDM, VPDEPD, VPEXTD, VGNB, VCNTMBB, VCNTMBH, VCNTMBW, VCNTMBD): Replace macro expansion BU_P10V_2 with BU_P10V_AV_2. (VEXTRACTBL, VEXTRACTHL, VEXTRACTWL, VEXTRACTDL, VEXTRACTBR, VEXTRACTHR, VEXTRACTWR, VEXTRACTDR, VINSERTGPRBL, VINSERTGPRHL, VINSERTGPRWL, VINSERTGPRDL, VINSERTVPRBL, VINSERTVPRHL, VINSERTVPRWL, VINSERTGPRBR, VINSERTGPRHR, VINSERTGPRWR, VINSERTGPRDR, VINSERTVPRBR, VINSERTVPRHR, VINSERTVPRWR, VREPLACE_ELT_V4SI, VREPLACE_ELT_UV4SI, VREPLACE_ELT_V2DF, VREPLACE_ELT_V4SF, VREPLACE_ELT_V2DI, VREPLACE_ELT_UV2DI, VREPLACE_UN_V4SI, VREPLACE_UN_UV4SI, VREPLACE_UN_V4SF, VREPLACE_UN_V2DI, VREPLACE_UN_UV2DI, VREPLACE_UN_V2DF, VSLDB_V16QI, VSLDB_V8HI, VSLDB_V4SI, VSLDB_V2DI, VSRDB_V16QI, VSRDB_V8HI, VSRDB_V4SI, VSRDB_V2DI): Replace macro expansion BU_P10V_3 with BU_P10V_AV_3. (VXXSPLTIW_V4SI, VXXSPLTIW_V4SF, VXXSPLTID): Replace macro expansion BU_P10V_1 with BU_P10V_AV_1. (XXGENPCVM_V16QI, XXGENPCVM_V8HI, XXGENPCVM_V4SI, XXGENPCVM_V2DI): Replace macro expansion BU_P10V_2 with BU_P10V_VSX_2. (VXXSPLTI32DX_V4SI, VXXSPLTI32DX_V4SF, VXXBLEND_V16QI, VXXBLEND_V8HI, VXXBLEND_V4SI, VXXBLEND_V2DI, VXXBLEND_V4SF, VXXBLEND_V2DF): Replace macor expansion BU_P10V_3 with BU_P10V_VSX_3. (XXEVAL, VXXPERMX): Replace macro expansion BU_P10V_4 with BU_P10V_VSX_4. (XVCVBF16SP, XVCVSPBF16): Replace macro expansion BU_VSX_1 with BU_P10V_VSX_1. Also change MISC to CONST. * config/rs6000/rs6000-c.c: (P10_BUILTIN_VXXPERMX): Replace with P10V_BUILTIN_VXXPERMX. (P10_BUILTIN_VCLRLB, P10_BUILTIN_VCLRLB, P10_BUILTIN_VCLRRB, P10_BUILTIN_VGNB, P10_BUILTIN_XXEVAL, P10_BUILTIN_VXXPERMX, P10_BUILTIN_VEXTRACTBL, P10_BUILTIN_VEXTRACTHL, P10_BUILTIN_VEXTRACTWL, P10_BUILTIN_VEXTRACTDL, P10_BUILTIN_VINSERTGPRHL, P10_BUILTIN_VINSERTGPRWL, P10_BUILTIN_VINSERTGPRDL, P10_BUILTIN_VINSERTVPRBL, P10_BUILTIN_VINSERTVPRHL, P10_BUILTIN_VEXTRACTBR, P10_BUILTIN_VEXTRACTHR, P10_BUILTIN_VEXTRACTWR, P10_BUILTIN_VEXTRACTDR, P10_BUILTIN_VINSERTGPRBR, P10_BUILTIN_VINSERTGPRHR, P10_BUILTIN_VINSERTGPRWR, P10_BUILTIN_VINSERTGPRDR, P10_BUILTIN_VINSERTVPRBR,