Re: [PATCH, rs6000] Repair vec_xl, vec_xst, vec_xl_be, vec_xst_be built-in functions
Hi! On Tue, Nov 14, 2017 at 02:24:13PM -0600, Bill Schmidt wrote: > + for (i = 0; i < 16; ++i) > + perm[i] = GEN_INT (reorder[i]); > + > + pcv = force_reg (V16QImode, > + gen_rtx_CONST_VECTOR (V16QImode, > + gen_rtvec_v (16, perm))); > + emit_insn (gen_altivec_vperm_v8hi_direct (operands[0], subreg2, > + subreg2, pcv)); > + DONE; Many whitespace problems on these lines, please fix. More times later. > + (match_operand:V16QI 1 "vsx_register_operand" "wa") > + (parallel [(const_int 15) (const_int 14) > + (const_int 13) (const_int 12) > + (const_int 11) (const_int 10) > + (const_int 9) (const_int 8) > + (const_int 7) (const_int 6) > + (const_int 5) (const_int 4) > + (const_int 3) (const_int 2) > + (const_int 1) (const_int 0)])))] Here, too. The rest looks fine. Thanks! Segher
Re: [PATCH, rs6000] Repair vec_xl, vec_xst, vec_xl_be, vec_xst_be built-in functions
Hi, I had a pasto in the function prototype for vec_xst_be. Fixed patch is below. Thanks, Bill On 11/14/17 8:11 AM, Bill Schmidt wrote: > Please hold review on this until I investigate something that Carl brought > up. Thanks, > and sorry for the noise! > > -- Bill >> On Nov 13, 2017, at 10:30 AM, Bill Schmidt>> wrote: >> >> Hi, >> >> Some previous patches to add support for vec_xl_be and fill in gaps in >> testing >> for vec_xl and vec_xst resulted in incorrect semantics for these built-in >> functions. My fault for not reviewing them in detail. Essentially the >> vec_xl >> and vec_xl_be semantics were reversed, and vec_xst received semantics that >> should be associated with vec_xst_be. Also, vec_xst_be has not yet been >> implemented. Also, my attempt to add P8-specific code for the element- >> reversing loads and stores had the same problem with wrong semantics. This >> patch addresses these issues in a uniform manner. >> >> Most of the work is done by adjusting the rs6000-c.c mapping between >> overloaded >> function names and type-specific implementations, and adding missing >> interfaces >> there. I've also removed the previous implementation function >> altivec_expand_xl_be_builtin, and moved the byte-reversal into define_expands >> for the code gen patterns with some simplification. These generate single >> instructions for P9 (lxvh8x, lxvq16x, etc.) and lvxw4x + permute for P8. >> >> I've verified the code generation by hand, but have not included test cases >> in this patch because Carl Love has been independently working on adding a >> full set of test cases for these built-ins. The code does pass those >> relevant >> tests that are already in the test suite. I hope it's okay to install this >> patch while waiting for the full tests; I'll of course give high priority to >> fixing any possible fallout from those tests. >> >> Bootstrapped and tested on powerpc64le-linux-gnu (POWER8 and POWER9) with no >> regressions. Is this okay for trunk? >> >> Thanks, >> Bill >> [gcc] 2017-11-14 Bill Schmidt * config/rs6000/altivec.h (vec_xst_be): New #define. * config/rs6000/altivec.md (altivec_vperm__direct): Rename and externalize from *altivec_vperm__internal. * config/rs6000/rs6000-builtin.def (XL_BE_V16QI): Remove macro instantiation. (XL_BE_V8HI): Likewise. (XL_BE_V4SI): Likewise. (XL_BE_V4SI): Likewise. (XL_BE_V2DI): Likewise. (XL_BE_V4SF): Likewise. (XL_BE_V2DF): Likewise. (XST_BE): Add BU_VSX_OVERLOAD_X macro instantiation. * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Correct all array entries with these keys: VSX_BUILTIN_VEC_XL, VSX_BUILTIN_VEC_XL_BE, VSX_BUILTIN_VEC_XST. Add entries for key VSX_BUILTIN_VEC_XST_BE. * config/rs6000/rs6000.c (altivec_expand_xl_be_builtin): Remove. (altivec_expand_builtin): Remove handling for VSX_BUILTIN_XL_BE_* built-ins. (altivec_init_builtins): Replace conditional calls to def_builtin for __builtin_vsx_ld_elemrev_{v8hi,v16qi} and __builtin_vsx_st_elemrev_{v8hi,v16qi} based on TARGET_P9_VECTOR with unconditional calls. Remove calls to def_builtin for __builtin_vsx_le_be_. Add a call to def_builtin for __builtin_vec_xst_be. * config/rs6000/vsx.md (vsx_ld_elemrev_v8hi): Convert define_insn to define_expand, and add alternate RTL generation for P8. (*vsx_ld_elemrev_v8hi_internal): New define_insn based on vsx_ld_elemrev_v8hi. (vsx_ld_elemrev_v16qi): Convert define_insn to define_expand, and add alternate RTL generation for P8. (*vsx_ld_elemrev_v16qi_internal): New define_insn based on vsx_ld_elemrev_v16qi. (vsx_st_elemrev_v8hi): Convert define_insn to define_expand, and add alternate RTL generation for P8. (*vsx_st_elemrev_v8hi_internal): New define_insn based on vsx_st_elemrev_v8hi. (vsx_st_elemrev_v16qi): Convert define_insn to define_expand, and add alternate RTL generation for P8. (*vsx_st_elemrev_v16qi_internal): New define_insn based on vsx_st_elemrev_v16qi. [gcc/testsuite] 2017-11-14 Bill Schmidt * gcc.target/powerpc/swaps-p8-26.c: Modify expected code generation. Index: gcc/config/rs6000/altivec.h === --- gcc/config/rs6000/altivec.h (revision 254629) +++ gcc/config/rs6000/altivec.h (working copy) @@ -357,6 +357,7 @@ #define vec_xl __builtin_vec_vsx_ld #define vec_xl_be __builtin_vec_xl_be #define vec_xst __builtin_vec_vsx_st +#define vec_xst_be __builtin_vec_xst_be /* Note, xxsldi and xxpermdi were added as __builtin_vsx_ functions instead of __builtin_vec_ */ Index: gcc/config/rs6000/altivec.md
Re: [PATCH, rs6000] Repair vec_xl, vec_xst, vec_xl_be, vec_xst_be built-in functions
Please hold review on this until I investigate something that Carl brought up. Thanks, and sorry for the noise! -- Bill > On Nov 13, 2017, at 10:30 AM, Bill Schmidt> wrote: > > Hi, > > Some previous patches to add support for vec_xl_be and fill in gaps in testing > for vec_xl and vec_xst resulted in incorrect semantics for these built-in > functions. My fault for not reviewing them in detail. Essentially the vec_xl > and vec_xl_be semantics were reversed, and vec_xst received semantics that > should be associated with vec_xst_be. Also, vec_xst_be has not yet been > implemented. Also, my attempt to add P8-specific code for the element- > reversing loads and stores had the same problem with wrong semantics. This > patch addresses these issues in a uniform manner. > > Most of the work is done by adjusting the rs6000-c.c mapping between > overloaded > function names and type-specific implementations, and adding missing > interfaces > there. I've also removed the previous implementation function > altivec_expand_xl_be_builtin, and moved the byte-reversal into define_expands > for the code gen patterns with some simplification. These generate single > instructions for P9 (lxvh8x, lxvq16x, etc.) and lvxw4x + permute for P8. > > I've verified the code generation by hand, but have not included test cases > in this patch because Carl Love has been independently working on adding a > full set of test cases for these built-ins. The code does pass those relevant > tests that are already in the test suite. I hope it's okay to install this > patch while waiting for the full tests; I'll of course give high priority to > fixing any possible fallout from those tests. > > Bootstrapped and tested on powerpc64le-linux-gnu (POWER8 and POWER9) with no > regressions. Is this okay for trunk? > > Thanks, > Bill > > > [gcc] > > 2017-11-13 Bill Schmidt > > * config/rs6000/altivec.h (vec_xst_be): New #define. > * config/rs6000/altivec.md (altivec_vperm__direct): Rename > and externalize from *altivec_vperm__internal. > * config/rs6000/rs6000-builtin.def (XL_BE_V16QI): Remove macro > instantiation. > (XL_BE_V8HI): Likewise. > (XL_BE_V4SI): Likewise. > (XL_BE_V4SI): Likewise. > (XL_BE_V2DI): Likewise. > (XL_BE_V4SF): Likewise. > (XL_BE_V2DF): Likewise. > (XST_BE): Add BU_VSX_OVERLOAD_X macro instantiation. > * config/rss6000/rs6000-c.c (altivec_overloaded_builtins): Correct > all array entries with these keys: VSX_BUILTIN_VEC_XL, > VSX_BUILTIN_VEC_XL_BE, VSX_BUILTIN_VEC_XST. Add entries for key > VSX_BUILTIN_VEC_XST_BE. > (altivec_expand_xl_be_builtin): Remove. > (altivec_expand_builtin): Remove handling for VSX_BUILTIN_XL_BE_* > built-ins. Replace conditional calls to def_builtin for > __builtin_vsx_ld_elemrev_{v8hi,v16qi} and > __builtin_vsx_st_elemrev_{v8hi,v16qi} based on TARGET_P9_VECTOR > with unconditional calls. Remove calls to def_builtin for > __builtin_vsx_le_be_. Add a call to def_builtin for > __builtin_vec_xst_be. > * config/rs6000/vsx.md (vsx_ld_elemrev_v8hi): Convert define_insn > to define_expand, and add alternate RTL generation for P8. > (*vsx_ld_elemrev_v8hi_internal): New define_insn based on > vsx_ld_elemrev_v8hi. > (vsx_ld_elemrev_v16qi): Convert define_insn to define_expand, and > add alternate RTL generation for P8. > (*vsx_ld_elemrev_v16qi_internal): New define_insn based on > vsx_ld_elemrev_v16qi. > (vsx_st_elemrev_v8hi): Convert define_insn > to define_expand, and add alternate RTL generation for P8. > (*vsx_st_elemrev_v8hi_internal): New define_insn based on > vsx_st_elemrev_v8hi. > (vsx_st_elemrev_v16qi): Convert define_insn to define_expand, and > add alternate RTL generation for P8. > (*vsx_st_elemrev_v16qi_internal): New define_insn based on > vsx_st_elemrev_v16qi. > > [gcc/testsuite] > > 2017-11-13 Bill Schmidt > > * gcc.target/powerpc/swaps-p8-26.c: Modify expected code > generation. > > > Index: gcc/config/rs6000/altivec.h > === > --- gcc/config/rs6000/altivec.h (revision 254629) > +++ gcc/config/rs6000/altivec.h (working copy) > @@ -357,6 +357,7 @@ > #define vec_xl __builtin_vec_vsx_ld > #define vec_xl_be __builtin_vec_xl_be > #define vec_xst __builtin_vec_vsx_st > +#define vec_xst_be __builtin_vec_xst_be > > /* Note, xxsldi and xxpermdi were added as __builtin_vsx_ functions >instead of __builtin_vec_ */ > Index: gcc/config/rs6000/altivec.md > === > --- gcc/config/rs6000/altivec.md (revision 254629) > +++ gcc/config/rs6000/altivec.md (working copy) > @@ -2130,7
[PATCH, rs6000] Repair vec_xl, vec_xst, vec_xl_be, vec_xst_be built-in functions
Hi, Some previous patches to add support for vec_xl_be and fill in gaps in testing for vec_xl and vec_xst resulted in incorrect semantics for these built-in functions. My fault for not reviewing them in detail. Essentially the vec_xl and vec_xl_be semantics were reversed, and vec_xst received semantics that should be associated with vec_xst_be. Also, vec_xst_be has not yet been implemented. Also, my attempt to add P8-specific code for the element- reversing loads and stores had the same problem with wrong semantics. This patch addresses these issues in a uniform manner. Most of the work is done by adjusting the rs6000-c.c mapping between overloaded function names and type-specific implementations, and adding missing interfaces there. I've also removed the previous implementation function altivec_expand_xl_be_builtin, and moved the byte-reversal into define_expands for the code gen patterns with some simplification. These generate single instructions for P9 (lxvh8x, lxvq16x, etc.) and lvxw4x + permute for P8. I've verified the code generation by hand, but have not included test cases in this patch because Carl Love has been independently working on adding a full set of test cases for these built-ins. The code does pass those relevant tests that are already in the test suite. I hope it's okay to install this patch while waiting for the full tests; I'll of course give high priority to fixing any possible fallout from those tests. Bootstrapped and tested on powerpc64le-linux-gnu (POWER8 and POWER9) with no regressions. Is this okay for trunk? Thanks, Bill [gcc] 2017-11-13 Bill Schmidt* config/rs6000/altivec.h (vec_xst_be): New #define. * config/rs6000/altivec.md (altivec_vperm__direct): Rename and externalize from *altivec_vperm__internal. * config/rs6000/rs6000-builtin.def (XL_BE_V16QI): Remove macro instantiation. (XL_BE_V8HI): Likewise. (XL_BE_V4SI): Likewise. (XL_BE_V4SI): Likewise. (XL_BE_V2DI): Likewise. (XL_BE_V4SF): Likewise. (XL_BE_V2DF): Likewise. (XST_BE): Add BU_VSX_OVERLOAD_X macro instantiation. * config/rss6000/rs6000-c.c (altivec_overloaded_builtins): Correct all array entries with these keys: VSX_BUILTIN_VEC_XL, VSX_BUILTIN_VEC_XL_BE, VSX_BUILTIN_VEC_XST. Add entries for key VSX_BUILTIN_VEC_XST_BE. (altivec_expand_xl_be_builtin): Remove. (altivec_expand_builtin): Remove handling for VSX_BUILTIN_XL_BE_* built-ins. Replace conditional calls to def_builtin for __builtin_vsx_ld_elemrev_{v8hi,v16qi} and __builtin_vsx_st_elemrev_{v8hi,v16qi} based on TARGET_P9_VECTOR with unconditional calls. Remove calls to def_builtin for __builtin_vsx_le_be_. Add a call to def_builtin for __builtin_vec_xst_be. * config/rs6000/vsx.md (vsx_ld_elemrev_v8hi): Convert define_insn to define_expand, and add alternate RTL generation for P8. (*vsx_ld_elemrev_v8hi_internal): New define_insn based on vsx_ld_elemrev_v8hi. (vsx_ld_elemrev_v16qi): Convert define_insn to define_expand, and add alternate RTL generation for P8. (*vsx_ld_elemrev_v16qi_internal): New define_insn based on vsx_ld_elemrev_v16qi. (vsx_st_elemrev_v8hi): Convert define_insn to define_expand, and add alternate RTL generation for P8. (*vsx_st_elemrev_v8hi_internal): New define_insn based on vsx_st_elemrev_v8hi. (vsx_st_elemrev_v16qi): Convert define_insn to define_expand, and add alternate RTL generation for P8. (*vsx_st_elemrev_v16qi_internal): New define_insn based on vsx_st_elemrev_v16qi. [gcc/testsuite] 2017-11-13 Bill Schmidt * gcc.target/powerpc/swaps-p8-26.c: Modify expected code generation. Index: gcc/config/rs6000/altivec.h === --- gcc/config/rs6000/altivec.h (revision 254629) +++ gcc/config/rs6000/altivec.h (working copy) @@ -357,6 +357,7 @@ #define vec_xl __builtin_vec_vsx_ld #define vec_xl_be __builtin_vec_xl_be #define vec_xst __builtin_vec_vsx_st +#define vec_xst_be __builtin_vec_xst_be /* Note, xxsldi and xxpermdi were added as __builtin_vsx_ functions instead of __builtin_vec_ */ Index: gcc/config/rs6000/altivec.md === --- gcc/config/rs6000/altivec.md(revision 254629) +++ gcc/config/rs6000/altivec.md(working copy) @@ -2130,7 +2130,7 @@ }) ;; Slightly prefer vperm, since the target does not overlap the source -(define_insn "*altivec_vperm__internal" +(define_insn "altivec_vperm__direct" [(set (match_operand:VM 0 "register_operand" "=v,?wo") (unspec:VM [(match_operand:VM 1 "register_operand" "v,wo") (match_operand:VM 2