Re: [PATCH, rs6000] Repair vec_xl, vec_xst, vec_xl_be, vec_xst_be built-in functions

2017-11-15 Thread Segher Boessenkool
Hi!

On Tue, Nov 14, 2017 at 02:24:13PM -0600, Bill Schmidt wrote:
> +  for (i = 0; i < 16; ++i)
> + perm[i] = GEN_INT (reorder[i]);
> +
> +  pcv = force_reg (V16QImode,
> +   gen_rtx_CONST_VECTOR (V16QImode,
> +  gen_rtvec_v (16, perm)));
> +  emit_insn (gen_altivec_vperm_v8hi_direct (operands[0], subreg2,
> + subreg2, pcv));
> +  DONE;

Many whitespace problems on these lines, please fix.  More times later.

> +   (match_operand:V16QI 1 "vsx_register_operand" "wa")
> +   (parallel [(const_int 15) (const_int 14)
> +  (const_int 13) (const_int 12)
> +  (const_int 11) (const_int 10)
> +  (const_int  9) (const_int  8)
> +  (const_int  7) (const_int  6)
> +  (const_int  5) (const_int  4)
> +  (const_int  3) (const_int  2)
> +  (const_int  1) (const_int  0)])))]

Here, too.

The rest looks fine.  Thanks!


Segher


Re: [PATCH, rs6000] Repair vec_xl, vec_xst, vec_xl_be, vec_xst_be built-in functions

2017-11-14 Thread Bill Schmidt
Hi,

I had a pasto in the function prototype for vec_xst_be.  Fixed patch is below.

Thanks,
Bill

On 11/14/17 8:11 AM, Bill Schmidt wrote:

> Please hold review on this until I investigate something that Carl brought 
> up.  Thanks,
> and sorry for the noise!
>
> -- Bill
>> On Nov 13, 2017, at 10:30 AM, Bill Schmidt  
>> wrote:
>>
>> Hi,
>>
>> Some previous patches to add support for vec_xl_be and fill in gaps in 
>> testing
>> for vec_xl and vec_xst resulted in incorrect semantics for these built-in
>> functions.  My fault for not reviewing them in detail.  Essentially the 
>> vec_xl
>> and vec_xl_be semantics were reversed, and vec_xst received semantics that
>> should be associated with vec_xst_be.  Also, vec_xst_be has not yet been
>> implemented.  Also, my attempt to add P8-specific code for the element-
>> reversing loads and stores had the same problem with wrong semantics.  This 
>> patch addresses these issues in a uniform manner.
>>
>> Most of the work is done by adjusting the rs6000-c.c mapping between 
>> overloaded
>> function names and type-specific implementations, and adding missing 
>> interfaces
>> there.  I've also removed the previous implementation function
>> altivec_expand_xl_be_builtin, and moved the byte-reversal into define_expands
>> for the code gen patterns with some simplification.  These generate single
>> instructions for P9 (lxvh8x, lxvq16x, etc.) and lvxw4x + permute for P8.
>>
>> I've verified the code generation by hand, but have not included test cases
>> in this patch because Carl Love has been independently working on adding a
>> full set of test cases for these built-ins.  The code does pass those 
>> relevant
>> tests that are already in the test suite.  I hope it's okay to install this
>> patch while waiting for the full tests; I'll of course give high priority to
>> fixing any possible fallout from those tests.
>>
>> Bootstrapped and tested on powerpc64le-linux-gnu (POWER8 and POWER9) with no
>> regressions.  Is this okay for trunk?
>>
>> Thanks,
>> Bill
>>
[gcc]

2017-11-14  Bill Schmidt  

* config/rs6000/altivec.h (vec_xst_be): New #define.
* config/rs6000/altivec.md (altivec_vperm__direct): Rename
and externalize from *altivec_vperm__internal.
* config/rs6000/rs6000-builtin.def (XL_BE_V16QI): Remove macro
instantiation.
(XL_BE_V8HI): Likewise.
(XL_BE_V4SI): Likewise.
(XL_BE_V4SI): Likewise.
(XL_BE_V2DI): Likewise.
(XL_BE_V4SF): Likewise.
(XL_BE_V2DF): Likewise.
(XST_BE): Add BU_VSX_OVERLOAD_X macro instantiation.
* config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Correct
all array entries with these keys: VSX_BUILTIN_VEC_XL,
VSX_BUILTIN_VEC_XL_BE, VSX_BUILTIN_VEC_XST.  Add entries for key
VSX_BUILTIN_VEC_XST_BE.
* config/rs6000/rs6000.c (altivec_expand_xl_be_builtin): Remove.
(altivec_expand_builtin): Remove handling for VSX_BUILTIN_XL_BE_*
built-ins.
(altivec_init_builtins): Replace conditional calls to def_builtin
for __builtin_vsx_ld_elemrev_{v8hi,v16qi} and
__builtin_vsx_st_elemrev_{v8hi,v16qi} based on TARGET_P9_VECTOR
with unconditional calls.  Remove calls to def_builtin for
__builtin_vsx_le_be_.  Add a call to def_builtin for
__builtin_vec_xst_be.
* config/rs6000/vsx.md (vsx_ld_elemrev_v8hi): Convert define_insn
to define_expand, and add alternate RTL generation for P8.
(*vsx_ld_elemrev_v8hi_internal): New define_insn based on
vsx_ld_elemrev_v8hi.
(vsx_ld_elemrev_v16qi): Convert define_insn to define_expand, and
add alternate RTL generation for P8.
(*vsx_ld_elemrev_v16qi_internal): New define_insn based on
vsx_ld_elemrev_v16qi.
(vsx_st_elemrev_v8hi): Convert define_insn
to define_expand, and add alternate RTL generation for P8.
(*vsx_st_elemrev_v8hi_internal): New define_insn based on
vsx_st_elemrev_v8hi.
(vsx_st_elemrev_v16qi): Convert define_insn to define_expand, and
add alternate RTL generation for P8.
(*vsx_st_elemrev_v16qi_internal): New define_insn based on
vsx_st_elemrev_v16qi.

[gcc/testsuite]

2017-11-14  Bill Schmidt  

* gcc.target/powerpc/swaps-p8-26.c: Modify expected code
generation.


Index: gcc/config/rs6000/altivec.h
===
--- gcc/config/rs6000/altivec.h (revision 254629)
+++ gcc/config/rs6000/altivec.h (working copy)
@@ -357,6 +357,7 @@
 #define vec_xl __builtin_vec_vsx_ld
 #define vec_xl_be __builtin_vec_xl_be
 #define vec_xst __builtin_vec_vsx_st
+#define vec_xst_be __builtin_vec_xst_be
 
 /* Note, xxsldi and xxpermdi were added as __builtin_vsx_ functions
instead of __builtin_vec_  */
Index: gcc/config/rs6000/altivec.md

Re: [PATCH, rs6000] Repair vec_xl, vec_xst, vec_xl_be, vec_xst_be built-in functions

2017-11-14 Thread Bill Schmidt
Please hold review on this until I investigate something that Carl brought up.  
Thanks,
and sorry for the noise!

-- Bill


> On Nov 13, 2017, at 10:30 AM, Bill Schmidt  
> wrote:
> 
> Hi,
> 
> Some previous patches to add support for vec_xl_be and fill in gaps in testing
> for vec_xl and vec_xst resulted in incorrect semantics for these built-in
> functions.  My fault for not reviewing them in detail.  Essentially the vec_xl
> and vec_xl_be semantics were reversed, and vec_xst received semantics that
> should be associated with vec_xst_be.  Also, vec_xst_be has not yet been
> implemented.  Also, my attempt to add P8-specific code for the element-
> reversing loads and stores had the same problem with wrong semantics.  This 
> patch addresses these issues in a uniform manner.
> 
> Most of the work is done by adjusting the rs6000-c.c mapping between 
> overloaded
> function names and type-specific implementations, and adding missing 
> interfaces
> there.  I've also removed the previous implementation function
> altivec_expand_xl_be_builtin, and moved the byte-reversal into define_expands
> for the code gen patterns with some simplification.  These generate single
> instructions for P9 (lxvh8x, lxvq16x, etc.) and lvxw4x + permute for P8.
> 
> I've verified the code generation by hand, but have not included test cases
> in this patch because Carl Love has been independently working on adding a
> full set of test cases for these built-ins.  The code does pass those relevant
> tests that are already in the test suite.  I hope it's okay to install this
> patch while waiting for the full tests; I'll of course give high priority to
> fixing any possible fallout from those tests.
> 
> Bootstrapped and tested on powerpc64le-linux-gnu (POWER8 and POWER9) with no
> regressions.  Is this okay for trunk?
> 
> Thanks,
> Bill
> 
> 
> [gcc]
> 
> 2017-11-13  Bill Schmidt  
> 
>   * config/rs6000/altivec.h (vec_xst_be): New #define.
>   * config/rs6000/altivec.md (altivec_vperm__direct): Rename
>   and externalize from *altivec_vperm__internal.
>   * config/rs6000/rs6000-builtin.def (XL_BE_V16QI): Remove macro
>   instantiation.
>   (XL_BE_V8HI): Likewise.
>   (XL_BE_V4SI): Likewise.
>   (XL_BE_V4SI): Likewise.
>   (XL_BE_V2DI): Likewise.
>   (XL_BE_V4SF): Likewise.
>   (XL_BE_V2DF): Likewise.
>   (XST_BE): Add BU_VSX_OVERLOAD_X macro instantiation.
>   * config/rss6000/rs6000-c.c (altivec_overloaded_builtins): Correct
>   all array entries with these keys: VSX_BUILTIN_VEC_XL,
>   VSX_BUILTIN_VEC_XL_BE, VSX_BUILTIN_VEC_XST.  Add entries for key
>   VSX_BUILTIN_VEC_XST_BE.
>   (altivec_expand_xl_be_builtin): Remove.
>   (altivec_expand_builtin): Remove handling for VSX_BUILTIN_XL_BE_*
>   built-ins.  Replace conditional calls to def_builtin for
>   __builtin_vsx_ld_elemrev_{v8hi,v16qi} and
>   __builtin_vsx_st_elemrev_{v8hi,v16qi} based on TARGET_P9_VECTOR
>   with unconditional calls.  Remove calls to def_builtin for
>   __builtin_vsx_le_be_.  Add a call to def_builtin for
>   __builtin_vec_xst_be.
>   * config/rs6000/vsx.md (vsx_ld_elemrev_v8hi): Convert define_insn
>   to define_expand, and add alternate RTL generation for P8.
>   (*vsx_ld_elemrev_v8hi_internal): New define_insn based on
>   vsx_ld_elemrev_v8hi.
>   (vsx_ld_elemrev_v16qi): Convert define_insn to define_expand, and
>   add alternate RTL generation for P8.
>   (*vsx_ld_elemrev_v16qi_internal): New define_insn based on
>   vsx_ld_elemrev_v16qi.
>   (vsx_st_elemrev_v8hi): Convert define_insn
>   to define_expand, and add alternate RTL generation for P8.
>   (*vsx_st_elemrev_v8hi_internal): New define_insn based on
>   vsx_st_elemrev_v8hi.
>   (vsx_st_elemrev_v16qi): Convert define_insn to define_expand, and
>   add alternate RTL generation for P8.
>   (*vsx_st_elemrev_v16qi_internal): New define_insn based on
>   vsx_st_elemrev_v16qi.
> 
> [gcc/testsuite]
> 
> 2017-11-13  Bill Schmidt  
> 
>   * gcc.target/powerpc/swaps-p8-26.c: Modify expected code
>   generation.
> 
> 
> Index: gcc/config/rs6000/altivec.h
> ===
> --- gcc/config/rs6000/altivec.h   (revision 254629)
> +++ gcc/config/rs6000/altivec.h   (working copy)
> @@ -357,6 +357,7 @@
> #define vec_xl __builtin_vec_vsx_ld
> #define vec_xl_be __builtin_vec_xl_be
> #define vec_xst __builtin_vec_vsx_st
> +#define vec_xst_be __builtin_vec_xst_be
> 
> /* Note, xxsldi and xxpermdi were added as __builtin_vsx_ functions
>instead of __builtin_vec_  */
> Index: gcc/config/rs6000/altivec.md
> ===
> --- gcc/config/rs6000/altivec.md  (revision 254629)
> +++ gcc/config/rs6000/altivec.md  (working copy)
> @@ -2130,7 

[PATCH, rs6000] Repair vec_xl, vec_xst, vec_xl_be, vec_xst_be built-in functions

2017-11-13 Thread Bill Schmidt
Hi,

Some previous patches to add support for vec_xl_be and fill in gaps in testing
for vec_xl and vec_xst resulted in incorrect semantics for these built-in
functions.  My fault for not reviewing them in detail.  Essentially the vec_xl
and vec_xl_be semantics were reversed, and vec_xst received semantics that
should be associated with vec_xst_be.  Also, vec_xst_be has not yet been
implemented.  Also, my attempt to add P8-specific code for the element-
reversing loads and stores had the same problem with wrong semantics.  This 
patch addresses these issues in a uniform manner.

Most of the work is done by adjusting the rs6000-c.c mapping between overloaded
function names and type-specific implementations, and adding missing interfaces
there.  I've also removed the previous implementation function
altivec_expand_xl_be_builtin, and moved the byte-reversal into define_expands
for the code gen patterns with some simplification.  These generate single
instructions for P9 (lxvh8x, lxvq16x, etc.) and lvxw4x + permute for P8.

I've verified the code generation by hand, but have not included test cases
in this patch because Carl Love has been independently working on adding a
full set of test cases for these built-ins.  The code does pass those relevant
tests that are already in the test suite.  I hope it's okay to install this
patch while waiting for the full tests; I'll of course give high priority to
fixing any possible fallout from those tests.

Bootstrapped and tested on powerpc64le-linux-gnu (POWER8 and POWER9) with no
regressions.  Is this okay for trunk?

Thanks,
Bill


[gcc]

2017-11-13  Bill Schmidt  

* config/rs6000/altivec.h (vec_xst_be): New #define.
* config/rs6000/altivec.md (altivec_vperm__direct): Rename
and externalize from *altivec_vperm__internal.
* config/rs6000/rs6000-builtin.def (XL_BE_V16QI): Remove macro
instantiation.
(XL_BE_V8HI): Likewise.
(XL_BE_V4SI): Likewise.
(XL_BE_V4SI): Likewise.
(XL_BE_V2DI): Likewise.
(XL_BE_V4SF): Likewise.
(XL_BE_V2DF): Likewise.
(XST_BE): Add BU_VSX_OVERLOAD_X macro instantiation.
* config/rss6000/rs6000-c.c (altivec_overloaded_builtins): Correct
all array entries with these keys: VSX_BUILTIN_VEC_XL,
VSX_BUILTIN_VEC_XL_BE, VSX_BUILTIN_VEC_XST.  Add entries for key
VSX_BUILTIN_VEC_XST_BE.
(altivec_expand_xl_be_builtin): Remove.
(altivec_expand_builtin): Remove handling for VSX_BUILTIN_XL_BE_*
built-ins.  Replace conditional calls to def_builtin for
__builtin_vsx_ld_elemrev_{v8hi,v16qi} and
__builtin_vsx_st_elemrev_{v8hi,v16qi} based on TARGET_P9_VECTOR
with unconditional calls.  Remove calls to def_builtin for
__builtin_vsx_le_be_.  Add a call to def_builtin for
__builtin_vec_xst_be.
* config/rs6000/vsx.md (vsx_ld_elemrev_v8hi): Convert define_insn
to define_expand, and add alternate RTL generation for P8.
(*vsx_ld_elemrev_v8hi_internal): New define_insn based on
vsx_ld_elemrev_v8hi.
(vsx_ld_elemrev_v16qi): Convert define_insn to define_expand, and
add alternate RTL generation for P8.
(*vsx_ld_elemrev_v16qi_internal): New define_insn based on
vsx_ld_elemrev_v16qi.
(vsx_st_elemrev_v8hi): Convert define_insn
to define_expand, and add alternate RTL generation for P8.
(*vsx_st_elemrev_v8hi_internal): New define_insn based on
vsx_st_elemrev_v8hi.
(vsx_st_elemrev_v16qi): Convert define_insn to define_expand, and
add alternate RTL generation for P8.
(*vsx_st_elemrev_v16qi_internal): New define_insn based on
vsx_st_elemrev_v16qi.

[gcc/testsuite]

2017-11-13  Bill Schmidt  

* gcc.target/powerpc/swaps-p8-26.c: Modify expected code
generation.


Index: gcc/config/rs6000/altivec.h
===
--- gcc/config/rs6000/altivec.h (revision 254629)
+++ gcc/config/rs6000/altivec.h (working copy)
@@ -357,6 +357,7 @@
 #define vec_xl __builtin_vec_vsx_ld
 #define vec_xl_be __builtin_vec_xl_be
 #define vec_xst __builtin_vec_vsx_st
+#define vec_xst_be __builtin_vec_xst_be
 
 /* Note, xxsldi and xxpermdi were added as __builtin_vsx_ functions
instead of __builtin_vec_  */
Index: gcc/config/rs6000/altivec.md
===
--- gcc/config/rs6000/altivec.md(revision 254629)
+++ gcc/config/rs6000/altivec.md(working copy)
@@ -2130,7 +2130,7 @@
 })
 
 ;; Slightly prefer vperm, since the target does not overlap the source
-(define_insn "*altivec_vperm__internal"
+(define_insn "altivec_vperm__direct"
   [(set (match_operand:VM 0 "register_operand" "=v,?wo")
(unspec:VM [(match_operand:VM 1 "register_operand" "v,wo")
(match_operand:VM 2