Re: [PATCH, rs6000] Support float from/to long conversion vectorization

2019-09-28 Thread Kewen.Lin
Hi Segher,

on 2019/9/28 上午12:12, Segher Boessenkool wrote:
> On Fri, Sep 27, 2019 at 04:52:30PM +0800, Kewen.Lin wrote:
>>> (Maybe one of the gen* tools complains any_fix needs a mode? :QI will do
>>> if so, or :P if you like that better).
>>
>> I didn't encounter any errors, it sounds it's allowable now?
> 
> Ah, that particular warning (from genrecog.c) only happens for a SET,
> so you won't get it here.  It would be a warning, not an error, btw.
> 

Thanks for the clarification.  :)

Committed by r276266.


Thanks,
Kewen



Re: [PATCH, rs6000] Support float from/to long conversion vectorization

2019-09-27 Thread Segher Boessenkool
On Fri, Sep 27, 2019 at 04:52:30PM +0800, Kewen.Lin wrote:
> > (Maybe one of the gen* tools complains any_fix needs a mode? :QI will do
> > if so, or :P if you like that better).
> 
> I didn't encounter any errors, it sounds it's allowable now?

Ah, that particular warning (from genrecog.c) only happens for a SET,
so you won't get it here.  It would be a warning, not an error, btw.


Segher


Re: [PATCH, rs6000] Support float from/to long conversion vectorization

2019-09-27 Thread Kewen.Lin
Hi Segher,

on 2019/9/27 下午3:27, Segher Boessenkool wrote:
> Hi Kewen,
> 
>> +;; Support signed/unsigned long long to float conversion vectorization.
>> +(define_expand "vec_pack_float_v2di"
>> +  [(match_operand:V4SF 0 "vfloat_operand")
>> +   (any_float:V4SF (parallel [(match_operand:V2DI 1 "vint_operand")
>> + (match_operand:V2DI 2 "vint_operand")]))]
> 
> To concatenate two vectors, the syntax is vec_concat.  So
> 
>   [(set (match_operand:V4SF 0 "vfloat_operand")
>   (any_float:V4SF
> (vec_concat:V4DI (match_operand:V2DI 1 "vint_operand")
>  (match_operand:V2DI 2 "vint_operand"]
> 
> It is of course a define_expand here, and it always calls DONE, so the
> only thing the RTL template is used for is the match_operands; but also
> important here is that you use an iterator (any_float), so you need to
> work that into the template some way.
> 
> Your code would work, but it is a bit misleading, an unsuspecting reader
> (*cough* me *cough*) might think this is the actual insn this expander
> will create.
> 
>> +;; Support float to signed/unsigned long long conversion vectorization.
>> +(define_expand "vec_unpack_fix_trunc_hi_v4sf"
>> +  [(match_operand:V2DI 0 "vint_operand")
>> +   (any_fix:V2DI (match_operand:V4SF 1 "vfloat_operand"))]
> 
> Similarly here: the pattern as you wrote it isn't valid RTL.
> 
>   [(set (match_operand:V2DI 0 "vint_operand")
>   (any_fix:V2DI (vec_select:V2SF ...
> uh-oh, we do not have a mode V2SF.
> Let's go with what you have then, add a comment that the template isn't
> valid RTL, but you need it for the iterator?
> 
> Or can you think of a different way of putting an iterator like this in
> the template?  Maybe something like
> 
> (define_expand "vec_unpack_fix_trunc_hi_v4sf"
>   [(match_operand:V2DI 0 "vint_operand")
>(match_operand:V4SF 1 "vfloat_operand")
>(any_fix (pc))]
> 
> works?  If it does, please do that; if you cannot find a reasonably clear
> syntax, go with what you had, but please add a comment saying the template
> won't ever be inserted as instruction.
> 

Thanks for your advice on "any_fix (pc)", it works perfectly by testing.
Attached patch has adopted this writing and add a comment saying it's just
for code attribute.  Bootstrapped, I'll commit it once regress tested.
Thanks!

> (Maybe one of the gen* tools complains any_fix needs a mode? :QI will do
> if so, or :P if you like that better).

I didn't encounter any errors, it sounds it's allowable now?


Thanks,
Kewen
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 7633171..dc6a6f6 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -5538,3 +5538,48 @@
   operands[SFBOOL_TMP_VSX_DI] = gen_rtx_REG (DImode, regno_tmp_vsx);
   operands[SFBOOL_MTVSR_D_V4SF] = gen_rtx_REG (V4SFmode, regno_mtvsr_d);
 })
+
+;; Support signed/unsigned long long to float conversion vectorization.
+;; Note that any_float (pc) here is just for code attribute .
+(define_expand "vec_pack_float_v2di"
+  [(match_operand:V4SF 0 "vfloat_operand")
+   (match_operand:V2DI 1 "vint_operand")
+   (match_operand:V2DI 2 "vint_operand")
+   (any_float (pc))]
+  "TARGET_VSX"
+{
+  rtx r1 = gen_reg_rtx (V4SFmode);
+  rtx r2 = gen_reg_rtx (V4SFmode);
+  emit_insn (gen_vsx_xvcvxdsp (r1, operands[1]));
+  emit_insn (gen_vsx_xvcvxdsp (r2, operands[2]));
+  rs6000_expand_extract_even (operands[0], r1, r2);
+  DONE;
+})
+
+;; Support float to signed/unsigned long long conversion vectorization.
+;; Note that any_fix (pc) here is just for code attribute .
+(define_expand "vec_unpack_fix_trunc_hi_v4sf"
+  [(match_operand:V2DI 0 "vint_operand")
+   (match_operand:V4SF 1 "vfloat_operand")
+   (any_fix (pc))]
+  "TARGET_VSX"
+{
+  rtx reg = gen_reg_rtx (V4SFmode);
+  rs6000_expand_interleave (reg, operands[1], operands[1], BYTES_BIG_ENDIAN);
+  emit_insn (gen_vsx_xvcvspxds (operands[0], reg));
+  DONE;
+})
+
+;; Note that any_fix (pc) here is just for code attribute .
+(define_expand "vec_unpack_fix_trunc_lo_v4sf"
+  [(match_operand:V2DI 0 "vint_operand")
+   (match_operand:V4SF 1 "vfloat_operand")
+   (any_fix (pc))]
+  "TARGET_VSX"
+{
+  rtx reg = gen_reg_rtx (V4SFmode);
+  rs6000_expand_interleave (reg, operands[1], operands[1], !BYTES_BIG_ENDIAN);
+  emit_insn (gen_vsx_xvcvspxds (operands[0], reg));
+  DONE;
+})
+
diff --git a/gcc/testsuite/gcc.target/powerpc/conv-vectorize-1.c 
b/gcc/testsuite/gcc.target/powerpc/conv-vectorize-1.c
new file mode 100644
index 000..d96db14
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/conv-vectorize-1.c
@@ -0,0 +1,37 @@
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -ftree-vectorize -mvsx" } */
+
+/* Test vectorizer can exploit vector conversion instructions to convert
+   unsigned/signed long long to float.  */
+
+#include 
+
+#define SIZE 32
+#define ALIGN 16
+
+float sflt_array[SIZE] __attribute__ ((__aligned__ (ALIGN)));
+float uflt_array[SIZE] __attribute__ ((__aligned__ (ALIGN)));
+

Re: [PATCH, rs6000] Support float from/to long conversion vectorization

2019-09-27 Thread Segher Boessenkool
Hi Kewen,

On Fri, Sep 27, 2019 at 10:33:01AM +0800, Kewen.Lin wrote:
> This patch is to add the support for float from/to long conversion
> vectorization.  ISA 2.06 supports the vector version instructions
> for conversion between float and long long (both signed and unsigned),
> but vectorizer can't exploit them since the optab check fails.

Nice :-)

> +;; Support signed/unsigned long long to float conversion vectorization.
> +(define_expand "vec_pack_float_v2di"
> +  [(match_operand:V4SF 0 "vfloat_operand")
> +   (any_float:V4SF (parallel [(match_operand:V2DI 1 "vint_operand")
> + (match_operand:V2DI 2 "vint_operand")]))]

To concatenate two vectors, the syntax is vec_concat.  So

  [(set (match_operand:V4SF 0 "vfloat_operand")
(any_float:V4SF
  (vec_concat:V4DI (match_operand:V2DI 1 "vint_operand")
   (match_operand:V2DI 2 "vint_operand"]

It is of course a define_expand here, and it always calls DONE, so the
only thing the RTL template is used for is the match_operands; but also
important here is that you use an iterator (any_float), so you need to
work that into the template some way.

Your code would work, but it is a bit misleading, an unsuspecting reader
(*cough* me *cough*) might think this is the actual insn this expander
will create.

> +;; Support float to signed/unsigned long long conversion vectorization.
> +(define_expand "vec_unpack_fix_trunc_hi_v4sf"
> +  [(match_operand:V2DI 0 "vint_operand")
> +   (any_fix:V2DI (match_operand:V4SF 1 "vfloat_operand"))]

Similarly here: the pattern as you wrote it isn't valid RTL.

  [(set (match_operand:V2DI 0 "vint_operand")
(any_fix:V2DI (vec_select:V2SF ...
uh-oh, we do not have a mode V2SF.
Let's go with what you have then, add a comment that the template isn't
valid RTL, but you need it for the iterator?

Or can you think of a different way of putting an iterator like this in
the template?  Maybe something like

(define_expand "vec_unpack_fix_trunc_hi_v4sf"
  [(match_operand:V2DI 0 "vint_operand")
   (match_operand:V4SF 1 "vfloat_operand")
   (any_fix (pc))]

works?  If it does, please do that; if you cannot find a reasonably clear
syntax, go with what you had, but please add a comment saying the template
won't ever be inserted as instruction.

(Maybe one of the gen* tools complains any_fix needs a mode? :QI will do
if so, or :P if you like that better).

With either way, approved for trunk (after testing, of course).  Thanks!


Segher


[PATCH, rs6000] Support float from/to long conversion vectorization

2019-09-26 Thread Kewen.Lin
Hi,

This patch is to add the support for float from/to long conversion
vectorization.  ISA 2.06 supports the vector version instructions
for conversion between float and long long (both signed and unsigned),
but vectorizer can't exploit them since the optab check fails.
So this patch is mainly to add related optab supports.

I've verified the functionality on both LE and BE.  Bootstrapped and
regress tested on powerpc64le-linux-gnu.


Thanks,
Kewen

--- 
gcc/ChangeLog

2019-09-27  Kewen Lin  

* config/rs6000/vsx.md (vec_pack[su]_float_v2di): New define_expand.
(vec_unpack_[su]fix_trunc_hi_v4sf): Likewise.
(vec_unpack_[su]fix_trunc_lo_v4sf): Likewise.

gcc/testsuite/ChangeLog

2019-09-27  Kewen Lin  

* gcc.target/powerpc/conv-vectorize-1.c: New test.
* gcc.target/powerpc/conv-vectorize-2.c: New test.
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 7633171..ef32971 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -5538,3 +5538,42 @@
   operands[SFBOOL_TMP_VSX_DI] = gen_rtx_REG (DImode, regno_tmp_vsx);
   operands[SFBOOL_MTVSR_D_V4SF] = gen_rtx_REG (V4SFmode, regno_mtvsr_d);
 })
+
+;; Support signed/unsigned long long to float conversion vectorization.
+(define_expand "vec_pack_float_v2di"
+  [(match_operand:V4SF 0 "vfloat_operand")
+   (any_float:V4SF (parallel [(match_operand:V2DI 1 "vint_operand")
+ (match_operand:V2DI 2 "vint_operand")]))]
+  "TARGET_VSX"
+{
+  rtx r1 = gen_reg_rtx (V4SFmode);
+  rtx r2 = gen_reg_rtx (V4SFmode);
+  emit_insn (gen_vsx_xvcvxdsp (r1, operands[1]));
+  emit_insn (gen_vsx_xvcvxdsp (r2, operands[2]));
+  rs6000_expand_extract_even (operands[0], r1, r2);
+  DONE;
+})
+
+;; Support float to signed/unsigned long long conversion vectorization.
+(define_expand "vec_unpack_fix_trunc_hi_v4sf"
+  [(match_operand:V2DI 0 "vint_operand")
+   (any_fix:V2DI (match_operand:V4SF 1 "vfloat_operand"))]
+  "TARGET_VSX"
+{
+  rtx reg = gen_reg_rtx (V4SFmode);
+  rs6000_expand_interleave (reg, operands[1], operands[1], BYTES_BIG_ENDIAN);
+  emit_insn (gen_vsx_xvcvspxds (operands[0], reg));
+  DONE;
+})
+
+(define_expand "vec_unpack_fix_trunc_lo_v4sf"
+  [(match_operand:V2DI 0 "vint_operand")
+   (any_fix:V2DI (match_operand:V4SF 1 "vfloat_operand"))]
+  "TARGET_VSX"
+{
+  rtx reg = gen_reg_rtx (V4SFmode);
+  rs6000_expand_interleave (reg, operands[1], operands[1], !BYTES_BIG_ENDIAN);
+  emit_insn (gen_vsx_xvcvspxds (operands[0], reg));
+  DONE;
+})
+
diff --git a/gcc/testsuite/gcc.target/powerpc/conv-vectorize-1.c 
b/gcc/testsuite/gcc.target/powerpc/conv-vectorize-1.c
new file mode 100644
index 000..d96db14
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/conv-vectorize-1.c
@@ -0,0 +1,37 @@
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -ftree-vectorize -mvsx" } */
+
+/* Test vectorizer can exploit vector conversion instructions to convert
+   unsigned/signed long long to float.  */
+
+#include 
+
+#define SIZE 32
+#define ALIGN 16
+
+float sflt_array[SIZE] __attribute__ ((__aligned__ (ALIGN)));
+float uflt_array[SIZE] __attribute__ ((__aligned__ (ALIGN)));
+
+unsigned long long ulong_array[SIZE] __attribute__ ((__aligned__ (ALIGN)));
+signed long long slong_array[SIZE] __attribute__ ((__aligned__ (ALIGN)));
+
+void
+convert_slong_to_float (void)
+{
+  size_t i;
+
+  for (i = 0; i < SIZE; i++)
+sflt_array[i] = (float) slong_array[i];
+}
+
+void
+convert_ulong_to_float (void)
+{
+  size_t i;
+
+  for (i = 0; i < SIZE; i++)
+uflt_array[i] = (float) ulong_array[i];
+}
+
+/* { dg-final { scan-assembler {\mxvcvsxdsp\M} } } */
+/* { dg-final { scan-assembler {\mxvcvuxdsp\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/conv-vectorize-2.c 
b/gcc/testsuite/gcc.target/powerpc/conv-vectorize-2.c
new file mode 100644
index 000..5dd5dea
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/conv-vectorize-2.c
@@ -0,0 +1,37 @@
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -ftree-vectorize -mvsx" } */
+
+/* Test vectorizer can exploit vector conversion instructions to convert
+   float to unsigned/signed long long.  */
+
+#include 
+
+#define SIZE 32
+#define ALIGN 16
+
+float sflt_array[SIZE] __attribute__ ((__aligned__ (ALIGN)));
+float uflt_array[SIZE] __attribute__ ((__aligned__ (ALIGN)));
+
+unsigned long long ulong_array[SIZE] __attribute__ ((__aligned__ (ALIGN)));
+signed long long slong_array[SIZE] __attribute__ ((__aligned__ (ALIGN)));
+
+void
+convert_float_to_slong (void)
+{
+  size_t i;
+
+  for (i = 0; i < SIZE; i++)
+slong_array[i] = (signed long long) sflt_array[i];
+}
+
+void
+convert_float_to_ulong (void)
+{
+  size_t i;
+
+  for (i = 0; i < SIZE; i++)
+ulong_array[i] = (unsigned long long) uflt_array[i];
+}
+
+/* { dg-final { scan-assembler {\mxvcvspsxds\M} } } */
+/* { dg-final { scan-assembler {\mxvcvspuxds\M} } } */