Re: [PATCH] PR target/66144, PowerPC improve vector compare

2017-02-06 Thread Segher Boessenkool
On Mon, Feb 06, 2017 at 01:28:29PM -0500, Michael Meissner wrote:
> Can I check it into the trunk?

Yes please.  Thanks,

Segher


> 2017-02-06  Michael Meissner  
> 
>   PR target/66144
>   * config/rs6000/vector.md (vcond): Allow the true and
>   false values to be constant vectors with all 0 or all 1 bits set.
>   (vcondu): Likewise.
>   * config/rs6000/predicates.md (vector_int_reg_or_same_bit): New
>   predicate.
>   (fpmask_comparison_operator): Update comment.
>   (vecint_comparison_operator): New predicate.
>   * config/rs6000/rs6000.c (rs6000_emit_vector_cond_expr): Optimize
>   vector conditionals when the true and false values are constant
>   vectors with all 0 bits or all 1 bits set.
> 
> [gcc/testsuite]
> 2017-02-06  Michael Meissner  
> 
>   PR target/66144
>   * gcc.target/powerpc/pr66144-1.c: New test.
>   * gcc.target/powerpc/pr66144-2.c: Likewise.
>   * gcc.target/powerpc/pr66144-3.c: Likewise.


Re: [PATCH] PR target/66144, PowerPC improve vector compare

2017-02-06 Thread Michael Meissner
On Fri, Feb 03, 2017 at 06:07:56PM -0600, Segher Boessenkool wrote:
> On Fri, Feb 03, 2017 at 04:25:00PM -0500, Michael Meissner wrote:
> > +;; Return 1 if operand is either a vector constant of all 0 bits of a 
> > vector
> > +;; constant of all 1 bits.
> > +(define_predicate "vector_int_same_bit"
> > +  (match_code "const_vector")
> > +{
> > +  if (GET_MODE_CLASS (mode) != MODE_VECTOR_INT)
> > +return 0;
> > +
> > +  else
> > +return op == CONST0_RTX (mode) || op == CONSTM1_RTX (mode);
> > +})
> 
> This predicate is unused as far as I see?

I removed it.  Thanks.

> > +  /* Optimize vec1 == vec2, to know the mask generates -1/0.  */
> > +  if (GET_MODE_CLASS (dest_mode) == MODE_VECTOR_INT)
> >  {
> > -  tmp = op_true;
> > -  op_true = op_false;
> > -  op_false = tmp;
> > +  if (op_true == constant_m1 && op_false == constant_0)
> > +   {
> > + emit_move_insn (dest, mask);
> > + return 1;
> > +   }
> > +
> > +  else if (op_true == constant_0 && op_false == constant_m1)
> > +   {
> > + emit_insn (gen_rtx_SET (dest, gen_rtx_NOT (dest_mode, mask)));
> > + return 1;
> > +   }
> >  }
> 
> Do you need to test for dest_mode == mask_mode here, like below?

I reworked the test, see below.

> > +  if (op_true == constant_m1 && dest_mode == mask_mode)
> > +op_true = mask;
> > +  else if (!REG_P (op_true) && !SUBREG_P (op_true))
> > +op_true = force_reg (dest_mode, op_true);
> > +
> > +  if (op_false == constant_0 && dest_mode == mask_mode)
> > +op_false = mask;
> > +  else if (!REG_P (op_false) && !SUBREG_P (op_false))
> > +op_false = force_reg (dest_mode, op_false);
> 
> Another thing you could try is, if either op_true or op_false is 0
> or -1, let the result be
>   (mask & op_true) | (~mask & op_false)
> 
> and let the rest of the optimisers sort it out (it's a single vor/vand
> or vorc/vandc, or a vnot, or nothing).  A later improvement perhaps.
> Or does it already handle all cases now :-)

That will be a later optimization if desired.  I reran the tests with no
regressions.

When I reran the spec benchmark in more controlled conditions, I wasn't able to
reproduce the gcc speedups like I saw with a quick run.  I did see the tonto
speedups with these changes.

As I said, but perhaps I wasn't clear.  I will be in the office on Tuesday, but
I will be on vacation starting on Wednesday.  I would prefer checking in the
changes today (Monday) just in case something needs fixing before I leave.  Can
I check it into the trunk?

[gcc]
2017-02-06  Michael Meissner  

PR target/66144
* config/rs6000/vector.md (vcond): Allow the true and
false values to be constant vectors with all 0 or all 1 bits set.
(vcondu): Likewise.
* config/rs6000/predicates.md (vector_int_reg_or_same_bit): New
predicate.
(fpmask_comparison_operator): Update comment.
(vecint_comparison_operator): New predicate.
* config/rs6000/rs6000.c (rs6000_emit_vector_cond_expr): Optimize
vector conditionals when the true and false values are constant
vectors with all 0 bits or all 1 bits set.

[gcc/testsuite]
2017-02-06  Michael Meissner  

PR target/66144
* gcc.target/powerpc/pr66144-1.c: New test.
* gcc.target/powerpc/pr66144-2.c: Likewise.
* gcc.target/powerpc/pr66144-3.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md (revision 245137)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -808,6 +808,21 @@ (define_predicate "all_ones_constant"
   (and (match_code "const_int,const_double,const_wide_int,const_vector")
(match_test "op == CONSTM1_RTX (mode) && !FLOAT_MODE_P (mode)")))
 
+;; Return 1 if operand is a vector int register or is either a vector constant
+;; of all 0 bits of a vector constant of all 1 bits.
+(define_predicate "vector_int_reg_or_same_bit"
+  (match_code "reg,subreg,const_vector")
+{
+  if (GET_MODE_CLASS (mode) != MODE_VECTOR_INT)
+return 0;
+
+  else if (REG_P (op) || SUBREG_P (op))
+return vint_operand (op, mode);
+
+  else
+return op == CONST0_RTX (mode) || op == CONSTM1_RTX (mode);
+})
+
 ;; Return 1 if operand is 0.0.
 (define_predicate "zero_fp_constant"
   (and (match_code "const_double")
@@ -1260,8 +1275,8 @@ (define_predicate "scc_rev_comparison_op
   (and (match_operand 0 "branch_comparison_operator")
(match_code "ne,le,ge,leu,geu,ordered")))
 
-;; Return 1 if OP is a comparison operator suitable for vector/scalar
-;; comparisons that generate a -1/0 mask.
+;; Return 1 if OP is a comparison operator suitable for floating point
+;; vector/scalar comparisons that generate a -1/0 mask.
 

Re: [PATCH] PR target/66144, PowerPC improve vector compare

2017-02-03 Thread Michael Meissner
On Fri, Feb 03, 2017 at 06:07:56PM -0600, Segher Boessenkool wrote:
> On Fri, Feb 03, 2017 at 04:25:00PM -0500, Michael Meissner wrote:
> > +;; Return 1 if operand is either a vector constant of all 0 bits of a 
> > vector
> > +;; constant of all 1 bits.
> > +(define_predicate "vector_int_same_bit"
> > +  (match_code "const_vector")
> > +{
> > +  if (GET_MODE_CLASS (mode) != MODE_VECTOR_INT)
> > +return 0;
> > +
> > +  else
> > +return op == CONST0_RTX (mode) || op == CONSTM1_RTX (mode);
> > +})
> 
> This predicate is unused as far as I see?

Right.  It was used in my first attempt when I had a peephole2 to
eliminate the extra loads.  Since I moved the processing to when we create the
conditional vector assignment and deleted the peephole2, I missed deleting the
predicate.  Thanks for catching it.

> > +  /* Optimize vec1 == vec2, to know the mask generates -1/0.  */
> > +  if (GET_MODE_CLASS (dest_mode) == MODE_VECTOR_INT)
> >  {
> > -  tmp = op_true;
> > -  op_true = op_false;
> > -  op_false = tmp;
> > +  if (op_true == constant_m1 && op_false == constant_0)
> > +   {
> > + emit_move_insn (dest, mask);
> > + return 1;
> > +   }
> > +
> > +  else if (op_true == constant_0 && op_false == constant_m1)
> > +   {
> > + emit_insn (gen_rtx_SET (dest, gen_rtx_NOT (dest_mode, mask)));
> > + return 1;
> > +   }
> >  }
> 
> Do you need to test for dest_mode == mask_mode here, like below?

Yes, because there is support for vcondv4siv4sf and vcondv4sfv4si where the
mask is one of V4SI or V4SF and the destination is the other.  The dest_mode ==
mask_mode checks for that, and ignore that case.

I.e. something like:

static float af[1024], bf[1024], cf[1024];
static int di[1024], ei[1024];

// ..

for (i = 0; i < 1024; i++)
  af[i] = (di[i] == ei[i]) ? bf[i] : cf[i];

> 
> > +  if (op_true == constant_m1 && dest_mode == mask_mode)
> > +op_true = mask;
> > +  else if (!REG_P (op_true) && !SUBREG_P (op_true))
> > +op_true = force_reg (dest_mode, op_true);
> > +
> > +  if (op_false == constant_0 && dest_mode == mask_mode)
> > +op_false = mask;
> > +  else if (!REG_P (op_false) && !SUBREG_P (op_false))
> > +op_false = force_reg (dest_mode, op_false);
> 
> Another thing you could try is, if either op_true or op_false is 0
> or -1, let the result be
>   (mask & op_true) | (~mask & op_false)
> 
> and let the rest of the optimisers sort it out (it's a single vor/vand
> or vorc/vandc, or a vnot, or nothing).  A later improvement perhaps.
> Or does it already handle all cases now :-)

I don't know if it would handle it.

> 
> Okay for trunk with the unused predicate removed, and the dest_mode ==
> mask_mode thing looked at.  Thanks!

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH] PR target/66144, PowerPC improve vector compare

2017-02-03 Thread Segher Boessenkool
On Fri, Feb 03, 2017 at 04:25:00PM -0500, Michael Meissner wrote:
> +;; Return 1 if operand is either a vector constant of all 0 bits of a vector
> +;; constant of all 1 bits.
> +(define_predicate "vector_int_same_bit"
> +  (match_code "const_vector")
> +{
> +  if (GET_MODE_CLASS (mode) != MODE_VECTOR_INT)
> +return 0;
> +
> +  else
> +return op == CONST0_RTX (mode) || op == CONSTM1_RTX (mode);
> +})

This predicate is unused as far as I see?

> +  /* Optimize vec1 == vec2, to know the mask generates -1/0.  */
> +  if (GET_MODE_CLASS (dest_mode) == MODE_VECTOR_INT)
>  {
> -  tmp = op_true;
> -  op_true = op_false;
> -  op_false = tmp;
> +  if (op_true == constant_m1 && op_false == constant_0)
> + {
> +   emit_move_insn (dest, mask);
> +   return 1;
> + }
> +
> +  else if (op_true == constant_0 && op_false == constant_m1)
> + {
> +   emit_insn (gen_rtx_SET (dest, gen_rtx_NOT (dest_mode, mask)));
> +   return 1;
> + }
>  }

Do you need to test for dest_mode == mask_mode here, like below?

> +  if (op_true == constant_m1 && dest_mode == mask_mode)
> +op_true = mask;
> +  else if (!REG_P (op_true) && !SUBREG_P (op_true))
> +op_true = force_reg (dest_mode, op_true);
> +
> +  if (op_false == constant_0 && dest_mode == mask_mode)
> +op_false = mask;
> +  else if (!REG_P (op_false) && !SUBREG_P (op_false))
> +op_false = force_reg (dest_mode, op_false);

Another thing you could try is, if either op_true or op_false is 0
or -1, let the result be
(mask & op_true) | (~mask & op_false)

and let the rest of the optimisers sort it out (it's a single vor/vand
or vorc/vandc, or a vnot, or nothing).  A later improvement perhaps.
Or does it already handle all cases now :-)

Okay for trunk with the unused predicate removed, and the dest_mode ==
mask_mode thing looked at.  Thanks!


Segher


[PATCH] PR target/66144, PowerPC improve vector compare

2017-02-03 Thread Michael Meissner
This patch improves the code generation for:

vector char
compare_vc (vector char a, vector char b)
{
  return a == b;
}

Previously it generated:

vcmpequb 2,2,3
vspltisw 1,-1
vspltisw 0,0
xxsel 34,32,33,34

and now it generates:

vcmpequb 2,2,3

I changed the vector conditional support for integer vectors to allow constant
all 0's or all 1's in addition to registers, and it knows that after a vector
comparison, the vector element will be all 1's if the comparison was true, and
all 0's if the comparison was false.

I did the normal bootstrap build and make check.  There were no regressions.

I built Spec 2006 with the compiler, and I noticed that 15 out of the 29
benchmarks had code changes with this fix (bzip2, gcc, gamess,milc, gromacs,
gobmk, dealII, calculix, hmmer, sjeng, libquantum, h264ref, tonto, wrf,
sphinx3).  I did a quick run (one pass instead of the normal 3) of the
benchmarks that had code changes.  There were no regressions in performance,
and 2 runs that were faster (gcc and tonto) by 3-4%.

Can I check this fix into the GCC trunk?

Note, I will be on starting on vacation on Wednesday (February 8th), and I will
return to work on Friday February 24th.  If I don't get the approval by Monday,
I will likely hold the patches until I return to work.

[gcc]
2017-02-03  Michael Meissner  

PR target/66144
* config/rs6000/vector.md (vcond): Allow the true and
false values to be constant vectors with all 0 or all 1 bits set.
(vcondu): Likewise.
* config/rs6000/predicates.md (vector_int_same_bit): New
predicate.
(vector_int_reg_or_same_bit): Likewise.
(fpmask_comparison_operator): Update comment.
(vecint_comparison_operator): New predicate.
* config/rs6000/rs6000.c (rs6000_emit_vector_cond_expr): Optimize
vector conditionals when the true and false values are constant
vectors with all 0 bits or all 1 bits set.

[gcc/testsuite]
2017-02-03  Michael Meissner  

PR target/66144
* gcc.target/powerpc/pr66144-1.c: New test.
* gcc.target/powerpc/pr66144-2.c: Likewise.
* gcc.target/powerpc/pr66144-3.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/vector.md
===
--- gcc/config/rs6000/vector.md (revision 245101)
+++ gcc/config/rs6000/vector.md (working copy)
@@ -390,13 +390,13 @@ (define_expand "vcond"
 }")
 
 (define_expand "vcond"
-  [(set (match_operand:VEC_I 0 "vint_operand" "")
+  [(set (match_operand:VEC_I 0 "vint_operand")
(if_then_else:VEC_I
 (match_operator 3 "comparison_operator"
-[(match_operand:VEC_I 4 "vint_operand" "")
- (match_operand:VEC_I 5 "vint_operand" "")])
-(match_operand:VEC_I 1 "vint_operand" "")
-(match_operand:VEC_I 2 "vint_operand" "")))]
+[(match_operand:VEC_I 4 "vint_operand")
+ (match_operand:VEC_I 5 "vint_operand")])
+(match_operand:VEC_I 1 "vector_int_reg_or_same_bit")
+(match_operand:VEC_I 2 "vector_int_reg_or_same_bit")))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)"
   "
 {
@@ -446,13 +446,13 @@ (define_expand "vcondv4siv4sf"
 }")
 
 (define_expand "vcondu"
-  [(set (match_operand:VEC_I 0 "vint_operand" "")
+  [(set (match_operand:VEC_I 0 "vint_operand")
(if_then_else:VEC_I
 (match_operator 3 "comparison_operator"
-[(match_operand:VEC_I 4 "vint_operand" "")
- (match_operand:VEC_I 5 "vint_operand" "")])
-(match_operand:VEC_I 1 "vint_operand" "")
-(match_operand:VEC_I 2 "vint_operand" "")))]
+[(match_operand:VEC_I 4 "vint_operand")
+ (match_operand:VEC_I 5 "vint_operand")])
+(match_operand:VEC_I 1 "vector_int_reg_or_same_bit")
+(match_operand:VEC_I 2 "vector_int_reg_or_same_bit")))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)"
   "
 {
Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md (revision 245101)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -808,6 +808,33 @@ (define_predicate "all_ones_constant"
   (and (match_code "const_int,const_double,const_wide_int,const_vector")
(match_test "op == CONSTM1_RTX (mode) && !FLOAT_MODE_P (mode)")))
 
+;; Return 1 if operand is either a vector constant of all 0 bits of a vector
+;; constant of all 1 bits.
+(define_predicate "vector_int_same_bit"
+  (match_code "const_vector")
+{
+  if (GET_MODE_CLASS (mode) != MODE_VECTOR_INT)
+return 0;
+
+  else
+return op == CONST0_RTX (mode)