Hi Richard,
I was wondering what you think of the following patch as a solution to
PR tree-optimization/96912, i.e. the ability to recognize pblendvb from
regular code rather than as a target specific builtin?

The obvious point of contention is that the current middle-end philosophy
around vector expressions is that the middle-end should continually check
for backend support, whereas I prefer the "old school" view that trees
are an abstraction and that RTL expansion is the point where these abstract
operations get converted/lowered into instructions supported by the target.
[The exceptions being built-in functions, IFN_* etc.] Should tree.texi
document
which tree codes can't be used without checking the backend.


Bootstrapped and regression tested, but this obviously depends upon RTL
expansion being able to perform the inverse operation/lowering if required.


2022-05-23  Roger Sayle  <ro...@nextmovesoftware.com>

gcc/ChangeLog
        PR tree-optimization/96912
        * match.pd (vector_mask_p): New predicate to identify vectors
        where every element must be zero or all ones.
        (bit_xor (bit_and (bit_xor ...) ...) ...): Recognize a VEC_COND_EXPR
        expressed as logical vector operations.

gcc/testsuite/ChangeLog
        PR tree-optimization/96912
        * gcc.target/i386/pr96912.c: New test case.


Thoughts?  How would you solve this PR?  Are there convenience predicates
for testing whether a target supports vec_cond_expr, vec_duplicate, etc?

Cheers,
Roger
--

diff --git a/gcc/match.pd b/gcc/match.pd
index c2fed9b..e365f28 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4221,6 +4221,35 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
    (if (integer_all_onesp (@1) && integer_zerop (@2))
     @0))))
 
+/* A integer vector where every element must be 0 or -1.  */
+(match vector_mask_p
+ @0
+ (if (VECTOR_BOOLEAN_TYPE_P (type))))
+(match vector_mask_p
+ VECTOR_CST@0
+ (if (integer_zerop (@0) || integer_all_onesp (@0))))
+(match vector_mask_p
+ (vec_cond @0 vector_mask_p@1 vector_mask_p@2))
+(match vector_mask_p
+ (bit_not vector_mask_p@0))
+(for op (bit_and bit_ior bit_xor)
+ (match vector_mask_p
+  (op vector_mask_p@0 vector_mask_p@1)))
+
+/* Recognize VEC_COND_EXPR.  */
+(simplify
+ (bit_xor:c (bit_and:c (bit_xor:c @0 @1) (view_convert vector_mask_p@2)) @0)
+ (if (VECTOR_TYPE_P (type)
+      && VECTOR_TYPE_P (TREE_TYPE (@2)))
+  (with { tree masktype = truth_type_for (TREE_TYPE (@2));
+          tree vecttype = maybe_ne (TYPE_VECTOR_SUBPARTS (masktype),
+                                   TYPE_VECTOR_SUBPARTS (type))
+                         ? unsigned_type_for (masktype)
+                         : type; }
+   (view_convert (vec_cond:vecttype (view_convert:masktype @2)
+                                   (view_convert:vecttype @1)
+                                   (view_convert:vecttype @0))))))
+
 /* A few simplifications of "a ? CST1 : CST2". */
 /* NOTE: Only do this on gimple as the if-chain-to-switch
    optimization depends on the gimple to have if statements in it. */
/* { dg-do compile { target { ! ia32 } } } */
/* { dg-options "-O2 -msse4" } */

typedef char V __attribute__((vector_size(16)));
typedef long long W __attribute__((vector_size(16)));

W
foo (W x, W y, V m)
{
  W t = (m < 0);
  return (~t & x) | (t & y);
}

V
bar (V x, V y, V m)
{
  V t = (m < 0);
  return (~t & x) | (t & y);
}

/* { dg-final { scan-assembler-times "pblend" 2 } } */

Reply via email to