Hi Richard, I was wondering what you think of the following patch as a solution to PR tree-optimization/96912, i.e. the ability to recognize pblendvb from regular code rather than as a target specific builtin?
The obvious point of contention is that the current middle-end philosophy around vector expressions is that the middle-end should continually check for backend support, whereas I prefer the "old school" view that trees are an abstraction and that RTL expansion is the point where these abstract operations get converted/lowered into instructions supported by the target. [The exceptions being built-in functions, IFN_* etc.] Should tree.texi document which tree codes can't be used without checking the backend. Bootstrapped and regression tested, but this obviously depends upon RTL expansion being able to perform the inverse operation/lowering if required. 2022-05-23 Roger Sayle <ro...@nextmovesoftware.com> gcc/ChangeLog PR tree-optimization/96912 * match.pd (vector_mask_p): New predicate to identify vectors where every element must be zero or all ones. (bit_xor (bit_and (bit_xor ...) ...) ...): Recognize a VEC_COND_EXPR expressed as logical vector operations. gcc/testsuite/ChangeLog PR tree-optimization/96912 * gcc.target/i386/pr96912.c: New test case. Thoughts? How would you solve this PR? Are there convenience predicates for testing whether a target supports vec_cond_expr, vec_duplicate, etc? Cheers, Roger --
diff --git a/gcc/match.pd b/gcc/match.pd index c2fed9b..e365f28 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -4221,6 +4221,35 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (integer_all_onesp (@1) && integer_zerop (@2)) @0)))) +/* A integer vector where every element must be 0 or -1. */ +(match vector_mask_p + @0 + (if (VECTOR_BOOLEAN_TYPE_P (type)))) +(match vector_mask_p + VECTOR_CST@0 + (if (integer_zerop (@0) || integer_all_onesp (@0)))) +(match vector_mask_p + (vec_cond @0 vector_mask_p@1 vector_mask_p@2)) +(match vector_mask_p + (bit_not vector_mask_p@0)) +(for op (bit_and bit_ior bit_xor) + (match vector_mask_p + (op vector_mask_p@0 vector_mask_p@1))) + +/* Recognize VEC_COND_EXPR. */ +(simplify + (bit_xor:c (bit_and:c (bit_xor:c @0 @1) (view_convert vector_mask_p@2)) @0) + (if (VECTOR_TYPE_P (type) + && VECTOR_TYPE_P (TREE_TYPE (@2))) + (with { tree masktype = truth_type_for (TREE_TYPE (@2)); + tree vecttype = maybe_ne (TYPE_VECTOR_SUBPARTS (masktype), + TYPE_VECTOR_SUBPARTS (type)) + ? unsigned_type_for (masktype) + : type; } + (view_convert (vec_cond:vecttype (view_convert:masktype @2) + (view_convert:vecttype @1) + (view_convert:vecttype @0)))))) + /* A few simplifications of "a ? CST1 : CST2". */ /* NOTE: Only do this on gimple as the if-chain-to-switch optimization depends on the gimple to have if statements in it. */
/* { dg-do compile { target { ! ia32 } } } */ /* { dg-options "-O2 -msse4" } */ typedef char V __attribute__((vector_size(16))); typedef long long W __attribute__((vector_size(16))); W foo (W x, W y, V m) { W t = (m < 0); return (~t & x) | (t & y); } V bar (V x, V y, V m) { V t = (m < 0); return (~t & x) | (t & y); } /* { dg-final { scan-assembler-times "pblend" 2 } } */