Hi all, 
I've been attempting to get early gimple-folding to work with the
vec_sel intrinsic for powerpc, and I've run into a snag or two such that
I'm not sure how to best proceed.  Code snippet is below, followed by a
description of the issues as I interpret them below.

Apologies for the ramble, Thanks in advance, ...  :-)

-Will

---8<---
/* vector selects */
/* d = vec_sel (a, b, c)
Each bit of the result vector (d) has the value of
the corresponding bit of (a) if the corresponding bit
of (c) is 0. Otherwise, each bit of the result vector
has the value of the corresponding bit of (b).  */
    case ALTIVEC_BUILTIN_VSEL_16QI:
    case ALTIVEC_BUILTIN_VSEL_8HI:
    case ALTIVEC_BUILTIN_VSEL_4SI:
    case ALTIVEC_BUILTIN_VSEL_2DI:
    case ALTIVEC_BUILTIN_VSEL_4SF:
    case ALTIVEC_BUILTIN_VSEL_2DF:
{
        tree cond_tree = gimple_call_arg (stmt, 2);
        tree then_tree = gimple_call_arg (stmt, 0);
        tree else_tree = gimple_call_arg (stmt, 1);
        lhs = gimple_call_lhs (stmt);
        location_t loc = gimple_location (stmt);
        gimple_seq stmts = NULL;
       tree truth_cond_tree_type
         = build_same_sized_truth_vector_type (TREE_TYPE(cond_tree));
       tree truth_cond_tree
         = gimple_convert (&stmts, loc, truth_cond_tree_type, cond_tree);
       gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
       g = gimple_build_assign (lhs, VEC_COND_EXPR, truth_cond_tree,
                                then_tree, else_tree);
       gimple_set_location (g, gimple_location (stmt));
       gsi_replace (gsi, g, true);
       return true;
}
---8<---

First issue (easier?) - The above code snippet works for the
existing powerpc/fold-vec-sel-* testcases, except that when comparing
before and after codegen, we end up with an extra pair of instructions
(vspltisw and vcmpgtsh ~~ splat zero, compare).  This appears to be due
to taking the "Fake op0 < 0" path during optabs.c:
expand_vec_cond_expr(),  I've not fully exhausted my debug on that
front, but mention it in case this is something obvious.  And, this is
probably minor with respect to issue 2.

Second issue - This works for the simple tests, so seems like the
implementation should be close to correct, but triggers an ICE
when trying to build some of the pre-existing tests.  After some
investigation, this appears to be specific to those tests that have
non-variable values for the condition vector.

For instance, our powerpc/altivec-32.c testcase contains: 
  unsigned k = 1;
  a = (vector unsigned) { 0, 0, 0, 1 };
  b = c = (vector unsigned) { 0, 0, 0, 0 };
  a = vec_add (a, vec_splats (k));
  b = vec_add (b, a);
  c = vec_sel (c, a, b);

which ends up as...
c = vec_sel ({0,0,0,0}, {1,1,1,2}, {1,1,1,2});
Our condition vector is the last argument, so this gets rearranged a bit
when we build the vec_cond_expr, and we eventually get (at gimple time)

  _1 = VEC_COND_EXPR <{ 1(OVF), 1(OVF), 1(OVF), 2(OVF) }, { 0, 0, 0, 0 }, { 1, 
1, 1, 2 }>;

And we subsequently ICE (at expand time) when we hit a gcc_unreachable()
in expr.c const_vector_mask_from_tree() when we try to map that '2(OVF)'
value to a (boolean) zero or minus_one, and fail.

during RTL pass: expand
dump file: altivec-34.c.230r.expand
/home/willschm/gcc/gcc-mainline-regtest_patches/gcc/testsuite/gcc.target/powerpc/altivec-34.c:
 In function ‘foo’:
/home/willschm/gcc/gcc-mainline-regtest_patches/gcc/testsuite/gcc.target/powerpc/altivec-34.c:21:6:
 internal compiler error: in const_vector_mask_from_tree, at expr.c:12247
0x1059fba7 const_vector_mask_from_tree
        /home/willschm/gcc/gcc-mainline-regtest_patches/gcc/expr.c:12247

Ultimately, this seems like an impasse between the vec_sel intrinsic
being a bit-wise operation, and the truth_vector being of a boolean
type.   But.. i'm not certain.
I had at an earlier time tried to implement vec_sel() up using a mix of
BIT_NOT_EXPR, BIT_AND_EXPR, BIT_IOR_EXPR, but ended up with some
incredibly horrible codegen.  (which would be a no-go).  I may need to
revisit that..

Thanks,
-Will



Reply via email to