Hi all, I've been attempting to get early gimple-folding to work with the vec_sel intrinsic for powerpc, and I've run into a snag or two such that I'm not sure how to best proceed. Code snippet is below, followed by a description of the issues as I interpret them below.
Apologies for the ramble, Thanks in advance, ... :-) -Will ---8<--- /* vector selects */ /* d = vec_sel (a, b, c) Each bit of the result vector (d) has the value of the corresponding bit of (a) if the corresponding bit of (c) is 0. Otherwise, each bit of the result vector has the value of the corresponding bit of (b). */ case ALTIVEC_BUILTIN_VSEL_16QI: case ALTIVEC_BUILTIN_VSEL_8HI: case ALTIVEC_BUILTIN_VSEL_4SI: case ALTIVEC_BUILTIN_VSEL_2DI: case ALTIVEC_BUILTIN_VSEL_4SF: case ALTIVEC_BUILTIN_VSEL_2DF: { tree cond_tree = gimple_call_arg (stmt, 2); tree then_tree = gimple_call_arg (stmt, 0); tree else_tree = gimple_call_arg (stmt, 1); lhs = gimple_call_lhs (stmt); location_t loc = gimple_location (stmt); gimple_seq stmts = NULL; tree truth_cond_tree_type = build_same_sized_truth_vector_type (TREE_TYPE(cond_tree)); tree truth_cond_tree = gimple_convert (&stmts, loc, truth_cond_tree_type, cond_tree); gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); g = gimple_build_assign (lhs, VEC_COND_EXPR, truth_cond_tree, then_tree, else_tree); gimple_set_location (g, gimple_location (stmt)); gsi_replace (gsi, g, true); return true; } ---8<--- First issue (easier?) - The above code snippet works for the existing powerpc/fold-vec-sel-* testcases, except that when comparing before and after codegen, we end up with an extra pair of instructions (vspltisw and vcmpgtsh ~~ splat zero, compare). This appears to be due to taking the "Fake op0 < 0" path during optabs.c: expand_vec_cond_expr(), I've not fully exhausted my debug on that front, but mention it in case this is something obvious. And, this is probably minor with respect to issue 2. Second issue - This works for the simple tests, so seems like the implementation should be close to correct, but triggers an ICE when trying to build some of the pre-existing tests. After some investigation, this appears to be specific to those tests that have non-variable values for the condition vector. For instance, our powerpc/altivec-32.c testcase contains: unsigned k = 1; a = (vector unsigned) { 0, 0, 0, 1 }; b = c = (vector unsigned) { 0, 0, 0, 0 }; a = vec_add (a, vec_splats (k)); b = vec_add (b, a); c = vec_sel (c, a, b); which ends up as... c = vec_sel ({0,0,0,0}, {1,1,1,2}, {1,1,1,2}); Our condition vector is the last argument, so this gets rearranged a bit when we build the vec_cond_expr, and we eventually get (at gimple time) _1 = VEC_COND_EXPR <{ 1(OVF), 1(OVF), 1(OVF), 2(OVF) }, { 0, 0, 0, 0 }, { 1, 1, 1, 2 }>; And we subsequently ICE (at expand time) when we hit a gcc_unreachable() in expr.c const_vector_mask_from_tree() when we try to map that '2(OVF)' value to a (boolean) zero or minus_one, and fail. during RTL pass: expand dump file: altivec-34.c.230r.expand /home/willschm/gcc/gcc-mainline-regtest_patches/gcc/testsuite/gcc.target/powerpc/altivec-34.c: In function ‘foo’: /home/willschm/gcc/gcc-mainline-regtest_patches/gcc/testsuite/gcc.target/powerpc/altivec-34.c:21:6: internal compiler error: in const_vector_mask_from_tree, at expr.c:12247 0x1059fba7 const_vector_mask_from_tree /home/willschm/gcc/gcc-mainline-regtest_patches/gcc/expr.c:12247 Ultimately, this seems like an impasse between the vec_sel intrinsic being a bit-wise operation, and the truth_vector being of a boolean type. But.. i'm not certain. I had at an earlier time tried to implement vec_sel() up using a mix of BIT_NOT_EXPR, BIT_AND_EXPR, BIT_IOR_EXPR, but ended up with some incredibly horrible codegen. (which would be a no-go). I may need to revisit that.. Thanks, -Will