On Fri, Feb 03, 2017 at 06:07:56PM -0600, Segher Boessenkool wrote: > On Fri, Feb 03, 2017 at 04:25:00PM -0500, Michael Meissner wrote: > > +;; Return 1 if operand is either a vector constant of all 0 bits of a > > vector > > +;; constant of all 1 bits. > > +(define_predicate "vector_int_same_bit" > > + (match_code "const_vector") > > +{ > > + if (GET_MODE_CLASS (mode) != MODE_VECTOR_INT) > > + return 0; > > + > > + else > > + return op == CONST0_RTX (mode) || op == CONSTM1_RTX (mode); > > +}) > > This predicate is unused as far as I see?
Right. It was used in my first attempt when I had a peephole2 to eliminate the extra loads. Since I moved the processing to when we create the conditional vector assignment and deleted the peephole2, I missed deleting the predicate. Thanks for catching it. > > + /* Optimize vec1 == vec2, to know the mask generates -1/0. */ > > + if (GET_MODE_CLASS (dest_mode) == MODE_VECTOR_INT) > > { > > - tmp = op_true; > > - op_true = op_false; > > - op_false = tmp; > > + if (op_true == constant_m1 && op_false == constant_0) > > + { > > + emit_move_insn (dest, mask); > > + return 1; > > + } > > + > > + else if (op_true == constant_0 && op_false == constant_m1) > > + { > > + emit_insn (gen_rtx_SET (dest, gen_rtx_NOT (dest_mode, mask))); > > + return 1; > > + } > > } > > Do you need to test for dest_mode == mask_mode here, like below? Yes, because there is support for vcondv4siv4sf and vcondv4sfv4si where the mask is one of V4SI or V4SF and the destination is the other. The dest_mode == mask_mode checks for that, and ignore that case. I.e. something like: static float af[1024], bf[1024], cf[1024]; static int di[1024], ei[1024]; // .. for (i = 0; i < 1024; i++) af[i] = (di[i] == ei[i]) ? bf[i] : cf[i]; > > > + if (op_true == constant_m1 && dest_mode == mask_mode) > > + op_true = mask; > > + else if (!REG_P (op_true) && !SUBREG_P (op_true)) > > + op_true = force_reg (dest_mode, op_true); > > + > > + if (op_false == constant_0 && dest_mode == mask_mode) > > + op_false = mask; > > + else if (!REG_P (op_false) && !SUBREG_P (op_false)) > > + op_false = force_reg (dest_mode, op_false); > > Another thing you could try is, if either op_true or op_false is 0 > or -1, let the result be > (mask & op_true) | (~mask & op_false) > > and let the rest of the optimisers sort it out (it's a single vor/vand > or vorc/vandc, or a vnot, or nothing). A later improvement perhaps. > Or does it already handle all cases now :-) I don't know if it would handle it. > > Okay for trunk with the unused predicate removed, and the dest_mode == > mask_mode thing looked at. Thanks! -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797