Hi Ke Wen, On Tue, Nov 05, 2019 at 04:35:05PM +0800, Kewen.Lin wrote: > >> ;; 128-bit one's complement > >> -(define_insn_and_split "*one_cmpl<mode>3_internal" > >> +(define_insn_and_split "one_cmpl<mode>3_internal" > > > > Instead, rename it to "one_cmpl<mode>3" and delete the define_expand that > > serves no function? > > Renamed. Sorry, what's the "define_expand" specified here. I thought it's > for existing one_cmpl<mode>3 but I didn't find it.
The expander named "one_cmpl<mode>3": Erm. 2, not 3 :-) (define_expand "one_cmpl<mode>2" [(set (match_operand:BOOL_128 0 "vlogical_operand") (not:BOOL_128 (match_operand:BOOL_128 1 "vlogical_operand")))] "" "") while the define_insn is (define_insn_and_split "*one_cmpl<mode>3_internal" [(set (match_operand:BOOL_128 0 "vlogical_operand" "=<BOOL_REGS_OUTPUT>") (not:BOOL_128 (match_operand:BOOL_128 1 "vlogical_operand" "<BOOL_REGS_UNARY>")))] "" { etc., so you can just delete the expand and rename the insn to the proper name (one_cmpl<mode>2). It sometimes is useful to have an expand like this if there are multiple insns that could implement this, but that is not the case here. > >> +(define_code_iterator fpcmpun [ungt unge unlt unle]) > > > > Why these four? Should there be more? Should this be added to some > > existing iterator? > > For floating point comparison operator and vector type, currently rs6000 > supports eq, gt, ge, *ltgt, *unordered, *ordered, *uneq (* for unnamed). > We can leverage gt, ge, eq for lt, le, ne, then these four left. There are four conditions for FP: lt/gt/eq/un. For every comparison, exactly one of the four is true. If not HONOR_NANS for this mode you never have un, so it is one of lt/gt/eq then, just like with integers. If we have HONOR_NANS(mode) (or !flag_finite_math_only), there are 14 possible combinations to test for (testing for any of the four or none of the four is easy ;-) ) Four test just if lt, gt, eq, or un is set. Another four test if one of the flags is *not* set, or said differently, if one of three flags is set: ordered, ne, unle, unge. The remaining six test two flags each: ltgt, le, unlt, ge, ungt, uneq. > I originally wanted to merge them into the existing unordered or uneq, but > I found it's hard to share their existing patterns. For example, the uneq > looks like: > > [(set (match_dup 3) > (gt:VEC_F (match_dup 1) > (match_dup 2))) > (set (match_dup 4) > (gt:VEC_F (match_dup 2) > (match_dup 1))) > (set (match_dup 0) > (and:VEC_F (not:VEC_F (match_dup 3)) > (not:VEC_F (match_dup 4))))] Or ge/ge/eqv, etc. -- there are multiple options. > While ungt looks like: > > [(set (match_dup 3) > (ge:VEC_F (match_dup 1) > (match_dup 2))) > (set (match_dup 4) > (ge:VEC_F (match_dup 2) > (match_dup 1))) > (set (match_dup 3) > (ior:VEC_F (not:VEC_F (match_dup 3)) > (not:VEC_F (match_dup 4)))) > (set (match_dup 4) > (gt:VEC_F (match_dup 1) > (match_dup 2))) > (set (match_dup 3) > (ior:VEC_F (match_dup 3) > (match_dup 4)))] (set (match_dup 3) (ge:VEC_F (match_dup 2) (match_dup 1))) (set (match_dup 0) (not:VEC_F (match_dup 3))) should be enough? So we have only gt/ge/eq. I think the following are ooptimal (not tested!): lt(a,b) = gt(b,a) gt(a,b) = gt(a,b) eq(a,b) = eq(a,b) un(a,b) = ~(ge(a,b) | ge(b,a)) ltgt(a,b) = ge(a,b) ^ ge(b,a) le(a,b) = ge(b,a) unlt(a,b) = ~ge(a,b) ge(a,b) = ge(a,b) ungt(a,b) = ~ge(b,a) uneq(a,b) = ~(ge(a,b) ^ ge(b,a)) ord(a,b) = ge(a,b) | ge(b,a) ne(a,b) = ~eq(a,b) unle(a,b) = ~gt(a,b) unge(a,b) = ~gt(b,a) This is quite regular :-) 5 are done with one cmp; 5 are done with a cmp and an inversion; 4 are done with two compares and one xor/eqv/or/nor. Half are pretty simple: lt(a,b) = gt(b,a) gt(a,b) = gt(a,b) eq(a,b) = eq(a,b) le(a,b) = ge(b,a) ge(a,b) = ge(a,b) ltgt(a,b) = ge(a,b) ^ ge(b,a) ord(a,b) = ge(a,b) | ge(b,a) The other half are the negations of those: unge(a,b) = ~gt(b,a) unle(a,b) = ~gt(a,b) ne(a,b) = ~eq(a,b) ungt(a,b) = ~ge(b,a) unlt(a,b) = ~ge(a,b) uneq(a,b) = ~(ge(a,b) ^ ge(b,a)) un(a,b) = ~(ge(a,b) | ge(b,a)) And please remember to test everythin with -ffast-math :-) That is, when flag_finite_math_only is set. You cannot get unordered results, then, making the optimal sequences different in some cases (and changing what "ne" means!) 8 codes, ordered: never lt gt ltgt eq le ge ordered 8 codes, unordered: unordered unlt ungt ne uneq unle unge always 8 codes, fast-math: never lt gt ne eq le ge always 8 codes, non-fp: never lt gt ne eq le ge always > >> +;; Same mode for condition true/false values and predicate operand. > >> +(define_expand "vcond_mask_<mode><mode>" > >> + [(match_operand:VEC_I 0 "vint_operand") > >> + (match_operand:VEC_I 1 "vint_operand") > >> + (match_operand:VEC_I 2 "vint_operand") > >> + (match_operand:VEC_I 3 "vint_operand")] > >> + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" > >> +{ > >> + emit_insn (gen_vector_select_<mode> (operands[0], operands[2], > >> operands[1], > >> + operands[3])); > >> + DONE; > >> +}) > > > > So is this exactly the same as vsel/xxsel? > > Yes, expanded into if_then_else and ne against zero, can match their patterns. Ah, so vector_select is not the canonical name. > > You shouldn't allow those for signed comparisons, that will only hide > > problems. > > OK, moved into vec_cmpu*. > > Why *are* there separate vec_cmp and vec_cmpu patterns, in the first place? > > If I understood the question correctly, you were asking why not have one > unique pattern for them? Yes, it is redundant, the comparison code already says if it is an unsigned comparison. So this a question about the generic patterns, not your implementation of them :-) And if it is *one* pattern then handling LTU etc. makes perfect sense. > I noticed some vectorization related SPNs have > separate signed and unsigned patterns, I guess it's due to that sign matters > for some vector instructions, some platform may only support some of them, > using sign for fine grain queries and checks? I think it is because one particular implementation has different machine insns for both, the one this interface was implemented for first. > Updated patch attached by addressing above comments. I'll review it later, sorry. Segher