Richard Biener <rguent...@suse.de> writes: > On Mon, 11 Dec 2023, Tamar Christina wrote: >> @@ -5553,6 +5554,83 @@ integer_type_for_mask (tree var, vec_info *vinfo) >> return build_nonstandard_integer_type (def_stmt_info->mask_precision, 1); >> } >> >> +/* Function vect_recog_gcond_pattern >> + >> + Try to find pattern like following: >> + >> + if (a op b) >> + >> + where operator 'op' is not != and convert it to an adjusted boolean >> pattern >> + >> + mask = a op b >> + if (mask != 0) >> + >> + and set the mask type on MASK. >> + >> + Input: >> + >> + * STMT_VINFO: The stmt at the end from which the pattern >> + search begins, i.e. cast of a bool to >> + an integer type. >> + >> + Output: >> + >> + * TYPE_OUT: The type of the output of this pattern. >> + >> + * Return value: A new stmt that will be used to replace the pattern. */ >> + >> +static gimple * >> +vect_recog_gcond_pattern (vec_info *vinfo, >> + stmt_vec_info stmt_vinfo, tree *type_out) >> +{ >> + gimple *last_stmt = STMT_VINFO_STMT (stmt_vinfo); >> + gcond* cond = NULL; >> + if (!(cond = dyn_cast <gcond *> (last_stmt))) >> + return NULL; >> + >> + auto lhs = gimple_cond_lhs (cond); >> + auto rhs = gimple_cond_rhs (cond); >> + auto code = gimple_cond_code (cond); >> + >> + tree scalar_type = TREE_TYPE (lhs); >> + if (VECTOR_TYPE_P (scalar_type)) >> + return NULL; >> + >> + if (code == NE_EXPR && zerop (rhs)) > > I think you need && VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type) here, > an integer != 0 would not be an appropriate mask. I guess two > relevant testcases would have an early exit like > > if (here[i] != 0) > break; > > once with a 'bool here[]' and once with a 'int here[]'. > >> + return NULL; >> + >> + tree vecitype = get_vectype_for_scalar_type (vinfo, scalar_type); >> + if (vecitype == NULL_TREE) >> + return NULL; >> + >> + /* Build a scalar type for the boolean result that when vectorized >> matches the >> + vector type of the result in size and number of elements. */ >> + unsigned prec >> + = vector_element_size (tree_to_poly_uint64 (TYPE_SIZE (vecitype)), >> + TYPE_VECTOR_SUBPARTS (vecitype)); >> + >> + scalar_type >> + = build_nonstandard_integer_type (prec, TYPE_UNSIGNED (scalar_type)); >> + >> + vecitype = get_vectype_for_scalar_type (vinfo, scalar_type); >> + if (vecitype == NULL_TREE) >> + return NULL; >> + >> + tree vectype = truth_type_for (vecitype); > > That looks awfully complicated. I guess one complication is that > we compute mask_precision & friends before this pattern gets > recognized. See vect_determine_mask_precision and its handling > of tcc_comparison, see also integer_type_for_mask. For comparisons > properly handled during pattern recog the vector type is determined > in vect_get_vector_types_for_stmt via > > else if (vect_use_mask_type_p (stmt_info)) > { > unsigned int precision = stmt_info->mask_precision; > scalar_type = build_nonstandard_integer_type (precision, 1); > vectype = get_mask_type_for_scalar_type (vinfo, scalar_type, > group_size); > if (!vectype) > return opt_result::failure_at (stmt, "not vectorized: unsupported" > " data-type %T\n", scalar_type); > > Richard, do you have any advice here? I suppose vect_determine_precisions > needs to handle the gcond case with bool != 0 somehow and for the > extra mask producer we add here we have to emulate what it would have > done, right?
How about handling gconds directly in vect_determine_mask_precision? In a sense it's not needed, since gconds are always roots, and so we could calculate their precision on the fly instead. But handling it in vect_determine_mask_precision feels like it should reduce the number of special cases. Thanks, Richard