Hi, the recent changes that allowed multi-step conversions for "non-packing/unpacking", i.e. modifier == NONE targets included promoting to-float and demoting to-int variants. This patch adds demoting to-float and promoting to-int handling.
Bootstrapped and regtested on x86 and aarch64. A question that seems related: Why do we require !flag_trapping_math for the "NONE" multistep conversion but not for the "NARROW_DST" case when both seem to handle float -> int and there are float values that do not have an int representation? If a backend can guarantee that the conversion traps, should it just implement a multistep conversion in a matching expander? Regards Robin gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_conversion): Handle more demotion/promotion for modifier == NONE. --- gcc/tree-vect-stmts.cc | 40 +++++++++++++++++++++++++++++----------- 1 file changed, 29 insertions(+), 11 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 10e71178ce7..78e0510be7e 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -5324,28 +5324,46 @@ vectorizable_conversion (vec_info *vinfo, break; } - /* For conversions between float and smaller integer types try whether we - can use intermediate signed integer types to support the + /* For conversions between float and larger integer types try whether + we can use intermediate signed integer types to support the conversion. */ if ((code == FLOAT_EXPR - && GET_MODE_SIZE (lhs_mode) > GET_MODE_SIZE (rhs_mode)) + && GET_MODE_SIZE (lhs_mode) != GET_MODE_SIZE (rhs_mode)) || (code == FIX_TRUNC_EXPR - && GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode) - && !flag_trapping_math)) + && ((GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode) + && !flag_trapping_math) + || GET_MODE_SIZE (rhs_mode) < GET_MODE_SIZE (lhs_mode)))) { + bool demotion = GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode); bool float_expr_p = code == FLOAT_EXPR; - scalar_mode imode = float_expr_p ? rhs_mode : lhs_mode; - fltsz = GET_MODE_SIZE (float_expr_p ? lhs_mode : rhs_mode); + unsigned short target_size; + scalar_mode intermediate_mode; + if (demotion) + { + intermediate_mode = lhs_mode; + target_size = GET_MODE_SIZE (rhs_mode); + } + else + { + target_size = GET_MODE_SIZE (lhs_mode); + tree itype + = build_nonstandard_integer_type (GET_MODE_BITSIZE + (rhs_mode), 0); + intermediate_mode = SCALAR_TYPE_MODE (itype); + } code1 = float_expr_p ? code : NOP_EXPR; codecvt1 = float_expr_p ? NOP_EXPR : code; - FOR_EACH_2XWIDER_MODE (rhs_mode_iter, imode) + opt_scalar_mode mode_iter; + FOR_EACH_2XWIDER_MODE (mode_iter, intermediate_mode) { - imode = rhs_mode_iter.require (); - if (GET_MODE_SIZE (imode) > fltsz) + intermediate_mode = mode_iter.require (); + + if (GET_MODE_SIZE (intermediate_mode) > target_size) break; cvt_type - = build_nonstandard_integer_type (GET_MODE_BITSIZE (imode), + = build_nonstandard_integer_type (GET_MODE_BITSIZE + (intermediate_mode), 0); cvt_type = get_vectype_for_scalar_type (vinfo, cvt_type, slp_node); -- 2.41.0