Hi,

the recent changes that allowed multi-step conversions for
"non-packing/unpacking", i.e. modifier == NONE targets included
promoting to-float and demoting to-int variants.  This patch
adds demoting to-float and promoting to-int handling.

Bootstrapped and regtested on x86 and aarch64.

A question that seems related: Why do we require !flag_trapping_math
for the "NONE" multistep conversion but not for the "NARROW_DST"
case when both seem to handle float -> int and there are float
values that do not have an int representation?  If a backend
can guarantee that the conversion traps, should it just implement
a multistep conversion in a matching expander?

Regards
 Robin


gcc/ChangeLog:

        * tree-vect-stmts.cc (vectorizable_conversion): Handle
        more demotion/promotion for modifier == NONE.
---
 gcc/tree-vect-stmts.cc | 40 +++++++++++++++++++++++++++++-----------
 1 file changed, 29 insertions(+), 11 deletions(-)

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 10e71178ce7..78e0510be7e 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -5324,28 +5324,46 @@ vectorizable_conversion (vec_info *vinfo,
        break;
       }
 
-      /* For conversions between float and smaller integer types try whether we
-        can use intermediate signed integer types to support the
+      /* For conversions between float and larger integer types try whether
+        we can use intermediate signed integer types to support the
         conversion.  */
       if ((code == FLOAT_EXPR
-          && GET_MODE_SIZE (lhs_mode) > GET_MODE_SIZE (rhs_mode))
+          && GET_MODE_SIZE (lhs_mode) != GET_MODE_SIZE (rhs_mode))
          || (code == FIX_TRUNC_EXPR
-             && GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode)
-             && !flag_trapping_math))
+             && ((GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode)
+                 && !flag_trapping_math)
+                 || GET_MODE_SIZE (rhs_mode) < GET_MODE_SIZE (lhs_mode))))
        {
+         bool demotion = GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode);
          bool float_expr_p = code == FLOAT_EXPR;
-         scalar_mode imode = float_expr_p ? rhs_mode : lhs_mode;
-         fltsz = GET_MODE_SIZE (float_expr_p ? lhs_mode : rhs_mode);
+         unsigned short target_size;
+         scalar_mode intermediate_mode;
+         if (demotion)
+           {
+             intermediate_mode = lhs_mode;
+             target_size = GET_MODE_SIZE (rhs_mode);
+           }
+         else
+           {
+             target_size = GET_MODE_SIZE (lhs_mode);
+             tree itype
+               = build_nonstandard_integer_type (GET_MODE_BITSIZE
+                                                 (rhs_mode), 0);
+             intermediate_mode = SCALAR_TYPE_MODE (itype);
+           }
          code1 = float_expr_p ? code : NOP_EXPR;
          codecvt1 = float_expr_p ? NOP_EXPR : code;
-         FOR_EACH_2XWIDER_MODE (rhs_mode_iter, imode)
+         opt_scalar_mode mode_iter;
+         FOR_EACH_2XWIDER_MODE (mode_iter, intermediate_mode)
            {
-             imode = rhs_mode_iter.require ();
-             if (GET_MODE_SIZE (imode) > fltsz)
+             intermediate_mode = mode_iter.require ();
+
+             if (GET_MODE_SIZE (intermediate_mode) > target_size)
                break;
 
              cvt_type
-               = build_nonstandard_integer_type (GET_MODE_BITSIZE (imode),
+               = build_nonstandard_integer_type (GET_MODE_BITSIZE
+                                                 (intermediate_mode),
                                                  0);
              cvt_type = get_vectype_for_scalar_type (vinfo, cvt_type,
                                                      slp_node);
-- 
2.41.0

Reply via email to