Re: [PATCH 1/5]middle-end: Add scaffolding to support narrowing IFNs

Richard Biener Wed, 20 Aug 2025 05:48:44 -0700

On Tue, 19 Aug 2025, Tamar Christina wrote:

> This adds scaffolding for supporting narrowing IFNs inside the vectorizer in a
> similar way as how widening is supported.  However because narrowing 
> operations
> always have the same number of elements as the input and output we need to be
> able to combine the results.  One way this could have been done is by using a
> vec_perm_expr but this then can become tricky to recognize as low/hi pairs in
> backends.
> 
> As such I've chosen the design where the _hi and _odd variants of the
> instructions must always be RMW.  This simplifies the implementation and 
> targets
> that don't want this can use the direct conversion variant.


the canonial way for "narrowing" would be to have a

vec_pack_saddh_optab

that takes two input vectors for each operand (we currently have such
for conversions, aka the single operand case).  There's no hi/lo
involved, that's only for widening as we can't have two outputs.

So - no, we don't want this new odd way of doing.  Either only go with
vec_saddh_narrow, aka the result mode is of half size, if that suits
you, or please add the first "pack" variant of a binary operation.

"pack" would imply narrow here.  Alternatively vec_pack_narrow_saddh
and vec_narrow_saddh as the two variants.

Note that for composition I'd use a CTOR.  Note that in your scheme
the even/odd variant would interleave one result into the other?
Would the binary optab then fill only every 2nd output lane?  The
documentation in 2/n isn't exactly clear here.

Thanks,
Richard.

> Bootstrapped Regtested on aarch64-none-linux-gnu,
> arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> -m32, -m64 and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>       * internal-fn.cc (lookup_hilo_internal_fn,
>       DEF_INTERNAL_NARROWING_OPTAB_FN, lookup_evenodd_internal_fn,
>       narrowing_fn_p, narrowing_evenodd_fn_p): New.
>       * internal-fn.def (DEF_INTERNAL_NARROWING_OPTAB_FN): New.
>       * internal-fn.h (narrowing_fn_p, narrowing_evenodd_fn_p): New.
>       * tree-vect-stmts.cc (simple_integer_narrowing, vectorizable_call,
>       vectorizable_conversion, supportable_widening_operation,
>       supportable_narrowing_operation): Use it.
>       * tree-vectorizer.h (supportable_narrowing_operation): Modify
>       signature.
> 
> ---
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 
> bf2fac8180706ec418de7eb97cd1260f1d078c03..83438dd2ff57474cec999adaeabe92c0540e2a51
>  100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -101,7 +101,7 @@ lookup_internal_fn (const char *name)
>  extern void
>  lookup_hilo_internal_fn (internal_fn ifn, internal_fn *lo, internal_fn *hi)
>  {
> -  gcc_assert (widening_fn_p (ifn));
> +  gcc_assert (widening_fn_p (ifn) || narrowing_fn_p (ifn));
>  
>    switch (ifn)
>      {
> @@ -113,6 +113,11 @@ lookup_hilo_internal_fn (internal_fn ifn, internal_fn 
> *lo, internal_fn *hi)
>        *lo = internal_fn (IFN_##NAME##_LO);                   \
>        *hi = internal_fn (IFN_##NAME##_HI);                   \
>        break;
> +#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, S, SO, UO, T1, T2) \
> +    case IFN_##NAME:                                             \
> +      *lo = internal_fn (IFN_##NAME##_LO);                       \
> +      *hi = internal_fn (IFN_##NAME##_HI);                       \
> +      break;
>  #include "internal-fn.def"
>      }
>  }
> @@ -124,7 +129,7 @@ extern void
>  lookup_evenodd_internal_fn (internal_fn ifn, internal_fn *even,
>                           internal_fn *odd)
>  {
> -  gcc_assert (widening_fn_p (ifn));
> +  gcc_assert (widening_fn_p (ifn) || narrowing_fn_p (ifn));
>  
>    switch (ifn)
>      {
> @@ -136,6 +141,11 @@ lookup_evenodd_internal_fn (internal_fn ifn, internal_fn 
> *even,
>        *even = internal_fn (IFN_##NAME##_EVEN);                       \
>        *odd = internal_fn (IFN_##NAME##_ODD);                 \
>        break;
> +#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, S, SO, UO, T1, T2) \
> +    case IFN_##NAME:                                             \
> +      *even = internal_fn (IFN_##NAME##_EVEN);                           \
> +      *odd = internal_fn (IFN_##NAME##_ODD);                     \
> +      break;
>  #include "internal-fn.def"
>      }
>  }
> @@ -4548,6 +4558,35 @@ widening_fn_p (code_helper code)
>      }
>  }
>  
> +/* Return true if this CODE describes an internal_fn that returns a vector 
> with
> +   elements twice as narrow as the element size of the input vectors.  */
> +
> +bool
> +narrowing_fn_p (code_helper code)
> +{
> +  if (!code.is_fn_code ())
> +    return false;
> +
> +  if (!internal_fn_p ((combined_fn) code))
> +    return false;
> +
> +  internal_fn fn = as_internal_fn ((combined_fn) code);
> +  switch (fn)
> +    {
> +    #define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, S, SO, UO, T1, T2) \
> +    case IFN_##NAME:                                                 \
> +    case IFN_##NAME##_HI:                                            \
> +    case IFN_##NAME##_LO:                                            \
> +    case IFN_##NAME##_EVEN:                                          \
> +    case IFN_##NAME##_ODD:                                           \
> +      return true;
> +    #include "internal-fn.def"
> +
> +    default:
> +      return false;
> +    }
> +}
> +
>  /* Return true if this CODE describes an internal_fn that returns a vector 
> with
>     elements twice as wide as the element size of the input vectors and 
> operates
>     on even/odd parts of the input.  */
> @@ -4575,6 +4614,33 @@ widening_evenodd_fn_p (code_helper code)
>      }
>  }
>  
> +/* Return true if this CODE describes an internal_fn that returns a vector 
> with
> +   elements twice as narrow as the element size of the input vectors and
> +   operates on even/odd parts of the input.  */
> +
> +bool
> +narrowing_evenodd_fn_p (code_helper code)
> +{
> +  if (!code.is_fn_code ())
> +    return false;
> +
> +  if (!internal_fn_p ((combined_fn) code))
> +    return false;
> +
> +  internal_fn fn = as_internal_fn ((combined_fn) code);
> +  switch (fn)
> +    {
> +    #define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, S, SO, UO, T1, T2) \
> +    case IFN_##NAME##_EVEN:                                          \
> +    case IFN_##NAME##_ODD:                                           \
> +      return true;
> +    #include "internal-fn.def"
> +
> +    default:
> +      return false;
> +    }
> +}
> +
>  /* Return true if IFN_SET_EDOM is supported.  */
>  
>  bool
> diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> index 
> d2480a1bf7927476215bc7bb99c0b74197d2b7e9..69677dd10b980c83dec36487b1214ff066f4789b
>  100644
> --- a/gcc/internal-fn.def
> +++ b/gcc/internal-fn.def
> @@ -40,6 +40,8 @@ along with GCC; see the file COPYING3.  If not see
>       DEF_INTERNAL_SIGNED_COND_FN (NAME, FLAGS, OPTAB, TYPE)
>       DEF_INTERNAL_WIDENING_OPTAB_FN (NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB,
>                                    TYPE)
> +     DEF_INTERNAL_NARROWING_OPTAB_FN (NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB,
> +                                  TYPE_LO, TYPE_HI)
>  
>     where NAME is the name of the function, FLAGS is a set of
>     ECF_* flags and FNSPEC is a string describing functions fnspec.
> @@ -122,6 +124,21 @@ along with GCC; see the file COPYING3.  If not see
>     These five internal functions will require two optabs each, a SIGNED_OPTAB
>     and an UNSIGNED_OTPAB.
>  
> +   DEF_INTERNAL_NARROWING_OPTAB_FN is a wrapper that defines five internal
> +   functions with DEF_INTERNAL_SIGNED_OPTAB_FN:
> +   - one that describes a narrowing operation with the same number of 
> elements
> +   in the output and input vectors,
> +   - two that describe a pair of high-low narrowing operations where the 
> output
> +   vectors each have half the number of elements of the input vectors,
> +   corresponding to the result of the narrowing operation on the top half and
> +   bottom half, these have the suffixes _HI and _LO,
> +   - and two that describe a pair of even-odd narrowing operations where the
> +   output vectors each have half the number of elements of the input vectors,
> +   corresponding to the result of the narrowing operation on the even and odd
> +   elements, these have the suffixes _EVEN and _ODD.
> +   These five internal functions will require two optabs each, a SIGNED_OPTAB
> +   and an UNSIGNED_OTPAB.
> +
>     DEF_INTERNAL_COND_FN is a wrapper that defines 2 internal functions with
>     DEF_INTERNAL_OPTAB_FN:
>     - One is COND_* operations that are predicated by mask only. Such 
> operations
> @@ -184,6 +201,15 @@ along with GCC; see the file COPYING3.  If not see
>    DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _ODD, FLAGS, SELECTOR, SOPTAB##_odd, 
> UOPTAB##_odd, TYPE)
>  #endif
>  
> +#ifndef DEF_INTERNAL_NARROWING_OPTAB_FN
> +#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, FLAGS, SELECTOR, SOPTAB, 
> UOPTAB, TYPE_LO, TYPE_HI)       \
> +  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, 
> TYPE_LO)                             \
> +  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _LO, FLAGS, SELECTOR, SOPTAB##_lo, 
> UOPTAB##_lo, TYPE_LO)       \
> +  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _HI, FLAGS, SELECTOR, SOPTAB##_hi, 
> UOPTAB##_hi, TYPE_HI)       \
> +  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _EVEN, FLAGS, SELECTOR, 
> SOPTAB##_even, UOPTAB##_even, TYPE_LO) \
> +  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _ODD, FLAGS, SELECTOR, SOPTAB##_odd, 
> UOPTAB##_odd, TYPE_HI)
> +#endif
> +
>  #ifndef DEF_INTERNAL_COND_FN
>  #define DEF_INTERNAL_COND_FN(NAME, FLAGS, OPTAB, TYPE)                       
>   \
>    DEF_INTERNAL_OPTAB_FN (COND_##NAME, FLAGS, cond_##OPTAB, cond_##TYPE)      
>   \
> @@ -608,6 +634,7 @@ DEF_INTERNAL_OPTAB_FN (BIT_ANDN, ECF_CONST, andn, binary)
>  DEF_INTERNAL_OPTAB_FN (BIT_IORN, ECF_CONST, iorn, binary)
>  
>  #undef DEF_INTERNAL_WIDENING_OPTAB_FN
> +#undef DEF_INTERNAL_NARROWING_OPTAB_FN
>  #undef DEF_INTERNAL_SIGNED_COND_FN
>  #undef DEF_INTERNAL_COND_FN
>  #undef DEF_INTERNAL_INT_EXT_FN
> diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
> index 
> fd21694dfebfb8518810fd85f7aa8c45dd4c362e..8c6ad218e4412716ba7b79b24af708920e11e3be
>  100644
> --- a/gcc/internal-fn.h
> +++ b/gcc/internal-fn.h
> @@ -220,6 +220,8 @@ extern int first_commutative_argument (internal_fn);
>  extern bool associative_binary_fn_p (internal_fn);
>  extern bool widening_fn_p (code_helper);
>  extern bool widening_evenodd_fn_p (code_helper);
> +extern bool narrowing_fn_p (code_helper);
> +extern bool narrowing_evenodd_fn_p (code_helper);
>  
>  extern bool set_edom_supported_p (void);
>  
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 
> 675c6e2e683c59df44d5d7d65b87900a70506f50..97b3d4801d19f3168b91c91271e882bad3f99f13
>  100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -3157,15 +3157,20 @@ simple_integer_narrowing (tree vectype_out, tree 
> vectype_in,
>        || !INTEGRAL_TYPE_P (TREE_TYPE (vectype_in)))
>      return false;
>  
> -  code_helper code;
> +  code_helper code1 = ERROR_MARK, code2 = ERROR_MARK;
>    int multi_step_cvt = 0;
>    auto_vec <tree, 8> interm_types;
>    if (!supportable_narrowing_operation (NOP_EXPR, vectype_out, vectype_in,
> -                                     &code, &multi_step_cvt, &interm_types)
> +                                     &code1, &code2, &multi_step_cvt,
> +                                     &interm_types)
>        || multi_step_cvt)
>      return false;
>  
> -  *convert_code = code;
> +  /* Simple narrowing never have hi/lo splits.  */
> +  if (code2 != ERROR_MARK)
> +    return false;
> +
> +  *convert_code = code1;
>    return true;
>  }
>  
> @@ -3375,6 +3380,7 @@ vectorizable_call (vec_info *vinfo,
>    if (cfn != CFN_LAST
>        && (modifier == NONE
>         || (modifier == NARROW
> +           && !narrowing_fn_p (cfn)
>             && simple_integer_narrowing (vectype_out, vectype_in,
>                                          &convert_code))))
>      ifn = vectorizable_internal_function (cfn, callee, vectype_out,
> @@ -3511,7 +3517,7 @@ vectorizable_call (vec_info *vinfo,
>    if (clz_ctz_arg1)
>      ++vect_nargs;
>  
> -  if (modifier == NONE || ifn != IFN_LAST)
> +  if (modifier == NONE || (ifn != IFN_LAST && !narrowing_fn_p (ifn)))
>      {
>        tree prev_res = NULL_TREE;
>        vargs.safe_grow (vect_nargs, true);
> @@ -5058,7 +5064,8 @@ vectorizable_conversion (vec_info *vinfo,
>    if (!widen_arith
>        && !CONVERT_EXPR_CODE_P (code)
>        && code != FIX_TRUNC_EXPR
> -      && code != FLOAT_EXPR)
> +      && code != FLOAT_EXPR
> +      && !narrowing_fn_p (code))
>      return false;
>  
>    /* Check types of lhs and rhs.  */
> @@ -5102,7 +5109,8 @@ vectorizable_conversion (vec_info *vinfo,
>      {
>        gcc_assert (code == WIDEN_MULT_EXPR
>                 || code == WIDEN_LSHIFT_EXPR
> -               || widening_fn_p (code));
> +               || widening_fn_p (code)
> +               || narrowing_fn_p (code));
>  
>        op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
>                                    gimple_call_arg (stmt, 0);
> @@ -5285,9 +5293,9 @@ vectorizable_conversion (vec_info *vinfo,
>        break;
>  
>      case NARROW_DST:
> -      gcc_assert (op_type == unary_op);
> +      gcc_assert (op_type == unary_op || op_type == binary_op);
>        if (supportable_narrowing_operation (code, vectype_out, vectype_in,
> -                                        &code1, &multi_step_cvt,
> +                                        &code1, &code2, &multi_step_cvt,
>                                          &interm_types))
>       break;
>  
> @@ -5307,7 +5315,7 @@ vectorizable_conversion (vec_info *vinfo,
>         else
>           goto unsupported;
>         if (supportable_narrowing_operation (NOP_EXPR, vectype_out, cvt_type,
> -                                            &code1, &multi_step_cvt,
> +                                            &code1, &code2, &multi_step_cvt,
>                                              &interm_types))
>           break;
>       }
> @@ -5336,7 +5344,7 @@ vectorizable_conversion (vec_info *vinfo,
>         if (cvt_type == NULL_TREE)
>           goto unsupported;
>         if (!supportable_narrowing_operation (NOP_EXPR, cvt_type, vectype_in,
> -                                             &code1, &multi_step_cvt,
> +                                             &code1, &code2, &multi_step_cvt,
>                                               &interm_types))
>           goto unsupported;
>         if (supportable_convert_operation ((tree_code) code, vectype_out,
> @@ -5553,11 +5561,44 @@ vectorizable_conversion (vec_info *vinfo,
>           vec_oprnds0[i] = new_temp;
>         }
>  
> -      vect_create_vectorized_demotion_stmts (vinfo, &vec_oprnds0,
> -                                          multi_step_cvt,
> -                                          stmt_info, vec_dsts, gsi,
> -                                          slp_node, code1,
> -                                          modifier == NARROW_SRC);
> +      if (modifier == NARROW_DST && narrowing_fn_p (code))
> +     {
> +       gcc_assert (op_type == binary_op);
> +       vect_get_vec_defs (vinfo, slp_node, op0, &vec_oprnds0,
> +                          op1, &vec_oprnds1);
> +       tree vop0, vop1;
> +       internal_fn ifn1 = as_internal_fn ((combined_fn)code1);
> +       internal_fn ifn2 = as_internal_fn ((combined_fn)code2);
> +       tree small_type
> +         = get_related_vectype_for_scalar_type (TYPE_MODE (vectype_out),
> +                                                TREE_TYPE (vectype_out),
> +                                                exact_div 
> (TYPE_VECTOR_SUBPARTS (vectype_out), 2));
> +       for (unsigned i = 0; i < vec_oprnds0.length (); i += 2)
> +         {
> +           vop0 = vec_oprnds0[i];
> +           vop1 = vec_oprnds1[i];
> +           gimple *new_stmt
> +             = gimple_build_call_internal (ifn1, 2, vop0, vop1);
> +           tree new_tmp = make_ssa_name (small_type);
> +           gimple_call_set_lhs (new_stmt, new_tmp);
> +           vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
> +
> +           vop0 = vec_oprnds0[i + 1];
> +           vop1 = vec_oprnds1[i + 1];
> +           new_stmt
> +             = gimple_build_call_internal (ifn2, 3, vop0, vop1, new_tmp);
> +           new_tmp = make_ssa_name (vec_dest, new_stmt);
> +           gimple_call_set_lhs (new_stmt, new_tmp);
> +           vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
> +           slp_node->push_vec_def (new_stmt);
> +         }
> +     }
> +      else
> +        vect_create_vectorized_demotion_stmts (vinfo, &vec_oprnds0,
> +                                            multi_step_cvt,
> +                                            stmt_info, vec_dsts, gsi,
> +                                            slp_node, code1,
> +                                            modifier == NARROW_SRC);
>        /* After demoting op0 to cvt_type, convert it to dest.  */
>        if (cvt_type && code == FLOAT_EXPR)
>       {
> @@ -13616,6 +13657,8 @@ supportable_widening_operation (vec_info *vinfo,
>     Output:
>     - CODE1 is the code of a vector operation to be used when
>     vectorizing the operation, if available.
> +   - CODE2 is the code of a vector operation for the high part to be used 
> when
> +   vectorizing the operation, if available.
>     - MULTI_STEP_CVT determines the number of required intermediate steps in
>     case of multi-step conversion (like int->short->char - in that case
>     MULTI_STEP_CVT will be 1).
> @@ -13625,64 +13668,117 @@ supportable_widening_operation (vec_info *vinfo,
>  bool
>  supportable_narrowing_operation (code_helper code,
>                                tree vectype_out, tree vectype_in,
> -                              code_helper *code1, int *multi_step_cvt,
> -                                 vec<tree> *interm_types)
> +                              code_helper *code1, code_helper *code2,
> +                              int *multi_step_cvt, vec<tree> *interm_types)
>  {
>    machine_mode vec_mode;
> -  enum insn_code icode1;
> -  optab optab1, interm_optab;
> +  enum insn_code icode1 = CODE_FOR_nothing, icode2 = CODE_FOR_nothing;
> +  optab optab1 = unknown_optab, optab2 = unknown_optab, interm_optab;
>    tree vectype = vectype_in;
>    tree narrow_vectype = vectype_out;
> -  enum tree_code c1;
> +  code_helper c1 = ERROR_MARK;
>    tree intermediate_type, prev_type;
>    machine_mode intermediate_mode, prev_mode;
>    int i;
>    unsigned HOST_WIDE_INT n_elts;
>    bool uns;
>  
> -  if (!code.is_tree_code ())
> -    return false;
> -
> +  vec_mode = TYPE_MODE (vectype);
>    *multi_step_cvt = 0;
> -  switch ((tree_code) code)
> +  if (narrowing_fn_p (code))
> +     {
> +       /* If this is an internal fn then we must check whether the target
> +       supports the narrowing in one go.  */
> +      internal_fn ifn = as_internal_fn ((combined_fn) code);
> +
> +      internal_fn lo, hi, even, odd;
> +      lookup_hilo_internal_fn (ifn, &lo, &hi);
> +      if (BYTES_BIG_ENDIAN)
> +     std::swap (lo, hi);
> +      *code1 = as_combined_fn (lo);
> +      *code2 = as_combined_fn (hi);
> +      optab1 = direct_internal_fn_optab (lo, {vectype, vectype});
> +      optab2 = direct_internal_fn_optab (hi, {vectype, vectype});
> +
> +      /* If we don't support low-high, then check for even-odd.  */
> +      if (!optab1
> +       || (icode1 = optab_handler (optab1, vec_mode)) == CODE_FOR_nothing
> +       || !optab2
> +       || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing)
> +     {
> +       lookup_evenodd_internal_fn (ifn, &even, &odd);
> +       *code1 = as_combined_fn (even);
> +       *code2 = as_combined_fn (odd);
> +       optab1 = direct_internal_fn_optab (even, {vectype, vectype});
> +       optab2 = direct_internal_fn_optab (odd, {vectype, vectype});
> +     }
> +    }
> +  else if (code.is_tree_code ())
>      {
> -    CASE_CONVERT:
> -      c1 = VEC_PACK_TRUNC_EXPR;
> -      if (VECTOR_BOOLEAN_TYPE_P (narrow_vectype)
> -       && VECTOR_BOOLEAN_TYPE_P (vectype)
> -       && SCALAR_INT_MODE_P (TYPE_MODE (vectype))
> -       && TYPE_VECTOR_SUBPARTS (vectype).is_constant (&n_elts)
> -       && n_elts < BITS_PER_UNIT)
> -     optab1 = vec_pack_sbool_trunc_optab;
> -      else
> -     optab1 = optab_for_tree_code (c1, vectype, optab_default);
> -      break;
> +      switch ((tree_code) code)
> +     {
> +     CASE_CONVERT:
> +       c1 = VEC_PACK_TRUNC_EXPR;
> +       if (VECTOR_BOOLEAN_TYPE_P (narrow_vectype)
> +           && VECTOR_BOOLEAN_TYPE_P (vectype)
> +           && SCALAR_INT_MODE_P (TYPE_MODE (vectype))
> +           && TYPE_VECTOR_SUBPARTS (vectype).is_constant (&n_elts)
> +           && n_elts < BITS_PER_UNIT)
> +         optab1 = vec_pack_sbool_trunc_optab;
> +       else
> +         optab1 = optab_for_tree_code ((tree_code)c1, vectype,
> +                                       optab_default);
> +       break;
>  
> -    case FIX_TRUNC_EXPR:
> -      c1 = VEC_PACK_FIX_TRUNC_EXPR;
> -      /* The signedness is determined from output operand.  */
> -      optab1 = optab_for_tree_code (c1, vectype_out, optab_default);
> -      break;
> +     case FIX_TRUNC_EXPR:
> +       c1 = VEC_PACK_FIX_TRUNC_EXPR;
> +       /* The signedness is determined from output operand.  */
> +       optab1 = optab_for_tree_code ((tree_code)c1, vectype_out,
> +                                     optab_default);
> +       break;
>  
> -    case FLOAT_EXPR:
> -      c1 = VEC_PACK_FLOAT_EXPR;
> -      optab1 = optab_for_tree_code (c1, vectype, optab_default);
> -      break;
> +     case FLOAT_EXPR:
> +       c1 = VEC_PACK_FLOAT_EXPR;
> +       optab1 = optab_for_tree_code ((tree_code)c1, vectype_out,
> +                                     optab_default);
> +       break;
>  
> -    default:
> -      gcc_unreachable ();
> +     default:
> +       gcc_unreachable ();
> +     }
>      }
> +  else
> +    return false;
>  
>    if (!optab1)
>      return false;
>  
> -  vec_mode = TYPE_MODE (vectype);
> -  if ((icode1 = optab_handler (optab1, vec_mode)) == CODE_FOR_nothing)
> -    return false;
> +  if (narrowing_fn_p (code))
> +    {
> +      if (!optab2)
> +     return false;
> +      if ((icode1 = optab_handler (optab1, vec_mode)) == CODE_FOR_nothing
> +       || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing)
> +     return false;
> +    }
> +  else
> +    {
> +      if ((icode1 = optab_handler (optab1, vec_mode)) == CODE_FOR_nothing)
> +     return false;
>  
> -  *code1 = c1;
> +      *code1 = c1;
> +    }
>  
> -  if (insn_data[icode1].operand[0].mode == TYPE_MODE (narrow_vectype))
> +  machine_mode nmode;
> +  machine_mode vmode = TYPE_MODE (narrow_vectype);
> +  scalar_mode emode = GET_MODE_INNER (vmode);
> +  poly_uint64 hnunits;
> +  if (insn_data[icode1].operand[0].mode == vmode
> +      || (narrowing_fn_p (code)
> +       && known_ne (hnunits = exact_div (GET_MODE_NUNITS (vmode), 2U), 0U)
> +       && related_vector_mode (vmode, emode, hnunits).exists (&nmode)
> +       && insn_data[icode1].operand[0].mode == nmode
> +       && insn_data[icode2].operand[0].mode == vmode))
>      {
>        if (!VECTOR_BOOLEAN_TYPE_P (vectype))
>       return true;
> @@ -13716,7 +13812,7 @@ supportable_narrowing_operation (code_helper code,
>        intermediate_type
>       = lang_hooks.types.type_for_mode (TYPE_MODE (vectype_out), 0);
>        interm_optab
> -     = optab_for_tree_code (c1, intermediate_type, optab_default);
> +     = optab_for_tree_code ((tree_code)c1, intermediate_type, optab_default);
>        if (interm_optab != unknown_optab
>         && (icode2 = optab_handler (optab1, vec_mode)) != CODE_FOR_nothing
>         && insn_data[icode1].operand[0].mode
> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> index 
> 3d8a9466982a0c29099e60ed7a84e0f5ed207fa9..026dfb131b4c2808290fdbd015b63dab5918c7f2
>  100644
> --- a/gcc/tree-vectorizer.h
> +++ b/gcc/tree-vectorizer.h
> @@ -2463,8 +2463,8 @@ extern bool supportable_widening_operation (vec_info*, 
> code_helper,
>                                           code_helper*, code_helper*,
>                                           int*, vec<tree> *);
>  extern bool supportable_narrowing_operation (code_helper, tree, tree,
> -                                          code_helper *, int *,
> -                                          vec<tree> *);
> +                                          code_helper *, code_helper *,
> +                                          int *, vec<tree> *);
>  extern bool supportable_indirect_convert_operation (code_helper,
>                                                   tree, tree,
>                                                   vec<std::pair<tree, 
> tree_code> > &,
> 
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH 1/5]middle-end: Add scaffolding to support narrowing IFNs

Reply via email to