Committed, thanks Richard. Pan
-----Original Message----- From: Richard Biener <richard.guent...@gmail.com> Sent: Thursday, November 2, 2023 12:43 AM To: Li, Pan2 <pan2...@intel.com> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang <yanzhang.w...@intel.com>; kito.ch...@gmail.com; Liu, Hongtao <hongtao....@intel.com> Subject: Re: [PATCH v4] VECT: Refine the type size restriction of call vectorizer > Am 31.10.2023 um 16:10 schrieb pan2...@intel.com: > > From: Pan Li <pan2...@intel.com> > > Update in v4: > > * Append the check to vectorizable_internal_function. > > Update in v3: > > * Add func to predicate type size is legal or not for vectorizer call. > > Update in v2: > > * Fix one ICE of type assertion. > * Adjust some test cases for aarch64 sve and riscv vector. > > Original log: > > The vectoriable_call has one restriction of the size of data type. > Aka DF to DI is allowed but SF to DI isn't. You may see below message > when try to vectorize function call like lrintf. > > void > test_lrintf (long *out, float *in, unsigned count) > { > for (unsigned i = 0; i < count; i++) > out[i] = __builtin_lrintf (in[i]); > } > > lrintf.c:5:26: missed: couldn't vectorize loop > lrintf.c:5:26: missed: not vectorized: unsupported data-type > > Then the standard name pattern like lrintmn2 cannot work for different > data type size like SF => DI. This patch would like to refine this data > type size check and unblock the standard name like lrintmn2 on conditions. > > The type size of vectype_out need to be exactly the same as the type > size of vectype_in when the vectype_out size isn't participating in > the optab selection. While there is no such restriction when the > vectype_out is somehow a part of the optab query. > > The below test are passed for this patch. > > * The risc-v regression tests. > * Ensure the lrintf standard name in risc-v. > > The below test are ongoing. > > * The x86 bootstrap and regression test. > * The aarch64 regression test. > Ok Thanks, Richard > gcc/ChangeLog: > > * tree-vect-stmts.cc (vectorizable_internal_function): Add type > size check for vectype_out doesn't participating for optab query. > (vectorizable_call): Remove the type size check. > > Signed-off-by: Pan Li <pan2...@intel.com> > --- > gcc/tree-vect-stmts.cc | 22 +++++++++------------- > 1 file changed, 9 insertions(+), 13 deletions(-) > > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > index a9200767f67..799b4ab10c7 100644 > --- a/gcc/tree-vect-stmts.cc > +++ b/gcc/tree-vect-stmts.cc > @@ -1420,8 +1420,17 @@ vectorizable_internal_function (combined_fn cfn, tree > fndecl, > const direct_internal_fn_info &info = direct_internal_fn (ifn); > if (info.vectorizable) > { > + bool same_size_p = TYPE_SIZE (vectype_in) == TYPE_SIZE (vectype_out); > tree type0 = (info.type0 < 0 ? vectype_out : vectype_in); > tree type1 = (info.type1 < 0 ? vectype_out : vectype_in); > + > + /* The type size of both the vectype_in and vectype_out should be > + exactly the same when vectype_out isn't participating the optab. > + While there is no restriction for type size when vectype_out > + is part of the optab query. */ > + if (type0 != vectype_out && type1 != vectype_out && !same_size_p) > + return IFN_LAST; > + > if (direct_internal_fn_supported_p (ifn, tree_pair (type0, type1), > OPTIMIZE_FOR_SPEED)) > return ifn; > @@ -3361,19 +3370,6 @@ vectorizable_call (vec_info *vinfo, > > return false; > } > - /* FORNOW: we don't yet support mixtures of vector sizes for calls, > - just mixtures of nunits. E.g. DI->SI versions of __builtin_ctz* > - are traditionally vectorized as two VnDI->VnDI IFN_CTZs followed > - by a pack of the two vectors into an SI vector. We would need > - separate code to handle direct VnDI->VnSI IFN_CTZs. */ > - if (TYPE_SIZE (vectype_in) != TYPE_SIZE (vectype_out)) > - { > - if (dump_enabled_p ()) > - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > - "mismatched vector sizes %T and %T\n", > - vectype_in, vectype_out); > - return false; > - } > > if (VECTOR_BOOLEAN_TYPE_P (vectype_out) > != VECTOR_BOOLEAN_TYPE_P (vectype_in)) > -- > 2.34.1 >