RE: [PATCH v1 1/7] [Vectorizer]: SLP MATCH Pattern: Add Internal Functions and Optabs

Tamar Christina Mon, 02 Feb 2026 05:40:21 -0800

Hi Andrei,

This is a middle-end patch and you'll need a middle-end review to accept these.


That said, you didn't document the new optabs nor the new IFNs in the internal
documentation. See  md.texi

> -----Original Message-----
> From: Andrei Tirziu via Sourceware Forge <forge-bot+why@forge-
> stage.sourceware.org>
> Sent: 20 January 2026 16:51
> To: gcc-patches mailing list <[email protected]>
> Cc: Tamar Christina <[email protected]>; Victor Do Nascimento
> <[email protected]>
> Subject: [PATCH v1 1/7] [Vectorizer]: SLP MATCH Pattern: Add Internal
> Functions and Optabs
> 
> From: Andrei Nichita Tirziu <[email protected]>
> 
> The MATCH Pattern is a new addition to the vectorizer.
> After the pattern is identified, a replacement node
> is created for the initial expression.
> 
> This requires several new Internal Functions:
>   - `IFN_MATCH_EQ`, `IFN_MATCH_NE`: the basic IFNs for
>      unpredicated `match / nmatch` instructions.
>      These are used when building the new node, but are expected to
>      get transformed into the conditional versions.
>   - `IFN_COND_MATCH_EQ`, `IFN_COND_MATCH_NE`: conditional versions.
>   - `IFN_COND_LEN_MATCH_EQ`, `IFN_COND_LEN_MATCH_NE`: conditional
>      with length versions.
> 
> The arguments of these functions are:
>   - `IFN_MATCH_EQ (variants, invariants)`
>   - `IFN_MATCH_NE (variants, invariants)`

I think the names here are slightly confusing.  In vector terms EQ and NE means 
something
else, in that when comparing two vectors EQ is usually "Forall", and the NE is 
"Any"

But these instructions as you say are a cross product. MATCH checks if any 
value from vector
A matches any element in vector B and if so returns true, NMATCH checks that no 
value from
A maches any from B and returns true.

So I think better names are IFN_MATCHES_ANY_FROM and IFN_MATCHES_NONE_FROM.

>   - `IFN_COND_MATCH_EQ (variants, invariants, mask, else)`
>   - `IFN_COND_MATCH_NE (variants, invariants, mask, else)`
>   - `IFN_COND_LEN_MATCH_EQ (variants, invariants, mask, else, len, bias)`
>   - `IFN_COND_LEN_MATCH_NE (variants, invariants, mask, else, len, bias)`
> 

Inactive elements are explicitly set to zero by these instructions, and they 
have to since
any other value than zero would give a false positive (as it'll be interpreted 
as a match
having occurred).

So the `else` argument should be dropped.  There's no ISA that can actually
implement this instruction with something other than zero.

> These IFNs have corresponding optabs. It is up to the backends to support
> them further. All the optabs are declared as "conversion direct" optabs,
> since two modes are being used. It is expected that the first mode
> corresponds to the result, while the second mode represents the input.
> 

Conversion optabs don't' mean that two modes are being used. Conversion optab
is required when two modes are required to determine which operation to perform.
(The text in optabs.def is slightly simplistic here).

An example is sign/zero extensions, where for instance from V4QI we can either
extend to V4HI or V4SI. As such, one type is not sufficient to determine the 
operation.

This is not the case for these instructions. The type of the input vector 
operands are
always the same and the result type is just truth_type_for (<input>).  So they 
should
be a direct optab as just the input mode is enough to determine the output type 
and of
the other operands.

> The modes are required because the output of the `match / nmatch`
> instructions is a vector of booleans (or a mask), while the inputs
> are the arrays of variants and invariants (which are INT types
> of different sizes).

Which brings up the next design point of these instructions.  In SVE since the 
"invariant"
argument is required to be build up element wise from scalar invariants, the 
size of this
is (currently) limited to the minimum supported vector size (and does not 
change with
-msve-vector-bits=..) because for VLA vectorization that's the only safe size 
we can use.
I see the vectorizer pattern you check this limit by reading coeff[0] of 
GET_MODE_NUNITS.
I'll get to that later.

But for this one, I think the IFN should contain a scalar constant denoting the 
restriction
on the "invariant" argument of the IFN (or a fixed size mode used here, though 
a fixed
sized mode would introduce other issues and make things less extendable).

This scalar argument would be similar to the TBAA cookie we use for 
gather/scatters,
and the intention being to encode the semantics of the instruction.  Since 
other passes
can modify the invariant argument and extend or shrink it, neither of which are 
valid
we need a way to represent this limitation.  To me it makes more sense to make 
this
explicit.

Any thoughts on these Richi?

Thanks,
Tamar

> 
> gcc/ChangeLog:
> 
>       * internal-fn.def: New IFN for MATCH and NMATCH (including
>                          conditional versions).
>       * internal-fn.cc: Define paramater positions for new IFNs.
>       * optabs.def: New optabs for MATCH and NMATCH (including
>                     conditional versions).
> 
> Change-Id: I173c5989b844137a600622ebfa5c474f8829321b
> ---
>  gcc/internal-fn.cc  | 33 ++++++++++++++++++++++++++++++++-
>  gcc/internal-fn.def | 11 +++++++++++
>  gcc/optabs.def      |  6 ++++++
>  3 files changed, 49 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index d879568c6e3e..bae1cb38d433 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -179,6 +179,7 @@ init_internal_fns ()
>  #define unary_direct { 0, 0, true }
>  #define unary_convert_direct { -1, 0, true }
>  #define binary_direct { 0, 0, true }
> +#define binary_convert_direct { -1, 1, true }
>  #define ternary_direct { 0, 0, true }
>  #define cond_unary_direct { 1, 1, true }
>  #define cond_binary_direct { 1, 1, true }
> @@ -186,6 +187,8 @@ init_internal_fns ()
>  #define cond_len_unary_direct { 1, 1, true }
>  #define cond_len_binary_direct { 1, 1, true }
>  #define cond_len_ternary_direct { 1, 1, true }
> +#define cond_binary_convert_direct { -1, 1, true }
> +#define cond_len_binary_convert_direct { -1, 1, true }
>  #define while_direct { 0, 2, false }
>  #define fold_extract_direct { 2, 2, false }
>  #define fold_len_extract_direct { 2, 2, false }
> @@ -4253,6 +4256,15 @@ expand_reduc_sbool_optab_fn (internal_fn fn,
> gcall *stmt, direct_optab optab)
>  #define expand_unary_convert_optab_fn(FN, STMT, OPTAB) \
>    expand_convert_optab_fn (FN, STMT, OPTAB, 1)
> 
> +#define expand_binary_convert_optab_fn(FN, STMT, OPTAB) \
> +  expand_convert_optab_fn (FN, STMT, OPTAB, 2)
> +
> +#define expand_cond_binary_convert_optab_fn(FN, STMT, OPTAB) \
> +  expand_convert_optab_fn (FN, STMT, OPTAB, 4)
> +
> +#define expand_cond_len_binary_convert_optab_fn(FN, STMT, OPTAB) \
> +  expand_convert_optab_fn (FN, STMT, OPTAB, 6)
> +
>  #define expand_vec_extract_optab_fn(FN, STMT, OPTAB) \
>    expand_convert_optab_fn (FN, STMT, OPTAB, 2)
> 
> @@ -4330,6 +4342,7 @@ multi_vector_optab_supported_p (convert_optab
> optab, tree_pair types,
>  #define direct_unary_optab_supported_p direct_optab_supported_p
>  #define direct_unary_convert_optab_supported_p
> convert_optab_supported_p
>  #define direct_binary_optab_supported_p direct_optab_supported_p
> +#define direct_binary_convert_optab_supported_p
> convert_optab_supported_p
>  #define direct_ternary_optab_supported_p direct_optab_supported_p
>  #define direct_cond_unary_optab_supported_p direct_optab_supported_p
>  #define direct_cond_binary_optab_supported_p direct_optab_supported_p
> @@ -4337,6 +4350,8 @@ multi_vector_optab_supported_p (convert_optab
> optab, tree_pair types,
>  #define direct_cond_len_unary_optab_supported_p
> direct_optab_supported_p
>  #define direct_cond_len_binary_optab_supported_p
> direct_optab_supported_p
>  #define direct_cond_len_ternary_optab_supported_p
> direct_optab_supported_p
> +#define direct_cond_binary_convert_optab_supported_p
> convert_optab_supported_p
> +#define direct_cond_len_binary_convert_optab_supported_p
> convert_optab_supported_p
>  #define direct_crc_optab_supported_p convert_optab_supported_p
>  #define direct_mask_load_optab_supported_p convert_optab_supported_p
>  #define direct_load_lanes_optab_supported_p
> multi_vector_optab_supported_p
> @@ -4834,7 +4849,9 @@ get_conditional_len_internal_fn (tree_code code)
>    T (ROUND) \
>    T (FLOOR) \
>    T (RINT) \
> -  T (CEIL)
> +  T (CEIL) \
> +  T (MATCH_EQ) \
> +  T (MATCH_NE)
> 
>  /* Return a function that only performs internal function FN when a
>     certain condition is met and that uses a given fallback value otherwise.
> @@ -4890,6 +4907,10 @@ get_len_internal_fn (internal_fn fn)
>        return IFN_MASK_LEN_LOAD_LANES;
>      case IFN_MASK_GATHER_LOAD:
>        return IFN_MASK_LEN_GATHER_LOAD;
> +    case IFN_MATCH_EQ:
> +      return IFN_COND_LEN_MATCH_EQ;
> +    case IFN_MATCH_NE:
> +      return IFN_COND_LEN_MATCH_NE;
>      default:
>        return IFN_LAST;
>      }
> @@ -5106,6 +5127,8 @@ internal_fn_len_index (internal_fn fn)
>      case IFN_COND_LEN_XOR:
>      case IFN_COND_LEN_SHL:
>      case IFN_COND_LEN_SHR:
> +    case IFN_COND_LEN_MATCH_EQ:
> +    case IFN_COND_LEN_MATCH_NE:
>      case IFN_MASK_LEN_STRIDED_STORE:
>        return 4;
> 
> @@ -5179,6 +5202,10 @@ internal_fn_else_index (internal_fn fn)
>      case IFN_COND_LEN_XOR:
>      case IFN_COND_LEN_SHL:
>      case IFN_COND_LEN_SHR:
> +    case IFN_COND_MATCH_EQ:
> +    case IFN_COND_MATCH_NE:
> +    case IFN_COND_LEN_MATCH_EQ:
> +    case IFN_COND_LEN_MATCH_NE:
>        return 3;
> 
>      case IFN_MASK_LOAD:
> @@ -5225,6 +5252,10 @@ internal_fn_mask_index (internal_fn fn)
>      case IFN_MASK_LEN_STORE_LANES:
>      case IFN_MASK_LEN_LOAD:
>      case IFN_MASK_LEN_STORE:
> +    case IFN_COND_MATCH_EQ:
> +    case IFN_COND_MATCH_NE:
> +    case IFN_COND_LEN_MATCH_EQ:
> +    case IFN_COND_LEN_MATCH_NE:
>        return 2;
> 
>      case IFN_MASK_LEN_STRIDED_LOAD:
> diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> index 084a92716312..de3d71ea981d 100644
> --- a/gcc/internal-fn.def
> +++ b/gcc/internal-fn.def
> @@ -246,6 +246,17 @@ DEF_INTERNAL_OPTAB_FN (VEC_SET, ECF_CONST |
> ECF_NOTHROW, vec_set, vec_set)
>  DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, ECF_CONST | ECF_NOTHROW,
>                      vec_extract, vec_extract)
> 
> +// Internal functions for Match EQ and Match NE patterns
> +// (conjuction of inequalities or disjunction of equalities)
> +DEF_INTERNAL_OPTAB_FN (MATCH_EQ, ECF_PURE | ECF_NOTHROW,
> +                    vec_match_eq, binary_convert)
> +DEF_INTERNAL_OPTAB_FN (MATCH_NE, ECF_PURE | ECF_NOTHROW,
> +                    vec_match_ne, binary_convert)
> +DEF_INTERNAL_COND_FN (MATCH_EQ, ECF_PURE | ECF_NOTHROW,
> +                   vec_match_eq, binary_convert)
> +DEF_INTERNAL_COND_FN (MATCH_NE, ECF_PURE | ECF_NOTHROW,
> +                   vec_match_ne, binary_convert)
> +
>  DEF_INTERNAL_OPTAB_FN (LEN_STORE, 0, len_store, len_store)
>  DEF_INTERNAL_OPTAB_FN (MASK_LEN_STORE, 0, mask_len_store,
> mask_len_store)
> 
> diff --git a/gcc/optabs.def b/gcc/optabs.def
> index 193f42a728a2..6d9007730e8b 100644
> --- a/gcc/optabs.def
> +++ b/gcc/optabs.def
> @@ -99,6 +99,12 @@ OPTAB_CD(vcond_mask_optab, "vcond_mask_$a$b")
>  OPTAB_CD(vec_cmp_optab, "vec_cmp$a$b")
>  OPTAB_CD(vec_cmpu_optab, "vec_cmpu$a$b")
>  OPTAB_CD(vec_cmpeq_optab, "vec_cmpeq$a$b")
> +OPTAB_CD(vec_match_eq_optab, "vec_match_$a$b")
> +OPTAB_CD(vec_match_ne_optab, "vec_nmatch_$a$b")
> +OPTAB_CD(cond_vec_match_eq_optab, "vec_match_cond_$a$b")
> +OPTAB_CD(cond_vec_match_ne_optab, "vec_nmatch_cond_$a$b")
> +OPTAB_CD(cond_len_vec_match_eq_optab, "vec_match_cond_len_$a$b")
> +OPTAB_CD(cond_len_vec_match_ne_optab, "vec_nmatch_cond_len_$a$b")
>  OPTAB_CD(maskload_optab, "maskload$a$b")
>  OPTAB_CD(maskstore_optab, "maskstore$a$b")
>  OPTAB_CD(mask_len_load_optab, "mask_len_load$a$b")
> --
> 2.52.0

RE: [PATCH v1 1/7] [Vectorizer]: SLP MATCH Pattern: Add Internal Functions and Optabs

Reply via email to