RE: [PATCH 1/4]middle-end Support not decomposing specific divisions during vectorization.

2022-11-09 Thread Kyrylo Tkachov via Gcc-patches
Hi Tamar,

> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Tamar
> Christina via Gcc-patches
> Sent: Friday, September 23, 2022 10:33 AM
> To: gcc-patches@gcc.gnu.org
> Cc: nd ; rguent...@suse.de
> Subject: [PATCH 1/4]middle-end Support not decomposing specific divisions
> during vectorization.
> 
> Hi All,
> 
> In plenty of image and video processing code it's common to modify pixel
> values
> by a widening operation and then scale them back into range by dividing by
> 255.
> 
> e.g.:
> 
>x = y / (2 ^ (bitsize (y)/2)-1
> 
> This patch adds a new target hook can_special_div_by_const, similar to
> can_vec_perm which can be called to check if a target will handle a particular
> division in a special way in the back-end.
> 
> The vectorizer will then vectorize the division using the standard tree code
> and at expansion time the hook is called again to generate the code for the
> division.
> 
> Alot of the changes in the patch are to pass down the tree operands in all
> paths
> that can lead to the divmod expansion so that the target hook always has the
> type of the expression you're expanding since the types can change the
> expansion.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * expmed.h (expand_divmod): Pass tree operands down in addition
> to RTX.
>   * expmed.cc (expand_divmod): Likewise.
>   * explow.cc (round_push, align_dynamic_address): Likewise.
>   * expr.cc (force_operand, expand_expr_divmod): Likewise.
>   * optabs.cc (expand_doubleword_mod,
> expand_doubleword_divmod):
>   Likewise.
>   * target.h: Include tree-core.
>   * target.def (can_special_div_by_const): New.
>   * targhooks.cc (default_can_special_div_by_const): New.
>   * targhooks.h (default_can_special_div_by_const): New.
>   * tree-vect-generic.cc (expand_vector_operation): Use it.
>   * doc/tm.texi.in: Document it.
>   * doc/tm.texi: Regenerate.
>   * tree-vect-patterns.cc (vect_recog_divmod_pattern): Check for
> support.
>   * tree-vect-stmts.cc (vectorizable_operation): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/vect-div-bitmask-1.c: New test.
>   * gcc.dg/vect/vect-div-bitmask-2.c: New test.
>   * gcc.dg/vect/vect-div-bitmask-3.c: New test.
>   * gcc.dg/vect/vect-div-bitmask.h: New file.
> 
> --- inline copy of patch --
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index
> 92bda1a7e14a3c9ea63e151e4a49a818bf4d1bdb..adba9fe97a9b43729c5e86d
> 244a2a23e76cac097 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -6112,6 +6112,22 @@ instruction pattern.  There is no need for the
> hook to handle these two
>  implementation approaches itself.
>  @end deftypefn
> 
> +@deftypefn {Target Hook} bool
> TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST (enum @var{tree_code},
> tree @var{vectype}, tree @var{treeop0}, tree @var{treeop1}, rtx
> *@var{output}, rtx @var{in0}, rtx @var{in1})
> +This hook is used to test whether the target has a special method of
> +division of vectors of type @var{vectype} using the two operands
> @code{treeop0},
> +and @code{treeop1} and producing a vector of type @var{vectype}.  The
> division
> +will then not be decomposed by the and kept as a div.

I think the grammar here is wonky, can you reword this sentence please?
(I was just reading this patch to understand the optab semantics futher in the 
series)
Thanks,
Kyrill

> +
> +When the hook is being used to test whether the target supports a special
> +divide, @var{in0}, @var{in1}, and @var{output} are all null.  When the hook
> +is being used to emit a division, @var{in0} and @var{in1} are the source
> +vectors of type @var{vecttype} and @var{output} is the destination vector
> of
> +type @var{vectype}.
> +
> +Return true if the operation is possible, emitting instructions for it
> +if rtxes are provided and updating @var{output}.
> +@end deftypefn
> +
>  @deftypefn {Target Hook} tree
> TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION (unsigned
> @var{code}, tree @var{vec_type_out}, tree @var{vec_type_in})
>  This hook should return the decl of a function that implements the
>  vectorized variant of the function with the @code{combined_fn} code
> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index
> 112462310b134705d860153294287cfd7d4af81d..d5a745a02acdf051ea1da1b
> 04076d058c24ce093 100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -4164,6 +4164,8 @@ address;  but often a machine-dependent strategy
> can gene

Re: [PATCH 1/4]middle-end Support not decomposing specific divisions during vectorization.

2022-11-09 Thread Tamar Christina via Gcc-patches
Ah sorry, i missed that one.

Thanks,
Tamar


From: Richard Biener 
Sent: Wednesday, November 9, 2022 8:01 AM
To: Tamar Christina 
Cc: gcc-patches@gcc.gnu.org ; nd ; 
jeffreya...@gmail.com 
Subject: RE: [PATCH 1/4]middle-end Support not decomposing specific divisions 
during vectorization.

On Tue, 8 Nov 2022, Tamar Christina wrote:

> Ping.

Jeff approved this already.  I think it's OK if the rest of the series
is approved.

Richard.

> > -Original Message-
> > From: Tamar Christina
> > Sent: Monday, October 31, 2022 11:35 AM
> > To: Richard Biener 
> > Cc: gcc-patches@gcc.gnu.org; nd ; jeffreya...@gmail.com
> > Subject: RE: [PATCH 1/4]middle-end Support not decomposing specific
> > divisions during vectorization.
> >
> > >
> > > The type of the expression should be available via the mode and the
> > > signedness, no?  So maybe to avoid having both RTX and TREE on the
> > > target hook pass it a wide_int instead for the divisor?
> > >
> >
> > Done.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> > and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> >  * expmed.h (expand_divmod): Pass tree operands down in addition
> > to RTX.
> >  * expmed.cc (expand_divmod): Likewise.
> >  * explow.cc (round_push, align_dynamic_address): Likewise.
> >  * expr.cc (force_operand, expand_expr_divmod): Likewise.
> >  * optabs.cc (expand_doubleword_mod,
> > expand_doubleword_divmod):
> >  Likewise.
> >  * target.h: Include tree-core.
> >  * target.def (can_special_div_by_const): New.
> >  * targhooks.cc (default_can_special_div_by_const): New.
> >  * targhooks.h (default_can_special_div_by_const): New.
> >  * tree-vect-generic.cc (expand_vector_operation): Use it.
> >  * doc/tm.texi.in: Document it.
> >  * doc/tm.texi: Regenerate.
> >  * tree-vect-patterns.cc (vect_recog_divmod_pattern): Check for
> > support.
> >  * tree-vect-stmts.cc (vectorizable_operation): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> >
> >  * gcc.dg/vect/vect-div-bitmask-1.c: New test.
> >  * gcc.dg/vect/vect-div-bitmask-2.c: New test.
> >  * gcc.dg/vect/vect-div-bitmask-3.c: New test.
> >  * gcc.dg/vect/vect-div-bitmask.h: New file.
> >
> > --- inline copy of patch ---
> >
> > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index
> > 92bda1a7e14a3c9ea63e151e4a49a818bf4d1bdb..a29f5c39be3f0927f8ef6e094
> > c7a712c0604fb77 100644
> > --- a/gcc/doc/tm.texi
> > +++ b/gcc/doc/tm.texi
> > @@ -6112,6 +6112,22 @@ instruction pattern.  There is no need for the hook
> > to handle these two  implementation approaches itself.
> >  @end deftypefn
> >
> > +@deftypefn {Target Hook} bool
> > TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST
> > +(enum @var{tree_code}, tree @var{vectype}, wide_int @var{constant}, rtx
> > +*@var{output}, rtx @var{in0}, rtx @var{in1}) This hook is used to test
> > +whether the target has a special method of division of vectors of type
> > +@var{vectype} using the value @var{constant}, and producing a vector of
> > type @var{vectype}.  The division will then not be decomposed by the and
> > kept as a div.
> > +
> > +When the hook is being used to test whether the target supports a
> > +special divide, @var{in0}, @var{in1}, and @var{output} are all null.
> > +When the hook is being used to emit a division, @var{in0} and @var{in1}
> > +are the source vectors of type @var{vecttype} and @var{output} is the
> > +destination vector of type @var{vectype}.
> > +
> > +Return true if the operation is possible, emitting instructions for it
> > +if rtxes are provided and updating @var{output}.
> > +@end deftypefn
> > +
> >  @deftypefn {Target Hook} tree
> > TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION (unsigned
> > @var{code}, tree @var{vec_type_out}, tree @var{vec_type_in})  This hook
> > should return the decl of a function that implements the  vectorized variant
> > of the function with the @code{combined_fn} code diff --git
> > a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index
> > 112462310b134705d860153294287cfd7d4af81d..d5a745a02acdf051ea1da1b04
> > 076d058c24ce093 100644
> > --- a/gcc/doc/tm.texi.in
> > +++ b/gcc/doc/tm.texi.in
> > @@ -4164,6 +4164,8 @@ address;  but often a machine-dependent strategy
> > can generate better code.
> >
> >  @hook TARGET_VECTORIZE_VEC_PERM_CON

RE: [PATCH 1/4]middle-end Support not decomposing specific divisions during vectorization.

2022-11-09 Thread Richard Biener via Gcc-patches
On Tue, 8 Nov 2022, Tamar Christina wrote:

> Ping.

Jeff approved this already.  I think it's OK if the rest of the series
is approved.

Richard.

> > -Original Message-
> > From: Tamar Christina
> > Sent: Monday, October 31, 2022 11:35 AM
> > To: Richard Biener 
> > Cc: gcc-patches@gcc.gnu.org; nd ; jeffreya...@gmail.com
> > Subject: RE: [PATCH 1/4]middle-end Support not decomposing specific
> > divisions during vectorization.
> > 
> > >
> > > The type of the expression should be available via the mode and the
> > > signedness, no?  So maybe to avoid having both RTX and TREE on the
> > > target hook pass it a wide_int instead for the divisor?
> > >
> > 
> > Done.
> > 
> > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> > and no issues.
> > 
> > Ok for master?
> > 
> > Thanks,
> > Tamar
> > 
> > gcc/ChangeLog:
> > 
> > * expmed.h (expand_divmod): Pass tree operands down in addition
> > to RTX.
> > * expmed.cc (expand_divmod): Likewise.
> > * explow.cc (round_push, align_dynamic_address): Likewise.
> > * expr.cc (force_operand, expand_expr_divmod): Likewise.
> > * optabs.cc (expand_doubleword_mod,
> > expand_doubleword_divmod):
> > Likewise.
> > * target.h: Include tree-core.
> > * target.def (can_special_div_by_const): New.
> > * targhooks.cc (default_can_special_div_by_const): New.
> > * targhooks.h (default_can_special_div_by_const): New.
> > * tree-vect-generic.cc (expand_vector_operation): Use it.
> > * doc/tm.texi.in: Document it.
> > * doc/tm.texi: Regenerate.
> > * tree-vect-patterns.cc (vect_recog_divmod_pattern): Check for
> > support.
> > * tree-vect-stmts.cc (vectorizable_operation): Likewise.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * gcc.dg/vect/vect-div-bitmask-1.c: New test.
> > * gcc.dg/vect/vect-div-bitmask-2.c: New test.
> > * gcc.dg/vect/vect-div-bitmask-3.c: New test.
> > * gcc.dg/vect/vect-div-bitmask.h: New file.
> > 
> > --- inline copy of patch ---
> > 
> > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index
> > 92bda1a7e14a3c9ea63e151e4a49a818bf4d1bdb..a29f5c39be3f0927f8ef6e094
> > c7a712c0604fb77 100644
> > --- a/gcc/doc/tm.texi
> > +++ b/gcc/doc/tm.texi
> > @@ -6112,6 +6112,22 @@ instruction pattern.  There is no need for the hook
> > to handle these two  implementation approaches itself.
> >  @end deftypefn
> > 
> > +@deftypefn {Target Hook} bool
> > TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST
> > +(enum @var{tree_code}, tree @var{vectype}, wide_int @var{constant}, rtx
> > +*@var{output}, rtx @var{in0}, rtx @var{in1}) This hook is used to test
> > +whether the target has a special method of division of vectors of type
> > +@var{vectype} using the value @var{constant}, and producing a vector of
> > type @var{vectype}.  The division will then not be decomposed by the and
> > kept as a div.
> > +
> > +When the hook is being used to test whether the target supports a
> > +special divide, @var{in0}, @var{in1}, and @var{output} are all null.
> > +When the hook is being used to emit a division, @var{in0} and @var{in1}
> > +are the source vectors of type @var{vecttype} and @var{output} is the
> > +destination vector of type @var{vectype}.
> > +
> > +Return true if the operation is possible, emitting instructions for it
> > +if rtxes are provided and updating @var{output}.
> > +@end deftypefn
> > +
> >  @deftypefn {Target Hook} tree
> > TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION (unsigned
> > @var{code}, tree @var{vec_type_out}, tree @var{vec_type_in})  This hook
> > should return the decl of a function that implements the  vectorized variant
> > of the function with the @code{combined_fn} code diff --git
> > a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index
> > 112462310b134705d860153294287cfd7d4af81d..d5a745a02acdf051ea1da1b04
> > 076d058c24ce093 100644
> > --- a/gcc/doc/tm.texi.in
> > +++ b/gcc/doc/tm.texi.in
> > @@ -4164,6 +4164,8 @@ address;  but often a machine-dependent strategy
> > can generate better code.
> > 
> >  @hook TARGET_VECTORIZE_VEC_PERM_CONST
> > 
> > +@hook TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST
> > +
> >  @hook TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION
> > 
> >  @hook TARGET_VECTORIZE_BUILTIN_MD_VECTORIZED_FUNCTION
> > diff --git a/gcc/explow.cc b/gcc/explow.cc index
> > ddb4d6ae3600542f8d2bb5617cdd3933a9fae

RE: [PATCH 1/4]middle-end Support not decomposing specific divisions during vectorization.

2022-11-08 Thread Tamar Christina via Gcc-patches
Ping.

> -Original Message-
> From: Tamar Christina
> Sent: Monday, October 31, 2022 11:35 AM
> To: Richard Biener 
> Cc: gcc-patches@gcc.gnu.org; nd ; jeffreya...@gmail.com
> Subject: RE: [PATCH 1/4]middle-end Support not decomposing specific
> divisions during vectorization.
> 
> >
> > The type of the expression should be available via the mode and the
> > signedness, no?  So maybe to avoid having both RTX and TREE on the
> > target hook pass it a wide_int instead for the divisor?
> >
> 
> Done.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * expmed.h (expand_divmod): Pass tree operands down in addition
> to RTX.
>   * expmed.cc (expand_divmod): Likewise.
>   * explow.cc (round_push, align_dynamic_address): Likewise.
>   * expr.cc (force_operand, expand_expr_divmod): Likewise.
>   * optabs.cc (expand_doubleword_mod,
> expand_doubleword_divmod):
>   Likewise.
>   * target.h: Include tree-core.
>   * target.def (can_special_div_by_const): New.
>   * targhooks.cc (default_can_special_div_by_const): New.
>   * targhooks.h (default_can_special_div_by_const): New.
>   * tree-vect-generic.cc (expand_vector_operation): Use it.
>   * doc/tm.texi.in: Document it.
>   * doc/tm.texi: Regenerate.
>   * tree-vect-patterns.cc (vect_recog_divmod_pattern): Check for
> support.
>   * tree-vect-stmts.cc (vectorizable_operation): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/vect-div-bitmask-1.c: New test.
>   * gcc.dg/vect/vect-div-bitmask-2.c: New test.
>   * gcc.dg/vect/vect-div-bitmask-3.c: New test.
>   * gcc.dg/vect/vect-div-bitmask.h: New file.
> 
> --- inline copy of patch ---
> 
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index
> 92bda1a7e14a3c9ea63e151e4a49a818bf4d1bdb..a29f5c39be3f0927f8ef6e094
> c7a712c0604fb77 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -6112,6 +6112,22 @@ instruction pattern.  There is no need for the hook
> to handle these two  implementation approaches itself.
>  @end deftypefn
> 
> +@deftypefn {Target Hook} bool
> TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST
> +(enum @var{tree_code}, tree @var{vectype}, wide_int @var{constant}, rtx
> +*@var{output}, rtx @var{in0}, rtx @var{in1}) This hook is used to test
> +whether the target has a special method of division of vectors of type
> +@var{vectype} using the value @var{constant}, and producing a vector of
> type @var{vectype}.  The division will then not be decomposed by the and
> kept as a div.
> +
> +When the hook is being used to test whether the target supports a
> +special divide, @var{in0}, @var{in1}, and @var{output} are all null.
> +When the hook is being used to emit a division, @var{in0} and @var{in1}
> +are the source vectors of type @var{vecttype} and @var{output} is the
> +destination vector of type @var{vectype}.
> +
> +Return true if the operation is possible, emitting instructions for it
> +if rtxes are provided and updating @var{output}.
> +@end deftypefn
> +
>  @deftypefn {Target Hook} tree
> TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION (unsigned
> @var{code}, tree @var{vec_type_out}, tree @var{vec_type_in})  This hook
> should return the decl of a function that implements the  vectorized variant
> of the function with the @code{combined_fn} code diff --git
> a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index
> 112462310b134705d860153294287cfd7d4af81d..d5a745a02acdf051ea1da1b04
> 076d058c24ce093 100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -4164,6 +4164,8 @@ address;  but often a machine-dependent strategy
> can generate better code.
> 
>  @hook TARGET_VECTORIZE_VEC_PERM_CONST
> 
> +@hook TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST
> +
>  @hook TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION
> 
>  @hook TARGET_VECTORIZE_BUILTIN_MD_VECTORIZED_FUNCTION
> diff --git a/gcc/explow.cc b/gcc/explow.cc index
> ddb4d6ae3600542f8d2bb5617cdd3933a9fae6c0..568e0eb1a158c696458ae678f
> 5e346bf34ba0036 100644
> --- a/gcc/explow.cc
> +++ b/gcc/explow.cc
> @@ -1037,7 +1037,7 @@ round_push (rtx size)
>   TRUNC_DIV_EXPR.  */
>size = expand_binop (Pmode, add_optab, size, alignm1_rtx,
>  NULL_RTX, 1, OPTAB_LIB_WIDEN);
> -  size = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, size, align_rtx,
> +  size = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, NULL, NULL, size,
> + align_rtx,
>   NULL_RTX, 1);
>size = expand_mult (Pmode, size, align_rtx, NULL_RTX, 1);
> 
> @@ -1203,7 +1203,7 @@ align_dyn

Re: [PATCH 1/4]middle-end Support not decomposing specific divisions during vectorization.

2022-10-31 Thread Jeff Law via Gcc-patches



On 10/31/22 05:34, Tamar Christina wrote:

The type of the expression should be available via the mode and the
signedness, no?  So maybe to avoid having both RTX and TREE on the target
hook pass it a wide_int instead for the divisor?


Done.

Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* expmed.h (expand_divmod): Pass tree operands down in addition to RTX.
* expmed.cc (expand_divmod): Likewise.
* explow.cc (round_push, align_dynamic_address): Likewise.
* expr.cc (force_operand, expand_expr_divmod): Likewise.
* optabs.cc (expand_doubleword_mod, expand_doubleword_divmod):
Likewise.
* target.h: Include tree-core.
* target.def (can_special_div_by_const): New.
* targhooks.cc (default_can_special_div_by_const): New.
* targhooks.h (default_can_special_div_by_const): New.
* tree-vect-generic.cc (expand_vector_operation): Use it.
* doc/tm.texi.in: Document it.
* doc/tm.texi: Regenerate.
* tree-vect-patterns.cc (vect_recog_divmod_pattern): Check for support.
* tree-vect-stmts.cc (vectorizable_operation): Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-div-bitmask-1.c: New test.
* gcc.dg/vect/vect-div-bitmask-2.c: New test.
* gcc.dg/vect/vect-div-bitmask-3.c: New test.
* gcc.dg/vect/vect-div-bitmask.h: New file.

--- inline copy of patch ---


OK for the trunk.


Jeff



RE: [PATCH 1/4]middle-end Support not decomposing specific divisions during vectorization.

2022-10-31 Thread Tamar Christina via Gcc-patches
> 
> The type of the expression should be available via the mode and the
> signedness, no?  So maybe to avoid having both RTX and TREE on the target
> hook pass it a wide_int instead for the divisor?
> 

Done.

Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* expmed.h (expand_divmod): Pass tree operands down in addition to RTX.
* expmed.cc (expand_divmod): Likewise.
* explow.cc (round_push, align_dynamic_address): Likewise.
* expr.cc (force_operand, expand_expr_divmod): Likewise.
* optabs.cc (expand_doubleword_mod, expand_doubleword_divmod):
Likewise.
* target.h: Include tree-core.
* target.def (can_special_div_by_const): New.
* targhooks.cc (default_can_special_div_by_const): New.
* targhooks.h (default_can_special_div_by_const): New.
* tree-vect-generic.cc (expand_vector_operation): Use it.
* doc/tm.texi.in: Document it.
* doc/tm.texi: Regenerate.
* tree-vect-patterns.cc (vect_recog_divmod_pattern): Check for support.
* tree-vect-stmts.cc (vectorizable_operation): Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-div-bitmask-1.c: New test.
* gcc.dg/vect/vect-div-bitmask-2.c: New test.
* gcc.dg/vect/vect-div-bitmask-3.c: New test.
* gcc.dg/vect/vect-div-bitmask.h: New file.

--- inline copy of patch ---

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 
92bda1a7e14a3c9ea63e151e4a49a818bf4d1bdb..a29f5c39be3f0927f8ef6e094c7a712c0604fb77
 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6112,6 +6112,22 @@ instruction pattern.  There is no need for the hook to 
handle these two
 implementation approaches itself.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST (enum 
@var{tree_code}, tree @var{vectype}, wide_int @var{constant}, rtx 
*@var{output}, rtx @var{in0}, rtx @var{in1})
+This hook is used to test whether the target has a special method of
+division of vectors of type @var{vectype} using the value @var{constant},
+and producing a vector of type @var{vectype}.  The division
+will then not be decomposed by the and kept as a div.
+
+When the hook is being used to test whether the target supports a special
+divide, @var{in0}, @var{in1}, and @var{output} are all null.  When the hook
+is being used to emit a division, @var{in0} and @var{in1} are the source
+vectors of type @var{vecttype} and @var{output} is the destination vector of
+type @var{vectype}.
+
+Return true if the operation is possible, emitting instructions for it
+if rtxes are provided and updating @var{output}.
+@end deftypefn
+
 @deftypefn {Target Hook} tree TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION 
(unsigned @var{code}, tree @var{vec_type_out}, tree @var{vec_type_in})
 This hook should return the decl of a function that implements the
 vectorized variant of the function with the @code{combined_fn} code
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 
112462310b134705d860153294287cfd7d4af81d..d5a745a02acdf051ea1da1b04076d058c24ce093
 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4164,6 +4164,8 @@ address;  but often a machine-dependent strategy can 
generate better code.
 
 @hook TARGET_VECTORIZE_VEC_PERM_CONST
 
+@hook TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST
+
 @hook TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION
 
 @hook TARGET_VECTORIZE_BUILTIN_MD_VECTORIZED_FUNCTION
diff --git a/gcc/explow.cc b/gcc/explow.cc
index 
ddb4d6ae3600542f8d2bb5617cdd3933a9fae6c0..568e0eb1a158c696458ae678f5e346bf34ba0036
 100644
--- a/gcc/explow.cc
+++ b/gcc/explow.cc
@@ -1037,7 +1037,7 @@ round_push (rtx size)
  TRUNC_DIV_EXPR.  */
   size = expand_binop (Pmode, add_optab, size, alignm1_rtx,
   NULL_RTX, 1, OPTAB_LIB_WIDEN);
-  size = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, size, align_rtx,
+  size = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, NULL, NULL, size, align_rtx,
NULL_RTX, 1);
   size = expand_mult (Pmode, size, align_rtx, NULL_RTX, 1);
 
@@ -1203,7 +1203,7 @@ align_dynamic_address (rtx target, unsigned 
required_align)
 gen_int_mode (required_align / BITS_PER_UNIT - 1,
   Pmode),
 NULL_RTX, 1, OPTAB_LIB_WIDEN);
-  target = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, target,
+  target = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, NULL, NULL, target,
  gen_int_mode (required_align / BITS_PER_UNIT,
Pmode),
  NULL_RTX, 1);
diff --git a/gcc/expmed.h b/gcc/expmed.h
index 
0b2538c4c6bd51dfdc772ef70bdf631c0bed8717..0db2986f11ff4a4b10b59501c6f33cb3595659b5
 100644
--- a/gcc/expmed.h
+++ b/gcc/expmed.h
@@ -708,8 +708,9 @@ extern rtx expand_variable_shift (enum tree_code, 
machine_mode,
 extern rtx expand_shift (enum tree_code, 

Re: [PATCH 1/4]middle-end Support not decomposing specific divisions during vectorization.

2022-09-26 Thread Richard Biener via Gcc-patches
On Fri, 23 Sep 2022, Tamar Christina wrote:

> Hi All,
> 
> In plenty of image and video processing code it's common to modify pixel 
> values
> by a widening operation and then scale them back into range by dividing by 
> 255.
> 
> e.g.:
> 
>x = y / (2 ^ (bitsize (y)/2)-1
> 
> This patch adds a new target hook can_special_div_by_const, similar to
> can_vec_perm which can be called to check if a target will handle a particular
> division in a special way in the back-end.
> 
> The vectorizer will then vectorize the division using the standard tree code
> and at expansion time the hook is called again to generate the code for the
> division.
> 
> Alot of the changes in the patch are to pass down the tree operands in all 
> paths
> that can lead to the divmod expansion so that the target hook always has the
> type of the expression you're expanding since the types can change the
> expansion.

The type of the expression should be available via the mode and the
signedness, no?  So maybe to avoid having both RTX and TREE on the
target hook pass it a wide_int instead for the divisor?

> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * expmed.h (expand_divmod): Pass tree operands down in addition to RTX.
>   * expmed.cc (expand_divmod): Likewise.
>   * explow.cc (round_push, align_dynamic_address): Likewise.
>   * expr.cc (force_operand, expand_expr_divmod): Likewise.
>   * optabs.cc (expand_doubleword_mod, expand_doubleword_divmod):
>   Likewise.
>   * target.h: Include tree-core.
>   * target.def (can_special_div_by_const): New.
>   * targhooks.cc (default_can_special_div_by_const): New.
>   * targhooks.h (default_can_special_div_by_const): New.
>   * tree-vect-generic.cc (expand_vector_operation): Use it.
>   * doc/tm.texi.in: Document it.
>   * doc/tm.texi: Regenerate.
>   * tree-vect-patterns.cc (vect_recog_divmod_pattern): Check for support.
>   * tree-vect-stmts.cc (vectorizable_operation): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/vect-div-bitmask-1.c: New test.
>   * gcc.dg/vect/vect-div-bitmask-2.c: New test.
>   * gcc.dg/vect/vect-div-bitmask-3.c: New test.
>   * gcc.dg/vect/vect-div-bitmask.h: New file.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 
> 92bda1a7e14a3c9ea63e151e4a49a818bf4d1bdb..adba9fe97a9b43729c5e86d244a2a23e76cac097
>  100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -6112,6 +6112,22 @@ instruction pattern.  There is no need for the hook to 
> handle these two
>  implementation approaches itself.
>  @end deftypefn
>  
> +@deftypefn {Target Hook} bool TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST 
> (enum @var{tree_code}, tree @var{vectype}, tree @var{treeop0}, tree 
> @var{treeop1}, rtx *@var{output}, rtx @var{in0}, rtx @var{in1})
> +This hook is used to test whether the target has a special method of
> +division of vectors of type @var{vectype} using the two operands 
> @code{treeop0},
> +and @code{treeop1} and producing a vector of type @var{vectype}.  The 
> division
> +will then not be decomposed by the and kept as a div.
> +
> +When the hook is being used to test whether the target supports a special
> +divide, @var{in0}, @var{in1}, and @var{output} are all null.  When the hook
> +is being used to emit a division, @var{in0} and @var{in1} are the source
> +vectors of type @var{vecttype} and @var{output} is the destination vector of
> +type @var{vectype}.
> +
> +Return true if the operation is possible, emitting instructions for it
> +if rtxes are provided and updating @var{output}.
> +@end deftypefn
> +
>  @deftypefn {Target Hook} tree TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION 
> (unsigned @var{code}, tree @var{vec_type_out}, tree @var{vec_type_in})
>  This hook should return the decl of a function that implements the
>  vectorized variant of the function with the @code{combined_fn} code
> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index 
> 112462310b134705d860153294287cfd7d4af81d..d5a745a02acdf051ea1da1b04076d058c24ce093
>  100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -4164,6 +4164,8 @@ address;  but often a machine-dependent strategy can 
> generate better code.
>  
>  @hook TARGET_VECTORIZE_VEC_PERM_CONST
>  
> +@hook TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST
> +
>  @hook TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION
>  
>  @hook TARGET_VECTORIZE_BUILTIN_MD_VECTORIZED_FUNCTION
> diff --git a/gcc/explow.cc b/gcc/explow.cc
> index 
> ddb4d6ae3600542f8d2bb5617cdd3933a9fae6c0..568e0eb1a158c696458ae678f5e346bf34ba0036
>  100644
> --- a/gcc/explow.cc
> +++ b/gcc/explow.cc
> @@ -1037,7 +1037,7 @@ round_push (rtx size)
>   TRUNC_DIV_EXPR.  */
>size = expand_binop (Pmode, add_optab, size, alignm1_rtx,
>  NULL_RTX, 1, OPTAB_LIB_WIDEN);
> -  size = expand_divmod (0, 

[PATCH 1/4]middle-end Support not decomposing specific divisions during vectorization.

2022-09-23 Thread Tamar Christina via Gcc-patches
Hi All,

In plenty of image and video processing code it's common to modify pixel values
by a widening operation and then scale them back into range by dividing by 255.

e.g.:

   x = y / (2 ^ (bitsize (y)/2)-1

This patch adds a new target hook can_special_div_by_const, similar to
can_vec_perm which can be called to check if a target will handle a particular
division in a special way in the back-end.

The vectorizer will then vectorize the division using the standard tree code
and at expansion time the hook is called again to generate the code for the
division.

Alot of the changes in the patch are to pass down the tree operands in all paths
that can lead to the divmod expansion so that the target hook always has the
type of the expression you're expanding since the types can change the
expansion.

Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* expmed.h (expand_divmod): Pass tree operands down in addition to RTX.
* expmed.cc (expand_divmod): Likewise.
* explow.cc (round_push, align_dynamic_address): Likewise.
* expr.cc (force_operand, expand_expr_divmod): Likewise.
* optabs.cc (expand_doubleword_mod, expand_doubleword_divmod):
Likewise.
* target.h: Include tree-core.
* target.def (can_special_div_by_const): New.
* targhooks.cc (default_can_special_div_by_const): New.
* targhooks.h (default_can_special_div_by_const): New.
* tree-vect-generic.cc (expand_vector_operation): Use it.
* doc/tm.texi.in: Document it.
* doc/tm.texi: Regenerate.
* tree-vect-patterns.cc (vect_recog_divmod_pattern): Check for support.
* tree-vect-stmts.cc (vectorizable_operation): Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-div-bitmask-1.c: New test.
* gcc.dg/vect/vect-div-bitmask-2.c: New test.
* gcc.dg/vect/vect-div-bitmask-3.c: New test.
* gcc.dg/vect/vect-div-bitmask.h: New file.

--- inline copy of patch -- 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 
92bda1a7e14a3c9ea63e151e4a49a818bf4d1bdb..adba9fe97a9b43729c5e86d244a2a23e76cac097
 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6112,6 +6112,22 @@ instruction pattern.  There is no need for the hook to 
handle these two
 implementation approaches itself.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST (enum 
@var{tree_code}, tree @var{vectype}, tree @var{treeop0}, tree @var{treeop1}, 
rtx *@var{output}, rtx @var{in0}, rtx @var{in1})
+This hook is used to test whether the target has a special method of
+division of vectors of type @var{vectype} using the two operands 
@code{treeop0},
+and @code{treeop1} and producing a vector of type @var{vectype}.  The division
+will then not be decomposed by the and kept as a div.
+
+When the hook is being used to test whether the target supports a special
+divide, @var{in0}, @var{in1}, and @var{output} are all null.  When the hook
+is being used to emit a division, @var{in0} and @var{in1} are the source
+vectors of type @var{vecttype} and @var{output} is the destination vector of
+type @var{vectype}.
+
+Return true if the operation is possible, emitting instructions for it
+if rtxes are provided and updating @var{output}.
+@end deftypefn
+
 @deftypefn {Target Hook} tree TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION 
(unsigned @var{code}, tree @var{vec_type_out}, tree @var{vec_type_in})
 This hook should return the decl of a function that implements the
 vectorized variant of the function with the @code{combined_fn} code
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 
112462310b134705d860153294287cfd7d4af81d..d5a745a02acdf051ea1da1b04076d058c24ce093
 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4164,6 +4164,8 @@ address;  but often a machine-dependent strategy can 
generate better code.
 
 @hook TARGET_VECTORIZE_VEC_PERM_CONST
 
+@hook TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST
+
 @hook TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION
 
 @hook TARGET_VECTORIZE_BUILTIN_MD_VECTORIZED_FUNCTION
diff --git a/gcc/explow.cc b/gcc/explow.cc
index 
ddb4d6ae3600542f8d2bb5617cdd3933a9fae6c0..568e0eb1a158c696458ae678f5e346bf34ba0036
 100644
--- a/gcc/explow.cc
+++ b/gcc/explow.cc
@@ -1037,7 +1037,7 @@ round_push (rtx size)
  TRUNC_DIV_EXPR.  */
   size = expand_binop (Pmode, add_optab, size, alignm1_rtx,
   NULL_RTX, 1, OPTAB_LIB_WIDEN);
-  size = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, size, align_rtx,
+  size = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, NULL, NULL, size, align_rtx,
NULL_RTX, 1);
   size = expand_mult (Pmode, size, align_rtx, NULL_RTX, 1);
 
@@ -1203,7 +1203,7 @@ align_dynamic_address (rtx target, unsigned 
required_align)
 gen_int_mode (required_align / BITS_PER_UNIT - 1,
   Pmode),