Re: [PATCH v2] tree-optimization/110279- Check for nested FMA chains in reassoc

2023-07-19 Thread Richard Biener via Gcc-patches
On Tue, Jul 11, 2023 at 5:00 AM Di Zhao OS via Gcc-patches
 wrote:
>
> Attached is an updated version of the patch.
>
> Based on Philipp's review, some changes:
>
> 1. Defined new enum fma_state to describe the state of FMA candidates
>for a list of operands. (Since the tests seems simple after the
>change, I didn't add predicates on it.)
> 2. Changed return type of convert_mult_to_fma_1 and convert_mult_to_fma
>to tree, to remove the in/out parameter.
> 3. Added description of return value values of rank_ops_for_fma.

I'll note that rank_ops_for_fma works on the single-use addition chain only so
there might be FMA ops "elsewhere" in the dependence chain that are not
in 'ops'.

You are using defer_p = false for the fma_deferring_state so I wonder why
you need this machinery at all?  Wouldn't the restriction only apply when
we'd actually deferred a FMA generation?  I'm CCing Martin who did this work.

But what I'm curious about is that if any of the new conditions trigger then
you leave the ops chain alone - but it _could_ already be in "bad" shape,
is there anything we could do to improve ordering?

Also

   if (ops_mult.length () >= 2 && ops_mult.length () != ops_length)
 {
+  if (maybe_le (tree_to_poly_int64 (TYPE_SIZE (type)),
+   param_avoid_fma_max_bits))
+   /* Avoid re-arrange to produce less FMA chains that can be slow.  */
+   return FMA_STATE_MULTIPLE;
+
   /* Put no-mult ops and mult ops alternately at the end of the
 queue, which is conducive to generating more FMA and reducing the
 loss of FMA when breaking the chain.  */
@@ -6829,9 +6909,9 @@ rank_ops_for_fma (vec *ops)
  if (opindex > 0)
opindex--;
}
-  return true;
+  return FMA_STATE_MULTIPLE;

so we end up parallel rewriting in any case and just avoid
"optimizing" the ops list here.
>From the PR it looks like without the rewriting we are lucky that the
FMA generation
heuristic to avoid cross backedge FMA dependences doesn't trigger
without associating
(but parallel rewrite is still good)?

For the NESTED case we avoid parallel rewriting completely, independent on
param_avoid_fma_max_bits - it seems from the PR we want more FMAs here
and the parallel rewriting makes it worse?  But I don't see exactly how.

I think these are all a bit fragile heuristics also give that reassoc
works on single-use
chains only.  The more we interwind FMA creation in widen_mult and associating
for FMA in reassoc the more I think that reassoc itself should form the FMAs?
The passes are unfortunately quite a bit separated.

Can you produce testcases for the two problematical cases in the PR?

That said, the added heuristics are not looking universally good to me without
better motivation.

> ---
> gcc/ChangeLog:
>

Missing

   PR tree-optimization/110279

so bugzilla picks up the commit.

> * tree-ssa-math-opts.cc (convert_mult_to_fma_1): Added new parameter
> check_only_p. Changed return type to tree.
> (struct fma_transformation_info): Moved to header.
> (class fma_deferring_state): Moved to header.
> (convert_mult_to_fma): Added new parameter check_only_p. Changed
> return type to tree.
> * tree-ssa-math-opts.h (struct fma_transformation_info): Moved from 
> .cc.
> (class fma_deferring_state): Moved from .cc.
> (convert_mult_to_fma): Add function decl.
> * tree-ssa-reassoc.cc (enum fma_state): Defined new enum to describe
> the state of FMA candidates for a list of operands.
> (rewrite_expr_tree_parallel): Changed boolean parameter to enum type.
> (rank_ops_for_fma): Return enum fma_state.
> (reassociate_bb): Avoid rewriting to parallel if nested FMAs are 
> found.
>
> Thanks,
> Di Zhao
>
>


Re: [PATCH v2] tree-optimization/110279- Check for nested FMA chains in reassoc

2023-07-18 Thread Richard Biener via Gcc-patches
On Mon, Jul 17, 2023 at 4:26 PM Tamar Christina via Gcc-patches
 wrote:
>
> I think Andrew is listed as maintainer for tree-ssa, or maybe it's on one of 
> the Richard's lists?

It's on my rather longish list of things to review ...

> > -Original Message-
> > From: Gcc-patches  > bounces+tamar.christina=arm@gcc.gnu.org> On Behalf Of Philipp
> > Tomsich
> > Sent: Tuesday, July 11, 2023 7:51 AM
> > To: Jakub Jelinek 
> > Cc: gcc-patches@gcc.gnu.org; Di Zhao OS
> > 
> > Subject: Re: [PATCH v2] tree-optimization/110279- Check for nested FMA
> > chains in reassoc
> >
> > Jakub,
> >
> > it looks like you did a lot of work on reassoc in the past — could you have 
> > a
> > quick look and comment?
> >
> > Thanks,
> > Philipp.
> >
> >
> > On Tue, 11 Jul 2023 at 04:59, Di Zhao OS
> >  wrote:
> > >
> > > Attached is an updated version of the patch.
> > >
> > > Based on Philipp's review, some changes:
> > >
> > > 1. Defined new enum fma_state to describe the state of FMA candidates
> > >for a list of operands. (Since the tests seems simple after the
> > >change, I didn't add predicates on it.) 2. Changed return type of
> > > convert_mult_to_fma_1 and convert_mult_to_fma
> > >to tree, to remove the in/out parameter.
> > > 3. Added description of return value values of rank_ops_for_fma.
> > >
> > > ---
> > > gcc/ChangeLog:
> > >
> > > * tree-ssa-math-opts.cc (convert_mult_to_fma_1): Added new
> > parameter
> > > check_only_p. Changed return type to tree.
> > > (struct fma_transformation_info): Moved to header.
> > > (class fma_deferring_state): Moved to header.
> > > (convert_mult_to_fma): Added new parameter check_only_p. Changed
> > > return type to tree.
> > > * tree-ssa-math-opts.h (struct fma_transformation_info): Moved 
> > > from
> > .cc.
> > > (class fma_deferring_state): Moved from .cc.
> > > (convert_mult_to_fma): Add function decl.
> > > * tree-ssa-reassoc.cc (enum fma_state): Defined new enum to 
> > > describe
> > > the state of FMA candidates for a list of operands.
> > > (rewrite_expr_tree_parallel): Changed boolean parameter to enum 
> > > type.
> > > (rank_ops_for_fma): Return enum fma_state.
> > > (reassociate_bb): Avoid rewriting to parallel if nested FMAs are 
> > > found.
> > >
> > > Thanks,
> > > Di Zhao
> > >
> > >


RE: [PATCH v2] tree-optimization/110279- Check for nested FMA chains in reassoc

2023-07-17 Thread Tamar Christina via Gcc-patches
I think Andrew is listed as maintainer for tree-ssa, or maybe it's on one of 
the Richard's lists?

> -Original Message-
> From: Gcc-patches  bounces+tamar.christina=arm@gcc.gnu.org> On Behalf Of Philipp
> Tomsich
> Sent: Tuesday, July 11, 2023 7:51 AM
> To: Jakub Jelinek 
> Cc: gcc-patches@gcc.gnu.org; Di Zhao OS
> 
> Subject: Re: [PATCH v2] tree-optimization/110279- Check for nested FMA
> chains in reassoc
> 
> Jakub,
> 
> it looks like you did a lot of work on reassoc in the past — could you have a
> quick look and comment?
> 
> Thanks,
> Philipp.
> 
> 
> On Tue, 11 Jul 2023 at 04:59, Di Zhao OS
>  wrote:
> >
> > Attached is an updated version of the patch.
> >
> > Based on Philipp's review, some changes:
> >
> > 1. Defined new enum fma_state to describe the state of FMA candidates
> >for a list of operands. (Since the tests seems simple after the
> >change, I didn't add predicates on it.) 2. Changed return type of
> > convert_mult_to_fma_1 and convert_mult_to_fma
> >to tree, to remove the in/out parameter.
> > 3. Added description of return value values of rank_ops_for_fma.
> >
> > ---
> > gcc/ChangeLog:
> >
> > * tree-ssa-math-opts.cc (convert_mult_to_fma_1): Added new
> parameter
> > check_only_p. Changed return type to tree.
> > (struct fma_transformation_info): Moved to header.
> > (class fma_deferring_state): Moved to header.
> > (convert_mult_to_fma): Added new parameter check_only_p. Changed
> > return type to tree.
> > * tree-ssa-math-opts.h (struct fma_transformation_info): Moved from
> .cc.
> > (class fma_deferring_state): Moved from .cc.
> > (convert_mult_to_fma): Add function decl.
> > * tree-ssa-reassoc.cc (enum fma_state): Defined new enum to describe
> > the state of FMA candidates for a list of operands.
> > (rewrite_expr_tree_parallel): Changed boolean parameter to enum 
> > type.
> > (rank_ops_for_fma): Return enum fma_state.
> > (reassociate_bb): Avoid rewriting to parallel if nested FMAs are 
> > found.
> >
> > Thanks,
> > Di Zhao
> >
> >


Re: [PATCH v2] tree-optimization/110279- Check for nested FMA chains in reassoc

2023-07-11 Thread Philipp Tomsich
Jakub,

it looks like you did a lot of work on reassoc in the past — could you
have a quick look and comment?

Thanks,
Philipp.


On Tue, 11 Jul 2023 at 04:59, Di Zhao OS  wrote:
>
> Attached is an updated version of the patch.
>
> Based on Philipp's review, some changes:
>
> 1. Defined new enum fma_state to describe the state of FMA candidates
>for a list of operands. (Since the tests seems simple after the
>change, I didn't add predicates on it.)
> 2. Changed return type of convert_mult_to_fma_1 and convert_mult_to_fma
>to tree, to remove the in/out parameter.
> 3. Added description of return value values of rank_ops_for_fma.
>
> ---
> gcc/ChangeLog:
>
> * tree-ssa-math-opts.cc (convert_mult_to_fma_1): Added new parameter
> check_only_p. Changed return type to tree.
> (struct fma_transformation_info): Moved to header.
> (class fma_deferring_state): Moved to header.
> (convert_mult_to_fma): Added new parameter check_only_p. Changed
> return type to tree.
> * tree-ssa-math-opts.h (struct fma_transformation_info): Moved from 
> .cc.
> (class fma_deferring_state): Moved from .cc.
> (convert_mult_to_fma): Add function decl.
> * tree-ssa-reassoc.cc (enum fma_state): Defined new enum to describe
> the state of FMA candidates for a list of operands.
> (rewrite_expr_tree_parallel): Changed boolean parameter to enum type.
> (rank_ops_for_fma): Return enum fma_state.
> (reassociate_bb): Avoid rewriting to parallel if nested FMAs are 
> found.
>
> Thanks,
> Di Zhao
>
>