On Wed, Jun 26, 2024 at 4:58 PM Feng Xue OS <[email protected]> wrote:
>
> Allow shift-by-induction for slp node, when it is single lane, which is
> aligned with the original loop-based handling.
OK.
Did you try whether we handle multiple lanes correctly? The simplest
case would be a loop
body with say
a[2*i] = x << i;
a[2*i+1] = x << i;
I'm not sure how we match up multiple (different) inductions in the
same SLP node,
but one node could be x << (i + 1).
Note you enable a nested cycle def the same way, I think that could be
treated like
an internal def and also generally. There's probably no test coverage
for that though.
Sth like
for (m ...)
{
i = m;
j = i + 1;
for (k ...)
{
res1 += k << i;
res2 += k << j;
i++;
j++;
}
a[2*m] = res1;
a[2*m+1] = res2;
}
Thanks,
Richard.
> Thanks,
> Feng
>
> ---
> gcc/tree-vect-stmts.cc | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index ca6052662a3..840e162c7f0 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -6247,7 +6247,7 @@ vectorizable_shift (vec_info *vinfo,
> if ((dt[1] == vect_internal_def
> || dt[1] == vect_induction_def
> || dt[1] == vect_nested_cycle)
> - && !slp_node)
> + && (!slp_node || SLP_TREE_LANES (slp_node) == 1))
> scalar_shift_arg = false;
> else if (dt[1] == vect_constant_def
> || dt[1] == vect_external_def
> --
> 2.17.1