[Bug tree-optimization/96208] non-grouped load can be SLP vectorized for 2-element vectors case

cvs-commit at gcc dot gnu.org via Gcc-bugs Tue, 27 Jun 2023 00:48:25 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96208


--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rgue...@gcc.gnu.org>:

https://gcc.gnu.org/g:dd86a5a69cbda40cf76388a65d3317c91cb2b501

commit r14-2117-gdd86a5a69cbda40cf76388a65d3317c91cb2b501
Author: Richard Biener <rguent...@suse.de>
Date:   Thu Jun 22 11:40:46 2023 +0200

    tree-optimization/96208 - SLP of non-grouped loads

    The following extends SLP discovery to handle non-grouped loads
    in loop vectorization in the case the same load appears in all
    lanes.

    Code generation is adjusted to mimick what we do for the case
    of single element interleaving (when the load is not unit-stride)
    which is already handled by SLP.  There are some limits we
    run into because peeling for gap cannot cover all cases and
    we choose VMAT_CONTIGUOUS.  The patch does not try to address
    these issues yet.

    The main obstacle is that these loads are not
    STMT_VINFO_GROUPED_ACCESS and that's a new thing with SLP.
    I know from the past that it's not a good idea to make them
    grouped.  Instead the following massages places to deal
    with SLP loads that are not STMT_VINFO_GROUPED_ACCESS.

    There's already a testcase testing for the case the PR
    is after, just XFAILed, the following adjusts that instead
    of adding another.

    I do expect to have missed some so I don't plan to push this
    on a Friday.  Still there may be feedback, so posting this
    now.

    Bootstrapped and tested on x86_64-unknown-linux-gnu.

            PR tree-optimization/96208
            * tree-vect-slp.cc (vect_build_slp_tree_1): Allow
            a non-grouped load if it is the same for all lanes.
            (vect_build_slp_tree_2): Handle not grouped loads.
            (vect_optimize_slp_pass::remove_redundant_permutations):
            Likewise.
            (vect_transform_slp_perm_load_1): Likewise.
            * tree-vect-stmts.cc (vect_model_load_cost): Likewise.
            (get_group_load_store_type): Likewise.  Handle
            invariant accesses.
            (vectorizable_load): Likewise.

            * gcc.dg/vect/slp-46.c: Adjust for new vectorizations.
            * gcc.dg/vect/bb-slp-pr65935.c: Adjust.

[Bug tree-optimization/96208] non-grouped load can be SLP vectorized for 2-element vectors case

Reply via email to