This patch series extends the GCC vectorizer's capability as to be
able to vectorize uncounted loops, as per the following example:
while (str[i] != 0)
str[i] ^=0x20;
Though this implementation has been demonstrated not to cause any
regressions, either in the GCC testsuite or in performance, the scope
of this patch-series is limited. It lays the foundational groundwork
for the vectorization of such loops but leaves further features to be
enabled in separate patches. Namely, peeling for alignment and alias
analysis.
This submission provides only limited unit tests and is made with the
primary purpose of getting feedback on the design choices made, while
further tests used in development are added in the run-up to the stage
1 deadline.
As such, this patch series is split into a large number of patches.
The intuition behind this is to be able to have each commit message
explain the rationale behind each change and allow for easier
feedback for any of these choices.
The work borrows heavily and builds upon the previous early break
vectorization work, whereby an early-break loop is one with both a
counting IV exit as well as one or more non IV-counting exit.
By having all exits behave as early break exits, we are able to extend
the types of loops which get vectorized while keeping changes to the
code base fairly minimal.
The changes can be broadly described as follows:
- Relax the constraint that loops must have a known iteration count
in order to be considered for vectorization.
- Implement a way of retrieving whether loop_vinfo refers to an
uncounted loop. This is done w/ the LOOP_VINFO_NITERS_UNCOUNTED_P
accessor macro for loop_vinfo.
- Categorize uncounted loops as satisfying the criterion given in
LOOP_VINFO_EARLY_BREAKS_VECT_PEELED. This ensures that whatever
exit is assigned to the "main" exit is given equal treatment to
early-break exits.
- Make all exit conditions early breaks for uncounted loops: Given
the absence of any IV-counting exits, it makes no sense for any exit
to be associated with LOOP_VINFO_LOOP_IV_COND. Consequently, all
exit conditions are assigned to LOOP_VINFO_LOOP_CONDS.
- In choosing the loop's "main" exit, we choose the last exit in the
loop. This choice is made as it facilitates the job of implementing
peeling for alignment, wherein it is required that the effective
latch be empty.
- Ensure that we don't segfault from functions which attempt to
derive useful information the niter count. For such functions, we
return some value such as `NULL_TREE'. This allows for the calling
function to choose how to deal with the unknown. Where types would be
derived from `TREE_TYPE (niters)', we fall-back to the
`size_type_node', given the association of `size_t' with the maximum
size of a theoretically possible object of any type. This should
thus be able to accommodate any induction variable count as well as
the type of `niters' would.
- Disable niter-based profitability checking. At runtime, this
would require knowledge of the maximum number of iterations that
will be executed so as to ascertain whether or not it will be
beneficial to performance to run the vectorized loop.
Victor Do Nascimento (13):
vect: Relax known iteration number constraint
vect: Make all exit conditions early breaks for uncounted loops
vect: Correct analysis of nested loops
vect: Extend `vec_init_loop_exit_info' to handle uncounted loops
vect: Add default types & retvals for uncounted loops
vect: guard niters manipulation with `LOOP_VINFO_NITERS_UNCOUNTED_P'
vect: Disable niters-based skipping of uncounted vectorized loops
vect: Reclassify early break fold left reductions as simple reductions
vect: Fix uncounted PHI handling of
`slpeel_tree_duplicate_loop_to_edge_cfg'
vect: Correct resetting of live out values on epilog loop entry
vect: Disable use of partial vectors for uncounted loops
vect: Reject uncounted loop vectorization where alias analysis may
fail
vect: Add uncounted loop unit tests
.../gcc.dg/vect/vect-early-break_40.c | 3 +-
gcc/testsuite/gcc.dg/vect/vect-uncounted-1.c | 18 +++
gcc/testsuite/gcc.dg/vect/vect-uncounted-2.c | 24 ++++
gcc/testsuite/gcc.dg/vect/vect-uncounted-3.c | 16 +++
.../gcc.dg/vect/vect-uncounted-run-1.c | 33 ++++++
gcc/tree-vect-data-refs.cc | 11 +-
gcc/tree-vect-loop-manip.cc | 103 +++++++++++-------
gcc/tree-vect-loop.cc | 88 ++++++++++-----
gcc/tree-vect-stmts.cc | 3 +-
gcc/tree-vectorizer.h | 8 +-
10 files changed, 235 insertions(+), 72 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/vect/vect-uncounted-1.c
create mode 100644 gcc/testsuite/gcc.dg/vect/vect-uncounted-2.c
create mode 100644 gcc/testsuite/gcc.dg/vect/vect-uncounted-3.c
create mode 100644 gcc/testsuite/gcc.dg/vect/vect-uncounted-run-1.c
--
2.43.0