This patch series extends the GCC vectorizer's capability as to be
able to vectorize uncounted loops, as per the following example:

while (str[i] != 0)
  str[i] ^=0x20;

Though this implementation has been demonstrated not to cause any
regressions, either in the GCC testsuite or in performance, the scope
of this patch-series is limited.  It lays the foundational groundwork
for the vectorization of such loops but leaves further features to be
enabled in separate patches.  Namely, peeling for alignment and alias
analysis.

This submission provides only limited unit tests and is made with the
primary purpose of getting feedback on the design choices made, while
further tests used in development are added in the run-up to the stage
1 deadline.

As such, this patch series is split into a large number of patches.
The intuition behind this is to be able to have each commit message
explain the rationale behind each change and allow for easier
feedback for any of these choices.

The work borrows heavily and builds upon the previous early break
vectorization work, whereby an early-break loop is one with both a
counting IV exit as well as one or more non IV-counting exit.

By having all exits behave as early break exits, we are able to extend
the types of loops which get vectorized while keeping changes to the
code base fairly minimal.

The changes can be broadly described as follows:

  - Relax the constraint that loops must have a known iteration count
  in order to be considered for vectorization.
  - Implement a way of retrieving whether loop_vinfo refers to an
  uncounted loop.  This is done w/ the LOOP_VINFO_NITERS_UNCOUNTED_P
  accessor macro for loop_vinfo.
  - Categorize uncounted loops as satisfying the criterion given in
  LOOP_VINFO_EARLY_BREAKS_VECT_PEELED.  This ensures that whatever
  exit is assigned to the "main" exit is given equal treatment to
  early-break exits.
  - Make all exit conditions early breaks for uncounted loops: Given
  the absence of any IV-counting exits, it makes no sense for any exit
  to be associated with LOOP_VINFO_LOOP_IV_COND.  Consequently, all
  exit conditions are assigned to LOOP_VINFO_LOOP_CONDS.
  - In choosing the loop's "main" exit, we choose the last exit in the
  loop.  This choice is made as it facilitates the job of implementing
  peeling for alignment, wherein it is required that the effective
  latch be empty.
  - Ensure that we don't segfault from functions which attempt to
  derive useful information the niter count.  For such functions, we
  return some value such as `NULL_TREE'.  This allows for the calling
  function to choose how to deal with the unknown.  Where types would be
  derived from `TREE_TYPE (niters)', we fall-back to the
  `size_type_node', given the association of `size_t' with the maximum
  size of a theoretically possible object of any type.  This should
  thus be able to accommodate any induction variable count as well as
  the type of `niters' would.
  - Disable niter-based profitability checking.  At runtime, this
  would require knowledge of the maximum number of iterations that
  will be executed so as to ascertain whether or not it will be
  beneficial to performance to run the vectorized loop.

Victor Do Nascimento (13):
  vect: Relax known iteration number constraint
  vect: Make all exit conditions early breaks for uncounted loops
  vect: Correct analysis of nested loops
  vect: Extend `vec_init_loop_exit_info' to handle uncounted loops
  vect: Add default types & retvals for uncounted loops
  vect: guard niters manipulation with `LOOP_VINFO_NITERS_UNCOUNTED_P'
  vect: Disable niters-based skipping of uncounted vectorized loops
  vect: Reclassify early break fold left reductions as simple reductions
  vect: Fix uncounted PHI handling of
    `slpeel_tree_duplicate_loop_to_edge_cfg'
  vect: Correct resetting of live out values on epilog loop entry
  vect: Disable use of partial vectors for uncounted loops
  vect: Reject uncounted loop vectorization where alias analysis may
    fail
  vect: Add uncounted loop unit tests

 .../gcc.dg/vect/vect-early-break_40.c         |   3 +-
 gcc/testsuite/gcc.dg/vect/vect-uncounted-1.c  |  18 +++
 gcc/testsuite/gcc.dg/vect/vect-uncounted-2.c  |  24 ++++
 gcc/testsuite/gcc.dg/vect/vect-uncounted-3.c  |  16 +++
 .../gcc.dg/vect/vect-uncounted-run-1.c        |  33 ++++++
 gcc/tree-vect-data-refs.cc                    |  11 +-
 gcc/tree-vect-loop-manip.cc                   | 103 +++++++++++-------
 gcc/tree-vect-loop.cc                         |  88 ++++++++++-----
 gcc/tree-vect-stmts.cc                        |   3 +-
 gcc/tree-vectorizer.h                         |   8 +-
 10 files changed, 235 insertions(+), 72 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-uncounted-1.c
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-uncounted-2.c
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-uncounted-3.c
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-uncounted-run-1.c

-- 
2.43.0

Reply via email to