https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123266

            Bug ID: 123266
           Summary: [16 Regression] Recent change exposes failure to
                    optimize in VRP/DOM and code quality regression
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: law at gcc dot gnu.org
  Target Milestone: ---

This change:

commit 89fb47c1ce4e0762e469cc76fe56e44ddab0969a (HEAD)
Author: Victor Do Nascimento <[email protected]>
Date:   Thu Nov 6 10:24:43 2025 +0000

    vect: Enable prolog peeling for uncounted loops

    The categorization of uncounted loops as
    LOOP_VINFO_EARLY_BREAKS_VECT_PEELED disables prolog peeling by
    default.  This is due to the assumption that you have early break
    exits following the IV counting main exit.  For such loops, prolog
    peeling is indeed problematic.
[ ... ]

Is causing a regression on RISC-V:

unix/-march=rv64gc_zba_zbb_zbs_zicond: gcc:
gcc.target/riscv/rvv/autovec/pr113206-1.c -O3 -ftree-vectorize 
scan-assembler-times vsetvli 2
unix/-march=rv64gc_zba_zbb_zbs_zicond: gcc:
gcc.target/riscv/rvv/autovec/pr113206-2.c -O3 -ftree-vectorize 
scan-assembler-times vsetvli 1

Essentially we're failing to optimize away some code and as a result we have
more vsetvli instructions than we are expecting.

This is from the DOM3 dump, but it's just as applicable to VRP2. 

So at the end of bb2 we have:

  if (e.2_23 >= -6)
    goto <bb 3>; [89.00%]
  else
    goto <bb 9>; [11.00%]

bb9 is a path to exit.  Effectively we know that entering bb3 that e.2_23 >= 6
and that will hold true for any block dominated by the edge 2->3.

The only really interesting statement in bb3 is:

  # e.2_24 = PHI <_15(5), e.2_23(2)>

THe edge 5->3 is a loop backedge.  But the property that e.2_23 >= 6 still
holds.  And would hold on the backedge as well if you dive into it a bit.  So
e.2_24 has the property as well.

In bb4 (dominated by the 2->3 edge) we have:

  # prephitmp_19 = PHI <e.2_24(3)>
  _49 = [vec_duplicate_expr] e.2_24;
  mask_patt_38.15_50 = _49 >= { -6, ... };
  vexit_inv_51 = ~mask_patt_38.15_50;
  goto <bb 6>; [100.00%]


So we should have been able to optimize away everying and initialize
vext_inv_51 to a vector constant.  That in turn would feed into this code in
bb6

  if (vexit_inv_51 != { 0, ... })
    goto <bb 7>; [11.00%]
  else
    goto <bb 6>; [89.00%]

And we should have realized that the branch will always go to the same place. 
The combination of optimizing away junk in bb4 and the branch in bb5 would
eliminate the unnecessary vsetvl and other vector code.

Reply via email to