https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125788

--- Comment #6 from Guoce Feng <fengguoce at hygon dot cn> ---
I think the sink-like direction makes sense here. One possible way to start
  would be to handle the all-predecessors common-tail case first:

    - scan the tail statements of all unconditional predecessors of a join
block;
    - require matching statement shape, while allowing operands to differ;
    - create PHIs in the join block for differing operands;
    - rebuild one common statement after those PHIs;
    - replace/remove result PHIs that exactly merge the old predecessor
results.

  That keeps the initial transform local and avoids split-block creation or
  partial predecessor participation.

  The initial legality checks could be conservative. For example, the first
  version could handle mechanically rebuildable GIMPLE_ASSIGNs and pure/const
  calls with one SSA register result, while rejecting statements with
VUSE/VDEF,
  side effects, possible EH, volatile operands, or anything requiring
memory-SSA
  repair. For calls, restricting to non-looping ECF_CONST/ECF_PURE direct or
  internal calls seems like a useful starting point. That would cover cases
such
  as sqrt/fabs/pow under the normal semantic restrictions without making sqrt
  itself a special case.

  It may also be useful to distinguish ordinary sinking from a pre-ifcvt or
  pre-vectorization use. For normal runtime-oriented sinking, reducing
duplicated
  static code can be unattractive if it increases PHI pressure, since only one
  predecessor path executes. Before if-conversion, however, operand PHIs may
  become selects and can expose a single operation after the merge.

  For example:

    void f(int *restrict a, int *restrict b, int *restrict c, int n)
    {
      for (int i = 0; i < n; ++i)
        {
          int r, x, y;

          if (b[i] > 0)
            {
              x = a[i] + 1;
              y = x + c[i];
              r = y * 2;
            }
          else
            {
              x = a[i] - 1;
              y = x + c[i];
              r = y * 2;
            }

          a[i] = r;
        }
    }

  A pre-ifcvt common-sinking pass could turn the common `+ c[i]` and `* 2`
  tail into merged data flow by creating operand PHIs/selects before rebuilding
  the common operations after the join.

  So the profitability model might want to:

    - charge newly-created operand PHIs;
    - reuse existing operand PHIs where possible;
    - credit result PHIs that are removed;
    - cap total new PHIs and PHIs per sunk statement;
    - add extra cost for PHIs with more than two incoming values in the ifcvt
      case, since they no longer map to one simple two-way select;
    - be more permissive in a pre-ifcvt context than in ordinary sinking.

  This would also keep load/store handling as a later phase. Starting with
  side-effect-free operations and pure/const calls avoids memory ordering and
  memory-SSA issues while still covering the motivating sqrt/div cases.

Reply via email to