https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67396

Mikhail Maltsev <miyuki at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|x86_64-*-*                  |
                 CC|                            |miyuki at gcc dot gnu.org
            Version|5.2.0                       |unknown
   Target Milestone|4.9.4                       |---
            Summary|[4.9/5/6 regression]        |[4.9/5.0 regression]
                   |Performance regression      |Performance regression
                   |compiling variadic function |compiling variadic function
                   |with many arguments in RTL  |with many arguments
                   |DSE                         |

--- Comment #7 from Mikhail Maltsev <miyuki at gcc dot gnu.org> ---
(In reply to Paul Pluzhnikov from comment #5)

> for j in 500 1000 2000 3000; do perl ./gen_bz18872.pl $j > t.c ; (time
> gcc-svn-r227326-rel/bin/gcc -c -O2 t.c) |& grep user; done
> 
> user  0m1.488s
> user  0m12.010s
> user  1m40.353s
> user  5m47.668s

Rough fitting suggests that it is O(n^3).

I did some profiling and it looks like we get rather deep recursion in alias
analysis. Most time is spent in the following call chain: dse_step1 ->
scan_insn -> record_store -> canon_true_dependence -> true_dependence1. It has
2 callees, which start deep recursion: memrefs_conflict_p and find_base_term
(the latter is invoked twice, the profiler suggests that only the second call,
i.e.

rtx mem_base = find_base_term (true_mem_addr);

is a hot spot). Then we start to recur via these 2 (interleaving) calls:

      f = val->locs;
      /* Temporarily reset val->locs to avoid infinite recursion.  */
      val->locs = NULL;

      for (l = f; l; l = l->next)
        if (GET_CODE (l->loc) == VALUE
            && CSELIB_VAL_PTR (l->loc)->locs
            && !CSELIB_VAL_PTR (l->loc)->locs->next
            && CSELIB_VAL_PTR (l->loc)->locs->loc == x)
          continue;
        else if ((ret = find_base_term (l->loc)) != 0)

1865            else if ((ret = find_base_term (l->loc)) != 0)
(gdb) p l->loc
$6 = (rtx_def *) (plus:DI (value:DI 2:4321 @0x1abc518/0x1ab8a20)
    (const_int -8 [0xfffffffffffffff8]))


and:

        /* Go ahead and find the base term for both operands.  If either base
           term is from a pointer or is a named object or a special address
           (like an argument or stack reference), then use it for the
           base term.  */
        rtx base = find_base_term (tmp1);
        if (base != NULL_RTX
            && ((REG_P (tmp1) && REG_POINTER (tmp1))
                 || known_base_value_p (base)))
          return base;

#1  0x000000000062b92a in find_base_term (x=<optimized out>) at
/home/miyuki/gcc/src/gcc/alias.c:1917
1917            rtx base = find_base_term (tmp1);
(gdb) p tmp1
$7 = (rtx_def *) (value:DI 2:4321 @0x1abc518/0x1ab8a20)

It's harder to say what happens in memrefs_conflict_p because it is
tail-recursive. This function has ~20 calls of itself. Perhaps basic block
profiling will tell more.

P.S. The profiler shows an estimate: alias.c:get_addr is invoked ~300000000
times (that is for 1000 printf arguments).

Reply via email to