Hi Martin and Honza,

On 2019/11/18 21:02, Martin Liška wrote:
> On 11/16/19 10:59 AM, luoxhu wrote:
>> Sorry that I don't quite understand your meanning here.  I didn't grep the
>> word "cgraph_edge_summary" in source code, do you mean add new structure
> 
> Hello.
> 
> He wanted to write call_summary class and so you need something similar to
> ipa-sra.c:431. It's a data structure which associate a data to cgraph_edge.
> Is it understandable please?
> 
> Martin
> 

Update the patch as below with "git format-patch -U15" for review convenience,
the GC issue is fixed after leveraging the full GC framework in ipa-profile.c, 
Thanks:

v6 Changes:
 1. Define and use speculative_call_targets summary, move
 speculative_call_target from cgraph.h to ipa-profile.c.
 2. Use num_speculative_call_targets in cgraph_indirect_call_info.
 3. Refine with review comments.

This patch aims to fix PR69678 caused by PGO indirect call profiling
performance issues.
The bug that profiling data is never working was fixed by Martin's pull
back of topN patches, performance got GEOMEAN ~1% improvement(+24% for
511.povray_r specifically).
Still, currently the default profile only generates SINGLE indirect target
that called more than 75%.  This patch leverages MULTIPLE indirect
targets use in LTO-WPA and LTO-LTRANS stage, as a result, function
specialization, profiling, partial devirtualization, inlining and
cloning could be done successfully based on it.
Performance can get improved from 0.70 sec to 0.38 sec on simple tests.
Details are:
  1.  PGO with topn is enabled by default now, but only one indirect
  target edge will be generated in ipa-profile pass, so add variables to enable
  multiple speculative edges through passes, speculative_id will record the
  direct edge index bind to the indirect edge, indirect_call_targets length
  records how many direct edges owned by the indirect edge, postpone gimple_ic
  to ipa-profile like default as inline pass will decide whether it is benefit
  to transform indirect call.
  2.  Use speculative_id to track and search the reference node matched
  with the direct edge's callee for multiple targets.  Actually, it is the
  caller's responsibility to handle the direct edges mapped to same indirect
  edge.  speculative_call_info will return one of the direct edge specified,
  this will leverage current IPA edge process framework mostly.
  3.  Enable LTO WPA/LTRANS stage multiple indirect call targets analysis for
  profile full support in ipa passes and cgraph_edge functions.  speculative_id
  can be set by make_speculative id when multiple targets are binded to
  one indirect edge, and cloned if new edge is cloned.  speculative_id
  is streamed out and stream int by lto like lto_stmt_uid.
  4.  Add 1 in module testcase and 2 cross module testcases.
  5.  Bootstrap and regression test passed on Power8-LE.  No function
  and performance regression for SPEC2017.

gcc/ChangeLog

        2019-12-02  Xiong Hu Luo  <luo...@linux.ibm.com>

        PR ipa/69678
        * Makefile.in (GTFILES): Add ipa-profile.c.
        * cgraph.c (symbol_table::create_edge): Init speculative_id.
        (cgraph_edge::make_speculative): Add param for setting speculative_id.
        (cgraph_edge::speculative_call_info): Update comments and find reference
        by speculative_id for multiple indirect targets.
        (cgraph_edge::resolve_speculation): Decrease the speculations
        for indirect edge, drop it's speculative if not direct target
        left. Update comments.
        (cgraph_edge::redirect_call_stmt_to_callee): Likewise.
        (cgraph_node::dump): Print num_speculative_call_targets.
        (cgraph_node::verify_node): Don't report error if speculative
        edge not include statement.
        (cgraph_edge::num_speculative_call_targets_p): New function.
        * cgraph.h (int common_target_id): Remove.
        (int common_target_probability): Remove.
        (num_speculative_call_targets): New variable.
        (make_speculative): Add param for setting speculative_id.
        (cgraph_edge::num_speculative_call_targets_p): New declare.
        (speculative_id): New variable.
        * cgraphclones.c (cgraph_node::create_clone): Clone speculative_id.
        * ipa-profile.c (struct speculative_call_target): New struct.
        (class speculative_call_summary): New class.
        (class speculative_call_summaries): New class.
        (call_sums): New variable.
        (ipa_profile_generate_summary): Generate indirect multiple targets 
summaries.
        (ipa_profile_write_edge_summary): New function.
        (ipa_profile_write_summary): Stream out indirect multiple targets 
summaries.
        (ipa_profile_dump_all_summaries): New function.
        (ipa_profile_read_edge_summary): New function.
        (ipa_profile_read_summary_section): New function.
        (ipa_profile_read_summary): Stream in indirect multiple targets 
summaries.
        (ipa_profile): Generate num_speculative_call_targets from
        summaries.
        * ipa-ref.h (speculative_id): New variable.
        * lto-cgraph.c (lto_output_edge): Remove indirect common_target_id and
        common_target_probability.   Stream out speculative_id and
        num_speculative_call_targets.
        (input_edge): Likewise.
        * predict.c (dump_prediction): Remove edges count assert to be
        precise.
        * symtab.c (symtab_node::create_reference): Init speculative_id.
        (symtab_node::clone_references): Clone speculative_id.
        (symtab_node::clone_referring): Clone speculative_id.
        (symtab_node::clone_reference): Clone speculative_id.
        (symtab_node::clear_stmts_in_references): Clear speculative_id.
        * tree-inline.c (copy_bb): Duplicate all the speculative edges
        if indirect call contains multiple speculative targets.
        * tree-profile.c (gimple_gen_ic_profiler): Use the new variable
        __gcov_indirect_call.counters and __gcov_indirect_call.callee.
        (gimple_gen_ic_func_profiler): Likewise.
        (pass_ipa_tree_profile::gate): Fix comment typos.
        * value-prof.h  (check_ic_target): Remove.
        * value-prof.c  (gimple_value_profile_transformations):
        Use void function gimple_ic_transform.
        * value-prof.c  (gimple_ic_transform): Handle topn case.
        Fix comment typos.  Change it to a void function.

gcc/testsuite/ChangeLog

        2019-12-02  Xiong Hu Luo  <luo...@linux.ibm.com>

        PR ipa/69678
        * gcc.dg/tree-prof/indir-call-prof-topn.c: New testcase.
        * gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c: New testcase.
        * gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c: New testcase.
        * gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c: New testcase.
        * lib/scandump.exp: Dump executable file name.
        * lib/scanwpaipa.exp: New scan-pgo-wap-ipa-dump.
---
 gcc/Makefile.in                               |   1 +
 gcc/cgraph.c                                  | 106 ++++-
 gcc/cgraph.h                                  |  16 +-
 gcc/cgraphclones.c                            |   1 +
 gcc/ipa-profile.c                             | 384 ++++++++++++++++--
 gcc/ipa-ref.h                                 |   1 +
 gcc/lto-cgraph.c                              |  28 +-
 gcc/predict.c                                 |   1 -
 gcc/symtab.c                                  |   5 +
 .../tree-prof/crossmodule-indir-call-topn-1.c |  33 ++
 .../crossmodule-indir-call-topn-1a.c          |  22 +
 .../tree-prof/crossmodule-indir-call-topn-2.c |  40 ++
 .../gcc.dg/tree-prof/indir-call-prof-topn.c   |  37 ++
 gcc/testsuite/lib/scandump.exp                |   1 +
 gcc/testsuite/lib/scanwpaipa.exp              |  23 ++
 gcc/tree-inline.c                             |  22 +
 gcc/value-prof.c                              |  87 ++--
 gcc/value-prof.h                              |   1 -
 18 files changed, 705 insertions(+), 104 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c
 create mode 100644 
gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c
 create mode 100644 
gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 7d3c13230e4..366aceb8330 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2578,30 +2578,31 @@ GTFILES = $(CPPLIB_H) $(srcdir)/input.h 
$(srcdir)/coretypes.h \
   $(srcdir)/omp-offload.c \
   $(srcdir)/omp-expand.c \
   $(srcdir)/omp-low.c \
   $(srcdir)/targhooks.c $(out_file) $(srcdir)/passes.c $(srcdir)/cgraphunit.c \
   $(srcdir)/cgraphclones.c \
   $(srcdir)/tree-phinodes.c \
   $(srcdir)/tree-ssa-alias.h \
   $(srcdir)/tree-ssanames.h \
   $(srcdir)/tree-vrp.h \
   $(srcdir)/ipa-prop.h \
   $(srcdir)/trans-mem.c \
   $(srcdir)/lto-streamer.h \
   $(srcdir)/target-globals.h \
   $(srcdir)/ipa-predicate.h \
   $(srcdir)/ipa-fnsummary.h \
+  $(srcdir)/ipa-profile.c \
   $(srcdir)/value-range.h \
   $(srcdir)/vtable-verify.c \
   $(srcdir)/asan.c \
   $(srcdir)/ubsan.c \
   $(srcdir)/tsan.c \
   $(srcdir)/sanopt.c \
   $(srcdir)/sancov.c \
   $(srcdir)/ipa-devirt.c \
   $(srcdir)/internal-fn.h \
   $(srcdir)/hsa-common.c \
   $(srcdir)/calls.c \
   $(srcdir)/omp-general.h \
   @all_gtfiles@
 
 # Compute the list of GT header files from the corresponding C sources,
diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index b75430f3f3a..8aa4cc91939 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -852,30 +852,31 @@ symbol_table::create_edge (cgraph_node *caller, 
cgraph_node *callee,
 
   edge = ggc_alloc<cgraph_edge> ();
   edge->m_summary_id = -1;
   edges_count++;
 
   gcc_assert (++edges_max_uid != 0);
   edge->m_uid = edges_max_uid;
   edge->aux = NULL;
   edge->caller = caller;
   edge->callee = callee;
   edge->prev_caller = NULL;
   edge->next_caller = NULL;
   edge->prev_callee = NULL;
   edge->next_callee = NULL;
   edge->lto_stmt_uid = 0;
+  edge->speculative_id = 0;
 
   edge->count = count;
   edge->call_stmt = call_stmt;
   edge->indirect_info = NULL;
   edge->indirect_inlining_edge = 0;
   edge->speculative = false;
   edge->indirect_unknown_callee = indir_unknown_callee;
   if (call_stmt && caller->call_site_hash)
     cgraph_add_edge_to_call_site_hash (edge);
 
   if (cloning_p)
     return edge;
 
   edge->can_throw_external
     = call_stmt ? stmt_can_throw_external (DECL_STRUCT_FUNCTION (caller->decl),
@@ -1041,66 +1042,95 @@ cgraph_edge::remove (void)
    At clone materialization time, the indirect call E will
    be expanded as:
 
    if (call_dest == N2)
      n2 ();
    else
      call call_dest
 
    At this time the function just creates the direct call,
    the reference representing the if conditional and attaches
    them all to the original indirect call statement.  
 
    Return direct edge created.  */
 
 cgraph_edge *
-cgraph_edge::make_speculative (cgraph_node *n2, profile_count direct_count)
+cgraph_edge::make_speculative (cgraph_node *n2, profile_count direct_count,
+                              unsigned int speculative_id)
 {
   cgraph_node *n = caller;
   ipa_ref *ref = NULL;
   cgraph_edge *e2;
 
   if (dump_file)
     fprintf (dump_file, "Indirect call -> speculative call %s => %s\n",
             n->dump_name (), n2->dump_name ());
   speculative = true;
   e2 = n->create_edge (n2, call_stmt, direct_count);
   initialize_inline_failed (e2);
   e2->speculative = true;
   if (TREE_NOTHROW (n2->decl))
     e2->can_throw_external = false;
   else
     e2->can_throw_external = can_throw_external;
   e2->lto_stmt_uid = lto_stmt_uid;
+  e2->speculative_id = speculative_id;
   e2->in_polymorphic_cdtor = in_polymorphic_cdtor;
   count -= e2->count;
   symtab->call_edge_duplication_hooks (this, e2);
   ref = n->create_reference (n2, IPA_REF_ADDR, call_stmt);
   ref->lto_stmt_uid = lto_stmt_uid;
+  ref->speculative_id = speculative_id;
   ref->speculative = speculative;
   n2->mark_address_taken ();
   return e2;
 }
 
-/* Speculative call consist of three components:
-   1) an indirect edge representing the original call
-   2) an direct edge representing the new call
-   3) ADDR_EXPR reference representing the speculative check.
-   All three components are attached to single statement (the indirect
-   call) and if one of them exists, all of them must exist.
+/* Speculative calls represent a transformation of indirect calls
+   which may be later inserted into gimple in the following form:
 
-   Given speculative call edge, return all three components.
+   if (call_dest == target1)
+   target1 ();
+   else if (call_dest == target2)
+   target2 ();
+   else
+   call_dest ();
+
+   This is a win in case when target1 and target2 are common values for
+   call_dest as determined by ipa-devirt or indirect call profiling.
+   In particular this may enable inlining and other optimizations.
+
+   Speculative call consists of the following main components:
+
+   1) One or more "speculative" direct call (num_speculative_call_targets is
+   speculative direct call count belongs to the speculative indirect call)
+   2) One or more IPA_REF_ADDR references (representing the fact that code 
above
+   takes address of target1 and target2)
+   3) The fallback "speculative" indirect call
+
+   Direct calls and corresponding references are linked by
+   speculative_id.
+
+   speculative_call_info returns triple
+   (direct_call, indirect call, IPA_REF_ADDR reference)
+   when called on one edge participating in the speculative call:
+
+   1) If called on direct call, its corresponding IPA_REF_ADDR and related
+   indirect call are returned.
+
+   2) If called on indirect call, it will return one of direct edges and its
+   matching IPA_REF_ADDR.
  */
 
 void
 cgraph_edge::speculative_call_info (cgraph_edge *&direct,
                                    cgraph_edge *&indirect,
                                    ipa_ref *&reference)
 {
   ipa_ref *ref;
   int i;
   cgraph_edge *e2;
   cgraph_edge *e = this;
 
   if (!e->indirect_unknown_callee)
     for (e2 = e->caller->indirect_calls;
         e2->call_stmt != e->call_stmt || e2->lto_stmt_uid != e->lto_stmt_uid;
@@ -1116,47 +1146,56 @@ cgraph_edge::speculative_call_info (cgraph_edge 
*&direct,
          gcc_assert (e->speculative && !e->indirect_unknown_callee);
        }
       else
        for (e = e->caller->callees; 
             e2->call_stmt != e->call_stmt
             || e2->lto_stmt_uid != e->lto_stmt_uid;
             e = e->next_callee)
          ;
     }
   gcc_assert (e->speculative && e2->speculative);
   direct = e;
   indirect = e2;
 
   reference = NULL;
   for (i = 0; e->caller->iterate_reference (i, ref); i++)
-    if (ref->speculative
+    if (ref->speculative && ref->speculative_id == e->speculative_id
        && ((ref->stmt && ref->stmt == e->call_stmt)
            || (!ref->stmt && ref->lto_stmt_uid == e->lto_stmt_uid)))
       {
        reference = ref;
        break;
       }
 
   /* Speculative edge always consist of all three components - direct edge,
      indirect and reference.  */
   
   gcc_assert (e && e2 && ref);
 }
 
 /* Speculative call edge turned out to be direct call to CALLEE_DECL.
    Remove the speculative call sequence and return edge representing the call.
-   It is up to caller to redirect the call as appropriate. */
+
+   For "speculative" indirect call that contains multiple "speculative"
+   targets (i.e. edge->indirect_info->num_speculative_call_targets > 1),
+   decrease the count and only remove current direct edge.
+
+   If no speculative direct call left to the speculative indirect call, remove
+   the speculative of both the indirect call and corresponding direct edge.
+
+   It is up to caller to iteratively resolve each "speculative" direct call and
+   redirect the call as appropriate.  */
 
 cgraph_edge *
 cgraph_edge::resolve_speculation (tree callee_decl)
 {
   cgraph_edge *edge = this;
   cgraph_edge *e2;
   ipa_ref *ref;
 
   gcc_assert (edge->speculative);
   edge->speculative_call_info (e2, edge, ref);
   if (!callee_decl
       || !ref->referred->semantically_equivalent_p
           (symtab_node::get (callee_decl)))
     {
       if (dump_file)
@@ -1177,31 +1216,40 @@ cgraph_edge::resolve_speculation (tree callee_decl)
                       e2->callee->dump_name ());
            }
        }
     }
   else
     {
       cgraph_edge *tmp = edge;
       if (dump_file)
         fprintf (dump_file, "Speculative call turned into direct call.\n");
       edge = e2;
       e2 = tmp;
       /* FIXME:  If EDGE is inlined, we should scale up the frequencies and 
counts
          in the functions inlined through it.  */
     }
   edge->count += e2->count;
-  edge->speculative = false;
+  if (edge->num_speculative_call_targets_p ())
+    {
+      /* The indirect edge has multiple speculative targets, don't remove
+        speculative until all related direct edges are resolved.  */
+      edge->indirect_info->num_speculative_call_targets--;
+      if (!edge->indirect_info->num_speculative_call_targets)
+       edge->speculative = false;
+    }
+  else
+    edge->speculative = false;
   e2->speculative = false;
   ref->remove_reference ();
   if (e2->indirect_unknown_callee || e2->inline_failed)
     e2->remove ();
   else
     e2->callee->remove_symbol_and_inline_clones ();
   if (edge->caller->call_site_hash)
     cgraph_update_edge_in_call_site_hash (edge);
   return edge;
 }
 
 /* Make an indirect edge with an unknown callee an ordinary edge leading to
    CALLEE.  DELTA is an integer constant that is to be added to the this
    pointer (first parameter) to compensate for skipping a thunk adjustment.  */
 
@@ -1237,31 +1285,41 @@ cgraph_edge::make_direct (cgraph_node *callee)
   prev_callee = NULL;
   next_callee = caller->callees;
   if (caller->callees)
     caller->callees->prev_callee = edge;
   caller->callees = edge;
 
   /* Insert to callers list of the new callee.  */
   edge->set_callee (callee);
 
   /* We need to re-determine the inlining status of the edge.  */
   initialize_inline_failed (edge);
   return edge;
 }
 
 /* If necessary, change the function declaration in the call statement
-   associated with E so that it corresponds to the edge callee.  */
+   associated with E so that it corresponds to the edge callee.
+
+   The edge could be one of speculative direct call generated from speculative
+   indirect call.  In this circumstance, decrease the speculative targets
+   count (i.e. num_speculative_call_targets) and redirect call stmt to the
+   corresponding i-th target.  If no speculative direct call left to the
+   speculative indirect call, remove "speculative" of the indirect call and
+   also redirect stmt to it's final direct target.
+
+   It is up to caller to iteratively transform each "speculative"
+   direct call as appropriate.  */
 
 gimple *
 cgraph_edge::redirect_call_stmt_to_callee (void)
 {
   cgraph_edge *e = this;
 
   tree decl = gimple_call_fndecl (e->call_stmt);
   gcall *new_stmt;
   gimple_stmt_iterator gsi;
 
   if (e->speculative)
     {
       cgraph_edge *e2;
       gcall *new_stmt;
       ipa_ref *ref;
@@ -1285,31 +1343,41 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
            }
          gcc_assert (e2->speculative);
          push_cfun (DECL_STRUCT_FUNCTION (e->caller->decl));
 
          profile_probability prob = e->count.probability_in (e->count
                                                              + e2->count);
          if (!prob.initialized_p ())
            prob = profile_probability::even ();
          new_stmt = gimple_ic (e->call_stmt,
                                dyn_cast<cgraph_node *> (ref->referred),
                                prob);
          e->speculative = false;
          e->caller->set_call_stmt_including_clones (e->call_stmt, new_stmt,
                                                     false);
          e->count = gimple_bb (e->call_stmt)->count;
-         e2->speculative = false;
+         if (e2->num_speculative_call_targets_p ())
+           {
+             /* The indirect edge has multiple speculative targets, don't
+                remove speculative until all related direct edges are
+                redirected.  */
+             e2->indirect_info->num_speculative_call_targets--;
+             if (!e2->indirect_info->num_speculative_call_targets)
+               e2->speculative = false;
+           }
+         else
+           e2->speculative = false;
          e2->count = gimple_bb (e2->call_stmt)->count;
          ref->speculative = false;
          ref->stmt = NULL;
          /* Indirect edges are not both in the call site hash.
             get it updated.  */
          if (e->caller->call_site_hash)
            cgraph_update_edge_in_call_site_hash (e2);
          pop_cfun ();
          /* Continue redirecting E to proper target.  */
        }
     }
 
 
   if (e->indirect_unknown_callee
       || decl == e->callee->decl)
@@ -2097,30 +2165,32 @@ cgraph_node::dump (FILE *f)
        }
       else
         fprintf (f, "   Indirect call");
       edge->dump_edge_flags (f);
       if (edge->indirect_info->param_index != -1)
        {
          fprintf (f, " of param:%i", edge->indirect_info->param_index);
          if (edge->indirect_info->agg_contents)
           fprintf (f, " loaded from %s %s at offset %i",
                    edge->indirect_info->member_ptr ? "member ptr" : 
"aggregate",
                    edge->indirect_info->by_ref ? "passed by reference":"",
                    (int)edge->indirect_info->offset);
          if (edge->indirect_info->vptr_changed)
            fprintf (f, " (vptr maybe changed)");
        }
+      fprintf (f, " Num speculative call targets: %i",
+              edge->indirect_info->num_speculative_call_targets);
       fprintf (f, "\n");
       if (edge->indirect_info->polymorphic)
        edge->indirect_info->context.dump (f);
     }
 }
 
 /* Dump call graph node to file F in graphviz format.  */
 
 void
 cgraph_node::dump_graphviz (FILE *f)
 {
   cgraph_edge *edge;
 
   for (edge = callees; edge; edge = edge->next_callee)
     {
@@ -3374,31 +3444,31 @@ cgraph_node::verify_node (void)
              }
            for (i = 0; iterate_reference (i, ref); i++)
              if (ref->stmt && !stmts.contains (ref->stmt))
                {
                  error ("reference to dead statement");
                  cgraph_debug_gimple_stmt (this_cfun, ref->stmt);
                  error_found = true;
                }
        }
       else
        /* No CFG available?!  */
        gcc_unreachable ();
 
       for (e = callees; e; e = e->next_callee)
        {
-         if (!e->aux)
+         if (!e->aux && !e->speculative)
            {
              error ("edge %s->%s has no corresponding call_stmt",
                     identifier_to_locale (e->caller->name ()),
                     identifier_to_locale (e->callee->name ()));
              cgraph_debug_gimple_stmt (this_cfun, e->call_stmt);
              error_found = true;
            }
          e->aux = 0;
        }
       for (e = indirect_calls; e; e = e->next_callee)
        {
          if (!e->aux && !e->speculative)
            {
              error ("an indirect edge from %s has no corresponding call_stmt",
                     identifier_to_locale (e->caller->name ()));
@@ -3711,30 +3781,38 @@ cgraph_edge::possibly_call_in_translation_unit_p (void)
   if (!TREE_PUBLIC (callee->decl) && !DECL_EXTERNAL (callee->decl))
     return true;
 
   /* Otherwise we need to lookup prevailing symbol (symbol table is not merged,
      yet) and see if it is a definition.  In fact we may also resolve aliases,
      but that is probably not too important.  */
   symtab_node *node = callee;
   for (int n = 10; node->previous_sharing_asm_name && n ; n--)
     node = node->previous_sharing_asm_name;
   if (node->previous_sharing_asm_name)
     node = symtab_node::get_for_asmname (DECL_ASSEMBLER_NAME (callee->decl));
   gcc_assert (TREE_PUBLIC (node->decl));
   return node->get_availability () >= AVAIL_INTERPOSABLE;
 }
 
+/* Return num_speculative_targets of this edge.  */
+
+int
+cgraph_edge::num_speculative_call_targets_p (void)
+{
+  return indirect_info ? indirect_info->num_speculative_call_targets : 0;
+}
+
 /* A stashed copy of "symtab" for use by selftest::symbol_table_test.
    This needs to be a global so that it can be a GC root, and thus
    prevent the stashed copy from being garbage-collected if the GC runs
    during a symbol_table_test.  */
 
 symbol_table *saved_symtab;
 
 #if CHECKING_P
 
 namespace selftest {
 
 /* class selftest::symbol_table_test.  */
 
 /* Constructor.  Store the old value of symtab, and create a new one.  */
 
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 9c086fedaef..4cb047776e9 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -1652,34 +1652,33 @@ class GTY(()) cgraph_indirect_call_info
 {
 public:
   /* When agg_content is set, an offset where the call pointer is located
      within the aggregate.  */
   HOST_WIDE_INT offset;
   /* Context of the polymorphic call; use only when POLYMORPHIC flag is set.  
*/
   ipa_polymorphic_call_context context;
   /* OBJ_TYPE_REF_TOKEN of a polymorphic call (if polymorphic is set).  */
   HOST_WIDE_INT otr_token;
   /* Type of the object from OBJ_TYPE_REF_OBJECT. */
   tree otr_type;
   /* Index of the parameter that is called.  */
   int param_index;
   /* ECF flags determined from the caller.  */
   int ecf_flags;
-  /* Profile_id of common target obtained from profile.  */
-  int common_target_id;
-  /* Probability that call will land in function with COMMON_TARGET_ID.  */
-  int common_target_probability;
+
+  /* Number of speculative call targets, it's less than GCOV_TOPN_VALUES.  */
+  unsigned num_speculative_call_targets : 16;
 
   /* Set when the call is a virtual call with the parameter being the
      associated object pointer rather than a simple direct call.  */
   unsigned polymorphic : 1;
   /* Set when the call is a call of a pointer loaded from contents of an
      aggregate at offset.  */
   unsigned agg_contents : 1;
   /* Set when this is a call through a member pointer.  */
   unsigned member_ptr : 1;
   /* When the agg_contents bit is set, this one determines whether the
      destination is loaded from a parameter passed by reference. */
   unsigned by_ref : 1;
   /* When the agg_contents bit is set, this one determines whether we can
      deduce from the function body that the loaded value from the reference is
      never modified between the invocation of the function and the load
@@ -1712,31 +1711,32 @@ public:
   /* If the edge does not lead to a thunk, simply redirect it to N.  Otherwise
      create one or more equivalent thunks for N and redirect E to the first in
      the chain.  Note that it is then necessary to call
      n->expand_all_artificial_thunks once all callers are redirected.  */
   void redirect_callee_duplicating_thunks (cgraph_node *n);
 
   /* Make an indirect edge with an unknown callee an ordinary edge leading to
      CALLEE.  DELTA is an integer constant that is to be added to the this
      pointer (first parameter) to compensate for skipping
      a thunk adjustment.  */
   cgraph_edge *make_direct (cgraph_node *callee);
 
   /* Turn edge into speculative call calling N2. Update
      the profile so the direct call is taken COUNT times
      with FREQUENCY.  */
-  cgraph_edge *make_speculative (cgraph_node *n2, profile_count direct_count);
+  cgraph_edge *make_speculative (cgraph_node *n2, profile_count direct_count,
+                                unsigned int speculative_id = 0);
 
    /* Given speculative call edge, return all three components.  */
   void speculative_call_info (cgraph_edge *&direct, cgraph_edge *&indirect,
                              ipa_ref *&reference);
 
   /* Speculative call edge turned out to be direct call to CALLEE_DECL.
      Remove the speculative call sequence and return edge representing the 
call.
      It is up to caller to redirect the call as appropriate. */
   cgraph_edge *resolve_speculation (tree callee_decl = NULL);
 
   /* If necessary, change the function declaration in the call statement
      associated with the edge so that it corresponds to the edge callee.  */
   gimple *redirect_call_stmt_to_callee (void);
 
   /* Create clone of edge in the node N represented
@@ -1771,49 +1771,55 @@ public:
     return m_summary_id;
   }
 
   /* Rebuild cgraph edges for current function node.  This needs to be run 
after
      passes that don't update the cgraph.  */
   static unsigned int rebuild_edges (void);
 
   /* Rebuild cgraph references for current function node.  This needs to be run
      after passes that don't update the cgraph.  */
   static void rebuild_references (void);
 
   /* During LTO stream in this can be used to check whether call can possibly
      be internal to the current translation unit.  */
   bool possibly_call_in_translation_unit_p (void);
 
+  /* Return num_speculative_targets of this edge.  */
+  int num_speculative_call_targets_p (void);
+
   /* Expected number of executions: calculated in profile.c.  */
   profile_count count;
   cgraph_node *caller;
   cgraph_node *callee;
   cgraph_edge *prev_caller;
   cgraph_edge *next_caller;
   cgraph_edge *prev_callee;
   cgraph_edge *next_callee;
   gcall *call_stmt;
   /* Additional information about an indirect call.  Not cleared when an edge
      becomes direct.  */
   cgraph_indirect_call_info *indirect_info;
   PTR GTY ((skip (""))) aux;
   /* When equal to CIF_OK, inline this call.  Otherwise, points to the
      explanation why function was not inlined.  */
   enum cgraph_inline_failed_t inline_failed;
   /* The stmt_uid of call_stmt.  This is used by LTO to recover the call_stmt
      when the function is serialized in.  */
   unsigned int lto_stmt_uid;
+  /* speculative id is used to link direct calls with their corresponding
+     IPA_REF_ADDR references when representing speculative calls.  */
+  unsigned int speculative_id : 16;
   /* Whether this edge was made direct by indirect inlining.  */
   unsigned int indirect_inlining_edge : 1;
   /* Whether this edge describes an indirect call with an undetermined
      callee.  */
   unsigned int indirect_unknown_callee : 1;
   /* Whether this edge is still a dangling  */
   /* True if the corresponding CALL stmt cannot be inlined.  */
   unsigned int call_stmt_cannot_inline_p : 1;
   /* Can this call throw externally?  */
   unsigned int can_throw_external : 1;
   /* Edges with SPECULATIVE flag represents indirect calls that was
      speculatively turned into direct (i.e. by profile feedback).
      The final code sequence will have form:
 
      if (call_target == expected_fn)
diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
index 81c5dfd194f..da9a4f3f6f5 100644
--- a/gcc/cgraphclones.c
+++ b/gcc/cgraphclones.c
@@ -121,30 +121,31 @@ cgraph_edge::clone (cgraph_node *n, gcall *call_stmt, 
unsigned stmt_uid,
     }
   else
     {
       new_edge = n->create_edge (callee, call_stmt, prof_count, true);
       if (indirect_info)
        {
          new_edge->indirect_info
            = ggc_cleared_alloc<cgraph_indirect_call_info> ();
          *new_edge->indirect_info = *indirect_info;
        }
     }
 
   new_edge->inline_failed = inline_failed;
   new_edge->indirect_inlining_edge = indirect_inlining_edge;
   new_edge->lto_stmt_uid = stmt_uid;
+  new_edge->speculative_id = speculative_id;
   /* Clone flags that depend on call_stmt availability manually.  */
   new_edge->can_throw_external = can_throw_external;
   new_edge->call_stmt_cannot_inline_p = call_stmt_cannot_inline_p;
   new_edge->speculative = speculative;
   new_edge->in_polymorphic_cdtor = in_polymorphic_cdtor;
 
   /* Update IPA profile.  Local profiles need no updating in original.  */
   if (update_original)
     count = count.combine_with_ipa_count_within (count.ipa () 
                                                 - new_edge->count.ipa (),
                                                 caller->count);
   symtab->call_edge_duplication_hooks (this, new_edge);
   return new_edge;
 }
 
diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c
index 4b28b94aaad..17ab5ce60d7 100644
--- a/gcc/ipa-profile.c
+++ b/gcc/ipa-profile.c
@@ -147,144 +147,417 @@ dump_histogram (FILE *file, vec<histogram_entry *> 
histogram)
   if (!overall_size)
     overall_size = 1;
   for (i = 0; i < histogram.length (); i++)
     {
       cumulated_time += histogram[i]->count * histogram[i]->time;
       cumulated_size += histogram[i]->size;
       fprintf (file, "  %" PRId64": time:%i (%2.2f) size:%i (%2.2f)\n",
               (int64_t) histogram[i]->count,
               histogram[i]->time,
               cumulated_time * 100.0 / overall_time,
               histogram[i]->size,
               cumulated_size * 100.0 / overall_size);
    }
 }
 
-/* Collect histogram from CFG profiles.  */
+/* Structure containing speculative target information from profile.  */
+
+struct GTY (()) speculative_call_target
+{
+  speculative_call_target (unsigned int id, int prob)
+    : target_id (id), target_probability (prob)
+  {
+  }
+
+  /* Profile_id of target obtained from profile.  */
+  unsigned int target_id;
+  /* Probability that call will land in function with target_id.  */
+  int target_probability;
+};
+
+class GTY ((for_user)) speculative_call_summary
+{
+public:
+  speculative_call_summary () : speculative_call_targets ()
+  {}
+
+  vec<speculative_call_target, va_gc> *speculative_call_targets;
+  ~speculative_call_summary ();
+
+  void dump (FILE *f);
+
+  /* Check whether this is a empty summary.  */
+  bool is_empty ();
+};
+
+  /* Class to manage call summaries.  */
+
+class GTY ((user)) ipa_profile_call_summaries
+  : public call_summary<speculative_call_summary *>
+{
+public:
+  ipa_profile_call_summaries (symbol_table *table, bool ggc)
+    : call_summary<speculative_call_summary *> (table, ggc)
+  {}
+
+  /* Duplicate info when an edge is cloned.  */
+  virtual void duplicate (cgraph_edge *, cgraph_edge *,
+                         speculative_call_summary *old_sum,
+                         speculative_call_summary *new_sum);
+};
+
+static GTY (()) ipa_profile_call_summaries *call_sums = NULL;
+
+speculative_call_summary::~speculative_call_summary ()
+{
+  if (speculative_call_targets)
+    {
+      vec_free (speculative_call_targets);
+      speculative_call_targets = NULL;
+    }
+}
+
+/* Dump all information in speculative call summary to F.  */
+
+void
+speculative_call_summary::dump (FILE *f)
+{
+  speculative_call_target *item;
+  cgraph_node *n2;
+  unsigned int i;
+
+  FOR_EACH_VEC_SAFE_ELT (speculative_call_targets, i, item)
+    {
+      n2 = find_func_by_profile_id (item->target_id);
+      if (n2)
+       fprintf (f, "    The %i speculative target is %s with prob %3.2f\n", i,
+                n2->dump_name (),
+                item->target_probability / (float) REG_BR_PROB_BASE);
+      else
+       fprintf (f, "    The %i speculative target is %u with prob %3.2f\n", i,
+                item->target_id,
+                item->target_probability / (float) REG_BR_PROB_BASE);
+    }
+}
+
+/* Check whether this is a empty summary.  */
+bool
+speculative_call_summary::is_empty ()
+{
+  return speculative_call_targets == NULL
+        || speculative_call_targets->is_empty ();
+}
+
+/* Duplicate info when an edge is cloned.  */
+
+void
+ipa_profile_call_summaries::duplicate (cgraph_edge *, cgraph_edge *,
+                                 speculative_call_summary *old_sum,
+                                 speculative_call_summary *new_sum)
+{
+  if (!old_sum || !old_sum->speculative_call_targets)
+    return;
+
+  speculative_call_target *item;
+  unsigned int i;
+
+  FOR_EACH_VEC_SAFE_ELT (old_sum->speculative_call_targets, i, item)
+    {
+      vec_safe_push (new_sum->speculative_call_targets, *item);
+    }
+}
+
+/* Collect histogram and speculative target summaries from CFG profiles.  */
 
 static void
 ipa_profile_generate_summary (void)
 {
   struct cgraph_node *node;
   gimple_stmt_iterator gsi;
   basic_block bb;
 
   hash_table<histogram_hash> hashtable (10);
-  
+
+  gcc_checking_assert (!call_sums);
+  call_sums = (new (ggc_alloc_no_dtor<ipa_profile_call_summaries> ())
+                ipa_profile_call_summaries (symtab, true));
+
   FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
     if (ENTRY_BLOCK_PTR_FOR_FN (DECL_STRUCT_FUNCTION 
(node->decl))->count.ipa_p ())
       FOR_EACH_BB_FN (bb, DECL_STRUCT_FUNCTION (node->decl))
        {
          int time = 0;
          int size = 0;
          for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
            {
              gimple *stmt = gsi_stmt (gsi);
              if (gimple_code (stmt) == GIMPLE_CALL
                  && !gimple_call_fndecl (stmt))
                {
                  histogram_value h;
                  h = gimple_histogram_value_of_type
                        (DECL_STRUCT_FUNCTION (node->decl),
                         stmt, HIST_TYPE_INDIR_CALL);
                  /* No need to do sanity check: gimple_ic_transform already
                     takes away bad histograms.  */
                  if (h)
                    {
                      gcov_type val, count, all;
-                     if (get_nth_most_common_value (NULL, "indirect call", h,
-                                                    &val, &count, &all))
+                     struct cgraph_edge *e = node->get_edge (stmt);
+                     if (e && !e->indirect_unknown_callee)
+                       continue;
+
+                     speculative_call_summary *csum
+                       = call_sums->get_create (e);
+
+                     for (unsigned j = 0; j < GCOV_TOPN_VALUES; j++)
                        {
-                         struct cgraph_edge * e = node->get_edge (stmt);
-                         if (e && !e->indirect_unknown_callee)
+                         if (!get_nth_most_common_value (NULL, "indirect call",
+                                                         h, &val, &count, &all,
+                                                         j))
                            continue;
 
-                         e->indirect_info->common_target_id = val;
-                         e->indirect_info->common_target_probability
-                           = GCOV_COMPUTE_SCALE (count, all);
-                         if (e->indirect_info->common_target_probability > 
REG_BR_PROB_BASE)
+                         if (val == 0)
+                           continue;
+
+                         speculative_call_target item (
+                           val, GCOV_COMPUTE_SCALE (count, all));
+                         if (item.target_probability > REG_BR_PROB_BASE)
                            {
                              if (dump_file)
-                               fprintf (dump_file, "Probability capped to 
1\n");
-                             e->indirect_info->common_target_probability = 
REG_BR_PROB_BASE;
+                               fprintf (dump_file,
+                                        "Probability capped to 1\n");
+                             item.target_probability = REG_BR_PROB_BASE;
                            }
+                         vec_safe_push (csum->speculative_call_targets, item);
                        }
+
                      gimple_remove_histogram_value (DECL_STRUCT_FUNCTION 
(node->decl),
                                                      stmt, h);
                    }
                }
              time += estimate_num_insns (stmt, &eni_time_weights);
              size += estimate_num_insns (stmt, &eni_size_weights);
            }
          if (bb->count.ipa_p () && bb->count.initialized_p ())
            account_time_size (&hashtable, histogram, bb->count.ipa 
().to_gcov_type (),
                               time, size);
        }
   histogram.qsort (cmp_counts);
 }
 
+/* Serialize the speculative summary info for LTO.  */
+
+static void
+ipa_profile_write_edge_summary (lto_simple_output_block *ob,
+                               speculative_call_summary *csum)
+{
+  speculative_call_target *item;
+  unsigned int i;
+  unsigned len = 0;
+
+  if (!csum->is_empty ())
+    len = csum->speculative_call_targets->length ();
+
+  gcc_assert (len <= GCOV_TOPN_VALUES);
+
+  streamer_write_hwi_stream (ob->main_stream, len);
+
+  if (len)
+    {
+      FOR_EACH_VEC_SAFE_ELT (csum->speculative_call_targets, i, item)
+       {
+         gcc_assert (item->target_id);
+         streamer_write_hwi_stream (ob->main_stream, item->target_id);
+         streamer_write_hwi_stream (ob->main_stream, item->target_probability);
+       }
+    }
+}
+
 /* Serialize the ipa info for lto.  */
 
 static void
 ipa_profile_write_summary (void)
 {
   struct lto_simple_output_block *ob
     = lto_create_simple_output_block (LTO_section_ipa_profile);
   unsigned int i;
 
   streamer_write_uhwi_stream (ob->main_stream, histogram.length ());
   for (i = 0; i < histogram.length (); i++)
     {
       streamer_write_gcov_count_stream (ob->main_stream, histogram[i]->count);
       streamer_write_uhwi_stream (ob->main_stream, histogram[i]->time);
       streamer_write_uhwi_stream (ob->main_stream, histogram[i]->size);
     }
+
+  if (!call_sums)
+    return;
+
+  /* Serialize speculative targets information.  */
+  unsigned int count = 0;
+  lto_symtab_encoder_t encoder = ob->decl_state->symtab_node_encoder;
+  lto_symtab_encoder_iterator lsei;
+  cgraph_node *node;
+
+  for (lsei = lsei_start_function_in_partition (encoder); !lsei_end_p (lsei);
+       lsei_next_function_in_partition (&lsei))
+    {
+      node = lsei_cgraph_node (lsei);
+      if (node->definition && node->has_gimple_body_p ()
+         && node->indirect_calls)
+       count++;
+    }
+
+  streamer_write_uhwi_stream (ob->main_stream, count);
+
+  /* Process all of the functions.  */
+  for (lsei = lsei_start_function_in_partition (encoder);
+       !lsei_end_p (lsei) && count; lsei_next_function_in_partition (&lsei))
+    {
+      cgraph_node *node = lsei_cgraph_node (lsei);
+      if (node->definition && node->has_gimple_body_p ()
+         && node->indirect_calls)
+       {
+         int node_ref = lto_symtab_encoder_encode (encoder, node);
+         streamer_write_uhwi_stream (ob->main_stream, node_ref);
+
+         for (cgraph_edge *e = node->indirect_calls; e; e = e->next_callee)
+           {
+             speculative_call_summary *csum = call_sums->get_create (e);
+             ipa_profile_write_edge_summary (ob, csum);
+           }
+      }
+    }
+
   lto_destroy_simple_output_block (ob);
 }
 
-/* Deserialize the ipa info for lto.  */
+/* Dump all profile summary data for all cgraph nodes and edges to file F.  */
+
+static void
+ipa_profile_dump_all_summaries (FILE *f)
+{
+  fprintf (dump_file,
+          "\n========== IPA-profile speculative targets: ==========\n");
+  cgraph_node *node;
+  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
+    {
+      fprintf (f, "\nSummary for node %s:\n", node->dump_name ());
+      for (cgraph_edge *e = node->indirect_calls; e; e = e->next_callee)
+       {
+         fprintf (f, "  Summary for %s of indirect edge %d:\n",
+                  e->caller->dump_name (), e->lto_stmt_uid);
+         speculative_call_summary *csum = call_sums->get_create (e);
+         if (!csum->is_empty ())
+           csum->dump (f);
+         else
+           fprintf (f, "    No indirect call summary.\n");
+       }
+    }
+  fprintf (f, "\n\n");
+}
+
+/* Read speculative targets information about edge for LTO WPA.  */
+
+static void
+ipa_profile_read_edge_summary (class lto_input_block *ib, cgraph_edge *edge)
+{
+  unsigned i, len;
+
+  len = streamer_read_hwi (ib);
+  gcc_assert (len <= GCOV_TOPN_VALUES);
+
+  speculative_call_summary *csum = call_sums->get_create (edge);
+
+  for (i = 0; i < len; i++)
+  {
+    speculative_call_target item (streamer_read_hwi (ib),
+       streamer_read_hwi (ib));
+    vec_safe_push (csum->speculative_call_targets, item);
+  }
+}
+
+/* Read profile speculative targets section information for LTO WPA.  */
+
+static void
+ipa_profile_read_summary_section (struct lto_file_decl_data *file_data,
+                                 class lto_input_block *ib)
+{
+  if (!ib)
+    return;
+
+  lto_symtab_encoder_t encoder = file_data->symtab_node_encoder;
+
+  unsigned int count = streamer_read_uhwi (ib);
+
+  unsigned int i;
+  unsigned int index;
+  cgraph_node * node;
+
+  for (i = 0; i < count; i++)
+    {
+      index = streamer_read_uhwi (ib);
+      encoder = file_data->symtab_node_encoder;
+      node
+       = dyn_cast<cgraph_node *> (lto_symtab_encoder_deref (encoder, index));
+
+      for (cgraph_edge *e = node->indirect_calls; e; e = e->next_callee)
+       ipa_profile_read_edge_summary (ib, e);
+    }
+}
+
+/* Deserialize the IPA histogram and speculative targets summary info for LTO.
+   */
 
 static void
 ipa_profile_read_summary (void)
 {
   struct lto_file_decl_data ** file_data_vec
     = lto_get_file_decl_data ();
   struct lto_file_decl_data * file_data;
   int j = 0;
 
   hash_table<histogram_hash> hashtable (10);
 
+  gcc_checking_assert (!call_sums);
+  call_sums = (new (ggc_alloc_no_dtor<ipa_profile_call_summaries> ())
+                ipa_profile_call_summaries (symtab, true));
+
   while ((file_data = file_data_vec[j++]))
     {
       const char *data;
       size_t len;
       class lto_input_block *ib
        = lto_create_simple_input_block (file_data,
                                         LTO_section_ipa_profile,
                                         &data, &len);
       if (ib)
        {
           unsigned int num = streamer_read_uhwi (ib);
          unsigned int n;
          for (n = 0; n < num; n++)
            {
              gcov_type count = streamer_read_gcov_count (ib);
              int time = streamer_read_uhwi (ib);
              int size = streamer_read_uhwi (ib);
              account_time_size (&hashtable, histogram,
                                 count, time, size);
            }
+
+         ipa_profile_read_summary_section (file_data, ib);
+
          lto_destroy_simple_input_block (file_data,
                                          LTO_section_ipa_profile,
                                          ib, data, len);
        }
     }
   histogram.qsort (cmp_counts);
 }
 
 /* Data used by ipa_propagate_frequency.  */
 
 struct ipa_propagate_frequency_data
 {
   cgraph_node *function_symbol;
   bool maybe_unlikely_executed;
   bool maybe_executed_once;
@@ -500,46 +773,45 @@ check_argument_count (struct cgraph_node *n, struct 
cgraph_edge *e)
 /* Simple ipa profile pass propagating frequencies across the callgraph.  */
 
 static unsigned int
 ipa_profile (void)
 {
   struct cgraph_node **order;
   struct cgraph_edge *e;
   int order_pos;
   bool something_changed = false;
   int i;
   gcov_type overall_time = 0, cutoff = 0, cumulated = 0, overall_size = 0;
   struct cgraph_node *n,*n2;
   int nindirect = 0, ncommon = 0, nunknown = 0, nuseless = 0, nconverted = 0;
   int nmismatch = 0, nimpossible = 0;
   bool node_map_initialized = false;
+  gcov_type threshold;
 
   if (dump_file)
     dump_histogram (dump_file, histogram);
   for (i = 0; i < (int)histogram.length (); i++)
     {
       overall_time += histogram[i]->count * histogram[i]->time;
       overall_size += histogram[i]->size;
     }
+  threshold = 0;
   if (overall_time)
     {
-      gcov_type threshold;
-
       gcc_assert (overall_size);
 
       cutoff = (overall_time * param_hot_bb_count_ws_permille + 500) / 1000;
-      threshold = 0;
       for (i = 0; cumulated < cutoff; i++)
        {
          cumulated += histogram[i]->count * histogram[i]->time;
           threshold = histogram[i]->count;
        }
       if (!threshold)
        threshold = 1;
       if (dump_file)
        {
          gcov_type cumulated_time = 0, cumulated_size = 0;
 
           for (i = 0;
               i < (int)histogram.length () && histogram[i]->count >= threshold;
               i++)
            {
@@ -551,65 +823,101 @@ ipa_profile (void)
                   (int64_t)threshold,
                   cumulated_time * 100.0 / overall_time,
                   cumulated_size * 100.0 / overall_size);
        }
 
       if (in_lto_p)
        {
          if (dump_file)
            fprintf (dump_file, "Setting hotness threshold in LTO mode.\n");
           set_hot_bb_threshold (threshold);
        }
     }
   histogram.release ();
   histogram_pool.release ();
 
-  /* Produce speculative calls: we saved common target from porfiling into
-     e->common_target_id.  Now, at link time, we can look up corresponding
+  /* Produce speculative calls: we saved common target from profiling into
+     e->target_id.  Now, at link time, we can look up corresponding
      function node and produce speculative call.  */
 
+  gcc_checking_assert (call_sums);
+
+  if (dump_file)
+    {
+      if (!node_map_initialized)
+       init_node_map (false);
+      node_map_initialized = true;
+
+      ipa_profile_dump_all_summaries (dump_file);
+    }
+
   FOR_EACH_DEFINED_FUNCTION (n)
     {
       bool update = false;
 
       if (!opt_for_fn (n->decl, flag_ipa_profile))
        continue;
 
       for (e = n->indirect_calls; e; e = e->next_callee)
        {
          if (n->count.initialized_p ())
            nindirect++;
-         if (e->indirect_info->common_target_id)
+
+         speculative_call_summary *csum = call_sums->get_create (e);
+         if (!csum->is_empty ())
            {
              if (!node_map_initialized)
-               init_node_map (false);
+               init_node_map (false);
              node_map_initialized = true;
              ncommon++;
-             n2 = find_func_by_profile_id (e->indirect_info->common_target_id);
+
+             if (in_lto_p)
+               {
+                 if (dump_file)
+                   {
+                     fprintf (dump_file,
+                              "Updating hotness threshold in LTO mode.\n");
+                     fprintf (dump_file, "Updated min count: %" PRId64 "\n",
+                              (int64_t) threshold
+                                / csum->speculative_call_targets->length ());
+                   }
+                 set_hot_bb_threshold (threshold
+                   / csum->speculative_call_targets->length ());
+               }
+
+             unsigned speculative_id = 0;
+             speculative_call_target *item;
+             /* The code below is not formatted yet for review convenience.
+                Move to a seprate small function is not easy as too many local
+                variables used in it.  Need format and remove this comments
+                once got approved.  */
+             FOR_EACH_VEC_SAFE_ELT (csum->speculative_call_targets, i, item)
+              {
+             bool speculative_found = false;
+             n2 = find_func_by_profile_id (item->target_id);
              if (n2)
                {
                  if (dump_file)
                    {
                      fprintf (dump_file, "Indirect call -> direct call from"
                               " other module %s => %s, prob %3.2f\n",
                               n->dump_name (),
                               n2->dump_name (),
-                              e->indirect_info->common_target_probability
-                              / (float)REG_BR_PROB_BASE);
+                              item->target_probability
+                                / (float) REG_BR_PROB_BASE);
                    }
-                 if (e->indirect_info->common_target_probability
-                     < REG_BR_PROB_BASE / 2)
+                 if (item->target_probability < REG_BR_PROB_BASE / 2)
                    {
                      nuseless++;
                      if (dump_file)
                        fprintf (dump_file,
                                 "Not speculating: probability is too low.\n");
                    }
                  else if (!e->maybe_hot_p ())
                    {
                      nuseless++;
                      if (dump_file)
                        fprintf (dump_file,
                                 "Not speculating: call is cold.\n");
                    }
                  else if (n2->get_availability () <= AVAIL_INTERPOSABLE
                           && n2->can_be_discarded_p ())
@@ -641,44 +949,58 @@ ipa_profile (void)
                    }
                  else
                    {
                      /* Target may be overwritable, but profile says that
                         control flow goes to this particular implementation
                         of N2.  Speculate on the local alias to allow inlining.
                       */
                      if (!n2->can_be_discarded_p ())
                        {
                          cgraph_node *alias;
                          alias = dyn_cast<cgraph_node *> 
(n2->noninterposable_alias ());
                          if (alias)
                            n2 = alias;
                        }
                      nconverted++;
-                     e->make_speculative
-                       (n2,
-                        e->count.apply_probability
-                                    
(e->indirect_info->common_target_probability));
+                     e->make_speculative (n2,
+                                          e->count.apply_probability (
+                                            item->target_probability),
+                                          speculative_id);
                      update = true;
+                     speculative_id++;
+                     speculative_found = true;
                    }
                }
              else
                {
                  if (dump_file)
                    fprintf (dump_file, "Function with profile-id %i not 
found.\n",
-                            e->indirect_info->common_target_id);
+                            item->target_id);
                  nunknown++;
                }
+             if (!speculative_found && !csum->is_empty ())
+               {
+                 /* Remove item from speculative_call_targets if no
+                    speculative edge generated, rollback the iteration.  */
+                 csum->speculative_call_targets->ordered_remove (i);
+                 if (i)
+                   i--;
+               }
+              }
+             if (!csum->is_empty ())
+               e->indirect_info->num_speculative_call_targets
+                 = csum->speculative_call_targets->length ();
            }
         }
        if (update)
         ipa_update_overall_fn_summary (n);
      }
   if (node_map_initialized)
     del_node_map ();
   if (dump_file && nindirect)
     fprintf (dump_file,
             "%i indirect calls trained.\n"
             "%i (%3.2f%%) have common target.\n"
             "%i (%3.2f%%) targets was not found.\n"
             "%i (%3.2f%%) targets had parameter count mismatch.\n"
             "%i (%3.2f%%) targets was not in polymorphic call target list.\n"
             "%i (%3.2f%%) speculations seems useless.\n"
@@ -764,15 +1086,17 @@ public:
   {}
 
   /* opt_pass methods: */
   virtual bool gate (function *) { return flag_ipa_profile || in_lto_p; }
   virtual unsigned int execute (function *) { return ipa_profile (); }
 
 }; // class pass_ipa_profile
 
 } // anon namespace
 
 ipa_opt_pass_d *
 make_pass_ipa_profile (gcc::context *ctxt)
 {
   return new pass_ipa_profile (ctxt);
 }
+/* Tell the garbage collector about GTY markers in this source file.  */
+#include "gt-ipa-profile.h"
diff --git a/gcc/ipa-ref.h b/gcc/ipa-ref.h
index 00af24c77db..b68661d45c8 100644
--- a/gcc/ipa-ref.h
+++ b/gcc/ipa-ref.h
@@ -47,30 +47,31 @@ public:
   bool cannot_lead_to_return ();
 
   /* Return true if reference may be used in address compare.  */
   bool address_matters_p ();
 
   /* Return reference list this reference is in.  */
   struct ipa_ref_list * referring_ref_list (void);
 
   /* Return reference list this reference is in.  */
   struct ipa_ref_list * referred_ref_list (void);
 
   symtab_node *referring;
   symtab_node *referred;
   gimple *stmt;
   unsigned int lto_stmt_uid;
+  unsigned int speculative_id;
   unsigned int referred_index;
   ENUM_BITFIELD (ipa_ref_use) use:3;
   unsigned int speculative:1;
 };
 
 typedef struct ipa_ref ipa_ref_t;
 typedef struct ipa_ref *ipa_ref_ptr;
 
 
 /* List of references.  This is stored in both callgraph and varpool nodes.  */
 struct GTY(()) ipa_ref_list
 {
 public:
   /* Return first reference in list or NULL if empty.  */
   struct ipa_ref *first_reference (void)
diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
index b5221cd41f9..00a79d35a51 100644
--- a/gcc/lto-cgraph.c
+++ b/gcc/lto-cgraph.c
@@ -250,62 +250,58 @@ lto_output_edge (struct lto_simple_output_block *ob, 
struct cgraph_edge *edge,
   if (!edge->indirect_unknown_callee)
     {
       ref = lto_symtab_encoder_lookup (encoder, edge->callee);
       gcc_assert (ref != LCC_NOT_FOUND);
       streamer_write_hwi_stream (ob->main_stream, ref);
     }
 
   edge->count.stream_out (ob->main_stream);
 
   bp = bitpack_create (ob->main_stream);
   uid = (!gimple_has_body_p (edge->caller->decl) || edge->caller->thunk.thunk_p
         ? edge->lto_stmt_uid : gimple_uid (edge->call_stmt) + 1);
   bp_pack_enum (&bp, cgraph_inline_failed_t,
                CIF_N_REASONS, edge->inline_failed);
   bp_pack_var_len_unsigned (&bp, uid);
+  bp_pack_var_len_unsigned (&bp, edge->speculative_id);
   bp_pack_value (&bp, edge->indirect_inlining_edge, 1);
   bp_pack_value (&bp, edge->speculative, 1);
   bp_pack_value (&bp, edge->call_stmt_cannot_inline_p, 1);
   gcc_assert (!edge->call_stmt_cannot_inline_p
              || edge->inline_failed != CIF_BODY_NOT_AVAILABLE);
   bp_pack_value (&bp, edge->can_throw_external, 1);
   bp_pack_value (&bp, edge->in_polymorphic_cdtor, 1);
   if (edge->indirect_unknown_callee)
     {
       int flags = edge->indirect_info->ecf_flags;
       bp_pack_value (&bp, (flags & ECF_CONST) != 0, 1);
       bp_pack_value (&bp, (flags & ECF_PURE) != 0, 1);
       bp_pack_value (&bp, (flags & ECF_NORETURN) != 0, 1);
       bp_pack_value (&bp, (flags & ECF_MALLOC) != 0, 1);
       bp_pack_value (&bp, (flags & ECF_NOTHROW) != 0, 1);
       bp_pack_value (&bp, (flags & ECF_RETURNS_TWICE) != 0, 1);
       /* Flags that should not appear on indirect calls.  */
       gcc_assert (!(flags & (ECF_LOOPING_CONST_OR_PURE
                             | ECF_MAY_BE_ALLOCA
                             | ECF_SIBCALL
                             | ECF_LEAF
                             | ECF_NOVOPS)));
+
+      bp_pack_value (&bp, edge->indirect_info->num_speculative_call_targets,
+                    16);
     }
   streamer_write_bitpack (&bp);
-  if (edge->indirect_unknown_callee)
-    {
-      streamer_write_hwi_stream (ob->main_stream,
-                                edge->indirect_info->common_target_id);
-      if (edge->indirect_info->common_target_id)
-       streamer_write_hwi_stream
-          (ob->main_stream, edge->indirect_info->common_target_probability);
-    }
 }
 
 /* Return if NODE contain references from other partitions.  */
 
 bool
 referenced_from_other_partition_p (symtab_node *node, lto_symtab_encoder_t 
encoder)
 {
   int i;
   struct ipa_ref *ref = NULL;
 
   for (i = 0; node->iterate_referring (i, ref); i++)
     {
       /* Ignore references from non-offloadable nodes while streaming NODE into
         offload LTO section.  */
       if (!ref->referring->need_lto_streaming)
@@ -678,30 +674,31 @@ lto_output_ref (struct lto_simple_output_block *ob, 
struct ipa_ref *ref,
 
   bp = bitpack_create (ob->main_stream);
   bp_pack_value (&bp, ref->use, 3);
   bp_pack_value (&bp, ref->speculative, 1);
   streamer_write_bitpack (&bp);
   nref = lto_symtab_encoder_lookup (encoder, ref->referred);
   gcc_assert (nref != LCC_NOT_FOUND);
   streamer_write_hwi_stream (ob->main_stream, nref);
   
   node = dyn_cast <cgraph_node *> (ref->referring);
   if (node)
     {
       if (ref->stmt)
        uid = gimple_uid (ref->stmt) + 1;
       streamer_write_hwi_stream (ob->main_stream, uid);
+      streamer_write_hwi_stream (ob->main_stream, ref->speculative_id);
     }
 }
 
 /* Stream out profile_summary to OB.  */
 
 static void
 output_profile_summary (struct lto_simple_output_block *ob)
 {
   if (profile_info)
     {
       /* We do not output num and run_max, they are not used by
          GCC profile feedback and they are difficult to merge from multiple
          units.  */
       unsigned runs = (profile_info->runs);
       streamer_write_uhwi_stream (ob->main_stream, runs);
@@ -1416,99 +1413,104 @@ input_ref (class lto_input_block *ib,
           vec<symtab_node *> nodes)
 {
   symtab_node *node = NULL;
   struct bitpack_d bp;
   enum ipa_ref_use use;
   bool speculative;
   struct ipa_ref *ref;
 
   bp = streamer_read_bitpack (ib);
   use = (enum ipa_ref_use) bp_unpack_value (&bp, 3);
   speculative = (enum ipa_ref_use) bp_unpack_value (&bp, 1);
   node = nodes[streamer_read_hwi (ib)];
   ref = referring_node->create_reference (node, use);
   ref->speculative = speculative;
   if (is_a <cgraph_node *> (referring_node))
-    ref->lto_stmt_uid = streamer_read_hwi (ib);
+    {
+      ref->lto_stmt_uid = streamer_read_hwi (ib);
+      ref->speculative_id = streamer_read_hwi (ib);
+    }
 }
 
 /* Read an edge from IB.  NODES points to a vector of previously read nodes for
    decoding caller and callee of the edge to be read.  If INDIRECT is true, the
    edge being read is indirect (in the sense that it has
    indirect_unknown_callee set).  */
 
 static void
 input_edge (class lto_input_block *ib, vec<symtab_node *> nodes,
            bool indirect)
 {
   struct cgraph_node *caller, *callee;
   struct cgraph_edge *edge;
-  unsigned int stmt_id;
+  unsigned int stmt_id, speculative_id;
   profile_count count;
   cgraph_inline_failed_t inline_failed;
   struct bitpack_d bp;
   int ecf_flags = 0;
 
   caller = dyn_cast<cgraph_node *> (nodes[streamer_read_hwi (ib)]);
   if (caller == NULL || caller->decl == NULL_TREE)
     internal_error ("bytecode stream: no caller found while reading edge");
 
   if (!indirect)
     {
       callee = dyn_cast<cgraph_node *> (nodes[streamer_read_hwi (ib)]);
       if (callee == NULL || callee->decl == NULL_TREE)
        internal_error ("bytecode stream: no callee found while reading edge");
     }
   else
     callee = NULL;
 
   count = profile_count::stream_in (ib);
 
   bp = streamer_read_bitpack (ib);
   inline_failed = bp_unpack_enum (&bp, cgraph_inline_failed_t, CIF_N_REASONS);
   stmt_id = bp_unpack_var_len_unsigned (&bp);
+  speculative_id = bp_unpack_var_len_unsigned (&bp);
 
   if (indirect)
     edge = caller->create_indirect_edge (NULL, 0, count);
   else
     edge = caller->create_edge (callee, NULL, count);
 
   edge->indirect_inlining_edge = bp_unpack_value (&bp, 1);
   edge->speculative = bp_unpack_value (&bp, 1);
   edge->lto_stmt_uid = stmt_id;
+  edge->speculative_id = speculative_id;
   edge->inline_failed = inline_failed;
   edge->call_stmt_cannot_inline_p = bp_unpack_value (&bp, 1);
   edge->can_throw_external = bp_unpack_value (&bp, 1);
   edge->in_polymorphic_cdtor = bp_unpack_value (&bp, 1);
   if (indirect)
     {
       if (bp_unpack_value (&bp, 1))
        ecf_flags |= ECF_CONST;
       if (bp_unpack_value (&bp, 1))
        ecf_flags |= ECF_PURE;
       if (bp_unpack_value (&bp, 1))
        ecf_flags |= ECF_NORETURN;
       if (bp_unpack_value (&bp, 1))
        ecf_flags |= ECF_MALLOC;
       if (bp_unpack_value (&bp, 1))
        ecf_flags |= ECF_NOTHROW;
       if (bp_unpack_value (&bp, 1))
        ecf_flags |= ECF_RETURNS_TWICE;
       edge->indirect_info->ecf_flags = ecf_flags;
-      edge->indirect_info->common_target_id = streamer_read_hwi (ib);
-      if (edge->indirect_info->common_target_id)
-        edge->indirect_info->common_target_probability = streamer_read_hwi 
(ib);
+
+      edge->indirect_info->num_speculative_call_targets
+       = bp_unpack_value (&bp, 16);
     }
 }
 
 
 /* Read a cgraph from IB using the info in FILE_DATA.  */
 
 static vec<symtab_node *> 
 input_cgraph_1 (struct lto_file_decl_data *file_data,
                class lto_input_block *ib)
 {
   enum LTO_symtab_tags tag;
   vec<symtab_node *> nodes = vNULL;
   symtab_node *node;
   unsigned i;
 
diff --git a/gcc/predict.c b/gcc/predict.c
index 67f850de17a..592db3421f3 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -749,31 +749,30 @@ dump_prediction (FILE *file, enum br_predictor predictor, 
int probability,
          fprintf (file, " hit ");
          e->count ().dump (file);
          fprintf (file, " (%.1f%%)", e->count ().to_gcov_type() * 100.0
                   / bb->count.to_gcov_type ());
        }
     }
 
   fprintf (file, "\n");
 
   /* Print output that be easily read by analyze_brprob.py script. We are
      interested only in counts that are read from GCDA files.  */
   if (dump_file && (dump_flags & TDF_DETAILS)
       && bb->count.precise_p ()
       && reason == REASON_NONE)
     {
-      gcc_assert (e->count ().precise_p ());
       fprintf (file, ";;heuristics;%s;%" PRId64 ";%" PRId64 ";%.1f;\n",
               predictor_info[predictor].name,
               bb->count.to_gcov_type (), e->count ().to_gcov_type (),
               probability * 100.0 / REG_BR_PROB_BASE);
     }
 }
 
 /* Return true if STMT is known to be unlikely executed.  */
 
 static bool
 unlikely_executed_stmt_p (gimple *stmt)
 {
   if (!is_gimple_call (stmt))
     return false;
   /* NORETURN attribute alone is not strong enough: exit() may be quite
diff --git a/gcc/symtab.c b/gcc/symtab.c
index f4317d02b71..c43bf613f08 100644
--- a/gcc/symtab.c
+++ b/gcc/symtab.c
@@ -591,30 +591,31 @@ symtab_node::create_reference (symtab_node *referred_node,
       ref->referred_index = 0;
 
       for (unsigned int i = 1; i < list2->referring.length (); i++)
        list2->referring[i]->referred_index = i;
     }
   else
     {
       list2->referring.safe_push (ref);
       ref->referred_index = list2->referring.length () - 1;
     }
 
   ref->referring = this;
   ref->referred = referred_node;
   ref->stmt = stmt;
   ref->lto_stmt_uid = 0;
+  ref->speculative_id = 0;
   ref->use = use_type;
   ref->speculative = 0;
 
   /* If vector was moved in memory, update pointers.  */
   if (old_references != list->references->address ())
     {
       int i;
       for (i = 0; iterate_reference(i, ref2); i++)
        ref2->referred_ref_list ()->referring[ref2->referred_index] = ref2;
     }
   return ref;
 }
 
 ipa_ref *
 symtab_node::maybe_create_reference (tree val, gimple *stmt)
@@ -648,63 +649,66 @@ symtab_node::maybe_create_reference (tree val, gimple 
*stmt)
 /* Clone all references from symtab NODE to this symtab_node.  */
 
 void
 symtab_node::clone_references (symtab_node *node)
 {
   ipa_ref *ref = NULL, *ref2 = NULL;
   int i;
   for (i = 0; node->iterate_reference (i, ref); i++)
     {
       bool speculative = ref->speculative;
       unsigned int stmt_uid = ref->lto_stmt_uid;
 
       ref2 = create_reference (ref->referred, ref->use, ref->stmt);
       ref2->speculative = speculative;
       ref2->lto_stmt_uid = stmt_uid;
+      ref2->speculative_id = ref->speculative_id;
     }
 }
 
 /* Clone all referring from symtab NODE to this symtab_node.  */
 
 void
 symtab_node::clone_referring (symtab_node *node)
 {
   ipa_ref *ref = NULL, *ref2 = NULL;
   int i;
   for (i = 0; node->iterate_referring(i, ref); i++)
     {
       bool speculative = ref->speculative;
       unsigned int stmt_uid = ref->lto_stmt_uid;
 
       ref2 = ref->referring->create_reference (this, ref->use, ref->stmt);
       ref2->speculative = speculative;
       ref2->lto_stmt_uid = stmt_uid;
+      ref2->speculative_id = ref->speculative_id;
     }
 }
 
 /* Clone reference REF to this symtab_node and set its stmt to STMT.  */
 
 ipa_ref *
 symtab_node::clone_reference (ipa_ref *ref, gimple *stmt)
 {
   bool speculative = ref->speculative;
   unsigned int stmt_uid = ref->lto_stmt_uid;
   ipa_ref *ref2;
 
   ref2 = create_reference (ref->referred, ref->use, stmt);
   ref2->speculative = speculative;
   ref2->lto_stmt_uid = stmt_uid;
+  ref2->speculative_id = ref->speculative_id;
   return ref2;
 }
 
 /* Find the structure describing a reference to REFERRED_NODE
    and associated with statement STMT.  */
 
 ipa_ref *
 symtab_node::find_reference (symtab_node *referred_node,
                             gimple *stmt, unsigned int lto_stmt_uid)
 {
   ipa_ref *r = NULL;
   int i;
 
   for (i = 0; iterate_reference (i, r); i++)
     if (r->referred == referred_node
@@ -735,30 +739,31 @@ symtab_node::remove_stmt_references (gimple *stmt)
    Those are not maintained during inlining & cloning.
    The exception are speculative references that are updated along
    with callgraph edges associated with them.  */
 
 void
 symtab_node::clear_stmts_in_references (void)
 {
   ipa_ref *r = NULL;
   int i;
 
   for (i = 0; iterate_reference (i, r); i++)
     if (!r->speculative)
       {
        r->stmt = NULL;
        r->lto_stmt_uid = 0;
+       r->speculative_id = 0;
       }
 }
 
 /* Remove all references in ref list.  */
 
 void
 symtab_node::remove_all_references (void)
 {
   while (vec_safe_length (ref_list.references))
     ref_list.references->last ().remove_reference ();
   vec_free (ref_list.references);
 }
 
 /* Remove all referring items in ref list.  */
 
diff --git a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c 
b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c
new file mode 100644
index 00000000000..a13b08cd60e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c
@@ -0,0 +1,33 @@
+/* { dg-require-effective-target lto } */
+/* { dg-additional-sources "crossmodule-indir-call-topn-1a.c" } */
+/* { dg-require-profiling "-fprofile-generate" } */
+/* { dg-options "-O2 -flto -DDOJOB=1 -fdump-ipa-profile_estimate" } */
+
+#include <stdio.h>
+
+typedef int (*fptr) (int);
+int
+one (int a);
+
+int
+two (int a);
+
+fptr table[] = {&one, &two};
+
+int
+main()
+{
+  int i, x;
+  fptr p = &one;
+
+  x = one (3);
+
+  for (i = 0; i < 350000000; i++)
+    {
+      x = (*p) (3);
+      p = table[x];
+    }
+  printf ("done:%d\n", x);
+}
+
+/* { dg-final-use-not-autofdo { scan-pgo-wpa-ipa-dump "2 \\(200.00%\\) 
speculations produced." "profile_estimate" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c 
b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c
new file mode 100644
index 00000000000..a8c6e365fb9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c
@@ -0,0 +1,22 @@
+/* It seems there is no way to avoid the other source of mulitple
+   source testcase from being compiled independently.  Just avoid
+   error.  */
+#ifdef DOJOB
+int
+one (int a)
+{
+  return 1;
+}
+
+int
+two (int a)
+{
+  return 0;
+}
+#else
+int
+main()
+{
+  return 0;
+}
+#endif
diff --git a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c 
b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c
new file mode 100644
index 00000000000..9b996fcf0ed
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c
@@ -0,0 +1,40 @@
+/* { dg-require-effective-target lto } */
+/* { dg-additional-sources "crossmodule-indir-call-topn-1a.c" } */
+/* { dg-require-profiling "-fprofile-generate" } */
+/* { dg-options "-O2 -flto -DDOJOB=1 -fdump-ipa-profile_estimate" } */
+
+#include <stdio.h>
+
+typedef int (*fptr) (int);
+int
+one (int a);
+
+int
+two (int a);
+
+fptr table[] = {&one, &two};
+
+int foo ()
+{
+  int i, x;
+  fptr p = &one;
+
+  x = one (3);
+
+  for (i = 0; i < 350000000; i++)
+    {
+      x = (*p) (3);
+      p = table[x];
+    }
+  return x;
+}
+
+int
+main()
+{
+  int x = foo ();
+  printf ("done:%d\n", x);
+}
+
+/* { dg-final-use-not-autofdo { scan-pgo-wpa-ipa-dump "2 \\(200.00%\\) 
speculations produced." "profile_estimate" } } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c 
b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c
new file mode 100644
index 00000000000..063996c71df
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c
@@ -0,0 +1,37 @@
+/* { dg-require-profiling "-fprofile-generate" } */
+/* { dg-options "-O2 -fdump-ipa-profile_estimate" } */
+
+#include <stdio.h>
+
+typedef int (*fptr) (int);
+int
+one (int a)
+{
+  return 1;
+}
+
+int
+two (int a)
+{
+  return 0;
+}
+
+fptr table[] = {&one, &two};
+
+int
+main()
+{
+  int i, x;
+  fptr p = &one;
+
+  one (3);
+
+  for (i = 0; i < 350000000; i++)
+    {
+      x = (*p) (3);
+      p = table[x];
+    }
+  printf ("done:%d\n", x);
+}
+
+/* { dg-final-use-not-autofdo { scan-ipa-dump "2 \\(200.00%\\) speculations 
produced." "profile_estimate" } } */
diff --git a/gcc/testsuite/lib/scandump.exp b/gcc/testsuite/lib/scandump.exp
index 42f5c01aa60..8b7cd7cfc16 100644
--- a/gcc/testsuite/lib/scandump.exp
+++ b/gcc/testsuite/lib/scandump.exp
@@ -58,30 +58,31 @@ proc scan-dump { args } {
         }
     }
 
     set testcase [testname-for-summary]
     # The name might include a list of options; extract the file name.
     set filename [lindex $testcase 0]
 
     set printable_pattern [make_pattern_printable [lindex $args 1]]
     set suf [dump-suffix [lindex $args 2]]
     set testname "$testcase scan-[lindex $args 0]-dump $suf 
\"$printable_pattern\""
     set src [file tail $filename]
     set dumpbase [dump-base $src [lindex $args 3]]
     set output_file "[glob -nocomplain $dumpbase.[lindex $args 2]]"
     if { $output_file == "" } {
        verbose -log "$testcase: dump file does not exist"
+       verbose -log "dump file: $dumpbase.$suf"
        unresolved "$testname"
        return
     }
 
     set fd [open $output_file r]
     set text [read $fd]
     close $fd
 
     if [regexp -- [lindex $args 1] $text] {
        pass "$testname"
     } else {
        fail "$testname"
     }
 }
 
diff --git a/gcc/testsuite/lib/scanwpaipa.exp b/gcc/testsuite/lib/scanwpaipa.exp
index b5549fd688e..8aafd6c82e8 100644
--- a/gcc/testsuite/lib/scanwpaipa.exp
+++ b/gcc/testsuite/lib/scanwpaipa.exp
@@ -33,30 +33,53 @@ proc scan-wpa-ipa-dump { args } {
     }
     if { [llength $args] > 3 } {
        error "scan-wpa-ipa-dump: too many arguments"
        return
     }
     if { [llength $args] >= 3 } {
        scan-dump "wpa-ipa" [lindex $args 0] \
                  "\[0-9\]\[0-9\]\[0-9\]i.[lindex $args 1]" ".exe.wpa" \
                  [lindex $args 2]
     } else {
        scan-dump "wpa-ipa" [lindex $args 0] \
                  "\[0-9\]\[0-9\]\[0-9\]i.[lindex $args 1]" ".exe.wpa"
     }
 }
 
+# Argument 0 is the regexp to match
+# Argument 1 is the name of the dumped ipa pass
+# Argument 2 handles expected failures and the like
+proc scan-pgo-wpa-ipa-dump { args } {
+
+    if { [llength $args] < 2 } {
+       error "scan-pgo-wpa-ipa-dump: too few arguments"
+       return
+    }
+    if { [llength $args] > 3 } {
+       error "scan-pgo-wpa-ipa-dump: too many arguments"
+       return
+    }
+    if { [llength $args] >= 3 } {
+       scan-dump "pgo-wpa-ipa" [lindex $args 0] \
+                 "\[0-9\]\[0-9\]\[0-9\]i.[lindex $args 1]" ".x02.wpa" \
+                 [lindex $args 2]
+    } else {
+       scan-dump "pgo-wpa-ipa" [lindex $args 0] \
+                 "\[0-9\]\[0-9\]\[0-9\]i.[lindex $args 1]" ".x02.wpa"
+    }
+}
+
 # Call pass if pattern is present given number of times, otherwise fail.
 # Argument 0 is the regexp to match
 # Argument 1 is number of times the regexp must be found
 # Argument 2 is the name of the dumped ipa pass
 # Argument 3 handles expected failures and the like
 proc scan-wpa-ipa-dump-times { args } {
 
     if { [llength $args] < 3 } {
        error "scan-wpa-ipa-dump-times: too few arguments"
        return
     }
     if { [llength $args] > 4 } {
        error "scan-wpa-ipa-dump-times: too many arguments"
        return
     }
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 720f50eefec..ee024388cf7 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -2185,30 +2185,52 @@ copy_bb (copy_body_data *id, basic_block bb,
                      edge = edge->clone (id->dst_node, call_stmt,
                                          gimple_uid (stmt),
                                          num, den,
                                          true);
 
                      /* Speculative calls consist of two edges - direct and
                         indirect.  Duplicate the whole thing and distribute
                         frequencies accordingly.  */
                      if (edge->speculative)
                        {
                          struct cgraph_edge *direct, *indirect;
                          struct ipa_ref *ref;
 
                          gcc_assert (!edge->indirect_unknown_callee);
                          old_edge->speculative_call_info (direct, indirect, 
ref);
+                         while (old_edge->next_callee
+                                && old_edge->next_callee->speculative
+                                && indirect->num_speculative_call_targets_p ()
+                                     > 1)
+                           {
+                             id->dst_node->clone_reference (ref, stmt);
+
+                             edge = old_edge->next_callee;
+                             edge = edge->clone (id->dst_node, call_stmt,
+                                                 gimple_uid (stmt), num, den,
+                                                 true);
+                             old_edge = old_edge->next_callee;
+                             gcc_assert (!edge->indirect_unknown_callee);
+
+                             /* If the indirect edge has multiple speculative
+                                calls, iterate through all direct calls
+                                associated to the speculative call and clone
+                                all related direct edges before cloning the
+                                related indirect edge.  */
+                             old_edge->speculative_call_info (direct, indirect,
+                                                              ref);
+                           }
 
                          profile_count indir_cnt = indirect->count;
                          indirect = indirect->clone (id->dst_node, call_stmt,
                                                      gimple_uid (stmt),
                                                      num, den,
                                                      true);
 
                          profile_probability prob
                             = indir_cnt.probability_in (old_cnt + indir_cnt);
                          indirect->count
                             = copy_basic_block->count.apply_probability (prob);
                          edge->count = copy_basic_block->count - 
indirect->count;
                          id->dst_node->clone_reference (ref, stmt);
                        }
                      else
diff --git a/gcc/value-prof.c b/gcc/value-prof.c
index cc3542f0295..f64f515c1ee 100644
--- a/gcc/value-prof.c
+++ b/gcc/value-prof.c
@@ -94,31 +94,31 @@ along with GCC; see the file COPYING3.  If not see
    Limitations / FIXME / TODO:
    * Only one histogram of each type can be associated with a statement.
    * Some value profile transformations are done in builtins.c (?!)
    * Updating of histograms needs some TLC.
    * The value profiling code could be used to record analysis results
      from non-profiling (e.g. VRP).
    * Adding new profilers should be simplified, starting with a cleanup
      of what-happens-where and with making gimple_find_values_to_profile
      and gimple_value_profile_transformations table-driven, perhaps...
 */
 
 static bool gimple_divmod_fixed_value_transform (gimple_stmt_iterator *);
 static bool gimple_mod_pow2_value_transform (gimple_stmt_iterator *);
 static bool gimple_mod_subtract_transform (gimple_stmt_iterator *);
 static bool gimple_stringops_transform (gimple_stmt_iterator *);
-static bool gimple_ic_transform (gimple_stmt_iterator *);
+static void gimple_ic_transform (gimple_stmt_iterator *);
 
 /* Allocate histogram value.  */
 
 histogram_value
 gimple_alloc_histogram_value (struct function *fun ATTRIBUTE_UNUSED,
                              enum hist_type type, gimple *stmt, tree value)
 {
    histogram_value hist = (histogram_value) xcalloc (1, sizeof (*hist));
    hist->hvalue.value = value;
    hist->hvalue.stmt = stmt;
    hist->type = type;
    return hist;
 }
 
 /* Hash value for histogram.  */
@@ -604,42 +604,44 @@ gimple_value_profile_transformations (void)
              fprintf (dump_file, "Trying transformations on stmt ");
              print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
              dump_histograms_for_stmt (cfun, dump_file, stmt);
            }
 
          /* Transformations:  */
          /* The order of things in this conditional controls which
             transformation is used when more than one is applicable.  */
          /* It is expected that any code added by the transformations
             will be added before the current statement, and that the
             current statement remain valid (although possibly
             modified) upon return.  */
          if (gimple_mod_subtract_transform (&gsi)
              || gimple_divmod_fixed_value_transform (&gsi)
              || gimple_mod_pow2_value_transform (&gsi)
-             || gimple_stringops_transform (&gsi)
-             || gimple_ic_transform (&gsi))
+             || gimple_stringops_transform (&gsi))
            {
              stmt = gsi_stmt (gsi);
              changed = true;
              /* Original statement may no longer be in the same block. */
              if (bb != gimple_bb (stmt))
                {
                  bb = gimple_bb (stmt);
                  gsi = gsi_for_stmt (stmt);
                }
            }
+
+         /* The function never thansforms a GIMPLE statement.  */
+         gimple_ic_transform (&gsi);
         }
     }
 
   return changed;
 }
 
 /* Generate code for transformation 1 (with parent gimple assignment
    STMT and probability of taking the optimal path PROB, which is
    equivalent to COUNT/ALL within roundoff error).  This generates the
    result into a temp and returns the temp; it does not replace or
    alter the original STMT.  */
 
 static tree
 gimple_divmod_fixed_value (gassign *stmt, tree value, profile_probability prob,
                           gcov_type count, gcov_type all)
@@ -1374,92 +1376,97 @@ gimple_ic (gcall *icall_stmt, struct cgraph_node 
*direct_call,
        e = make_edge (dcall_bb, e_eh->dest, e_eh->flags);
        e->probability = e_eh->probability;
        for (gphi_iterator psi = gsi_start_phis (e_eh->dest);
             !gsi_end_p (psi); gsi_next (&psi))
          {
            gphi *phi = psi.phi ();
            SET_USE (PHI_ARG_DEF_PTR_FROM_EDGE (phi, e),
                     PHI_ARG_DEF_FROM_EDGE (phi, e_eh));
          }
        }
   if (!stmt_could_throw_p (cfun, dcall_stmt))
     gimple_purge_dead_eh_edges (dcall_bb);
   return dcall_stmt;
 }
 
-/*
-  For every checked indirect/virtual call determine if most common pid of
-  function/class method has probability more than 50%. If yes modify code of
-  this call to:
- */
+/* There maybe multiple indirect targets in histogram.  Check every
+   indirect/virtual call if callee function exists, if not exist, leave it to
+   LTO stage for later process.  Modify code of this indirect call to an 
if-else
+   structure in ipa-profile finally.  */
 
-static bool
+static void
 gimple_ic_transform (gimple_stmt_iterator *gsi)
 {
   gcall *stmt;
   histogram_value histogram;
   gcov_type val, count, all;
   struct cgraph_node *direct_call;
 
   stmt = dyn_cast <gcall *> (gsi_stmt (*gsi));
   if (!stmt)
-    return false;
+    return;
 
   if (gimple_call_fndecl (stmt) != NULL_TREE)
-    return false;
+    return;
 
   if (gimple_call_internal_p (stmt))
-    return false;
+    return;
 
   histogram = gimple_histogram_value_of_type (cfun, stmt, 
HIST_TYPE_INDIR_CALL);
   if (!histogram)
-    return false;
+    return;
 
-  if (!get_nth_most_common_value (NULL, "indirect call", histogram, &val,
-                                 &count, &all))
-    return false;
+  count = 0;
+  all = histogram->hvalue.counters[0];
 
-  if (4 * count <= 3 * all)
-    return false;
+  for (unsigned j = 0; j < GCOV_TOPN_VALUES; j++)
+    {
+      if (!get_nth_most_common_value (NULL, "indirect call", histogram, &val,
+                                     &count, &all, j))
+       return;
 
-  direct_call = find_func_by_profile_id ((int)val);
+      /* Minimum probability.  should be higher than 25%.  */
+      if (4 * count <= all)
+       return;
 
-  if (direct_call == NULL)
-    {
-      if (val)
+      direct_call = find_func_by_profile_id ((int) val);
+
+      if (direct_call == NULL)
        {
-         if (dump_enabled_p ())
-           dump_printf_loc (MSG_MISSED_OPTIMIZATION, stmt,
-                            "Indirect call -> direct call from other "
-                            "module %T=> %i (will resolve only with LTO)\n",
-                            gimple_call_fn (stmt), (int)val);
+         if (val)
+           {
+             if (dump_enabled_p ())
+               dump_printf_loc (
+                 MSG_MISSED_OPTIMIZATION, stmt,
+                 "Indirect call -> direct call from other "
+                 "module %T=> %i (will resolve only with LTO)\n",
+                 gimple_call_fn (stmt), (int) val);
+           }
+         return;
        }
-      return false;
-    }
 
-  if (dump_enabled_p ())
-    {
-      dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, stmt,
-                      "Indirect call -> direct call "
-                      "%T => %T transformation on insn postponed\n",
-                      gimple_call_fn (stmt), direct_call->decl);
-      dump_printf_loc (MSG_NOTE, stmt,
-                      "hist->count %" PRId64
-                      " hist->all %" PRId64"\n", count, all);
+      if (dump_enabled_p ())
+       {
+         dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, stmt,
+                          "Indirect call -> direct call "
+                          "%T => %T transformation on insn postponed\n",
+                          gimple_call_fn (stmt), direct_call->decl);
+         dump_printf_loc (MSG_NOTE, stmt,
+                          "hist->count %" PRId64 " hist->all %" PRId64 "\n",
+                          count, all);
+       }
     }
-
-  return true;
 }
 
 /* Return true if the stringop CALL shall be profiled.  SIZE_ARG be
    set to the argument index for the size of the string operation.  */
 
 static bool
 interesting_stringop_to_profile_p (gcall *call, int *size_arg)
 {
   enum built_in_function fcode;
 
   fcode = DECL_FUNCTION_CODE (gimple_call_fndecl (call));
   switch (fcode)
     {
      case BUILT_IN_MEMCPY:
      case BUILT_IN_MEMPCPY:
diff --git a/gcc/value-prof.h b/gcc/value-prof.h
index 77c06f60096..b3eeb57d37d 100644
--- a/gcc/value-prof.h
+++ b/gcc/value-prof.h
@@ -77,31 +77,30 @@ histogram_value gimple_alloc_histogram_value (struct 
function *, enum hist_type,
                                              tree value = NULL_TREE);
 histogram_value gimple_histogram_value (struct function *, gimple *);
 histogram_value gimple_histogram_value_of_type (struct function *, gimple *,
                                                enum hist_type);
 void gimple_add_histogram_value (struct function *, gimple *, histogram_value);
 void dump_histograms_for_stmt (struct function *, FILE *, gimple *);
 void gimple_remove_histogram_value (struct function *, gimple *, 
histogram_value);
 void gimple_remove_stmt_histograms (struct function *, gimple *);
 void gimple_duplicate_stmt_histograms (struct function *, gimple *,
                                       struct function *, gimple *);
 void gimple_move_stmt_histograms (struct function *, gimple *, gimple *);
 void verify_histograms (void);
 void free_histograms (function *);
 void stringop_block_profile (gimple *, unsigned int *, HOST_WIDE_INT *);
 gcall *gimple_ic (gcall *, struct cgraph_node *, profile_probability);
-bool check_ic_target (gcall *, struct cgraph_node *);
 bool get_nth_most_common_value (gimple *stmt, const char *counter_type,
                                histogram_value hist, gcov_type *value,
                                gcov_type *count, gcov_type *all,
                                unsigned n = 0);
 
 /* In tree-profile.c.  */
 extern void gimple_init_gcov_profiler (void);
 extern void gimple_gen_edge_profiler (int, edge);
 extern void gimple_gen_interval_profiler (histogram_value, unsigned);
 extern void gimple_gen_pow2_profiler (histogram_value, unsigned);
 extern void gimple_gen_topn_values_profiler (histogram_value, unsigned);
 extern void gimple_gen_ic_profiler (histogram_value, unsigned);
 extern void gimple_gen_ic_func_profiler (void);
 extern void gimple_gen_time_profiler (unsigned);
 extern void gimple_gen_average_profiler (histogram_value, unsigned);
-- 
2.21.0.777.g83232e3864


Reply via email to