v4: Update according to Andi's comments. The requirement is not displaying
    the number of removed loops. It needs to display the average number of
    iterations. It computes out the number of iterations by counting
    the removed loops. 

v3: 1. Display the count for tsx abort, remove the abort percentage.

    2. Since the branch history code has a loop detection that removes
       small loops in util/machine.c:remove_loops(). It would be nice to
       note how many loops were removed. So it adds the note on some
       callchain entries.

v2: Just a rebase to Arnaldo's perf/core branch, no functional changes.

Initial post

perf record -g -b ...
perf report --branch-history

Currently it only shows the branches from the LBR in the callgraph view.
It would be useful to annotate branch predictions and TSX aborts and
also timed LBR cycles also in the callgraph view.

This would allow a quick overview where branch predictions are and how
costly basic blocks are.

For example:

# Overhead  Source:Line                  Symbol                       Shared 
Object      Predicted  Abort  Cycles
# ........  ...........................  ...........................  
.................  .........  .....  ......
#
    38.25%  div.c:45                     [.] main                     div       
         97.6%      0      3
            |
            ---main div.c:42 (cycles:2)
               compute_flag div.c:28 (cycles:2)
               compute_flag div.c:27 (cycles:1)
               rand rand.c:28 (cycles:1)
               rand rand.c:28 (cycles:1)
               __random random.c:298 (cycles:1)
               __random random.c:297 (cycles:1)
               __random random.c:295 (cycles:1)
               __random random.c:295 (cycles:1)
               __random random.c:295 (cycles:1)
               __random random.c:295 (cycles:9)
               |
               |--36.73%--__random_r random_r.c:392 (cycles:9)
               |          __random_r random_r.c:357 (cycles:1)
               |          __random random.c:293 (cycles:1)
               |          __random random.c:293 (cycles:1)
               |          __random random.c:291 (cycles:1)
               |          __random random.c:291 (cycles:1)
               |          __random random.c:291 (cycles:1)
               |          __random random.c:288 (cycles:1)
               |          rand rand.c:27 (cycles:1)
               |          rand rand.c:26 (cycles:1)
               |          rand@plt +4194304 (cycles:1)
               |          rand@plt +4194304 (cycles:1)
               |          compute_flag div.c:25 (cycles:1)
               |          compute_flag div.c:22 (cycles:1)
               |          main div.c:40 (cycles:1)
               |          main div.c:40 (cycles:16)
               |          main div.c:39 (cycles:16)
               |          |
               |          |--29.93%--main div.c:39 (predicted:50.6%, cycles:1, 
iterations:18)
               |          |          main div.c:44 (predicted:50.6%, cycles:1)
               |          |          |
               |          |           --22.69%--main div.c:42 (cycles:2, 
iterations:17)
               |          |                     compute_flag div.c:28 (cycles:2)
               |          |                     |
               |          |                      --10.52%--compute_flag 
div.c:27 (cycles:1)
               |          |                                rand rand.c:28 
(cycles:1)

Jin Yao (6):
  perf report: Add branch flag to callchain cursor node
  perf report: Create a symbol_conf flag for showing branch flag
    counting
  perf report: Caculate and return the branch flag counting
  perf report: Show branch info in callchain entry for stdio mode
  perf report: Show branch info in callchain entry for browser mode
  perf report: Display columns Predicted/Abort/Cycles in
    --branch-history

 tools/perf/Documentation/perf-report.txt |   9 ++
 tools/perf/builtin-report.c              |   9 +-
 tools/perf/ui/browsers/hists.c           |  20 ++-
 tools/perf/ui/stdio/hist.c               |  35 +++++-
 tools/perf/util/callchain.c              | 203 ++++++++++++++++++++++++++++++-
 tools/perf/util/callchain.h              |  22 +++-
 tools/perf/util/hist.c                   |   3 +
 tools/perf/util/hist.h                   |   3 +
 tools/perf/util/machine.c                |  82 ++++++++++---
 tools/perf/util/sort.c                   | 113 ++++++++++++++++-
 tools/perf/util/sort.h                   |   3 +
 tools/perf/util/symbol.h                 |   1 +
 12 files changed, 476 insertions(+), 27 deletions(-)

-- 
2.7.4

Reply via email to