Re: [GIT PULL 00/25] perf/core improvements and fixes
* Arnaldo Carvalho de Melowrote: > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > Test results at the end of this message, as usual. > > The following changes since commit 007b811b4041989ec2dc91b9614aa2c41332723e: > > Merge tag 'perf-core-for-mingo-4.13-20170719' of > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core > (2017-06-20 10:49:08 +0200) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git > tags/perf-core-for-mingo-4.13-20170621 > > for you to fetch changes up to 701516ae3dec801084bc913d21e03fce15c61a0b: > > perf script: Fix message because field list option is -F not -f (2017-06-21 > 11:35:53 -0300) > > > perf/core improvements ad fixes: > > New features: > > - Add support to measure SMI cost in 'perf stat' (Kan Liang) > > - Add support for unwinding callchains in powerpc with libdw (Paolo Bonzini) > > Fixes: > > - Fix message: cpu list option is -C not -c (Adrian Hunter) > > - Fix 'perf script' message: field list option is -F not -f (Adrian Hunter) > > - Intel PT fixes: (Adrian Hunter) > > o Fix missing stack clear > o Ensure IP is zero when state is INTEL_PT_STATE_NO_IP > o Fix last_ip usage > o Ensure never to set 'last_ip' when packet 'count' is zero > o Clear FUP flag on error > o Fix transactions_sample_type > > Infrastructure: > > - Intel PT cleanups/refactorings (Adrian Hunter) > > o Use FUP always when scanning for an IP > o Add missing __fallthrough > o Remove redundant initial_skip checks > o Allow decoding with branch tracing disabled > o Add default config for pass-through branch enable > o Add documentation for new config terms > o Add decoder support for ptwrite and power event packets > o Add reserved byte to CBR packet payload > o Add decoder support for CBR events > > - Move find_process() to the only place that uses it, skimming some > more fat from util.[ch] (Arnaldo Carvalho de Melo) > > - Do parameter validation earlier on fetch_kernel_version() (Arnaldo Carvalho > de Melo) > > - Remove unused _ALL_SOURCE define (Arnaldo Carvalho de Melo) > > - Add sysfs__write_int function (Kan Liang) > > Signed-off-by: Arnaldo Carvalho de Melo > > > Adrian Hunter (19): > perf intel-pt: Move decoder error setting into one condition > perf intel-pt: Improve sample timestamp > perf intel-pt: Fix missing stack clear > perf intel-pt: Ensure IP is zero when state is INTEL_PT_STATE_NO_IP > perf intel-pt: Fix last_ip usage > perf intel-pt: Ensure never to set 'last_ip' when packet 'count' is zero > perf intel-pt: Use FUP always when scanning for an IP > perf intel-pt: Clear FUP flag on error > perf intel-pt: Add missing __fallthrough > perf intel-pt: Allow decoding with branch tracing disabled > perf intel-pt: Add default config for pass-through branch enable > perf intel-pt: Add documentation for new config terms > perf intel-pt: Add decoder support for ptwrite and power event packets > perf intel-pt: Add reserved byte to CBR packet payload > perf intel-pt: Add decoder support for CBR events > perf intel-pt: Remove redundant initial_skip checks > perf intel-pt: Fix transactions_sample_type > perf tools: Fix message because cpu list option is -C not -c > perf script: Fix message because field list option is -F not -f > > Arnaldo Carvalho de Melo (3): > perf evsel: Adopt find_process() > perf tools: Do parameter validation earlier on fetch_kernel_version() > perf tools: Remove unused _ALL_SOURCE define > > Kan Liang (2): > tools lib api fs: Add sysfs__write_int function > perf stat: Add support to measure SMI cost > > Paolo Bonzini (1): > perf unwind: Support for powerpc > > tools/lib/api/fs/fs.c | 30 +++ > tools/lib/api/fs/fs.h | 4 + > tools/perf/Documentation/intel-pt.txt | 36 +++ > tools/perf/Documentation/perf-stat.txt | 14 + > tools/perf/Makefile.config | 2 +- > tools/perf/arch/powerpc/util/Build | 2 + > tools/perf/arch/powerpc/util/unwind-libdw.c| 73 ++ > tools/perf/arch/x86/util/intel-pt.c| 5 + > tools/perf/builtin-script.c| 2 +- > tools/perf/builtin-stat.c | 49 > tools/perf/util/evsel.c| 39 +++ > .../perf/util/intel-pt-decoder/intel-pt-decoder.c | 290 > +++-- > .../perf/util/intel-pt-decoder/intel-pt-decoder.h | 13 + > .../util/intel-pt-decoder/intel-pt-pkt-decoder.c | 110 +++- >
Re: [GIT PULL 00/25] perf/core improvements and fixes
* Arnaldo Carvalho de Melo a...@infradead.org wrote: Hi Ingo, Please consider pulling, - Arnaldo The following changes since commit 152fefa921535665f95840c08062844ab2f5593e: Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2013-01-31 10:20:14 +0100) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux tags/perf-core-for-mingo for you to fetch changes up to 2ac3634a7e1c8eedc961030c87c5c36ebd5bbf8e: perf: Document the ABI of perf sysfs entries (2013-01-31 13:07:51 -0300) perf/core improvements and fixes: . Make some POWER7 events available in sysfs, equivalent to what was done on x86, from Sukadev Bhattiprolu. . Add event group view, from Namyung Kim: To use it, 'perf record' should group events when recording. And then perf report parses the saved group relation from file header and prints them together if --group option is provided. You can use 'perf evlist' command to see event group information: $ perf record -e '{ref-cycles,cycles}' noploop 1 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.385 MB perf.data (~16807 samples) ] $ perf evlist --group {ref-cycles,cycles} With this example, default perf report will show you each event separately like this: $ perf report ... # group: {ref-cycles,cycles} # # Samples: 3K of event 'ref-cycles' # Event count (approx.): 3153797218 # # Overhead Command Shared Object Symbol # ... . .. 99.84% noploop noploop[.] main 0.07% noploop ld-2.15.so [.] strcmp 0.03% noploop [kernel.kallsyms] [k] timerqueue_del 0.03% noploop [kernel.kallsyms] [k] sched_clock_cpu 0.02% noploop [kernel.kallsyms] [k] account_user_time 0.01% noploop [kernel.kallsyms] [k] __alloc_pages_nodemask 0.00% noploop [kernel.kallsyms] [k] native_write_msr_safe # Samples: 3K of event 'cycles' # Event count (approx.): 3722310525 # # Overhead Command Shared Object Symbol # ... . . 99.76% noploop noploop[.] main 0.11% noploop [kernel.kallsyms] [k] _raw_spin_lock 0.06% noploop [kernel.kallsyms] [k] find_get_page 0.03% noploop [kernel.kallsyms] [k] sched_clock_cpu 0.02% noploop [kernel.kallsyms] [k] rcu_check_callbacks 0.02% noploop [kernel.kallsyms] [k] __current_kernel_time 0.00% noploop [kernel.kallsyms] [k] native_write_msr_safe In this case the event group information will be shown in the end of header area. So you can use --group option to enable event group view. $ perf report --group ... # group: {ref-cycles,cycles} # # Samples: 7K of event 'anon group { ref-cycles, cycles }' # Event count (approx.): 6876107743 # # Overhead Command Shared Object Symbol # ... . .. 99.84% 99.76% noploop noploop[.] main 0.07% 0.00% noploop ld-2.15.so [.] strcmp 0.03% 0.00% noploop [kernel.kallsyms] [k] timerqueue_del 0.03% 0.03% noploop [kernel.kallsyms] [k] sched_clock_cpu 0.02% 0.00% noploop [kernel.kallsyms] [k] account_user_time 0.01% 0.00% noploop [kernel.kallsyms] [k] __alloc_pages_nodemask 0.00% 0.00% noploop [kernel.kallsyms] [k] native_write_msr_safe 0.00% 0.11% noploop [kernel.kallsyms] [k] _raw_spin_lock 0.00% 0.06% noploop [kernel.kallsyms] [k] find_get_page 0.00% 0.02% noploop [kernel.kallsyms] [k] rcu_check_callbacks 0.00% 0.02% noploop [kernel.kallsyms] [k] __current_kernel_time As you can see the Overhead column now contains both of ref-cycles and cycles and header line shows group information also - 'anon group { ref-cycles, cycles }'. The output is sorted by period of group leader first. If perf.data file doesn't contain group information, this --group option does nothing. So if you want enable event group view by default you can set it in ~/.perfconfig file: $ cat ~/.perfconfig [report] group = true It can be overridden with command line if you want: $ perf report --no-group Signed-off-by: Arnaldo Carvalho de Melo a...@redhat.com Arnaldo Carvalho de Melo (2): perf top: Stop