[GIT PULL 00/30] perf/core improvements and fixes

2019-03-11 Thread Arnaldo Carvalho de Melo
Hi Ingo,

Please consider pulling,

Best regards,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit b339da480315505aa28a723a983217ebcff95c86:

  Merge tag 'perf-core-for-mingo-5.1-20190307' of 
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent 
(2019-03-09 17:00:17 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
tags/perf-core-for-mingo-5.1-20190311

for you to fetch changes up to dfcbc2f2994b8a3af3605a26dc29c07ad7378bf4:

  tools lib bpf: Fix the build by adding a missing stdarg.h include (2019-03-11 
17:14:31 -0300)


perf/core improvements and fixes:

kernel:

  Stephane Eranian :

  - Restore mmap record type correctly when handling PERF_RECORD_MMAP2
events, as the same template is used for all the threads interested
in mmap events, some may want just PERF_RECORD_MMAP, while some
may want the extra info in MMAP2 records.

perf probe:

  Adrian Hunter:

  - Fix getting the kernel map, because since changes related to x86 PTI
entry trampolines handling, there are more than one kernel map.

perf script:

  Andi Kleen:

  - Support insn output for normal samples, i.e.:

perf script -F ip,sym,insn --xed

Will fetch the sample IP from the thread address space and feed it
to Intel's XED disassembler, producing lines such as:

  a4068804 native_write_msrwrmsr
  a415b95e __hrtimer_next_event_base   movq  0x18(%rax), %rdx

That match 'perf annotate's output.

  - Make the --cpu filter apply to  PERF_RECORD_COMM/FORK/... events, in
addition to PERF_RECORD_SAMPLE.

perf report:

  - Add a new --samples option to save a small random number of samples
per hist entry, using a reservoir technique to select a representative
number of samples.

Then allow browsing the samples using 'perf script' as part of the hist
entry context menu. This automatically adds the right filters, so only
the thread or CPU of the sample is displayed. Then we use less' search
functionality to directly jump to the time stamp of the selected sample.

It uses different menus for assembler and source display.  Assembler
needs xed installed and source needs debuginfo.

  - Fix the UI browser scripts pop up menu when there are many scripts
available.

perf report:

  Andi Kleen:

  - Add 'time' sort option. E.g.:

% perf report --sort time,overhead,symbol --time-quantum 1ms --stdio
...
 0.67%  277061.87300  [.] _dl_start
 0.50%  277061.87300  [.] f1
 0.50%  277061.87300  [.] f2
 0.33%  277061.87300  [.] main
 0.29%  277061.87300  [.] _dl_lookup_symbol_x
 0.29%  277061.87300  [.] dl_main
 0.29%  277061.87300  [.] do_lookup_x
 0.17%  277061.87300  [.] _dl_debug_initialize
 0.17%  277061.87300  [.] _dl_init_paths
 0.08%  277061.87300  [.] check_match
 0.04%  277061.87300  [.] _dl_count_modids
 1.33%  277061.87400  [.] f1
 1.33%  277061.87400  [.] f2
 1.33%  277061.87400  [.] main
 1.17%  277061.87500  [.] main
 1.08%  277061.87500  [.] f1
 1.08%  277061.87500  [.] f2
 1.00%  277061.87600  [.] main
 0.83%  277061.87600  [.] f1
 0.83%  277061.87600  [.] f2
 1.00%  277061.87700  [.] main

tools headers:

  Arnaldo Carvalho de Melo:

  - Update x86's syscall_64.tbl, no change in tools/perf behaviour.

  -  Sync copies asm-generic/unistd.h and linux/in with the kernel sources.

perf data:

  Jiri Olsa:

  - Prep work to support having perf.data stored as a directory, with one
file per CPU, that ultimately will allow having one ring buffer reading
thread per CPU.

Vendor events:

  Martin Liška:

  - perf PMU events for AMD Family 17h.

perf script python:

  Tony Jones:

  - Add python3 support for the remaining Intel PT related scripts, with
these we should have a clean build of perf with python3 while still
supporting the build with python2.

libbpf:

  Arnaldo Carvalho de Melo:

  - Fix the build on uCLibc, adding the missing stdarg.h since we use
va_list in one typedef.

Signed-off-by: Arnaldo Carvalho de Melo 


Adrian Hunter (1):
  perf probe: Fix getting the kernel map

Andi Kleen (14):
  perf script: Support insn output for normal samples
  perf report: Support output in nanoseconds
  perf time-utils: Add utility function to print time stamps in nanoseconds
  perf report: Parse time quantum
  perf report: Use less for scripts output
  perf script: Filter COMM/FORK/.. events by CPU
  perf report: Support time sort key
  perf report: Support running scripts for current time range
  perf report: Support builtin perf script in scripts menu
 

Re: [GIT PULL 00/30] perf/core improvements and fixes

2017-07-01 Thread Ingo Molnar

* Arnaldo Carvalho de Melo  wrote:

> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit 8e70e8409102a37ab066bd91007b75fd5d113931:
> 
>   Merge tag 'perf-core-for-mingo-4.13-20170621' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core 
> (2017-06-21 20:11:53 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo-4.13-20170630
> 
> for you to fetch changes up to 644e0840ad4615e032d67adec6ee60f821b669fe:
> 
>   perf auxtrace: Add CPU filter support (2017-06-30 11:50:55 -0300)
> 
> 
> perf/core improvements and fixes:
> 
> Intel PT:
> 
> - Support "ptwrite" instructio, a way to stuff 32 or 64 bit values into
>   the Intel PT trace (Adrian Hunter)
> 
> - Support power events in Intel PT to report changes to C-state (Adrian
>   Hunter)
> 
> - Synthesize Intel PT events as PERF_RECORD_SAMPLE records with a
>   perf_event_attr.type (PERF_TYPE_SYNTH) just after the range used by the
>   kernel, i.e. right after what is allocated for PMUs, at INT_MAX + 1U,
>   attr.config will have the identification for the synthesized event and
>   the PERF_SAMPLE_RAW payload will have its fields (Adrian Hunter)
> 
> Infrastructure:
> 
> - Remove warning() and error(), using instead pr_warning() and
>   pr_error(), consolidating error reporting (Arnaldo Carvalho de Melo)
> 
> - Add platform dependency to 'perf test 15' (Thomas Richter)
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Adrian Hunter (19):
>   x86/insn: perf tools: Add new ptwrite instruction
>   perf script: Add 'synth' event type for synthesized events
>   tools include: Add byte-swapping macros to kernel.h
>   perf auxtrace: Add itrace option to output ptwrite events
>   perf auxtrace: Add itrace option to output power events
>   perf script: Add 'synth' field for synthesized event payloads
>   perf script: Add synthesized Intel PT power and ptwrite events
>   perf intel-pt: Factor out common code synthesizing event samples
>   perf intel-pt: Remove unused instructions_sample_period
>   perf intel-pt: Join needlessly wrapped lines
>   perf intel-pt: Tidy Intel PT evsel lookup into separate function
>   perf intel-pt: Tidy messages into called function intel_pt_synth_event()
>   perf intel-pt: Factor out intel_pt_set_event_name()
>   perf intel-pt: Move code in intel_pt_synth_events() to simplify attr 
> setting
>   perf intel-pt: Synthesize new power and "ptwrite" events
>   perf intel-pt: Add example script for power events and PTWRITE
>   perf intel-pt: Update documentation to include new ptwrite and power 
> events
>   perf intel-pt: Do not use TSC packets for calculating CPU cycles to TSC
>   perf auxtrace: Add CPU filter support
> 
> Arnaldo Carvalho de Melo (9):
>   perf help: Introduce exec_failed() to avoid code duplication
>   perf help: Elliminate dup code for reporting
>   perf help: Use pr_warning()
>   perf config: Use pr_warning()
>   perf event-parse: Use pr_warning()
>   perf tools: Remove warning()
>   perf tools: Replace error() with pr_err()
>   perf config: Do not die when parsing u64 or int config values
>   perf tools: Kill die()
> 
> Colin Ian King (1):
>   perf jit: fix typo: "incalid" -> "invalid"
> 
> Thomas Richter (1):
>   perf tests: Add platform dependency to test 15
> 
>  arch/x86/lib/x86-opcode-map.txt|   2 +-
>  tools/include/linux/kernel.h   |  35 +-
>  tools/objtool/arch/x86/insn/x86-opcode-map.txt |   2 +-
>  tools/perf/Documentation/intel-pt.txt  |  42 +-
>  tools/perf/Documentation/itrace.txt|   8 +-
>  tools/perf/Documentation/perf-script.txt   |   6 +-
>  tools/perf/arch/x86/tests/insn-x86-dat-32.c|  12 +
>  tools/perf/arch/x86/tests/insn-x86-dat-64.c|  30 +
>  tools/perf/arch/x86/tests/insn-x86-dat-src.c   |  30 +
>  tools/perf/builtin-c2c.c   |   4 +-
>  tools/perf/builtin-diff.c  |   5 +-
>  tools/perf/builtin-help.c  |  48 +-
>  tools/perf/builtin-kmem.c  |   4 +-
>  tools/perf/builtin-record.c|   4 +-
>  tools/perf/builtin-report.c|   8 +-
>  tools/perf/builtin-sched.c |   2 +-
>  tools/perf/builtin-script.c| 205 ++-
>  tools/perf/builtin-stat.c  |   4 +-
>  tools/perf/builtin-top.c   |   2 +-
>  tools/perf/jvmti/jvmti_agent.c |   2 +-
>  .../perf/scripts/python/bin/intel-pt-events-record |  

[GIT PULL 00/30] perf/core improvements and fixes

2017-06-30 Thread Arnaldo Carvalho de Melo
Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 8e70e8409102a37ab066bd91007b75fd5d113931:

  Merge tag 'perf-core-for-mingo-4.13-20170621' of 
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core 
(2017-06-21 20:11:53 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
tags/perf-core-for-mingo-4.13-20170630

for you to fetch changes up to 644e0840ad4615e032d67adec6ee60f821b669fe:

  perf auxtrace: Add CPU filter support (2017-06-30 11:50:55 -0300)


perf/core improvements and fixes:

Intel PT:

- Support "ptwrite" instructio, a way to stuff 32 or 64 bit values into
  the Intel PT trace (Adrian Hunter)

- Support power events in Intel PT to report changes to C-state (Adrian
  Hunter)

- Synthesize Intel PT events as PERF_RECORD_SAMPLE records with a
  perf_event_attr.type (PERF_TYPE_SYNTH) just after the range used by the
  kernel, i.e. right after what is allocated for PMUs, at INT_MAX + 1U,
  attr.config will have the identification for the synthesized event and
  the PERF_SAMPLE_RAW payload will have its fields (Adrian Hunter)

Infrastructure:

- Remove warning() and error(), using instead pr_warning() and
  pr_error(), consolidating error reporting (Arnaldo Carvalho de Melo)

- Add platform dependency to 'perf test 15' (Thomas Richter)

Signed-off-by: Arnaldo Carvalho de Melo 


Adrian Hunter (19):
  x86/insn: perf tools: Add new ptwrite instruction
  perf script: Add 'synth' event type for synthesized events
  tools include: Add byte-swapping macros to kernel.h
  perf auxtrace: Add itrace option to output ptwrite events
  perf auxtrace: Add itrace option to output power events
  perf script: Add 'synth' field for synthesized event payloads
  perf script: Add synthesized Intel PT power and ptwrite events
  perf intel-pt: Factor out common code synthesizing event samples
  perf intel-pt: Remove unused instructions_sample_period
  perf intel-pt: Join needlessly wrapped lines
  perf intel-pt: Tidy Intel PT evsel lookup into separate function
  perf intel-pt: Tidy messages into called function intel_pt_synth_event()
  perf intel-pt: Factor out intel_pt_set_event_name()
  perf intel-pt: Move code in intel_pt_synth_events() to simplify attr 
setting
  perf intel-pt: Synthesize new power and "ptwrite" events
  perf intel-pt: Add example script for power events and PTWRITE
  perf intel-pt: Update documentation to include new ptwrite and power 
events
  perf intel-pt: Do not use TSC packets for calculating CPU cycles to TSC
  perf auxtrace: Add CPU filter support

Arnaldo Carvalho de Melo (9):
  perf help: Introduce exec_failed() to avoid code duplication
  perf help: Elliminate dup code for reporting
  perf help: Use pr_warning()
  perf config: Use pr_warning()
  perf event-parse: Use pr_warning()
  perf tools: Remove warning()
  perf tools: Replace error() with pr_err()
  perf config: Do not die when parsing u64 or int config values
  perf tools: Kill die()

Colin Ian King (1):
  perf jit: fix typo: "incalid" -> "invalid"

Thomas Richter (1):
  perf tests: Add platform dependency to test 15

 arch/x86/lib/x86-opcode-map.txt|   2 +-
 tools/include/linux/kernel.h   |  35 +-
 tools/objtool/arch/x86/insn/x86-opcode-map.txt |   2 +-
 tools/perf/Documentation/intel-pt.txt  |  42 +-
 tools/perf/Documentation/itrace.txt|   8 +-
 tools/perf/Documentation/perf-script.txt   |   6 +-
 tools/perf/arch/x86/tests/insn-x86-dat-32.c|  12 +
 tools/perf/arch/x86/tests/insn-x86-dat-64.c|  30 +
 tools/perf/arch/x86/tests/insn-x86-dat-src.c   |  30 +
 tools/perf/builtin-c2c.c   |   4 +-
 tools/perf/builtin-diff.c  |   5 +-
 tools/perf/builtin-help.c  |  48 +-
 tools/perf/builtin-kmem.c  |   4 +-
 tools/perf/builtin-record.c|   4 +-
 tools/perf/builtin-report.c|   8 +-
 tools/perf/builtin-sched.c |   2 +-
 tools/perf/builtin-script.c| 205 ++-
 tools/perf/builtin-stat.c  |   4 +-
 tools/perf/builtin-top.c   |   2 +-
 tools/perf/jvmti/jvmti_agent.c |   2 +-
 .../perf/scripts/python/bin/intel-pt-events-record |  13 +
 .../perf/scripts/python/bin/intel-pt-events-report |   3 +
 tools/perf/scripts/python/intel-pt-events.py   | 128 +
 tools/perf/tests/attr.c|  10 +-
 tools/perf/tests/attr.py   |  48 ++
 tools

Re: [GIT PULL 00/30] perf/core improvements and fixes

2016-04-27 Thread Ingo Molnar

* Arnaldo Carvalho de Melo  wrote:

> From: Arnaldo Carvalho de Melo 
> 
> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> The following changes since commit 67d61296ffcc850bffdd4466430cb91e5328f39a:
> 
>   Merge tag 'perf-core-for-mingo-20160419' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux (2016-04-23 14:50:39 
> +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo-20160427
> 
> for you to fetch changes up to 4cb93446c587d56e2a54f4f83113daba2c0b6dee:
> 
>   perf tools: Set the maximum allowed stack from 
> /proc/sys/kernel/perf_event_max_stack (2016-04-27 10:29:07 -0300)
> 
> 
> perf/core improvements and fixes:
> 
> User visible:
> 
> - perf trace --pf maj/min/all works with --call-graph: (Arnaldo Carvalho de 
> Melo)
> 
>   Tracing write syscalls and major page faults with callchains while starting
>   firefox, limiting the stack to 5 frames:
> 
>  # perf trace -e write --pf maj --max-stack 5 firefox
>589.549 ( 0.014 ms): firefox/15377 write(fd: 4, buf: 0x7fff80acc898, 
> count: 151) = 151
>[0xfaed] 
> (/usr/lib64/libpthread-2.22.so)
>fire_glxtest_process+0x5c 
> (/usr/lib64/firefox/libxul.so)
>InstallGdkErrorHandler+0x41 
> (/usr/lib64/firefox/libxul.so)
>XREMain::XRE_mainInit+0x12c 
> (/usr/lib64/firefox/libxul.so)
>XREMain::XRE_main+0x1e4 
> (/usr/lib64/firefox/libxul.so)
>760.704 ( 0.000 ms): firefox/15332 majfault 
> [gtk_tree_view_accessible_get_type+0x0] => 
> /usr/lib64/libgtk-3.so.0.1800.9@0xa0850 (x.)
>gtk_tree_view_accessible_get_type+0x0 
> (/usr/lib64/libgtk-3.so.0.1800.9)
>gtk_tree_view_class_intern_init+0x1a54 
> (/usr/lib64/libgtk-3.so.0.1800.9)
>g_type_class_ref+0x6dd 
> (/usr/lib64/libgobject-2.0.so.0.4600.2)
>[0x115378] 
> (/usr/lib64/libgnutls.so.30.6.3)
> 
>   This automagically selects "--call-graph dwarf", use "--call-graph fp" on 
> systems
>   where -fno-omit-frame-pointer was used to built the components of interest, 
> to
>   incur in less overhead, or tune "--call-graph dwarf" appropriately, see 
> 'perf record --help'.
> 
> - Allow /proc/sys/kernel/perf_event_max_stack, that defaults to the old hard 
> coded value
>   of PERF_MAX_STACK_DEPTH (127), useful for huge callstacks for things like 
> Groovy, Ruby, etc,
>   and also to reduce overhead by limiting it to a smaller value, upcoming 
> work will allow
>   this to be done per-event (Arnaldo Carvalho de Melo)
> 
> - Make 'perf trace --min-stack' be honoured by --pf and --event (Arnaldo 
> Carvalho de Melo)
> 
> - Make 'perf evlist -v' decode perf_event_attr->branch_sample_type (Arnaldo 
> Carvalho de Melo)
> 
># perf record --call lbr usleep 1
># perf evlist -v
>cycles:ppp: ... sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|BRANCH_STACK, ...
> branch_sample_type: USER|CALL_STACK|NO_FLAGS|NO_CYCLES
>#
> 
> - Clear dummy entry accumulated period, fixing such 'perf top/report' output
>   as: (Kan Liang)
> 
> 4769.98%  0.01%  0.00%  0.01%  tchain_edit  [kernel] [k] 
> update_fast_timekeeper
> 
> - System calls with pid_t arguments gets them augmented with the COMM event
>   more thoroughly:
> 
>   # trace -e perf_event_open perf stat -e cycles -p 15608
>6.876 ( 0.014 ms): perf_event_open(attr_uptr: 0x2ae20d8, pid: 15608 
> (hexchat), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
>6.882 ( 0.005 ms): perf_event_open(attr_uptr: 0x2ae20d8, pid: 15639 
> (gmain), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
>6.889 ( 0.005 ms): perf_event_open(attr_uptr: 0x2ae20d8, pid: 15640 
> (gdbus), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 5
> ^^
>^C
> 
> - Fix offline module name mismatch issue in 'perf probe' (Ravi Bangoria)
> 
> - Fix module probe issue if no dwarf support in (Ravi Bangoria)
> 
> Assorted fixes:
> 
> - Fix off-by-one in write_buildid() (Andrey Ryabinin)
> 
> - Fix segfault when printing callchains in 'perf script' (Chris Phlipot)
> 
> - Replace assignment with comparison on assert check in 'perf test' entry 
> (Colin Ian King)
> 
> - Fix off-by-one comparison in intel-pt code (Colin Ian King)
> 
> - Close target file on error path in 'perf probe' (Masami Hiramatsu)
> 
> - Set default kprobe group name if not given in 'perf probe' (Masami 
> Hiramatsu)
> 
> - Avoid partial perf_event_header reads (Wang Nan)
> 
> Infrastructure:
> 
> - Update x86's syscall_64.tbl copy, adding preadv2 & pwritev2 (Arnaldo 
> Carvalho de Melo)
> 
> - Make the x86 clean qui

[GIT PULL 00/30] perf/core improvements and fixes

2016-04-27 Thread Arnaldo Carvalho de Melo
From: Arnaldo Carvalho de Melo 

Hi Ingo,

Please consider pulling,

- Arnaldo

The following changes since commit 67d61296ffcc850bffdd4466430cb91e5328f39a:

  Merge tag 'perf-core-for-mingo-20160419' of 
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux (2016-04-23 14:50:39 
+0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
tags/perf-core-for-mingo-20160427

for you to fetch changes up to 4cb93446c587d56e2a54f4f83113daba2c0b6dee:

  perf tools: Set the maximum allowed stack from 
/proc/sys/kernel/perf_event_max_stack (2016-04-27 10:29:07 -0300)


perf/core improvements and fixes:

User visible:

- perf trace --pf maj/min/all works with --call-graph: (Arnaldo Carvalho de 
Melo)

  Tracing write syscalls and major page faults with callchains while starting
  firefox, limiting the stack to 5 frames:

 # perf trace -e write --pf maj --max-stack 5 firefox
   589.549 ( 0.014 ms): firefox/15377 write(fd: 4, buf: 0x7fff80acc898, count: 
151) = 151
   [0xfaed] (/usr/lib64/libpthread-2.22.so)
   fire_glxtest_process+0x5c 
(/usr/lib64/firefox/libxul.so)
   InstallGdkErrorHandler+0x41 
(/usr/lib64/firefox/libxul.so)
   XREMain::XRE_mainInit+0x12c 
(/usr/lib64/firefox/libxul.so)
   XREMain::XRE_main+0x1e4 
(/usr/lib64/firefox/libxul.so)
   760.704 ( 0.000 ms): firefox/15332 majfault 
[gtk_tree_view_accessible_get_type+0x0] => 
/usr/lib64/libgtk-3.so.0.1800.9@0xa0850 (x.)
   gtk_tree_view_accessible_get_type+0x0 
(/usr/lib64/libgtk-3.so.0.1800.9)
   gtk_tree_view_class_intern_init+0x1a54 
(/usr/lib64/libgtk-3.so.0.1800.9)
   g_type_class_ref+0x6dd 
(/usr/lib64/libgobject-2.0.so.0.4600.2)
   [0x115378] 
(/usr/lib64/libgnutls.so.30.6.3)

  This automagically selects "--call-graph dwarf", use "--call-graph fp" on 
systems
  where -fno-omit-frame-pointer was used to built the components of interest, to
  incur in less overhead, or tune "--call-graph dwarf" appropriately, see 'perf 
record --help'.

- Allow /proc/sys/kernel/perf_event_max_stack, that defaults to the old hard 
coded value
  of PERF_MAX_STACK_DEPTH (127), useful for huge callstacks for things like 
Groovy, Ruby, etc,
  and also to reduce overhead by limiting it to a smaller value, upcoming work 
will allow
  this to be done per-event (Arnaldo Carvalho de Melo)

- Make 'perf trace --min-stack' be honoured by --pf and --event (Arnaldo 
Carvalho de Melo)

- Make 'perf evlist -v' decode perf_event_attr->branch_sample_type (Arnaldo 
Carvalho de Melo)

   # perf record --call lbr usleep 1
   # perf evlist -v
   cycles:ppp: ... sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|BRANCH_STACK, ...
branch_sample_type: USER|CALL_STACK|NO_FLAGS|NO_CYCLES
   #

- Clear dummy entry accumulated period, fixing such 'perf top/report' output
  as: (Kan Liang)

4769.98%  0.01%  0.00%  0.01%  tchain_edit  [kernel] [k] 
update_fast_timekeeper

- System calls with pid_t arguments gets them augmented with the COMM event
  more thoroughly:

  # trace -e perf_event_open perf stat -e cycles -p 15608
   6.876 ( 0.014 ms): perf_event_open(attr_uptr: 0x2ae20d8, pid: 15608 
(hexchat), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
   6.882 ( 0.005 ms): perf_event_open(attr_uptr: 0x2ae20d8, pid: 15639 (gmain), 
cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
   6.889 ( 0.005 ms): perf_event_open(attr_uptr: 0x2ae20d8, pid: 15640 (gdbus), 
cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 5
^^
   ^C

- Fix offline module name mismatch issue in 'perf probe' (Ravi Bangoria)

- Fix module probe issue if no dwarf support in (Ravi Bangoria)

Assorted fixes:

- Fix off-by-one in write_buildid() (Andrey Ryabinin)

- Fix segfault when printing callchains in 'perf script' (Chris Phlipot)

- Replace assignment with comparison on assert check in 'perf test' entry 
(Colin Ian King)

- Fix off-by-one comparison in intel-pt code (Colin Ian King)

- Close target file on error path in 'perf probe' (Masami Hiramatsu)

- Set default kprobe group name if not given in 'perf probe' (Masami Hiramatsu)

- Avoid partial perf_event_header reads (Wang Nan)

Infrastructure:

- Update x86's syscall_64.tbl copy, adding preadv2 & pwritev2 (Arnaldo Carvalho 
de Melo)

- Make the x86 clean quiet wrt syscall table removal (Jiri Olsa)

Cleanups:

- Simplify wrapper for LOCK_PI in 'perf bench futex' (Davidlohr Bueso)

- Remove duplicate const qualifier (Eric Engestrom)

Signed-off-by: Arnaldo Carvalho de Melo 


Andrey Ryabinin (1):

Re: [GIT PULL 00/30] perf/core improvements and fixes

2015-05-18 Thread Arnaldo Carvalho de Melo
Em Fri, May 15, 2015 at 11:08:04AM +0900, Namhyung Kim escreveu:
> Hi Arnaldo,
> 
> On Thu, May 14, 2015 at 10:18:27AM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Thu, May 14, 2015 at 05:23:30PM +0900, Namhyung Kim escreveu:
> > > On Mon, May 11, 2015 at 11:06:26AM -0300, Arnaldo Carvalho de Melo wrote:
> > We need to improve this segfault backtrace, I have to always use
> > addr2line to resolve those missing entries, i.e. if you try:
> > 
> > addr2line -fe /path/to/your/perf 0x4dd9c8
> > addr2line -fe /path/to/your/perf 0x4e2580
> > 
> > We would have resolved those lines :-/
> 
> Right, I'll add it to my TODO list.
> 
> Anyway, this is a backtrace using gdb..
 
Ok, reproduced here:


[acme@ibm-x3650m4-01 linux]$ fg
gdb perf
list
134 if (verbose) {
135 dso_name_l = dso_l->long_name;
136 dso_name_r = dso_r->long_name;
137 } else {
138 dso_name_l = dso_l->short_name;
139 dso_name_r = dso_r->short_name;
140 }
141
142 return strcmp(dso_name_l, dso_name_r);
143 }
(gdb) p dso_l
$2 = (struct dso *) 0x1924ba0
(gdb) 
$3 = (struct dso *) 0x1924ba0
(gdb) p dso_r
$4 = (struct dso *) 0x1
(gdb) bt
#0  0x004f557b in _sort__dso_cmp (map_l=0x182ab3120, map_r=0xd5325b0) 
at util/sort.c:139
#1  0x004f55f1 in sort__dso_cmp (left=0x606c7f0, right=0x7fffb850) 
at util/sort.c:148
#2  0x004f8470 in __sort__hpp_cmp (fmt=0x1922fb0, a=0x606c7f0, 
b=0x7fffb850) at util/sort.c:1313
#3  0x004fc3b8 in hist_entry__cmp (left=0x606c7f0, 
right=0x7fffb850) at util/hist.c:911
#4  0x004fafcc in add_hist_entry (hists=0x1922d80, 
entry=0x7fffb850, al=0x7fffbbe0, sample_self=false) at util/hist.c:389
#5  0x004fb350 in __hists__add_entry (hists=0x1922d80, 
al=0x7fffbbe0, sym_parent=0x0, bi=0x0, mi=0x0, period=557536, weight=0, 
transaction=0, sample_self=false)
at util/hist.c:471
#6  0x004fc03c in iter_add_next_cumulative_entry (iter=0x7fffbc10, 
al=0x7fffbbe0) at util/hist.c:797
#7  0x004fc291 in hist_entry_iter__add (iter=0x7fffbc10, 
al=0x7fffbbe0, evsel=0x1922c50, sample=0x7fffbdf0, max_stack_depth=127, 
arg=0x7fffc810) at util/hist.c:882
#8  0x0042f1b7 in process_sample_event (tool=0x7fffc810, 
event=0x7ffed74b41e0, sample=0x7fffbdf0, evsel=0x1922c50, 
machine=0x19213d0) at builtin-report.c:171
#9  0x004da272 in perf_evlist__deliver_sample (evlist=0x1922260, 
tool=0x7fffc810, event=0x7ffed74b41e0, sample=0x7fffbdf0, 
evsel=0x1922c50, machine=0x19213d0)
at util/session.c:1000
#10 0x004da40c in machines__deliver_event (machines=0x19213d0, 
evlist=0x1922260, event=0x7ffed74b41e0, sample=0x7fffbdf0, 
tool=0x7fffc810, file_offset=1097646560)
at util/session.c:1037
#11 0x004da659 in perf_session__deliver_event (session=0x1921310, 
event=0x7ffed74b41e0, sample=0x7fffbdf0, tool=0x7fffc810, 
file_offset=1097646560) at util/session.c:1082
#12 0x004d7d7b in ordered_events__deliver_event (oe=0x1921558, 
event=0x2050430) at util/session.c:109
#13 0x004dd65b in __ordered_events__flush (oe=0x1921558) at 
util/ordered-events.c:207
#14 0x004dd92f in ordered_events__flush (oe=0x1921558, 
how=OE_FLUSH__ROUND) at util/ordered-events.c:271
#15 0x004d94c8 in process_finished_round (tool=0x7fffc810, 
event=0x7ffed74c6830, oe=0x1921558) at util/session.c:663
#16 0x004da7cd in perf_session__process_user_event (session=0x1921310, 
event=0x7ffed74c6830, file_offset=1097721904) at util/session.c:1119
#17 0x004daced in perf_session__process_event (session=0x1921310, 
event=0x7ffed74c6830, file_offset=1097721904) at util/session.c:1232
#18 0x004db811 in __perf_session__process_events (session=0x1921310, 
data_offset=232, data_size=5774474704, file_size=5774474936) at 
util/session.c:1533
#19 0x004dba01 in perf_session__process_events (session=0x1921310) at 
util/session.c:1580
#20 0x0042ff9f in __cmd_report (rep=0x7fffc810) at 
builtin-report.c:487
#21 0x004315d9 in cmd_report (argc=0, argv=0x7fffddd0, prefix=0x0) 
at builtin-report.c:878
#22 0x00490fb8 in run_builtin (p=0x886528 , argc=1, 
argv=0x7fffddd0) at perf.c:370
#23 0x00491217 in handle_internal_command (argc=1, argv=0x7fffddd0) 
at perf.c:429
#24 0x00491363 in run_argv (argcp=0x7fffdc2c, argv=0x7fffdc20) 
at perf.c:473
#25 0x004916c4 in main (argc=1, argv=0x7fffddd0) at perf.c:588
(gdb) 

Looking at the frame #1 I see:

(gdb) p left->hists
$22 = (struct hists *) 0x1922d80
(gdb) p right->hists
$23 = (struct hists *) 0x1922d80
(gdb)

I.e. both look like fine hist_entry instances, both are on the same struct 
hists, but:

(gdb) p right->ms.map->dso
$25 = (struct dso *) 0x1924ba0
(gdb) p right->ms.ma
There is no member named ma.
(gdb) p right->ms.map
$26 = 

Re: [GIT PULL 00/30] perf/core improvements and fixes

2015-05-14 Thread Namhyung Kim
Hi Arnaldo,

On Thu, May 14, 2015 at 10:18:27AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, May 14, 2015 at 05:23:30PM +0900, Namhyung Kim escreveu:
> > On Mon, May 11, 2015 at 11:06:26AM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Mon, May 11, 2015 at 02:09:39PM +0900, Namhyung Kim escreveu:
> > > > I'm seeing a segfault on 'perf report' with a large data file after
> > > > applying thread refcount change - it happens regardless of the atomic
> > > > operation.
> 
> > > Any specific 'perf record' command line? Does it take a long time to
> > > reproduce? Any backtraces? I'll try to repro, its possible that we're
> > > doing one too many thread__put()...
>  
> > It's a kernel build with '-j 20' and recorded data size is ~2.1GB.
> > It takes ~30 sec to reproduce.
> > 
> >   $ perf report -i threaded/kbuild7.data --header-only
> >   # 
> >   # captured on: Thu Dec 18 12:06:35 2014
> >   # hostname : sejong
> >   # os release : 3.17.4-1-ARCH
> >   # perf version : 3.18.rc3.gcb4774b
> >   # arch : x86_64
> >   # nrcpus online : 12
> >   # nrcpus avail : 12
> >   # cpudesc : Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz
> >   # cpuid : GenuineIntel,6,45,7
> >   # total memory : 24646828 kB
> >   # cmdline : /home/namhyung/project/linux/tools/perf/perf record -ag -o 
> > /home/namhyung/tmp/perf/threaded/kbuild7.data -- make -j20
> >   # event : name = cycles, , size = 104, { sample_period, sample_freq } = 
> > 4000, sample_type = IP|TID|TIME|CALLCHAIN|CPU|PERIOD, disabled = 1, inherit
> >   # HEADER_CPU_TOPOLOGY info available, use -I to display
> >   # HEADER_NUMA_TOPOLOGY info available, use -I to display
> >   # pmu mappings: cpu = 4, software = 1, power = 24, uncore_pcu = 13, 
> > tracepoint = 2, uncore_imc_0 = 15, uncore_imc_1 = 16, uncore_imc_2 = 17, 
> > uncore_
> >   # 
> >   #
> > 
> > 
> >   $ perf data stat -i threaded/kbuild7.data
> > 
> >Total event stats for 'threaded/kbuild7.data' file:
> >   
> >  TOTAL events:   25126492
> >   MMAP events:114
> >   COMM events: 117957
> >   EXIT events: 240544
> >   THROTTLE events: 16
> > UNTHROTTLE events: 16
> >   FORK events: 120488
> > SAMPLE events:   23878219
> >  MMAP2 events: 745325
> > FINISHED_ROUND events:  23813
> >   
> >Sample event stats:
> >   
> >   20,579,564,471,104  cycles
> >   23,878,219  samples   #   sampling ratio  
> > 99.745% (3989/4000)
> > 
> >498.736917889 second time sampled
> > 
> > 
> >   $ perf report -i threaded/kbuild7.data
> 
> We need to improve this segfault backtrace, I have to always use
> addr2line to resolve those missing entries, i.e. if you try:
> 
> addr2line -fe /path/to/your/perf 0x4dd9c8
> addr2line -fe /path/to/your/perf 0x4e2580
> 
> We would have resolved those lines :-/

Right, I'll add it to my TODO list.

Anyway, this is a backtrace using gdb..

Thanks,
Namhyung


Program received signal SIGSEGV, Segmentation fault.
0x75fb229e in __strcmp_sse2_unaligned () from /usr/lib/libc.so.6
(gdb) bt
#0  0x75fb229e in __strcmp_sse2_unaligned () from /usr/lib/libc.so.6
#1  0x004d3948 in _sort__dso_cmp (map_r=, 
map_l=) at util/sort.c:142
#2  sort__dso_cmp (left=, right=) at 
util/sort.c:148
#3  0x004d7f08 in hist_entry__cmp (right=0x7fffc530, 
left=0x323a27f0) at util/hist.c:911
#4  add_hist_entry (sample_self=true, al=0x7fffc710, entry=0x7fffc530, 
hists=0x18f6690) at util/hist.c:389
#5  __hists__add_entry (hists=0x18f6690, al=0x7fffc710, 
sym_parent=, bi=bi@entry=0x0, mi=mi@entry=0x0, period=,
weight=0, transaction=0, sample_self=true) at util/hist.c:471
#6  0x004d8234 in iter_add_single_normal_entry (iter=0x7fffc740, 
al=) at util/hist.c:662
#7  0x004d8765 in hist_entry_iter__add (iter=0x7fffc740, 
al=0x7fffc710, evsel=0x18f6550, sample=,
max_stack_depth=, arg=0x7fffd0a0) at util/hist.c:871
#8  0x00436353 in process_sample_event (tool=0x7fffd0a0, 
event=, sample=0x7fffc870, evsel=0x18f6550,
machine=) at builtin-report.c:171
#9  0x004bbe23 in perf_evlist__deliver_sample (machine=0x18f4cc0, 
evsel=0x18f6550, sample=0x7fffc870, event=0x7fffe0bd3220,
tool=0x7fffd0a0, evlist=0x18f5b50) at util/session.c:972
#10 machines__deliver_event (machines=machines@entry=0x18f4cc0, 
evlist=, event=event@entry=0x7fffe0bd3220,
sample=sample@entry=0x7fffc870, tool=tool@entry=0x7fffd0a0, 
file_offset=file_offset@entry=1821434400) at util/session.c:1009
#11 0x004bc681 in perf_session__deliver_event (file_offset=1821434400, 
tool=0x7fffd0a0, sample=0x7fffc870, event=0x7fffe0bd3220,
session=) at util/session.c:1050
#12 ordered_events__deliver_event (oe=0x18f4e00, event=) at 
util/session.c:109
#13 0x004bf12b in __ordered_events__flush (oe=0x18f4e00) at 
util/order

Re: [GIT PULL 00/30] perf/core improvements and fixes

2015-05-14 Thread Arnaldo Carvalho de Melo
Em Thu, May 14, 2015 at 05:23:30PM +0900, Namhyung Kim escreveu:
> On Mon, May 11, 2015 at 11:06:26AM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Mon, May 11, 2015 at 02:09:39PM +0900, Namhyung Kim escreveu:
> > > I'm seeing a segfault on 'perf report' with a large data file after
> > > applying thread refcount change - it happens regardless of the atomic
> > > operation.

> > Any specific 'perf record' command line? Does it take a long time to
> > reproduce? Any backtraces? I'll try to repro, its possible that we're
> > doing one too many thread__put()...
 
> It's a kernel build with '-j 20' and recorded data size is ~2.1GB.
> It takes ~30 sec to reproduce.
> 
>   $ perf report -i threaded/kbuild7.data --header-only
>   # 
>   # captured on: Thu Dec 18 12:06:35 2014
>   # hostname : sejong
>   # os release : 3.17.4-1-ARCH
>   # perf version : 3.18.rc3.gcb4774b
>   # arch : x86_64
>   # nrcpus online : 12
>   # nrcpus avail : 12
>   # cpudesc : Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz
>   # cpuid : GenuineIntel,6,45,7
>   # total memory : 24646828 kB
>   # cmdline : /home/namhyung/project/linux/tools/perf/perf record -ag -o 
> /home/namhyung/tmp/perf/threaded/kbuild7.data -- make -j20
>   # event : name = cycles, , size = 104, { sample_period, sample_freq } = 
> 4000, sample_type = IP|TID|TIME|CALLCHAIN|CPU|PERIOD, disabled = 1, inherit
>   # HEADER_CPU_TOPOLOGY info available, use -I to display
>   # HEADER_NUMA_TOPOLOGY info available, use -I to display
>   # pmu mappings: cpu = 4, software = 1, power = 24, uncore_pcu = 13, 
> tracepoint = 2, uncore_imc_0 = 15, uncore_imc_1 = 16, uncore_imc_2 = 17, 
> uncore_
>   # 
>   #
> 
> 
>   $ perf data stat -i threaded/kbuild7.data
> 
>Total event stats for 'threaded/kbuild7.data' file:
>   
>  TOTAL events:   25126492
>   MMAP events:114
>   COMM events: 117957
>   EXIT events: 240544
>   THROTTLE events: 16
> UNTHROTTLE events: 16
>   FORK events: 120488
> SAMPLE events:   23878219
>  MMAP2 events: 745325
> FINISHED_ROUND events:  23813
>   
>Sample event stats:
>   
>   20,579,564,471,104  cycles
>   23,878,219  samples   #   sampling ratio  
> 99.745% (3989/4000)
> 
>498.736917889 second time sampled
> 
> 
>   $ perf report -i threaded/kbuild7.data

We need to improve this segfault backtrace, I have to always use
addr2line to resolve those missing entries, i.e. if you try:

addr2line -fe /path/to/your/perf 0x4dd9c8
addr2line -fe /path/to/your/perf 0x4e2580

We would have resolved those lines :-/

But I think this is a longstanding bug in handling hist_entries, i.e.
probably we have more than one pointer to a hist_entry and are accessing
it in two places at the same time, with one of them deleting it and
possibly reusing the data.

>   perf: Segmentation fault
>    backtrace 
>   perf[0x51c7cb]
>   /usr/lib/libc.so.6(+0x33540)[0x7f37eb37e540]
>   /usr/lib/libc.so.6(+0x9029e)[0x7f37eb3db29e]
>   perf[0x4dd9c8]
>   perf(__hists__add_entry+0x188)[0x4e2258]
>   perf[0x4e2580]
>   perf(hist_entry_iter__add+0x9d)[0x4e2a7d]
>   perf[0x437fda]
>   perf[0x4c4c8e]
>   perf[0x4c5176]
>   perf[0x4c8bab]
>   perf[0x4c53c2]
>   perf[0x4c5f0c]
>   perf(perf_session__process_events+0xb3)[0x4c6b23]
>   perf(cmd_report+0x12a0)[0x439310]
>   perf[0x483ec3]
>   perf(main+0x60a)[0x42979a]
>   /usr/lib/libc.so.6(__libc_start_main+0xf0)[0x7f37eb36b800]
>   perf(_start+0x29)[0x4298b9]
>   [0x0]
> 
> It seems like some memory area was corrupted..

Right, looks like use after free, for instance, freeing something still
on a list or rbtree :-/

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL 00/30] perf/core improvements and fixes

2015-05-14 Thread Namhyung Kim
On Mon, May 11, 2015 at 11:06:26AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, May 11, 2015 at 02:09:39PM +0900, Namhyung Kim escreveu:
> > Hi Arnaldo,
> > 
> > I'm seeing a segfault on 'perf report' with a large data file after
> > applying thread refcount change - it happens regardless of the atomic
> > operation.
> 
> Any specific 'perf record' command line? Does it take a long time to
> reproduce? Any backtraces? I'll try to repro, its possible that we're
> doing one too many thread__put()...

It's a kernel build with '-j 20' and recorded data size is ~2.1GB.
It takes ~30 sec to reproduce.

  $ perf report -i threaded/kbuild7.data --header-only
  # 
  # captured on: Thu Dec 18 12:06:35 2014
  # hostname : sejong
  # os release : 3.17.4-1-ARCH
  # perf version : 3.18.rc3.gcb4774b
  # arch : x86_64
  # nrcpus online : 12
  # nrcpus avail : 12
  # cpudesc : Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz
  # cpuid : GenuineIntel,6,45,7
  # total memory : 24646828 kB
  # cmdline : /home/namhyung/project/linux/tools/perf/perf record -ag -o 
/home/namhyung/tmp/perf/threaded/kbuild7.data -- make -j20
  # event : name = cycles, , size = 104, { sample_period, sample_freq } = 4000, 
sample_type = IP|TID|TIME|CALLCHAIN|CPU|PERIOD, disabled = 1, inherit
  # HEADER_CPU_TOPOLOGY info available, use -I to display
  # HEADER_NUMA_TOPOLOGY info available, use -I to display
  # pmu mappings: cpu = 4, software = 1, power = 24, uncore_pcu = 13, 
tracepoint = 2, uncore_imc_0 = 15, uncore_imc_1 = 16, uncore_imc_2 = 17, uncore_
  # 
  #


  $ perf data stat -i threaded/kbuild7.data

   Total event stats for 'threaded/kbuild7.data' file:
  
 TOTAL events:   25126492
  MMAP events:114
  COMM events: 117957
  EXIT events: 240544
  THROTTLE events: 16
UNTHROTTLE events: 16
  FORK events: 120488
SAMPLE events:   23878219
 MMAP2 events: 745325
FINISHED_ROUND events:  23813
  
   Sample event stats:
  
  20,579,564,471,104  cycles
  23,878,219  samples   #   sampling ratio  99.745% 
(3989/4000)

   498.736917889 second time sampled


  $ perf report -i threaded/kbuild7.data
  perf: Segmentation fault
   backtrace 
  perf[0x51c7cb]
  /usr/lib/libc.so.6(+0x33540)[0x7f37eb37e540]
  /usr/lib/libc.so.6(+0x9029e)[0x7f37eb3db29e]
  perf[0x4dd9c8]
  perf(__hists__add_entry+0x188)[0x4e2258]
  perf[0x4e2580]
  perf(hist_entry_iter__add+0x9d)[0x4e2a7d]
  perf[0x437fda]
  perf[0x4c4c8e]
  perf[0x4c5176]
  perf[0x4c8bab]
  perf[0x4c53c2]
  perf[0x4c5f0c]
  perf(perf_session__process_events+0xb3)[0x4c6b23]
  perf(cmd_report+0x12a0)[0x439310]
  perf[0x483ec3]
  perf(main+0x60a)[0x42979a]
  /usr/lib/libc.so.6(__libc_start_main+0xf0)[0x7f37eb36b800]
  perf(_start+0x29)[0x4298b9]
  [0x0]

It seems like some memory area was corrupted..

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL 00/30] perf/core improvements and fixes

2015-05-11 Thread Arnaldo Carvalho de Melo
Em Mon, May 11, 2015 at 02:09:39PM +0900, Namhyung Kim escreveu:
> Hi Arnaldo,
> 
> I'm seeing a segfault on 'perf report' with a large data file after
> applying thread refcount change - it happens regardless of the atomic
> operation.

Any specific 'perf record' command line? Does it take a long time to
reproduce? Any backtraces? I'll try to repro, its possible that we're
doing one too many thread__put()...

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL 00/30] perf/core improvements and fixes

2015-05-10 Thread Namhyung Kim
Hi Arnaldo,

On Fri, May 08, 2015 at 05:56:12PM -0300, Arnaldo Carvalho de Melo wrote:
> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> 
> 
> The following changes since commit cb307113746b4d184155d2c412e8069aeaa60d42:
> 
>   perf_event: Don't allow vmalloc() backed perf on powerpc (2015-05-08 
> 12:26:01 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo
> 
> for you to fetch changes up to 76d408498b08447e0f61dfdd611aeb6e8e61ce80:
> 
>   perf build: Disable libdw DWARF unwind when built with NO_DWARF (2015-05-08 
> 16:43:14 -0300)
> 
> 
> perf/core improvements and fixes:
> 
> User visible:
> 
> - 'perf probe' improvements (Masami Hiramatsu)
> 
>   - Support glob wildcards for function name
>   - Support $params special probe argument: Collect all function arguments
>   - Make --line checks validate C-style function name.
>   - Add --no-inlines option to avoid searching inline functions
> 
> - Introduce new 'perf bench futex' benchmark: 'wake-parallel', to
>   measure parallel waker threads generating contention for kerne
>   locks (hb->lock) (Davidlohr Bueso)
> 
> Bug fixes:
> 
> - 'perf top' survives much longer on high core count machines, more work
>   needed to refcount more data structures besides 'struct thread' and fix
>   more races (Arnaldo Carvalho de Melo)

I'm seeing a segfault on 'perf report' with a large data file after
applying thread refcount change - it happens regardless of the atomic
operation.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL 00/30] perf/core improvements and fixes

2015-05-08 Thread Ingo Molnar

* Arnaldo Carvalho de Melo  wrote:

> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> 
> 
> The following changes since commit cb307113746b4d184155d2c412e8069aeaa60d42:
> 
>   perf_event: Don't allow vmalloc() backed perf on powerpc (2015-05-08 
> 12:26:01 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo
> 
> for you to fetch changes up to 76d408498b08447e0f61dfdd611aeb6e8e61ce80:
> 
>   perf build: Disable libdw DWARF unwind when built with NO_DWARF (2015-05-08 
> 16:43:14 -0300)
> 
> 
> perf/core improvements and fixes:
> 
> User visible:
> 
> - 'perf probe' improvements (Masami Hiramatsu)
> 
>   - Support glob wildcards for function name
>   - Support $params special probe argument: Collect all function arguments
>   - Make --line checks validate C-style function name.
>   - Add --no-inlines option to avoid searching inline functions
> 
> - Introduce new 'perf bench futex' benchmark: 'wake-parallel', to
>   measure parallel waker threads generating contention for kerne
>   locks (hb->lock) (Davidlohr Bueso)
> 
> Bug fixes:
> 
> - 'perf top' survives much longer on high core count machines, more work
>   needed to refcount more data structures besides 'struct thread' and fix
>   more races (Arnaldo Carvalho de Melo)
> 
> Infrastructure:
> 
> - Move barrier.h mb/rmb/wmb API from tools/perf/ to kernel like tools/arch/
>   hierarchy (Arnaldo Carvalho de Melo)
> 
> - Borrow atomic.h from the kernel, initially the x86 implementations
>   with a fallback to gcc intrinsics for the other arches, all the kernel
>   like framework in place for doing arch specific implementations,
>   preferrably cloning what is in the kernel to the greater extent
>   possible (Arnaldo Carvalho de Melo)
> 
> - Protect the 'struct thread' lifetime with a reference counter,
>   and protect data structures that contains its instances with
>   a mutex (Arnaldo Carvalho de Melo
> 
> - Disable libdw DWARF unwind when built with NO_DWARF (Naveen N. Rao)
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Arnaldo Carvalho de Melo (17):
>   perf tools: Move x86 barrier.h stuff to 
> tools/arch/x86/include/asm/barrier.h
>   perf tools: Move powerpc barrier.h stuff to 
> tools/arch/powerpc/include/asm/barrier.h
>   perf tools: Move s390 barrier.h stuff to 
> tools/arch/s390/include/asm/barrier.h
>   perf tools: Move barrier() definition to tools/include/linux/compiler.h
>   tools: Adopt asm-generic/barrier.h
>   perf tools: Move sh barrier.h stuff to 
> tools/arch/sh/include/asm/barrier.h
>   perf tools: Move sparc barrier.h stuff to 
> tools/arch/sparc/include/asm/barrier.h
>   perf tools: Move alpha barrier.h stuff to 
> tools/arch/alpha/include/asm/barrier.h
>   perf tools: Move ia64 barrier.h stuff to 
> tools/arch/ia64/include/asm/barrier.h
>   perf tools: Move arm(64) barrier.h stuff to 
> tools/arch/arm*/include/asm/barrier.h
>   perf tools: Move xtensa barrier.h stuff to 
> tools/arch/xtensa/include/asm/barrier.h
>   perf tools: Move mips barrier.h stuff to 
> tools/arch/mips/include/asm/barrier.h
>   perf tools: Move tile barrier.h stuff to 
> tools/arch/tile/include/asm/barrier.h
>   perf tools: Move generic barriers out of perf-sys.h
>   tools include: Add basic atomic.h implementation from the kernel sources
>   perf tools: Use atomic_t to implement thread__{get,put} refcnt
>   perf machine: Protect the machine->threads with a rwlock
> 
> Davidlohr Bueso (2):
>   perf bench futex: Support parallel waker threads
>   perf bench futex: Handle spurious wakeups
> 
> Masami Hiramatsu (10):
>   perf probe: Fix to close probe_events file in error
>   perf probe: Fix a typo for the flags of open
>   perf probe: Fix to return 0 when positive value returned
>   perf probe: Make --line checks validate C-style function name
>   perf probe: Skip kernel symbols which is out of .text
>   perf probe: Support $params special probe argument
>   perf probe: Use perf_probe_event.target instead of passing as an 
> argument
>   perf probe: Introduce probe_conf global configs
>   perf probe: Add --no-inlines option to avoid searching inline functions
>   perf probe: Support glob wildcards for function name
> 
> Naveen N. Rao (1):
>   perf build: Disable libdw DWARF unwind when built with NO_DWARF
> 
>  tools/arch/alpha/include/asm/barrier.h|   8 +
>  tools/arch/arm/include/asm/barrier.h  |  12 ++
>  tools/arch/arm64/include/asm/barrier.h|  16 ++
>  tools/arch/ia64/include/asm/barrier.h |  48 +
>  tools/arch/mips/include/asm/barrier.h |  20 ++
>  tools/arch/powerpc/include/asm/barrier.h  |  29 +++
>  tools/arch/s390/include/asm/barrier.h |  30 +++
>  tools/arch/sh/inc

[GIT PULL 00/30] perf/core improvements and fixes

2015-05-08 Thread Arnaldo Carvalho de Melo
Hi Ingo,

Please consider pulling,

- Arnaldo



The following changes since commit cb307113746b4d184155d2c412e8069aeaa60d42:

  perf_event: Don't allow vmalloc() backed perf on powerpc (2015-05-08 12:26:01 
+0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
tags/perf-core-for-mingo

for you to fetch changes up to 76d408498b08447e0f61dfdd611aeb6e8e61ce80:

  perf build: Disable libdw DWARF unwind when built with NO_DWARF (2015-05-08 
16:43:14 -0300)


perf/core improvements and fixes:

User visible:

- 'perf probe' improvements (Masami Hiramatsu)

  - Support glob wildcards for function name
  - Support $params special probe argument: Collect all function arguments
  - Make --line checks validate C-style function name.
  - Add --no-inlines option to avoid searching inline functions

- Introduce new 'perf bench futex' benchmark: 'wake-parallel', to
  measure parallel waker threads generating contention for kerne
  locks (hb->lock) (Davidlohr Bueso)

Bug fixes:

- 'perf top' survives much longer on high core count machines, more work
  needed to refcount more data structures besides 'struct thread' and fix
  more races (Arnaldo Carvalho de Melo)

Infrastructure:

- Move barrier.h mb/rmb/wmb API from tools/perf/ to kernel like tools/arch/
  hierarchy (Arnaldo Carvalho de Melo)

- Borrow atomic.h from the kernel, initially the x86 implementations
  with a fallback to gcc intrinsics for the other arches, all the kernel
  like framework in place for doing arch specific implementations,
  preferrably cloning what is in the kernel to the greater extent
  possible (Arnaldo Carvalho de Melo)

- Protect the 'struct thread' lifetime with a reference counter,
  and protect data structures that contains its instances with
  a mutex (Arnaldo Carvalho de Melo

- Disable libdw DWARF unwind when built with NO_DWARF (Naveen N. Rao)

Signed-off-by: Arnaldo Carvalho de Melo 


Arnaldo Carvalho de Melo (17):
  perf tools: Move x86 barrier.h stuff to 
tools/arch/x86/include/asm/barrier.h
  perf tools: Move powerpc barrier.h stuff to 
tools/arch/powerpc/include/asm/barrier.h
  perf tools: Move s390 barrier.h stuff to 
tools/arch/s390/include/asm/barrier.h
  perf tools: Move barrier() definition to tools/include/linux/compiler.h
  tools: Adopt asm-generic/barrier.h
  perf tools: Move sh barrier.h stuff to tools/arch/sh/include/asm/barrier.h
  perf tools: Move sparc barrier.h stuff to 
tools/arch/sparc/include/asm/barrier.h
  perf tools: Move alpha barrier.h stuff to 
tools/arch/alpha/include/asm/barrier.h
  perf tools: Move ia64 barrier.h stuff to 
tools/arch/ia64/include/asm/barrier.h
  perf tools: Move arm(64) barrier.h stuff to 
tools/arch/arm*/include/asm/barrier.h
  perf tools: Move xtensa barrier.h stuff to 
tools/arch/xtensa/include/asm/barrier.h
  perf tools: Move mips barrier.h stuff to 
tools/arch/mips/include/asm/barrier.h
  perf tools: Move tile barrier.h stuff to 
tools/arch/tile/include/asm/barrier.h
  perf tools: Move generic barriers out of perf-sys.h
  tools include: Add basic atomic.h implementation from the kernel sources
  perf tools: Use atomic_t to implement thread__{get,put} refcnt
  perf machine: Protect the machine->threads with a rwlock

Davidlohr Bueso (2):
  perf bench futex: Support parallel waker threads
  perf bench futex: Handle spurious wakeups

Masami Hiramatsu (10):
  perf probe: Fix to close probe_events file in error
  perf probe: Fix a typo for the flags of open
  perf probe: Fix to return 0 when positive value returned
  perf probe: Make --line checks validate C-style function name
  perf probe: Skip kernel symbols which is out of .text
  perf probe: Support $params special probe argument
  perf probe: Use perf_probe_event.target instead of passing as an argument
  perf probe: Introduce probe_conf global configs
  perf probe: Add --no-inlines option to avoid searching inline functions
  perf probe: Support glob wildcards for function name

Naveen N. Rao (1):
  perf build: Disable libdw DWARF unwind when built with NO_DWARF

 tools/arch/alpha/include/asm/barrier.h|   8 +
 tools/arch/arm/include/asm/barrier.h  |  12 ++
 tools/arch/arm64/include/asm/barrier.h|  16 ++
 tools/arch/ia64/include/asm/barrier.h |  48 +
 tools/arch/mips/include/asm/barrier.h |  20 ++
 tools/arch/powerpc/include/asm/barrier.h  |  29 +++
 tools/arch/s390/include/asm/barrier.h |  30 +++
 tools/arch/sh/include/asm/barrier.h   |  32 
 tools/arch/sparc/include/asm/barrier.h|   8 +
 tools/arch/sparc/include/asm/barrier_32.h |   6 +
 tools/arch/sparc/include/asm/barrier_64.h |  42 +
 tools/arch/tile/include/asm/barrier.h |  15 ++
 tools/arch/x86/include/asm/atomic.h  

[GIT PULL 00/30] perf/core improvements and fixes

2012-09-24 Thread Arnaldo Carvalho de Melo
Hi Ingo,

Please consider pulling,

- Arnaldo

The following changes since commit 1e6dd8adc78d4a153db253d051fd4ef6c49c9019:

  perf: Fix off by one test in perf_reg_value() (2012-09-19 17:08:40 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux 
tags/perf-core-for-mingo

for you to fetch changes up to b1ac754b67b5a875d63bee880f60ccb0c6bd8899:

  tools lib traceevent: Handle alloc_arg failure (2012-09-24 12:31:52 -0300)


perf/core improvements and fixes:

. Convert the trace builtins to use the growing evsel/evlist
  tracepoint infrastructure, removing several open coded constructs
  like switch like series of strcmp to dispatch events, etc.
  Basically what had already been showcased in 'perf sched'.

. Add evsel constructor for tracepoints, that uses libtraceevent
  just to parse the /format events file, use it in a new 'perf test'
  to make sure the libtraceevent format parsing regressions can
  be more readily caught.

. Some strange errors were happening in some builds, but not on the
  next, reported by several people, problem was some parser related
  files, generated during the build, didn't had proper make deps,
  fix from Eric Sandeen.

. Fix some compiling errors on 32-bit, from Feng Tang.

. Don't use sscanf extension %as, not available on bionic, reimplementation
  by Irina Tirdea.

. Fix bfd.h/libbfd detection with recent binutils, from Markus Trippelsdorf.

. Introduce struct and cache information about the environment where a
  perf.data file was captured, from Namhyung Kim.

. Fix several error paths in libtraceevent, from Namhyung Kim.

  Print event causing perf_event_open() to fail in 'perf record',
  from Stephane Eranian.

. New 'kvm' analysis tool, from Xiao Guangrong.

Signed-off-by: Arnaldo Carvalho de Melo 


Arnaldo Carvalho de Melo (11):
  perf kvm: Use perf_evsel__intval
  perf kmem: Use perf_evsel__intval and 
perf_session__set_tracepoints_handlers
  perf lock: Use perf_evsel__intval and 
perf_session__set_tracepoints_handlers
  perf timechart: Use zalloc and fix a couple leaks
  tools lib traceevent: Use asprintf were applicable
  tools lib traceevent: Use calloc were applicable
  tools lib traceevent: Fix afterlife gotos
  tools lib traceevent: Remove some die() calls
  tools lib traceevent: Carve out events format parsing routine
  perf evsel: Provide a new constructor for tracepoints
  perf test: Add test for the sched tracepoint format fields

Eric Sandeen (1):
  perf tools: Fix parallel build

Feng Tang (2):
  perf tools: Fix a compiling error in trace-event-perl.c for 32 bits 
machine
  perf tools: Fix a compiling error in util/map.c

Irina Tirdea (1):
  perf tools: remove sscanf extension %as

Markus Trippelsdorf (1):
  perf tools: bfd.h/libbfd detection fails with recent binutils

Namhyung Kim (11):
  perf header: Add struct perf_session_env
  perf header: Add ->process callbacks to most of features
  perf header: Use pre-processed session env when printing
  perf header: Remove unused @feat arg from ->process callback
  perf kvm: Use perf_session_env for reading cpuid
  perf header: Remove perf_header__read_feature
  tools lib traceevent: Fix error path on process_array()
  tools lib traceevent: Make sure that arg->op.right is set properly
  tools lib traceevent: Free field if an error occurs on process_fields
  tools lib traceevent: Free field if an error occurs on 
process_flags/symbols
  tools lib traceevent: Handle alloc_arg failure

Stephane Eranian (1):
  perf record: Print event causing perf_event_open() to fail

Xiao Guangrong (2):
  KVM: x86: Export svm/vmx exit code and vector code to userspace
  perf kvm: Events analysis tool

 arch/x86/include/asm/kvm.h |   16 +
 arch/x86/include/asm/kvm_host.h|   16 -
 arch/x86/include/asm/svm.h |  205 +++--
 arch/x86/include/asm/vmx.h |  127 ++-
 arch/x86/kvm/trace.h   |   89 ---
 tools/lib/traceevent/event-parse.c |  570 +
 tools/lib/traceevent/event-parse.h |3 +
 tools/perf/Documentation/perf-kvm.txt  |   30 +-
 tools/perf/MANIFEST|3 +
 tools/perf/Makefile|6 +-
 tools/perf/builtin-kmem.c  |   90 +--
 tools/perf/builtin-kvm.c   |  836 +++-
 tools/perf/builtin-lock.c  |  233 ++
 tools/perf/builtin-record.c|6 +-
 tools/perf/builtin-test.c  |   86 ++
 tools/perf/builtin-timechart.c |   40 +-
 tools/perf/uti