[tip:perf/core] perf script: Share code and output format for uregs and iregs output

2018-11-21 Thread tip-bot for Milian Wolff
Commit-ID:  9add8fe8e6f63db47e40e65173530dcb68cd7a07
Gitweb: https://git.kernel.org/tip/9add8fe8e6f63db47e40e65173530dcb68cd7a07
Author: Milian Wolff 
AuthorDate: Wed, 7 Nov 2018 23:34:37 +0100
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 21 Nov 2018 12:00:32 -0300

perf script: Share code and output format for uregs and iregs output

The iregs output was missing the newline at end as well as the leading
ABI output. This made it hard to compare the iregs and uregs values.
Instead, use a single function to output the register values and use it
for both, iregs and uregs, to ensure the output is consistent.

Before:

  perf  7049 [-01]  1343.354347:  1 cycles:ppp:
a7bc21ce perf_event_exec+0x18e 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7c7ead3 setup_new_exec+0xf3 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7cd7be5 load_elf_binary+0x395 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7c7e540 search_binary_handler+0x80 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7c7f1aa __do_execve_file.isra.13+0x58a 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7c7f561 do_execve+0x21 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7c7f596 __x64_sys_execve+0x26 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7a041cb do_syscall_64+0x5b 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a840008c entry_SYSCALL_64+0x7c 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
AX:0x8000BX:0x0CX:0x0DX:0x7SI:0xfDI:0x286
BP:0x95bc8213a460SP:0xacbf0ba97d18IP:0xa7bc21cd 
FLAGS:0x28eCS:0x10SS:0x18R8:0x2R9:0x21440   R10:0x33816fb3b8c   
R11:0x1   R12:0x95bc8213a460   R13:0x95bc8213a400   
R14:0x95bc8213a400   R15:0x1  ABI:2AX:0xffda
BX:0xCX:0x7f84ad85798bDX:0x560209699d50
SI:0x7ffe2c7a6820DI:0x7ffe2c7a8c9bBP:0x7ffe2c7a20d0
SP:0x7ffe2c7a2058IP:0x7f84ad85798b FLAGS:0x206CS:0x33SS:0x2b
R8:0x7ffe2c7a2030R9:0x7f84ae55f010   R10:0x8   R11:0x206   
R12:0x   R13:0x   R14:0x   
R15:0x

  perf  7049 [-01]  1343.354363:  1 cycles:ppp:
...

After:

  perf  7049 [-01]  1343.354347:  1 cycles:ppp:
a7bc21ce perf_event_exec+0x18e 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7c7ead3 setup_new_exec+0xf3 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7cd7be5 load_elf_binary+0x395 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7c7e540 search_binary_handler+0x80 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7c7f1aa __do_execve_file.isra.13+0x58a 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7c7f561 do_execve+0x21 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7c7f596 __x64_sys_execve+0x26 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7a041cb do_syscall_64+0x5b 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a840008c entry_SYSCALL_64+0x7c 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
ABI:2AX:0x8000BX:0x0CX:0x0DX:0x7SI:0xfDI:0x286  
  BP:0x95bc8213a460SP:0xacbf0ba97d18IP:0xa7bc21cd 
FLAGS:0x28eCS:0x10SS:0x18R8:0x2R9:0x21440   R10:0x33816fb3b8c   
R11:0x1   R12:0x95bc8213a460   R13:0x95bc8213a400   
R14:0x95bc8213a400   R15:0x1
ABI:2AX:0xffdaBX:0x
CX:0x7f84ad85798bDX:0x560209699d50SI:0x7ffe2c7a6820
DI:0x7ffe2c7a8c9bBP:0x7ffe2c7a20d0SP:0x7ffe2c7a2058
IP:0x7f84ad85798b FLAGS:0x206CS:0x33SS:0x2bR8:0x7ffe2c7a2030
R9:0x7f84ae55f010   R10:0x8   R11:0x206   R12:0x   
R13:0x   R14:0x   R15:0x

  perf  7049 [-01]  1343.354363:  1 cycles:ppp:
...

Signed-off-by: Milian Wolff 
Acked-by: Jiri Olsa 
Link: http://lkml.kernel.org/r/20181107223437.9071-1-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-script.c | 40 +---
 1 file changed, 17 insertions(+), 23 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 

[tip:perf/core] perf script: Share code and output format for uregs and iregs output

2018-11-21 Thread tip-bot for Milian Wolff
Commit-ID:  9add8fe8e6f63db47e40e65173530dcb68cd7a07
Gitweb: https://git.kernel.org/tip/9add8fe8e6f63db47e40e65173530dcb68cd7a07
Author: Milian Wolff 
AuthorDate: Wed, 7 Nov 2018 23:34:37 +0100
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 21 Nov 2018 12:00:32 -0300

perf script: Share code and output format for uregs and iregs output

The iregs output was missing the newline at end as well as the leading
ABI output. This made it hard to compare the iregs and uregs values.
Instead, use a single function to output the register values and use it
for both, iregs and uregs, to ensure the output is consistent.

Before:

  perf  7049 [-01]  1343.354347:  1 cycles:ppp:
a7bc21ce perf_event_exec+0x18e 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7c7ead3 setup_new_exec+0xf3 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7cd7be5 load_elf_binary+0x395 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7c7e540 search_binary_handler+0x80 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7c7f1aa __do_execve_file.isra.13+0x58a 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7c7f561 do_execve+0x21 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7c7f596 __x64_sys_execve+0x26 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7a041cb do_syscall_64+0x5b 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a840008c entry_SYSCALL_64+0x7c 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
AX:0x8000BX:0x0CX:0x0DX:0x7SI:0xfDI:0x286
BP:0x95bc8213a460SP:0xacbf0ba97d18IP:0xa7bc21cd 
FLAGS:0x28eCS:0x10SS:0x18R8:0x2R9:0x21440   R10:0x33816fb3b8c   
R11:0x1   R12:0x95bc8213a460   R13:0x95bc8213a400   
R14:0x95bc8213a400   R15:0x1  ABI:2AX:0xffda
BX:0xCX:0x7f84ad85798bDX:0x560209699d50
SI:0x7ffe2c7a6820DI:0x7ffe2c7a8c9bBP:0x7ffe2c7a20d0
SP:0x7ffe2c7a2058IP:0x7f84ad85798b FLAGS:0x206CS:0x33SS:0x2b
R8:0x7ffe2c7a2030R9:0x7f84ae55f010   R10:0x8   R11:0x206   
R12:0x   R13:0x   R14:0x   
R15:0x

  perf  7049 [-01]  1343.354363:  1 cycles:ppp:
...

After:

  perf  7049 [-01]  1343.354347:  1 cycles:ppp:
a7bc21ce perf_event_exec+0x18e 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7c7ead3 setup_new_exec+0xf3 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7cd7be5 load_elf_binary+0x395 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7c7e540 search_binary_handler+0x80 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7c7f1aa __do_execve_file.isra.13+0x58a 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7c7f561 do_execve+0x21 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7c7f596 __x64_sys_execve+0x26 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a7a041cb do_syscall_64+0x5b 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
a840008c entry_SYSCALL_64+0x7c 
(/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
ABI:2AX:0x8000BX:0x0CX:0x0DX:0x7SI:0xfDI:0x286  
  BP:0x95bc8213a460SP:0xacbf0ba97d18IP:0xa7bc21cd 
FLAGS:0x28eCS:0x10SS:0x18R8:0x2R9:0x21440   R10:0x33816fb3b8c   
R11:0x1   R12:0x95bc8213a460   R13:0x95bc8213a400   
R14:0x95bc8213a400   R15:0x1
ABI:2AX:0xffdaBX:0x
CX:0x7f84ad85798bDX:0x560209699d50SI:0x7ffe2c7a6820
DI:0x7ffe2c7a8c9bBP:0x7ffe2c7a20d0SP:0x7ffe2c7a2058
IP:0x7f84ad85798b FLAGS:0x206CS:0x33SS:0x2bR8:0x7ffe2c7a2030
R9:0x7f84ae55f010   R10:0x8   R11:0x206   R12:0x   
R13:0x   R14:0x   R15:0x

  perf  7049 [-01]  1343.354363:  1 cycles:ppp:
...

Signed-off-by: Milian Wolff 
Acked-by: Jiri Olsa 
Link: http://lkml.kernel.org/r/20181107223437.9071-1-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-script.c | 40 +---
 1 file changed, 17 insertions(+), 23 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 

[tip:perf/core] perf script: Add newline after uregs output

2018-11-21 Thread tip-bot for Milian Wolff
Commit-ID:  b07d16f7e9e4cf2562f61b5f68a4b0831fe5ef14
Gitweb: https://git.kernel.org/tip/b07d16f7e9e4cf2562f61b5f68a4b0831fe5ef14
Author: Milian Wolff 
AuthorDate: Wed, 7 Nov 2018 10:37:05 +0100
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 21 Nov 2018 12:00:31 -0300

perf script: Add newline after uregs output

This change makes it much easier to easily distinguish between
consecutive samples by keeping the empty line between them, like we see
when we do not enable uregs output.

Before:

  cpp-inlining 28298 [-01] 54837.342780:3068085 cycles:pp:
  77c96709 __hypot_finite+0xa9 (/usr/lib/libm-2.28.so)
  ...
   ABI:2AX:0x0BX:0x40f56cf6CX:0x294a3ae7...
  cpp-inlining 28298 [-01] 54837.344493:2881929 cycles:pp:
  77c96696 __hypot_finite+0x36 (/usr/lib/libm-2.28.so)
  ...
   ABI:2AX:0x40d440c7BX:0x40d440c7CX:0x4d45e5da...

After:

  cpp-inlining 28298 [-01] 54837.342780:3068085 cycles:pp:
  77c96709 __hypot_finite+0xa9 (/usr/lib/libm-2.28.so)
  ...
   ABI:2AX:0x0BX:0x40f56cf6CX:0x294a3ae7...

  cpp-inlining 28298 [-01] 54837.344493:2881929 cycles:pp:
  77c96696 __hypot_finite+0x36 (/usr/lib/libm-2.28.so)
  ...
   ABI:2AX:0x40d440c7BX:0x40d440c7CX:0x4d45e5da...

Signed-off-by: Milian Wolff 
Cc: Jiri Olsa 
Link: http://lkml.kernel.org/r/20181107093705.16346-1-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-script.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index b5bc85bd0bbe..daf73832743e 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -603,6 +603,8 @@ static int perf_sample__fprintf_uregs(struct perf_sample 
*sample,
printed += fprintf(fp, "%5s:0x%"PRIx64" ", perf_reg_name(r), 
val);
}
 
+   fprintf(fp, "\n");
+
return printed;
 }
 


[tip:perf/core] perf script: Add newline after uregs output

2018-11-21 Thread tip-bot for Milian Wolff
Commit-ID:  b07d16f7e9e4cf2562f61b5f68a4b0831fe5ef14
Gitweb: https://git.kernel.org/tip/b07d16f7e9e4cf2562f61b5f68a4b0831fe5ef14
Author: Milian Wolff 
AuthorDate: Wed, 7 Nov 2018 10:37:05 +0100
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 21 Nov 2018 12:00:31 -0300

perf script: Add newline after uregs output

This change makes it much easier to easily distinguish between
consecutive samples by keeping the empty line between them, like we see
when we do not enable uregs output.

Before:

  cpp-inlining 28298 [-01] 54837.342780:3068085 cycles:pp:
  77c96709 __hypot_finite+0xa9 (/usr/lib/libm-2.28.so)
  ...
   ABI:2AX:0x0BX:0x40f56cf6CX:0x294a3ae7...
  cpp-inlining 28298 [-01] 54837.344493:2881929 cycles:pp:
  77c96696 __hypot_finite+0x36 (/usr/lib/libm-2.28.so)
  ...
   ABI:2AX:0x40d440c7BX:0x40d440c7CX:0x4d45e5da...

After:

  cpp-inlining 28298 [-01] 54837.342780:3068085 cycles:pp:
  77c96709 __hypot_finite+0xa9 (/usr/lib/libm-2.28.so)
  ...
   ABI:2AX:0x0BX:0x40f56cf6CX:0x294a3ae7...

  cpp-inlining 28298 [-01] 54837.344493:2881929 cycles:pp:
  77c96696 __hypot_finite+0x36 (/usr/lib/libm-2.28.so)
  ...
   ABI:2AX:0x40d440c7BX:0x40d440c7CX:0x4d45e5da...

Signed-off-by: Milian Wolff 
Cc: Jiri Olsa 
Link: http://lkml.kernel.org/r/20181107093705.16346-1-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-script.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index b5bc85bd0bbe..daf73832743e 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -603,6 +603,8 @@ static int perf_sample__fprintf_uregs(struct perf_sample 
*sample,
printed += fprintf(fp, "%5s:0x%"PRIx64" ", perf_reg_name(r), 
val);
}
 
+   fprintf(fp, "\n");
+
return printed;
 }
 


[tip:perf/urgent] perf unwind: Take pgoff into account when reporting elf to libdwfl

2018-10-31 Thread tip-bot for Milian Wolff
Commit-ID:  1fe627da30331024f453faef04d500079b901107
Gitweb: https://git.kernel.org/tip/1fe627da30331024f453faef04d500079b901107
Author: Milian Wolff 
AuthorDate: Mon, 29 Oct 2018 15:16:44 +0100
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 31 Oct 2018 09:57:50 -0300

perf unwind: Take pgoff into account when reporting elf to libdwfl

libdwfl parses an ELF file itself and creates mappings for the
individual sections. perf on the other hand sees raw mmap events which
represent individual sections. When we encounter an address pointing
into a mapping with pgoff != 0, we must take that into account and
report the file at the non-offset base address.

This fixes unwinding with libdwfl in some cases. E.g. for a file like:

```

using namespace std;

mutex g_mutex;

double worker()
{
lock_guard guard(g_mutex);
uniform_real_distribution uniform(-1E5, 1E5);
default_random_engine engine;
double s = 0;
for (int i = 0; i < 1000; ++i) {
s += norm(complex(uniform(engine), uniform(engine)));
}
cout << s << endl;
return s;
}

int main()
{
vector> results;
for (int i = 0; i < 1; ++i) {
results.push_back(async(launch::async, worker));
}
return 0;
}
```

Compile it with `g++ -g -O2 -lpthread cpp-locking.cpp  -o cpp-locking`,
then record it with `perf record --call-graph dwarf -e
sched:sched_switch`.

When you analyze it with `perf script` and libunwind, you should see:

```
cpp-locking 20038 [005] 54830.236589: sched:sched_switch: prev_comm=cpp-locking 
prev_pid=20038 prev_prio=120 prev_state=T ==> next_comm=swapper/5 next_pid=0 
next_prio=120
b166fec5 __sched_text_start+0x545 
(/lib/modules/4.14.78-1-lts/build/vmlinux)
b166fec5 __sched_text_start+0x545 
(/lib/modules/4.14.78-1-lts/build/vmlinux)
b1670208 schedule+0x28 
(/lib/modules/4.14.78-1-lts/build/vmlinux)
b16737cc rwsem_down_read_failed+0xec 
(/lib/modules/4.14.78-1-lts/build/vmlinux)
b1665e04 call_rwsem_down_read_failed+0x14 
(/lib/modules/4.14.78-1-lts/build/vmlinux)
b1672a03 down_read+0x13 
(/lib/modules/4.14.78-1-lts/build/vmlinux)
b106bd85 __do_page_fault+0x445 
(/lib/modules/4.14.78-1-lts/build/vmlinux)
b18015f5 page_fault+0x45 
(/lib/modules/4.14.78-1-lts/build/vmlinux)
7f38e4252591 new_heap+0x101 (/usr/lib/libc-2.28.so)
7f38e4252d0b arena_get2.part.4+0x2fb (/usr/lib/libc-2.28.so)
7f38e4255b1c tcache_init.part.6+0xec (/usr/lib/libc-2.28.so)
7f38e42569e5 __GI___libc_malloc+0x115 (inlined)
7f38e4241790 __GI__IO_file_doallocate+0x90 (inlined)
7f38e424fbbf __GI__IO_doallocbuf+0x4f (inlined)
7f38e424ee47 __GI__IO_file_overflow+0x197 (inlined)
7f38e424df36 _IO_new_file_xsputn+0x116 (inlined)
7f38e4242bfb __GI__IO_fwrite+0xdb (inlined)
7f38e463fa6d std::basic_streambuf 
>::sputn(char const*, long)+0x1cd (inlined)
7f38e463fa6d std::ostreambuf_iterator 
>::_M_put(char const*, long)+0x1cd (inlined)
7f38e463fa6d std::ostreambuf_iterator 
> std::__write(std::ostreambuf_iterator >, 
char const*, int)+0x1cd (inlined)
7f38e463fa6d std::ostreambuf_iterator 
> std::num_put > 
>::_M_insert_float(std::ostreambuf_iterator
7f38e464bd70 std::num_put > >::put(std::ostreambuf_iterator >, std::ios_base&, char, double) const+0x90 (inl>
7f38e464bd70 std::ostream& 
std::ostream::_M_insert(double)+0x90 (/usr/lib/libstdc++.so.6.0.25)
563b9cb502f7 std::ostream::operator<<(double)+0xb7 (inlined)
563b9cb502f7 worker()+0xb7 
(/ssd/milian/projects/kdab/rnd/hotspot/build/tests/test-clients/cpp-locking/cpp-locking)
563b9cb506fb double std::__invoke_impl(std::__invoke_other, double (*&&)())+0x2b (inlined)
563b9cb506fb std::__invoke_result::type 
std::__invoke(double (*&&)())+0x2b (inlined)
563b9cb506fb decltype (__invoke((_S_declval<0ul>)())) 
std::thread::_Invoker 
>::_M_invoke<0ul>(std::_Index_tuple<0ul>)+0x2b (inlined)
563b9cb506fb std::thread::_Invoker 
>::operator()()+0x2b (inlined)
563b9cb506fb 
std::__future_base::_Task_setter,
 std::__future_base::_Result_base::_Deleter>, 
std::thread::_Invoker >, dou>
563b9cb506fb 
std::_Function_handler (), 
std::__future_base::_Task_setter
563b9cb507e8 
std::function ()>::operator()() const+0x28 
(inlined)
563b9cb507e8 
std::__future_base::_State_baseV2::_M_do_set(std::function ()>*, bool*)+0x28 (/ssd/milian/>
7f38e46d24fe __pthread_once_slow+0xbe (/usr/lib/libpthread-2.28.so)
563b9cb51149 __gthread_once+0xe9 (inlined)
563b9cb51149 void std::call_once ()>*, bool*)>
563b9cb51149 
std::__future_base::_State_baseV2::_M_set_result(std::function ()>, bool)+0xe9 (inlined)
563b9cb51149 

[tip:perf/urgent] perf unwind: Take pgoff into account when reporting elf to libdwfl

2018-10-31 Thread tip-bot for Milian Wolff
Commit-ID:  1fe627da30331024f453faef04d500079b901107
Gitweb: https://git.kernel.org/tip/1fe627da30331024f453faef04d500079b901107
Author: Milian Wolff 
AuthorDate: Mon, 29 Oct 2018 15:16:44 +0100
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 31 Oct 2018 09:57:50 -0300

perf unwind: Take pgoff into account when reporting elf to libdwfl

libdwfl parses an ELF file itself and creates mappings for the
individual sections. perf on the other hand sees raw mmap events which
represent individual sections. When we encounter an address pointing
into a mapping with pgoff != 0, we must take that into account and
report the file at the non-offset base address.

This fixes unwinding with libdwfl in some cases. E.g. for a file like:

```

using namespace std;

mutex g_mutex;

double worker()
{
lock_guard guard(g_mutex);
uniform_real_distribution uniform(-1E5, 1E5);
default_random_engine engine;
double s = 0;
for (int i = 0; i < 1000; ++i) {
s += norm(complex(uniform(engine), uniform(engine)));
}
cout << s << endl;
return s;
}

int main()
{
vector> results;
for (int i = 0; i < 1; ++i) {
results.push_back(async(launch::async, worker));
}
return 0;
}
```

Compile it with `g++ -g -O2 -lpthread cpp-locking.cpp  -o cpp-locking`,
then record it with `perf record --call-graph dwarf -e
sched:sched_switch`.

When you analyze it with `perf script` and libunwind, you should see:

```
cpp-locking 20038 [005] 54830.236589: sched:sched_switch: prev_comm=cpp-locking 
prev_pid=20038 prev_prio=120 prev_state=T ==> next_comm=swapper/5 next_pid=0 
next_prio=120
b166fec5 __sched_text_start+0x545 
(/lib/modules/4.14.78-1-lts/build/vmlinux)
b166fec5 __sched_text_start+0x545 
(/lib/modules/4.14.78-1-lts/build/vmlinux)
b1670208 schedule+0x28 
(/lib/modules/4.14.78-1-lts/build/vmlinux)
b16737cc rwsem_down_read_failed+0xec 
(/lib/modules/4.14.78-1-lts/build/vmlinux)
b1665e04 call_rwsem_down_read_failed+0x14 
(/lib/modules/4.14.78-1-lts/build/vmlinux)
b1672a03 down_read+0x13 
(/lib/modules/4.14.78-1-lts/build/vmlinux)
b106bd85 __do_page_fault+0x445 
(/lib/modules/4.14.78-1-lts/build/vmlinux)
b18015f5 page_fault+0x45 
(/lib/modules/4.14.78-1-lts/build/vmlinux)
7f38e4252591 new_heap+0x101 (/usr/lib/libc-2.28.so)
7f38e4252d0b arena_get2.part.4+0x2fb (/usr/lib/libc-2.28.so)
7f38e4255b1c tcache_init.part.6+0xec (/usr/lib/libc-2.28.so)
7f38e42569e5 __GI___libc_malloc+0x115 (inlined)
7f38e4241790 __GI__IO_file_doallocate+0x90 (inlined)
7f38e424fbbf __GI__IO_doallocbuf+0x4f (inlined)
7f38e424ee47 __GI__IO_file_overflow+0x197 (inlined)
7f38e424df36 _IO_new_file_xsputn+0x116 (inlined)
7f38e4242bfb __GI__IO_fwrite+0xdb (inlined)
7f38e463fa6d std::basic_streambuf 
>::sputn(char const*, long)+0x1cd (inlined)
7f38e463fa6d std::ostreambuf_iterator 
>::_M_put(char const*, long)+0x1cd (inlined)
7f38e463fa6d std::ostreambuf_iterator 
> std::__write(std::ostreambuf_iterator >, 
char const*, int)+0x1cd (inlined)
7f38e463fa6d std::ostreambuf_iterator 
> std::num_put > 
>::_M_insert_float(std::ostreambuf_iterator
7f38e464bd70 std::num_put > >::put(std::ostreambuf_iterator >, std::ios_base&, char, double) const+0x90 (inl>
7f38e464bd70 std::ostream& 
std::ostream::_M_insert(double)+0x90 (/usr/lib/libstdc++.so.6.0.25)
563b9cb502f7 std::ostream::operator<<(double)+0xb7 (inlined)
563b9cb502f7 worker()+0xb7 
(/ssd/milian/projects/kdab/rnd/hotspot/build/tests/test-clients/cpp-locking/cpp-locking)
563b9cb506fb double std::__invoke_impl(std::__invoke_other, double (*&&)())+0x2b (inlined)
563b9cb506fb std::__invoke_result::type 
std::__invoke(double (*&&)())+0x2b (inlined)
563b9cb506fb decltype (__invoke((_S_declval<0ul>)())) 
std::thread::_Invoker 
>::_M_invoke<0ul>(std::_Index_tuple<0ul>)+0x2b (inlined)
563b9cb506fb std::thread::_Invoker 
>::operator()()+0x2b (inlined)
563b9cb506fb 
std::__future_base::_Task_setter,
 std::__future_base::_Result_base::_Deleter>, 
std::thread::_Invoker >, dou>
563b9cb506fb 
std::_Function_handler (), 
std::__future_base::_Task_setter
563b9cb507e8 
std::function ()>::operator()() const+0x28 
(inlined)
563b9cb507e8 
std::__future_base::_State_baseV2::_M_do_set(std::function ()>*, bool*)+0x28 (/ssd/milian/>
7f38e46d24fe __pthread_once_slow+0xbe (/usr/lib/libpthread-2.28.so)
563b9cb51149 __gthread_once+0xe9 (inlined)
563b9cb51149 void std::call_once ()>*, bool*)>
563b9cb51149 
std::__future_base::_State_baseV2::_M_set_result(std::function ()>, bool)+0xe9 (inlined)
563b9cb51149 

[tip:perf/urgent] perf script: Flush output stream after events in verbose mode

2018-10-26 Thread tip-bot for Milian Wolff
Commit-ID:  7ee40678af935fb489b0c6cf0f75808175214cd7
Gitweb: https://git.kernel.org/tip/7ee40678af935fb489b0c6cf0f75808175214cd7
Author: Milian Wolff 
AuthorDate: Sun, 21 Oct 2018 21:14:24 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 22 Oct 2018 14:27:11 -0300

perf script: Flush output stream after events in verbose mode

When the perf script output is written to a terminal stream, the normal
output of `perf script` would get buffered, but its debug output would
be written directly. This made it quite hard to figure out where a given
debug output is coming from.

We can improve on this by flushing the output buffer after processing an
event. To see the value, compare the following output for a `perf script
-v` run:

Before this patch:
```
unwind: reg 16, val 7faf7dfdc000
unwind: reg 7, val 7ffc80811e30
unwind: find_proc_info dso /usr/lib/ld-2.28.so
unwind: reg 6, val 0
unwind: _start:ip = 0x7faf7dfdc000 (0x2000)
unwind: reg 16, val 7faf7dfdc000
unwind: reg 7, val 7ffc80811e30
unwind: find_proc_info dso /usr/lib/ld-2.28.so
unwind: reg 6, val 0
unwind: _start:ip = 0x7faf7dfdc000 (0x2000)
unwind: reg 16, val 7faf7dfdc000
unwind: reg 7, val 7ffc80811e30
unwind: find_proc_info dso /usr/lib/ld-2.28.so
unwind: reg 6, val 0
unwind: _start:ip = 0x7faf7dfdc000 (0x2000)
unwind: reg 16, val 7faf7dfdc000
unwind: reg 7, val 7ffc80811e30
... lots and lots of verbose debug output
cpp-inlining 24617 90229.122036534:  1 cycles:uppp:
7faf7dfdc000 _start+0x0 (/usr/lib/ld-2.28.so)

cpp-inlining 24617 90229.122043974:  1 cycles:uppp:
7faf7dfdc000 _start+0x0 (/usr/lib/ld-2.28.so)
...
```

After this patch:
```
...
unwind: reg 16, val 7faf7dfdc000
unwind: reg 7, val 7ffc80811e30
unwind: find_proc_info dso /usr/lib/ld-2.28.so
unwind: reg 6, val 0
unwind: _start:ip = 0x7faf7dfdc000 (0x2000)
cpp-inlining 24617 90229.122036534:  1 cycles:uppp:
7faf7dfdc000 _start+0x0 (/usr/lib/ld-2.28.so)

unwind: reg 16, val 7faf7dfdc000
unwind: reg 7, val 7ffc80811e30
unwind: find_proc_info dso /usr/lib/ld-2.28.so
unwind: reg 6, val 0
unwind: _start:ip = 0x7faf7dfdc000 (0x2000)
cpp-inlining 24617 90229.122043974:  1 cycles:uppp:
7faf7dfdc000 _start+0x0 (/usr/lib/ld-2.28.so)
...
```

This new output format makes it much easier to use perf script output
for debugging purposes, e.g. to investigate broken dwarf unwinding.

Signed-off-by: Milian Wolff 
Acked-by: Jiri Olsa 
Link: http://lkml.kernel.org/r/20181021191424.16183-2-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-script.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index bd468b90801b..ca09b7d2adb7 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1737,6 +1737,9 @@ static void process_event(struct perf_script *script,
 
if (PRINT_FIELD(METRIC))
perf_sample__fprint_metric(script, thread, evsel, sample, fp);
+
+   if (verbose)
+   fflush(fp);
 }
 
 static struct scripting_ops*scripting_ops;


[tip:perf/urgent] perf script: Flush output stream after events in verbose mode

2018-10-26 Thread tip-bot for Milian Wolff
Commit-ID:  7ee40678af935fb489b0c6cf0f75808175214cd7
Gitweb: https://git.kernel.org/tip/7ee40678af935fb489b0c6cf0f75808175214cd7
Author: Milian Wolff 
AuthorDate: Sun, 21 Oct 2018 21:14:24 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 22 Oct 2018 14:27:11 -0300

perf script: Flush output stream after events in verbose mode

When the perf script output is written to a terminal stream, the normal
output of `perf script` would get buffered, but its debug output would
be written directly. This made it quite hard to figure out where a given
debug output is coming from.

We can improve on this by flushing the output buffer after processing an
event. To see the value, compare the following output for a `perf script
-v` run:

Before this patch:
```
unwind: reg 16, val 7faf7dfdc000
unwind: reg 7, val 7ffc80811e30
unwind: find_proc_info dso /usr/lib/ld-2.28.so
unwind: reg 6, val 0
unwind: _start:ip = 0x7faf7dfdc000 (0x2000)
unwind: reg 16, val 7faf7dfdc000
unwind: reg 7, val 7ffc80811e30
unwind: find_proc_info dso /usr/lib/ld-2.28.so
unwind: reg 6, val 0
unwind: _start:ip = 0x7faf7dfdc000 (0x2000)
unwind: reg 16, val 7faf7dfdc000
unwind: reg 7, val 7ffc80811e30
unwind: find_proc_info dso /usr/lib/ld-2.28.so
unwind: reg 6, val 0
unwind: _start:ip = 0x7faf7dfdc000 (0x2000)
unwind: reg 16, val 7faf7dfdc000
unwind: reg 7, val 7ffc80811e30
... lots and lots of verbose debug output
cpp-inlining 24617 90229.122036534:  1 cycles:uppp:
7faf7dfdc000 _start+0x0 (/usr/lib/ld-2.28.so)

cpp-inlining 24617 90229.122043974:  1 cycles:uppp:
7faf7dfdc000 _start+0x0 (/usr/lib/ld-2.28.so)
...
```

After this patch:
```
...
unwind: reg 16, val 7faf7dfdc000
unwind: reg 7, val 7ffc80811e30
unwind: find_proc_info dso /usr/lib/ld-2.28.so
unwind: reg 6, val 0
unwind: _start:ip = 0x7faf7dfdc000 (0x2000)
cpp-inlining 24617 90229.122036534:  1 cycles:uppp:
7faf7dfdc000 _start+0x0 (/usr/lib/ld-2.28.so)

unwind: reg 16, val 7faf7dfdc000
unwind: reg 7, val 7ffc80811e30
unwind: find_proc_info dso /usr/lib/ld-2.28.so
unwind: reg 6, val 0
unwind: _start:ip = 0x7faf7dfdc000 (0x2000)
cpp-inlining 24617 90229.122043974:  1 cycles:uppp:
7faf7dfdc000 _start+0x0 (/usr/lib/ld-2.28.so)
...
```

This new output format makes it much easier to use perf script output
for debugging purposes, e.g. to investigate broken dwarf unwinding.

Signed-off-by: Milian Wolff 
Acked-by: Jiri Olsa 
Link: http://lkml.kernel.org/r/20181021191424.16183-2-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-script.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index bd468b90801b..ca09b7d2adb7 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1737,6 +1737,9 @@ static void process_event(struct perf_script *script,
 
if (PRINT_FIELD(METRIC))
perf_sample__fprint_metric(script, thread, evsel, sample, fp);
+
+   if (verbose)
+   fflush(fp);
 }
 
 static struct scripting_ops*scripting_ops;


[tip:perf/urgent] perf script: Allow extended console debug output

2018-10-26 Thread tip-bot for Milian Wolff
Commit-ID:  c1c9b9695cc8868048f45c7e2559f65bc0be7382
Gitweb: https://git.kernel.org/tip/c1c9b9695cc8868048f45c7e2559f65bc0be7382
Author: Milian Wolff 
AuthorDate: Sun, 21 Oct 2018 21:14:23 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 22 Oct 2018 12:37:53 -0300

perf script: Allow extended console debug output

The script tool isn't using a browser, yet use_browser wasn't set
explicitly to zero. This in turn lead to confusing output such as:

  ```
  $ perf script -vvv ...
  ...
  overlapping maps in /home/milian/foobar (disable tui for more info)
  ...
  ```

Explicitly set use_browser to 0 now, which gives us the extended
debug information now in perf script as expected.

Signed-off-by: Milian Wolff 
Acked-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Link: http://lkml.kernel.org/r/20181021191424.16183-1-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-script.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 4da5e32b9e03..bd468b90801b 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3417,8 +3417,10 @@ int cmd_script(int argc, const char **argv)
exit(-1);
}
 
-   if (!script_name)
+   if (!script_name) {
setup_pager();
+   use_browser = 0;
+   }
 
session = perf_session__new(, false, );
if (session == NULL)


[tip:perf/urgent] perf script: Allow extended console debug output

2018-10-26 Thread tip-bot for Milian Wolff
Commit-ID:  c1c9b9695cc8868048f45c7e2559f65bc0be7382
Gitweb: https://git.kernel.org/tip/c1c9b9695cc8868048f45c7e2559f65bc0be7382
Author: Milian Wolff 
AuthorDate: Sun, 21 Oct 2018 21:14:23 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 22 Oct 2018 12:37:53 -0300

perf script: Allow extended console debug output

The script tool isn't using a browser, yet use_browser wasn't set
explicitly to zero. This in turn lead to confusing output such as:

  ```
  $ perf script -vvv ...
  ...
  overlapping maps in /home/milian/foobar (disable tui for more info)
  ...
  ```

Explicitly set use_browser to 0 now, which gives us the extended
debug information now in perf script as expected.

Signed-off-by: Milian Wolff 
Acked-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Link: http://lkml.kernel.org/r/20181021191424.16183-1-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-script.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 4da5e32b9e03..bd468b90801b 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3417,8 +3417,10 @@ int cmd_script(int argc, const char **argv)
exit(-1);
}
 
-   if (!script_name)
+   if (!script_name) {
setup_pager();
+   use_browser = 0;
+   }
 
session = perf_session__new(, false, );
if (session == NULL)


[tip:perf/urgent] perf report: Don't crash on invalid inline debug information

2018-10-18 Thread tip-bot for Milian Wolff
Commit-ID:  d4046e8e17b9f378cb861982ef71c63911b5dff3
Gitweb: https://git.kernel.org/tip/d4046e8e17b9f378cb861982ef71c63911b5dff3
Author: Milian Wolff 
AuthorDate: Wed, 26 Sep 2018 15:52:07 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 16 Oct 2018 14:52:21 -0300

perf report: Don't crash on invalid inline debug information

When the function name for an inline frame is invalid, we must not try
to demangle this symbol, otherwise we crash with:

  #0  0x55895c01 in bfd_demangle ()
  #1  0x55823262 in demangle_sym (dso=0x55d92b90, elf_name=0x0, 
kmodule=0) at util/symbol-elf.c:215
  #2  dso__demangle_sym (dso=dso@entry=0x55d92b90, kmodule=, 
kmodule@entry=0, elf_name=elf_name@entry=0x0) at util/symbol-elf.c:400
  #3  0x557fef4b in new_inline_sym (funcname=0x0, 
base_sym=0x55d92b90, dso=0x55d92b90) at util/srcline.c:89
  #4  inline_list__append_dso_a2l (dso=dso@entry=0x55c7bb00, 
node=node@entry=0x55e31810, sym=sym@entry=0x55d92b90) at 
util/srcline.c:264
  #5  0x557ff27f in addr2line (dso_name=dso_name@entry=0x55d92430 
"/home/milian/.debug/.build-id/f7/186d14bb94f3c6161c010926da66033d24fce5/elf", 
addr=addr@entry=2888, file=file@entry=0x0,
  line=line@entry=0x0, dso=dso@entry=0x55c7bb00, 
unwind_inlines=unwind_inlines@entry=true, node=0x55e31810, 
sym=0x55d92b90) at util/srcline.c:313
  #6  0x557ffe7c in addr2inlines (sym=0x55d92b90, 
dso=0x55c7bb00, addr=2888, dso_name=0x55d92430 
"/home/milian/.debug/.build-id/f7/186d14bb94f3c6161c010926da66033d24fce5/elf")
  at util/srcline.c:358

So instead handle the case where we get invalid function names for
inlined frames and use a fallback '??' function name instead.

While this crash was originally reported by Hadrien for rust code, I can
now also reproduce it with trivial C++ code. Indeed, it seems like
libbfd fails to interpret the debug information for the inline frame
symbol name:

  $ addr2line -e 
/home/milian/.debug/.build-id/f7/186d14bb94f3c6161c010926da66033d24fce5/elf -if 
b48
  main
  /usr/include/c++/8.2.1/complex:610
  ??
  /usr/include/c++/8.2.1/complex:618
  ??
  /usr/include/c++/8.2.1/complex:675
  ??
  /usr/include/c++/8.2.1/complex:685
  main
  
/home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39

I've reported this bug upstream and also attached a patch there which
should fix this issue:

https://sourceware.org/bugzilla/show_bug.cgi?id=23715

Reported-by: Hadrien Grasland 
Signed-off-by: Milian Wolff 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Fixes: a64489c56c30 ("perf report: Find the inline stack for a given address")
[ The above 'Fixes:' cset is where originally the problem was
  introduced, i.e.  using a2l->funcname without checking if it is NULL,
  but this current patch fixes the current codebase, i.e. multiple csets
  were applied after a64489c56c30 before the problem was reported by Hadrien ]
Link: http://lkml.kernel.org/r/20180926135207.30263-3-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/srcline.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 09d6746e6ec8..e767c4a9d4d2 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -85,6 +85,9 @@ static struct symbol *new_inline_sym(struct dso *dso,
struct symbol *inline_sym;
char *demangled = NULL;
 
+   if (!funcname)
+   funcname = "??";
+
if (dso) {
demangled = dso__demangle_sym(dso, 0, funcname);
if (demangled)


[tip:perf/urgent] perf report: Don't crash on invalid inline debug information

2018-10-18 Thread tip-bot for Milian Wolff
Commit-ID:  d4046e8e17b9f378cb861982ef71c63911b5dff3
Gitweb: https://git.kernel.org/tip/d4046e8e17b9f378cb861982ef71c63911b5dff3
Author: Milian Wolff 
AuthorDate: Wed, 26 Sep 2018 15:52:07 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 16 Oct 2018 14:52:21 -0300

perf report: Don't crash on invalid inline debug information

When the function name for an inline frame is invalid, we must not try
to demangle this symbol, otherwise we crash with:

  #0  0x55895c01 in bfd_demangle ()
  #1  0x55823262 in demangle_sym (dso=0x55d92b90, elf_name=0x0, 
kmodule=0) at util/symbol-elf.c:215
  #2  dso__demangle_sym (dso=dso@entry=0x55d92b90, kmodule=, 
kmodule@entry=0, elf_name=elf_name@entry=0x0) at util/symbol-elf.c:400
  #3  0x557fef4b in new_inline_sym (funcname=0x0, 
base_sym=0x55d92b90, dso=0x55d92b90) at util/srcline.c:89
  #4  inline_list__append_dso_a2l (dso=dso@entry=0x55c7bb00, 
node=node@entry=0x55e31810, sym=sym@entry=0x55d92b90) at 
util/srcline.c:264
  #5  0x557ff27f in addr2line (dso_name=dso_name@entry=0x55d92430 
"/home/milian/.debug/.build-id/f7/186d14bb94f3c6161c010926da66033d24fce5/elf", 
addr=addr@entry=2888, file=file@entry=0x0,
  line=line@entry=0x0, dso=dso@entry=0x55c7bb00, 
unwind_inlines=unwind_inlines@entry=true, node=0x55e31810, 
sym=0x55d92b90) at util/srcline.c:313
  #6  0x557ffe7c in addr2inlines (sym=0x55d92b90, 
dso=0x55c7bb00, addr=2888, dso_name=0x55d92430 
"/home/milian/.debug/.build-id/f7/186d14bb94f3c6161c010926da66033d24fce5/elf")
  at util/srcline.c:358

So instead handle the case where we get invalid function names for
inlined frames and use a fallback '??' function name instead.

While this crash was originally reported by Hadrien for rust code, I can
now also reproduce it with trivial C++ code. Indeed, it seems like
libbfd fails to interpret the debug information for the inline frame
symbol name:

  $ addr2line -e 
/home/milian/.debug/.build-id/f7/186d14bb94f3c6161c010926da66033d24fce5/elf -if 
b48
  main
  /usr/include/c++/8.2.1/complex:610
  ??
  /usr/include/c++/8.2.1/complex:618
  ??
  /usr/include/c++/8.2.1/complex:675
  ??
  /usr/include/c++/8.2.1/complex:685
  main
  
/home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39

I've reported this bug upstream and also attached a patch there which
should fix this issue:

https://sourceware.org/bugzilla/show_bug.cgi?id=23715

Reported-by: Hadrien Grasland 
Signed-off-by: Milian Wolff 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Fixes: a64489c56c30 ("perf report: Find the inline stack for a given address")
[ The above 'Fixes:' cset is where originally the problem was
  introduced, i.e.  using a2l->funcname without checking if it is NULL,
  but this current patch fixes the current codebase, i.e. multiple csets
  were applied after a64489c56c30 before the problem was reported by Hadrien ]
Link: http://lkml.kernel.org/r/20180926135207.30263-3-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/srcline.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 09d6746e6ec8..e767c4a9d4d2 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -85,6 +85,9 @@ static struct symbol *new_inline_sym(struct dso *dso,
struct symbol *inline_sym;
char *demangled = NULL;
 
+   if (!funcname)
+   funcname = "??";
+
if (dso) {
demangled = dso__demangle_sym(dso, 0, funcname);
if (demangled)


[tip:perf/urgent] perf record: Use unmapped IP for inline callchain cursors

2018-10-05 Thread tip-bot for Milian Wolff
Commit-ID:  7a8a8fcf7b860e4b2d4edc787c844d41cad9dfcf
Gitweb: https://git.kernel.org/tip/7a8a8fcf7b860e4b2d4edc787c844d41cad9dfcf
Author: Milian Wolff 
AuthorDate: Wed, 26 Sep 2018 15:52:06 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 5 Oct 2018 11:18:09 -0300

perf record: Use unmapped IP for inline callchain cursors

Only use the mapped IP to find inline frames, but keep using the
unmapped IP for the callchain cursor. This ensures we properly show the
unmapped IP when displaying a frame we received via the
dso__parse_addr_inlines API for a module which does not contain
sufficient debug symbols to show the srcline.

This is another follow-up to commit 19610184693c ("perf script: Show
virtual addresses instead of offsets").

Signed-off-by: Milian Wolff 
Acked-by: Jiri Olsa 
Tested-by: Ravi Bangoria 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Jin Yao 
Cc: Namhyung Kim 
Cc: Sandipan Das 
Fixes: 19610184693c ("perf script: Show virtual addresses instead of offsets")
Link: http://lkml.kernel.org/r/20180926135207.30263-2-milian.wo...@kdab.com
Link: http://lkml.kernel.org/r/20181002073949.3297-1-milian.wo...@kdab.com
[ Squashed a fix from Milian for a problem reported by Ravi, fixed up space 
damage ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/machine.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 0cb4f8bf3ca7..111ae858cbcb 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2286,7 +2286,8 @@ static int append_inlines(struct callchain_cursor *cursor,
if (!symbol_conf.inline_name || !map || !sym)
return ret;
 
-   addr = map__rip_2objdump(map, ip);
+   addr = map__map_ip(map, ip);
+   addr = map__rip_2objdump(map, addr);
 
inline_node = inlines__tree_find(>dso->inlined_nodes, addr);
if (!inline_node) {


[tip:perf/urgent] perf record: Use unmapped IP for inline callchain cursors

2018-10-05 Thread tip-bot for Milian Wolff
Commit-ID:  7a8a8fcf7b860e4b2d4edc787c844d41cad9dfcf
Gitweb: https://git.kernel.org/tip/7a8a8fcf7b860e4b2d4edc787c844d41cad9dfcf
Author: Milian Wolff 
AuthorDate: Wed, 26 Sep 2018 15:52:06 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 5 Oct 2018 11:18:09 -0300

perf record: Use unmapped IP for inline callchain cursors

Only use the mapped IP to find inline frames, but keep using the
unmapped IP for the callchain cursor. This ensures we properly show the
unmapped IP when displaying a frame we received via the
dso__parse_addr_inlines API for a module which does not contain
sufficient debug symbols to show the srcline.

This is another follow-up to commit 19610184693c ("perf script: Show
virtual addresses instead of offsets").

Signed-off-by: Milian Wolff 
Acked-by: Jiri Olsa 
Tested-by: Ravi Bangoria 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Jin Yao 
Cc: Namhyung Kim 
Cc: Sandipan Das 
Fixes: 19610184693c ("perf script: Show virtual addresses instead of offsets")
Link: http://lkml.kernel.org/r/20180926135207.30263-2-milian.wo...@kdab.com
Link: http://lkml.kernel.org/r/20181002073949.3297-1-milian.wo...@kdab.com
[ Squashed a fix from Milian for a problem reported by Ravi, fixed up space 
damage ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/machine.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 0cb4f8bf3ca7..111ae858cbcb 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2286,7 +2286,8 @@ static int append_inlines(struct callchain_cursor *cursor,
if (!symbol_conf.inline_name || !map || !sym)
return ret;
 
-   addr = map__rip_2objdump(map, ip);
+   addr = map__map_ip(map, ip);
+   addr = map__rip_2objdump(map, addr);
 
inline_node = inlines__tree_find(>dso->inlined_nodes, addr);
if (!inline_node) {


[tip:perf/urgent] perf report: Don't try to map ip to invalid map

2018-10-05 Thread tip-bot for Milian Wolff
Commit-ID:  ff4ce2885af8f9e8e99864d78dbeb4673f089c76
Gitweb: https://git.kernel.org/tip/ff4ce2885af8f9e8e99864d78dbeb4673f089c76
Author: Milian Wolff 
AuthorDate: Wed, 26 Sep 2018 15:52:05 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 27 Sep 2018 16:05:43 -0300

perf report: Don't try to map ip to invalid map

Fixes a crash when the report encounters an address that could not be
associated with an mmaped region:

  #0  0x557bdc4a in callchain_srcline (ip=, sym=0x0, map=0x0) at util/machine.c:2329
  #1  unwind_entry (entry=entry@entry=0x7fff9180, 
arg=arg@entry=0x75642498) at util/machine.c:2329
  #2  0x558370af in entry (arg=0x75642498, cb=0x557bdb50 
, thread=, ip=18446744073709551615) at 
util/unwind-libunwind-local.c:586
  #3  get_entries (ui=ui@entry=0x7fff9620, cb=0x557bdb50 
, arg=0x75642498, max_stack=) at 
util/unwind-libunwind-local.c:703
  #4  0x55837192 in _unwind__get_entries (cb=, 
arg=, thread=, data=, 
max_stack=) at util/unwind-libunwind-local.c:725
  #5  0x557c310f in thread__resolve_callchain_unwind (max_stack=127, 
sample=0x7fff9830, evsel=0x55c7b3b0, cursor=0x75642498, 
thread=0x55c7f6f0) at util/machine.c:2351
  #6  thread__resolve_callchain (thread=0x55c7f6f0, cursor=0x75642498, 
evsel=0x55c7b3b0, sample=0x7fff9830, parent=0x7fff97b8, 
root_al=0x7fff9750, max_stack=127) at util/machine.c:2378
  #7  0x557ba4ee in sample__resolve_callchain (sample=, 
cursor=, parent=parent@entry=0x7fff97b8, evsel=, al=al@entry=0x7fff9750,
  max_stack=) at util/callchain.c:1085

Signed-off-by: Milian Wolff 
Tested-by: Sandipan Das 
Acked-by: Jiri Olsa 
Cc: Jin Yao 
Cc: Namhyung Kim 
Fixes: 2a9d5050dc84 ("perf script: Show correct offsets for DWARF-based 
unwinding")
Link: http://lkml.kernel.org/r/20180926135207.30263-1-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/machine.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index c4acd2001db0..0cb4f8bf3ca7 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2312,7 +2312,7 @@ static int unwind_entry(struct unwind_entry *entry, void 
*arg)
 {
struct callchain_cursor *cursor = arg;
const char *srcline = NULL;
-   u64 addr;
+   u64 addr = entry->ip;
 
if (symbol_conf.hide_unresolved && entry->sym == NULL)
return 0;
@@ -2324,7 +2324,8 @@ static int unwind_entry(struct unwind_entry *entry, void 
*arg)
 * Convert entry->ip from a virtual address to an offset in
 * its corresponding binary.
 */
-   addr = map__map_ip(entry->map, entry->ip);
+   if (entry->map)
+   addr = map__map_ip(entry->map, entry->ip);
 
srcline = callchain_srcline(entry->map, entry->sym, addr);
return callchain_cursor_append(cursor, entry->ip,


[tip:perf/urgent] perf report: Don't try to map ip to invalid map

2018-10-05 Thread tip-bot for Milian Wolff
Commit-ID:  ff4ce2885af8f9e8e99864d78dbeb4673f089c76
Gitweb: https://git.kernel.org/tip/ff4ce2885af8f9e8e99864d78dbeb4673f089c76
Author: Milian Wolff 
AuthorDate: Wed, 26 Sep 2018 15:52:05 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 27 Sep 2018 16:05:43 -0300

perf report: Don't try to map ip to invalid map

Fixes a crash when the report encounters an address that could not be
associated with an mmaped region:

  #0  0x557bdc4a in callchain_srcline (ip=, sym=0x0, map=0x0) at util/machine.c:2329
  #1  unwind_entry (entry=entry@entry=0x7fff9180, 
arg=arg@entry=0x75642498) at util/machine.c:2329
  #2  0x558370af in entry (arg=0x75642498, cb=0x557bdb50 
, thread=, ip=18446744073709551615) at 
util/unwind-libunwind-local.c:586
  #3  get_entries (ui=ui@entry=0x7fff9620, cb=0x557bdb50 
, arg=0x75642498, max_stack=) at 
util/unwind-libunwind-local.c:703
  #4  0x55837192 in _unwind__get_entries (cb=, 
arg=, thread=, data=, 
max_stack=) at util/unwind-libunwind-local.c:725
  #5  0x557c310f in thread__resolve_callchain_unwind (max_stack=127, 
sample=0x7fff9830, evsel=0x55c7b3b0, cursor=0x75642498, 
thread=0x55c7f6f0) at util/machine.c:2351
  #6  thread__resolve_callchain (thread=0x55c7f6f0, cursor=0x75642498, 
evsel=0x55c7b3b0, sample=0x7fff9830, parent=0x7fff97b8, 
root_al=0x7fff9750, max_stack=127) at util/machine.c:2378
  #7  0x557ba4ee in sample__resolve_callchain (sample=, 
cursor=, parent=parent@entry=0x7fff97b8, evsel=, al=al@entry=0x7fff9750,
  max_stack=) at util/callchain.c:1085

Signed-off-by: Milian Wolff 
Tested-by: Sandipan Das 
Acked-by: Jiri Olsa 
Cc: Jin Yao 
Cc: Namhyung Kim 
Fixes: 2a9d5050dc84 ("perf script: Show correct offsets for DWARF-based 
unwinding")
Link: http://lkml.kernel.org/r/20180926135207.30263-1-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/machine.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index c4acd2001db0..0cb4f8bf3ca7 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2312,7 +2312,7 @@ static int unwind_entry(struct unwind_entry *entry, void 
*arg)
 {
struct callchain_cursor *cursor = arg;
const char *srcline = NULL;
-   u64 addr;
+   u64 addr = entry->ip;
 
if (symbol_conf.hide_unresolved && entry->sym == NULL)
return 0;
@@ -2324,7 +2324,8 @@ static int unwind_entry(struct unwind_entry *entry, void 
*arg)
 * Convert entry->ip from a virtual address to an offset in
 * its corresponding binary.
 */
-   addr = map__map_ip(entry->map, entry->ip);
+   if (entry->map)
+   addr = map__map_ip(entry->map, entry->ip);
 
srcline = callchain_srcline(entry->map, entry->sym, addr);
return callchain_cursor_append(cursor, entry->ip,


[tip:perf/core] perf util: Enable handling of inlined frames by default

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  d8a88dd243a170a226aba33e7c53704db2f82aa6
Gitweb: https://git.kernel.org/tip/d8a88dd243a170a226aba33e7c53704db2f82aa6
Author: Milian Wolff 
AuthorDate: Thu, 19 Oct 2017 13:38:36 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 25 Oct 2017 10:50:47 -0300

perf util: Enable handling of inlined frames by default

Now that we have caches in place to speed up the process of finding
inlined frames and srcline information repeatedly, we can enable this
useful option by default.

Suggested-by: Ingo Molnar 
Signed-off-by: Milian Wolff 
Reviewed-by: Andi Kleen 
Cc: David Ahern 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20171019113836.5548-6-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-report.txt | 3 ++-
 tools/perf/Documentation/perf-script.txt | 3 ++-
 tools/perf/util/symbol.c | 1 +
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index 383a98d..ddde2b5 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -434,7 +434,8 @@ include::itrace.txt[]
 
 --inline::
If a callgraph address belongs to an inlined function, the inline stack
-   will be printed. Each entry is function name or file/line.
+   will be printed. Each entry is function name or file/line. Enabled by
+   default, disable with --no-inline.
 
 include::callchain-overhead-calculation.txt[]
 
diff --git a/tools/perf/Documentation/perf-script.txt 
b/tools/perf/Documentation/perf-script.txt
index bcc1ba3..25e6773 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -327,7 +327,8 @@ include::itrace.txt[]
 
 --inline::
If a callgraph address belongs to an inlined function, the inline stack
-   will be printed. Each entry has function name and file/line.
+   will be printed. Each entry has function name and file/line. Enabled by
+   default, disable with --no-inline.
 
 SEE ALSO
 
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 066e38a..ce6993b 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -45,6 +45,7 @@ struct symbol_conf symbol_conf = {
.show_hist_headers  = true,
.symfs  = "",
.event_group= true,
+   .inline_name= true,
 };
 
 static enum dso_binary_type binary_type_symtab[] = {


[tip:perf/core] perf util: Enable handling of inlined frames by default

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  d8a88dd243a170a226aba33e7c53704db2f82aa6
Gitweb: https://git.kernel.org/tip/d8a88dd243a170a226aba33e7c53704db2f82aa6
Author: Milian Wolff 
AuthorDate: Thu, 19 Oct 2017 13:38:36 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 25 Oct 2017 10:50:47 -0300

perf util: Enable handling of inlined frames by default

Now that we have caches in place to speed up the process of finding
inlined frames and srcline information repeatedly, we can enable this
useful option by default.

Suggested-by: Ingo Molnar 
Signed-off-by: Milian Wolff 
Reviewed-by: Andi Kleen 
Cc: David Ahern 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20171019113836.5548-6-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-report.txt | 3 ++-
 tools/perf/Documentation/perf-script.txt | 3 ++-
 tools/perf/util/symbol.c | 1 +
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index 383a98d..ddde2b5 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -434,7 +434,8 @@ include::itrace.txt[]
 
 --inline::
If a callgraph address belongs to an inlined function, the inline stack
-   will be printed. Each entry is function name or file/line.
+   will be printed. Each entry is function name or file/line. Enabled by
+   default, disable with --no-inline.
 
 include::callchain-overhead-calculation.txt[]
 
diff --git a/tools/perf/Documentation/perf-script.txt 
b/tools/perf/Documentation/perf-script.txt
index bcc1ba3..25e6773 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -327,7 +327,8 @@ include::itrace.txt[]
 
 --inline::
If a callgraph address belongs to an inlined function, the inline stack
-   will be printed. Each entry has function name and file/line.
+   will be printed. Each entry has function name and file/line. Enabled by
+   default, disable with --no-inline.
 
 SEE ALSO
 
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 066e38a..ce6993b 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -45,6 +45,7 @@ struct symbol_conf symbol_conf = {
.show_hist_headers  = true,
.symfs  = "",
.event_group= true,
+   .inline_name= true,
 };
 
 static enum dso_binary_type binary_type_symtab[] = {


[tip:perf/core] perf report: Use srcline from callchain for hist entries

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  1fb7d06a509e82893e59e0f0b223e7d5d6d0ef8c
Gitweb: https://git.kernel.org/tip/1fb7d06a509e82893e59e0f0b223e7d5d6d0ef8c
Author: Milian Wolff 
AuthorDate: Thu, 19 Oct 2017 13:38:35 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 25 Oct 2017 10:50:46 -0300

perf report: Use srcline from callchain for hist entries

This also removes the symbol name from the srcline column, more on this
below.

This ensures we use the correct srcline, which could originate from a
potentially inlined function. The hist entries used to query for the
srcline based purely on the IP, which leads to wrong results for inlined
entries.

Before:

~
  perf report --inline -s srcline -g none --stdio
  ...
  # Children  Self  Source:Line
  #     
..
  #
  94.23% 0.00%  __libc_start_main+18446603487898210537
  94.23% 0.00%  _start+41
  44.58% 0.00%  main+100
  44.58% 0.00%  std::_Norm_helper::_S_do_it+100
  44.58% 0.00%  std::__complex_abs+100
  44.58% 0.00%  std::abs+100
  44.58% 0.00%  std::norm+100
  36.01% 0.00%  hypot+18446603487892193300
  25.81% 0.00%  main+41
  25.81% 0.00%  
std::__detail::_Adaptor::operator()+41
  25.81% 0.00%  
std::uniform_real_distribution::operator()+41
  25.75%25.75%  random.h:143
  18.39% 0.00%  main+57
  18.39% 0.00%  
std::__detail::_Adaptor::operator()+57
  18.39% 0.00%  
std::uniform_real_distribution::operator()+57
  13.80%13.80%  random.tcc:3330
   5.64% 0.00%  ??:0
   4.13% 4.13%  __hypot_finite+163
   4.13% 0.00%  __hypot_finite+18446603487892193443
...
~

After:

~
  perf report --inline -s srcline -g none --stdio
  ...
  # Children  Self  Source:Line
  #     ...
  #
  94.30% 1.19%  main.cpp:39
  94.23% 0.00%  __libc_start_main+18446603487898210537
  94.23% 0.00%  _start+41
  48.44% 1.70%  random.h:1823
  48.44% 0.00%  random.h:1814
  46.74% 2.53%  random.h:185
  44.68% 0.10%  complex:589
  44.68% 0.00%  complex:597
  44.68% 0.00%  complex:654
  44.68% 0.00%  complex:664
  40.61%13.80%  random.tcc:3330
  36.01% 0.00%  hypot+18446603487892193300
  26.81% 0.00%  random.h:151
  26.81% 0.00%  random.h:332
  25.75%25.75%  random.h:143
   5.64% 0.00%  ??:0
   4.13% 4.13%  __hypot_finite+163
   4.13% 0.00%  __hypot_finite+18446603487892193443
...
~

Note that this change removes the symbol from the source:line hist
column. If this information is desired, users should explicitly query
for it if needed. I.e. run this command instead:

~
  perf report --inline -s sym,srcline -g none --stdio
  ...
  # To display the perf.data header info, please use --header/--header-only 
options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 1K of event 'cycles:uppp'
  # Event count (approx.): 1381229476
  #
  # Children  Self  Symbol  
 
Source:Line
  #     
...
  ...
  #
  94.30% 1.19%  [.] main
 
main.cpp:39
  94.23% 0.00%  [.] __libc_start_main   
 
__libc_start_main+18446603487898210537
  94.23% 0.00%  [.] _start  
 
_start+41
  48.44% 0.00%  [.] 
std::uniform_real_distribution::operator() (inlined)  random.h:1814
  48.44% 0.00%  [.] 
std::uniform_real_distribution::operator() (inlined)  random.h:1823
  46.74% 0.00%  [.] 
std::__detail::_Adaptor::operator() (inlined)  random.h:185
  44.68% 0.00%  [.] std::_Norm_helper::_S_do_it (inlined) 
 
complex:654
  44.68% 0.00%  [.] std::__complex_abs (inlined)
  

[tip:perf/core] perf report: Use srcline from callchain for hist entries

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  1fb7d06a509e82893e59e0f0b223e7d5d6d0ef8c
Gitweb: https://git.kernel.org/tip/1fb7d06a509e82893e59e0f0b223e7d5d6d0ef8c
Author: Milian Wolff 
AuthorDate: Thu, 19 Oct 2017 13:38:35 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 25 Oct 2017 10:50:46 -0300

perf report: Use srcline from callchain for hist entries

This also removes the symbol name from the srcline column, more on this
below.

This ensures we use the correct srcline, which could originate from a
potentially inlined function. The hist entries used to query for the
srcline based purely on the IP, which leads to wrong results for inlined
entries.

Before:

~
  perf report --inline -s srcline -g none --stdio
  ...
  # Children  Self  Source:Line
  #     
..
  #
  94.23% 0.00%  __libc_start_main+18446603487898210537
  94.23% 0.00%  _start+41
  44.58% 0.00%  main+100
  44.58% 0.00%  std::_Norm_helper::_S_do_it+100
  44.58% 0.00%  std::__complex_abs+100
  44.58% 0.00%  std::abs+100
  44.58% 0.00%  std::norm+100
  36.01% 0.00%  hypot+18446603487892193300
  25.81% 0.00%  main+41
  25.81% 0.00%  
std::__detail::_Adaptor, double>::operator()+41
  25.81% 0.00%  
std::uniform_real_distribution::operator() >+41
  25.75%25.75%  random.h:143
  18.39% 0.00%  main+57
  18.39% 0.00%  
std::__detail::_Adaptor, double>::operator()+57
  18.39% 0.00%  
std::uniform_real_distribution::operator() >+57
  13.80%13.80%  random.tcc:3330
   5.64% 0.00%  ??:0
   4.13% 4.13%  __hypot_finite+163
   4.13% 0.00%  __hypot_finite+18446603487892193443
...
~

After:

~
  perf report --inline -s srcline -g none --stdio
  ...
  # Children  Self  Source:Line
  #     ...
  #
  94.30% 1.19%  main.cpp:39
  94.23% 0.00%  __libc_start_main+18446603487898210537
  94.23% 0.00%  _start+41
  48.44% 1.70%  random.h:1823
  48.44% 0.00%  random.h:1814
  46.74% 2.53%  random.h:185
  44.68% 0.10%  complex:589
  44.68% 0.00%  complex:597
  44.68% 0.00%  complex:654
  44.68% 0.00%  complex:664
  40.61%13.80%  random.tcc:3330
  36.01% 0.00%  hypot+18446603487892193300
  26.81% 0.00%  random.h:151
  26.81% 0.00%  random.h:332
  25.75%25.75%  random.h:143
   5.64% 0.00%  ??:0
   4.13% 4.13%  __hypot_finite+163
   4.13% 0.00%  __hypot_finite+18446603487892193443
...
~

Note that this change removes the symbol from the source:line hist
column. If this information is desired, users should explicitly query
for it if needed. I.e. run this command instead:

~
  perf report --inline -s sym,srcline -g none --stdio
  ...
  # To display the perf.data header info, please use --header/--header-only 
options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 1K of event 'cycles:uppp'
  # Event count (approx.): 1381229476
  #
  # Children  Self  Symbol  
 
Source:Line
  #     
...
  ...
  #
  94.30% 1.19%  [.] main
 
main.cpp:39
  94.23% 0.00%  [.] __libc_start_main   
 
__libc_start_main+18446603487898210537
  94.23% 0.00%  [.] _start  
 
_start+41
  48.44% 0.00%  [.] 
std::uniform_real_distribution::operator() > (inlined)  random.h:1814
  48.44% 0.00%  [.] 
std::uniform_real_distribution::operator() > (inlined)  random.h:1823
  46.74% 0.00%  [.] 
std::__detail::_Adaptor, double>::operator() (inlined)  random.h:185
  44.68% 0.00%  [.] std::_Norm_helper::_S_do_it (inlined) 
 
complex:654
  44.68% 0.00%  [.] std::__complex_abs (inlined)
 
complex:589
  44.68% 0.00%  [.] std::abs (inlined)  
 
complex:597
  44.68% 0.00%  [.] std::norm (inlined) 

[tip:perf/core] perf report: Cache srclines for callchain nodes

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  21ac9d547fdde79c1e8692587d9044fde549214b
Gitweb: https://git.kernel.org/tip/21ac9d547fdde79c1e8692587d9044fde549214b
Author: Milian Wolff 
AuthorDate: Thu, 19 Oct 2017 13:38:34 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 25 Oct 2017 10:50:46 -0300

perf report: Cache srclines for callchain nodes

On one hand this ensures that the memory is properly freed when the DSO
gets freed. On the other hand this significantly speeds up the
processing of the callchain nodes when lots of srclines are requested.
For one of my data files e.g.:

Before:

 Performance counter stats for 'perf report -s srcline -g srcline --stdio':

  52496.495043  task-clock (msec) #0.999 CPUs utilized
   634  context-switches  #0.012 K/sec
 2  cpu-migrations#0.000 K/sec
   191,561  page-faults   #0.004 M/sec
   165,074,498,235  cycles#3.144 GHz
   334,170,832,408  instructions  #2.02  insn per cycle
90,220,029,745  branches  # 1718.591 M/sec
   654,525,177  branch-misses #0.73% of all branches

  52.533273822 seconds time elapsedProcessed 236605 events and lost 40 
chunks!

After:

 Performance counter stats for 'perf report -s srcline -g srcline --stdio':

  22606.323706  task-clock (msec) #1.000 CPUs utilized
31  context-switches  #0.001 K/sec
 0  cpu-migrations#0.000 K/sec
   185,471  page-faults   #0.008 M/sec
71,188,113,681  cycles#3.149 GHz
   133,204,943,083  instructions  #1.87  insn per cycle
34,886,384,979  branches  # 1543.214 M/sec
   278,214,495  branch-misses #0.80% of all branches

  22.609857253 seconds time elapsed

Note that the difference is only this large when `--inline` is not
passed. In such situations, we would use the inliner cache and thus do
not run this code path that often.

I think that this cache should actually be used in other places, too.
When looking at the valgrind leak report for perf report, we see tons of
srclines being leaked, most notably from calls to
hist_entry__get_srcline. The problem is that get_srcline has many
different formatting options (show_sym, show_addr, potentially even
unwind_inlines when calling __get_srcline directly). As such, the
srcline cannot easily be cached for all calls, or we'd have to add
caches for all formatting combinations (6 so far). An alternative would
be to remove the formatting options and handle that on a different level
- i.e. print the sym/addr on demand wherever we actually output
something. And the unwind_inlines could be moved into a separate
function that does not return the srcline.

Signed-off-by: Milian Wolff 
Reviewed-by: Andi Kleen 
Cc: David Ahern 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20171019113836.5548-4-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/dso.c |  2 ++
 tools/perf/util/dso.h |  1 +
 tools/perf/util/machine.c | 17 +---
 tools/perf/util/srcline.c | 66 +++
 tools/perf/util/srcline.h |  7 +
 5 files changed, 90 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 75c8250..3192b60 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -1203,6 +1203,7 @@ struct dso *dso__new(const char *name)
dso->symbols[i] = dso->symbol_names[i] = RB_ROOT;
dso->data.cache = RB_ROOT;
dso->inlined_nodes = RB_ROOT;
+   dso->srclines = RB_ROOT;
dso->data.fd = -1;
dso->data.status = DSO_DATA_STATUS_UNKNOWN;
dso->symtab_type = DSO_BINARY_TYPE__NOT_FOUND;
@@ -1237,6 +1238,7 @@ void dso__delete(struct dso *dso)
 
/* free inlines first, as they reference symbols */
inlines__tree_delete(>inlined_nodes);
+   srcline__tree_delete(>srclines);
for (i = 0; i < MAP__NR_TYPES; ++i)
symbols__delete(>symbols[i]);
 
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 122eca0..821b16c 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -142,6 +142,7 @@ struct dso {
struct rb_root   symbols[MAP__NR_TYPES];
struct rb_root   symbol_names[MAP__NR_TYPES];
struct rb_root   inlined_nodes;
+   struct rb_root   srclines;
struct {
u64 addr;
struct 

[tip:perf/core] perf report: Cache srclines for callchain nodes

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  21ac9d547fdde79c1e8692587d9044fde549214b
Gitweb: https://git.kernel.org/tip/21ac9d547fdde79c1e8692587d9044fde549214b
Author: Milian Wolff 
AuthorDate: Thu, 19 Oct 2017 13:38:34 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 25 Oct 2017 10:50:46 -0300

perf report: Cache srclines for callchain nodes

On one hand this ensures that the memory is properly freed when the DSO
gets freed. On the other hand this significantly speeds up the
processing of the callchain nodes when lots of srclines are requested.
For one of my data files e.g.:

Before:

 Performance counter stats for 'perf report -s srcline -g srcline --stdio':

  52496.495043  task-clock (msec) #0.999 CPUs utilized
   634  context-switches  #0.012 K/sec
 2  cpu-migrations#0.000 K/sec
   191,561  page-faults   #0.004 M/sec
   165,074,498,235  cycles#3.144 GHz
   334,170,832,408  instructions  #2.02  insn per cycle
90,220,029,745  branches  # 1718.591 M/sec
   654,525,177  branch-misses #0.73% of all branches

  52.533273822 seconds time elapsedProcessed 236605 events and lost 40 
chunks!

After:

 Performance counter stats for 'perf report -s srcline -g srcline --stdio':

  22606.323706  task-clock (msec) #1.000 CPUs utilized
31  context-switches  #0.001 K/sec
 0  cpu-migrations#0.000 K/sec
   185,471  page-faults   #0.008 M/sec
71,188,113,681  cycles#3.149 GHz
   133,204,943,083  instructions  #1.87  insn per cycle
34,886,384,979  branches  # 1543.214 M/sec
   278,214,495  branch-misses #0.80% of all branches

  22.609857253 seconds time elapsed

Note that the difference is only this large when `--inline` is not
passed. In such situations, we would use the inliner cache and thus do
not run this code path that often.

I think that this cache should actually be used in other places, too.
When looking at the valgrind leak report for perf report, we see tons of
srclines being leaked, most notably from calls to
hist_entry__get_srcline. The problem is that get_srcline has many
different formatting options (show_sym, show_addr, potentially even
unwind_inlines when calling __get_srcline directly). As such, the
srcline cannot easily be cached for all calls, or we'd have to add
caches for all formatting combinations (6 so far). An alternative would
be to remove the formatting options and handle that on a different level
- i.e. print the sym/addr on demand wherever we actually output
something. And the unwind_inlines could be moved into a separate
function that does not return the srcline.

Signed-off-by: Milian Wolff 
Reviewed-by: Andi Kleen 
Cc: David Ahern 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20171019113836.5548-4-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/dso.c |  2 ++
 tools/perf/util/dso.h |  1 +
 tools/perf/util/machine.c | 17 +---
 tools/perf/util/srcline.c | 66 +++
 tools/perf/util/srcline.h |  7 +
 5 files changed, 90 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 75c8250..3192b60 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -1203,6 +1203,7 @@ struct dso *dso__new(const char *name)
dso->symbols[i] = dso->symbol_names[i] = RB_ROOT;
dso->data.cache = RB_ROOT;
dso->inlined_nodes = RB_ROOT;
+   dso->srclines = RB_ROOT;
dso->data.fd = -1;
dso->data.status = DSO_DATA_STATUS_UNKNOWN;
dso->symtab_type = DSO_BINARY_TYPE__NOT_FOUND;
@@ -1237,6 +1238,7 @@ void dso__delete(struct dso *dso)
 
/* free inlines first, as they reference symbols */
inlines__tree_delete(>inlined_nodes);
+   srcline__tree_delete(>srclines);
for (i = 0; i < MAP__NR_TYPES; ++i)
symbols__delete(>symbols[i]);
 
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 122eca0..821b16c 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -142,6 +142,7 @@ struct dso {
struct rb_root   symbols[MAP__NR_TYPES];
struct rb_root   symbol_names[MAP__NR_TYPES];
struct rb_root   inlined_nodes;
+   struct rb_root   srclines;
struct {
u64 addr;
struct symbol   *symbol;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 177c1d4..94d8f1c 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1711,11 +1711,22 @@ struct 

[tip:perf/core] perf report: Compare symbol name for inlined frames when sorting

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  aa441895f7b4ff5394d4964a8e6749f3866e44d0
Gitweb: https://git.kernel.org/tip/aa441895f7b4ff5394d4964a8e6749f3866e44d0
Author: Milian Wolff 
AuthorDate: Mon, 9 Oct 2017 22:33:04 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 24 Oct 2017 09:59:56 -0300

perf report: Compare symbol name for inlined frames when sorting

Similar to the callstack frame matching, we also have to compare the
symbol name when sorting hist entries. The reason is twofold: On one
hand, multiple inlined functions will use the same symbol start/end
values of the parent, non-inlined symbol.

As such, all of these symbols often end up missing from top-level
report, as they get merged with the non-inlined frame. On the other
hand, multiple different functions may end up inlining the same
function, and we need to aggregate these values properly.

Before:

~
  perf report --stdio --inline -g none
  # Children Self  Command   Shared Object Symbol
  #      . 
...
  #
 100.00%   39.69%  cpp-inlining  cpp-inlining  [.] main
 100.00%0.00%  cpp-inlining  cpp-inlining  [.] _start
 100.00%0.00%  cpp-inlining  libc-2.25.so  [.] __libc_start_main
  97.03%0.00%  cpp-inlining  cpp-inlining  [.] std::norm 
(inlined)
  59.53%4.26%  cpp-inlining  libm-2.25.so  [.] hypot
  55.21%   55.08%  cpp-inlining  libm-2.25.so  [.] __hypot_finite
   0.52%0.52%  cpp-inlining  libm-2.25.so  [.] cabs
~

After:

~
  perf report --stdio --inline -g none
  # Children Self  Command   Shared Object Symbol
  #      . 
...
  #
 100.00%   39.69%  cpp-inlining  cpp-inlining  [.] main
 100.00%0.00%  cpp-inlining  cpp-inlining  [.] _start
 100.00%0.00%  cpp-inlining  libc-2.25.so  [.] __libc_start_main
  62.57%0.00%  cpp-inlining  cpp-inlining  [.] 
std::_Norm_helper::_S_do_it (inlined)
  62.57%0.00%  cpp-inlining  cpp-inlining  [.] std::__complex_abs 
(inlined)
  62.57%0.00%  cpp-inlining  cpp-inlining  [.] std::abs 
(inlined)
  62.57%0.00%  cpp-inlining  cpp-inlining  [.] std::norm 
(inlined)
  59.53%4.26%  cpp-inlining  libm-2.25.so  [.] hypot
  55.21%   55.08%  cpp-inlining  libm-2.25.so  [.] __hypot_finite
  34.46%0.00%  cpp-inlining  cpp-inlining  [.] 
std::uniform_real_distribution::operator() (inlined)
  32.39%0.00%  cpp-inlining  cpp-inlining  [.] 
std::__detail::_Adaptor::operator() (inlined)
  32.39%0.00%  cpp-inlining  cpp-inlining  [.] 
std::generate_canonical (inlined)
  12.29%0.00%  cpp-inlining  cpp-inlining  [.] 
std::__detail::_Mod::__calc (inlined)
  12.29%0.00%  cpp-inlining  cpp-inlining  [.] 
std::__detail::__mod (inlined)
  12.29%0.00%  cpp-inlining  cpp-inlining  [.] 
std::linear_congruential_engine::operator() (inlined)
   0.52%0.52%  cpp-inlining  libm-2.25.so  [.] cabs
~

Signed-off-by: Milian Wolff 
Reviewed-by: Jiri Olsa 
Reviewed-by: Namhyung Kim 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20171009203310.17362-11-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/sort.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index acb9210..006d10a 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -225,6 +225,9 @@ static int64_t _sort__sym_cmp(struct symbol *sym_l, struct 
symbol *sym_r)
if (sym_l == sym_r)
return 0;
 
+   if (sym_l->inlined || sym_r->inlined)
+   return strcmp(sym_l->name, sym_r->name);
+
if (sym_l->start != sym_r->start)
return (int64_t)(sym_r->start - sym_l->start);
 


[tip:perf/core] perf report: Compare symbol name for inlined frames when sorting

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  aa441895f7b4ff5394d4964a8e6749f3866e44d0
Gitweb: https://git.kernel.org/tip/aa441895f7b4ff5394d4964a8e6749f3866e44d0
Author: Milian Wolff 
AuthorDate: Mon, 9 Oct 2017 22:33:04 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 24 Oct 2017 09:59:56 -0300

perf report: Compare symbol name for inlined frames when sorting

Similar to the callstack frame matching, we also have to compare the
symbol name when sorting hist entries. The reason is twofold: On one
hand, multiple inlined functions will use the same symbol start/end
values of the parent, non-inlined symbol.

As such, all of these symbols often end up missing from top-level
report, as they get merged with the non-inlined frame. On the other
hand, multiple different functions may end up inlining the same
function, and we need to aggregate these values properly.

Before:

~
  perf report --stdio --inline -g none
  # Children Self  Command   Shared Object Symbol
  #      . 
...
  #
 100.00%   39.69%  cpp-inlining  cpp-inlining  [.] main
 100.00%0.00%  cpp-inlining  cpp-inlining  [.] _start
 100.00%0.00%  cpp-inlining  libc-2.25.so  [.] __libc_start_main
  97.03%0.00%  cpp-inlining  cpp-inlining  [.] std::norm 
(inlined)
  59.53%4.26%  cpp-inlining  libm-2.25.so  [.] hypot
  55.21%   55.08%  cpp-inlining  libm-2.25.so  [.] __hypot_finite
   0.52%0.52%  cpp-inlining  libm-2.25.so  [.] cabs
~

After:

~
  perf report --stdio --inline -g none
  # Children Self  Command   Shared Object Symbol
  #      . 
...
  #
 100.00%   39.69%  cpp-inlining  cpp-inlining  [.] main
 100.00%0.00%  cpp-inlining  cpp-inlining  [.] _start
 100.00%0.00%  cpp-inlining  libc-2.25.so  [.] __libc_start_main
  62.57%0.00%  cpp-inlining  cpp-inlining  [.] 
std::_Norm_helper::_S_do_it (inlined)
  62.57%0.00%  cpp-inlining  cpp-inlining  [.] std::__complex_abs 
(inlined)
  62.57%0.00%  cpp-inlining  cpp-inlining  [.] std::abs 
(inlined)
  62.57%0.00%  cpp-inlining  cpp-inlining  [.] std::norm 
(inlined)
  59.53%4.26%  cpp-inlining  libm-2.25.so  [.] hypot
  55.21%   55.08%  cpp-inlining  libm-2.25.so  [.] __hypot_finite
  34.46%0.00%  cpp-inlining  cpp-inlining  [.] 
std::uniform_real_distribution::operator() > (inlined)
  32.39%0.00%  cpp-inlining  cpp-inlining  [.] 
std::__detail::_Adaptor, double>::operator() (inlined)
  32.39%0.00%  cpp-inlining  cpp-inlining  [.] 
std::generate_canonical > (inlined)
  12.29%0.00%  cpp-inlining  cpp-inlining  [.] 
std::__detail::_Mod::__calc (inlined)
  12.29%0.00%  cpp-inlining  cpp-inlining  [.] 
std::__detail::__mod (inlined)
  12.29%0.00%  cpp-inlining  cpp-inlining  [.] 
std::linear_congruential_engine::operator() (inlined)
   0.52%0.52%  cpp-inlining  libm-2.25.so  [.] cabs
~

Signed-off-by: Milian Wolff 
Reviewed-by: Jiri Olsa 
Reviewed-by: Namhyung Kim 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20171009203310.17362-11-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/sort.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index acb9210..006d10a 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -225,6 +225,9 @@ static int64_t _sort__sym_cmp(struct symbol *sym_l, struct 
symbol *sym_r)
if (sym_l == sym_r)
return 0;
 
+   if (sym_l->inlined || sym_r->inlined)
+   return strcmp(sym_l->name, sym_r->name);
+
if (sym_l->start != sym_r->start)
return (int64_t)(sym_r->start - sym_l->start);
 


[tip:perf/core] perf report: Cache failed lookups of inlined frames

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  b38775cf7678d7715b35dded3dcfab66e244baae
Gitweb: https://git.kernel.org/tip/b38775cf7678d7715b35dded3dcfab66e244baae
Author: Milian Wolff 
AuthorDate: Thu, 19 Oct 2017 13:38:33 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 25 Oct 2017 10:50:45 -0300

perf report: Cache failed lookups of inlined frames

When no inlined frames could be found for a given address, we did not
store this information anywhere. That means we potentially do the costly
inliner lookup repeatedly for cases where we know it can never succeed.

This patch makes dso__parse_addr_inlines always return a valid
inline_node. It will be empty when no inliners are found. This enables
us to cache the empty list in the DSO, thereby improving the performance
when many addresses fail to find the inliners.

For my trivial example, the performance impact is already quite
significant:

Before:

~
 Performance counter stats for 'perf report --stdio --inline -g srcline -s 
srcline' (5 runs):

594.804032  task-clock (msec) #0.998 CPUs utilized  
  ( +-  0.07% )
53  context-switches  #0.089 K/sec  
  ( +-  4.09% )
 0  cpu-migrations#0.000 K/sec  
  ( +-100.00% )
 5,687  page-faults   #0.010 M/sec  
  ( +-  0.02% )
 2,300,918,213  cycles#3.868 GHz
  ( +-  0.09% )
 4,395,839,080  instructions  #1.91  insn per cycle 
  ( +-  0.00% )
   939,177,205  branches  # 1578.969 M/sec  
  ( +-  0.00% )
11,824,633  branch-misses #1.26% of all branches
  ( +-  0.10% )

   0.596246531 seconds time elapsed 
 ( +-  0.07% )
~

After:

~
 Performance counter stats for 'perf report --stdio --inline -g srcline -s 
srcline' (5 runs):

113.111405  task-clock (msec) #0.990 CPUs utilized  
  ( +-  0.89% )
29  context-switches  #0.255 K/sec  
  ( +- 54.25% )
 0  cpu-migrations#0.000 K/sec
 5,380  page-faults   #0.048 M/sec  
  ( +-  0.01% )
   432,378,779  cycles#3.823 GHz
  ( +-  0.75% )
   670,057,633  instructions  #1.55  insn per cycle 
  ( +-  0.01% )
   141,001,247  branches  # 1246.570 M/sec  
  ( +-  0.01% )
 2,346,845  branch-misses #1.66% of all branches
  ( +-  0.19% )

   0.114222393 seconds time elapsed 
 ( +-  1.19% )
~

Signed-off-by: Milian Wolff 
Reviewed-by: Andi Kleen 
Cc: David Ahern 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20171019113836.5548-3-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/machine.c | 15 +++
 tools/perf/util/srcline.c | 16 +---
 2 files changed, 8 insertions(+), 23 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 3d049cb..177c1d4 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2115,9 +2115,10 @@ static int append_inlines(struct callchain_cursor 
*cursor,
struct inline_node *inline_node;
struct inline_list *ilist;
u64 addr;
+   int ret = 1;
 
if (!symbol_conf.inline_name || !map || !sym)
-   return 1;
+   return ret;
 
addr = map__rip_2objdump(map, ip);
 
@@ -2125,22 +2126,20 @@ static int append_inlines(struct callchain_cursor 
*cursor,
if (!inline_node) {
inline_node = dso__parse_addr_inlines(map->dso, addr, sym);
if (!inline_node)
-   return 1;
-
+   return ret;
inlines__tree_insert(>dso->inlined_nodes, inline_node);
}
 
list_for_each_entry(ilist, _node->val, list) {
-   int ret = callchain_cursor_append(cursor, ip, map,
- ilist->symbol, false,
- NULL, 0, 0, 0,
- ilist->srcline);
+   ret = callchain_cursor_append(cursor, ip, map,
+ ilist->symbol, false,
+ NULL, 0, 0, 0, ilist->srcline);
 
if (ret != 0)

[tip:perf/core] perf report: Cache failed lookups of inlined frames

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  b38775cf7678d7715b35dded3dcfab66e244baae
Gitweb: https://git.kernel.org/tip/b38775cf7678d7715b35dded3dcfab66e244baae
Author: Milian Wolff 
AuthorDate: Thu, 19 Oct 2017 13:38:33 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 25 Oct 2017 10:50:45 -0300

perf report: Cache failed lookups of inlined frames

When no inlined frames could be found for a given address, we did not
store this information anywhere. That means we potentially do the costly
inliner lookup repeatedly for cases where we know it can never succeed.

This patch makes dso__parse_addr_inlines always return a valid
inline_node. It will be empty when no inliners are found. This enables
us to cache the empty list in the DSO, thereby improving the performance
when many addresses fail to find the inliners.

For my trivial example, the performance impact is already quite
significant:

Before:

~
 Performance counter stats for 'perf report --stdio --inline -g srcline -s 
srcline' (5 runs):

594.804032  task-clock (msec) #0.998 CPUs utilized  
  ( +-  0.07% )
53  context-switches  #0.089 K/sec  
  ( +-  4.09% )
 0  cpu-migrations#0.000 K/sec  
  ( +-100.00% )
 5,687  page-faults   #0.010 M/sec  
  ( +-  0.02% )
 2,300,918,213  cycles#3.868 GHz
  ( +-  0.09% )
 4,395,839,080  instructions  #1.91  insn per cycle 
  ( +-  0.00% )
   939,177,205  branches  # 1578.969 M/sec  
  ( +-  0.00% )
11,824,633  branch-misses #1.26% of all branches
  ( +-  0.10% )

   0.596246531 seconds time elapsed 
 ( +-  0.07% )
~

After:

~
 Performance counter stats for 'perf report --stdio --inline -g srcline -s 
srcline' (5 runs):

113.111405  task-clock (msec) #0.990 CPUs utilized  
  ( +-  0.89% )
29  context-switches  #0.255 K/sec  
  ( +- 54.25% )
 0  cpu-migrations#0.000 K/sec
 5,380  page-faults   #0.048 M/sec  
  ( +-  0.01% )
   432,378,779  cycles#3.823 GHz
  ( +-  0.75% )
   670,057,633  instructions  #1.55  insn per cycle 
  ( +-  0.01% )
   141,001,247  branches  # 1246.570 M/sec  
  ( +-  0.01% )
 2,346,845  branch-misses #1.66% of all branches
  ( +-  0.19% )

   0.114222393 seconds time elapsed 
 ( +-  1.19% )
~

Signed-off-by: Milian Wolff 
Reviewed-by: Andi Kleen 
Cc: David Ahern 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20171019113836.5548-3-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/machine.c | 15 +++
 tools/perf/util/srcline.c | 16 +---
 2 files changed, 8 insertions(+), 23 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 3d049cb..177c1d4 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2115,9 +2115,10 @@ static int append_inlines(struct callchain_cursor 
*cursor,
struct inline_node *inline_node;
struct inline_list *ilist;
u64 addr;
+   int ret = 1;
 
if (!symbol_conf.inline_name || !map || !sym)
-   return 1;
+   return ret;
 
addr = map__rip_2objdump(map, ip);
 
@@ -2125,22 +2126,20 @@ static int append_inlines(struct callchain_cursor 
*cursor,
if (!inline_node) {
inline_node = dso__parse_addr_inlines(map->dso, addr, sym);
if (!inline_node)
-   return 1;
-
+   return ret;
inlines__tree_insert(>dso->inlined_nodes, inline_node);
}
 
list_for_each_entry(ilist, _node->val, list) {
-   int ret = callchain_cursor_append(cursor, ip, map,
- ilist->symbol, false,
- NULL, 0, 0, 0,
- ilist->srcline);
+   ret = callchain_cursor_append(cursor, ip, map,
+ ilist->symbol, false,
+ NULL, 0, 0, 0, ilist->srcline);
 
if (ret != 0)
return ret;
}
 
-   return 0;
+   return ret;
 }
 
 static int unwind_entry(struct unwind_entry *entry, void *arg)
diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 

[tip:perf/core] perf report: Properly handle branch count in match_chain()

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  bf36eb5c4b3ef0ebfb19b1a67a5fa5821e6c9fa7
Gitweb: https://git.kernel.org/tip/bf36eb5c4b3ef0ebfb19b1a67a5fa5821e6c9fa7
Author: Milian Wolff 
AuthorDate: Fri, 20 Oct 2017 12:14:47 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 25 Oct 2017 10:50:37 -0300

perf report: Properly handle branch count in match_chain()

Some of the code paths I introduced before returned too early without
running the code to handle a node's branch count.  By refactoring
match_chain to only have one exit point, this can be remedied.

Signed-off-by: Milian Wolff 
Acked-by: Namhyung Kim 
Cc: David Ahern 
Cc: Jin Yao 
Cc: Peter Zijlstra 
Cc: Ravi Bangoria 
Link: http://lkml.kernel.org/r/1707691.qaJ269GSZW@agathebauer
Link: http://lkml.kernel.org/r/20171018185350.14893-2-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/callchain.c | 140 
 1 file changed, 78 insertions(+), 62 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 35a920f..19bfcad 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -666,83 +666,99 @@ static enum match_result match_chain_strings(const char 
*left,
return ret;
 }
 
-static enum match_result match_chain(struct callchain_cursor_node *node,
-struct callchain_list *cnode)
+/*
+ * We need to always use relative addresses because we're aggregating
+ * callchains from multiple threads, i.e. different address spaces, so
+ * comparing absolute addresses make no sense as a symbol in a DSO may end up
+ * in a different address when used in a different binary or even the same
+ * binary but with some sort of address randomization technique, thus we need
+ * to compare just relative addresses. -acme
+ */
+static enum match_result match_chain_dso_addresses(struct map *left_map, u64 
left_ip,
+  struct map *right_map, u64 
right_ip)
 {
-   struct symbol *sym = node->sym;
-   u64 left, right;
-   struct dso *left_dso = NULL;
-   struct dso *right_dso = NULL;
+   struct dso *left_dso = left_map ? left_map->dso : NULL;
+   struct dso *right_dso = right_map ? right_map->dso : NULL;
 
-   if (callchain_param.key == CCKEY_SRCLINE) {
-   enum match_result match = match_chain_strings(cnode->srcline,
- node->srcline);
+   if (left_dso != right_dso)
+   return left_dso < right_dso ? MATCH_LT : MATCH_GT;
 
-   /* if no srcline is available, fallback to symbol name */
-   if (match == MATCH_ERROR && cnode->ms.sym && node->sym)
-   match = match_chain_strings(cnode->ms.sym->name,
-   node->sym->name);
+   if (left_ip != right_ip)
+   return left_ip < right_ip ? MATCH_LT : MATCH_GT;
 
-   if (match != MATCH_ERROR)
-   return match;
+   return MATCH_EQ;
+}
 
-   /* otherwise fall-back to IP-based comparison below */
-   }
+static enum match_result match_chain(struct callchain_cursor_node *node,
+struct callchain_list *cnode)
+{
+   enum match_result match = MATCH_ERROR;
 
-   if (cnode->ms.sym && sym && callchain_param.key == CCKEY_FUNCTION) {
-   /*
-* Compare inlined frames based on their symbol name because
-* different inlined frames will have the same symbol start
-*/
-   if (cnode->ms.sym->inlined || node->sym->inlined)
-   return match_chain_strings(cnode->ms.sym->name,
-  node->sym->name);
-
-   left = cnode->ms.sym->start;
-   right = sym->start;
-   left_dso = cnode->ms.map->dso;
-   right_dso = node->map->dso;
-   } else {
-   left = cnode->ip;
-   right = node->ip;
+   switch (callchain_param.key) {
+   case CCKEY_SRCLINE:
+   match = match_chain_strings(cnode->srcline, node->srcline);
+   if (match != MATCH_ERROR)
+   break;
+   /* otherwise fall-back to symbol-based comparison below */
+   __fallthrough;
+   case CCKEY_FUNCTION:
+   if (node->sym && cnode->ms.sym) {
+   /*
+* Compare inlined frames based on their symbol name
+* because different inlined frames will have the same
+* symbol start. Otherwise do a faster comparison based
+  

[tip:perf/core] perf report: Properly handle branch count in match_chain()

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  bf36eb5c4b3ef0ebfb19b1a67a5fa5821e6c9fa7
Gitweb: https://git.kernel.org/tip/bf36eb5c4b3ef0ebfb19b1a67a5fa5821e6c9fa7
Author: Milian Wolff 
AuthorDate: Fri, 20 Oct 2017 12:14:47 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 25 Oct 2017 10:50:37 -0300

perf report: Properly handle branch count in match_chain()

Some of the code paths I introduced before returned too early without
running the code to handle a node's branch count.  By refactoring
match_chain to only have one exit point, this can be remedied.

Signed-off-by: Milian Wolff 
Acked-by: Namhyung Kim 
Cc: David Ahern 
Cc: Jin Yao 
Cc: Peter Zijlstra 
Cc: Ravi Bangoria 
Link: http://lkml.kernel.org/r/1707691.qaJ269GSZW@agathebauer
Link: http://lkml.kernel.org/r/20171018185350.14893-2-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/callchain.c | 140 
 1 file changed, 78 insertions(+), 62 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 35a920f..19bfcad 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -666,83 +666,99 @@ static enum match_result match_chain_strings(const char 
*left,
return ret;
 }
 
-static enum match_result match_chain(struct callchain_cursor_node *node,
-struct callchain_list *cnode)
+/*
+ * We need to always use relative addresses because we're aggregating
+ * callchains from multiple threads, i.e. different address spaces, so
+ * comparing absolute addresses make no sense as a symbol in a DSO may end up
+ * in a different address when used in a different binary or even the same
+ * binary but with some sort of address randomization technique, thus we need
+ * to compare just relative addresses. -acme
+ */
+static enum match_result match_chain_dso_addresses(struct map *left_map, u64 
left_ip,
+  struct map *right_map, u64 
right_ip)
 {
-   struct symbol *sym = node->sym;
-   u64 left, right;
-   struct dso *left_dso = NULL;
-   struct dso *right_dso = NULL;
+   struct dso *left_dso = left_map ? left_map->dso : NULL;
+   struct dso *right_dso = right_map ? right_map->dso : NULL;
 
-   if (callchain_param.key == CCKEY_SRCLINE) {
-   enum match_result match = match_chain_strings(cnode->srcline,
- node->srcline);
+   if (left_dso != right_dso)
+   return left_dso < right_dso ? MATCH_LT : MATCH_GT;
 
-   /* if no srcline is available, fallback to symbol name */
-   if (match == MATCH_ERROR && cnode->ms.sym && node->sym)
-   match = match_chain_strings(cnode->ms.sym->name,
-   node->sym->name);
+   if (left_ip != right_ip)
+   return left_ip < right_ip ? MATCH_LT : MATCH_GT;
 
-   if (match != MATCH_ERROR)
-   return match;
+   return MATCH_EQ;
+}
 
-   /* otherwise fall-back to IP-based comparison below */
-   }
+static enum match_result match_chain(struct callchain_cursor_node *node,
+struct callchain_list *cnode)
+{
+   enum match_result match = MATCH_ERROR;
 
-   if (cnode->ms.sym && sym && callchain_param.key == CCKEY_FUNCTION) {
-   /*
-* Compare inlined frames based on their symbol name because
-* different inlined frames will have the same symbol start
-*/
-   if (cnode->ms.sym->inlined || node->sym->inlined)
-   return match_chain_strings(cnode->ms.sym->name,
-  node->sym->name);
-
-   left = cnode->ms.sym->start;
-   right = sym->start;
-   left_dso = cnode->ms.map->dso;
-   right_dso = node->map->dso;
-   } else {
-   left = cnode->ip;
-   right = node->ip;
+   switch (callchain_param.key) {
+   case CCKEY_SRCLINE:
+   match = match_chain_strings(cnode->srcline, node->srcline);
+   if (match != MATCH_ERROR)
+   break;
+   /* otherwise fall-back to symbol-based comparison below */
+   __fallthrough;
+   case CCKEY_FUNCTION:
+   if (node->sym && cnode->ms.sym) {
+   /*
+* Compare inlined frames based on their symbol name
+* because different inlined frames will have the same
+* symbol start. Otherwise do a faster comparison based
+* on the symbol start address.
+*/
+   if (cnode->ms.sym->inlined || node->sym->inlined) {
+   match = 

[tip:perf/core] perf callchain: Compare symbol name for inlined frames when matching

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  9856240ad3269f2fdab0b2fa4400ef8aab792061
Gitweb: https://git.kernel.org/tip/9856240ad3269f2fdab0b2fa4400ef8aab792061
Author: Milian Wolff 
AuthorDate: Mon, 9 Oct 2017 22:33:03 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 24 Oct 2017 09:59:56 -0300

perf callchain: Compare symbol name for inlined frames when matching

The fake symbols we create for inlined frames will represent different
functions but can use the symbol start address. This leads to issues
when different inline branches all lead to the same function.

Before:
~
$ perf report -s sym -i perf.inlining.data --inline --stdio -g function
...
 --38.86%--_start
   __libc_start_main
   main
   |
--37.57%--std::norm (inlined)
  std::_Norm_helper::_S_do_it 
(inlined)
  |
   --36.36%--std::abs (inlined)
 std::__complex_abs (inlined)
 |
  
--12.24%--std::linear_congruential_engine::operator() (inlined)

std::__detail::__mod (inlined)

std::__detail::_Mod::__calc (inlined)
~

Note that this backtrace representation is completely bogus.
Complex abs does not call the linear congruential engine! It
is just a side-effect of a longer inlined stack being appended
to a shorter, different inlined stack, both of which originate
in the same function (main).

This patch fixes the issue:

~
$ perf report -s sym -i perf.inlining.data --inline --stdio -g function
...
 --38.86%--_start
   __libc_start_main
   main
   |
   
|--35.59%--std::uniform_real_distribution::operator() (inlined)
   |  
std::uniform_real_distribution::operator() (inlined)
   |  |
   |   
--34.37%--std::__detail::_Adaptor::operator() (inlined)
   | std::generate_canonical (inlined)
   | |
   |  
--12.24%--std::linear_congruential_engine::operator() (inlined)
   |
std::__detail::__mod (inlined)
   |
std::__detail::_Mod::__calc (inlined)
   |
--1.99%--std::norm (inlined)
  std::_Norm_helper::_S_do_it 
(inlined)
  std::abs (inlined)
  std::__complex_abs (inlined)
~

Signed-off-by: Milian Wolff 
Reviewed-by: Jiri Olsa 
Reviewed-by: Namhyung Kim 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Ravi Bangoria 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20171009203310.17362-10-milian.wo...@kdab.com
Cc: Arnaldo Carvalho de Melo 
[ Fix up conflict with c1fbc0cf81f1 ("perf callchain: Compare dsos (as well) 
for CCKEY_FUNCTION"), remove unneeded hunk ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/callchain.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 77031ef..35a920f 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -690,6 +690,14 @@ static enum match_result match_chain(struct 
callchain_cursor_node *node,
}
 
if (cnode->ms.sym && sym && callchain_param.key == CCKEY_FUNCTION) {
+   /*
+* Compare inlined frames based on their symbol name because
+* different inlined frames will have the same symbol start
+*/
+   if (cnode->ms.sym->inlined || node->sym->inlined)
+   return match_chain_strings(cnode->ms.sym->name,
+  node->sym->name);
+
left = cnode->ms.sym->start;
right = sym->start;
left_dso = cnode->ms.map->dso;


[tip:perf/core] perf callchain: Compare symbol name for inlined frames when matching

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  9856240ad3269f2fdab0b2fa4400ef8aab792061
Gitweb: https://git.kernel.org/tip/9856240ad3269f2fdab0b2fa4400ef8aab792061
Author: Milian Wolff 
AuthorDate: Mon, 9 Oct 2017 22:33:03 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 24 Oct 2017 09:59:56 -0300

perf callchain: Compare symbol name for inlined frames when matching

The fake symbols we create for inlined frames will represent different
functions but can use the symbol start address. This leads to issues
when different inline branches all lead to the same function.

Before:
~
$ perf report -s sym -i perf.inlining.data --inline --stdio -g function
...
 --38.86%--_start
   __libc_start_main
   main
   |
--37.57%--std::norm (inlined)
  std::_Norm_helper::_S_do_it 
(inlined)
  |
   --36.36%--std::abs (inlined)
 std::__complex_abs (inlined)
 |
  
--12.24%--std::linear_congruential_engine::operator() (inlined)

std::__detail::__mod (inlined)

std::__detail::_Mod::__calc (inlined)
~

Note that this backtrace representation is completely bogus.
Complex abs does not call the linear congruential engine! It
is just a side-effect of a longer inlined stack being appended
to a shorter, different inlined stack, both of which originate
in the same function (main).

This patch fixes the issue:

~
$ perf report -s sym -i perf.inlining.data --inline --stdio -g function
...
 --38.86%--_start
   __libc_start_main
   main
   |
   
|--35.59%--std::uniform_real_distribution::operator() > (inlined)
   |  
std::uniform_real_distribution::operator() > (inlined)
   |  |
   |   
--34.37%--std::__detail::_Adaptor, double>::operator() (inlined)
   | std::generate_canonical > (inlined)
   | |
   |  
--12.24%--std::linear_congruential_engine::operator() (inlined)
   |
std::__detail::__mod (inlined)
   |
std::__detail::_Mod::__calc (inlined)
   |
--1.99%--std::norm (inlined)
  std::_Norm_helper::_S_do_it 
(inlined)
  std::abs (inlined)
  std::__complex_abs (inlined)
~

Signed-off-by: Milian Wolff 
Reviewed-by: Jiri Olsa 
Reviewed-by: Namhyung Kim 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Ravi Bangoria 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20171009203310.17362-10-milian.wo...@kdab.com
Cc: Arnaldo Carvalho de Melo 
[ Fix up conflict with c1fbc0cf81f1 ("perf callchain: Compare dsos (as well) 
for CCKEY_FUNCTION"), remove unneeded hunk ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/callchain.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 77031ef..35a920f 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -690,6 +690,14 @@ static enum match_result match_chain(struct 
callchain_cursor_node *node,
}
 
if (cnode->ms.sym && sym && callchain_param.key == CCKEY_FUNCTION) {
+   /*
+* Compare inlined frames based on their symbol name because
+* different inlined frames will have the same symbol start
+*/
+   if (cnode->ms.sym->inlined || node->sym->inlined)
+   return match_chain_strings(cnode->ms.sym->name,
+  node->sym->name);
+
left = cnode->ms.sym->start;
right = sym->start;
left_dso = cnode->ms.map->dso;


[tip:perf/core] perf script: Mark inlined frames and do not print DSO for them

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  9628b56dc1240ce0faa3bd9b7a3390fa4451c59f
Gitweb: https://git.kernel.org/tip/9628b56dc1240ce0faa3bd9b7a3390fa4451c59f
Author: Milian Wolff 
AuthorDate: Mon, 9 Oct 2017 22:33:02 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 24 Oct 2017 09:59:56 -0300

perf script: Mark inlined frames and do not print DSO for them

Instead of showing the (repeated) DSO name of the non-inlined frame, we
now show the "(inlined)" suffix instead.

Before:
   214f7 __hypot_finite (/usr/lib/libm-2.25.so)
ace3 hypot (/usr/lib/libm-2.25.so)
 a4a std::__complex_abs 
(/home/milian/projects/src/perf-tests/inlining)
 a4a std::abs 
(/home/milian/projects/src/perf-tests/inlining)
 a4a std::_Norm_helper::_S_do_it 
(/home/milian/projects/src/perf-tests/inlining)
 a4a std::norm 
(/home/milian/projects/src/perf-tests/inlining)
 a4a main (/home/milian/projects/src/perf-tests/inlining)
   20510 __libc_start_main (/usr/lib/libc-2.25.so)
 bd9 _start (/home/milian/projects/src/perf-tests/inlining)

After:
   214f7 __hypot_finite (/usr/lib/libm-2.25.so)
ace3 hypot (/usr/lib/libm-2.25.so)
 a4a std::__complex_abs (inlined)
 a4a std::abs (inlined)
 a4a std::_Norm_helper::_S_do_it (inlined)
 a4a std::norm (inlined)
 a4a main (/home/milian/projects/src/perf-tests/inlining)
   20510 __libc_start_main (/usr/lib/libc-2.25.so)
 bd9 _start (/home/milian/projects/src/perf-tests/inlining)

Signed-off-by: Milian Wolff 
Reviewed-by: Jiri Olsa 
Reviewed-by: Namhyung Kim 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20171009203310.17362-9-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/evsel_fprintf.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evsel_fprintf.c b/tools/perf/util/evsel_fprintf.c
index f2c6c5e..5b9e892 100644
--- a/tools/perf/util/evsel_fprintf.c
+++ b/tools/perf/util/evsel_fprintf.c
@@ -157,7 +157,7 @@ int sample__fprintf_callchain(struct perf_sample *sample, 
int left_alignment,
}
}
 
-   if (print_dso) {
+   if (print_dso && (!node->sym || !node->sym->inlined)) {
printed += fprintf(fp, " (");
printed += map__fprintf_dsoname(node->map, fp);
printed += fprintf(fp, ")");
@@ -166,6 +166,9 @@ int sample__fprintf_callchain(struct perf_sample *sample, 
int left_alignment,
if (print_srcline)
printed += map__fprintf_srcline(node->map, 
addr, "\n  ", fp);
 
+   if (node->sym && node->sym->inlined)
+   printed += fprintf(fp, " (inlined)");
+
if (!print_oneline)
printed += fprintf(fp, "\n");
 


[tip:perf/core] perf script: Mark inlined frames and do not print DSO for them

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  9628b56dc1240ce0faa3bd9b7a3390fa4451c59f
Gitweb: https://git.kernel.org/tip/9628b56dc1240ce0faa3bd9b7a3390fa4451c59f
Author: Milian Wolff 
AuthorDate: Mon, 9 Oct 2017 22:33:02 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 24 Oct 2017 09:59:56 -0300

perf script: Mark inlined frames and do not print DSO for them

Instead of showing the (repeated) DSO name of the non-inlined frame, we
now show the "(inlined)" suffix instead.

Before:
   214f7 __hypot_finite (/usr/lib/libm-2.25.so)
ace3 hypot (/usr/lib/libm-2.25.so)
 a4a std::__complex_abs 
(/home/milian/projects/src/perf-tests/inlining)
 a4a std::abs 
(/home/milian/projects/src/perf-tests/inlining)
 a4a std::_Norm_helper::_S_do_it 
(/home/milian/projects/src/perf-tests/inlining)
 a4a std::norm 
(/home/milian/projects/src/perf-tests/inlining)
 a4a main (/home/milian/projects/src/perf-tests/inlining)
   20510 __libc_start_main (/usr/lib/libc-2.25.so)
 bd9 _start (/home/milian/projects/src/perf-tests/inlining)

After:
   214f7 __hypot_finite (/usr/lib/libm-2.25.so)
ace3 hypot (/usr/lib/libm-2.25.so)
 a4a std::__complex_abs (inlined)
 a4a std::abs (inlined)
 a4a std::_Norm_helper::_S_do_it (inlined)
 a4a std::norm (inlined)
 a4a main (/home/milian/projects/src/perf-tests/inlining)
   20510 __libc_start_main (/usr/lib/libc-2.25.so)
 bd9 _start (/home/milian/projects/src/perf-tests/inlining)

Signed-off-by: Milian Wolff 
Reviewed-by: Jiri Olsa 
Reviewed-by: Namhyung Kim 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20171009203310.17362-9-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/evsel_fprintf.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evsel_fprintf.c b/tools/perf/util/evsel_fprintf.c
index f2c6c5e..5b9e892 100644
--- a/tools/perf/util/evsel_fprintf.c
+++ b/tools/perf/util/evsel_fprintf.c
@@ -157,7 +157,7 @@ int sample__fprintf_callchain(struct perf_sample *sample, 
int left_alignment,
}
}
 
-   if (print_dso) {
+   if (print_dso && (!node->sym || !node->sym->inlined)) {
printed += fprintf(fp, " (");
printed += map__fprintf_dsoname(node->map, fp);
printed += fprintf(fp, ")");
@@ -166,6 +166,9 @@ int sample__fprintf_callchain(struct perf_sample *sample, 
int left_alignment,
if (print_srcline)
printed += map__fprintf_srcline(node->map, 
addr, "\n  ", fp);
 
+   if (node->sym && node->sym->inlined)
+   printed += fprintf(fp, " (inlined)");
+
if (!print_oneline)
printed += fprintf(fp, "\n");
 


[tip:perf/core] perf report: Remove code to handle inline frames from browsers

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  2a704fc8db7b0080a67d9f4f4cb2a7bcaf79949d
Gitweb: https://git.kernel.org/tip/2a704fc8db7b0080a67d9f4f4cb2a7bcaf79949d
Author: Milian Wolff 
AuthorDate: Mon, 9 Oct 2017 22:32:55 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 24 Oct 2017 09:59:55 -0300

perf report: Remove code to handle inline frames from browsers

The follow-up commits will make inline frames first-class citizens in
the callchain, thereby obsoleting all of this special code.

Signed-off-by: Milian Wolff 
Reviewed-by: Jiri Olsa 
Reviewed-by: Namhyung Kim 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20171009203310.17362-2-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/ui/browsers/hists.c  | 180 +++-
 tools/perf/ui/stdio/hist.c  |  77 +
 tools/perf/util/evsel_fprintf.c |  32 ---
 tools/perf/util/hist.c  |   5 --
 tools/perf/util/sort.h  |   1 -
 5 files changed, 13 insertions(+), 282 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 13dfb0a..3a433f3 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -154,57 +154,9 @@ static void callchain_list__set_folding(struct 
callchain_list *cl, bool unfold)
cl->unfolded = unfold ? cl->has_children : false;
 }
 
-static struct inline_node *inline_node__create(struct map *map, u64 ip)
-{
-   struct dso *dso;
-   struct inline_node *node;
-
-   if (map == NULL)
-   return NULL;
-
-   dso = map->dso;
-   if (dso == NULL)
-   return NULL;
-
-   node = dso__parse_addr_inlines(dso,
-  map__rip_2objdump(map, ip));
-
-   return node;
-}
-
-static int inline__count_rows(struct inline_node *node)
-{
-   struct inline_list *ilist;
-   int i = 0;
-
-   if (node == NULL)
-   return 0;
-
-   list_for_each_entry(ilist, >val, list) {
-   if ((ilist->filename != NULL) || (ilist->funcname != NULL))
-   i++;
-   }
-
-   return i;
-}
-
-static int callchain_list__inline_rows(struct callchain_list *chain)
-{
-   struct inline_node *node;
-   int rows;
-
-   node = inline_node__create(chain->ms.map, chain->ip);
-   if (node == NULL)
-   return 0;
-
-   rows = inline__count_rows(node);
-   inline_node__delete(node);
-   return rows;
-}
-
 static int callchain_node__count_rows_rb_tree(struct callchain_node *node)
 {
-   int n = 0, inline_rows;
+   int n = 0;
struct rb_node *nd;
 
for (nd = rb_first(>rb_root); nd; nd = rb_next(nd)) {
@@ -215,12 +167,6 @@ static int callchain_node__count_rows_rb_tree(struct 
callchain_node *node)
list_for_each_entry(chain, >val, list) {
++n;
 
-   if (symbol_conf.inline_name) {
-   inline_rows =
-   callchain_list__inline_rows(chain);
-   n += inline_rows;
-   }
-
/* We need this because we may not have children */
folded_sign = callchain_list__folded(chain);
if (folded_sign == '+')
@@ -272,7 +218,7 @@ static int callchain_node__count_rows(struct callchain_node 
*node)
 {
struct callchain_list *chain;
bool unfolded = false;
-   int n = 0, inline_rows;
+   int n = 0;
 
if (callchain_param.mode == CHAIN_FLAT)
return callchain_node__count_flat_rows(node);
@@ -281,10 +227,6 @@ static int callchain_node__count_rows(struct 
callchain_node *node)
 
list_for_each_entry(chain, >val, list) {
++n;
-   if (symbol_conf.inline_name) {
-   inline_rows = callchain_list__inline_rows(chain);
-   n += inline_rows;
-   }
 
unfolded = chain->unfolded;
}
@@ -432,19 +374,6 @@ static void hist_entry__init_have_children(struct 
hist_entry *he)
he->init_have_children = true;
 }
 
-static void hist_entry_init_inline_node(struct hist_entry *he)
-{
-   if (he->inline_node)
-   return;
-
-   he->inline_node = inline_node__create(he->ms.map, he->ip);
-
-   if (he->inline_node == NULL)
-   return;
-
-   he->has_children = true;
-}
-
 static bool hist_browser__toggle_fold(struct hist_browser *browser)
 {
struct hist_entry *he = browser->he_selection;
@@ -476,12 +405,8 @@ static bool hist_browser__toggle_fold(struct hist_browser 
*browser)
 
if (he->unfolded) {
if (he->leaf)
- 

[tip:perf/core] perf report: Remove code to handle inline frames from browsers

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  2a704fc8db7b0080a67d9f4f4cb2a7bcaf79949d
Gitweb: https://git.kernel.org/tip/2a704fc8db7b0080a67d9f4f4cb2a7bcaf79949d
Author: Milian Wolff 
AuthorDate: Mon, 9 Oct 2017 22:32:55 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 24 Oct 2017 09:59:55 -0300

perf report: Remove code to handle inline frames from browsers

The follow-up commits will make inline frames first-class citizens in
the callchain, thereby obsoleting all of this special code.

Signed-off-by: Milian Wolff 
Reviewed-by: Jiri Olsa 
Reviewed-by: Namhyung Kim 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20171009203310.17362-2-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/ui/browsers/hists.c  | 180 +++-
 tools/perf/ui/stdio/hist.c  |  77 +
 tools/perf/util/evsel_fprintf.c |  32 ---
 tools/perf/util/hist.c  |   5 --
 tools/perf/util/sort.h  |   1 -
 5 files changed, 13 insertions(+), 282 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 13dfb0a..3a433f3 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -154,57 +154,9 @@ static void callchain_list__set_folding(struct 
callchain_list *cl, bool unfold)
cl->unfolded = unfold ? cl->has_children : false;
 }
 
-static struct inline_node *inline_node__create(struct map *map, u64 ip)
-{
-   struct dso *dso;
-   struct inline_node *node;
-
-   if (map == NULL)
-   return NULL;
-
-   dso = map->dso;
-   if (dso == NULL)
-   return NULL;
-
-   node = dso__parse_addr_inlines(dso,
-  map__rip_2objdump(map, ip));
-
-   return node;
-}
-
-static int inline__count_rows(struct inline_node *node)
-{
-   struct inline_list *ilist;
-   int i = 0;
-
-   if (node == NULL)
-   return 0;
-
-   list_for_each_entry(ilist, >val, list) {
-   if ((ilist->filename != NULL) || (ilist->funcname != NULL))
-   i++;
-   }
-
-   return i;
-}
-
-static int callchain_list__inline_rows(struct callchain_list *chain)
-{
-   struct inline_node *node;
-   int rows;
-
-   node = inline_node__create(chain->ms.map, chain->ip);
-   if (node == NULL)
-   return 0;
-
-   rows = inline__count_rows(node);
-   inline_node__delete(node);
-   return rows;
-}
-
 static int callchain_node__count_rows_rb_tree(struct callchain_node *node)
 {
-   int n = 0, inline_rows;
+   int n = 0;
struct rb_node *nd;
 
for (nd = rb_first(>rb_root); nd; nd = rb_next(nd)) {
@@ -215,12 +167,6 @@ static int callchain_node__count_rows_rb_tree(struct 
callchain_node *node)
list_for_each_entry(chain, >val, list) {
++n;
 
-   if (symbol_conf.inline_name) {
-   inline_rows =
-   callchain_list__inline_rows(chain);
-   n += inline_rows;
-   }
-
/* We need this because we may not have children */
folded_sign = callchain_list__folded(chain);
if (folded_sign == '+')
@@ -272,7 +218,7 @@ static int callchain_node__count_rows(struct callchain_node 
*node)
 {
struct callchain_list *chain;
bool unfolded = false;
-   int n = 0, inline_rows;
+   int n = 0;
 
if (callchain_param.mode == CHAIN_FLAT)
return callchain_node__count_flat_rows(node);
@@ -281,10 +227,6 @@ static int callchain_node__count_rows(struct 
callchain_node *node)
 
list_for_each_entry(chain, >val, list) {
++n;
-   if (symbol_conf.inline_name) {
-   inline_rows = callchain_list__inline_rows(chain);
-   n += inline_rows;
-   }
 
unfolded = chain->unfolded;
}
@@ -432,19 +374,6 @@ static void hist_entry__init_have_children(struct 
hist_entry *he)
he->init_have_children = true;
 }
 
-static void hist_entry_init_inline_node(struct hist_entry *he)
-{
-   if (he->inline_node)
-   return;
-
-   he->inline_node = inline_node__create(he->ms.map, he->ip);
-
-   if (he->inline_node == NULL)
-   return;
-
-   he->has_children = true;
-}
-
 static bool hist_browser__toggle_fold(struct hist_browser *browser)
 {
struct hist_entry *he = browser->he_selection;
@@ -476,12 +405,8 @@ static bool hist_browser__toggle_fold(struct hist_browser 
*browser)
 
if (he->unfolded) {
if (he->leaf)
-   if (he->inline_node)
-   he->nr_rows = inline__count_rows(
-   

[tip:perf/core] perf callchain: Create real callchain entries for inlined frames

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  11ea2515f32e783b9a7984c148e742c377383915
Gitweb: https://git.kernel.org/tip/11ea2515f32e783b9a7984c148e742c377383915
Author: Milian Wolff 
AuthorDate: Mon, 9 Oct 2017 22:32:59 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 24 Oct 2017 09:59:55 -0300

perf callchain: Create real callchain entries for inlined frames

The inline_node structs are maintained by the new dso->inlines tree.
This in turn keeps ownership of the fake symbols and srcline string
representing an inline frame.

This tree is sorted by address to allow quick lookups. All other entries
of the symbol beside the function name are unused for inline frames. The
advantage of this approach is that all existing users of the callchain
API can now transparently display inlined frames without having to patch
their code.

Signed-off-by: Milian Wolff 
Reviewed-by: Jiri Olsa 
Reviewed-by: Namhyung Kim 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20171009203310.17362-6-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/dso.c |  5 +
 tools/perf/util/dso.h |  1 +
 tools/perf/util/machine.c | 37 ++
 tools/perf/util/srcline.c | 51 +++
 tools/perf/util/srcline.h |  9 +
 5 files changed, 103 insertions(+)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 339e529..75c8250 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -10,6 +10,7 @@
 #include "compress.h"
 #include "path.h"
 #include "symbol.h"
+#include "srcline.h"
 #include "dso.h"
 #include "machine.h"
 #include "auxtrace.h"
@@ -1201,6 +1202,7 @@ struct dso *dso__new(const char *name)
for (i = 0; i < MAP__NR_TYPES; ++i)
dso->symbols[i] = dso->symbol_names[i] = RB_ROOT;
dso->data.cache = RB_ROOT;
+   dso->inlined_nodes = RB_ROOT;
dso->data.fd = -1;
dso->data.status = DSO_DATA_STATUS_UNKNOWN;
dso->symtab_type = DSO_BINARY_TYPE__NOT_FOUND;
@@ -1232,6 +1234,9 @@ void dso__delete(struct dso *dso)
if (!RB_EMPTY_NODE(>rb_node))
pr_err("DSO %s is still in rbtree when being deleted!\n",
   dso->long_name);
+
+   /* free inlines first, as they reference symbols */
+   inlines__tree_delete(>inlined_nodes);
for (i = 0; i < MAP__NR_TYPES; ++i)
symbols__delete(>symbols[i]);
 
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index a2bbb21..122eca0 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -141,6 +141,7 @@ struct dso {
struct rb_root   *root; /* root of rbtree that rb_node is in */
struct rb_root   symbols[MAP__NR_TYPES];
struct rb_root   symbol_names[MAP__NR_TYPES];
+   struct rb_root   inlined_nodes;
struct {
u64 addr;
struct symbol   *symbol;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index a37e1c0..3d049cb 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2109,6 +2109,40 @@ check_calls:
return 0;
 }
 
+static int append_inlines(struct callchain_cursor *cursor,
+ struct map *map, struct symbol *sym, u64 ip)
+{
+   struct inline_node *inline_node;
+   struct inline_list *ilist;
+   u64 addr;
+
+   if (!symbol_conf.inline_name || !map || !sym)
+   return 1;
+
+   addr = map__rip_2objdump(map, ip);
+
+   inline_node = inlines__tree_find(>dso->inlined_nodes, addr);
+   if (!inline_node) {
+   inline_node = dso__parse_addr_inlines(map->dso, addr, sym);
+   if (!inline_node)
+   return 1;
+
+   inlines__tree_insert(>dso->inlined_nodes, inline_node);
+   }
+
+   list_for_each_entry(ilist, _node->val, list) {
+   int ret = callchain_cursor_append(cursor, ip, map,
+ ilist->symbol, false,
+ NULL, 0, 0, 0,
+ ilist->srcline);
+
+   if (ret != 0)
+   return ret;
+   }
+
+   return 0;
+}
+
 static int unwind_entry(struct unwind_entry *entry, void *arg)
 {
struct callchain_cursor *cursor = arg;
@@ -2117,6 +2151,9 @@ static int unwind_entry(struct unwind_entry *entry, void 
*arg)
if (symbol_conf.hide_unresolved && entry->sym == NULL)
return 0;
 
+   if (append_inlines(cursor, entry->map, entry->sym, entry->ip) == 0)
+   return 0;
+
srcline = callchain_srcline(entry->map, entry->sym, 

[tip:perf/core] perf callchain: Create real callchain entries for inlined frames

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  11ea2515f32e783b9a7984c148e742c377383915
Gitweb: https://git.kernel.org/tip/11ea2515f32e783b9a7984c148e742c377383915
Author: Milian Wolff 
AuthorDate: Mon, 9 Oct 2017 22:32:59 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 24 Oct 2017 09:59:55 -0300

perf callchain: Create real callchain entries for inlined frames

The inline_node structs are maintained by the new dso->inlines tree.
This in turn keeps ownership of the fake symbols and srcline string
representing an inline frame.

This tree is sorted by address to allow quick lookups. All other entries
of the symbol beside the function name are unused for inline frames. The
advantage of this approach is that all existing users of the callchain
API can now transparently display inlined frames without having to patch
their code.

Signed-off-by: Milian Wolff 
Reviewed-by: Jiri Olsa 
Reviewed-by: Namhyung Kim 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20171009203310.17362-6-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/dso.c |  5 +
 tools/perf/util/dso.h |  1 +
 tools/perf/util/machine.c | 37 ++
 tools/perf/util/srcline.c | 51 +++
 tools/perf/util/srcline.h |  9 +
 5 files changed, 103 insertions(+)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 339e529..75c8250 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -10,6 +10,7 @@
 #include "compress.h"
 #include "path.h"
 #include "symbol.h"
+#include "srcline.h"
 #include "dso.h"
 #include "machine.h"
 #include "auxtrace.h"
@@ -1201,6 +1202,7 @@ struct dso *dso__new(const char *name)
for (i = 0; i < MAP__NR_TYPES; ++i)
dso->symbols[i] = dso->symbol_names[i] = RB_ROOT;
dso->data.cache = RB_ROOT;
+   dso->inlined_nodes = RB_ROOT;
dso->data.fd = -1;
dso->data.status = DSO_DATA_STATUS_UNKNOWN;
dso->symtab_type = DSO_BINARY_TYPE__NOT_FOUND;
@@ -1232,6 +1234,9 @@ void dso__delete(struct dso *dso)
if (!RB_EMPTY_NODE(>rb_node))
pr_err("DSO %s is still in rbtree when being deleted!\n",
   dso->long_name);
+
+   /* free inlines first, as they reference symbols */
+   inlines__tree_delete(>inlined_nodes);
for (i = 0; i < MAP__NR_TYPES; ++i)
symbols__delete(>symbols[i]);
 
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index a2bbb21..122eca0 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -141,6 +141,7 @@ struct dso {
struct rb_root   *root; /* root of rbtree that rb_node is in */
struct rb_root   symbols[MAP__NR_TYPES];
struct rb_root   symbol_names[MAP__NR_TYPES];
+   struct rb_root   inlined_nodes;
struct {
u64 addr;
struct symbol   *symbol;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index a37e1c0..3d049cb 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2109,6 +2109,40 @@ check_calls:
return 0;
 }
 
+static int append_inlines(struct callchain_cursor *cursor,
+ struct map *map, struct symbol *sym, u64 ip)
+{
+   struct inline_node *inline_node;
+   struct inline_list *ilist;
+   u64 addr;
+
+   if (!symbol_conf.inline_name || !map || !sym)
+   return 1;
+
+   addr = map__rip_2objdump(map, ip);
+
+   inline_node = inlines__tree_find(>dso->inlined_nodes, addr);
+   if (!inline_node) {
+   inline_node = dso__parse_addr_inlines(map->dso, addr, sym);
+   if (!inline_node)
+   return 1;
+
+   inlines__tree_insert(>dso->inlined_nodes, inline_node);
+   }
+
+   list_for_each_entry(ilist, _node->val, list) {
+   int ret = callchain_cursor_append(cursor, ip, map,
+ ilist->symbol, false,
+ NULL, 0, 0, 0,
+ ilist->srcline);
+
+   if (ret != 0)
+   return ret;
+   }
+
+   return 0;
+}
+
 static int unwind_entry(struct unwind_entry *entry, void *arg)
 {
struct callchain_cursor *cursor = arg;
@@ -2117,6 +2151,9 @@ static int unwind_entry(struct unwind_entry *entry, void 
*arg)
if (symbol_conf.hide_unresolved && entry->sym == NULL)
return 0;
 
+   if (append_inlines(cursor, entry->map, entry->sym, entry->ip) == 0)
+   return 0;
+
srcline = callchain_srcline(entry->map, entry->sym, entry->ip);
return callchain_cursor_append(cursor, entry->ip,
   entry->map, entry->sym,
diff --git a/tools/perf/util/srcline.c 

[tip:perf/core] perf callchain: Mark inlined frames in output by " (inlined)" suffix

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  8932f8071cae8a12dfd5f49224ee176b0da4
Gitweb: https://git.kernel.org/tip/8932f8071cae8a12dfd5f49224ee176b0da4
Author: Milian Wolff 
AuthorDate: Mon, 9 Oct 2017 22:33:01 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 24 Oct 2017 09:59:56 -0300

perf callchain: Mark inlined frames in output by " (inlined)" suffix

The original patch that introduced inline frame output in the various
browsers used this suffix already. The new centralized approach that
uses fake symbols for inlined frames was missing this approach so far.

Instead of changing the symbol name itself, we only print the suffix
where needed. This allows us to efficiently lookup the symbol for a
given name without first having to append the suffix before the lookup.

Signed-off-by: Milian Wolff 
Reviewed-by: Jiri Olsa 
Reviewed-by: Namhyung Kim 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20171009203310.17362-8-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/callchain.c | 10 +++---
 tools/perf/util/sort.c  |  3 +++
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 0f2ba49..77031ef 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -,11 +,15 @@ char *callchain_list__sym_name(struct callchain_list 
*cl,
int printed;
 
if (cl->ms.sym) {
+   const char *inlined = cl->ms.sym->inlined ? " (inlined)" : "";
+
if (show_srcline && cl->srcline)
-   printed = scnprintf(bf, bfsize, "%s %s",
-   cl->ms.sym->name, cl->srcline);
+   printed = scnprintf(bf, bfsize, "%s %s%s",
+   cl->ms.sym->name, cl->srcline,
+   inlined);
else
-   printed = scnprintf(bf, bfsize, "%s", cl->ms.sym->name);
+   printed = scnprintf(bf, bfsize, "%s%s",
+   cl->ms.sym->name, inlined);
} else
printed = scnprintf(bf, bfsize, "%#" PRIx64, cl->ip);
 
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index eb3ab90..acb9210 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -283,6 +283,9 @@ static int _hist_entry__sym_snprintf(struct map *map, 
struct symbol *sym,
ret += repsep_snprintf(bf + ret, size - ret, "%.*s",
   width - ret,
   sym->name);
+   if (sym->inlined)
+   ret += repsep_snprintf(bf + ret, size - ret,
+  " (inlined)");
}
} else {
size_t len = BITS_PER_LONG / 4;


[tip:perf/core] perf callchain: Mark inlined frames in output by " (inlined)" suffix

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  8932f8071cae8a12dfd5f49224ee176b0da4
Gitweb: https://git.kernel.org/tip/8932f8071cae8a12dfd5f49224ee176b0da4
Author: Milian Wolff 
AuthorDate: Mon, 9 Oct 2017 22:33:01 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 24 Oct 2017 09:59:56 -0300

perf callchain: Mark inlined frames in output by " (inlined)" suffix

The original patch that introduced inline frame output in the various
browsers used this suffix already. The new centralized approach that
uses fake symbols for inlined frames was missing this approach so far.

Instead of changing the symbol name itself, we only print the suffix
where needed. This allows us to efficiently lookup the symbol for a
given name without first having to append the suffix before the lookup.

Signed-off-by: Milian Wolff 
Reviewed-by: Jiri Olsa 
Reviewed-by: Namhyung Kim 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20171009203310.17362-8-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/callchain.c | 10 +++---
 tools/perf/util/sort.c  |  3 +++
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 0f2ba49..77031ef 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -,11 +,15 @@ char *callchain_list__sym_name(struct callchain_list 
*cl,
int printed;
 
if (cl->ms.sym) {
+   const char *inlined = cl->ms.sym->inlined ? " (inlined)" : "";
+
if (show_srcline && cl->srcline)
-   printed = scnprintf(bf, bfsize, "%s %s",
-   cl->ms.sym->name, cl->srcline);
+   printed = scnprintf(bf, bfsize, "%s %s%s",
+   cl->ms.sym->name, cl->srcline,
+   inlined);
else
-   printed = scnprintf(bf, bfsize, "%s", cl->ms.sym->name);
+   printed = scnprintf(bf, bfsize, "%s%s",
+   cl->ms.sym->name, inlined);
} else
printed = scnprintf(bf, bfsize, "%#" PRIx64, cl->ip);
 
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index eb3ab90..acb9210 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -283,6 +283,9 @@ static int _hist_entry__sym_snprintf(struct map *map, 
struct symbol *sym,
ret += repsep_snprintf(bf + ret, size - ret, "%.*s",
   width - ret,
   sym->name);
+   if (sym->inlined)
+   ret += repsep_snprintf(bf + ret, size - ret,
+  " (inlined)");
}
} else {
size_t len = BITS_PER_LONG / 4;


[tip:perf/core] perf callchain: Store srcline in callchain_cursor_node

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  40a342cda2cd9bc8f7bf81c5ce1a141584760757
Gitweb: https://git.kernel.org/tip/40a342cda2cd9bc8f7bf81c5ce1a141584760757
Author: Milian Wolff 
AuthorDate: Mon, 9 Oct 2017 22:32:56 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 24 Oct 2017 09:59:55 -0300

perf callchain: Store srcline in callchain_cursor_node

This is mostly a preparation to enable the creation of full callchain
nodes for inline frames. Such frames will reference the IP of the
non-inlined frame, but hold the symbol and srcline for an inlined
location. As such, we won't be able to query the srcline on-demand based
on the IP alone. Instead, we will leverage the functionality provided by
this patch here, and store the srcline for the inlined nodes in the new
srcline member of callchain_cursor_node.

Note that this patch on its own leaks the srcline, as there is no
free_callchain_cursor_node or similar. A future patch will add caching
of the srcline and handle deletion properly.

Signed-off-by: Milian Wolff 
Reviewed-by: Jiri Olsa 
Reviewed-by: Namhyung Kim 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20171009203310.17362-3-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/callchain.c | 31 +--
 tools/perf/util/callchain.h |  6 --
 tools/perf/util/machine.c   | 18 --
 3 files changed, 29 insertions(+), 26 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index a971caf..e7ee794 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -566,6 +566,7 @@ fill_node(struct callchain_node *node, struct 
callchain_cursor *cursor)
call->ip = cursor_node->ip;
call->ms.sym = cursor_node->sym;
call->ms.map = map__get(cursor_node->map);
+   call->srcline = cursor_node->srcline;
 
if (cursor_node->branch) {
call->branch_count = 1;
@@ -647,20 +648,11 @@ enum match_result {
 static enum match_result match_chain_srcline(struct callchain_cursor_node 
*node,
 struct callchain_list *cnode)
 {
-   char *left = NULL;
-   char *right = NULL;
+   const char *left = cnode->srcline;
+   const char *right = node->srcline;
enum match_result ret = MATCH_EQ;
int cmp;
 
-   if (cnode->ms.map)
-   left = get_srcline(cnode->ms.map->dso,
-map__rip_2objdump(cnode->ms.map, cnode->ip),
-cnode->ms.sym, true, false);
-   if (node->map)
-   right = get_srcline(node->map->dso,
- map__rip_2objdump(node->map, node->ip),
- node->sym, true, false);
-
if (left && right)
cmp = strcmp(left, right);
else if (!left && right)
@@ -675,8 +667,6 @@ static enum match_result match_chain_srcline(struct 
callchain_cursor_node *node,
if (cmp != 0)
ret = cmp < 0 ? MATCH_LT : MATCH_GT;
 
-   free_srcline(left);
-   free_srcline(right);
return ret;
 }
 
@@ -969,7 +959,7 @@ merge_chain_branch(struct callchain_cursor *cursor,
list_for_each_entry_safe(list, next_list, >val, list) {
callchain_cursor_append(cursor, list->ip,
list->ms.map, list->ms.sym,
-   false, NULL, 0, 0, 0);
+   false, NULL, 0, 0, 0, list->srcline);
list_del(>list);
map__zput(list->ms.map);
free(list);
@@ -1009,7 +999,8 @@ int callchain_merge(struct callchain_cursor *cursor,
 int callchain_cursor_append(struct callchain_cursor *cursor,
u64 ip, struct map *map, struct symbol *sym,
bool branch, struct branch_flags *flags,
-   int nr_loop_iter, u64 iter_cycles, u64 branch_from)
+   int nr_loop_iter, u64 iter_cycles, u64 branch_from,
+   const char *srcline)
 {
struct callchain_cursor_node *node = *cursor->last;
 
@@ -1028,6 +1019,7 @@ int callchain_cursor_append(struct callchain_cursor 
*cursor,
node->branch = branch;
node->nr_loop_iter = nr_loop_iter;
node->iter_cycles = iter_cycles;
+   node->srcline = srcline;
 
if (flags)
memcpy(>branch_flags, flags,
@@ -1115,12 +1107,7 @@ char *callchain_list__sym_name(struct callchain_list *cl,
int printed;
 
if (cl->ms.sym) {
-   if (show_srcline && cl->ms.map && !cl->srcline)
-   cl->srcline = 

[tip:perf/core] perf report: Fall-back to function name comparison for -g srcline

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  cbe50f61727f538f05e63186c2e0022182a3a28f
Gitweb: https://git.kernel.org/tip/cbe50f61727f538f05e63186c2e0022182a3a28f
Author: Milian Wolff 
AuthorDate: Mon, 9 Oct 2017 22:33:00 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 24 Oct 2017 09:59:55 -0300

perf report: Fall-back to function name comparison for -g srcline

When a callchain entry has no srcline available, we ended up comparing
the instruction pointer. I consider this to be not too useful. Rather, I
think we should group the entries by function name, which this patch
adds. For people who want to split the data on the IP boundary, using
`-g address` is the correct choice.

Before:

~
   100.00%38.86%  [.] main
|
|--61.14%--main inlining.cpp:14
|  std::norm complex:664
|  std::_Norm_helper::_S_do_it complex:654
|  std::abs complex:597
|  std::__complex_abs complex:589
|  |
|  |--56.03%--hypot
|  |  |
|  |  |--8.45%--__hypot_finite
|  |  |
|  |  |--7.62%--__hypot_finite
|  |  |
|  |  |--2.29%--__hypot_finite
|  |  |
|  |  |--2.24%--__hypot_finite
|  |  |
|  |  |--2.06%--__hypot_finite
|  |  |
|  |  |--1.81%--__hypot_finite
...
~

After:

~
   100.00%38.86%  [.] main
|
|--61.14%--main inlining.cpp:14
|  std::norm complex:664
|  std::_Norm_helper::_S_do_it complex:654
|  std::abs complex:597
|  std::__complex_abs complex:589
|  |
|  |--60.29%--hypot
|  |  |
|  |   --56.03%--__hypot_finite
|  |
|   --0.85%--cabs
~

Signed-off-by: Milian Wolff 
Reviewed-by: Jiri Olsa 
Reviewed-by: Namhyung Kim 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20171009203310.17362-7-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/callchain.c | 20 
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index e7ee794..0f2ba49 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -645,11 +645,9 @@ enum match_result {
MATCH_GT,
 };
 
-static enum match_result match_chain_srcline(struct callchain_cursor_node 
*node,
-struct callchain_list *cnode)
+static enum match_result match_chain_strings(const char *left,
+const char *right)
 {
-   const char *left = cnode->srcline;
-   const char *right = node->srcline;
enum match_result ret = MATCH_EQ;
int cmp;
 
@@ -659,10 +657,8 @@ static enum match_result match_chain_srcline(struct 
callchain_cursor_node *node,
cmp = 1;
else if (left && !right)
cmp = -1;
-   else if (cnode->ip == node->ip)
-   cmp = 0;
else
-   cmp = (cnode->ip < node->ip) ? -1 : 1;
+   return MATCH_ERROR;
 
if (cmp != 0)
ret = cmp < 0 ? MATCH_LT : MATCH_GT;
@@ -679,10 +675,18 @@ static enum match_result match_chain(struct 
callchain_cursor_node *node,
struct dso *right_dso = NULL;
 
if (callchain_param.key == CCKEY_SRCLINE) {
-   enum match_result match = match_chain_srcline(node, cnode);
+   enum match_result match = match_chain_strings(cnode->srcline,
+ node->srcline);
+
+   /* if no srcline is available, fallback to symbol name */
+   if (match == MATCH_ERROR && cnode->ms.sym && node->sym)
+   match = match_chain_strings(cnode->ms.sym->name,
+   node->sym->name);
 
if (match != MATCH_ERROR)
return match;
+
+   /* otherwise fall-back to IP-based comparison below */
}
 
if (cnode->ms.sym && sym && callchain_param.key == CCKEY_FUNCTION) {


[tip:perf/core] perf callchain: Store srcline in callchain_cursor_node

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  40a342cda2cd9bc8f7bf81c5ce1a141584760757
Gitweb: https://git.kernel.org/tip/40a342cda2cd9bc8f7bf81c5ce1a141584760757
Author: Milian Wolff 
AuthorDate: Mon, 9 Oct 2017 22:32:56 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 24 Oct 2017 09:59:55 -0300

perf callchain: Store srcline in callchain_cursor_node

This is mostly a preparation to enable the creation of full callchain
nodes for inline frames. Such frames will reference the IP of the
non-inlined frame, but hold the symbol and srcline for an inlined
location. As such, we won't be able to query the srcline on-demand based
on the IP alone. Instead, we will leverage the functionality provided by
this patch here, and store the srcline for the inlined nodes in the new
srcline member of callchain_cursor_node.

Note that this patch on its own leaks the srcline, as there is no
free_callchain_cursor_node or similar. A future patch will add caching
of the srcline and handle deletion properly.

Signed-off-by: Milian Wolff 
Reviewed-by: Jiri Olsa 
Reviewed-by: Namhyung Kim 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20171009203310.17362-3-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/callchain.c | 31 +--
 tools/perf/util/callchain.h |  6 --
 tools/perf/util/machine.c   | 18 --
 3 files changed, 29 insertions(+), 26 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index a971caf..e7ee794 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -566,6 +566,7 @@ fill_node(struct callchain_node *node, struct 
callchain_cursor *cursor)
call->ip = cursor_node->ip;
call->ms.sym = cursor_node->sym;
call->ms.map = map__get(cursor_node->map);
+   call->srcline = cursor_node->srcline;
 
if (cursor_node->branch) {
call->branch_count = 1;
@@ -647,20 +648,11 @@ enum match_result {
 static enum match_result match_chain_srcline(struct callchain_cursor_node 
*node,
 struct callchain_list *cnode)
 {
-   char *left = NULL;
-   char *right = NULL;
+   const char *left = cnode->srcline;
+   const char *right = node->srcline;
enum match_result ret = MATCH_EQ;
int cmp;
 
-   if (cnode->ms.map)
-   left = get_srcline(cnode->ms.map->dso,
-map__rip_2objdump(cnode->ms.map, cnode->ip),
-cnode->ms.sym, true, false);
-   if (node->map)
-   right = get_srcline(node->map->dso,
- map__rip_2objdump(node->map, node->ip),
- node->sym, true, false);
-
if (left && right)
cmp = strcmp(left, right);
else if (!left && right)
@@ -675,8 +667,6 @@ static enum match_result match_chain_srcline(struct 
callchain_cursor_node *node,
if (cmp != 0)
ret = cmp < 0 ? MATCH_LT : MATCH_GT;
 
-   free_srcline(left);
-   free_srcline(right);
return ret;
 }
 
@@ -969,7 +959,7 @@ merge_chain_branch(struct callchain_cursor *cursor,
list_for_each_entry_safe(list, next_list, >val, list) {
callchain_cursor_append(cursor, list->ip,
list->ms.map, list->ms.sym,
-   false, NULL, 0, 0, 0);
+   false, NULL, 0, 0, 0, list->srcline);
list_del(>list);
map__zput(list->ms.map);
free(list);
@@ -1009,7 +999,8 @@ int callchain_merge(struct callchain_cursor *cursor,
 int callchain_cursor_append(struct callchain_cursor *cursor,
u64 ip, struct map *map, struct symbol *sym,
bool branch, struct branch_flags *flags,
-   int nr_loop_iter, u64 iter_cycles, u64 branch_from)
+   int nr_loop_iter, u64 iter_cycles, u64 branch_from,
+   const char *srcline)
 {
struct callchain_cursor_node *node = *cursor->last;
 
@@ -1028,6 +1019,7 @@ int callchain_cursor_append(struct callchain_cursor 
*cursor,
node->branch = branch;
node->nr_loop_iter = nr_loop_iter;
node->iter_cycles = iter_cycles;
+   node->srcline = srcline;
 
if (flags)
memcpy(>branch_flags, flags,
@@ -1115,12 +1107,7 @@ char *callchain_list__sym_name(struct callchain_list *cl,
int printed;
 
if (cl->ms.sym) {
-   if (show_srcline && cl->ms.map && !cl->srcline)
-   cl->srcline = get_srcline(cl->ms.map->dso,
- map__rip_2objdump(cl->ms.map,
-   cl->ip),
- 

[tip:perf/core] perf report: Fall-back to function name comparison for -g srcline

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  cbe50f61727f538f05e63186c2e0022182a3a28f
Gitweb: https://git.kernel.org/tip/cbe50f61727f538f05e63186c2e0022182a3a28f
Author: Milian Wolff 
AuthorDate: Mon, 9 Oct 2017 22:33:00 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 24 Oct 2017 09:59:55 -0300

perf report: Fall-back to function name comparison for -g srcline

When a callchain entry has no srcline available, we ended up comparing
the instruction pointer. I consider this to be not too useful. Rather, I
think we should group the entries by function name, which this patch
adds. For people who want to split the data on the IP boundary, using
`-g address` is the correct choice.

Before:

~
   100.00%38.86%  [.] main
|
|--61.14%--main inlining.cpp:14
|  std::norm complex:664
|  std::_Norm_helper::_S_do_it complex:654
|  std::abs complex:597
|  std::__complex_abs complex:589
|  |
|  |--56.03%--hypot
|  |  |
|  |  |--8.45%--__hypot_finite
|  |  |
|  |  |--7.62%--__hypot_finite
|  |  |
|  |  |--2.29%--__hypot_finite
|  |  |
|  |  |--2.24%--__hypot_finite
|  |  |
|  |  |--2.06%--__hypot_finite
|  |  |
|  |  |--1.81%--__hypot_finite
...
~

After:

~
   100.00%38.86%  [.] main
|
|--61.14%--main inlining.cpp:14
|  std::norm complex:664
|  std::_Norm_helper::_S_do_it complex:654
|  std::abs complex:597
|  std::__complex_abs complex:589
|  |
|  |--60.29%--hypot
|  |  |
|  |   --56.03%--__hypot_finite
|  |
|   --0.85%--cabs
~

Signed-off-by: Milian Wolff 
Reviewed-by: Jiri Olsa 
Reviewed-by: Namhyung Kim 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20171009203310.17362-7-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/callchain.c | 20 
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index e7ee794..0f2ba49 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -645,11 +645,9 @@ enum match_result {
MATCH_GT,
 };
 
-static enum match_result match_chain_srcline(struct callchain_cursor_node 
*node,
-struct callchain_list *cnode)
+static enum match_result match_chain_strings(const char *left,
+const char *right)
 {
-   const char *left = cnode->srcline;
-   const char *right = node->srcline;
enum match_result ret = MATCH_EQ;
int cmp;
 
@@ -659,10 +657,8 @@ static enum match_result match_chain_srcline(struct 
callchain_cursor_node *node,
cmp = 1;
else if (left && !right)
cmp = -1;
-   else if (cnode->ip == node->ip)
-   cmp = 0;
else
-   cmp = (cnode->ip < node->ip) ? -1 : 1;
+   return MATCH_ERROR;
 
if (cmp != 0)
ret = cmp < 0 ? MATCH_LT : MATCH_GT;
@@ -679,10 +675,18 @@ static enum match_result match_chain(struct 
callchain_cursor_node *node,
struct dso *right_dso = NULL;
 
if (callchain_param.key == CCKEY_SRCLINE) {
-   enum match_result match = match_chain_srcline(node, cnode);
+   enum match_result match = match_chain_strings(cnode->srcline,
+ node->srcline);
+
+   /* if no srcline is available, fallback to symbol name */
+   if (match == MATCH_ERROR && cnode->ms.sym && node->sym)
+   match = match_chain_strings(cnode->ms.sym->name,
+   node->sym->name);
 
if (match != MATCH_ERROR)
return match;
+
+   /* otherwise fall-back to IP-based comparison below */
}
 
if (cnode->ms.sym && sym && callchain_param.key == CCKEY_FUNCTION) {


[tip:perf/core] perf callchain: Refactor inline_list to operate on symbols

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  fea0cf842c7aa08950063264ab1cfbce4ba38c1b
Gitweb: https://git.kernel.org/tip/fea0cf842c7aa08950063264ab1cfbce4ba38c1b
Author: Milian Wolff 
AuthorDate: Mon, 9 Oct 2017 22:32:57 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 24 Oct 2017 09:59:55 -0300

perf callchain: Refactor inline_list to operate on symbols

This is a requirement to create real callchain entries for inlined
frames.

Since the list of inlines usually contains the target symbol too, i.e.
the location where the frames get inlined to, we alias that symbol and
reuse it as-is is. This ensures that other dependent functionality keeps
working, most notably annotation of the target frames.

For all other entries in the inline_list, a fake symbol is created.
These are marked by new 'inlined' member which is set to true. Only
those symbols are managed by the inline_list and get freed when the
inline_list is deleted from within inline_node__delete.

Signed-off-by: Milian Wolff 
Reviewed-by: Jiri Olsa 
Reviewed-by: Namhyung Kim 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20171009203310.17362-4-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/srcline.c | 93 ---
 tools/perf/util/srcline.h |  7 +++-
 tools/perf/util/symbol.h  |  1 +
 3 files changed, 69 insertions(+), 32 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index ed8e8d2..c0af61b 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -33,29 +33,19 @@ static const char *dso__name(struct dso *dso)
return dso_name;
 }
 
-static int inline_list__append(char *filename, char *funcname, int line_nr,
-  struct inline_node *node, struct dso *dso)
+static int inline_list__append(struct symbol *symbol, char *filename,
+  int line_nr, struct inline_node *node)
 {
struct inline_list *ilist;
-   char *demangled;
 
ilist = zalloc(sizeof(*ilist));
if (ilist == NULL)
return -1;
 
+   ilist->symbol = symbol;
ilist->filename = filename;
ilist->line_nr = line_nr;
 
-   if (dso != NULL) {
-   demangled = dso__demangle_sym(dso, 0, funcname);
-   if (demangled == NULL) {
-   ilist->funcname = funcname;
-   } else {
-   ilist->funcname = demangled;
-   free(funcname);
-   }
-   }
-
if (callchain_param.order == ORDER_CALLEE)
list_add_tail(>list, >val);
else
@@ -206,19 +196,56 @@ static void addr2line_cleanup(struct a2l_data *a2l)
 
 #define MAX_INLINE_NEST 1024
 
+static struct symbol *new_inline_sym(struct dso *dso,
+struct symbol *base_sym,
+const char *funcname)
+{
+   struct symbol *inline_sym;
+   char *demangled = NULL;
+
+   if (dso) {
+   demangled = dso__demangle_sym(dso, 0, funcname);
+   if (demangled)
+   funcname = demangled;
+   }
+
+   if (base_sym && strcmp(funcname, base_sym->name) == 0) {
+   /* reuse the real, existing symbol */
+   inline_sym = base_sym;
+   /* ensure that we don't alias an inlined symbol, which could
+* lead to double frees in inline_node__delete
+*/
+   assert(!base_sym->inlined);
+   } else {
+   /* create a fake symbol for the inline frame */
+   inline_sym = symbol__new(base_sym ? base_sym->start : 0,
+base_sym ? base_sym->end : 0,
+base_sym ? base_sym->binding : 0,
+funcname);
+   if (inline_sym)
+   inline_sym->inlined = 1;
+   }
+
+   free(demangled);
+
+   return inline_sym;
+}
+
 static int inline_list__append_dso_a2l(struct dso *dso,
-  struct inline_node *node)
+  struct inline_node *node,
+  struct symbol *sym)
 {
struct a2l_data *a2l = dso->a2l;
-   char *funcname = a2l->funcname ? strdup(a2l->funcname) : NULL;
-   char *filename = a2l->filename ? strdup(a2l->filename) : NULL;
+   struct symbol *inline_sym = new_inline_sym(dso, sym, a2l->funcname);
 
-   return inline_list__append(filename, funcname, a2l->line, node, dso);
+   return inline_list__append(inline_sym, strdup(a2l->filename),
+  a2l->line, node);
 }
 
 static int addr2line(const char 

[tip:perf/core] perf callchain: Refactor inline_list to operate on symbols

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  fea0cf842c7aa08950063264ab1cfbce4ba38c1b
Gitweb: https://git.kernel.org/tip/fea0cf842c7aa08950063264ab1cfbce4ba38c1b
Author: Milian Wolff 
AuthorDate: Mon, 9 Oct 2017 22:32:57 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 24 Oct 2017 09:59:55 -0300

perf callchain: Refactor inline_list to operate on symbols

This is a requirement to create real callchain entries for inlined
frames.

Since the list of inlines usually contains the target symbol too, i.e.
the location where the frames get inlined to, we alias that symbol and
reuse it as-is is. This ensures that other dependent functionality keeps
working, most notably annotation of the target frames.

For all other entries in the inline_list, a fake symbol is created.
These are marked by new 'inlined' member which is set to true. Only
those symbols are managed by the inline_list and get freed when the
inline_list is deleted from within inline_node__delete.

Signed-off-by: Milian Wolff 
Reviewed-by: Jiri Olsa 
Reviewed-by: Namhyung Kim 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20171009203310.17362-4-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/srcline.c | 93 ---
 tools/perf/util/srcline.h |  7 +++-
 tools/perf/util/symbol.h  |  1 +
 3 files changed, 69 insertions(+), 32 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index ed8e8d2..c0af61b 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -33,29 +33,19 @@ static const char *dso__name(struct dso *dso)
return dso_name;
 }
 
-static int inline_list__append(char *filename, char *funcname, int line_nr,
-  struct inline_node *node, struct dso *dso)
+static int inline_list__append(struct symbol *symbol, char *filename,
+  int line_nr, struct inline_node *node)
 {
struct inline_list *ilist;
-   char *demangled;
 
ilist = zalloc(sizeof(*ilist));
if (ilist == NULL)
return -1;
 
+   ilist->symbol = symbol;
ilist->filename = filename;
ilist->line_nr = line_nr;
 
-   if (dso != NULL) {
-   demangled = dso__demangle_sym(dso, 0, funcname);
-   if (demangled == NULL) {
-   ilist->funcname = funcname;
-   } else {
-   ilist->funcname = demangled;
-   free(funcname);
-   }
-   }
-
if (callchain_param.order == ORDER_CALLEE)
list_add_tail(>list, >val);
else
@@ -206,19 +196,56 @@ static void addr2line_cleanup(struct a2l_data *a2l)
 
 #define MAX_INLINE_NEST 1024
 
+static struct symbol *new_inline_sym(struct dso *dso,
+struct symbol *base_sym,
+const char *funcname)
+{
+   struct symbol *inline_sym;
+   char *demangled = NULL;
+
+   if (dso) {
+   demangled = dso__demangle_sym(dso, 0, funcname);
+   if (demangled)
+   funcname = demangled;
+   }
+
+   if (base_sym && strcmp(funcname, base_sym->name) == 0) {
+   /* reuse the real, existing symbol */
+   inline_sym = base_sym;
+   /* ensure that we don't alias an inlined symbol, which could
+* lead to double frees in inline_node__delete
+*/
+   assert(!base_sym->inlined);
+   } else {
+   /* create a fake symbol for the inline frame */
+   inline_sym = symbol__new(base_sym ? base_sym->start : 0,
+base_sym ? base_sym->end : 0,
+base_sym ? base_sym->binding : 0,
+funcname);
+   if (inline_sym)
+   inline_sym->inlined = 1;
+   }
+
+   free(demangled);
+
+   return inline_sym;
+}
+
 static int inline_list__append_dso_a2l(struct dso *dso,
-  struct inline_node *node)
+  struct inline_node *node,
+  struct symbol *sym)
 {
struct a2l_data *a2l = dso->a2l;
-   char *funcname = a2l->funcname ? strdup(a2l->funcname) : NULL;
-   char *filename = a2l->filename ? strdup(a2l->filename) : NULL;
+   struct symbol *inline_sym = new_inline_sym(dso, sym, a2l->funcname);
 
-   return inline_list__append(filename, funcname, a2l->line, node, dso);
+   return inline_list__append(inline_sym, strdup(a2l->filename),
+  a2l->line, node);
 }
 
 static int addr2line(const char *dso_name, u64 addr,
 char **file, unsigned int *line, struct dso *dso,
-bool unwind_inlines, struct inline_node *node)
+bool 

[tip:perf/core] perf callchain: Refactor inline_list to store srcline string directly

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  2be8832f3c51cf9e36a3e80ff57f4137505c2ba4
Gitweb: https://git.kernel.org/tip/2be8832f3c51cf9e36a3e80ff57f4137505c2ba4
Author: Milian Wolff 
AuthorDate: Mon, 9 Oct 2017 22:32:58 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 24 Oct 2017 09:59:55 -0300

perf callchain: Refactor inline_list to store srcline string directly

This is a preparation for the creation of real callchain entries for
inlined frames. The rest of the perf code uses the srcline string. As
such, using that also for the srcline API allows us to simplify some of
the upcoming code. Most notably, it will allow us to cache the srcline
for a given inline node and reuse it for different callchain entries.

Signed-off-by: Milian Wolff 
Reviewed-by: Jiri Olsa 
Reviewed-by: Namhyung Kim 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20171009203310.17362-5-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/srcline.c | 54 +++
 tools/perf/util/srcline.h |  3 +--
 2 files changed, 41 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index c0af61b..f202fc7 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -33,8 +33,8 @@ static const char *dso__name(struct dso *dso)
return dso_name;
 }
 
-static int inline_list__append(struct symbol *symbol, char *filename,
-  int line_nr, struct inline_node *node)
+static int inline_list__append(struct symbol *symbol, char *srcline,
+  struct inline_node *node)
 {
struct inline_list *ilist;
 
@@ -43,8 +43,7 @@ static int inline_list__append(struct symbol *symbol, char 
*filename,
return -1;
 
ilist->symbol = symbol;
-   ilist->filename = filename;
-   ilist->line_nr = line_nr;
+   ilist->srcline = srcline;
 
if (callchain_param.order == ORDER_CALLEE)
list_add_tail(>list, >val);
@@ -54,6 +53,30 @@ static int inline_list__append(struct symbol *symbol, char 
*filename,
return 0;
 }
 
+/* basename version that takes a const input string */
+static const char *gnu_basename(const char *path)
+{
+   const char *base = strrchr(path, '/');
+
+   return base ? base + 1 : path;
+}
+
+static char *srcline_from_fileline(const char *file, unsigned int line)
+{
+   char *srcline;
+
+   if (!file)
+   return NULL;
+
+   if (!srcline_full_filename)
+   file = gnu_basename(file);
+
+   if (asprintf(, "%s:%u", file, line) < 0)
+   return NULL;
+
+   return srcline;
+}
+
 #ifdef HAVE_LIBBFD_SUPPORT
 
 /*
@@ -237,9 +260,12 @@ static int inline_list__append_dso_a2l(struct dso *dso,
 {
struct a2l_data *a2l = dso->a2l;
struct symbol *inline_sym = new_inline_sym(dso, sym, a2l->funcname);
+   char *srcline = NULL;
 
-   return inline_list__append(inline_sym, strdup(a2l->filename),
-  a2l->line, node);
+   if (a2l->filename)
+   srcline = srcline_from_fileline(a2l->filename, a2l->line);
+
+   return inline_list__append(inline_sym, srcline, node);
 }
 
 static int addr2line(const char *dso_name, u64 addr,
@@ -437,13 +463,15 @@ static struct inline_node *addr2inlines(const char 
*dso_name, u64 addr,
node->addr = addr;
 
while (getline(, , fp) != -1) {
+   char *srcline;
 
if (filename_split(filename, _nr) != 1) {
free(filename);
goto out;
}
 
-   if (inline_list__append(sym, filename, line_nr, node) != 0)
+   srcline = srcline_from_fileline(filename, line_nr);
+   if (inline_list__append(sym, srcline, node) != 0)
goto out;
 
filename = NULL;
@@ -487,16 +515,14 @@ char *__get_srcline(struct dso *dso, u64 addr, struct 
symbol *sym,
   unwind_inlines, NULL, sym))
goto out;
 
-   if (asprintf(, "%s:%u",
-   srcline_full_filename ? file : basename(file),
-   line) < 0) {
-   free(file);
+   srcline = srcline_from_fileline(file, line);
+   free(file);
+
+   if (!srcline)
goto out;
-   }
 
dso->a2l_fails = 0;
 
-   free(file);
return srcline;
 
 out:
@@ -548,7 +574,7 @@ void inline_node__delete(struct inline_node *node)
 
list_for_each_entry_safe(ilist, tmp, >val, list) {
list_del_init(>list);
-   zfree(>filename);
+   free_srcline(ilist->srcline);
/* only the inlined symbols are 

[tip:perf/core] perf callchain: Refactor inline_list to store srcline string directly

2017-10-25 Thread tip-bot for Milian Wolff
Commit-ID:  2be8832f3c51cf9e36a3e80ff57f4137505c2ba4
Gitweb: https://git.kernel.org/tip/2be8832f3c51cf9e36a3e80ff57f4137505c2ba4
Author: Milian Wolff 
AuthorDate: Mon, 9 Oct 2017 22:32:58 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 24 Oct 2017 09:59:55 -0300

perf callchain: Refactor inline_list to store srcline string directly

This is a preparation for the creation of real callchain entries for
inlined frames. The rest of the perf code uses the srcline string. As
such, using that also for the srcline API allows us to simplify some of
the upcoming code. Most notably, it will allow us to cache the srcline
for a given inline node and reuse it for different callchain entries.

Signed-off-by: Milian Wolff 
Reviewed-by: Jiri Olsa 
Reviewed-by: Namhyung Kim 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20171009203310.17362-5-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/srcline.c | 54 +++
 tools/perf/util/srcline.h |  3 +--
 2 files changed, 41 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index c0af61b..f202fc7 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -33,8 +33,8 @@ static const char *dso__name(struct dso *dso)
return dso_name;
 }
 
-static int inline_list__append(struct symbol *symbol, char *filename,
-  int line_nr, struct inline_node *node)
+static int inline_list__append(struct symbol *symbol, char *srcline,
+  struct inline_node *node)
 {
struct inline_list *ilist;
 
@@ -43,8 +43,7 @@ static int inline_list__append(struct symbol *symbol, char 
*filename,
return -1;
 
ilist->symbol = symbol;
-   ilist->filename = filename;
-   ilist->line_nr = line_nr;
+   ilist->srcline = srcline;
 
if (callchain_param.order == ORDER_CALLEE)
list_add_tail(>list, >val);
@@ -54,6 +53,30 @@ static int inline_list__append(struct symbol *symbol, char 
*filename,
return 0;
 }
 
+/* basename version that takes a const input string */
+static const char *gnu_basename(const char *path)
+{
+   const char *base = strrchr(path, '/');
+
+   return base ? base + 1 : path;
+}
+
+static char *srcline_from_fileline(const char *file, unsigned int line)
+{
+   char *srcline;
+
+   if (!file)
+   return NULL;
+
+   if (!srcline_full_filename)
+   file = gnu_basename(file);
+
+   if (asprintf(, "%s:%u", file, line) < 0)
+   return NULL;
+
+   return srcline;
+}
+
 #ifdef HAVE_LIBBFD_SUPPORT
 
 /*
@@ -237,9 +260,12 @@ static int inline_list__append_dso_a2l(struct dso *dso,
 {
struct a2l_data *a2l = dso->a2l;
struct symbol *inline_sym = new_inline_sym(dso, sym, a2l->funcname);
+   char *srcline = NULL;
 
-   return inline_list__append(inline_sym, strdup(a2l->filename),
-  a2l->line, node);
+   if (a2l->filename)
+   srcline = srcline_from_fileline(a2l->filename, a2l->line);
+
+   return inline_list__append(inline_sym, srcline, node);
 }
 
 static int addr2line(const char *dso_name, u64 addr,
@@ -437,13 +463,15 @@ static struct inline_node *addr2inlines(const char 
*dso_name, u64 addr,
node->addr = addr;
 
while (getline(, , fp) != -1) {
+   char *srcline;
 
if (filename_split(filename, _nr) != 1) {
free(filename);
goto out;
}
 
-   if (inline_list__append(sym, filename, line_nr, node) != 0)
+   srcline = srcline_from_fileline(filename, line_nr);
+   if (inline_list__append(sym, srcline, node) != 0)
goto out;
 
filename = NULL;
@@ -487,16 +515,14 @@ char *__get_srcline(struct dso *dso, u64 addr, struct 
symbol *sym,
   unwind_inlines, NULL, sym))
goto out;
 
-   if (asprintf(, "%s:%u",
-   srcline_full_filename ? file : basename(file),
-   line) < 0) {
-   free(file);
+   srcline = srcline_from_fileline(file, line);
+   free(file);
+
+   if (!srcline)
goto out;
-   }
 
dso->a2l_fails = 0;
 
-   free(file);
return srcline;
 
 out:
@@ -548,7 +574,7 @@ void inline_node__delete(struct inline_node *node)
 
list_for_each_entry_safe(ilist, tmp, >val, list) {
list_del_init(>list);
-   zfree(>filename);
+   free_srcline(ilist->srcline);
/* only the inlined symbols are owned by the list */
if (ilist->symbol && ilist->symbol->inlined)
symbol__delete(ilist->symbol);
diff --git a/tools/perf/util/srcline.h 

[tip:perf/urgent] perf stat: Wait for the correct child

2017-09-13 Thread tip-bot for Milian Wolff
Commit-ID:  dfc9eec7716cc0a9f7eb743c703d74cd2d6085a0
Gitweb: http://git.kernel.org/tip/dfc9eec7716cc0a9f7eb743c703d74cd2d6085a0
Author: Milian Wolff 
AuthorDate: Tue, 12 Sep 2017 17:25:23 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 12 Sep 2017 12:49:13 -0300

perf stat: Wait for the correct child

When packaging the perf userland application into an AppImage, the
wait() call in perf stat returned too early. It turned out that some
other child process exited, but not the one perf stat launched:

  $ sudo strace -e fork,execve,clone,wait4 -f ./perf-x86_64.AppImage stat sleep 
1
  execve("./perf-git.3a73b7f9-x86_64.AppImage", 
["./perf-git.3a73b7f9-x86_64.AppIm"..., "stat", "sleep", "1"], 0x7ffec1bbf050 
/* 18 vars */) = 0
  clone(child_stack=NULL, 
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x7f6a6e7efe50) = 3912
  strace: Process 3912 attached
  [pid  3912] clone(child_stack=NULL, 
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x7f6a6e7efe50) = 3914
  strace: Process 3914 attached
  [pid  3912] +++ exited with 0 +++
  [pid  3911] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3912, 
si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
  [pid  3914] clone(strace: Process 3915 attached
  child_stack=0x7f6a6d9fefb0, 
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
 parent_tidptr=0x7f6a6d9ff9d0, tls=0x7f6a6d9ff700, child_tidptr=0x7f6a6d9ff9d0) 
= 3915
  [pid  3911] execve("/tmp/.mount_perf-g6VYMpl/AppRun", 
["./perf-git.3a73b7f9-x86_64.AppIm"..., "stat", "sleep", "1"], 0x14aab70 /* 21 
vars */) = 0
  [pid  3911] clone(child_stack=NULL, 
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x7f4ae113c4d0) = 3916
  strace: Process 3916 attached
  [pid  3911] wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 3912
  [pid  3916] execve("/usr/libexec/perf-core/sleep", ["sleep", "1"], 0x27d3650 
/* 22 vars */) = -1 ENOENT (No such file or directory)
  [pid  3916] execve("/tmp/./sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = 
-1 ENOENT (No such file or directory)
  [pid  3916] execve("/home/milian/.bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 
vars */) = -1 ENOENT (No such file or directory)
  [pid  3916] execve("/usr/lib/icecream/libexec/icecc/bin/sleep", ["sleep", 
"1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
  [pid  3916] execve("/ssd2/milian/projects/compiled/other/bin/sleep", 
["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
  [pid  3916] execve("/home/milian/.bin/kf5/sleep", ["sleep", "1"], 0x27d3650 
/* 22 vars */) = -1 ENOENT (No such file or directory)
  [pid  3916] execve("/ssd2/milian/projects/compiled/kf5/bin/sleep", ["sleep", 
"1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
  [pid  3916] execve("/home/milian/projects/compiled/other/bin/sleep", 
["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
  [pid  3916] execve("/home/milian/projects/compiled/kf5/bin/sleep", ["sleep", 
"1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
  [pid  3916] execve("/usr/local/sbin/sleep", ["sleep", "1"], 0x27d3650 /* 22 
vars */) = -1 ENOENT (No such file or directory)
  [pid  3916] execve("/usr/local/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 
vars */) = -1 ENOENT (No such file or directory)
  [pid  3916] execve("/usr/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */
   Performance counter stats for 'sleep 1':

   task-clock
   context-switches
   cpu-migrations
   page-faults
   cycles
   instructions
 branches
 branch-misses

 0.47194 seconds time elapsed

  [pid  3916] --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=3911, 
si_uid=0} ---
  [pid  3916] +++ killed by SIGTERM +++
  [pid  3911] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=3916, 
si_uid=0, si_status=SIGTERM, si_utime=0, si_stime=0} ---
  [pid  3915] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=3914, 
si_uid=0} ---
  [pid  3911] +++ exited with 0 +++
  [pid  3915] --- SIGHUP {si_signo=SIGHUP, si_code=SI_USER, si_pid=3914, 
si_uid=0} ---
  [pid  3915] +++ exited with 0 +++
  +++ exited with 0 +++

This patch uses waitpid instead to ensure the call waits for the
debuggee application launched by 'perf stat'. This fixes 'perf stat'
when launched from an AppImage:

  $ ./perf-x86_64.AppImage stat sleep 1

   Performance counter stats for 'sleep 1':

  0.357235  task-clock (msec) #0.000 CPUs utilized
 1  context-switches  #0.003 M/sec
 0  cpu-migrations#0.000 K/sec
50  page-faults   #0.140 M/sec
   1269602  cycles#3.554 GHz
   

[tip:perf/urgent] perf stat: Wait for the correct child

2017-09-13 Thread tip-bot for Milian Wolff
Commit-ID:  dfc9eec7716cc0a9f7eb743c703d74cd2d6085a0
Gitweb: http://git.kernel.org/tip/dfc9eec7716cc0a9f7eb743c703d74cd2d6085a0
Author: Milian Wolff 
AuthorDate: Tue, 12 Sep 2017 17:25:23 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 12 Sep 2017 12:49:13 -0300

perf stat: Wait for the correct child

When packaging the perf userland application into an AppImage, the
wait() call in perf stat returned too early. It turned out that some
other child process exited, but not the one perf stat launched:

  $ sudo strace -e fork,execve,clone,wait4 -f ./perf-x86_64.AppImage stat sleep 
1
  execve("./perf-git.3a73b7f9-x86_64.AppImage", 
["./perf-git.3a73b7f9-x86_64.AppIm"..., "stat", "sleep", "1"], 0x7ffec1bbf050 
/* 18 vars */) = 0
  clone(child_stack=NULL, 
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x7f6a6e7efe50) = 3912
  strace: Process 3912 attached
  [pid  3912] clone(child_stack=NULL, 
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x7f6a6e7efe50) = 3914
  strace: Process 3914 attached
  [pid  3912] +++ exited with 0 +++
  [pid  3911] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3912, 
si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
  [pid  3914] clone(strace: Process 3915 attached
  child_stack=0x7f6a6d9fefb0, 
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
 parent_tidptr=0x7f6a6d9ff9d0, tls=0x7f6a6d9ff700, child_tidptr=0x7f6a6d9ff9d0) 
= 3915
  [pid  3911] execve("/tmp/.mount_perf-g6VYMpl/AppRun", 
["./perf-git.3a73b7f9-x86_64.AppIm"..., "stat", "sleep", "1"], 0x14aab70 /* 21 
vars */) = 0
  [pid  3911] clone(child_stack=NULL, 
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x7f4ae113c4d0) = 3916
  strace: Process 3916 attached
  [pid  3911] wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 3912
  [pid  3916] execve("/usr/libexec/perf-core/sleep", ["sleep", "1"], 0x27d3650 
/* 22 vars */) = -1 ENOENT (No such file or directory)
  [pid  3916] execve("/tmp/./sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = 
-1 ENOENT (No such file or directory)
  [pid  3916] execve("/home/milian/.bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 
vars */) = -1 ENOENT (No such file or directory)
  [pid  3916] execve("/usr/lib/icecream/libexec/icecc/bin/sleep", ["sleep", 
"1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
  [pid  3916] execve("/ssd2/milian/projects/compiled/other/bin/sleep", 
["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
  [pid  3916] execve("/home/milian/.bin/kf5/sleep", ["sleep", "1"], 0x27d3650 
/* 22 vars */) = -1 ENOENT (No such file or directory)
  [pid  3916] execve("/ssd2/milian/projects/compiled/kf5/bin/sleep", ["sleep", 
"1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
  [pid  3916] execve("/home/milian/projects/compiled/other/bin/sleep", 
["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
  [pid  3916] execve("/home/milian/projects/compiled/kf5/bin/sleep", ["sleep", 
"1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
  [pid  3916] execve("/usr/local/sbin/sleep", ["sleep", "1"], 0x27d3650 /* 22 
vars */) = -1 ENOENT (No such file or directory)
  [pid  3916] execve("/usr/local/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 
vars */) = -1 ENOENT (No such file or directory)
  [pid  3916] execve("/usr/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */
   Performance counter stats for 'sleep 1':

   task-clock
   context-switches
   cpu-migrations
   page-faults
   cycles
   instructions
 branches
 branch-misses

 0.47194 seconds time elapsed

  [pid  3916] --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=3911, 
si_uid=0} ---
  [pid  3916] +++ killed by SIGTERM +++
  [pid  3911] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=3916, 
si_uid=0, si_status=SIGTERM, si_utime=0, si_stime=0} ---
  [pid  3915] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=3914, 
si_uid=0} ---
  [pid  3911] +++ exited with 0 +++
  [pid  3915] --- SIGHUP {si_signo=SIGHUP, si_code=SI_USER, si_pid=3914, 
si_uid=0} ---
  [pid  3915] +++ exited with 0 +++
  +++ exited with 0 +++

This patch uses waitpid instead to ensure the call waits for the
debuggee application launched by 'perf stat'. This fixes 'perf stat'
when launched from an AppImage:

  $ ./perf-x86_64.AppImage stat sleep 1

   Performance counter stats for 'sleep 1':

  0.357235  task-clock (msec) #0.000 CPUs utilized
 1  context-switches  #0.003 M/sec
 0  cpu-migrations#0.000 K/sec
50  page-faults   #0.140 M/sec
   1269602  cycles#3.554 GHz
654278  instructions   

[tip:perf/urgent] perf tools: Support running perf binaries with a dash in their name

2017-09-13 Thread tip-bot for Milian Wolff
Commit-ID:  3192f1ed3dd3a6883d5ae31bf2ff69984ea0fd54
Gitweb: http://git.kernel.org/tip/3192f1ed3dd3a6883d5ae31bf2ff69984ea0fd54
Author: Milian Wolff 
AuthorDate: Mon, 11 Sep 2017 13:14:22 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 12 Sep 2017 12:48:54 -0300

perf tools: Support running perf binaries with a dash in their name

Previously the part behind "perf-" was interpreted as an internal perf
command. If the suffix could not be handled, the execution was stopped.
This makes it impossible to launch perf binaries that got renamed to
have the `perf-` prefix. This is e.g. the case for appimages (e.g.
"perf-x86_64.AppImage"), but would also apply to all other scenarios
where users symlink or rename perf themselves:

Status quo with the broken behavior:

  $ ln -s ./perf ./perf-custom-suffix
  $ ./perf-custom-suffix list
  cannot handle custom-suffix internally$

Also note the missing newline at the end of the error message.

With this patch applied, the above works properly:

  $ ./perf-custom-suffix list

  List of pre-defined events (to be used in -e):
  ...

Signed-off-by: Milian Wolff 
Acked-by: David Ahern 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/2017091422.31903-1-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/perf.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index e0279ba..2f19e03 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -467,15 +467,21 @@ int main(int argc, const char **argv)
 *  - cannot execute it externally (since it would just do
 *the same thing over again)
 *
-* So we just directly call the internal command handler, and
-* die if that one cannot handle it.
+* So we just directly call the internal command handler. If that one
+* fails to handle this, then maybe we just run a renamed perf binary
+* that contains a dash in its name. To handle this scenario, we just
+* fall through and ignore the "" part of the command string.
 */
if (strstarts(cmd, "perf-")) {
cmd += 5;
argv[0] = cmd;
handle_internal_command(argc, argv);
-   fprintf(stderr, "cannot handle %s internally", cmd);
-   goto out;
+   /*
+* If the command is handled, the above function does not
+* return undo changes and fall through in such a case.
+*/
+   cmd -= 5;
+   argv[0] = cmd;
}
if (strstarts(cmd, "trace")) {
 #ifdef HAVE_LIBAUDIT_SUPPORT


[tip:perf/urgent] perf tools: Support running perf binaries with a dash in their name

2017-09-13 Thread tip-bot for Milian Wolff
Commit-ID:  3192f1ed3dd3a6883d5ae31bf2ff69984ea0fd54
Gitweb: http://git.kernel.org/tip/3192f1ed3dd3a6883d5ae31bf2ff69984ea0fd54
Author: Milian Wolff 
AuthorDate: Mon, 11 Sep 2017 13:14:22 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 12 Sep 2017 12:48:54 -0300

perf tools: Support running perf binaries with a dash in their name

Previously the part behind "perf-" was interpreted as an internal perf
command. If the suffix could not be handled, the execution was stopped.
This makes it impossible to launch perf binaries that got renamed to
have the `perf-` prefix. This is e.g. the case for appimages (e.g.
"perf-x86_64.AppImage"), but would also apply to all other scenarios
where users symlink or rename perf themselves:

Status quo with the broken behavior:

  $ ln -s ./perf ./perf-custom-suffix
  $ ./perf-custom-suffix list
  cannot handle custom-suffix internally$

Also note the missing newline at the end of the error message.

With this patch applied, the above works properly:

  $ ./perf-custom-suffix list

  List of pre-defined events (to be used in -e):
  ...

Signed-off-by: Milian Wolff 
Acked-by: David Ahern 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/2017091422.31903-1-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/perf.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index e0279ba..2f19e03 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -467,15 +467,21 @@ int main(int argc, const char **argv)
 *  - cannot execute it externally (since it would just do
 *the same thing over again)
 *
-* So we just directly call the internal command handler, and
-* die if that one cannot handle it.
+* So we just directly call the internal command handler. If that one
+* fails to handle this, then maybe we just run a renamed perf binary
+* that contains a dash in its name. To handle this scenario, we just
+* fall through and ignore the "" part of the command string.
 */
if (strstarts(cmd, "perf-")) {
cmd += 5;
argv[0] = cmd;
handle_internal_command(argc, argv);
-   fprintf(stderr, "cannot handle %s internally", cmd);
-   goto out;
+   /*
+* If the command is handled, the above function does not
+* return undo changes and fall through in such a case.
+*/
+   cmd -= 5;
+   argv[0] = cmd;
}
if (strstarts(cmd, "trace")) {
 #ifdef HAVE_LIBAUDIT_SUPPORT


[tip:perf/urgent] perf tests: Fix compile when libunwind's unwind.h is available

2017-09-13 Thread tip-bot for Milian Wolff
Commit-ID:  df90cc41d662ad5f700afc042df43e57ce1ed0a4
Gitweb: http://git.kernel.org/tip/df90cc41d662ad5f700afc042df43e57ce1ed0a4
Author: Milian Wolff 
AuthorDate: Wed, 6 Sep 2017 17:02:09 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 12 Sep 2017 12:34:02 -0300

perf tests: Fix compile when libunwind's unwind.h is available

When cross compiling perf and I want to link against a self-compiled
libunwind, I usually make the custom path where the libunwind headers
exist visible by adding the libunwind prefix to the include path when
compiling perf, i.e.:

~
$ ls $HOME/projects/compiled/other/include/
libunwind-coredump.h  libunwind.h libunwind-x86_64.h
libunwind-common.h  libunwind-dynamic.h   libunwind-ptrace.h
unwind.h
$ make EXTRA_CFLAGS="-I$HOME/projects/compiled/other/include/
~~

Note the `unwind.h` header from libunwind which leads to compile
errors when compiling tests/dwarf-unwind.c, since it shadows perf's
util/unwind.h:

~
tests/dwarf-unwind.c:41:32: error: ‘struct unwind_entry’ declared inside 
parameter list will not be visible outside of this definition or declaration 
[-Werror]
 static int unwind_entry(struct unwind_entry *entry, void *arg)
^~~~
tests/dwarf-unwind.c: In function ‘unwind_entry’:
tests/dwarf-unwind.c:44:22: error: dereferencing pointer to incomplete type 
‘struct unwind_entry’
  char *symbol = entry->sym ? entry->sym->name : NULL;
  ^~
tests/dwarf-unwind.c: In function ‘unwind_thread’:
tests/dwarf-unwind.c:92:8: error: implicit declaration of function 
‘unwind__get_entries’; did you mean ‘unwind_entry’? 
[-Werror=implicit-function-declaration]
  err = unwind__get_entries(unwind_entry, , thread,
^~~
unwind_entry
tests/dwarf-unwind.c:92:8: error: nested extern declaration of 
‘unwind__get_entries’ [-Werror=nested-externs]
~~

Fix this compile error by specificing an explicit include of perf's
unwind.h in the util folder.

Signed-off-by: Milian Wolff 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20170906150209.12579-1-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/tests/dwarf-unwind.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/tests/dwarf-unwind.c b/tools/perf/tests/dwarf-unwind.c
index 2a7b9b4..9ba1d21 100644
--- a/tools/perf/tests/dwarf-unwind.c
+++ b/tools/perf/tests/dwarf-unwind.c
@@ -6,7 +6,7 @@
 #include "debug.h"
 #include "machine.h"
 #include "event.h"
-#include "unwind.h"
+#include "../util/unwind.h"
 #include "perf_regs.h"
 #include "map.h"
 #include "thread.h"


[tip:perf/urgent] perf tests: Fix compile when libunwind's unwind.h is available

2017-09-13 Thread tip-bot for Milian Wolff
Commit-ID:  df90cc41d662ad5f700afc042df43e57ce1ed0a4
Gitweb: http://git.kernel.org/tip/df90cc41d662ad5f700afc042df43e57ce1ed0a4
Author: Milian Wolff 
AuthorDate: Wed, 6 Sep 2017 17:02:09 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 12 Sep 2017 12:34:02 -0300

perf tests: Fix compile when libunwind's unwind.h is available

When cross compiling perf and I want to link against a self-compiled
libunwind, I usually make the custom path where the libunwind headers
exist visible by adding the libunwind prefix to the include path when
compiling perf, i.e.:

~
$ ls $HOME/projects/compiled/other/include/
libunwind-coredump.h  libunwind.h libunwind-x86_64.h
libunwind-common.h  libunwind-dynamic.h   libunwind-ptrace.h
unwind.h
$ make EXTRA_CFLAGS="-I$HOME/projects/compiled/other/include/
~~

Note the `unwind.h` header from libunwind which leads to compile
errors when compiling tests/dwarf-unwind.c, since it shadows perf's
util/unwind.h:

~
tests/dwarf-unwind.c:41:32: error: ‘struct unwind_entry’ declared inside 
parameter list will not be visible outside of this definition or declaration 
[-Werror]
 static int unwind_entry(struct unwind_entry *entry, void *arg)
^~~~
tests/dwarf-unwind.c: In function ‘unwind_entry’:
tests/dwarf-unwind.c:44:22: error: dereferencing pointer to incomplete type 
‘struct unwind_entry’
  char *symbol = entry->sym ? entry->sym->name : NULL;
  ^~
tests/dwarf-unwind.c: In function ‘unwind_thread’:
tests/dwarf-unwind.c:92:8: error: implicit declaration of function 
‘unwind__get_entries’; did you mean ‘unwind_entry’? 
[-Werror=implicit-function-declaration]
  err = unwind__get_entries(unwind_entry, , thread,
^~~
unwind_entry
tests/dwarf-unwind.c:92:8: error: nested extern declaration of 
‘unwind__get_entries’ [-Werror=nested-externs]
~~

Fix this compile error by specificing an explicit include of perf's
unwind.h in the util folder.

Signed-off-by: Milian Wolff 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20170906150209.12579-1-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/tests/dwarf-unwind.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/tests/dwarf-unwind.c b/tools/perf/tests/dwarf-unwind.c
index 2a7b9b4..9ba1d21 100644
--- a/tools/perf/tests/dwarf-unwind.c
+++ b/tools/perf/tests/dwarf-unwind.c
@@ -6,7 +6,7 @@
 #include "debug.h"
 #include "machine.h"
 #include "event.h"
-#include "unwind.h"
+#include "../util/unwind.h"
 #include "perf_regs.h"
 #include "map.h"
 #include "thread.h"


[tip:perf/core] perf srcline: Do not consider empty files as valid srclines

2017-08-14 Thread tip-bot for Milian Wolff
Commit-ID:  d964b1cdbd94f359f1f65f81440be84ceb45978e
Gitweb: http://git.kernel.org/tip/d964b1cdbd94f359f1f65f81440be84ceb45978e
Author: Milian Wolff 
AuthorDate: Sun, 6 Aug 2017 23:24:45 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 11 Aug 2017 16:06:31 -0300

perf srcline: Do not consider empty files as valid srclines

Sometimes we get a non-null, but empty, string for the filename from
bfd. This then results in srclines of the form ":0", which is different
from the canonical SRCLINE_UNKNOWN in the form "??:0".  Set the file to
NULL if it is empty to fix this.

Signed-off-by: Milian Wolff 
Cc: David Ahern 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20170806212446.24925-14-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/srcline.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index ebc88a7..ed8e8d2 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -155,6 +155,9 @@ static void find_address_in_section(bfd *abfd, asection 
*section, void *data)
a2l->found = bfd_find_nearest_line(abfd, section, a2l->syms, pc - vma,
   >filename, >funcname,
   >line);
+
+   if (a2l->filename && !strlen(a2l->filename))
+   a2l->filename = NULL;
 }
 
 static struct a2l_data *addr2line_init(const char *path)
@@ -248,6 +251,9 @@ static int addr2line(const char *dso_name, u64 addr,
 >funcname, >line) &&
   cnt++ < MAX_INLINE_NEST) {
 
+   if (a2l->filename && !strlen(a2l->filename))
+   a2l->filename = NULL;
+
if (node != NULL) {
if (inline_list__append_dso_a2l(dso, node))
return 0;


[tip:perf/core] perf srcline: Do not consider empty files as valid srclines

2017-08-14 Thread tip-bot for Milian Wolff
Commit-ID:  d964b1cdbd94f359f1f65f81440be84ceb45978e
Gitweb: http://git.kernel.org/tip/d964b1cdbd94f359f1f65f81440be84ceb45978e
Author: Milian Wolff 
AuthorDate: Sun, 6 Aug 2017 23:24:45 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 11 Aug 2017 16:06:31 -0300

perf srcline: Do not consider empty files as valid srclines

Sometimes we get a non-null, but empty, string for the filename from
bfd. This then results in srclines of the form ":0", which is different
from the canonical SRCLINE_UNKNOWN in the form "??:0".  Set the file to
NULL if it is empty to fix this.

Signed-off-by: Milian Wolff 
Cc: David Ahern 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20170806212446.24925-14-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/srcline.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index ebc88a7..ed8e8d2 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -155,6 +155,9 @@ static void find_address_in_section(bfd *abfd, asection 
*section, void *data)
a2l->found = bfd_find_nearest_line(abfd, section, a2l->syms, pc - vma,
   >filename, >funcname,
   >line);
+
+   if (a2l->filename && !strlen(a2l->filename))
+   a2l->filename = NULL;
 }
 
 static struct a2l_data *addr2line_init(const char *path)
@@ -248,6 +251,9 @@ static int addr2line(const char *dso_name, u64 addr,
 >funcname, >line) &&
   cnt++ < MAX_INLINE_NEST) {
 
+   if (a2l->filename && !strlen(a2l->filename))
+   a2l->filename = NULL;
+
if (node != NULL) {
if (inline_list__append_dso_a2l(dso, node))
return 0;


[tip:perf/core] perf util: Take elf_name as const string in dso__demangle_sym

2017-08-14 Thread tip-bot for Milian Wolff
Commit-ID:  80c345b255cbb4e9dfb193bf0bf5536217237f6a
Gitweb: http://git.kernel.org/tip/80c345b255cbb4e9dfb193bf0bf5536217237f6a
Author: Milian Wolff 
AuthorDate: Sun, 6 Aug 2017 23:24:34 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 11 Aug 2017 16:06:31 -0300

perf util: Take elf_name as const string in dso__demangle_sym

The input string is not modified and thus can be passed in as a pointer
to const data.

Signed-off-by: Milian Wolff 
Cc: David Ahern 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20170806212446.24925-3-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/symbol-elf.c | 2 +-
 tools/perf/util/symbol-minimal.c | 2 +-
 tools/perf/util/symbol.h | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 502505c..7cf18f1 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -391,7 +391,7 @@ out_elf_end:
return 0;
 }
 
-char *dso__demangle_sym(struct dso *dso, int kmodule, char *elf_name)
+char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name)
 {
return demangle_sym(dso, kmodule, elf_name);
 }
diff --git a/tools/perf/util/symbol-minimal.c b/tools/perf/util/symbol-minimal.c
index 40bf5d4..1a5aa35 100644
--- a/tools/perf/util/symbol-minimal.c
+++ b/tools/perf/util/symbol-minimal.c
@@ -377,7 +377,7 @@ void symbol__elf_init(void)
 
 char *dso__demangle_sym(struct dso *dso __maybe_unused,
int kmodule __maybe_unused,
-   char *elf_name __maybe_unused)
+   const char *elf_name __maybe_unused)
 {
return NULL;
 }
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 41ebba9..f0b0881 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -306,7 +306,7 @@ int dso__load_sym(struct dso *dso, struct map *map, struct 
symsrc *syms_ss,
 int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss,
struct map *map);
 
-char *dso__demangle_sym(struct dso *dso, int kmodule, char *elf_name);
+char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
 
 void __symbols__insert(struct rb_root *symbols, struct symbol *sym, bool 
kernel);
 void symbols__insert(struct rb_root *symbols, struct symbol *sym);


[tip:perf/core] perf util: Take elf_name as const string in dso__demangle_sym

2017-08-14 Thread tip-bot for Milian Wolff
Commit-ID:  80c345b255cbb4e9dfb193bf0bf5536217237f6a
Gitweb: http://git.kernel.org/tip/80c345b255cbb4e9dfb193bf0bf5536217237f6a
Author: Milian Wolff 
AuthorDate: Sun, 6 Aug 2017 23:24:34 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 11 Aug 2017 16:06:31 -0300

perf util: Take elf_name as const string in dso__demangle_sym

The input string is not modified and thus can be passed in as a pointer
to const data.

Signed-off-by: Milian Wolff 
Cc: David Ahern 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20170806212446.24925-3-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/symbol-elf.c | 2 +-
 tools/perf/util/symbol-minimal.c | 2 +-
 tools/perf/util/symbol.h | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 502505c..7cf18f1 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -391,7 +391,7 @@ out_elf_end:
return 0;
 }
 
-char *dso__demangle_sym(struct dso *dso, int kmodule, char *elf_name)
+char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name)
 {
return demangle_sym(dso, kmodule, elf_name);
 }
diff --git a/tools/perf/util/symbol-minimal.c b/tools/perf/util/symbol-minimal.c
index 40bf5d4..1a5aa35 100644
--- a/tools/perf/util/symbol-minimal.c
+++ b/tools/perf/util/symbol-minimal.c
@@ -377,7 +377,7 @@ void symbol__elf_init(void)
 
 char *dso__demangle_sym(struct dso *dso __maybe_unused,
int kmodule __maybe_unused,
-   char *elf_name __maybe_unused)
+   const char *elf_name __maybe_unused)
 {
return NULL;
 }
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 41ebba9..f0b0881 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -306,7 +306,7 @@ int dso__load_sym(struct dso *dso, struct map *map, struct 
symsrc *syms_ss,
 int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss,
struct map *map);
 
-char *dso__demangle_sym(struct dso *dso, int kmodule, char *elf_name);
+char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
 
 void __symbols__insert(struct rb_root *symbols, struct symbol *sym, bool 
kernel);
 void symbols__insert(struct rb_root *symbols, struct symbol *sym);


[tip:perf/urgent] perf unwind: Report module before querying isactivation in dwfl unwind

2017-06-16 Thread tip-bot for Milian Wolff
Commit-ID:  9126cbbacecb8917bd0418809ef1d26616b2061e
Gitweb: http://git.kernel.org/tip/9126cbbacecb8917bd0418809ef1d26616b2061e
Author: Milian Wolff 
AuthorDate: Fri, 2 Jun 2017 16:37:53 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 16 Jun 2017 14:37:30 -0300

perf unwind: Report module before querying isactivation in dwfl unwind

The PC returned by dwfl_frame_pc() may map into a not-yet-reported
module. We have to report it before we continue unwinding. But when we
query for the isactivation flag in dwfl_frame_pc, libdw will actually do
one more unwinding step internally which can then break and lead to
missed frames or broken stacks.

With libunwind we get e.g.:

~
  heaptrack_gui  2228 135073.400474: 613969 cycles:
  108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  1093bc [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  109e7b QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
  1470ff [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  147f67 QSystemLocale::query (/usr/lib/libQt5Core.so.5.8.0)
  109fbf QLocalePrivate::updateSystemPrivate 
(/usr/lib/libQt5Core.so.5.8.0)
  10aa27 QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
  1e02c3 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  2113bb [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  211505 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  1b5df0 QFileInfo::exists (/usr/lib/libQt5Core.so.5.8.0)
   92eb2 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
   93423 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
   93d2a QLibraryInfo::location (/usr/lib/libQt5Core.so.5.8.0)
  2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  297c53 QCoreApplicationPrivate::init 
(/usr/lib/libQt5Core.so.5.8.0)
   f7cde QGuiApplicationPrivate::init 
(/usr/lib/libQt5Gui.so.5.8.0)
  1589e8 QApplicationPrivate::init 
(/usr/lib/libQt5Widgets.so.5.8.0)
   78622 main 
(/home/milian/projects/compiled/other/bin/heaptrack_gui)
   20439 __libc_start_main (/usr/lib/libc-2.25.so)
   78299 _start 
(/home/milian/projects/compiled/other/bin/heaptrack_gui)

  heaptrack_gui  2228 135073.401156: 569521 cycles:
  131633 QString::endsWith (/usr/lib/libQt5Core.so.5.8.0)
  1a0701 QDir::cleanPath (/usr/lib/libQt5Core.so.5.8.0)
  21b82d [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  1b3727 QFileInfo::canonicalFilePath 
(/usr/lib/libQt5Core.so.5.8.0)
  2780c7 QFactoryLoader::update (/usr/lib/libQt5Core.so.5.8.0)
  279525 QFactoryLoader::QFactoryLoader 
(/usr/lib/libQt5Core.so.5.8.0)
   e5bd0 QPlatformIntegrationFactory::create 
(/usr/lib/libQt5Gui.so.5.8.0)
   f5a1c QGuiApplicationPrivate::createPlatformIntegration 
(/usr/lib/libQt5Gui.so.5.8.0)
   f650c QGuiApplicationPrivate::createEventDispatcher 
(/usr/lib/libQt5Gui.so.5.8.0)
  298524 QCoreApplicationPrivate::init 
(/usr/lib/libQt5Core.so.5.8.0)
   f7cde QGuiApplicationPrivate::init 
(/usr/lib/libQt5Gui.so.5.8.0)
  1589e8 QApplicationPrivate::init 
(/usr/lib/libQt5Widgets.so.5.8.0)
   78622 main 
(/home/milian/projects/compiled/other/bin/heaptrack_gui)
   20439 __libc_start_main (/usr/lib/libc-2.25.so)
   78299 _start 
(/home/milian/projects/compiled/other/bin/heaptrack_gui)
~

Note the two frames 1589e8 and 78622 in the first sample. These are
missing when unwinding with libdw. The second sample's breakage is
more obvious:

~
  heaptrack_gui  2228 135073.400474: 613969 cycles:
  108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  1093bc [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  109e7b QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
  1470ff [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  147f67 QSystemLocale::query (/usr/lib/libQt5Core.so.5.8.0)
  109fbf QLocalePrivate::updateSystemPrivate 
(/usr/lib/libQt5Core.so.5.8.0)
  10aa27 QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
  1e02c3 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  2113bb [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  211505 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  1b5df0 QFileInfo::exists (/usr/lib/libQt5Core.so.5.8.0)
   92eb2 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
   93423 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
   93d2a QLibraryInfo::location (/usr/lib/libQt5Core.so.5.8.0)
  2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  297c53 

[tip:perf/urgent] perf unwind: Report module before querying isactivation in dwfl unwind

2017-06-16 Thread tip-bot for Milian Wolff
Commit-ID:  9126cbbacecb8917bd0418809ef1d26616b2061e
Gitweb: http://git.kernel.org/tip/9126cbbacecb8917bd0418809ef1d26616b2061e
Author: Milian Wolff 
AuthorDate: Fri, 2 Jun 2017 16:37:53 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 16 Jun 2017 14:37:30 -0300

perf unwind: Report module before querying isactivation in dwfl unwind

The PC returned by dwfl_frame_pc() may map into a not-yet-reported
module. We have to report it before we continue unwinding. But when we
query for the isactivation flag in dwfl_frame_pc, libdw will actually do
one more unwinding step internally which can then break and lead to
missed frames or broken stacks.

With libunwind we get e.g.:

~
  heaptrack_gui  2228 135073.400474: 613969 cycles:
  108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  1093bc [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  109e7b QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
  1470ff [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  147f67 QSystemLocale::query (/usr/lib/libQt5Core.so.5.8.0)
  109fbf QLocalePrivate::updateSystemPrivate 
(/usr/lib/libQt5Core.so.5.8.0)
  10aa27 QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
  1e02c3 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  2113bb [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  211505 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  1b5df0 QFileInfo::exists (/usr/lib/libQt5Core.so.5.8.0)
   92eb2 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
   93423 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
   93d2a QLibraryInfo::location (/usr/lib/libQt5Core.so.5.8.0)
  2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  297c53 QCoreApplicationPrivate::init 
(/usr/lib/libQt5Core.so.5.8.0)
   f7cde QGuiApplicationPrivate::init 
(/usr/lib/libQt5Gui.so.5.8.0)
  1589e8 QApplicationPrivate::init 
(/usr/lib/libQt5Widgets.so.5.8.0)
   78622 main 
(/home/milian/projects/compiled/other/bin/heaptrack_gui)
   20439 __libc_start_main (/usr/lib/libc-2.25.so)
   78299 _start 
(/home/milian/projects/compiled/other/bin/heaptrack_gui)

  heaptrack_gui  2228 135073.401156: 569521 cycles:
  131633 QString::endsWith (/usr/lib/libQt5Core.so.5.8.0)
  1a0701 QDir::cleanPath (/usr/lib/libQt5Core.so.5.8.0)
  21b82d [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  1b3727 QFileInfo::canonicalFilePath 
(/usr/lib/libQt5Core.so.5.8.0)
  2780c7 QFactoryLoader::update (/usr/lib/libQt5Core.so.5.8.0)
  279525 QFactoryLoader::QFactoryLoader 
(/usr/lib/libQt5Core.so.5.8.0)
   e5bd0 QPlatformIntegrationFactory::create 
(/usr/lib/libQt5Gui.so.5.8.0)
   f5a1c QGuiApplicationPrivate::createPlatformIntegration 
(/usr/lib/libQt5Gui.so.5.8.0)
   f650c QGuiApplicationPrivate::createEventDispatcher 
(/usr/lib/libQt5Gui.so.5.8.0)
  298524 QCoreApplicationPrivate::init 
(/usr/lib/libQt5Core.so.5.8.0)
   f7cde QGuiApplicationPrivate::init 
(/usr/lib/libQt5Gui.so.5.8.0)
  1589e8 QApplicationPrivate::init 
(/usr/lib/libQt5Widgets.so.5.8.0)
   78622 main 
(/home/milian/projects/compiled/other/bin/heaptrack_gui)
   20439 __libc_start_main (/usr/lib/libc-2.25.so)
   78299 _start 
(/home/milian/projects/compiled/other/bin/heaptrack_gui)
~

Note the two frames 1589e8 and 78622 in the first sample. These are
missing when unwinding with libdw. The second sample's breakage is
more obvious:

~
  heaptrack_gui  2228 135073.400474: 613969 cycles:
  108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  1093bc [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  109e7b QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
  1470ff [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  147f67 QSystemLocale::query (/usr/lib/libQt5Core.so.5.8.0)
  109fbf QLocalePrivate::updateSystemPrivate 
(/usr/lib/libQt5Core.so.5.8.0)
  10aa27 QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
  1e02c3 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  2113bb [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  211505 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  1b5df0 QFileInfo::exists (/usr/lib/libQt5Core.so.5.8.0)
   92eb2 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
   93423 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
   93d2a QLibraryInfo::location (/usr/lib/libQt5Core.so.5.8.0)
  2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0)
  297c53 QCoreApplicationPrivate::init 

[tip:perf/urgent] perf report: Ensure the perf DSO mapping matches what libdw sees

2017-06-07 Thread tip-bot for Milian Wolff
Commit-ID:  2538b9e2450ae255337c04356e9e0f8cb9ec48d9
Gitweb: http://git.kernel.org/tip/2538b9e2450ae255337c04356e9e0f8cb9ec48d9
Author: Milian Wolff 
AuthorDate: Fri, 2 Jun 2017 16:37:52 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Jun 2017 14:18:05 -0300

perf report: Ensure the perf DSO mapping matches what libdw sees

In some situations the libdw unwinder stopped working properly.  I.e.
with libunwind we see:

~
heaptrack_gui  2228 135073.400112: 641314 cycles:
e8ed _dl_fixup (/usr/lib/ld-2.25.so)
   15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so)
   ed94c KDynamicJobTracker::KDynamicJobTracker 
(/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
   608f3 _GLOBAL__sub_I_kdynamicjobtracker.cpp 
(/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
f199 call_init.part.0 (/usr/lib/ld-2.25.so)
f2a5 _dl_init (/usr/lib/ld-2.25.so)
 db9 _dl_start_user (/usr/lib/ld-2.25.so)
~

But with libdw and without this patch this sample is not properly
unwound:

~
heaptrack_gui  2228 135073.400112: 641314 cycles:
e8ed _dl_fixup (/usr/lib/ld-2.25.so)
   15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so)
   ed94c KDynamicJobTracker::KDynamicJobTracker 
(/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
~

Debug output showed me that libdw found a module for the last frame
address, but it thinks it belongs to /usr/lib/ld-2.25.so. This patch
double-checks what libdw sees and what perf knows. If the mappings
mismatch, we now report the elf known to perf. This fixes the situation
above, and the libdw unwinder produces the same stack as libunwind.

Signed-off-by: Milian Wolff 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Link: http://lkml.kernel.org/r/20170602143753.16907-1-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/unwind-libdw.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/tools/perf/util/unwind-libdw.c b/tools/perf/util/unwind-libdw.c
index b4c2012..da45c4b 100644
--- a/tools/perf/util/unwind-libdw.c
+++ b/tools/perf/util/unwind-libdw.c
@@ -39,6 +39,14 @@ static int __report_module(struct addr_location *al, u64 ip,
return 0;
 
mod = dwfl_addrmodule(ui->dwfl, ip);
+   if (mod) {
+   Dwarf_Addr s;
+
+   dwfl_module_info(mod, NULL, , NULL, NULL, NULL, NULL, NULL);
+   if (s != al->map->start)
+   mod = 0;
+   }
+
if (!mod)
mod = dwfl_report_elf(ui->dwfl, dso->short_name,
  dso->long_name, -1, al->map->start,


[tip:perf/urgent] perf report: Ensure the perf DSO mapping matches what libdw sees

2017-06-07 Thread tip-bot for Milian Wolff
Commit-ID:  2538b9e2450ae255337c04356e9e0f8cb9ec48d9
Gitweb: http://git.kernel.org/tip/2538b9e2450ae255337c04356e9e0f8cb9ec48d9
Author: Milian Wolff 
AuthorDate: Fri, 2 Jun 2017 16:37:52 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Jun 2017 14:18:05 -0300

perf report: Ensure the perf DSO mapping matches what libdw sees

In some situations the libdw unwinder stopped working properly.  I.e.
with libunwind we see:

~
heaptrack_gui  2228 135073.400112: 641314 cycles:
e8ed _dl_fixup (/usr/lib/ld-2.25.so)
   15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so)
   ed94c KDynamicJobTracker::KDynamicJobTracker 
(/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
   608f3 _GLOBAL__sub_I_kdynamicjobtracker.cpp 
(/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
f199 call_init.part.0 (/usr/lib/ld-2.25.so)
f2a5 _dl_init (/usr/lib/ld-2.25.so)
 db9 _dl_start_user (/usr/lib/ld-2.25.so)
~

But with libdw and without this patch this sample is not properly
unwound:

~
heaptrack_gui  2228 135073.400112: 641314 cycles:
e8ed _dl_fixup (/usr/lib/ld-2.25.so)
   15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so)
   ed94c KDynamicJobTracker::KDynamicJobTracker 
(/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
~

Debug output showed me that libdw found a module for the last frame
address, but it thinks it belongs to /usr/lib/ld-2.25.so. This patch
double-checks what libdw sees and what perf knows. If the mappings
mismatch, we now report the elf known to perf. This fixes the situation
above, and the libdw unwinder produces the same stack as libunwind.

Signed-off-by: Milian Wolff 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Link: http://lkml.kernel.org/r/20170602143753.16907-1-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/unwind-libdw.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/tools/perf/util/unwind-libdw.c b/tools/perf/util/unwind-libdw.c
index b4c2012..da45c4b 100644
--- a/tools/perf/util/unwind-libdw.c
+++ b/tools/perf/util/unwind-libdw.c
@@ -39,6 +39,14 @@ static int __report_module(struct addr_location *al, u64 ip,
return 0;
 
mod = dwfl_addrmodule(ui->dwfl, ip);
+   if (mod) {
+   Dwarf_Addr s;
+
+   dwfl_module_info(mod, NULL, , NULL, NULL, NULL, NULL, NULL);
+   if (s != al->map->start)
+   mod = 0;
+   }
+
if (!mod)
mod = dwfl_report_elf(ui->dwfl, dso->short_name,
  dso->long_name, -1, al->map->start,


[tip:perf/urgent] perf report: Include partial stacks unwound with libdw

2017-06-07 Thread tip-bot for Milian Wolff
Commit-ID:  5ea0416f51cc93436bbe497c62ab49fd9cb245b6
Gitweb: http://git.kernel.org/tip/5ea0416f51cc93436bbe497c62ab49fd9cb245b6
Author: Milian Wolff 
AuthorDate: Thu, 1 Jun 2017 23:00:21 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Jun 2017 14:18:03 -0300

perf report: Include partial stacks unwound with libdw

So far the whole stack was thrown away when any error occurred before
the maximum stack depth was unwound. This is actually a very common
scenario though. The stacks that got unwound so far are still
interesting. This removes a large chunk of differences when comparing
perf script output for libunwind and libdw perf unwinding.

E.g. with libunwind:

~
heaptrack_gui  2228 135073.388524: 479408 cycles:
811749ed perf_iterate_ctx ([kernel.kallsyms])
81181662 perf_event_mmap ([kernel.kallsyms])
811cf5ed mmap_region ([kernel.kallsyms])
811cfe6b do_mmap ([kernel.kallsyms])
811b0dca vm_mmap_pgoff ([kernel.kallsyms])
811cdb0c sys_mmap_pgoff ([kernel.kallsyms])
81033acb sys_mmap ([kernel.kallsyms])
81631d37 entry_SYSCALL_64_fastpath ([kernel.kallsyms])
   192ca mmap64 (/usr/lib/ld-2.25.so)
59a9 _dl_map_object_from_fd (/usr/lib/ld-2.25.so)
83d0 _dl_map_object (/usr/lib/ld-2.25.so)
cda1 openaux (/usr/lib/ld-2.25.so)
   1834f _dl_catch_error (/usr/lib/ld-2.25.so)
cfe2 _dl_map_object_deps (/usr/lib/ld-2.25.so)
3481 dl_main (/usr/lib/ld-2.25.so)
   17387 _dl_sysdep_start (/usr/lib/ld-2.25.so)
4d37 _dl_start (/usr/lib/ld-2.25.so)
 d87 _start (/usr/lib/ld-2.25.so)

heaptrack_gui  2228 135073.388677: 611329 cycles:
   1a3e0 strcmp (/usr/lib/ld-2.25.so)
82b2 _dl_map_object (/usr/lib/ld-2.25.so)
cda1 openaux (/usr/lib/ld-2.25.so)
   1834f _dl_catch_error (/usr/lib/ld-2.25.so)
cfe2 _dl_map_object_deps (/usr/lib/ld-2.25.so)
3481 dl_main (/usr/lib/ld-2.25.so)
   17387 _dl_sysdep_start (/usr/lib/ld-2.25.so)
4d37 _dl_start (/usr/lib/ld-2.25.so)
 d87 _start (/usr/lib/ld-2.25.so)
~

With libdw without this patch:

~
heaptrack_gui  2228 135073.388524: 479408 cycles:
811749ed perf_iterate_ctx ([kernel.kallsyms])
81181662 perf_event_mmap ([kernel.kallsyms])
811cf5ed mmap_region ([kernel.kallsyms])
811cfe6b do_mmap ([kernel.kallsyms])
811b0dca vm_mmap_pgoff ([kernel.kallsyms])
811cdb0c sys_mmap_pgoff ([kernel.kallsyms])
81033acb sys_mmap ([kernel.kallsyms])
81631d37 entry_SYSCALL_64_fastpath ([kernel.kallsyms])

heaptrack_gui  2228 135073.388677: 611329 cycles:
~

With this patch applied, the libdw unwinder will produce the same
output as the libunwind unwinder.

Signed-off-by: Milian Wolff 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Link: http://lkml.kernel.org/r/20170601210021.20046-1-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/unwind-libdw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/unwind-libdw.c b/tools/perf/util/unwind-libdw.c
index 943a0629..b4c2012 100644
--- a/tools/perf/util/unwind-libdw.c
+++ b/tools/perf/util/unwind-libdw.c
@@ -224,7 +224,7 @@ int unwind__get_entries(unwind_entry_cb_t cb, void *arg,
 
err = dwfl_getthread_frames(ui->dwfl, thread->tid, frame_callback, ui);
 
-   if (err && !ui->max_stack)
+   if (err && ui->max_stack != max_stack)
err = 0;
 
/*


[tip:perf/urgent] perf report: Include partial stacks unwound with libdw

2017-06-07 Thread tip-bot for Milian Wolff
Commit-ID:  5ea0416f51cc93436bbe497c62ab49fd9cb245b6
Gitweb: http://git.kernel.org/tip/5ea0416f51cc93436bbe497c62ab49fd9cb245b6
Author: Milian Wolff 
AuthorDate: Thu, 1 Jun 2017 23:00:21 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Jun 2017 14:18:03 -0300

perf report: Include partial stacks unwound with libdw

So far the whole stack was thrown away when any error occurred before
the maximum stack depth was unwound. This is actually a very common
scenario though. The stacks that got unwound so far are still
interesting. This removes a large chunk of differences when comparing
perf script output for libunwind and libdw perf unwinding.

E.g. with libunwind:

~
heaptrack_gui  2228 135073.388524: 479408 cycles:
811749ed perf_iterate_ctx ([kernel.kallsyms])
81181662 perf_event_mmap ([kernel.kallsyms])
811cf5ed mmap_region ([kernel.kallsyms])
811cfe6b do_mmap ([kernel.kallsyms])
811b0dca vm_mmap_pgoff ([kernel.kallsyms])
811cdb0c sys_mmap_pgoff ([kernel.kallsyms])
81033acb sys_mmap ([kernel.kallsyms])
81631d37 entry_SYSCALL_64_fastpath ([kernel.kallsyms])
   192ca mmap64 (/usr/lib/ld-2.25.so)
59a9 _dl_map_object_from_fd (/usr/lib/ld-2.25.so)
83d0 _dl_map_object (/usr/lib/ld-2.25.so)
cda1 openaux (/usr/lib/ld-2.25.so)
   1834f _dl_catch_error (/usr/lib/ld-2.25.so)
cfe2 _dl_map_object_deps (/usr/lib/ld-2.25.so)
3481 dl_main (/usr/lib/ld-2.25.so)
   17387 _dl_sysdep_start (/usr/lib/ld-2.25.so)
4d37 _dl_start (/usr/lib/ld-2.25.so)
 d87 _start (/usr/lib/ld-2.25.so)

heaptrack_gui  2228 135073.388677: 611329 cycles:
   1a3e0 strcmp (/usr/lib/ld-2.25.so)
82b2 _dl_map_object (/usr/lib/ld-2.25.so)
cda1 openaux (/usr/lib/ld-2.25.so)
   1834f _dl_catch_error (/usr/lib/ld-2.25.so)
cfe2 _dl_map_object_deps (/usr/lib/ld-2.25.so)
3481 dl_main (/usr/lib/ld-2.25.so)
   17387 _dl_sysdep_start (/usr/lib/ld-2.25.so)
4d37 _dl_start (/usr/lib/ld-2.25.so)
 d87 _start (/usr/lib/ld-2.25.so)
~

With libdw without this patch:

~
heaptrack_gui  2228 135073.388524: 479408 cycles:
811749ed perf_iterate_ctx ([kernel.kallsyms])
81181662 perf_event_mmap ([kernel.kallsyms])
811cf5ed mmap_region ([kernel.kallsyms])
811cfe6b do_mmap ([kernel.kallsyms])
811b0dca vm_mmap_pgoff ([kernel.kallsyms])
811cdb0c sys_mmap_pgoff ([kernel.kallsyms])
81033acb sys_mmap ([kernel.kallsyms])
81631d37 entry_SYSCALL_64_fastpath ([kernel.kallsyms])

heaptrack_gui  2228 135073.388677: 611329 cycles:
~

With this patch applied, the libdw unwinder will produce the same
output as the libunwind unwinder.

Signed-off-by: Milian Wolff 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Link: http://lkml.kernel.org/r/20170601210021.20046-1-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/unwind-libdw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/unwind-libdw.c b/tools/perf/util/unwind-libdw.c
index 943a0629..b4c2012 100644
--- a/tools/perf/util/unwind-libdw.c
+++ b/tools/perf/util/unwind-libdw.c
@@ -224,7 +224,7 @@ int unwind__get_entries(unwind_entry_cb_t cb, void *arg,
 
err = dwfl_getthread_frames(ui->dwfl, thread->tid, frame_callback, ui);
 
-   if (err && !ui->max_stack)
+   if (err && ui->max_stack != max_stack)
err = 0;
 
/*


[tip:perf/urgent] perf report: Do not drop last inlined frame

2017-05-24 Thread tip-bot for Milian Wolff
Commit-ID:  4d53b9d546f9f4505e6e3d58c8eed894d6f684e7
Gitweb: http://git.kernel.org/tip/4d53b9d546f9f4505e6e3d58c8eed894d6f684e7
Author: Milian Wolff 
AuthorDate: Wed, 24 May 2017 15:21:28 +0900
Committer:  Ingo Molnar 
CommitDate: Wed, 24 May 2017 08:41:48 +0200

perf report: Do not drop last inlined frame

The very last inlined frame, i.e. the one furthest away from the
non-inlined frame, was silently dropped. This is apparent when
comparing the output of `perf script` and `addr2line`:

~~
  $ perf script --inline
  ...
  a.out 26722 80836.309329:  72425 cycles:
 21561 __hypot_finite (/usr/lib/libm-2.25.so)
  ace3 hypot (/usr/lib/libm-2.25.so)
   a4a main (a.out)
   std::abs
   std::_Norm_helper::_S_do_it
   std::norm
   main
 20510 __libc_start_main (/usr/lib/libc-2.25.so)
   bd9 _start (a.out)

  $ addr2line -a -f -i -e /tmp/a.out a4a | c++filt
  0x0a4a
  std::__complex_abs(doublecomplex )
  /usr/include/c++/6.3.1/complex:589
  double std::abs(std::complex const&)
  /usr/include/c++/6.3.1/complex:597
  double std::_Norm_helper::_S_do_it(std::complex const&)
  /usr/include/c++/6.3.1/complex:654
  double std::norm(std::complex const&)
  /usr/include/c++/6.3.1/complex:664
  main
  /tmp/inlining.cpp:14
~

Note how `std::__complex_abs` is missing from the `perf script`
output. This is similarly showing up in `perf report`. The patch
here fixes this issue, and the output becomes:

~
  a.out 26722 80836.309329:  72425 cycles:
 21561 __hypot_finite (/usr/lib/libm-2.25.so)
  ace3 hypot (/usr/lib/libm-2.25.so)
   a4a main (a.out)
   std::__complex_abs
   std::abs
   std::_Norm_helper::_S_do_it
   std::norm
   main
 20510 __libc_start_main (/usr/lib/libc-2.25.so)
   bd9 _start (a.out)
~

Signed-off-by: Milian Wolff 
Signed-off-by: Namhyung Kim 
Cc: Arnaldo Carvalho de Melo 
Cc: Arnaldo Carvalho de Melo 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Yao Jin 
Cc: kernel-t...@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-7-namhy...@kernel.org
Signed-off-by: Ingo Molnar 
---
 tools/perf/util/srcline.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 6af0364..ebc88a7 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -203,6 +203,16 @@ static void addr2line_cleanup(struct a2l_data *a2l)
 
 #define MAX_INLINE_NEST 1024
 
+static int inline_list__append_dso_a2l(struct dso *dso,
+  struct inline_node *node)
+{
+   struct a2l_data *a2l = dso->a2l;
+   char *funcname = a2l->funcname ? strdup(a2l->funcname) : NULL;
+   char *filename = a2l->filename ? strdup(a2l->filename) : NULL;
+
+   return inline_list__append(filename, funcname, a2l->line, node, dso);
+}
+
 static int addr2line(const char *dso_name, u64 addr,
 char **file, unsigned int *line, struct dso *dso,
 bool unwind_inlines, struct inline_node *node)
@@ -231,15 +241,15 @@ static int addr2line(const char *dso_name, u64 addr,
if (unwind_inlines) {
int cnt = 0;
 
+   if (node && inline_list__append_dso_a2l(dso, node))
+   return 0;
+
while (bfd_find_inliner_info(a2l->abfd, >filename,
 >funcname, >line) &&
   cnt++ < MAX_INLINE_NEST) {
 
if (node != NULL) {
-   if (inline_list__append(strdup(a2l->filename),
-   strdup(a2l->funcname),
-   a2l->line, node,
-   dso) != 0)
+   if (inline_list__append_dso_a2l(dso, node))
return 0;
// found at least one inline frame
ret = 1;


[tip:perf/urgent] perf report: Fix memory leak in addr2line when called by addr2inlines

2017-05-24 Thread tip-bot for Milian Wolff
Commit-ID:  b21cc97810932a551f7aac46f0b89c469c828b3f
Gitweb: http://git.kernel.org/tip/b21cc97810932a551f7aac46f0b89c469c828b3f
Author: Milian Wolff 
AuthorDate: Wed, 24 May 2017 15:21:24 +0900
Committer:  Ingo Molnar 
CommitDate: Wed, 24 May 2017 08:41:48 +0200

perf report: Fix memory leak in addr2line when called by addr2inlines

When a filename was found in addr2line it was duplicated via strdup()
but never freed. Now we pass NULL and handle this gracefully in
addr2line.

Detected by Valgrind:

  ==16331== 1,680 bytes in 21 blocks are definitely lost in loss record 148 of 
220
  ==16331==at 0x4C2AF1F: malloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
  ==16331==by 0x672FA69: strdup (in /usr/lib/libc-2.25.so)
  ==16331==by 0x52769F: addr2line (srcline.c:256)
  ==16331==by 0x52769F: addr2inlines (srcline.c:294)
  ==16331==by 0x52769F: dso__parse_addr_inlines (srcline.c:502)
  ==16331==by 0x574D7A: inline__fprintf (hist.c:41)
  ==16331==by 0x574D7A: ipchain__fprintf_graph (hist.c:147)
  ==16331==by 0x57518A: __callchain__fprintf_graph (hist.c:212)
  ==16331==by 0x5753CF: callchain__fprintf_graph.constprop.6 (hist.c:337)
  ==16331==by 0x57738E: hist_entry__fprintf (hist.c:628)
  ==16331==by 0x57738E: hists__fprintf (hist.c:882)
  ==16331==by 0x44A20F: perf_evlist__tty_browse_hists (builtin-report.c:399)
  ==16331==by 0x44A20F: report__browse_hists (builtin-report.c:491)
  ==16331==by 0x44A20F: __cmd_report (builtin-report.c:624)
  ==16331==by 0x44A20F: cmd_report (builtin-report.c:1054)
  ==16331==by 0x4A49CE: run_builtin (perf.c:296)
  ==16331==by 0x4A4CC0: handle_internal_command (perf.c:348)
  ==16331==by 0x434371: run_argv (perf.c:392)
  ==16331==by 0x434371: main (perf.c:530)

Signed-off-by: Milian Wolff 
Signed-off-by: Namhyung Kim 
Cc: Arnaldo Carvalho de Melo 
Cc: Arnaldo Carvalho de Melo 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Yao Jin 
Cc: kernel-t...@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-3-namhy...@kernel.org
Signed-off-by: Ingo Molnar 
---
 tools/perf/util/srcline.c | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index df051a5..5e376d6 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -230,7 +230,10 @@ static int addr2line(const char *dso_name, u64 addr,
 
bfd_map_over_sections(a2l->abfd, find_address_in_section, a2l);
 
-   if (a2l->found && unwind_inlines) {
+   if (!a2l->found)
+   return 0;
+
+   if (unwind_inlines) {
int cnt = 0;
 
while (bfd_find_inliner_info(a2l->abfd, >filename,
@@ -243,6 +246,8 @@ static int addr2line(const char *dso_name, u64 addr,
a2l->line, node,
dso) != 0)
return 0;
+   // found at least one inline frame
+   ret = 1;
}
}
 
@@ -252,14 +257,14 @@ static int addr2line(const char *dso_name, u64 addr,
}
}
 
-   if (a2l->found && a2l->filename) {
-   *file = strdup(a2l->filename);
-   *line = a2l->line;
-
-   if (*file)
-   ret = 1;
+   if (file) {
+   *file = a2l->filename ? strdup(a2l->filename) : NULL;
+   ret = *file ? 1 : 0;
}
 
+   if (line)
+   *line = a2l->line;
+
return ret;
 }
 
@@ -278,8 +283,6 @@ void dso__free_a2l(struct dso *dso)
 static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
struct dso *dso)
 {
-   char *file = NULL;
-   unsigned int line = 0;
struct inline_node *node;
 
node = zalloc(sizeof(*node));
@@ -291,7 +294,7 @@ static struct inline_node *addr2inlines(const char 
*dso_name, u64 addr,
INIT_LIST_HEAD(>val);
node->addr = addr;
 
-   if (!addr2line(dso_name, addr, , , dso, TRUE, node))
+   if (!addr2line(dso_name, addr, NULL, NULL, dso, TRUE, node))
goto out_free_inline_node;
 
if (list_empty(>val))


[tip:perf/urgent] perf report: Do not drop last inlined frame

2017-05-24 Thread tip-bot for Milian Wolff
Commit-ID:  4d53b9d546f9f4505e6e3d58c8eed894d6f684e7
Gitweb: http://git.kernel.org/tip/4d53b9d546f9f4505e6e3d58c8eed894d6f684e7
Author: Milian Wolff 
AuthorDate: Wed, 24 May 2017 15:21:28 +0900
Committer:  Ingo Molnar 
CommitDate: Wed, 24 May 2017 08:41:48 +0200

perf report: Do not drop last inlined frame

The very last inlined frame, i.e. the one furthest away from the
non-inlined frame, was silently dropped. This is apparent when
comparing the output of `perf script` and `addr2line`:

~~
  $ perf script --inline
  ...
  a.out 26722 80836.309329:  72425 cycles:
 21561 __hypot_finite (/usr/lib/libm-2.25.so)
  ace3 hypot (/usr/lib/libm-2.25.so)
   a4a main (a.out)
   std::abs
   std::_Norm_helper::_S_do_it
   std::norm
   main
 20510 __libc_start_main (/usr/lib/libc-2.25.so)
   bd9 _start (a.out)

  $ addr2line -a -f -i -e /tmp/a.out a4a | c++filt
  0x0a4a
  std::__complex_abs(doublecomplex )
  /usr/include/c++/6.3.1/complex:589
  double std::abs(std::complex const&)
  /usr/include/c++/6.3.1/complex:597
  double std::_Norm_helper::_S_do_it(std::complex const&)
  /usr/include/c++/6.3.1/complex:654
  double std::norm(std::complex const&)
  /usr/include/c++/6.3.1/complex:664
  main
  /tmp/inlining.cpp:14
~

Note how `std::__complex_abs` is missing from the `perf script`
output. This is similarly showing up in `perf report`. The patch
here fixes this issue, and the output becomes:

~
  a.out 26722 80836.309329:  72425 cycles:
 21561 __hypot_finite (/usr/lib/libm-2.25.so)
  ace3 hypot (/usr/lib/libm-2.25.so)
   a4a main (a.out)
   std::__complex_abs
   std::abs
   std::_Norm_helper::_S_do_it
   std::norm
   main
 20510 __libc_start_main (/usr/lib/libc-2.25.so)
   bd9 _start (a.out)
~

Signed-off-by: Milian Wolff 
Signed-off-by: Namhyung Kim 
Cc: Arnaldo Carvalho de Melo 
Cc: Arnaldo Carvalho de Melo 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Yao Jin 
Cc: kernel-t...@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-7-namhy...@kernel.org
Signed-off-by: Ingo Molnar 
---
 tools/perf/util/srcline.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 6af0364..ebc88a7 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -203,6 +203,16 @@ static void addr2line_cleanup(struct a2l_data *a2l)
 
 #define MAX_INLINE_NEST 1024
 
+static int inline_list__append_dso_a2l(struct dso *dso,
+  struct inline_node *node)
+{
+   struct a2l_data *a2l = dso->a2l;
+   char *funcname = a2l->funcname ? strdup(a2l->funcname) : NULL;
+   char *filename = a2l->filename ? strdup(a2l->filename) : NULL;
+
+   return inline_list__append(filename, funcname, a2l->line, node, dso);
+}
+
 static int addr2line(const char *dso_name, u64 addr,
 char **file, unsigned int *line, struct dso *dso,
 bool unwind_inlines, struct inline_node *node)
@@ -231,15 +241,15 @@ static int addr2line(const char *dso_name, u64 addr,
if (unwind_inlines) {
int cnt = 0;
 
+   if (node && inline_list__append_dso_a2l(dso, node))
+   return 0;
+
while (bfd_find_inliner_info(a2l->abfd, >filename,
 >funcname, >line) &&
   cnt++ < MAX_INLINE_NEST) {
 
if (node != NULL) {
-   if (inline_list__append(strdup(a2l->filename),
-   strdup(a2l->funcname),
-   a2l->line, node,
-   dso) != 0)
+   if (inline_list__append_dso_a2l(dso, node))
return 0;
// found at least one inline frame
ret = 1;


[tip:perf/urgent] perf report: Fix memory leak in addr2line when called by addr2inlines

2017-05-24 Thread tip-bot for Milian Wolff
Commit-ID:  b21cc97810932a551f7aac46f0b89c469c828b3f
Gitweb: http://git.kernel.org/tip/b21cc97810932a551f7aac46f0b89c469c828b3f
Author: Milian Wolff 
AuthorDate: Wed, 24 May 2017 15:21:24 +0900
Committer:  Ingo Molnar 
CommitDate: Wed, 24 May 2017 08:41:48 +0200

perf report: Fix memory leak in addr2line when called by addr2inlines

When a filename was found in addr2line it was duplicated via strdup()
but never freed. Now we pass NULL and handle this gracefully in
addr2line.

Detected by Valgrind:

  ==16331== 1,680 bytes in 21 blocks are definitely lost in loss record 148 of 
220
  ==16331==at 0x4C2AF1F: malloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
  ==16331==by 0x672FA69: strdup (in /usr/lib/libc-2.25.so)
  ==16331==by 0x52769F: addr2line (srcline.c:256)
  ==16331==by 0x52769F: addr2inlines (srcline.c:294)
  ==16331==by 0x52769F: dso__parse_addr_inlines (srcline.c:502)
  ==16331==by 0x574D7A: inline__fprintf (hist.c:41)
  ==16331==by 0x574D7A: ipchain__fprintf_graph (hist.c:147)
  ==16331==by 0x57518A: __callchain__fprintf_graph (hist.c:212)
  ==16331==by 0x5753CF: callchain__fprintf_graph.constprop.6 (hist.c:337)
  ==16331==by 0x57738E: hist_entry__fprintf (hist.c:628)
  ==16331==by 0x57738E: hists__fprintf (hist.c:882)
  ==16331==by 0x44A20F: perf_evlist__tty_browse_hists (builtin-report.c:399)
  ==16331==by 0x44A20F: report__browse_hists (builtin-report.c:491)
  ==16331==by 0x44A20F: __cmd_report (builtin-report.c:624)
  ==16331==by 0x44A20F: cmd_report (builtin-report.c:1054)
  ==16331==by 0x4A49CE: run_builtin (perf.c:296)
  ==16331==by 0x4A4CC0: handle_internal_command (perf.c:348)
  ==16331==by 0x434371: run_argv (perf.c:392)
  ==16331==by 0x434371: main (perf.c:530)

Signed-off-by: Milian Wolff 
Signed-off-by: Namhyung Kim 
Cc: Arnaldo Carvalho de Melo 
Cc: Arnaldo Carvalho de Melo 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Yao Jin 
Cc: kernel-t...@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-3-namhy...@kernel.org
Signed-off-by: Ingo Molnar 
---
 tools/perf/util/srcline.c | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index df051a5..5e376d6 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -230,7 +230,10 @@ static int addr2line(const char *dso_name, u64 addr,
 
bfd_map_over_sections(a2l->abfd, find_address_in_section, a2l);
 
-   if (a2l->found && unwind_inlines) {
+   if (!a2l->found)
+   return 0;
+
+   if (unwind_inlines) {
int cnt = 0;
 
while (bfd_find_inliner_info(a2l->abfd, >filename,
@@ -243,6 +246,8 @@ static int addr2line(const char *dso_name, u64 addr,
a2l->line, node,
dso) != 0)
return 0;
+   // found at least one inline frame
+   ret = 1;
}
}
 
@@ -252,14 +257,14 @@ static int addr2line(const char *dso_name, u64 addr,
}
}
 
-   if (a2l->found && a2l->filename) {
-   *file = strdup(a2l->filename);
-   *line = a2l->line;
-
-   if (*file)
-   ret = 1;
+   if (file) {
+   *file = a2l->filename ? strdup(a2l->filename) : NULL;
+   ret = *file ? 1 : 0;
}
 
+   if (line)
+   *line = a2l->line;
+
return ret;
 }
 
@@ -278,8 +283,6 @@ void dso__free_a2l(struct dso *dso)
 static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
struct dso *dso)
 {
-   char *file = NULL;
-   unsigned int line = 0;
struct inline_node *node;
 
node = zalloc(sizeof(*node));
@@ -291,7 +294,7 @@ static struct inline_node *addr2inlines(const char 
*dso_name, u64 addr,
INIT_LIST_HEAD(>val);
node->addr = addr;
 
-   if (!addr2line(dso_name, addr, , , dso, TRUE, node))
+   if (!addr2line(dso_name, addr, NULL, NULL, dso, TRUE, node))
goto out_free_inline_node;
 
if (list_empty(>val))


[tip:perf/urgent] perf report: Always honor callchain order for inlined nodes

2017-05-24 Thread tip-bot for Milian Wolff
Commit-ID:  28071f51839e393f697d0d1df0b223a4bc373606
Gitweb: http://git.kernel.org/tip/28071f51839e393f697d0d1df0b223a4bc373606
Author: Milian Wolff 
AuthorDate: Wed, 24 May 2017 15:21:27 +0900
Committer:  Ingo Molnar 
CommitDate: Wed, 24 May 2017 08:41:48 +0200

perf report: Always honor callchain order for inlined nodes

So far, the inlined nodes where only reversed when we built perf
against libbfd. If that was not available, the addr2line fallback
code path was missing the inline_list__reverse call.

Now we always add the nodes in the correct order within
inline_list__append. This removes the need to reverse the list
and also ensures that all callers construct the list in the right
order.

Signed-off-by: Milian Wolff 
Signed-off-by: Namhyung Kim 
Cc: Arnaldo Carvalho de Melo 
Cc: Arnaldo Carvalho de Melo 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Yao Jin 
Cc: kernel-t...@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-6-namhy...@kernel.org
Signed-off-by: Ingo Molnar 
---
 tools/perf/util/srcline.c | 18 --
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 5e376d6..6af0364 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -56,7 +56,10 @@ static int inline_list__append(char *filename, char 
*funcname, int line_nr,
}
}
 
-   list_add_tail(>list, >val);
+   if (callchain_param.order == ORDER_CALLEE)
+   list_add_tail(>list, >val);
+   else
+   list_add(>list, >val);
 
return 0;
 }
@@ -200,14 +203,6 @@ static void addr2line_cleanup(struct a2l_data *a2l)
 
 #define MAX_INLINE_NEST 1024
 
-static void inline_list__reverse(struct inline_node *node)
-{
-   struct inline_list *ilist, *n;
-
-   list_for_each_entry_safe_reverse(ilist, n, >val, list)
-   list_move_tail(>list, >val);
-}
-
 static int addr2line(const char *dso_name, u64 addr,
 char **file, unsigned int *line, struct dso *dso,
 bool unwind_inlines, struct inline_node *node)
@@ -250,11 +245,6 @@ static int addr2line(const char *dso_name, u64 addr,
ret = 1;
}
}
-
-   if ((node != NULL) &&
-   (callchain_param.order != ORDER_CALLEE)) {
-   inline_list__reverse(node);
-   }
}
 
if (file) {


[tip:perf/urgent] perf report: Always honor callchain order for inlined nodes

2017-05-24 Thread tip-bot for Milian Wolff
Commit-ID:  28071f51839e393f697d0d1df0b223a4bc373606
Gitweb: http://git.kernel.org/tip/28071f51839e393f697d0d1df0b223a4bc373606
Author: Milian Wolff 
AuthorDate: Wed, 24 May 2017 15:21:27 +0900
Committer:  Ingo Molnar 
CommitDate: Wed, 24 May 2017 08:41:48 +0200

perf report: Always honor callchain order for inlined nodes

So far, the inlined nodes where only reversed when we built perf
against libbfd. If that was not available, the addr2line fallback
code path was missing the inline_list__reverse call.

Now we always add the nodes in the correct order within
inline_list__append. This removes the need to reverse the list
and also ensures that all callers construct the list in the right
order.

Signed-off-by: Milian Wolff 
Signed-off-by: Namhyung Kim 
Cc: Arnaldo Carvalho de Melo 
Cc: Arnaldo Carvalho de Melo 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Yao Jin 
Cc: kernel-t...@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-6-namhy...@kernel.org
Signed-off-by: Ingo Molnar 
---
 tools/perf/util/srcline.c | 18 --
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 5e376d6..6af0364 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -56,7 +56,10 @@ static int inline_list__append(char *filename, char 
*funcname, int line_nr,
}
}
 
-   list_add_tail(>list, >val);
+   if (callchain_param.order == ORDER_CALLEE)
+   list_add_tail(>list, >val);
+   else
+   list_add(>list, >val);
 
return 0;
 }
@@ -200,14 +203,6 @@ static void addr2line_cleanup(struct a2l_data *a2l)
 
 #define MAX_INLINE_NEST 1024
 
-static void inline_list__reverse(struct inline_node *node)
-{
-   struct inline_list *ilist, *n;
-
-   list_for_each_entry_safe_reverse(ilist, n, >val, list)
-   list_move_tail(>list, >val);
-}
-
 static int addr2line(const char *dso_name, u64 addr,
 char **file, unsigned int *line, struct dso *dso,
 bool unwind_inlines, struct inline_node *node)
@@ -250,11 +245,6 @@ static int addr2line(const char *dso_name, u64 addr,
ret = 1;
}
}
-
-   if ((node != NULL) &&
-   (callchain_param.order != ORDER_CALLEE)) {
-   inline_list__reverse(node);
-   }
}
 
if (file) {


[tip:perf/urgent] perf report: Don't crash on invalid maps in `-g srcline` mode

2017-05-24 Thread tip-bot for Milian Wolff
Commit-ID:  7d4df089d77306914426a604c890175f91a9a459
Gitweb: http://git.kernel.org/tip/7d4df089d77306914426a604c890175f91a9a459
Author: Milian Wolff 
AuthorDate: Wed, 24 May 2017 15:21:23 +0900
Committer:  Ingo Molnar 
CommitDate: Wed, 24 May 2017 08:41:47 +0200

perf report: Don't crash on invalid maps in `-g srcline` mode

I just hit a segfault when doing `perf report -g srcline`.
Valgrind pointed me at this code as the culprit:

  ==8359== Invalid read of size 8
  ==8359==at 0x3096D9: map__rip_2objdump (map.c:430)
  ==8359==by 0x2FC1A3: match_chain_srcline (callchain.c:645)
  ==8359==by 0x2FC1A3: match_chain (callchain.c:700)
  ==8359==by 0x2FC1A3: append_chain (callchain.c:895)
  ==8359==by 0x2FC1A3: append_chain_children (callchain.c:846)
  ==8359==by 0x2FF719: callchain_append (callchain.c:944)
  ==8359==by 0x2FF719: hist_entry__append_callchain (callchain.c:1058)
  ==8359==by 0x32FA06: iter_add_single_cumulative_entry (hist.c:908)
  ==8359==by 0x33195C: hist_entry_iter__add (hist.c:1050)
  ==8359==by 0x258F65: process_sample_event (builtin-report.c:204)
  ==8359==by 0x30D60C: perf_session__deliver_event (session.c:1310)
  ==8359==by 0x30D60C: ordered_events__deliver_event (session.c:119)
  ==8359==by 0x310D12: __ordered_events__flush (ordered-events.c:210)
  ==8359==by 0x310D12: ordered_events__flush.part.3 (ordered-events.c:277)
  ==8359==by 0x30DD3C: perf_session__process_user_event (session.c:1349)
  ==8359==by 0x30DD3C: perf_session__process_event (session.c:1475)
  ==8359==by 0x30FC3C: __perf_session__process_events (session.c:1867)
  ==8359==by 0x30FC3C: perf_session__process_events (session.c:1921)
  ==8359==by 0x25A985: __cmd_report (builtin-report.c:575)
  ==8359==by 0x25A985: cmd_report (builtin-report.c:1054)
  ==8359==by 0x2B9A80: run_builtin (perf.c:296)
  ==8359==  Address 0x70 is not stack'd, malloc'd or (recently) free'd

This patch fixes the issue.

Signed-off-by: Milian Wolff 
[ Remove dependency from another change ]
Signed-off-by: Namhyung Kim 
Cc: Arnaldo Carvalho de Melo 
Cc: Arnaldo Carvalho de Melo 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Yao Jin 
Cc: kernel-t...@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-2-namhy...@kernel.org
Signed-off-by: Ingo Molnar 
---
 tools/perf/util/callchain.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 81fc29a..b4204b4 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -621,14 +621,19 @@ enum match_result {
 static enum match_result match_chain_srcline(struct callchain_cursor_node 
*node,
 struct callchain_list *cnode)
 {
-   char *left = get_srcline(cnode->ms.map->dso,
+   char *left = NULL;
+   char *right = NULL;
+   enum match_result ret = MATCH_EQ;
+   int cmp;
+
+   if (cnode->ms.map)
+   left = get_srcline(cnode->ms.map->dso,
 map__rip_2objdump(cnode->ms.map, cnode->ip),
 cnode->ms.sym, true, false);
-   char *right = get_srcline(node->map->dso,
+   if (node->map)
+   right = get_srcline(node->map->dso,
  map__rip_2objdump(node->map, node->ip),
  node->sym, true, false);
-   enum match_result ret = MATCH_EQ;
-   int cmp;
 
if (left && right)
cmp = strcmp(left, right);


[tip:perf/urgent] perf report: Fix off-by-one for non-activation frames

2017-05-24 Thread tip-bot for Milian Wolff
Commit-ID:  1982ad48fc82c284a5cc55697a012d3357e84d01
Gitweb: http://git.kernel.org/tip/1982ad48fc82c284a5cc55697a012d3357e84d01
Author: Milian Wolff 
AuthorDate: Wed, 24 May 2017 15:21:25 +0900
Committer:  Ingo Molnar 
CommitDate: Wed, 24 May 2017 08:41:48 +0200

perf report: Fix off-by-one for non-activation frames

As the documentation for dwfl_frame_pc says, frames that
are no activation frames need to have their program counter
decremented by one to properly find the function of the caller.

This fixes many cases where perf report currently attributes
the cost to the next line. I.e. I have code like this:

~~~
  #include 
  #include 

  using namespace std;

  int main()
  {
this_thread::sleep_for(chrono::milliseconds(1000));
this_thread::sleep_for(chrono::milliseconds(100));
this_thread::sleep_for(chrono::milliseconds(10));

return 0;
  }
~~~

Now compile and record it:

~~~
  g++ -std=c++11 -g -O2 test.cpp
  echo 1 | sudo tee /proc/sys/kernel/sched_schedstats
  perf record \
--event sched:sched_stat_sleep \
--event sched:sched_process_exit \
--event sched:sched_switch --call-graph=dwarf \
--output perf.data.raw \
./a.out
  echo 0 | sudo tee /proc/sys/kernel/sched_schedstats
  perf inject --sched-stat --input perf.data.raw --output perf.data
~~~

Before this patch, the report clearly shows the off-by-one issue.
Most notably, the last sleep invocation is incorrectly attributed
to the "return 0;" line:

~~~
  Overhead  Source:Line
    ...

   100.00%  core.c:0
|
---__schedule core.c:0
   schedule
   do_nanosleep hrtimer.c:0
   hrtimer_nanosleep
   sys_nanosleep
   entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
   __nanosleep_nocancel .:0
   std::this_thread::sleep_for > 
thread:323
   |
   |--90.08%--main test.cpp:9
   |  __libc_start_main
   |  _start
   |
   |--9.01%--main test.cpp:10
   |  __libc_start_main
   |  _start
   |
--0.91%--main test.cpp:13
  __libc_start_main
  _start
~~~

With this patch here applied, the issue is fixed. The report becomes
much more usable:

~~~
  Overhead  Source:Line
    ...

   100.00%  core.c:0
|
---__schedule core.c:0
   schedule
   do_nanosleep hrtimer.c:0
   hrtimer_nanosleep
   sys_nanosleep
   entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
   __nanosleep_nocancel .:0
   std::this_thread::sleep_for > 
thread:323
   |
   |--90.08%--main test.cpp:8
   |  __libc_start_main
   |  _start
   |
   |--9.01%--main test.cpp:9
   |  __libc_start_main
   |  _start
   |
--0.91%--main test.cpp:10
  __libc_start_main
  _start
~~~

Similarly it works for signal frames:

~~~
  __noinline void bar(void)
  {
volatile long cnt = 0;

for (cnt = 0; cnt < 1; cnt++);
  }

  __noinline void foo(void)
  {
bar();
  }

  void sig_handler(int sig)
  {
foo();
  }

  int main(void)
  {
signal(SIGUSR1, sig_handler);
raise(SIGUSR1);

foo();
return 0;
  }


Before, the report wrongly points to `signal.c:29` after raise():


  $ perf report --stdio --no-children -g srcline -s srcline
  ...
   100.00%  signal.c:11
|
---bar signal.c:11
   |
   |--50.49%--main signal.c:29
   |  __libc_start_main
   |  _start
   |
--49.51%--0x33a8f
  raise .:0
  main signal.c:29
  __libc_start_main
  _start


With this patch in, the issue is fixed and we instead get:


   100.00%  signal   signal[.] bar
|
---bar signal.c:11
   |
   |--50.49%--main signal.c:29
   |  __libc_start_main
   |  _start
   |
--49.51%--0x33a8f
  raise .:0
  main signal.c:27
  __libc_start_main
  _start


Note how this patch fixes this issue for both unwinding methods, i.e.
both dwfl and libunwind. The former case is 

[tip:perf/urgent] perf report: Don't crash on invalid maps in `-g srcline` mode

2017-05-24 Thread tip-bot for Milian Wolff
Commit-ID:  7d4df089d77306914426a604c890175f91a9a459
Gitweb: http://git.kernel.org/tip/7d4df089d77306914426a604c890175f91a9a459
Author: Milian Wolff 
AuthorDate: Wed, 24 May 2017 15:21:23 +0900
Committer:  Ingo Molnar 
CommitDate: Wed, 24 May 2017 08:41:47 +0200

perf report: Don't crash on invalid maps in `-g srcline` mode

I just hit a segfault when doing `perf report -g srcline`.
Valgrind pointed me at this code as the culprit:

  ==8359== Invalid read of size 8
  ==8359==at 0x3096D9: map__rip_2objdump (map.c:430)
  ==8359==by 0x2FC1A3: match_chain_srcline (callchain.c:645)
  ==8359==by 0x2FC1A3: match_chain (callchain.c:700)
  ==8359==by 0x2FC1A3: append_chain (callchain.c:895)
  ==8359==by 0x2FC1A3: append_chain_children (callchain.c:846)
  ==8359==by 0x2FF719: callchain_append (callchain.c:944)
  ==8359==by 0x2FF719: hist_entry__append_callchain (callchain.c:1058)
  ==8359==by 0x32FA06: iter_add_single_cumulative_entry (hist.c:908)
  ==8359==by 0x33195C: hist_entry_iter__add (hist.c:1050)
  ==8359==by 0x258F65: process_sample_event (builtin-report.c:204)
  ==8359==by 0x30D60C: perf_session__deliver_event (session.c:1310)
  ==8359==by 0x30D60C: ordered_events__deliver_event (session.c:119)
  ==8359==by 0x310D12: __ordered_events__flush (ordered-events.c:210)
  ==8359==by 0x310D12: ordered_events__flush.part.3 (ordered-events.c:277)
  ==8359==by 0x30DD3C: perf_session__process_user_event (session.c:1349)
  ==8359==by 0x30DD3C: perf_session__process_event (session.c:1475)
  ==8359==by 0x30FC3C: __perf_session__process_events (session.c:1867)
  ==8359==by 0x30FC3C: perf_session__process_events (session.c:1921)
  ==8359==by 0x25A985: __cmd_report (builtin-report.c:575)
  ==8359==by 0x25A985: cmd_report (builtin-report.c:1054)
  ==8359==by 0x2B9A80: run_builtin (perf.c:296)
  ==8359==  Address 0x70 is not stack'd, malloc'd or (recently) free'd

This patch fixes the issue.

Signed-off-by: Milian Wolff 
[ Remove dependency from another change ]
Signed-off-by: Namhyung Kim 
Cc: Arnaldo Carvalho de Melo 
Cc: Arnaldo Carvalho de Melo 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Yao Jin 
Cc: kernel-t...@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-2-namhy...@kernel.org
Signed-off-by: Ingo Molnar 
---
 tools/perf/util/callchain.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 81fc29a..b4204b4 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -621,14 +621,19 @@ enum match_result {
 static enum match_result match_chain_srcline(struct callchain_cursor_node 
*node,
 struct callchain_list *cnode)
 {
-   char *left = get_srcline(cnode->ms.map->dso,
+   char *left = NULL;
+   char *right = NULL;
+   enum match_result ret = MATCH_EQ;
+   int cmp;
+
+   if (cnode->ms.map)
+   left = get_srcline(cnode->ms.map->dso,
 map__rip_2objdump(cnode->ms.map, cnode->ip),
 cnode->ms.sym, true, false);
-   char *right = get_srcline(node->map->dso,
+   if (node->map)
+   right = get_srcline(node->map->dso,
  map__rip_2objdump(node->map, node->ip),
  node->sym, true, false);
-   enum match_result ret = MATCH_EQ;
-   int cmp;
 
if (left && right)
cmp = strcmp(left, right);


[tip:perf/urgent] perf report: Fix off-by-one for non-activation frames

2017-05-24 Thread tip-bot for Milian Wolff
Commit-ID:  1982ad48fc82c284a5cc55697a012d3357e84d01
Gitweb: http://git.kernel.org/tip/1982ad48fc82c284a5cc55697a012d3357e84d01
Author: Milian Wolff 
AuthorDate: Wed, 24 May 2017 15:21:25 +0900
Committer:  Ingo Molnar 
CommitDate: Wed, 24 May 2017 08:41:48 +0200

perf report: Fix off-by-one for non-activation frames

As the documentation for dwfl_frame_pc says, frames that
are no activation frames need to have their program counter
decremented by one to properly find the function of the caller.

This fixes many cases where perf report currently attributes
the cost to the next line. I.e. I have code like this:

~~~
  #include 
  #include 

  using namespace std;

  int main()
  {
this_thread::sleep_for(chrono::milliseconds(1000));
this_thread::sleep_for(chrono::milliseconds(100));
this_thread::sleep_for(chrono::milliseconds(10));

return 0;
  }
~~~

Now compile and record it:

~~~
  g++ -std=c++11 -g -O2 test.cpp
  echo 1 | sudo tee /proc/sys/kernel/sched_schedstats
  perf record \
--event sched:sched_stat_sleep \
--event sched:sched_process_exit \
--event sched:sched_switch --call-graph=dwarf \
--output perf.data.raw \
./a.out
  echo 0 | sudo tee /proc/sys/kernel/sched_schedstats
  perf inject --sched-stat --input perf.data.raw --output perf.data
~~~

Before this patch, the report clearly shows the off-by-one issue.
Most notably, the last sleep invocation is incorrectly attributed
to the "return 0;" line:

~~~
  Overhead  Source:Line
    ...

   100.00%  core.c:0
|
---__schedule core.c:0
   schedule
   do_nanosleep hrtimer.c:0
   hrtimer_nanosleep
   sys_nanosleep
   entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
   __nanosleep_nocancel .:0
   std::this_thread::sleep_for > 
thread:323
   |
   |--90.08%--main test.cpp:9
   |  __libc_start_main
   |  _start
   |
   |--9.01%--main test.cpp:10
   |  __libc_start_main
   |  _start
   |
--0.91%--main test.cpp:13
  __libc_start_main
  _start
~~~

With this patch here applied, the issue is fixed. The report becomes
much more usable:

~~~
  Overhead  Source:Line
    ...

   100.00%  core.c:0
|
---__schedule core.c:0
   schedule
   do_nanosleep hrtimer.c:0
   hrtimer_nanosleep
   sys_nanosleep
   entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
   __nanosleep_nocancel .:0
   std::this_thread::sleep_for > 
thread:323
   |
   |--90.08%--main test.cpp:8
   |  __libc_start_main
   |  _start
   |
   |--9.01%--main test.cpp:9
   |  __libc_start_main
   |  _start
   |
--0.91%--main test.cpp:10
  __libc_start_main
  _start
~~~

Similarly it works for signal frames:

~~~
  __noinline void bar(void)
  {
volatile long cnt = 0;

for (cnt = 0; cnt < 1; cnt++);
  }

  __noinline void foo(void)
  {
bar();
  }

  void sig_handler(int sig)
  {
foo();
  }

  int main(void)
  {
signal(SIGUSR1, sig_handler);
raise(SIGUSR1);

foo();
return 0;
  }


Before, the report wrongly points to `signal.c:29` after raise():


  $ perf report --stdio --no-children -g srcline -s srcline
  ...
   100.00%  signal.c:11
|
---bar signal.c:11
   |
   |--50.49%--main signal.c:29
   |  __libc_start_main
   |  _start
   |
--49.51%--0x33a8f
  raise .:0
  main signal.c:29
  __libc_start_main
  _start


With this patch in, the issue is fixed and we instead get:


   100.00%  signal   signal[.] bar
|
---bar signal.c:11
   |
   |--50.49%--main signal.c:29
   |  __libc_start_main
   |  _start
   |
--49.51%--0x33a8f
  raise .:0
  main signal.c:27
  __libc_start_main
  _start


Note how this patch fixes this issue for both unwinding methods, i.e.
both dwfl and libunwind. The former case is straight-forward thanks
to dwfl_frame_pc(). For libunwind, we replace the functionality via

[tip:perf/core] perf report: Enable sorting by srcline as key

2017-03-27 Thread tip-bot for Milian Wolff
Commit-ID:  5dfa210e407d0fedf746958bff206995bd46570d
Gitweb: http://git.kernel.org/tip/5dfa210e407d0fedf746958bff206995bd46570d
Author: Milian Wolff 
AuthorDate: Sat, 18 Mar 2017 22:49:28 +0100
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 27 Mar 2017 12:13:28 -0300

perf report: Enable sorting by srcline as key

Often it is interesting to know how costly a given source line is in
total. Previously, one had to build these sums manually based on all
addresses that pointed to the same source line. This patch introduces
srcline as a sort key, which will do the aggregation for us.

Paired with the recent addition of showing inline frames, this makes
perf report much more useful for many C++ work loads.

The following shows the new feature in action. First, let's show the
status quo output when we sort by address. The result contains many hist
entries that generate the same output:

  
  $ perf report --stdio --inline -g address
  # Children  Self  Command   Shared ObjectSymbol
  #       ...  
.
  #
  99.89%35.34%  cpp-inlining  cpp-inlining [.] main
|
|--64.55%--main complex:655
|  
/home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39
 (inline)
|  /usr/include/c++/6.3.1/complex:664 (inline)
|  |
|  |--60.31%--hypot +20
|  |  |
|  |  |--8.52%--__hypot_finite +273
|  |  |
|  |  |--7.32%--__hypot_finite +411
...
 --35.34%--_start +4194346
   __libc_start_main +241
   |
   |--6.65%--main random.tcc:3326
   |  
/home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39
 (inline)
   |  /usr/include/c++/6.3.1/bits/random.h:1809 
(inline)
   |  /usr/include/c++/6.3.1/bits/random.h:1818 
(inline)
   |  /usr/include/c++/6.3.1/bits/random.h:185 
(inline)
   |
   |--2.70%--main random.tcc:3326
   |  
/home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39
 (inline)
   |  /usr/include/c++/6.3.1/bits/random.h:1809 
(inline)
   |  /usr/include/c++/6.3.1/bits/random.h:1818 
(inline)
   |  /usr/include/c++/6.3.1/bits/random.h:185 
(inline)
   |
   |--1.69%--main random.tcc:3326
   |  
/home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39
 (inline)
   |  /usr/include/c++/6.3.1/bits/random.h:1809 
(inline)
   |  /usr/include/c++/6.3.1/bits/random.h:1818 
(inline)
   |  /usr/include/c++/6.3.1/bits/random.h:185 
(inline)
  ...
  

With this patch and `-g srcline` we instead get the following output:

  
  $ perf report --stdio --inline -g srcline
  # Children  Self  Command   Shared ObjectSymbol
  #       ...  
.
  #
  99.89%35.34%  cpp-inlining  cpp-inlining [.] main
|
|--64.55%--main complex:655
|  
/home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39
 (inline)
|  /usr/include/c++/6.3.1/complex:664 (inline)
|  |
|  |--64.02%--hypot
|  |  |
|  |   --59.81%--__hypot_finite
|  |
|   --0.53%--cabs
|
 --35.34%--_start
   __libc_start_main
   |
   |--12.48%--main random.tcc:3326
   |  
/home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39
 (inline)
   |  /usr/include/c++/6.3.1/bits/random.h:1809 
(inline)
   |  /usr/include/c++/6.3.1/bits/random.h:1818 
(inline)
   |  /usr/include/c++/6.3.1/bits/random.h:185 
(inline)
  ...
  

Signed-off-by: Milian Wolff 
Cc: Jiri Olsa 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20170318214928.9047-1-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-report.txt |  1 +
 

[tip:perf/core] perf report: Enable sorting by srcline as key

2017-03-27 Thread tip-bot for Milian Wolff
Commit-ID:  5dfa210e407d0fedf746958bff206995bd46570d
Gitweb: http://git.kernel.org/tip/5dfa210e407d0fedf746958bff206995bd46570d
Author: Milian Wolff 
AuthorDate: Sat, 18 Mar 2017 22:49:28 +0100
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 27 Mar 2017 12:13:28 -0300

perf report: Enable sorting by srcline as key

Often it is interesting to know how costly a given source line is in
total. Previously, one had to build these sums manually based on all
addresses that pointed to the same source line. This patch introduces
srcline as a sort key, which will do the aggregation for us.

Paired with the recent addition of showing inline frames, this makes
perf report much more useful for many C++ work loads.

The following shows the new feature in action. First, let's show the
status quo output when we sort by address. The result contains many hist
entries that generate the same output:

  
  $ perf report --stdio --inline -g address
  # Children  Self  Command   Shared ObjectSymbol
  #       ...  
.
  #
  99.89%35.34%  cpp-inlining  cpp-inlining [.] main
|
|--64.55%--main complex:655
|  
/home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39
 (inline)
|  /usr/include/c++/6.3.1/complex:664 (inline)
|  |
|  |--60.31%--hypot +20
|  |  |
|  |  |--8.52%--__hypot_finite +273
|  |  |
|  |  |--7.32%--__hypot_finite +411
...
 --35.34%--_start +4194346
   __libc_start_main +241
   |
   |--6.65%--main random.tcc:3326
   |  
/home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39
 (inline)
   |  /usr/include/c++/6.3.1/bits/random.h:1809 
(inline)
   |  /usr/include/c++/6.3.1/bits/random.h:1818 
(inline)
   |  /usr/include/c++/6.3.1/bits/random.h:185 
(inline)
   |
   |--2.70%--main random.tcc:3326
   |  
/home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39
 (inline)
   |  /usr/include/c++/6.3.1/bits/random.h:1809 
(inline)
   |  /usr/include/c++/6.3.1/bits/random.h:1818 
(inline)
   |  /usr/include/c++/6.3.1/bits/random.h:185 
(inline)
   |
   |--1.69%--main random.tcc:3326
   |  
/home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39
 (inline)
   |  /usr/include/c++/6.3.1/bits/random.h:1809 
(inline)
   |  /usr/include/c++/6.3.1/bits/random.h:1818 
(inline)
   |  /usr/include/c++/6.3.1/bits/random.h:185 
(inline)
  ...
  

With this patch and `-g srcline` we instead get the following output:

  
  $ perf report --stdio --inline -g srcline
  # Children  Self  Command   Shared ObjectSymbol
  #       ...  
.
  #
  99.89%35.34%  cpp-inlining  cpp-inlining [.] main
|
|--64.55%--main complex:655
|  
/home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39
 (inline)
|  /usr/include/c++/6.3.1/complex:664 (inline)
|  |
|  |--64.02%--hypot
|  |  |
|  |   --59.81%--__hypot_finite
|  |
|   --0.53%--cabs
|
 --35.34%--_start
   __libc_start_main
   |
   |--12.48%--main random.tcc:3326
   |  
/home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39
 (inline)
   |  /usr/include/c++/6.3.1/bits/random.h:1809 
(inline)
   |  /usr/include/c++/6.3.1/bits/random.h:1818 
(inline)
   |  /usr/include/c++/6.3.1/bits/random.h:185 
(inline)
  ...
  

Signed-off-by: Milian Wolff 
Cc: Jiri Olsa 
Cc: Yao Jin 
Link: http://lkml.kernel.org/r/20170318214928.9047-1-milian.wo...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-report.txt |  1 +
 tools/perf/ui/browsers/hists.c   |  3 +-
 tools/perf/ui/stdio/hist.c   |  3 +-
 tools/perf/util/annotate.c 

[tip:perf/core] perf evsel: Allow specifying a file to output in perf_evsel__print_ip

2016-04-13 Thread tip-bot for Milian Wolff
Commit-ID:  6186de9a491af030889b372193fc9f38c248e69a
Gitweb: http://git.kernel.org/tip/6186de9a491af030889b372193fc9f38c248e69a
Author: Milian Wolff 
AuthorDate: Mon, 11 Apr 2016 10:18:11 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 11 Apr 2016 22:18:14 -0300

perf evsel: Allow specifying a file to output in perf_evsel__print_ip

As this function will be used in 'perf trace'.

Cc: Jiri Olsa 
Link: http://lkml.kernel.org/n/tip-8x297v9utnxq77onikevv...@git.kernel.org
[ Split from a larger patch ]
Signed-off-by: Milian Wolff 
---
 tools/perf/builtin-script.c |  4 ++--
 tools/perf/util/session.c   | 39 +--
 tools/perf/util/session.h   |  3 ++-
 3 files changed, 25 insertions(+), 21 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 8f6ab2a..dbf208f 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -580,7 +580,7 @@ static void print_sample_bts(struct perf_sample *sample,
}
}
perf_evsel__print_ip(evsel, sample, al, print_opts,
-scripting_max_stack);
+scripting_max_stack, stdout);
}
 
/* print branch_to information */
@@ -790,7 +790,7 @@ static void process_event(struct perf_script *script,
 
perf_evsel__print_ip(evsel, sample, al,
 output[attr->type].print_ip_opts,
-scripting_max_stack);
+scripting_max_stack, stdout);
}
 
if (PRINT_FIELD(IREGS))
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index ef37055..bbac0ef 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1955,7 +1955,8 @@ struct perf_evsel *perf_session__find_first_evtype(struct 
perf_session *session,
 
 void perf_evsel__print_ip(struct perf_evsel *evsel, struct perf_sample *sample,
  struct addr_location *al,
- unsigned int print_opts, unsigned int stack_depth)
+ unsigned int print_opts, unsigned int stack_depth,
+ FILE *fp)
 {
struct callchain_cursor_node *node;
int print_ip = print_opts & PRINT_IP_OPT_IP;
@@ -1992,33 +1993,35 @@ void perf_evsel__print_ip(struct perf_evsel *evsel, 
struct perf_sample *sample,
goto next;
 
if (print_ip)
-   printf("%c%16" PRIx64, s, node->ip);
+   fprintf(fp, "%c%16" PRIx64, s, node->ip);
 
if (node->map)
addr = node->map->map_ip(node->map, node->ip);
 
if (print_sym) {
-   printf(" ");
+   fprintf(fp, " ");
if (print_symoffset) {
node_al.addr = addr;
node_al.map  = node->map;
-   symbol__fprintf_symname_offs(node->sym, 
_al, stdout);
+   symbol__fprintf_symname_offs(node->sym,
+_al,
+fp);
} else
-   symbol__fprintf_symname(node->sym, 
stdout);
+   symbol__fprintf_symname(node->sym, fp);
}
 
if (print_dso) {
-   printf(" (");
-   map__fprintf_dsoname(node->map, stdout);
-   printf(")");
+   fprintf(fp, " (");
+   map__fprintf_dsoname(node->map, fp);
+   fprintf(fp, ")");
}
 
if (print_srcline)
map__fprintf_srcline(node->map, addr, "\n  ",
-stdout);
+fp);
 
if (!print_oneline)
-   printf("\n");
+   fprintf(fp, "\n");
 
stack_depth--;
 next:
@@ -2030,25 +2033,25 @@ next:
return;
 
if (print_ip)
-   printf("%16" PRIx64, sample->ip);
+   fprintf(fp, "%16" PRIx64, sample->ip);
 
if (print_sym) {
-   printf(" ");
+   fprintf(fp, " ");
if (print_symoffset)
 

[tip:perf/core] perf evsel: Allow specifying a file to output in perf_evsel__print_ip

2016-04-13 Thread tip-bot for Milian Wolff
Commit-ID:  6186de9a491af030889b372193fc9f38c248e69a
Gitweb: http://git.kernel.org/tip/6186de9a491af030889b372193fc9f38c248e69a
Author: Milian Wolff 
AuthorDate: Mon, 11 Apr 2016 10:18:11 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 11 Apr 2016 22:18:14 -0300

perf evsel: Allow specifying a file to output in perf_evsel__print_ip

As this function will be used in 'perf trace'.

Cc: Jiri Olsa 
Link: http://lkml.kernel.org/n/tip-8x297v9utnxq77onikevv...@git.kernel.org
[ Split from a larger patch ]
Signed-off-by: Milian Wolff 
---
 tools/perf/builtin-script.c |  4 ++--
 tools/perf/util/session.c   | 39 +--
 tools/perf/util/session.h   |  3 ++-
 3 files changed, 25 insertions(+), 21 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 8f6ab2a..dbf208f 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -580,7 +580,7 @@ static void print_sample_bts(struct perf_sample *sample,
}
}
perf_evsel__print_ip(evsel, sample, al, print_opts,
-scripting_max_stack);
+scripting_max_stack, stdout);
}
 
/* print branch_to information */
@@ -790,7 +790,7 @@ static void process_event(struct perf_script *script,
 
perf_evsel__print_ip(evsel, sample, al,
 output[attr->type].print_ip_opts,
-scripting_max_stack);
+scripting_max_stack, stdout);
}
 
if (PRINT_FIELD(IREGS))
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index ef37055..bbac0ef 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1955,7 +1955,8 @@ struct perf_evsel *perf_session__find_first_evtype(struct 
perf_session *session,
 
 void perf_evsel__print_ip(struct perf_evsel *evsel, struct perf_sample *sample,
  struct addr_location *al,
- unsigned int print_opts, unsigned int stack_depth)
+ unsigned int print_opts, unsigned int stack_depth,
+ FILE *fp)
 {
struct callchain_cursor_node *node;
int print_ip = print_opts & PRINT_IP_OPT_IP;
@@ -1992,33 +1993,35 @@ void perf_evsel__print_ip(struct perf_evsel *evsel, 
struct perf_sample *sample,
goto next;
 
if (print_ip)
-   printf("%c%16" PRIx64, s, node->ip);
+   fprintf(fp, "%c%16" PRIx64, s, node->ip);
 
if (node->map)
addr = node->map->map_ip(node->map, node->ip);
 
if (print_sym) {
-   printf(" ");
+   fprintf(fp, " ");
if (print_symoffset) {
node_al.addr = addr;
node_al.map  = node->map;
-   symbol__fprintf_symname_offs(node->sym, 
_al, stdout);
+   symbol__fprintf_symname_offs(node->sym,
+_al,
+fp);
} else
-   symbol__fprintf_symname(node->sym, 
stdout);
+   symbol__fprintf_symname(node->sym, fp);
}
 
if (print_dso) {
-   printf(" (");
-   map__fprintf_dsoname(node->map, stdout);
-   printf(")");
+   fprintf(fp, " (");
+   map__fprintf_dsoname(node->map, fp);
+   fprintf(fp, ")");
}
 
if (print_srcline)
map__fprintf_srcline(node->map, addr, "\n  ",
-stdout);
+fp);
 
if (!print_oneline)
-   printf("\n");
+   fprintf(fp, "\n");
 
stack_depth--;
 next:
@@ -2030,25 +2033,25 @@ next:
return;
 
if (print_ip)
-   printf("%16" PRIx64, sample->ip);
+   fprintf(fp, "%16" PRIx64, sample->ip);
 
if (print_sym) {
-   printf(" ");
+   fprintf(fp, " ");
if (print_symoffset)
symbol__fprintf_symname_offs(al->sym, al,
-   

[tip:perf/core] perf trace: Write to stderr by default

2015-08-06 Thread tip-bot for Milian Wolff
Commit-ID:  007d66a0bd43d886eb3e4aceaf1a96b8743ccaff
Gitweb: http://git.kernel.org/tip/007d66a0bd43d886eb3e4aceaf1a96b8743ccaff
Author: Milian Wolff 
AuthorDate: Wed, 5 Aug 2015 16:52:23 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 5 Aug 2015 16:52:23 -0300

perf trace: Write to stderr by default

Without this patch, it is cumbersome to read the trace output but
ignoring the normal, potentially verbose, output of the debuggee.  One
common example is doing something like the following:

 perf trace -s find /tmp > /dev/null

Without this patch, the trace summary will be lost. Now, it will still
be printed at the end. This behavior is also applied by strace.

Cc: Milian Wolff 
Cc: David Ahern 
Link: http://lkml.kernel.org/n/tip-tqnks6y2cnvm5f9g2dsfr...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-trace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 98d423e..a474970 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2965,7 +2965,7 @@ int cmd_trace(int argc, const char **argv, const char 
*prefix __maybe_unused)
.mmap_pages= UINT_MAX,
.proc_map_timeout  = 500,
},
-   .output = stdout,
+   .output = stderr,
.show_comm = true,
.trace_syscalls = true,
};
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:perf/core] perf trace: Write to stderr by default

2015-08-06 Thread tip-bot for Milian Wolff
Commit-ID:  007d66a0bd43d886eb3e4aceaf1a96b8743ccaff
Gitweb: http://git.kernel.org/tip/007d66a0bd43d886eb3e4aceaf1a96b8743ccaff
Author: Milian Wolff milian.wo...@kdab.com
AuthorDate: Wed, 5 Aug 2015 16:52:23 -0300
Committer:  Arnaldo Carvalho de Melo a...@redhat.com
CommitDate: Wed, 5 Aug 2015 16:52:23 -0300

perf trace: Write to stderr by default

Without this patch, it is cumbersome to read the trace output but
ignoring the normal, potentially verbose, output of the debuggee.  One
common example is doing something like the following:

 perf trace -s find /tmp  /dev/null

Without this patch, the trace summary will be lost. Now, it will still
be printed at the end. This behavior is also applied by strace.

Cc: Milian Wolff milian.wo...@kdab.com
Cc: David Ahern dsah...@gmail.com
Link: http://lkml.kernel.org/n/tip-tqnks6y2cnvm5f9g2dsfr...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo a...@redhat.com
---
 tools/perf/builtin-trace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 98d423e..a474970 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2965,7 +2965,7 @@ int cmd_trace(int argc, const char **argv, const char 
*prefix __maybe_unused)
.mmap_pages= UINT_MAX,
.proc_map_timeout  = 500,
},
-   .output = stdout,
+   .output = stderr,
.show_comm = true,
.trace_syscalls = true,
};
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/