On Thu, 23 Feb 2017 16:50:18 +0900 Masami Hiramatsu <mhira...@kernel.org> wrote:
[sorry for the delay, I just saw this] > perf record -g dwarf (and perf report) doesn't show correct callchain > on aarch64. Here is how to reproduce it. ... > # Samples: 6K of event 'cpu-clock:u' > # Event count (approx.): 1623750000 > # > # Children Self Command Shared Object Symbol > # ........ ........ ....... ............. .......................... > # > 17.21% 17.21% main main [.] func2 > | > ---func2 > > 17.09% 17.09% main main [.] func1 > | > ---func1 > > 16.67% 16.67% main main [.] main > | > ---main > ..... > > So, as you can see, the call graph reported each function has been > called from itself. If I report it with fp as below, perf reported > correct callgraph. ... > I guess there is a bug in libunwind on aarch64 or we missed to pass > the stack data to libunwind. (BTW, it works correctly on arm32) Trying to replicate this on a debian 9 ("stretch") arm64 box: Building acme's 'perf/urgent' branch (currently with the tag perf-urgent-for-mingo-4.11-20170317), natively (cd tools; make clean; make DEBUG=5 -C perf) shows this system has unwind support: Auto-detecting system features: ... dwarf: [ on ] ... dwarf_getlocations: [ on ] ... glibc: [ on ] ... gtk2: [ on ] ... libaudit: [ on ] ... libbfd: [ on ] ... libelf: [ on ] ... libnuma: [ on ] ... numa_num_possible_cpus: [ on ] ... libperl: [ OFF ] ... libpython: [ on ] ... libslang: [ on ] ... libcrypto: [ on ] ... libunwind: [ on ] ... libdw-dwarf-unwind: [ on ] ... zlib: [ on ] ... lzma: [ on ] ... get_cpuid: [ OFF ] ... bpf: [ on ] for which an apt search unwind returns the version: libunwind-dev/testing,now 1.1-4.1 arm64 [installed] library to determine the call-chain of a program - development libunwind8/testing,now 1.1-4.1 arm64 [installed,automatic] library to determine the call-chain of a program - runtime continuing, and ignoring the no debug_frame support perf configure mentions: Makefile.config:421: No debug_frame support found in libunwind-aarch64 Makefile.config:480: No debug_frame support found in libunwind $ ./perf --version perf version 4.10.rc4.ge7ede72 $ gcc --version gcc (Debian 6.3.0-6) 6.3.0 20170205 Copyright (C) 2016 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ gcc -O0 -ggdb3 -funwind-tables -o main main.c $ ./perf record -g --call-graph dwarf,1024 -e cpu-clock:u -o /tmp/perf.data -- ./main ^C[ perf record: Woken up 121 times to write data ] [ perf record: Captured and wrote 30.154 MB /tmp/perf.data (22975 samples) ] $ ./perf --no-pager report -i /tmp/perf.data --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 22K of event 'cpu-clock:u' # Event count (approx.): 5743750000 # # Children Self Command Shared Object Symbol # ........ ........ ....... ............. ..................... # 100.00% 8.14% main main [.] main | |--91.86%--main | func0 | | | --76.41%--func1 | | | --60.82%--func2 | | | --45.31%--func3 | | | --30.17%--func4 | | | --15.04%--func | --8.14%--__libc_start_main main ... which looks like it should, i.e., I can't reproduce. You mentioned you're using the 'latest' sources for libunwind, etc., but can you provide more exact details like commit IDs, and what, if anything, is being cross-built vs. native? Thanks, Kim