On Thu, 23 Feb 2017 16:50:18 +0900
Masami Hiramatsu <mhira...@kernel.org> wrote:

[sorry for the delay, I just saw this]

> perf record -g dwarf (and perf report) doesn't show correct callchain
> on aarch64. Here is how to reproduce it.
...
> # Samples: 6K of event 'cpu-clock:u'
> # Event count (approx.): 1623750000
> #
> # Children      Self  Command  Shared Object  Symbol                    
> # ........  ........  .......  .............  ..........................
> #
>     17.21%    17.21%  main     main           [.] func2
>             |
>             ---func2
> 
>     17.09%    17.09%  main     main           [.] func1
>             |
>             ---func1
> 
>     16.67%    16.67%  main     main           [.] main
>             |
>             ---main
> .....
> 
> So, as you can see, the call graph reported each function has been
> called from itself. If I report it with fp as below, perf reported
> correct callgraph.
...
> I guess there is a bug in libunwind on aarch64 or we missed to pass
> the stack data to libunwind. (BTW, it works correctly on arm32)

Trying to replicate this on a debian 9 ("stretch") arm64 box:

Building acme's 'perf/urgent' branch (currently with the tag
perf-urgent-for-mingo-4.11-20170317), natively (cd tools; make clean;
make DEBUG=5 -C perf) shows this system has unwind support:

Auto-detecting system features:
...                         dwarf: [ on  ]
...            dwarf_getlocations: [ on  ]
...                         glibc: [ on  ]
...                          gtk2: [ on  ]
...                      libaudit: [ on  ]
...                        libbfd: [ on  ]
...                        libelf: [ on  ]
...                       libnuma: [ on  ]
...        numa_num_possible_cpus: [ on  ]
...                       libperl: [ OFF ]
...                     libpython: [ on  ]
...                      libslang: [ on  ]
...                     libcrypto: [ on  ]
...                     libunwind: [ on  ]
...            libdw-dwarf-unwind: [ on  ]
...                          zlib: [ on  ]
...                          lzma: [ on  ]
...                     get_cpuid: [ OFF ]
...                           bpf: [ on  ]

for which an apt search unwind returns the version:

libunwind-dev/testing,now 1.1-4.1 arm64 [installed]
  library to determine the call-chain of a program - development
libunwind8/testing,now 1.1-4.1 arm64 [installed,automatic]
  library to determine the call-chain of a program - runtime

continuing, and ignoring the no debug_frame support perf configure
mentions:

Makefile.config:421: No debug_frame support found in libunwind-aarch64
Makefile.config:480: No debug_frame support found in libunwind

$ ./perf --version
perf version 4.10.rc4.ge7ede72
$ gcc --version
gcc (Debian 6.3.0-6) 6.3.0 20170205
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ gcc -O0 -ggdb3 -funwind-tables -o main main.c
$ ./perf record -g --call-graph dwarf,1024 -e cpu-clock:u -o /tmp/perf.data -- 
./main
^C[ perf record: Woken up 121 times to write data ]
[ perf record: Captured and wrote 30.154 MB /tmp/perf.data (22975 samples) ]

$ ./perf --no-pager report -i /tmp/perf.data --stdio
# To display the perf.data header info, please use --header/--header-only 
options.
#
#
# Total Lost Samples: 0
#
# Samples: 22K of event 'cpu-clock:u'
# Event count (approx.): 5743750000
#
# Children      Self  Command  Shared Object  Symbol               
# ........  ........  .......  .............  .....................
#
   100.00%     8.14%  main     main           [.] main
            |          
            |--91.86%--main
            |          func0
            |          |          
            |           --76.41%--func1
            |                     |          
            |                      --60.82%--func2
            |                                |          
            |                                 --45.31%--func3
            |                                           |          
            |                                            --30.17%--func4
            |                                                      |          
            |                                                       
--15.04%--func
            |          
             --8.14%--__libc_start_main
                       main
...

which looks like it should, i.e., I can't reproduce.

You mentioned you're using the 'latest' sources for libunwind, etc.,
but can you provide more exact details like commit IDs, and what, if
anything, is being cross-built vs. native?

Thanks,

Kim

Reply via email to