On Mon, Mar 15, 2010 at 10:04:54PM +0100, Frederic Weisbecker wrote: > On Mon, Mar 15, 2010 at 04:46:15PM +1100, Paul Mackerras wrote:
> > 14.99% perf [kernel.kallsyms] [k] ._raw_spin_lock > > | > > --- ._raw_spin_lock > > | > > |--25.00%-- .alloc_fd > > | (nil) > > | | > > | |--50.00%-- .anon_inode_getfd > > | | .sys_perf_event_open > > | | syscall_exit > > | | syscall > > | | create_counter > > | | __cmd_record > > | | run_builtin > > | | main > > | | 0xfd2e704 > > | | 0xfd2e8c0 > > | | (nil) > > > > ... etc. > > > > Signed-off-by: Paul Mackerras <pau...@samba.org> > > > Cool! By the way, I notice that gcc tends to inline the tracing functions, which means that by going up 2 stack frames we miss some of the functions. For example, for the lock:lock_acquire event, we have _raw_spin_lock() -> lock_acquire() -> trace_lock_acquire() -> perf_trace_lock_acquire() -> perf_trace_templ_lock_acquire() -> perf_fetch_caller_regs() -> perf_arch_fetch_caller_regs(). But in the ppc64 kernel binary I just built, gcc inlined trace_lock_acquire in lock_acquire, and perf_trace_templ_lock_acquire in perf_trace_lock_acquire. Given that perf_fetch_caller_regs is explicitly inlined, going up two levels from perf_fetch_caller_regs gets us to _raw_spin_lock, whereas I think you intended it to get us to trace_lock_acquire. I'm not sure what to do about that - any thoughts? Paul. _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev