Re: timing information with perf

Kornievskaia, Olga Thu, 12 Nov 2015 15:10:13 -0800

> On Nov 12, 2015, at 4:57 PM, Arnaldo Carvalho de Melo <a...@kernel.org> wrote:
> 
> Em Thu, Nov 12, 2015 at 09:07:20PM +0000, Kornievskaia, Olga escreveu:
>> Hi, 
>> 
>> Can somebody tell me if perf can do “timing-of-operation” type of 
>> measurement? By that I mean, say there are two tracepoints “foobar_enter” 
>> and “foobar_exit”. Each tracepoint logs a timestamp. I’d like to be able to 
>> say that on average it takes that many time-units between “enter” and “exit” 
>> tracepoints. I haven’t been able to find something like that with “perf”.
> 
> Well, you can do:
> 
>  # perf record -a -e sched:*waking*,sched:*wakeup* sleep 2
>  [ perf record: Woken up 1 times to write data ]
>  [ perf record: Captured and wrote 0.199 MB perf.data (156 samples) ]
>  # perf script -g python
>  generated Python script: perf-script.py
>  # grep '^def ' perf-script.py 
>  def trace_begin():
>  def trace_end():
>  def sched__sched_wakeup_new(event_name, context, common_cpu,
>  def sched__sched_wakeup(event_name, context, common_cpu,
>  def sched__sched_waking(event_name, context, common_cpu,
>  def trace_unhandled(event_name, context, event_fields_dict):
>  def print_header(event_name, cpu, secs, nsecs, pid, comm):
>  #
> 
> The prototype for sched__sched_wakeup(), for instance is:
> 
> def sched__sched_wakeup(event_name, context, common_cpu,
>        common_secs, common_nsecs, common_pid, common_comm,
>        common_callchain, comm, pid, prio, success,
>        target_cpu):
> 
> Then do some minimal coding for what you want, because you need
> to match it using some key, i.e. same pid is enough for some cases, for
> others you need pid + fd, etc.
> 
>       If you prefer perl, just use 'perf script -g perl' instead.
> 
>       If all you want is for syscalls, then 'perf trace' does get
> enter+exit and does the math for you:
> 
>  # perf trace usleep 1 2>&1 | head -10
>     0.083 ( 0.003 ms): usleep/5487 brk(                                       
>                            ) = 0x560cf53db000
>     0.100 ( 0.003 ms): usleep/5487 mmap(len: 4096, prot: READ|WRITE, flags: 
> PRIVATE|ANONYMOUS, fd: -1    ) = 0x7f38f3617000
>     0.108 ( 0.005 ms): usleep/5487 access(filename: 0xf3415270, mode: R       
>                            ) = -1 ENOENT No such file or directory
>     0.118 ( 0.005 ms): usleep/5487 open(filename: 0xf3413b92, flags: CLOEXEC  
>                            ) = 3
>     0.122 ( 0.003 ms): usleep/5487 fstat(fd: 3, statbuf: 0x7ffeaa0ad9a0       
>                            ) = 0
>     0.126 ( 0.003 ms): usleep/5487 mmap(len: 111605, prot: READ, flags: 
> PRIVATE, fd: 3                   ) = 0x7f38f35fb000
>     0.128 ( 0.001 ms): usleep/5487 close(fd: 3                                
>                            ) = 0
>     0.139 ( 0.004 ms): usleep/5487 open(filename: 0xf360c37c, flags: CLOEXEC  
>                            ) = 3
>     0.142 ( 0.002 ms): usleep/5487 read(fd: 3, buf: 0x7ffeaa0adb30, count: 
> 832                           ) = 832
>     0.145 ( 0.001 ms): usleep/5487 fstat(fd: 3, statbuf: 0x7ffeaa0ad9d0       
>                            ) = 0
>  # 
> 
> And you can specify, just like with strace, what are the syscalls you want (or
> the ones you don't, prefixing with !):
> 
>  # perf trace -e nanosleep sleep 1
>  1000.744 (1000.239 ms): sleep/5661 nanosleep(rqtp: 0x7ffc4dce2960            
>                             ) = 0
>  # 
> 
> You can also mix it with any other tracepoint, including any you have created
> with 'perf probe', for kernel or userspace:
> 
>  # perf trace --event sched:*exit,sched:*switch,kmem:kmalloc -e nanosleep 
> sleep 1 2>&1 | tail -5
>     0.343 (           ): kmem:kmalloc:(perf_event_mmap+0xc7) 
> call_site=ffffffff81194617 ptr=0xffff88022a4ad000 bytes_req=4096 
> bytes_alloc=4096 gfp_flags=GFP_KERNEL)
>     0.484 (   0.006 ms): sleep/6081 nanosleep(rqtp: 0x7ffdbcc9b6e0            
>                             ) ...
>     0.484 (           ): sched:sched_switch:sleep:6081 [120] S ==> tail:6080 
> [120])
>  1000.664 (1000.186 ms): sleep/6081  ... [continued]: nanosleep()) = 0
>  1000.800 (           ): sched:sched_process_exit:comm=sleep pid=6081 
> prio=120)
>  #


Thank you Arnaldo for the answer. 

Just to elaborate further, what I’m interested in timings of NFS operations, so 
I’m not using any new tracepoints, but just the ones that are present in the 
code. However since NFS doesn’t various things outside of the process calling 
into NFS then will “perf trace” still work? 

I’m using “perf record -e ‘nfs4:*’,’nfs:*’ -ag” command to generate tracing 
info. That captures tracepoints executed by the application as well as the 
kernel threads that do asynchronous IO. 

> 
> - Arnaldo

Re: timing information with perf

Reply via email to