Re: [PATCH 00/32] tracing: Inter-event (e.g. latency) support
Hi Sebastian, On Tue, 2017-07-04 at 18:12 +0200, Sebastian Andrzej Siewior wrote: > On 2017-06-26 17:49:00 [-0500], Tom Zanussi wrote: > > This patchset adds support for 'inter-event' quantities to the trace > > event subsystem. The most important example of inter-event quantities > > are latencies, or the time differences between two events. > … > > I took this series and pushed into my v4.11 RT tree and removed the > "old" latency hist patches I had there. That's great, thanks! > Please keep me in CC if you post new versions of this series. > I will, and will have a new version coming soon. ;-) Tom > Sebastian
Re: [PATCH 00/32] tracing: Inter-event (e.g. latency) support
Hi Sebastian, On Tue, 2017-07-04 at 18:12 +0200, Sebastian Andrzej Siewior wrote: > On 2017-06-26 17:49:00 [-0500], Tom Zanussi wrote: > > This patchset adds support for 'inter-event' quantities to the trace > > event subsystem. The most important example of inter-event quantities > > are latencies, or the time differences between two events. > … > > I took this series and pushed into my v4.11 RT tree and removed the > "old" latency hist patches I had there. That's great, thanks! > Please keep me in CC if you post new versions of this series. > I will, and will have a new version coming soon. ;-) Tom > Sebastian
Re: [PATCH 00/32] tracing: Inter-event (e.g. latency) support
Hi Joel, Sorry for the delayed reply, was out for the holidays.. On Sat, 2017-07-01 at 00:01 -0700, Joel Fernandes (Google) wrote: > Hi Tom, > Nice series and nice ELC talk as well. Thanks. > > On Mon, Jun 26, 2017 at 3:49 PM, Tom Zanussi >wrote: > > This patchset adds support for 'inter-event' quantities to the trace > > event subsystem. The most important example of inter-event quantities > > are latencies, or the time differences between two events. > > I tried your patches out and they are working fine for me. I will test > them out more. > > I think for the wakeup latency in your examples, its better / more > accurate to use sched_waking instead of sched_wakeup tracepoint? [1] > Yeah, makes sense - thanks for pointing it out. > Also, one other comment I had is, it would be nice to suppress the > output of individual trace events that are part of the synthetic event > into the trace buffer. Otherwise I feel the value of it is slightly > diminished - because one can simply post-process the individual > non-synthetic trace events themselves to get wake up latencies which > the synthetic events is trying to calculate in the first place. > Inorder to conserve space, if a user doesn't care about individual > events, and just the synthetic events then the individual ones > shouldn't be written to the trace buffer IMO. > Yes, you're right, this is also something I need to fix. I added a flag to avoid discards, but this has the effect of, well, avoiding discards, so unwanted events appear in the trace buffer. Also, regardless, any trigger that exists simply to provide input to a synthetic event should be filtered out, regardless of the enable state. I'll have to think about a better way do handle all this... Thanks, Tom > -Joel > > [1] commit fbd705a0c6184580d0e2fbcbd47a37b6e5822511 (sched: Introduce > the 'trace_sched_waking' tracepoint)
Re: [PATCH 00/32] tracing: Inter-event (e.g. latency) support
Hi Joel, Sorry for the delayed reply, was out for the holidays.. On Sat, 2017-07-01 at 00:01 -0700, Joel Fernandes (Google) wrote: > Hi Tom, > Nice series and nice ELC talk as well. Thanks. > > On Mon, Jun 26, 2017 at 3:49 PM, Tom Zanussi > wrote: > > This patchset adds support for 'inter-event' quantities to the trace > > event subsystem. The most important example of inter-event quantities > > are latencies, or the time differences between two events. > > I tried your patches out and they are working fine for me. I will test > them out more. > > I think for the wakeup latency in your examples, its better / more > accurate to use sched_waking instead of sched_wakeup tracepoint? [1] > Yeah, makes sense - thanks for pointing it out. > Also, one other comment I had is, it would be nice to suppress the > output of individual trace events that are part of the synthetic event > into the trace buffer. Otherwise I feel the value of it is slightly > diminished - because one can simply post-process the individual > non-synthetic trace events themselves to get wake up latencies which > the synthetic events is trying to calculate in the first place. > Inorder to conserve space, if a user doesn't care about individual > events, and just the synthetic events then the individual ones > shouldn't be written to the trace buffer IMO. > Yes, you're right, this is also something I need to fix. I added a flag to avoid discards, but this has the effect of, well, avoiding discards, so unwanted events appear in the trace buffer. Also, regardless, any trigger that exists simply to provide input to a synthetic event should be filtered out, regardless of the enable state. I'll have to think about a better way do handle all this... Thanks, Tom > -Joel > > [1] commit fbd705a0c6184580d0e2fbcbd47a37b6e5822511 (sched: Introduce > the 'trace_sched_waking' tracepoint)
Re: [PATCH 00/32] tracing: Inter-event (e.g. latency) support
On 2017-06-26 17:49:00 [-0500], Tom Zanussi wrote: > This patchset adds support for 'inter-event' quantities to the trace > event subsystem. The most important example of inter-event quantities > are latencies, or the time differences between two events. … I took this series and pushed into my v4.11 RT tree and removed the "old" latency hist patches I had there. Please keep me in CC if you post new versions of this series. Sebastian
Re: [PATCH 00/32] tracing: Inter-event (e.g. latency) support
On 2017-06-26 17:49:00 [-0500], Tom Zanussi wrote: > This patchset adds support for 'inter-event' quantities to the trace > event subsystem. The most important example of inter-event quantities > are latencies, or the time differences between two events. … I took this series and pushed into my v4.11 RT tree and removed the "old" latency hist patches I had there. Please keep me in CC if you post new versions of this series. Sebastian
Re: [PATCH 00/32] tracing: Inter-event (e.g. latency) support
Hi Tom, Nice series and nice ELC talk as well. Thanks. On Mon, Jun 26, 2017 at 3:49 PM, Tom Zanussiwrote: > This patchset adds support for 'inter-event' quantities to the trace > event subsystem. The most important example of inter-event quantities > are latencies, or the time differences between two events. I tried your patches out and they are working fine for me. I will test them out more. I think for the wakeup latency in your examples, its better / more accurate to use sched_waking instead of sched_wakeup tracepoint? [1] Also, one other comment I had is, it would be nice to suppress the output of individual trace events that are part of the synthetic event into the trace buffer. Otherwise I feel the value of it is slightly diminished - because one can simply post-process the individual non-synthetic trace events themselves to get wake up latencies which the synthetic events is trying to calculate in the first place. Inorder to conserve space, if a user doesn't care about individual events, and just the synthetic events then the individual ones shouldn't be written to the trace buffer IMO. -Joel [1] commit fbd705a0c6184580d0e2fbcbd47a37b6e5822511 (sched: Introduce the 'trace_sched_waking' tracepoint)
Re: [PATCH 00/32] tracing: Inter-event (e.g. latency) support
Hi Tom, Nice series and nice ELC talk as well. Thanks. On Mon, Jun 26, 2017 at 3:49 PM, Tom Zanussi wrote: > This patchset adds support for 'inter-event' quantities to the trace > event subsystem. The most important example of inter-event quantities > are latencies, or the time differences between two events. I tried your patches out and they are working fine for me. I will test them out more. I think for the wakeup latency in your examples, its better / more accurate to use sched_waking instead of sched_wakeup tracepoint? [1] Also, one other comment I had is, it would be nice to suppress the output of individual trace events that are part of the synthetic event into the trace buffer. Otherwise I feel the value of it is slightly diminished - because one can simply post-process the individual non-synthetic trace events themselves to get wake up latencies which the synthetic events is trying to calculate in the first place. Inorder to conserve space, if a user doesn't care about individual events, and just the synthetic events then the individual ones shouldn't be written to the trace buffer IMO. -Joel [1] commit fbd705a0c6184580d0e2fbcbd47a37b6e5822511 (sched: Introduce the 'trace_sched_waking' tracepoint)
Re: [PATCH 00/32] tracing: Inter-event (e.g. latency) support
Hi Masami, On Wed, 2017-06-28 at 23:21 +0900, Masami Hiramatsu wrote: > Hi Tom, > > On Mon, 26 Jun 2017 17:49:01 -0500 > Tom Zanussiwrote: > > > This patchset adds support for 'inter-event' quantities to the trace > > event subsystem. The most important example of inter-event quantities > > are latencies, or the time differences between two events. > > > > Thank you for your great work! > I'm playing this and found some issues. > > (1) new event format > > # echo "wakeup_latency u64 lat; pid_t pid; int prio" > synthetic_events > # cat /sys/kernel/debug/tracing # cat events/synthetic/wakeup_latency/format > name: wakeup_latency > ID: 972 > format: > field:unsigned short common_type; offset:0; size:2; > signed:0; > field:unsigned char common_flags; offset:2; size:1; > signed:0; > field:unsigned char common_preempt_count; offset:3; > size:1;signed:0; > field:int common_pid; offset:4; size:4; signed:1; > > field:u64 lat; offset:16; size:8; signed:0; > field:pid_t pid;offset:24; size:4; signed:1; > field:int prio; offset:32; size:4; signed:1; > > print fmt: "lat: 0x%08lx, pid: 0x%08lx, prio: 0x%08lx", ((u64)(REC->lat)), > ((u64)(REC->pid)), ((u64)(REC->prio)) > > Here, IMO, this format would be better "lat=0x%08lx pid=0x%08lx prio=0x%08lx" > so that perf-script can parse it correctly. > OK, will change it. > (2) lockdep found a dead lock case > When I added a histogram trigger, it happened. > > # echo 'hist:keys=pid:ts0=$common_timestamp.usecs' >> > /sys/kernel/debug/tracing/events/sched/sched_wakeup/trigger > [33967.672485] > [33967.673117] == > [33967.673117] WARNING: possible circular locking dependency detected > [33967.673117] 4.12.0-rc5+ #1 Not tainted > [33967.673117] -- Yeah, will fix this and anything else similar, thanks for pointing it out. Tom
Re: [PATCH 00/32] tracing: Inter-event (e.g. latency) support
Hi Masami, On Wed, 2017-06-28 at 23:21 +0900, Masami Hiramatsu wrote: > Hi Tom, > > On Mon, 26 Jun 2017 17:49:01 -0500 > Tom Zanussi wrote: > > > This patchset adds support for 'inter-event' quantities to the trace > > event subsystem. The most important example of inter-event quantities > > are latencies, or the time differences between two events. > > > > Thank you for your great work! > I'm playing this and found some issues. > > (1) new event format > > # echo "wakeup_latency u64 lat; pid_t pid; int prio" > synthetic_events > # cat /sys/kernel/debug/tracing # cat events/synthetic/wakeup_latency/format > name: wakeup_latency > ID: 972 > format: > field:unsigned short common_type; offset:0; size:2; > signed:0; > field:unsigned char common_flags; offset:2; size:1; > signed:0; > field:unsigned char common_preempt_count; offset:3; > size:1;signed:0; > field:int common_pid; offset:4; size:4; signed:1; > > field:u64 lat; offset:16; size:8; signed:0; > field:pid_t pid;offset:24; size:4; signed:1; > field:int prio; offset:32; size:4; signed:1; > > print fmt: "lat: 0x%08lx, pid: 0x%08lx, prio: 0x%08lx", ((u64)(REC->lat)), > ((u64)(REC->pid)), ((u64)(REC->prio)) > > Here, IMO, this format would be better "lat=0x%08lx pid=0x%08lx prio=0x%08lx" > so that perf-script can parse it correctly. > OK, will change it. > (2) lockdep found a dead lock case > When I added a histogram trigger, it happened. > > # echo 'hist:keys=pid:ts0=$common_timestamp.usecs' >> > /sys/kernel/debug/tracing/events/sched/sched_wakeup/trigger > [33967.672485] > [33967.673117] == > [33967.673117] WARNING: possible circular locking dependency detected > [33967.673117] 4.12.0-rc5+ #1 Not tainted > [33967.673117] -- Yeah, will fix this and anything else similar, thanks for pointing it out. Tom
Re: [PATCH 00/32] tracing: Inter-event (e.g. latency) support
Hi Tom, On Mon, 26 Jun 2017 17:49:01 -0500 Tom Zanussiwrote: > This patchset adds support for 'inter-event' quantities to the trace > event subsystem. The most important example of inter-event quantities > are latencies, or the time differences between two events. > Thank you for your great work! I'm playing this and found some issues. (1) new event format # echo "wakeup_latency u64 lat; pid_t pid; int prio" > synthetic_events # cat /sys/kernel/debug/tracing # cat events/synthetic/wakeup_latency/format name: wakeup_latency ID: 972 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1;signed:0; field:int common_pid; offset:4; size:4; signed:1; field:u64 lat; offset:16; size:8; signed:0; field:pid_t pid;offset:24; size:4; signed:1; field:int prio; offset:32; size:4; signed:1; print fmt: "lat: 0x%08lx, pid: 0x%08lx, prio: 0x%08lx", ((u64)(REC->lat)), ((u64)(REC->pid)), ((u64)(REC->prio)) Here, IMO, this format would be better "lat=0x%08lx pid=0x%08lx prio=0x%08lx" so that perf-script can parse it correctly. (2) lockdep found a dead lock case When I added a histogram trigger, it happened. # echo 'hist:keys=pid:ts0=$common_timestamp.usecs' >> /sys/kernel/debug/tracing/events/sched/sched_wakeup/trigger [33967.672485] [33967.673117] == [33967.673117] WARNING: possible circular locking dependency detected [33967.673117] 4.12.0-rc5+ #1 Not tainted [33967.673117] -- [33967.673117] bash/708 is trying to acquire lock: [33967.673117] (trace_types_lock){+.+.+.}, at: [] tracing_set_clock+0x66/0xe0 [33967.673117] [33967.673117] but task is already holding lock: [33967.673117] (trigger_cmd_mutex){+.+.+.}, at: [] event_trigger_write+0xb4/0x1a0 [33967.673117] [33967.673117] which lock already depends on the new lock. [33967.673117] [33967.673117] [33967.673117] the existing dependency chain (in reverse order) is: [33967.673117] [33967.673117] -> #2 (trigger_cmd_mutex){+.+.+.}: [33967.673117]lock_acquire+0xe3/0x1d0 [33967.673117]__mutex_lock+0x81/0x950 [33967.673117]mutex_lock_nested+0x1b/0x20 [33967.673117]event_trigger_write+0xb4/0x1a0 [33967.673117]__vfs_write+0x28/0x120 [33967.673117]vfs_write+0xc7/0x1b0 [33967.673117]SyS_write+0x49/0xa0 [33967.673117]entry_SYSCALL_64_fastpath+0x1f/0xbe [33967.673117] [33967.673117] -> #1 (event_mutex){+.+.+.}: [33967.673117]lock_acquire+0xe3/0x1d0 [33967.673117]__mutex_lock+0x81/0x950 [33967.673117]mutex_lock_nested+0x1b/0x20 [33967.673117]trace_add_event_call+0x28/0xc0 [33967.673117]create_synth_event+0x40a/0x880 [33967.673117]trace_run_command+0x54/0x60 [33967.673117]trace_parse_run_command+0xc4/0x160 [33967.673117]synth_events_write+0x10/0x20 [33967.673117]__vfs_write+0x28/0x120 [33967.673117]vfs_write+0xc7/0x1b0 [33967.673117]SyS_write+0x49/0xa0 [33967.673117]entry_SYSCALL_64_fastpath+0x1f/0xbe [33967.673117] [33967.673117] -> #0 (trace_types_lock){+.+.+.}: [33967.673117]__lock_acquire+0x1026/0x11d0 [33967.673117]lock_acquire+0xe3/0x1d0 [33967.673117]__mutex_lock+0x81/0x950 [33967.673117]mutex_lock_nested+0x1b/0x20 [33967.673117]tracing_set_clock+0x66/0xe0 [33967.673117]hist_register_trigger+0x209/0x2a0 [33967.673117]event_hist_trigger_func+0xc6b/0x2610 [33967.673117]event_trigger_write+0xfa/0x1a0 [33967.673117]__vfs_write+0x28/0x120 [33967.673117]vfs_write+0xc7/0x1b0 [33967.673117]SyS_write+0x49/0xa0 [33967.673117]entry_SYSCALL_64_fastpath+0x1f/0xbe [33967.673117] [33967.673117] other info that might help us debug this: [33967.673117] [33967.673117] Chain exists of: [33967.673117] trace_types_lock --> event_mutex --> trigger_cmd_mutex [33967.673117] [33967.673117] Possible unsafe locking scenario: [33967.673117] [33967.673117]CPU0CPU1 [33967.673117] [33967.673117] lock(trigger_cmd_mutex); [33967.673117]lock(event_mutex); [33967.673117]lock(trigger_cmd_mutex); [33967.673117] lock(trace_types_lock); [33967.673117] [33967.673117] *** DEADLOCK *** [33967.673117] [33967.673117] 3 locks held by bash/708: [33967.673117] #0: (sb_writers#8){.+.+.+}, at: [] vfs_write+0x18f/0x1b0 [33967.673117] #1: (event_mutex){+.+.+.}, at: [] event_trigger_write+0x6c/0x1a0 [33967.673117] #2: (trigger_cmd_mutex){+.+.+.}, at: [] event_trigger_write+0xb4/0x1a0 [33967.673117]
Re: [PATCH 00/32] tracing: Inter-event (e.g. latency) support
Hi Tom, On Mon, 26 Jun 2017 17:49:01 -0500 Tom Zanussi wrote: > This patchset adds support for 'inter-event' quantities to the trace > event subsystem. The most important example of inter-event quantities > are latencies, or the time differences between two events. > Thank you for your great work! I'm playing this and found some issues. (1) new event format # echo "wakeup_latency u64 lat; pid_t pid; int prio" > synthetic_events # cat /sys/kernel/debug/tracing # cat events/synthetic/wakeup_latency/format name: wakeup_latency ID: 972 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1;signed:0; field:int common_pid; offset:4; size:4; signed:1; field:u64 lat; offset:16; size:8; signed:0; field:pid_t pid;offset:24; size:4; signed:1; field:int prio; offset:32; size:4; signed:1; print fmt: "lat: 0x%08lx, pid: 0x%08lx, prio: 0x%08lx", ((u64)(REC->lat)), ((u64)(REC->pid)), ((u64)(REC->prio)) Here, IMO, this format would be better "lat=0x%08lx pid=0x%08lx prio=0x%08lx" so that perf-script can parse it correctly. (2) lockdep found a dead lock case When I added a histogram trigger, it happened. # echo 'hist:keys=pid:ts0=$common_timestamp.usecs' >> /sys/kernel/debug/tracing/events/sched/sched_wakeup/trigger [33967.672485] [33967.673117] == [33967.673117] WARNING: possible circular locking dependency detected [33967.673117] 4.12.0-rc5+ #1 Not tainted [33967.673117] -- [33967.673117] bash/708 is trying to acquire lock: [33967.673117] (trace_types_lock){+.+.+.}, at: [] tracing_set_clock+0x66/0xe0 [33967.673117] [33967.673117] but task is already holding lock: [33967.673117] (trigger_cmd_mutex){+.+.+.}, at: [] event_trigger_write+0xb4/0x1a0 [33967.673117] [33967.673117] which lock already depends on the new lock. [33967.673117] [33967.673117] [33967.673117] the existing dependency chain (in reverse order) is: [33967.673117] [33967.673117] -> #2 (trigger_cmd_mutex){+.+.+.}: [33967.673117]lock_acquire+0xe3/0x1d0 [33967.673117]__mutex_lock+0x81/0x950 [33967.673117]mutex_lock_nested+0x1b/0x20 [33967.673117]event_trigger_write+0xb4/0x1a0 [33967.673117]__vfs_write+0x28/0x120 [33967.673117]vfs_write+0xc7/0x1b0 [33967.673117]SyS_write+0x49/0xa0 [33967.673117]entry_SYSCALL_64_fastpath+0x1f/0xbe [33967.673117] [33967.673117] -> #1 (event_mutex){+.+.+.}: [33967.673117]lock_acquire+0xe3/0x1d0 [33967.673117]__mutex_lock+0x81/0x950 [33967.673117]mutex_lock_nested+0x1b/0x20 [33967.673117]trace_add_event_call+0x28/0xc0 [33967.673117]create_synth_event+0x40a/0x880 [33967.673117]trace_run_command+0x54/0x60 [33967.673117]trace_parse_run_command+0xc4/0x160 [33967.673117]synth_events_write+0x10/0x20 [33967.673117]__vfs_write+0x28/0x120 [33967.673117]vfs_write+0xc7/0x1b0 [33967.673117]SyS_write+0x49/0xa0 [33967.673117]entry_SYSCALL_64_fastpath+0x1f/0xbe [33967.673117] [33967.673117] -> #0 (trace_types_lock){+.+.+.}: [33967.673117]__lock_acquire+0x1026/0x11d0 [33967.673117]lock_acquire+0xe3/0x1d0 [33967.673117]__mutex_lock+0x81/0x950 [33967.673117]mutex_lock_nested+0x1b/0x20 [33967.673117]tracing_set_clock+0x66/0xe0 [33967.673117]hist_register_trigger+0x209/0x2a0 [33967.673117]event_hist_trigger_func+0xc6b/0x2610 [33967.673117]event_trigger_write+0xfa/0x1a0 [33967.673117]__vfs_write+0x28/0x120 [33967.673117]vfs_write+0xc7/0x1b0 [33967.673117]SyS_write+0x49/0xa0 [33967.673117]entry_SYSCALL_64_fastpath+0x1f/0xbe [33967.673117] [33967.673117] other info that might help us debug this: [33967.673117] [33967.673117] Chain exists of: [33967.673117] trace_types_lock --> event_mutex --> trigger_cmd_mutex [33967.673117] [33967.673117] Possible unsafe locking scenario: [33967.673117] [33967.673117]CPU0CPU1 [33967.673117] [33967.673117] lock(trigger_cmd_mutex); [33967.673117]lock(event_mutex); [33967.673117]lock(trigger_cmd_mutex); [33967.673117] lock(trace_types_lock); [33967.673117] [33967.673117] *** DEADLOCK *** [33967.673117] [33967.673117] 3 locks held by bash/708: [33967.673117] #0: (sb_writers#8){.+.+.+}, at: [] vfs_write+0x18f/0x1b0 [33967.673117] #1: (event_mutex){+.+.+.}, at: [] event_trigger_write+0x6c/0x1a0 [33967.673117] #2: (trigger_cmd_mutex){+.+.+.}, at: [] event_trigger_write+0xb4/0x1a0 [33967.673117] [33967.673117] stack backtrace:
Re: [PATCH 00/32] tracing: Inter-event (e.g. latency) support
On Mon, 26 Jun 2017 17:49:00 -0500 Tom Zanussiwrote: > This patchset adds support for 'inter-event' quantities to the trace > event subsystem. The most important example of inter-event quantities > are latencies, or the time differences between two events. > Hi Tom, Thanks for sending this out. I'm going to try to find some time this week to take a look at it. It's quite a lot to go over. I don't think it will be ready for this merge window, as I don't like rushing things like this, and we are already at rc7. But I want to get this in by 4.14. -- Steve
Re: [PATCH 00/32] tracing: Inter-event (e.g. latency) support
On Mon, 26 Jun 2017 17:49:00 -0500 Tom Zanussi wrote: > This patchset adds support for 'inter-event' quantities to the trace > event subsystem. The most important example of inter-event quantities > are latencies, or the time differences between two events. > Hi Tom, Thanks for sending this out. I'm going to try to find some time this week to take a look at it. It's quite a lot to go over. I don't think it will be ready for this merge window, as I don't like rushing things like this, and we are already at rc7. But I want to get this in by 4.14. -- Steve
[PATCH 00/32] tracing: Inter-event (e.g. latency) support
This patchset adds support for 'inter-event' quantities to the trace event subsystem. The most important example of inter-event quantities are latencies, or the time differences between two events. This is a follow-up to the initial RFC patchset and subsequent discussion at ELC. I think I've covered all the basic suggestions from both those sources - thanks to everyone who reviewed and contributed suggestions. Also, thanks to Vedang Patel for testing and providing real-user input and scripts - a number of the new patches such as extended error reporting, cpu fields, and aliases address his input directly. The current patchset has been fairly well tested and I'm not aware of any major problems with it, but there is one problem/feature to be aware of, which has to do with the fact that currently the tracing map can naturally have duplicate entries. Normally this is expected and isn't a problem for histograms without variables attached, as duplicates are merged on output. If a histogram has variables though and gets a duplicate entry, the variable associated with that record can't be trusted, since it can't be retrieved again. Luckily duplicates are normally very rare occurrences - I've never personally seen one in practice - but it does mean that you always need to check that the 'Duplicates' value on the histogram summary is 0 if that histogram has variables attached. We're currently working on a 'fix' for this that basically eliminates the possibility of duplicates, but for now the 'Duplicates' total will alert users to that rare situation should it occur. One of the main motivations for adding this capability is to provide a general-purpose base that existing existing tools such as the -RT latency_hist patchset can be built upon, while at the same time providing a simple way for users to track latencies (or any inter-event quantity) generically between any two events. Previous -RT latency_hist patchsets that take advantage of the trace event subsystem have been submitted, but they essentially hard-code special-case tracepoints and application logic in ways that can't be reused. It seemed to me that rather than providing a one-off patchset devoted specifically to generating the specific histograms in the latency_hist patchset, it should be possible to build the same functionality on top of a generic layer allowing users to do similar things for other non-latency_hist applications. In addition to preliminary patches that add some basic missing functionality such as a common ringbuffer-derived timestamp and dynamically-creatable tracepoints, the overall patchset is divided up into a few different areas that combine to produce the overall goal (The Documentation patch explains all the details): - variables and simple expressions required to calculate a latency In order to calculate a latency or any inter-event value, something from one event needs to be saved and later retrieved, and some operation such as subtraction or addition is performed on it. This means some minimal form of variables and expressions, which the first set of patches implements. Saving and retrieving events to use in a latency calculation is normally done using a hash table, and that's exactly what we have with trace event hist triggers, so that's where variables are instantiated, set, and retrieved. Basically, variables are set on one entry and retrieved and used by a 'matching' event. - 'synthetic' events, combining variables from other events The trace event interface is based on pseudo-files associated with individual events, so it wouldn't really make sense to have quantities derived from multiple events attached to any one of those events. For that reason, the patchset implements a means of combining variables from other events into a separate 'synthetic' event, which can be treated as if it were just like any other trace event in the system. - 'actions' generating synthetic events, among other things Variables and synthetic events provide the data and data structure for new events, but something still needs to actually generate an event using that data. 'Actions' are expanded to provide that capability. Though it hasn't been explicitly called as much before, the default 'action' currently for a hist trigger is to update the matching histogram entry's sum values. This patchset essentially expands that to provide a new 'onmatch.event(params)' action that can be used to have one event generate another. The mechanism is extensible to other actions, and in fact the patchset also includes another, 'onmax(var).save(field,...)' that can be used to save context whenever a value exceeds the previous maximum (something also needed by latency_hist). Here are some examples that should make things less abstract. Example - wakeup latency This basically implements the -RT
[PATCH 00/32] tracing: Inter-event (e.g. latency) support
This patchset adds support for 'inter-event' quantities to the trace event subsystem. The most important example of inter-event quantities are latencies, or the time differences between two events. This is a follow-up to the initial RFC patchset and subsequent discussion at ELC. I think I've covered all the basic suggestions from both those sources - thanks to everyone who reviewed and contributed suggestions. Also, thanks to Vedang Patel for testing and providing real-user input and scripts - a number of the new patches such as extended error reporting, cpu fields, and aliases address his input directly. The current patchset has been fairly well tested and I'm not aware of any major problems with it, but there is one problem/feature to be aware of, which has to do with the fact that currently the tracing map can naturally have duplicate entries. Normally this is expected and isn't a problem for histograms without variables attached, as duplicates are merged on output. If a histogram has variables though and gets a duplicate entry, the variable associated with that record can't be trusted, since it can't be retrieved again. Luckily duplicates are normally very rare occurrences - I've never personally seen one in practice - but it does mean that you always need to check that the 'Duplicates' value on the histogram summary is 0 if that histogram has variables attached. We're currently working on a 'fix' for this that basically eliminates the possibility of duplicates, but for now the 'Duplicates' total will alert users to that rare situation should it occur. One of the main motivations for adding this capability is to provide a general-purpose base that existing existing tools such as the -RT latency_hist patchset can be built upon, while at the same time providing a simple way for users to track latencies (or any inter-event quantity) generically between any two events. Previous -RT latency_hist patchsets that take advantage of the trace event subsystem have been submitted, but they essentially hard-code special-case tracepoints and application logic in ways that can't be reused. It seemed to me that rather than providing a one-off patchset devoted specifically to generating the specific histograms in the latency_hist patchset, it should be possible to build the same functionality on top of a generic layer allowing users to do similar things for other non-latency_hist applications. In addition to preliminary patches that add some basic missing functionality such as a common ringbuffer-derived timestamp and dynamically-creatable tracepoints, the overall patchset is divided up into a few different areas that combine to produce the overall goal (The Documentation patch explains all the details): - variables and simple expressions required to calculate a latency In order to calculate a latency or any inter-event value, something from one event needs to be saved and later retrieved, and some operation such as subtraction or addition is performed on it. This means some minimal form of variables and expressions, which the first set of patches implements. Saving and retrieving events to use in a latency calculation is normally done using a hash table, and that's exactly what we have with trace event hist triggers, so that's where variables are instantiated, set, and retrieved. Basically, variables are set on one entry and retrieved and used by a 'matching' event. - 'synthetic' events, combining variables from other events The trace event interface is based on pseudo-files associated with individual events, so it wouldn't really make sense to have quantities derived from multiple events attached to any one of those events. For that reason, the patchset implements a means of combining variables from other events into a separate 'synthetic' event, which can be treated as if it were just like any other trace event in the system. - 'actions' generating synthetic events, among other things Variables and synthetic events provide the data and data structure for new events, but something still needs to actually generate an event using that data. 'Actions' are expanded to provide that capability. Though it hasn't been explicitly called as much before, the default 'action' currently for a hist trigger is to update the matching histogram entry's sum values. This patchset essentially expands that to provide a new 'onmatch.event(params)' action that can be used to have one event generate another. The mechanism is extensible to other actions, and in fact the patchset also includes another, 'onmax(var).save(field,...)' that can be used to save context whenever a value exceeds the previous maximum (something also needed by latency_hist). Here are some examples that should make things less abstract. Example - wakeup latency This basically implements the -RT