Re: [PATCH v8 02/10] tracing: Add basic event trigger framework
On Thu, 2013-09-05 at 13:21 -0400, Steven Rostedt wrote: > On Mon, 2 Sep 2013 22:52:18 -0500 > Tom Zanussi wrote: > [...] > > +{ > > + struct event_trigger_data *data; > > + > > + if (list_empty(>triggers)) > > + return; > > + > > + preempt_disable_notrace(); > > What's the reason for preempt_disable()? This should be called with > rcu_read_lock held, right? > You're right, it is called under rcu_read_lock() and the preempt_disable() is completely pointless - I'll remove these. > > + list_for_each_entry_rcu(data, >triggers, list) > > + data->ops->func(data); > > + preempt_enable_notrace(); > > +} > > +EXPORT_SYMBOL_GPL(event_triggers_call); > > + > > +static void *trigger_next(struct seq_file *m, void *t, loff_t *pos) > > +{ > > + struct ftrace_event_file *event_file = event_file_data(m->private); > > + > > + return seq_list_next(t, _file->triggers, pos); > > +} > > + > > +static void *trigger_start(struct seq_file *m, loff_t *pos) > > +{ > > + struct ftrace_event_file *event_file; > > + > > + /* ->stop() is called even if ->start() fails */ > > + mutex_lock(_mutex); > > + event_file = event_file_data(m->private); > > + if (unlikely(!event_file)) > > + return ERR_PTR(-ENODEV); > > + > > + return seq_list_start(_file->triggers, *pos); > > +} > > + > > +static void trigger_stop(struct seq_file *m, void *t) > > +{ > > + mutex_unlock(_mutex); > > +} > > + > > +static int trigger_show(struct seq_file *m, void *v) > > +{ > > + struct event_trigger_data *data; > > + > > + data = list_entry(v, struct event_trigger_data, list); > > + data->ops->print(m, data->ops, data); > > + > > + return 0; > > +} > > + > > +static const struct seq_operations event_triggers_seq_ops = { > > + .start = trigger_start, > > + .next = trigger_next, > > + .stop = trigger_stop, > > + .show = trigger_show, > > +}; > > + > > +static int event_trigger_regex_open(struct inode *inode, struct file *file) > > +{ > > + int ret = 0; > > + > > + mutex_lock(_mutex); > > + > > + if (unlikely(!event_file_data(file))) { > > + mutex_unlock(_mutex); > > + return -ENODEV; > > + } > > + > > + if (file->f_mode & FMODE_READ) { > > + ret = seq_open(file, _triggers_seq_ops); > > + if (!ret) { > > + struct seq_file *m = file->private_data; > > + m->private = file; > > + } > > + } > > + > > + mutex_unlock(_mutex); > > + > > + return ret; > > +} > > + > > +static int trigger_process_regex(struct ftrace_event_file *file, > > +char *buff, int enable) > > +{ > > + char *command, *next = buff; > > + struct event_command *p; > > + int ret = -EINVAL; > > + > > + command = strsep(, ": \t"); > > + command = (command[0] != '!') ? command : command + 1; > > + > > + mutex_lock(_cmd_mutex); > > What exactly is trigger_cmd_mutex protecting? It is only called here, > and the event_mutex() is already held by its only caller, so this mutex > is basically doing nothing. > trigger_cmd_mutex is also passed into and used by register/unregister_event_command() (as *cmd_list_mutex) and protects writers setting trigger commands against changes in the command list. The next question would be why pass the mutex and command list in to the register/unregister command instead of having those functions use them directly? It's because these functions were originally meant to manage multiple command lists, for both ftrace and event commands, but since it now handles only event commands, I'll remove those extra params and just use them directly. Thanks, Tom -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v8 02/10] tracing: Add basic event trigger framework
On Thu, 2013-09-05 at 13:21 -0400, Steven Rostedt wrote: On Mon, 2 Sep 2013 22:52:18 -0500 Tom Zanussi tom.zanu...@linux.intel.com wrote: [...] +{ + struct event_trigger_data *data; + + if (list_empty(file-triggers)) + return; + + preempt_disable_notrace(); What's the reason for preempt_disable()? This should be called with rcu_read_lock held, right? You're right, it is called under rcu_read_lock() and the preempt_disable() is completely pointless - I'll remove these. + list_for_each_entry_rcu(data, file-triggers, list) + data-ops-func(data); + preempt_enable_notrace(); +} +EXPORT_SYMBOL_GPL(event_triggers_call); + +static void *trigger_next(struct seq_file *m, void *t, loff_t *pos) +{ + struct ftrace_event_file *event_file = event_file_data(m-private); + + return seq_list_next(t, event_file-triggers, pos); +} + +static void *trigger_start(struct seq_file *m, loff_t *pos) +{ + struct ftrace_event_file *event_file; + + /* -stop() is called even if -start() fails */ + mutex_lock(event_mutex); + event_file = event_file_data(m-private); + if (unlikely(!event_file)) + return ERR_PTR(-ENODEV); + + return seq_list_start(event_file-triggers, *pos); +} + +static void trigger_stop(struct seq_file *m, void *t) +{ + mutex_unlock(event_mutex); +} + +static int trigger_show(struct seq_file *m, void *v) +{ + struct event_trigger_data *data; + + data = list_entry(v, struct event_trigger_data, list); + data-ops-print(m, data-ops, data); + + return 0; +} + +static const struct seq_operations event_triggers_seq_ops = { + .start = trigger_start, + .next = trigger_next, + .stop = trigger_stop, + .show = trigger_show, +}; + +static int event_trigger_regex_open(struct inode *inode, struct file *file) +{ + int ret = 0; + + mutex_lock(event_mutex); + + if (unlikely(!event_file_data(file))) { + mutex_unlock(event_mutex); + return -ENODEV; + } + + if (file-f_mode FMODE_READ) { + ret = seq_open(file, event_triggers_seq_ops); + if (!ret) { + struct seq_file *m = file-private_data; + m-private = file; + } + } + + mutex_unlock(event_mutex); + + return ret; +} + +static int trigger_process_regex(struct ftrace_event_file *file, +char *buff, int enable) +{ + char *command, *next = buff; + struct event_command *p; + int ret = -EINVAL; + + command = strsep(next, : \t); + command = (command[0] != '!') ? command : command + 1; + + mutex_lock(trigger_cmd_mutex); What exactly is trigger_cmd_mutex protecting? It is only called here, and the event_mutex() is already held by its only caller, so this mutex is basically doing nothing. trigger_cmd_mutex is also passed into and used by register/unregister_event_command() (as *cmd_list_mutex) and protects writers setting trigger commands against changes in the command list. The next question would be why pass the mutex and command list in to the register/unregister command instead of having those functions use them directly? It's because these functions were originally meant to manage multiple command lists, for both ftrace and event commands, but since it now handles only event commands, I'll remove those extra params and just use them directly. Thanks, Tom -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v8 02/10] tracing: Add basic event trigger framework
On Mon, 2 Sep 2013 22:52:18 -0500 Tom Zanussi wrote: > include/linux/ftrace_event.h| 11 ++ > include/trace/ftrace.h | 4 + > kernel/trace/Makefile | 1 + > kernel/trace/trace.h| 182 > kernel/trace/trace_events.c | 21 ++- > kernel/trace/trace_events_trigger.c | 268 > > kernel/trace/trace_syscalls.c | 4 + > 7 files changed, 486 insertions(+), 5 deletions(-) > create mode 100644 kernel/trace/trace_events_trigger.c > > diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h > index 5eaa746..34ae1d4 100644 > --- a/include/linux/ftrace_event.h > +++ b/include/linux/ftrace_event.h > @@ -255,6 +255,7 @@ enum { > FTRACE_EVENT_FL_RECORDED_CMD_BIT, > FTRACE_EVENT_FL_SOFT_MODE_BIT, > FTRACE_EVENT_FL_SOFT_DISABLED_BIT, > + FTRACE_EVENT_FL_TRIGGER_MODE_BIT, > }; > > /* > @@ -264,12 +265,14 @@ enum { > * SOFT_MODE - The event is enabled/disabled by SOFT_DISABLED > * SOFT_DISABLED - When set, do not trace the event (even though its > * tracepoint may be enabled) > + * TRIGGER_MODE - When set, invoke the triggers associated with the event > */ > enum { > FTRACE_EVENT_FL_ENABLED = (1 << FTRACE_EVENT_FL_ENABLED_BIT), > FTRACE_EVENT_FL_RECORDED_CMD= (1 << > FTRACE_EVENT_FL_RECORDED_CMD_BIT), > FTRACE_EVENT_FL_SOFT_MODE = (1 << FTRACE_EVENT_FL_SOFT_MODE_BIT), > FTRACE_EVENT_FL_SOFT_DISABLED = (1 << > FTRACE_EVENT_FL_SOFT_DISABLED_BIT), > + FTRACE_EVENT_FL_TRIGGER_MODE= (1 << > FTRACE_EVENT_FL_TRIGGER_MODE_BIT), > }; > > struct ftrace_event_file { > @@ -278,6 +281,7 @@ struct ftrace_event_file { > struct dentry *dir; > struct trace_array *tr; > struct ftrace_subsystem_dir *system; > + struct list_headtriggers; > > /* >* 32 bit flags: > @@ -285,6 +289,7 @@ struct ftrace_event_file { >* bit 1: enabled cmd record >* bit 2: enable/disable with the soft disable bit >* bit 3: soft disabled > + * bit 4: trigger enabled >* >* Note: The bits must be set atomically to prevent races >* from other writers. Reads of flags do not need to be in > @@ -296,6 +301,7 @@ struct ftrace_event_file { >*/ > unsigned long flags; > atomic_tsm_ref; /* soft-mode reference counter */ > + atomic_ttm_ref; /* trigger-mode reference counter */ > }; > > #define __TRACE_EVENT_FLAGS(name, value) \ > @@ -310,12 +316,17 @@ struct ftrace_event_file { > > #define MAX_FILTER_STR_VAL 256 /* Should handle KSYM_SYMBOL_LEN */ > > +enum event_trigger_type { > + ETT_NONE= (0), > +}; > + > extern void destroy_preds(struct ftrace_event_call *call); > extern int filter_match_preds(struct event_filter *filter, void *rec); > extern int filter_current_check_discard(struct ring_buffer *buffer, > struct ftrace_event_call *call, > void *rec, > struct ring_buffer_event *event); > +extern void event_triggers_call(struct ftrace_event_file *file); > > enum { > FILTER_OTHER = 0, > diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h > index 41a6643..326ba32 100644 > --- a/include/trace/ftrace.h > +++ b/include/trace/ftrace.h > @@ -526,6 +526,10 @@ ftrace_raw_event_##call(void *__data, proto) > \ > int __data_size;\ > int pc; \ > \ > + if (test_bit(FTRACE_EVENT_FL_TRIGGER_MODE_BIT, \ > + _file->flags)) \ > + event_triggers_call(ftrace_file); \ > + \ > if (test_bit(FTRACE_EVENT_FL_SOFT_DISABLED_BIT, \ >_file->flags)) \ > return; \ > diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile > index d7e2068..1378e84 100644 > --- a/kernel/trace/Makefile > +++ b/kernel/trace/Makefile > @@ -50,6 +50,7 @@ ifeq ($(CONFIG_PERF_EVENTS),y) > obj-$(CONFIG_EVENT_TRACING) += trace_event_perf.o > endif > obj-$(CONFIG_EVENT_TRACING) += trace_events_filter.o > +obj-$(CONFIG_EVENT_TRACING) += trace_events_trigger.o > obj-$(CONFIG_KPROBE_EVENT) += trace_kprobe.o > obj-$(CONFIG_TRACEPOINTS) += power-traces.o > ifeq ($(CONFIG_PM_RUNTIME),y) > diff --git
Re: [PATCH v8 02/10] tracing: Add basic event trigger framework
On Mon, 2 Sep 2013 22:52:18 -0500 Tom Zanussi tom.zanu...@linux.intel.com wrote: include/linux/ftrace_event.h| 11 ++ include/trace/ftrace.h | 4 + kernel/trace/Makefile | 1 + kernel/trace/trace.h| 182 kernel/trace/trace_events.c | 21 ++- kernel/trace/trace_events_trigger.c | 268 kernel/trace/trace_syscalls.c | 4 + 7 files changed, 486 insertions(+), 5 deletions(-) create mode 100644 kernel/trace/trace_events_trigger.c diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h index 5eaa746..34ae1d4 100644 --- a/include/linux/ftrace_event.h +++ b/include/linux/ftrace_event.h @@ -255,6 +255,7 @@ enum { FTRACE_EVENT_FL_RECORDED_CMD_BIT, FTRACE_EVENT_FL_SOFT_MODE_BIT, FTRACE_EVENT_FL_SOFT_DISABLED_BIT, + FTRACE_EVENT_FL_TRIGGER_MODE_BIT, }; /* @@ -264,12 +265,14 @@ enum { * SOFT_MODE - The event is enabled/disabled by SOFT_DISABLED * SOFT_DISABLED - When set, do not trace the event (even though its * tracepoint may be enabled) + * TRIGGER_MODE - When set, invoke the triggers associated with the event */ enum { FTRACE_EVENT_FL_ENABLED = (1 FTRACE_EVENT_FL_ENABLED_BIT), FTRACE_EVENT_FL_RECORDED_CMD= (1 FTRACE_EVENT_FL_RECORDED_CMD_BIT), FTRACE_EVENT_FL_SOFT_MODE = (1 FTRACE_EVENT_FL_SOFT_MODE_BIT), FTRACE_EVENT_FL_SOFT_DISABLED = (1 FTRACE_EVENT_FL_SOFT_DISABLED_BIT), + FTRACE_EVENT_FL_TRIGGER_MODE= (1 FTRACE_EVENT_FL_TRIGGER_MODE_BIT), }; struct ftrace_event_file { @@ -278,6 +281,7 @@ struct ftrace_event_file { struct dentry *dir; struct trace_array *tr; struct ftrace_subsystem_dir *system; + struct list_headtriggers; /* * 32 bit flags: @@ -285,6 +289,7 @@ struct ftrace_event_file { * bit 1: enabled cmd record * bit 2: enable/disable with the soft disable bit * bit 3: soft disabled + * bit 4: trigger enabled * * Note: The bits must be set atomically to prevent races * from other writers. Reads of flags do not need to be in @@ -296,6 +301,7 @@ struct ftrace_event_file { */ unsigned long flags; atomic_tsm_ref; /* soft-mode reference counter */ + atomic_ttm_ref; /* trigger-mode reference counter */ }; #define __TRACE_EVENT_FLAGS(name, value) \ @@ -310,12 +316,17 @@ struct ftrace_event_file { #define MAX_FILTER_STR_VAL 256 /* Should handle KSYM_SYMBOL_LEN */ +enum event_trigger_type { + ETT_NONE= (0), +}; + extern void destroy_preds(struct ftrace_event_call *call); extern int filter_match_preds(struct event_filter *filter, void *rec); extern int filter_current_check_discard(struct ring_buffer *buffer, struct ftrace_event_call *call, void *rec, struct ring_buffer_event *event); +extern void event_triggers_call(struct ftrace_event_file *file); enum { FILTER_OTHER = 0, diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h index 41a6643..326ba32 100644 --- a/include/trace/ftrace.h +++ b/include/trace/ftrace.h @@ -526,6 +526,10 @@ ftrace_raw_event_##call(void *__data, proto) \ int __data_size;\ int pc; \ \ + if (test_bit(FTRACE_EVENT_FL_TRIGGER_MODE_BIT, \ + ftrace_file-flags)) \ + event_triggers_call(ftrace_file); \ + \ if (test_bit(FTRACE_EVENT_FL_SOFT_DISABLED_BIT, \ ftrace_file-flags)) \ return; \ diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile index d7e2068..1378e84 100644 --- a/kernel/trace/Makefile +++ b/kernel/trace/Makefile @@ -50,6 +50,7 @@ ifeq ($(CONFIG_PERF_EVENTS),y) obj-$(CONFIG_EVENT_TRACING) += trace_event_perf.o endif obj-$(CONFIG_EVENT_TRACING) += trace_events_filter.o +obj-$(CONFIG_EVENT_TRACING) += trace_events_trigger.o obj-$(CONFIG_KPROBE_EVENT) += trace_kprobe.o obj-$(CONFIG_TRACEPOINTS) += power-traces.o ifeq ($(CONFIG_PM_RUNTIME),y) diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index b1227b9..37b8ecf 100644 ---
[PATCH v8 02/10] tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event triggers' to be set for trace events. 'trace event triggers' are patterned after the existing 'ftrace function triggers' implementation except that triggers are written to per-event 'trigger' files instead of to a single file such as the 'set_ftrace_filter' used for ftrace function triggers. The implementation is meant to be entirely separate from ftrace function triggers, in order to keep the respective implementations relatively simple and to allow them to diverge. The event trigger functionality is built on top of SOFT_DISABLE functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file flags which is checked when any trace event fires. Triggers set for a particular event need to be checked regardless of whether that event is actually enabled or not - getting an event to fire even if it's not enabled is what's already implemented by SOFT_DISABLE mode, so trigger mode directly reuses that. Event trigger essentially inherit the soft disable logic in __ftrace_event_enable_disable() while adding a bit of logic and trigger reference counting via tm_ref on top of that in a new trace_event_trigger_enable_disable() function. Because the base __ftrace_event_enable_disable() code now needs to be invoked from outside trace_events.c, a wrapper is also added for those usages. The triggers for an event are actually invoked via a new function, event_triggers_call(), and code is also added to invoke them for ftrace_raw_event calls as well as syscall events. The main part of the patch creates a new trace_events_trigger.c file to contain the trace event triggers implementation. The standard open, read, and release file operations are implemented here. The open() implementation sets up for the various open modes of the 'trigger' file. It creates and attaches the trigger iterator and sets up the command parser. If opened for reading set up the trigger seq_ops. The read() implementation parses the event trigger written to the 'trigger' file, looks up the trigger command, and passes it along to that event_command's func() implementation for command-specific processing. The release() implementation does whatever cleanup is needed to release the 'trigger' file, like releasing the parser and trigger iterator, etc. A couple of functions for event command registration and unregistration are added, along with a list to add them to and a mutex to protect them, as well as an (initially empty) registration function to add the set of commands that will be added by future commits, and call to it from the trace event initialization code. also added are a couple trigger-specific data structures needed for these implementations such as a trigger iterator and a struct for trigger-specific data. A couple structs consisting mostly of function meant to be implemented in command-specific ways, event_command and event_trigger_ops, are used by the generic event trigger command implementations. They're being put into trace.h alongside the other trace_event data structures and functions, in the expectation that they'll be needed in several trace_event-related files such as trace_events_trigger.c and trace_events.c. The event_command.func() function is meant to be called by the trigger parsing code in order to add a trigger instance to the corresponding event. It essentially coordinates adding a live trigger instance to the event, and arming the triggering the event. Every event_command func() implementation essentially does the same thing for any command: - choose ops - use the value of param to choose either a number or count version of event_trigger_ops specific to the command - do the register or unregister of those ops - associate a filter, if specified, with the triggering event The reg() and unreg() ops allow command-specific implementations for event_trigger_op registration and unregistration, and the get_trigger_ops() op allows command-specific event_trigger_ops selection to be parameterized. When a trigger instance is added, the reg() op essentially adds that trigger to the triggering event and arms it, while unreg() does the opposite. The set_filter() function is used to associate a filter with the trigger - if the command doesn't specify a set_filter() implementation, the command will ignore filters. Each command has an associated trigger_type, which serves double duty, both as a unique identifier for the command as well as a value that can be used for setting a trigger mode bit during trigger invocation. The signature of func() adds a pointer to the event_command struct, used to invoke those functions, along with a command_data param that can be passed to the reg/unreg functions. This allows func() implementations to use command-specific blobs and supports code re-use. The event_trigger_ops.func() command corrsponds to the trigger 'probe' function that gets called when the triggering event is actually invoked. The other
[PATCH v8 02/10] tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event triggers' to be set for trace events. 'trace event triggers' are patterned after the existing 'ftrace function triggers' implementation except that triggers are written to per-event 'trigger' files instead of to a single file such as the 'set_ftrace_filter' used for ftrace function triggers. The implementation is meant to be entirely separate from ftrace function triggers, in order to keep the respective implementations relatively simple and to allow them to diverge. The event trigger functionality is built on top of SOFT_DISABLE functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file flags which is checked when any trace event fires. Triggers set for a particular event need to be checked regardless of whether that event is actually enabled or not - getting an event to fire even if it's not enabled is what's already implemented by SOFT_DISABLE mode, so trigger mode directly reuses that. Event trigger essentially inherit the soft disable logic in __ftrace_event_enable_disable() while adding a bit of logic and trigger reference counting via tm_ref on top of that in a new trace_event_trigger_enable_disable() function. Because the base __ftrace_event_enable_disable() code now needs to be invoked from outside trace_events.c, a wrapper is also added for those usages. The triggers for an event are actually invoked via a new function, event_triggers_call(), and code is also added to invoke them for ftrace_raw_event calls as well as syscall events. The main part of the patch creates a new trace_events_trigger.c file to contain the trace event triggers implementation. The standard open, read, and release file operations are implemented here. The open() implementation sets up for the various open modes of the 'trigger' file. It creates and attaches the trigger iterator and sets up the command parser. If opened for reading set up the trigger seq_ops. The read() implementation parses the event trigger written to the 'trigger' file, looks up the trigger command, and passes it along to that event_command's func() implementation for command-specific processing. The release() implementation does whatever cleanup is needed to release the 'trigger' file, like releasing the parser and trigger iterator, etc. A couple of functions for event command registration and unregistration are added, along with a list to add them to and a mutex to protect them, as well as an (initially empty) registration function to add the set of commands that will be added by future commits, and call to it from the trace event initialization code. also added are a couple trigger-specific data structures needed for these implementations such as a trigger iterator and a struct for trigger-specific data. A couple structs consisting mostly of function meant to be implemented in command-specific ways, event_command and event_trigger_ops, are used by the generic event trigger command implementations. They're being put into trace.h alongside the other trace_event data structures and functions, in the expectation that they'll be needed in several trace_event-related files such as trace_events_trigger.c and trace_events.c. The event_command.func() function is meant to be called by the trigger parsing code in order to add a trigger instance to the corresponding event. It essentially coordinates adding a live trigger instance to the event, and arming the triggering the event. Every event_command func() implementation essentially does the same thing for any command: - choose ops - use the value of param to choose either a number or count version of event_trigger_ops specific to the command - do the register or unregister of those ops - associate a filter, if specified, with the triggering event The reg() and unreg() ops allow command-specific implementations for event_trigger_op registration and unregistration, and the get_trigger_ops() op allows command-specific event_trigger_ops selection to be parameterized. When a trigger instance is added, the reg() op essentially adds that trigger to the triggering event and arms it, while unreg() does the opposite. The set_filter() function is used to associate a filter with the trigger - if the command doesn't specify a set_filter() implementation, the command will ignore filters. Each command has an associated trigger_type, which serves double duty, both as a unique identifier for the command as well as a value that can be used for setting a trigger mode bit during trigger invocation. The signature of func() adds a pointer to the event_command struct, used to invoke those functions, along with a command_data param that can be passed to the reg/unreg functions. This allows func() implementations to use command-specific blobs and supports code re-use. The event_trigger_ops.func() command corrsponds to the trigger 'probe' function that gets called when the triggering event is actually invoked. The other