From: Steven Rostedt (VMware)
Allow for multiple users to attach to function graph tracer at the same
time. Only 16 simultaneous users can attach to the tracer. This is because
there's an array that stores the pointers to the attached fgraph_ops. When
a function being traced is entered, each of the ftrace_ops entryfunc is
called and if it returns non zero, its index into the array will be added
to the shadow stack.
On exit of the function being traced, the shadow stack will contain the
indexes of the ftrace_ops on the array that want their retfunc to be
called.
Because a function may sleep for a long time (if a task sleeps itself),
the return of the function may be literally days later. If the ftrace_ops
is removed, its place on the array is replaced with a ftrace_ops that
contains the stub functions and that will be called when the function
finally returns.
If another ftrace_ops is added that happens to get the same index into the
array, its return function may be called. But that's actually the way
things current work with the old function graph tracer. If one tracer is
removed and another is added, the new one will get the return calls of the
function traced by the previous one, thus this is not a regression. This
can be fixed by adding a counter to each time the array item is updated and
save that on the shadow stack as well, such that it won't be called if the
index saved does not match the index on the array.
Note, being able to filter functions when both are called is not completely
handled yet, but that shouldn't be too hard to manage.
Signed-off-by: Steven Rostedt (VMware)
Signed-off-by: Masami Hiramatsu (Google)
---
Changes in v10:
- Rephrase all "index" on shadow stack to "offset" and some "ret_stack" to
"frame".
- Macros are renamed;
FGRAPH_RET_SIZE to FGRAPH_FRAME_SIZE
FGRAPH_RET_INDEX to FGRAPH_FRAME_OFFSET
FGRAPH_RET_INDEX_SIZE to FGRAPH_FRAME_OFFSET_SIZE
FGRAPH_RET_INDEX_MASK to FGRAPH_FRAME_OFFSET_MASK
SHADOW_STACK_INDEX to SHADOW_STACK_IN_WORD
SHADOW_STACK_MAX_INDEX to SHADOW_STACK_MAX_OFFSET
- Remove unused FGRAPH_MAX_INDEX macro.
- Fix wrong explanation of the shadow stack entry.
Changes in v7:
- Fix max limitation check in ftrace_graph_push_return().
- Rewrite the shadow stack implementation using bitmap entry. This allows
us to detect recursive call/tail call easier. (this implementation is
moved from later patch in the series.
Changes in v2:
- Check return value of the ftrace_pop_return_trace() instead of 'ret'
since 'ret' is set to the address of panic().
- Fix typo and make lines shorter than 76 chars in description.
---
include/linux/ftrace.h |3
kernel/trace/fgraph.c | 378 +++-
2 files changed, 310 insertions(+), 71 deletions(-)
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index d5df5f8fc35a..0a1a7316de7b 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -1066,6 +1066,7 @@ extern int ftrace_graph_entry_stub(struct
ftrace_graph_ent *trace);
struct fgraph_ops {
trace_func_graph_ent_t entryfunc;
trace_func_graph_ret_t retfunc;
+ int idx;
};
/*
@@ -1100,7 +1101,7 @@ function_graph_enter(unsigned long ret, unsigned long
func,
unsigned long frame_pointer, unsigned long *retp);
struct ftrace_ret_stack *
-ftrace_graph_get_ret_stack(struct task_struct *task, int idx);
+ftrace_graph_get_ret_stack(struct task_struct *task, int skip);
unsigned long ftrace_graph_ret_addr(struct task_struct *task, int *idx,
unsigned long ret, unsigned long *retp);
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index 3f9dd213e7d8..6b06962657fe 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -7,6 +7,7 @@
*
* Highly modified by Steven Rostedt (VMware).
*/
+#include
#include
#include
#include
@@ -25,25 +26,157 @@
#define ASSIGN_OPS_HASH(opsname, val)
#endif
-#define FGRAPH_RET_SIZE sizeof(struct ftrace_ret_stack)
-#define FGRAPH_RET_INDEX DIV_ROUND_UP(FGRAPH_RET_SIZE, sizeof(long))
+#define FGRAPH_FRAME_SIZE sizeof(struct ftrace_ret_stack)
+#define FGRAPH_FRAME_OFFSET DIV_ROUND_UP(FGRAPH_FRAME_SIZE, sizeof(long))
+
+/*
+ * On entry to a function (via function_graph_enter()), a new ftrace_ret_stack
+ * is allocated on the task's ret_stack with bitmap entry, then each
+ * fgraph_ops on the fgraph_array[]'s entryfunc is called and if that returns
+ * non-zero, the index into the fgraph_array[] for that fgraph_ops is recorded
+ * on the bitmap entry as a bit flag.
+ *
+ * The top of the ret_stack (when not empty) will always have a reference
+ * to the last ftrace_ret_stack saved. All references to the
+ * ftrace_ret_stack has the format of:
+ *
+ * bits: 0 - 9 offset in words from the previous ftrace_ret_stack
+ * (bitmap type should have