Re: [PATCH v10 07/36] function_graph: Allow multiple users to attach to function graph

2024-05-25 Thread Google
On Fri, 24 May 2024 21:32:08 -0400
Steven Rostedt  wrote:

> On Tue,  7 May 2024 23:09:22 +0900
> "Masami Hiramatsu (Google)"  wrote:
> 
> > @@ -109,6 +244,21 @@ ftrace_push_return_trace(unsigned long ret, unsigned 
> > long func,
> > if (!current->ret_stack)
> > return -EBUSY;
> >  
> > +   /*
> > +* At first, check whether the previous fgraph callback is pushed by
> > +* the fgraph on the same function entry.
> > +* But if @func is the self tail-call function, we also need to ensure
> > +* the ret_stack is not for the previous call by checking whether the
> > +* bit of @fgraph_idx is set or not.
> > +*/
> > +   ret_stack = get_ret_stack(current, current->curr_ret_stack, );
> > +   if (ret_stack && ret_stack->func == func &&
> > +   get_fgraph_type(current, offset + FGRAPH_FRAME_OFFSET) == 
> > FGRAPH_TYPE_BITMAP &&
> > +   !is_fgraph_index_set(current, offset + FGRAPH_FRAME_OFFSET, 
> > fgraph_idx))
> > +   return offset + FGRAPH_FRAME_OFFSET;
> > +
> > +   val = (FGRAPH_TYPE_RESERVED << FGRAPH_TYPE_SHIFT) | FGRAPH_FRAME_OFFSET;
> > +
> > BUILD_BUG_ON(SHADOW_STACK_SIZE % sizeof(long));
> 
> I'm trying to figure out what the above is trying to do. This gets called
> once in function_graph_enter() (or function_graph_enter_ops()). What
> exactly are you trying to catch here?

Aah, good catch! This was originally for catching the self tail-call case with
multiple fgraph callback on the same function, but it was my misread.
In later patch ([12/36]), we introduced function_graph_enter_ops() so that
we can skip checking hash table and directly pass the fgraph_ops to user
callback. I thought this function_graph_enter_ops() is used even if multiple
fgraph is set on the same function. In this case, we always need to check the
stack can be reused(pushed by other fgraph_ops on the same function) or not.
But as we discussed, the function_graph_enter_ops() is used only when only
one fgraph is set on the function (if there are multiple fgraphs are set on
the same function, use function_graph_enter() ), we are sure that 
ftrace_push_return_trace() is called only once on hooking the function entry.
Thus we don't need to reuse it.

> 
> Is it from this email:
> 
>   
> https://lore.kernel.org/all/20231110105154.df937bf9f200a0c16806c...@kernel.org/
> 
> As that's the last version before you added the above code.
> 
> But you also noticed it may not be needed, but triggered a crash without it
> in v3:
> 
>   
> https://lore.kernel.org/all/20231205234511.3839128259dfec153ea7d...@kernel.org/
> 
> I removed this code in my version and it runs just fine. Perhaps there was
> another bug that this was hiding that you fixed in later versions?

No problem. I think we can remove this block safely.

Thank you,

> 
> -- Steve
> 


-- 
Masami Hiramatsu (Google) 



Re: [PATCH v10 07/36] function_graph: Allow multiple users to attach to function graph

2024-05-24 Thread Steven Rostedt
On Tue,  7 May 2024 23:09:22 +0900
"Masami Hiramatsu (Google)"  wrote:

> @@ -109,6 +244,21 @@ ftrace_push_return_trace(unsigned long ret, unsigned 
> long func,
>   if (!current->ret_stack)
>   return -EBUSY;
>  
> + /*
> +  * At first, check whether the previous fgraph callback is pushed by
> +  * the fgraph on the same function entry.
> +  * But if @func is the self tail-call function, we also need to ensure
> +  * the ret_stack is not for the previous call by checking whether the
> +  * bit of @fgraph_idx is set or not.
> +  */
> + ret_stack = get_ret_stack(current, current->curr_ret_stack, );
> + if (ret_stack && ret_stack->func == func &&
> + get_fgraph_type(current, offset + FGRAPH_FRAME_OFFSET) == 
> FGRAPH_TYPE_BITMAP &&
> + !is_fgraph_index_set(current, offset + FGRAPH_FRAME_OFFSET, 
> fgraph_idx))
> + return offset + FGRAPH_FRAME_OFFSET;
> +
> + val = (FGRAPH_TYPE_RESERVED << FGRAPH_TYPE_SHIFT) | FGRAPH_FRAME_OFFSET;
> +
>   BUILD_BUG_ON(SHADOW_STACK_SIZE % sizeof(long));

I'm trying to figure out what the above is trying to do. This gets called
once in function_graph_enter() (or function_graph_enter_ops()). What
exactly are you trying to catch here?

Is it from this email:

  
https://lore.kernel.org/all/20231110105154.df937bf9f200a0c16806c...@kernel.org/

As that's the last version before you added the above code.

But you also noticed it may not be needed, but triggered a crash without it
in v3:

  
https://lore.kernel.org/all/20231205234511.3839128259dfec153ea7d...@kernel.org/

I removed this code in my version and it runs just fine. Perhaps there was
another bug that this was hiding that you fixed in later versions?

-- Steve



[PATCH v10 07/36] function_graph: Allow multiple users to attach to function graph

2024-05-07 Thread Masami Hiramatsu (Google)
From: Steven Rostedt (VMware) 

Allow for multiple users to attach to function graph tracer at the same
time. Only 16 simultaneous users can attach to the tracer. This is because
there's an array that stores the pointers to the attached fgraph_ops. When
a function being traced is entered, each of the ftrace_ops entryfunc is
called and if it returns non zero, its index into the array will be added
to the shadow stack.

On exit of the function being traced, the shadow stack will contain the
indexes of the ftrace_ops on the array that want their retfunc to be
called.

Because a function may sleep for a long time (if a task sleeps itself),
the return of the function may be literally days later. If the ftrace_ops
is removed, its place on the array is replaced with a ftrace_ops that
contains the stub functions and that will be called when the function
finally returns.

If another ftrace_ops is added that happens to get the same index into the
array, its return function may be called. But that's actually the way
things current work with the old function graph tracer. If one tracer is
removed and another is added, the new one will get the return calls of the
function traced by the previous one, thus this is not a regression. This
can be fixed by adding a counter to each time the array item is updated and
save that on the shadow stack as well, such that it won't be called if the
index saved does not match the index on the array.

Note, being able to filter functions when both are called is not completely
handled yet, but that shouldn't be too hard to manage.

Signed-off-by: Steven Rostedt (VMware) 
Signed-off-by: Masami Hiramatsu (Google) 
---
 Changes in v10:
  - Rephrase all "index" on shadow stack to "offset" and some "ret_stack" to
   "frame".
  - Macros are renamed;
FGRAPH_RET_SIZE to FGRAPH_FRAME_SIZE
FGRAPH_RET_INDEX to FGRAPH_FRAME_OFFSET
FGRAPH_RET_INDEX_SIZE to FGRAPH_FRAME_OFFSET_SIZE
FGRAPH_RET_INDEX_MASK to FGRAPH_FRAME_OFFSET_MASK
SHADOW_STACK_INDEX to SHADOW_STACK_IN_WORD
SHADOW_STACK_MAX_INDEX to SHADOW_STACK_MAX_OFFSET
  - Remove unused FGRAPH_MAX_INDEX macro.
  - Fix wrong explanation of the shadow stack entry.
 Changes in v7:
  - Fix max limitation check in ftrace_graph_push_return().
  - Rewrite the shadow stack implementation using bitmap entry. This allows
us to detect recursive call/tail call easier. (this implementation is
moved from later patch in the series.
 Changes in v2:
  - Check return value of the ftrace_pop_return_trace() instead of 'ret'
since 'ret' is set to the address of panic().
  - Fix typo and make lines shorter than 76 chars in description.
---
 include/linux/ftrace.h |3 
 kernel/trace/fgraph.c  |  378 +++-
 2 files changed, 310 insertions(+), 71 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index d5df5f8fc35a..0a1a7316de7b 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -1066,6 +1066,7 @@ extern int ftrace_graph_entry_stub(struct 
ftrace_graph_ent *trace);
 struct fgraph_ops {
trace_func_graph_ent_t  entryfunc;
trace_func_graph_ret_t  retfunc;
+   int idx;
 };
 
 /*
@@ -1100,7 +1101,7 @@ function_graph_enter(unsigned long ret, unsigned long 
func,
 unsigned long frame_pointer, unsigned long *retp);
 
 struct ftrace_ret_stack *
-ftrace_graph_get_ret_stack(struct task_struct *task, int idx);
+ftrace_graph_get_ret_stack(struct task_struct *task, int skip);
 
 unsigned long ftrace_graph_ret_addr(struct task_struct *task, int *idx,
unsigned long ret, unsigned long *retp);
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index 3f9dd213e7d8..6b06962657fe 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -7,6 +7,7 @@
  *
  * Highly modified by Steven Rostedt (VMware).
  */
+#include 
 #include 
 #include 
 #include 
@@ -25,25 +26,157 @@
 #define ASSIGN_OPS_HASH(opsname, val)
 #endif
 
-#define FGRAPH_RET_SIZE sizeof(struct ftrace_ret_stack)
-#define FGRAPH_RET_INDEX DIV_ROUND_UP(FGRAPH_RET_SIZE, sizeof(long))
+#define FGRAPH_FRAME_SIZE sizeof(struct ftrace_ret_stack)
+#define FGRAPH_FRAME_OFFSET DIV_ROUND_UP(FGRAPH_FRAME_SIZE, sizeof(long))
+
+/*
+ * On entry to a function (via function_graph_enter()), a new ftrace_ret_stack
+ * is allocated on the task's ret_stack with bitmap entry, then each
+ * fgraph_ops on the fgraph_array[]'s entryfunc is called and if that returns
+ * non-zero, the index into the fgraph_array[] for that fgraph_ops is recorded
+ * on the bitmap entry as a bit flag.
+ *
+ * The top of the ret_stack (when not empty) will always have a reference
+ * to the last ftrace_ret_stack saved. All references to the
+ * ftrace_ret_stack has the format of:
+ *
+ * bits:  0 -  9   offset in words from the previous ftrace_ret_stack
+ * (bitmap type should have