Re: [RFC PATCH v2 1/4] arm64: Implement infrastructure for stack trace reliability checks

2021-04-09 Thread Mark Brown
On Thu, Apr 08, 2021 at 06:30:22PM -0500, Madhavan T. Venkataraman wrote:
> On 4/8/21 2:30 PM, Madhavan T. Venkataraman wrote:

> > 1. Create a common section (I will have to come up with an appropriate 
> > name) and put
> >all such functions in that one section.

> > 2. Create one section for each logical type (exception section, ftrace 
> > section and
> >kprobe section) or some such.

> For now, I will start with idea 2. I will create a special section for each 
> class of
> functions (EL1 exception handlers, FTRACE trampolines, KPROBE trampolines). 
> Instead of a
> special functions array, I will implement a special_sections array. The rest 
> of the code
> should just fall into place.

> Let me know if you prefer something different.

It might be safer to start off by just putting all SYM_CODE into a
section then pulling bits we know to be safe out of the section as
needed - we know that anything that's SYM_CODE is doing something
non-standard and needs checking to verify that the unwinder will be
happy with it and I that should cover most if not all of the cases above
as well as anything else we didn't explicitly think of.


signature.asc
Description: PGP signature


Re: [RFC PATCH v2 1/4] arm64: Implement infrastructure for stack trace reliability checks

2021-04-08 Thread Madhavan T. Venkataraman



On 4/8/21 2:30 PM, Madhavan T. Venkataraman wrote:
> 
> 
> On 4/8/21 12:17 PM, Mark Brown wrote:
>> On Mon, Apr 05, 2021 at 03:43:10PM -0500, madve...@linux.microsoft.com wrote:
>>
>>> These checks will involve checking the return PC to see if it falls inside
>>> any special functions where the stack trace is considered unreliable.
>>> Implement the infrastructure needed for this.
>>
>> Following up again based on an off-list discussion with Mark Rutland:
>> while I think this is a reasonable implementation for specifically
>> listing functions that cause problems we could make life easier for
>> ourselves by instead using annotations at the call sites to put things
>> into sections which indicate that they're unsafe for unwinding, we can
>> then check for any address in one of those sections (or possibly do the
>> reverse and check for any address in a section we specifically know is
>> safe) rather than having to enumerate problematic functions in the
>> unwinder.  This also has the advantage of not having a list that's
>> separate to the functions themselves so it's less likely that the
>> unwinder will get out of sync with the rest of the code as things evolve.
>>
>> We already have SYM_CODE_START() annotations in the code for assembly
>> functions that aren't using the standard calling convention which should
>> help a lot here, we could add a variant of that for things that we know
>> are safe on stacks (like those we expect to find at the bottom of
>> stacks).
>>
> 
> As I already mentioned before, I like the idea of sections. The only reason 
> that I did
> not try it was that I have to address FTRACE trampolines and the 
> kretprobe_trampoline
> (and optprobes in the future).
> 
> I have the following options:
> 
> 1. Create a common section (I will have to come up with an appropriate name) 
> and put
>all such functions in that one section.
> 
> 2. Create one section for each logical type (exception section, ftrace 
> section and
>kprobe section) or some such.
> 

For now, I will start with idea 2. I will create a special section for each 
class of
functions (EL1 exception handlers, FTRACE trampolines, KPROBE trampolines). 
Instead of a
special functions array, I will implement a special_sections array. The rest of 
the code
should just fall into place.

Let me know if you prefer something different.

Thanks.

Madhavan

> 3. Use the section idea only for the el1 exceptions. For the others use the 
> current
>special_functions[] approach.
> 
> Which one do you and Mark Rutland prefer? Or, is there another choice?
> 
> Madhavan
> 


Re: [RFC PATCH v2 1/4] arm64: Implement infrastructure for stack trace reliability checks

2021-04-08 Thread Madhavan T. Venkataraman



On 4/8/21 12:17 PM, Mark Brown wrote:
> On Mon, Apr 05, 2021 at 03:43:10PM -0500, madve...@linux.microsoft.com wrote:
> 
>> These checks will involve checking the return PC to see if it falls inside
>> any special functions where the stack trace is considered unreliable.
>> Implement the infrastructure needed for this.
> 
> Following up again based on an off-list discussion with Mark Rutland:
> while I think this is a reasonable implementation for specifically
> listing functions that cause problems we could make life easier for
> ourselves by instead using annotations at the call sites to put things
> into sections which indicate that they're unsafe for unwinding, we can
> then check for any address in one of those sections (or possibly do the
> reverse and check for any address in a section we specifically know is
> safe) rather than having to enumerate problematic functions in the
> unwinder.  This also has the advantage of not having a list that's
> separate to the functions themselves so it's less likely that the
> unwinder will get out of sync with the rest of the code as things evolve.
> 
> We already have SYM_CODE_START() annotations in the code for assembly
> functions that aren't using the standard calling convention which should
> help a lot here, we could add a variant of that for things that we know
> are safe on stacks (like those we expect to find at the bottom of
> stacks).
> 

As I already mentioned before, I like the idea of sections. The only reason 
that I did
not try it was that I have to address FTRACE trampolines and the 
kretprobe_trampoline
(and optprobes in the future).

I have the following options:

1. Create a common section (I will have to come up with an appropriate name) 
and put
   all such functions in that one section.

2. Create one section for each logical type (exception section, ftrace section 
and
   kprobe section) or some such.

3. Use the section idea only for the el1 exceptions. For the others use the 
current
   special_functions[] approach.

Which one do you and Mark Rutland prefer? Or, is there another choice?

Madhavan


Re: [RFC PATCH v2 1/4] arm64: Implement infrastructure for stack trace reliability checks

2021-04-08 Thread Mark Brown
On Mon, Apr 05, 2021 at 03:43:10PM -0500, madve...@linux.microsoft.com wrote:

> These checks will involve checking the return PC to see if it falls inside
> any special functions where the stack trace is considered unreliable.
> Implement the infrastructure needed for this.

Following up again based on an off-list discussion with Mark Rutland:
while I think this is a reasonable implementation for specifically
listing functions that cause problems we could make life easier for
ourselves by instead using annotations at the call sites to put things
into sections which indicate that they're unsafe for unwinding, we can
then check for any address in one of those sections (or possibly do the
reverse and check for any address in a section we specifically know is
safe) rather than having to enumerate problematic functions in the
unwinder.  This also has the advantage of not having a list that's
separate to the functions themselves so it's less likely that the
unwinder will get out of sync with the rest of the code as things evolve.

We already have SYM_CODE_START() annotations in the code for assembly
functions that aren't using the standard calling convention which should
help a lot here, we could add a variant of that for things that we know
are safe on stacks (like those we expect to find at the bottom of
stacks).


signature.asc
Description: PGP signature


Re: [RFC PATCH v2 1/4] arm64: Implement infrastructure for stack trace reliability checks

2021-04-08 Thread Mark Brown
On Mon, Apr 05, 2021 at 03:43:10PM -0500, madve...@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" 
> 
> Implement a check_reliability() function that will contain checks for the
> presence of various features and conditions that can render the stack trace
> unreliable.

Reviewed-by: Mark Brown 


signature.asc
Description: PGP signature


[RFC PATCH v2 1/4] arm64: Implement infrastructure for stack trace reliability checks

2021-04-05 Thread madvenka
From: "Madhavan T. Venkataraman" 

Implement a check_reliability() function that will contain checks for the
presence of various features and conditions that can render the stack trace
unreliable.

Introduce the first reliability check - If a return PC encountered in a
stack trace is not a valid kernel text address, the stack trace is
considered unreliable. It could be some generated code.

Other reliability checks will be added in the future.

These checks will involve checking the return PC to see if it falls inside
any special functions where the stack trace is considered unreliable.
Implement the infrastructure needed for this.

Signed-off-by: Madhavan T. Venkataraman 
---
 arch/arm64/include/asm/stacktrace.h |  2 +
 arch/arm64/kernel/stacktrace.c  | 80 +
 2 files changed, 82 insertions(+)

diff --git a/arch/arm64/include/asm/stacktrace.h 
b/arch/arm64/include/asm/stacktrace.h
index eb29b1fe8255..684f65808394 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -59,6 +59,7 @@ struct stackframe {
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
int graph;
 #endif
+   bool reliable;
 };
 
 extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame);
@@ -169,6 +170,7 @@ static inline void start_backtrace(struct stackframe *frame,
bitmap_zero(frame->stacks_done, __NR_STACK_TYPES);
frame->prev_fp = 0;
frame->prev_type = STACK_TYPE_UNKNOWN;
+   frame->reliable = true;
 }
 
 #endif /* __ASM_STACKTRACE_H */
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index ad20981dfda4..557657d6e6bd 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -18,6 +18,84 @@
 #include 
 #include 
 
+struct function_range {
+   unsigned long   start;
+   unsigned long   end;
+};
+
+/*
+ * Special functions where the stack trace is unreliable.
+ */
+static struct function_range   special_functions[] = {
+   { /* sentinel */ }
+};
+
+static bool is_reliable_function(unsigned long pc)
+{
+   static bool inited = false;
+   struct function_range *func;
+
+   if (!inited) {
+   static char sym[KSYM_NAME_LEN];
+   unsigned long size, offset;
+
+   for (func = special_functions; func->start; func++) {
+   if (kallsyms_lookup(func->start, , ,
+   NULL, sym)) {
+   func->start -= offset;
+   func->end = func->start + size;
+   } else {
+   /*
+* This is just a label. So, we only need to
+* consider that particular location. So, size
+* is the size of one Aarch64 instruction.
+*/
+   func->end = func->start + 4;
+   }
+   }
+   inited = true;
+   }
+
+   for (func = special_functions; func->start; func++) {
+   if (pc >= func->start && pc < func->end)
+   return false;
+   }
+   return true;
+}
+
+/*
+ * Check for the presence of features and conditions that render the stack
+ * trace unreliable.
+ *
+ * Once all such cases have been addressed, this function can aid live
+ * patching (and this comment can be removed).
+ */
+static void check_reliability(struct stackframe *frame)
+{
+   /*
+* If the stack trace has already been marked unreliable, just return.
+*/
+   if (!frame->reliable)
+   return;
+
+   /*
+* First, make sure that the return address is a proper kernel text
+* address. A NULL or invalid return address probably means there's
+* some generated code which __kernel_text_address() doesn't know
+* about. Mark the stack trace as not reliable.
+*/
+   if (!__kernel_text_address(frame->pc)) {
+   frame->reliable = false;
+   return;
+   }
+
+   /*
+* Check the reliability of the return PC's function.
+*/
+   if (!is_reliable_function(frame->pc))
+   frame->reliable = false;
+}
+
 /*
  * AArch64 PCS assigns the frame pointer to x29.
  *
@@ -108,6 +186,8 @@ int notrace unwind_frame(struct task_struct *tsk, struct 
stackframe *frame)
 
frame->pc = ptrauth_strip_insn_pac(frame->pc);
 
+   check_reliability(frame);
+
return 0;
 }
 NOKPROBE_SYMBOL(unwind_frame);
-- 
2.25.1