Re: [RFC PATCH v2 1/8] arm64: Implement stack trace termination record

2021-03-23 Thread Madhavan T. Venkataraman



On 3/23/21 5:24 AM, Mark Rutland wrote:
> On Fri, Mar 19, 2021 at 05:03:09PM -0500, Madhavan T. Venkataraman wrote:
>> I solved this by using existing functions logically instead of inventing a
>> dummy function. I initialize pt_regs->stackframe[1] to an existing function
>> so that the stack trace will not show a 0x0 entry as well as the kernel and
>> gdb will show identical stack traces.
>>
>> For all task stack traces including the idle tasks, the stack trace will
>> end at copy_thread() as copy_thread() is the function that initializes the
>> pt_regs and the first stack frame for a task.
> 
> I don't think this is a good idea, as it will mean that copy_thread()
> will appear to be live in every thread, and therefore will not be
> patchable.
> 
> There are other things people need to be aware of when using an external
> debugger (e.g. during EL0<->ELx transitions there are periods when X29
> and X30 contain the EL0 values, and cannot be used to unwind), so I
> don't think there's a strong need to make this look prettier to an
> external debugger.
> 

OK.

>> For EL0 exceptions, the stack trace will end with vectors() as vectors
>> entries call the EL0 handlers.
>>
>> Here are sample stack traces (I only show the ending of each trace):
>>
>> Idle task on primary CPU
>> 
>>
>>   ...
>> [0.022557]   start_kernel+0x5b8/0x5f4
>> [0.022570]   __primary_switched+0xa8/0xb8
>> [0.022578]   copy_thread+0x0/0x188
>>
>> Idle task on secondary CPU
>> ==
>>
>>   ...
>> [0.023397]   secondary_start_kernel+0x188/0x1e0
>> [0.023406]   __secondary_switched+0x40/0x88
>> [0.023415]   copy_thread+0x0/0x188
>>
>> All other kernel threads
>> 
>>
>>   ...
>> [   13.501062]   ret_from_fork+0x10/0x18
>> [   13.507998]   copy_thread+0x0/0x188
>>
>> User threads (EL0 exception)
>> 
>>
>> write(2) system call example:
>>
>>   ...
>> [  521.686148]   vfs_write+0xc8/0x2c0
>> [  521.686156]   ksys_write+0x74/0x108
>> [  521.686161]   __arm64_sys_write+0x24/0x30
>> [  521.686166]   el0_svc_common.constprop.0+0x70/0x1a8
>> [  521.686175]   do_el0_svc+0x2c/0x98
>> [  521.686180]   el0_svc+0x2c/0x70
>> [  521.686188]   el0_sync_handler+0xb0/0xb8
>> [  521.686193]   el0_sync+0x17c/0x180
>> [  521.686198]   vectors+0x0/0x7d8
> 
> [...]
> 
>> If you approve, the above will become RFC Patch v3 1/8 in the next version.
> 
> As above, I don't think we should repurpose an existing function here,
> and my preference is to use 0x0.
> 

OK.

>> Let me know.
>>
>> Also, I could introduce an extra frame in the EL1 exception stack trace that
>> includes vectors so the stack trace would look like this (timer interrupt 
>> example):
>>
>> call_timer_fn
>> run_timer_softirq
>> __do_softirq
>> irq_exit
>> __handle_domain_irq
>> gic_handle_irq
>> el1_irq
>> vectors
>>
>> This way, if the unwinder finds vectors, it knows that it is an exception 
>> frame.
> 
> I can see this might make it simpler to detect exception boundaries, but
> I suspect that we need other information anyway, so this doesn't become
> all that helpful. For EL0<->EL1 exception boundaries we want to
> successfully terminate a robust stacktrace whereas for EL1<->EL1
> exception boundaries we want to fail a robust stacktrace.
> 
> I reckon we have to figure that out from the el1_* and el0_* entry
> points (which I am working to reduce/simplify as part of the entry
> assembly conversion to C). With that we can terminate unwind at the
> el0_* parts, and reject unwinding across any other bit of .entry.text.
> 

OK. That is fine.

Thanks.

Madhavan


Re: [RFC PATCH v2 1/8] arm64: Implement stack trace termination record

2021-03-23 Thread Mark Rutland
On Fri, Mar 19, 2021 at 05:03:09PM -0500, Madhavan T. Venkataraman wrote:
> I solved this by using existing functions logically instead of inventing a
> dummy function. I initialize pt_regs->stackframe[1] to an existing function
> so that the stack trace will not show a 0x0 entry as well as the kernel and
> gdb will show identical stack traces.
> 
> For all task stack traces including the idle tasks, the stack trace will
> end at copy_thread() as copy_thread() is the function that initializes the
> pt_regs and the first stack frame for a task.

I don't think this is a good idea, as it will mean that copy_thread()
will appear to be live in every thread, and therefore will not be
patchable.

There are other things people need to be aware of when using an external
debugger (e.g. during EL0<->ELx transitions there are periods when X29
and X30 contain the EL0 values, and cannot be used to unwind), so I
don't think there's a strong need to make this look prettier to an
external debugger.

> For EL0 exceptions, the stack trace will end with vectors() as vectors
> entries call the EL0 handlers.
> 
> Here are sample stack traces (I only show the ending of each trace):
> 
> Idle task on primary CPU
> 
> 
>...
> [0.022557]   start_kernel+0x5b8/0x5f4
> [0.022570]   __primary_switched+0xa8/0xb8
> [0.022578]   copy_thread+0x0/0x188
> 
> Idle task on secondary CPU
> ==
> 
>...
> [0.023397]   secondary_start_kernel+0x188/0x1e0
> [0.023406]   __secondary_switched+0x40/0x88
> [0.023415]   copy_thread+0x0/0x188
> 
> All other kernel threads
> 
> 
>...
> [   13.501062]   ret_from_fork+0x10/0x18
> [   13.507998]   copy_thread+0x0/0x188
> 
> User threads (EL0 exception)
> 
> 
> write(2) system call example:
> 
>...
> [  521.686148]   vfs_write+0xc8/0x2c0
> [  521.686156]   ksys_write+0x74/0x108
> [  521.686161]   __arm64_sys_write+0x24/0x30
> [  521.686166]   el0_svc_common.constprop.0+0x70/0x1a8
> [  521.686175]   do_el0_svc+0x2c/0x98
> [  521.686180]   el0_svc+0x2c/0x70
> [  521.686188]   el0_sync_handler+0xb0/0xb8
> [  521.686193]   el0_sync+0x17c/0x180
> [  521.686198]   vectors+0x0/0x7d8

[...]

> If you approve, the above will become RFC Patch v3 1/8 in the next version.

As above, I don't think we should repurpose an existing function here,
and my preference is to use 0x0.

> Let me know.
> 
> Also, I could introduce an extra frame in the EL1 exception stack trace that
> includes vectors so the stack trace would look like this (timer interrupt 
> example):
> 
> call_timer_fn
> run_timer_softirq
> __do_softirq
> irq_exit
> __handle_domain_irq
> gic_handle_irq
> el1_irq
> vectors
> 
> This way, if the unwinder finds vectors, it knows that it is an exception 
> frame.

I can see this might make it simpler to detect exception boundaries, but
I suspect that we need other information anyway, so this doesn't become
all that helpful. For EL0<->EL1 exception boundaries we want to
successfully terminate a robust stacktrace whereas for EL1<->EL1
exception boundaries we want to fail a robust stacktrace.

I reckon we have to figure that out from the el1_* and el0_* entry
points (which I am working to reduce/simplify as part of the entry
assembly conversion to C). With that we can terminate unwind at the
el0_* parts, and reject unwinding across any other bit of .entry.text.

Thanks,
Mark.



Re: [RFC PATCH v2 1/8] arm64: Implement stack trace termination record

2021-03-19 Thread Madhavan T. Venkataraman



On 3/19/21 1:19 PM, Madhavan T. Venkataraman wrote:
> 
> 
> On 3/19/21 9:29 AM, Madhavan T. Venkataraman wrote:
>>
>>
>> On 3/19/21 7:30 AM, Mark Brown wrote:
>>> On Thu, Mar 18, 2021 at 03:26:13PM -0500, Madhavan T. Venkataraman wrote:
 On 3/18/21 10:09 AM, Mark Brown wrote:
>>>
> If we are going to add the extra record there would probably be less
> potential for confusion if we pointed it at some sensibly named dummy
> function so anything or anyone that does see it on the stack doesn't get
> confused by a NULL.
>>>
 I agree. I will think about this some more. If no other solution presents
 itself, I will add the dummy function.
>>>
>>> After discussing this with Mark Rutland offlist he convinced me that so
>>> long as we ensure the kernel doesn't print the NULL record we're
>>> probably OK here, the effort setting the function pointer up correctly
>>> in all circumstances (especially when we're not in the normal memory
>>> map) is probably not worth it for the limited impact it's likely to have
>>> to see the NULL pointer (probably mainly a person working with some
>>> external debugger).  It should be noted in the changelog though, and/or
>>> merged in with the relevant change to the unwinder.
>>>
>>
>> OK. I will add a comment as well as note it in the changelog.
>>
>> Thanks to both of you.
>>
>> Madhavan
>>
> 
> I thought about this some more. I think I have a simple solution. I will
> prepare a patch and send it out. If you and Mark Rutland approve, I will
> include it in the next version of this RFC.
> 
> Madhavan
> 

I solved this by using existing functions logically instead of inventing a
dummy function. I initialize pt_regs->stackframe[1] to an existing function
so that the stack trace will not show a 0x0 entry as well as the kernel and
gdb will show identical stack traces.

For all task stack traces including the idle tasks, the stack trace will
end at copy_thread() as copy_thread() is the function that initializes the
pt_regs and the first stack frame for a task.

For EL0 exceptions, the stack trace will end with vectors() as vectors
entries call the EL0 handlers.

Here are sample stack traces (I only show the ending of each trace):

Idle task on primary CPU


 ...
[0.022557]   start_kernel+0x5b8/0x5f4
[0.022570]   __primary_switched+0xa8/0xb8
[0.022578]   copy_thread+0x0/0x188

Idle task on secondary CPU
==

 ...
[0.023397]   secondary_start_kernel+0x188/0x1e0
[0.023406]   __secondary_switched+0x40/0x88
[0.023415]   copy_thread+0x0/0x188

All other kernel threads


 ...
[   13.501062]   ret_from_fork+0x10/0x18
[   13.507998]   copy_thread+0x0/0x188

User threads (EL0 exception)


write(2) system call example:

 ...
[  521.686148]   vfs_write+0xc8/0x2c0
[  521.686156]   ksys_write+0x74/0x108
[  521.686161]   __arm64_sys_write+0x24/0x30
[  521.686166]   el0_svc_common.constprop.0+0x70/0x1a8
[  521.686175]   do_el0_svc+0x2c/0x98
[  521.686180]   el0_svc+0x2c/0x70
[  521.686188]   el0_sync_handler+0xb0/0xb8
[  521.686193]   el0_sync+0x17c/0x180
[  521.686198]   vectors+0x0/0x7d8

Here are the code changes:


diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index a31a0a713c85..514307e80b79 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -261,16 +261,19 @@ alternative_else_nop_endif
stp lr, x21, [sp, #S_LR]
 
/*
-* For exceptions from EL0, terminate the callchain here.
+* For exceptions from EL0, terminate the callchain here at
+* task_pt_regs(current)->stackframe.
+*
 * For exceptions from EL1, create a synthetic frame record so the
 * interrupted code shows up in the backtrace.
 */
.if \el == 0
-   mov x29, xzr
+   ldr x17, =vectors
+   stp xzr, x17, [sp, #S_STACKFRAME]
.else
stp x29, x22, [sp, #S_STACKFRAME]
-   add x29, sp, #S_STACKFRAME
.endif
+   add x29, sp, #S_STACKFRAME
 
 #ifdef CONFIG_ARM64_SW_TTBR0_PAN
 alternative_if_not ARM64_HAS_PAN
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 66b0e0b66e31..699e0dd313a1 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -393,6 +393,29 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
ret x28
 SYM_FUNC_END(__create_page_tables)
 
+   /*
+* The boot task becomes the idle task for the primary CPU. The
+* CPU startup task on each secondary CPU becomes the idle task
+* for the secondary CPU.
+*
+* The idle task does not require pt_regs. But create a dummy
+* pt_regs so that task_pt_regs(idle_task)->stackframe can be
+* set up to be the last frame on the idle task stack just 

Re: [RFC PATCH v2 1/8] arm64: Implement stack trace termination record

2021-03-19 Thread Madhavan T. Venkataraman



On 3/19/21 9:29 AM, Madhavan T. Venkataraman wrote:
> 
> 
> On 3/19/21 7:30 AM, Mark Brown wrote:
>> On Thu, Mar 18, 2021 at 03:26:13PM -0500, Madhavan T. Venkataraman wrote:
>>> On 3/18/21 10:09 AM, Mark Brown wrote:
>>
 If we are going to add the extra record there would probably be less
 potential for confusion if we pointed it at some sensibly named dummy
 function so anything or anyone that does see it on the stack doesn't get
 confused by a NULL.
>>
>>> I agree. I will think about this some more. If no other solution presents
>>> itself, I will add the dummy function.
>>
>> After discussing this with Mark Rutland offlist he convinced me that so
>> long as we ensure the kernel doesn't print the NULL record we're
>> probably OK here, the effort setting the function pointer up correctly
>> in all circumstances (especially when we're not in the normal memory
>> map) is probably not worth it for the limited impact it's likely to have
>> to see the NULL pointer (probably mainly a person working with some
>> external debugger).  It should be noted in the changelog though, and/or
>> merged in with the relevant change to the unwinder.
>>
> 
> OK. I will add a comment as well as note it in the changelog.
> 
> Thanks to both of you.
> 
> Madhavan
> 

I thought about this some more. I think I have a simple solution. I will
prepare a patch and send it out. If you and Mark Rutland approve, I will
include it in the next version of this RFC.

Madhavan


Re: [RFC PATCH v2 1/8] arm64: Implement stack trace termination record

2021-03-19 Thread Madhavan T. Venkataraman



On 3/19/21 7:30 AM, Mark Brown wrote:
> On Thu, Mar 18, 2021 at 03:26:13PM -0500, Madhavan T. Venkataraman wrote:
>> On 3/18/21 10:09 AM, Mark Brown wrote:
> 
>>> If we are going to add the extra record there would probably be less
>>> potential for confusion if we pointed it at some sensibly named dummy
>>> function so anything or anyone that does see it on the stack doesn't get
>>> confused by a NULL.
> 
>> I agree. I will think about this some more. If no other solution presents
>> itself, I will add the dummy function.
> 
> After discussing this with Mark Rutland offlist he convinced me that so
> long as we ensure the kernel doesn't print the NULL record we're
> probably OK here, the effort setting the function pointer up correctly
> in all circumstances (especially when we're not in the normal memory
> map) is probably not worth it for the limited impact it's likely to have
> to see the NULL pointer (probably mainly a person working with some
> external debugger).  It should be noted in the changelog though, and/or
> merged in with the relevant change to the unwinder.
> 

OK. I will add a comment as well as note it in the changelog.

Thanks to both of you.

Madhavan


Re: [RFC PATCH v2 1/8] arm64: Implement stack trace termination record

2021-03-19 Thread Mark Brown
On Thu, Mar 18, 2021 at 03:26:13PM -0500, Madhavan T. Venkataraman wrote:
> On 3/18/21 10:09 AM, Mark Brown wrote:

> > If we are going to add the extra record there would probably be less
> > potential for confusion if we pointed it at some sensibly named dummy
> > function so anything or anyone that does see it on the stack doesn't get
> > confused by a NULL.

> I agree. I will think about this some more. If no other solution presents
> itself, I will add the dummy function.

After discussing this with Mark Rutland offlist he convinced me that so
long as we ensure the kernel doesn't print the NULL record we're
probably OK here, the effort setting the function pointer up correctly
in all circumstances (especially when we're not in the normal memory
map) is probably not worth it for the limited impact it's likely to have
to see the NULL pointer (probably mainly a person working with some
external debugger).  It should be noted in the changelog though, and/or
merged in with the relevant change to the unwinder.


signature.asc
Description: PGP signature


Re: [RFC PATCH v2 1/8] arm64: Implement stack trace termination record

2021-03-18 Thread Madhavan T. Venkataraman



On 3/18/21 10:09 AM, Mark Brown wrote:
> On Mon, Mar 15, 2021 at 11:57:53AM -0500, madve...@linux.microsoft.com wrote:
> 
>> In summary, task pt_regs->stackframe is where a successful stack trace ends.
> 
>> .if \el == 0
>> -   mov x29, xzr
>> +   stp xzr, xzr, [sp, #S_STACKFRAME]
>> .else
>> stp x29, x22, [sp, #S_STACKFRAME]
>> -   add x29, sp, #S_STACKFRAME
>> .endif
>> +   add x29, sp, #S_STACKFRAME
> 
> For both user and kernel threads this patch (at least by itself) results
> in an additional record being reported in stack traces with a NULL
> function pointer since it keeps the existing record where it is and adds
> this new fixed record below it.  This is addressed for the kernel later
> in the series, by "arm64: Terminate the stack trace at TASK_FRAME and
> EL0_FRAME", but will still be visible to other unwinders such as
> debuggers.  I'm not sure that this *matters* but it might and should at
> least be called out more explicitly.
> 
> If we are going to add the extra record there would probably be less
> potential for confusion if we pointed it at some sensibly named dummy
> function so anything or anyone that does see it on the stack doesn't get
> confused by a NULL.
> 

I agree. I will think about this some more. If no other solution presents
itself, I will add the dummy function.

Madhavan

> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 


Re: [RFC PATCH v2 1/8] arm64: Implement stack trace termination record

2021-03-18 Thread Mark Brown
On Mon, Mar 15, 2021 at 11:57:53AM -0500, madve...@linux.microsoft.com wrote:

> In summary, task pt_regs->stackframe is where a successful stack trace ends.

> .if \el == 0
> -   mov x29, xzr
> +   stp xzr, xzr, [sp, #S_STACKFRAME]
> .else
> stp x29, x22, [sp, #S_STACKFRAME]
> -   add x29, sp, #S_STACKFRAME
> .endif
> +   add x29, sp, #S_STACKFRAME

For both user and kernel threads this patch (at least by itself) results
in an additional record being reported in stack traces with a NULL
function pointer since it keeps the existing record where it is and adds
this new fixed record below it.  This is addressed for the kernel later
in the series, by "arm64: Terminate the stack trace at TASK_FRAME and
EL0_FRAME", but will still be visible to other unwinders such as
debuggers.  I'm not sure that this *matters* but it might and should at
least be called out more explicitly.

If we are going to add the extra record there would probably be less
potential for confusion if we pointed it at some sensibly named dummy
function so anything or anyone that does see it on the stack doesn't get
confused by a NULL.


signature.asc
Description: PGP signature