On Monday, February 22, 2021 at 7:30:31 AM UTC+2 [email protected] wrote:
> I think I have an explanation of what is going on. Before I present it let
> me recap the calling convention for aarch64:
>
> Caller:
>
> 1. If we need any of x0-x18 registers, save them. They are corruptible.
> 2. Move the first 8 parameters into registers x0-x7.
> 3. Push any additional parameters on the stack.
> 4. Use BL to call the function.
> 5. Evaluate the return code in x0.
> 6. Restore any of x0-x18 that we saved in step 1.
>
> Callee:
>
> 1. Push LR (x30) and x19-x29 onto the stack if used by this routine.
> 2. Do the work
> 3. Put return code in x0.
> 4. Pop LR and x19-x29 if pushed in step 1.
> 5. Use RET instruction to return execution to the caller (this will
> implicitly use LR (x30) as an address to return to).
>
> Now imagine the following scenario involving function F executing on
> thread T1 that calls thread::yield() or another function calling yield():
>
> 1. Function F pushes one of the callee saved registers - x23 (just an
> example) - on the T1 stack becuase it uses it for something and it must do
> it per the calling convention.
> 2. Function F stores some value in x23.
> 3. Function F calls thread::yield() directly or indirectly.
> 4. Eventually, reschedule_from_interrupt() is called and it calls
> switch_to() to switch stack pointer to the new thread T2 stack. The debug
> version of reschedule_from_interrupt() nor switch_to() stores x23 as they
> do not use this register (unlike the release version).
> 5. At some point, later reschedule_from_interrupt() is called again
> (not necessarily the very next time) and calls switch_to() that switches
> back to T1.
> 6. T1 resumes and eventually returns the control to the function F1
> right after it called yield().
> 7. The code in F1 after calling yield() reads the value of x23 ... and
> boom. The x23 quite likely contains garbage because it was never restored
> by F1 after calling yield() because per calling convention yield() or
> other
> callees should have saved and restored. But it did not, did it? Or rather
> different routines on different threads running on the same cpu in between
> ruined it.
>
> Why does it all work with the release version? It does because the
> reschedule_from_interrupt() compiled with -02 happens to use and save all
> callee-saved registers x19-x28. So they happen to be restored to correct
> values after the switch.
>
> So it seems that the right solution is to save and restore x19-x28 (callee
> saved registers) in switch_to() like so:
>
> diff --git a/arch/aarch64/arch-switch.hh b/arch/aarch64/arch-switch.hh
> index dff7467c..45aff4a7 100644
> --- a/arch/aarch64/arch-switch.hh
> +++ b/arch/aarch64/arch-switch.hh
> @@ -27,6 +27,7 @@ void thread::switch_to()
>
> asm volatile("\n"
> "str x29, %0 \n"
> + "sub sp, sp, #0x50\n"
> "mov x2, sp \n"
> "adr x1, 1f \n" /* address of label */
> "stp x2, x1, %1 \n"
> @@ -34,10 +35,23 @@ void thread::switch_to()
> "ldp x29, x0, %2 \n"
> "ldp x2, x1, %3 \n"
>
> + "stp x19, x20, [sp, #0]\n"
> + "stp x21, x22, [sp, #16]\n"
> + "stp x23, x24, [sp, #32]\n"
> + "stp x25, x26, [sp, #48]\n"
> + "stp x27, x28, [sp, #64]\n"
> +
> "mov sp, x2 \n"
> "blr x1 \n"
>
> "1: \n" /* label */
> +
> + "ldp x19, x20, [sp, #0]\n"
> + "ldp x21, x22, [sp, #16]\n"
> + "ldp x23, x24, [sp, #32]\n"
> + "ldp x25, x26, [sp, #48]\n"
> + "ldp x27, x28, [sp, #64]\n"
> + "add sp, sp, #0x50\n"
> :
> : "Q"(old->_state.fp), "Ump"(old->_state.sp),
> "Ump"(this->_state.fp), "Ump"(this->_state.sp)
>
> And indeed the crashes in both -00 and -O1 go away.
>
> Does my explanation have holes? Or am I completely wrong?
>
I think you're completely right, well spotted.
I think you're completely right. Here's the equivalent from Linux.
Here's the Linux equivalent:
SYM_FUNC_START(cpu_switch_to)
mov x10, #THREAD_CPU_CONTEXT
add x8, x0, x10
mov x9, sp
stp x19, x20, [x8], #16 // store callee-saved
registers
stp x21, x22, [x8], #16
stp x23, x24, [x8], #16
stp x25, x26, [x8], #16
stp x27, x28, [x8], #16
stp x29, x9, [x8], #16
str lr, [x8]
add x8, x1, x10
ldp x19, x20, [x8], #16 // restore callee-saved
registers
ldp x21, x22, [x8], #16
ldp x23, x24, [x8], #16
ldp x25, x26, [x8], #16
ldp x27, x28, [x8], #16
ldp x29, x9, [x8], #16
ldr lr, [x8]
mov sp, x9
msr sp_el0, x1
ptrauth_keys_install_kernel x1, x8, x9, x10
scs_save x0, x8
scs_load x1, x8
ret
SYM_FUNC_END(cpu_switch_to)
NOKPROBE(cpu_switch_to)
--
You received this message because you are subscribed to the Google Groups "OSv
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/osv-dev/a6d7b7a4-4757-4959-9c6b-1e63c24997den%40googlegroups.com.