Go on I'll be busy wrapping the syscalls. On Mon, Aug 29, 2016 at 2:48 PM, Nadav Har'El <[email protected]> wrote:
> By the way, Benoit - with this patch, we save some of the registered saved > more than once (rcx, rsp), and some space wasted on the stack, all for the > purpose of building something resembling a "signal frame". Is this a worthy > cause? Why should the syscall_entry() function resemble a signal, when it > can resemble a norma function call? > > Would you have any objection that you or I clean up all these duplicate > saved stuff in syscall_entry(), and *not* emulate a signal frame? > > All the CFI stuff in this function is broken anyway, and will need to be > reworked. > > > -- > Nadav Har'El > [email protected] > > On Mon, Aug 29, 2016 at 3:42 PM, Nadav Har'El <[email protected]> wrote: > >> Not saving and restoring the rbp register causes tst-syscall to crash in >> the debug build. With this patch, the debug build of this test no longer >> crashes. >> >> Once we do save %rbp, let's kill two birds in one stone, and also enable >> backtrace_safe() (e.g., on abort) to go through the syscall_entry function >> correctly. To do this, we need to set up the old-style frame pointer - >> which means we need to push to the stack the return address (which we get >> in %rcx), then the old %rbp, and then set %rbp to our %rsp. >> >> Now there's an extra complication: Adding an odd number of 8-byte items >> to the stack makes it, in my debug-build of the test, no longer 16-bytes >> aligned. According to the C ABI, the stack must be 16-byte aligned when >> calling a C function (syscall_wrapper()) - and the debug build has some >> FPU-saving code which makes this assumptions, and crashes with #GP if not. >> >> So we add in this patch also code to align the stack to 16 bytes before >> calling the C function. We use a nice trick to do that without using up >> another register. >> >> Signed-off-by: Nadav Har'El <[email protected]> >> --- >> arch/x64/entry.S | 28 ++++++++++++++++++++++++++++ >> 1 file changed, 28 insertions(+) >> >> diff --git a/arch/x64/entry.S b/arch/x64/entry.S >> index 25f3cba..48a0a71 100644 >> --- a/arch/x64/entry.S >> +++ b/arch/x64/entry.S >> @@ -166,6 +166,13 @@ syscall_entry: >> .cfi_startproc simple >> # There is no ring transition and rflags are left unchanged. >> >> + # We need to save and restore the caller's %rbp anyway, so let's also >> + # set it up properly for old-style frame-pointer backtracing to work >> + # (e.g., backtrace_safe()). Also need to push the return address >> before >> + # the rbp to get a normal frame. Our return address is in rcx. >> + pushq %rcx >> + pushq %rbp >> + movq %rsp, %rbp >> # >> # From http://stackoverflow.com/questions/2535989/what-are-the- >> calling-conventions-for-unix-linux-system-calls-on-x86-64: >> # "User-level applications use as integer registers for passing >> the sequence %rdi, %rsi, %rdx, %rcx, %r8 and %r9. The kernel interface uses >> %rdi, %rsi, %rdx, %r10, %r8 and %r9" >> @@ -229,8 +236,26 @@ syscall_entry: >> # syscall number from rax as first argument >> movq %rax, %rdi >> >> + # align stack to 16 bytes, as required by the ABI. >> + # Counting the pushes above is not enough because we don't know what >> was >> + # the stack alignment initially (syscall is not a function call so >> it can >> + # be called with any stack alignment). An additional complication is >> that >> + # we need to restore %rsp later without knowing how it was previously >> + # aligned. In the following trick, not using an additional register, >> the >> + # two pushes leave the stack with the same alignment it had >> originally, >> + # and a copy of the original %rsp at (%rsp) and 8(%rsp). The andq >> then >> + # aligns the stack - if it was already 16 byte aligned nothing >> changes, if >> + # it was 8 byte aligned then it subtracts 8 from %rsp, meaning that >> the >> + # original %rsp is now at 8(%rsp) and 16(%rsp). In both cases we can >> + # restore it from 8(%rsp). >> + pushq %rsp >> + pushq (%rsp) >> + andq $-0x10, %rsp >> + >> callq syscall_wrapper >> >> + movq 8(%rsp), %rsp >> + >> popq %r9 >> # in Linux user and kernel return value are in rax so we have >> nothing to do for return values >> >> @@ -251,6 +276,9 @@ syscall_entry: >> addq $8, %rsp # rip emplacement (rip cannot be popped) >> popq %rsp >> >> + popq %rbp >> + popq %rcx >> + >> # jump to rcx where the syscall instruction put rip >> # (sysret would leave rxc cloberred so we have nothing to do to >> restore it) >> jmpq *%rcx >> -- >> 2.7.4 >> >> > -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
