Go on I'll be busy wrapping the syscalls.

On Mon, Aug 29, 2016 at 2:48 PM, Nadav Har'El <[email protected]> wrote:

> By the way, Benoit - with this patch, we save some of the registered saved
> more than once (rcx, rsp), and some space wasted on the stack, all for the
> purpose of building something resembling a "signal frame". Is this a worthy
> cause? Why should the syscall_entry() function resemble a signal, when it
> can resemble a norma function call?
>
> Would you have any objection that you or I clean up all these duplicate
> saved stuff in syscall_entry(), and *not* emulate a signal frame?
>
> All the CFI stuff in this function is broken anyway, and will need to be
> reworked.
>
>
> --
> Nadav Har'El
> [email protected]
>
> On Mon, Aug 29, 2016 at 3:42 PM, Nadav Har'El <[email protected]> wrote:
>
>> Not saving and restoring the rbp register causes tst-syscall to crash in
>> the debug build. With this patch, the debug build of this test no longer
>> crashes.
>>
>> Once we do save %rbp, let's kill two birds in one stone, and also enable
>> backtrace_safe() (e.g., on abort) to go through the syscall_entry function
>> correctly. To do this, we need to set up the old-style frame pointer -
>> which means we need to push to the stack the return address (which we get
>> in %rcx), then the old %rbp, and then set %rbp to our %rsp.
>>
>> Now there's an extra complication: Adding an odd number of 8-byte items
>> to the stack makes it, in my debug-build of the test, no longer 16-bytes
>> aligned. According to the C ABI, the stack must be 16-byte aligned when
>> calling a C function (syscall_wrapper()) - and the debug build has some
>> FPU-saving code which makes this assumptions, and crashes with #GP if not.
>>
>> So we add in this patch also code to align the stack to 16 bytes before
>> calling the C function. We use a nice trick to do that without using up
>> another register.
>>
>> Signed-off-by: Nadav Har'El <[email protected]>
>> ---
>>  arch/x64/entry.S | 28 ++++++++++++++++++++++++++++
>>  1 file changed, 28 insertions(+)
>>
>> diff --git a/arch/x64/entry.S b/arch/x64/entry.S
>> index 25f3cba..48a0a71 100644
>> --- a/arch/x64/entry.S
>> +++ b/arch/x64/entry.S
>> @@ -166,6 +166,13 @@ syscall_entry:
>>          .cfi_startproc simple
>>         # There is no ring transition and rflags are left unchanged.
>>
>> +    # We need to save and restore the caller's %rbp anyway, so let's also
>> +    # set it up properly for old-style frame-pointer backtracing to work
>> +    # (e.g., backtrace_safe()). Also need to push the return address
>> before
>> +    # the rbp to get a normal frame. Our return address is in rcx.
>> +    pushq %rcx
>> +    pushq %rbp
>> +    movq %rsp, %rbp
>>         #
>>         # From http://stackoverflow.com/questions/2535989/what-are-the-
>> calling-conventions-for-unix-linux-system-calls-on-x86-64:
>>         # "User-level applications use as integer registers for passing
>> the sequence %rdi, %rsi, %rdx, %rcx, %r8 and %r9. The kernel interface uses
>> %rdi, %rsi, %rdx, %r10, %r8 and %r9"
>> @@ -229,8 +236,26 @@ syscall_entry:
>>         # syscall number from rax as first argument
>>         movq %rax, %rdi
>>
>> +    # align stack to 16 bytes, as required by the ABI.
>> +    # Counting the pushes above is not enough because we don't know what
>> was
>> +    # the stack alignment initially (syscall is not a function call so
>> it can
>> +    # be called with any stack alignment). An additional complication is
>> that
>> +    # we need to restore %rsp later without knowing how it was previously
>> +    # aligned. In the following trick, not using an additional register,
>> the
>> +    # two pushes leave the stack with the same alignment it had
>> originally,
>> +    # and a copy of the original %rsp at (%rsp) and 8(%rsp). The andq
>> then
>> +    # aligns the stack - if it was already 16 byte aligned nothing
>> changes, if
>> +    # it was 8 byte aligned then it subtracts 8 from %rsp, meaning that
>> the
>> +    # original %rsp is now at 8(%rsp) and 16(%rsp). In both cases we can
>> +    # restore it from 8(%rsp).
>> +    pushq %rsp
>> +    pushq (%rsp)
>> +    andq $-0x10, %rsp
>> +
>>         callq syscall_wrapper
>>
>> +    movq 8(%rsp), %rsp
>> +
>>         popq %r9
>>         # in Linux user and kernel return value are in rax so we have
>> nothing to do for return values
>>
>> @@ -251,6 +276,9 @@ syscall_entry:
>>          addq $8, %rsp  # rip emplacement (rip cannot be popped)
>>         popq %rsp
>>
>> +    popq %rbp
>> +    popq %rcx
>> +
>>         # jump to rcx where the syscall instruction put rip
>>         # (sysret would leave rxc cloberred so we have nothing to do to
>> restore it)
>>         jmpq *%rcx
>> --
>> 2.7.4
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to