[osv-dev] Re: [PATCH v0] aarch64: implement reschedule_from_interrupt() in assembly

Waldek Kozaczuk Mon, 08 Mar 2021 15:17:22 -0800

It would be also nice to see what others think about the C++ code changes - 
#ifdef, using extra C functions, etc. Is there a better way to re-structure 
the code?



On Friday, March 5, 2021 at 10:13:10 AM UTC-5 [email protected] 
wrote:

> On Wednesday, March 3, 2021 at 5:50:45 PM UTC-5 [email protected] wrote:
>
>> On Wed, Mar 3, 2021 at 4:57 PM Waldemar Kozaczuk <[email protected]> 
>> wrote:
>>
>>> As the issue #1124 and the thread 
>>> https://groups.google.com/g/osv-dev/c/ystZElFNdBI
>>> on the mailing list explains, some tests in the aarch64 debug build
>>> fail in a consistent manner with page faults suggesting that possibly
>>> the values of parameters passed from function to function get somehow
>>> currupt. After painful research, I pin-pointed the crashes to the
>>> reschedule_from_interrupt() method that does not work correctly in
>>> voluntary (non-preemptive) context switch scenarios where
>>> thread::yield() method is called by an app. It turns out that in those
>>> situations so called callee-save registers used by an app function
>>> calling thread::yield() are not saved during context switch and then
>>> later not restored when such thread gets resumed later. Following steps
>>> illustrate such scenario:
>>>
>>
> Thank you so much for all your work investigating this issue.
>  
>
>>
>>> 1. Function F pushes one of the callee saved registers - x23 (just an 
>>> example) - on the T1
>>>    stack becuase it uses it for something and it must do it per the 
>>> calling convention.
>>> 2. Function F stores some value in x23.
>>> 3. Function F calls thread::yield() directly or indirectly.
>>> 4. Eventually, reschedule_from_interrupt() is called and it calls 
>>> switch_to() to switch
>>>    stack pointer to the new thread T2 stack. The debug version of 
>>> neither reschedule_from_interrupt()
>>>    nor switch_to() stores x23 as they do not use this register (unlike 
>>> the release version).
>>> 5. At some point, later reschedule_from_interrupt() is called again (not 
>>> necessarily the very next time)
>>>    and calls switch_to() that switches back to T1.
>>> 6. T1 resumes and eventually returns the control to the function F1 
>>> right after it called yield().
>>> 7. The code in F1 after calling yield() reads the value of x23 ... and 
>>> boom. The x23 quite likely
>>>    contains garbage because it was never restored by F1 after calling 
>>> yield() because per calling
>>>    convention yield() or other callees should have saved and restored 
>>> it. But they did not, did they?
>>>    Or rather different routines on different threads running on the same 
>>> cpu in between ruined it.
>>>
>>> One way to solve this problem would be to add all callee-saved registers
>>> (x19-x28 and d8-d15 per 
>>> https://github.com/ARM-software/abi-aa/blob/f52e1ad3f81254497a83578dc102f6aac89e52d0/aapcs64/aapcs64.rst
>>>  
>>> - 
>>> see "6.1.1 General-purpose Registers" and "6.1.2 SIMD and Floating-Point 
>>> Registers" about v8-v15 registers)
>>> to the list of clobberred registers of the inline assembly in
>>> thread::switch_to() in arch/aarch/arch-switch.hh. But this does not
>>> always work as some versions of gcc (see 
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99342) depending on
>>> the optimization level (-O0, -O2, etc) inline switch_to() into
>>> reschedule_from_interrupt or not and generate machine code that does not 
>>> do
>>> what our intention is. On top of that, the code in 
>>> reschedule_from_interrupt()
>>>
>>
> So, after having attempted to file a bug with GCC and been proven wrong, 
> I've discovered that the following bit of GCC documentation is incorrect:
> "When the compiler selects which registers to use to represent input and 
> output operands, it does not use any of the clobbered registers."
> Reference: 
> https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Clobbers-and-Scratch-Registers
>
> In fact, the compiler does use clobbered registers to represent input 
> operands. For more details and an example of this, see the following 
> (invalid) bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99342
>
> It just happened that we were lucky that the compiler hadn't chosen any 
> clobbered registers to represent input operands, until we tried it with GCC 
> 10.2. While we could have solved this by using earlyclobbers (as was 
> suggested in the linked GCC bug report thread), we still wouldn't have 
> satisfied the following requirement from GCC documentation: "the compiler 
> requires the value of the stack pointer to be the same after 
> an asm statement as it was on entry to the statement".
>

This is a pretty earth-shattering statement which seems to imply that even 
for x86_64 we might no longer be lucky which registers the compiler chooses 
to use so we may need to re-write switch_to() in assembly as well at some 
point. Clearly, the purpose of the inline assembly in switch_to() is to 
switch the stack.

>
> that destroys any terminated thread on the current cpu uses the thread
>>> local memory (TLS) pointer register - tpidr_el0 - which is not re-read
>>> after the switch_to() and causes interesting crashes as the issue #1121
>>> describes.
>>>
>>> Given all above and the fact that reschedule_from_interrupt() calls
>>> switch_to() to switch threads which is very tricky to do right using
>>> pure C++ and some inline assembly, this patch implements high-level
>>> version of reschedule_from_interrupt() in assembly for aarch64 (the x64
>>> stays as is).
>>>
>>> This patch modifies include/osv/sched.hh and core/sched.cc with number
>>> of `#ifdef AARCH64_PORT_STUB` to keep reschedule_from_interrupt() as is
>>> for x64 and split it into 2 new methods - schedule_next_thread() and
>>> destroy_current_cpu_terminating_thread() for aarch64. The two new
>>> functions correspond to the parts of the reschedule_from_interrupt() up
>>> to the switch_to() call and after. Please note that
>>> schedule_next_thread() returns simple 2-field struct with pointers to
>>> old and new thread state (pc,sp,tcb,etc) that gets used by
>>> reschedule_from_interrupt.
>>>
>>> On top of this, this patch adds arch/aarch64/sched.S which implements
>>> reschedule_from_interrupt() as a C function in assembly. The 
>>> reschedule_from_interrupt
>>> in essence calls new schedule_next_thread(), saves all callee-saves
>>> registers on old thread stack, executes a context switch as the old 
>>> switch_to() 
>>> did, restores the callee-saves registers from new thread stack and
>>> finally calls destroy_current_cpu_terminating_thread().
>>>
>>> This patch has been tested on with the -O0, -O1, -O2 optimization levels
>>> (release, debug 1 and 2) on real ARM hardware and emulated TGI mode.
>>>
>>> Please note that the new code makes the context switch a little bit more
>>> expensive (~3%) per this numbers:
>>>
>>
> I personally favor correctness over performance...
>
>
>>> BEFORE:
>>> taskset -c 2-5 ./scripts/run.py -c 1 -e '/tests/misc-ctxsw.so'
>>> OSv v0.55.0-204-g837a9b5d
>>> getauxval() stubbed
>>> eth0: 192.168.122.15
>>> Booted up in 58.19 ms
>>> Cmdline: /tests/misc-ctxsw.so
>>>        664 colocated
>>>        665 apart
>>>        665 nopin
>>>
>>> AFTER:
>>> taskset -c 2-5 ./scripts/run.py -c 1 -e '/tests/misc-ctxsw.so'
>>> OSv v0.55.0-205-g7e7c49f7
>>> getauxval() stubbed
>>> eth0: 192.168.122.15
>>> Booted up in 58.20 ms
>>> Cmdline: /tests/misc-ctxsw.so
>>>        683 colocated
>>>        690 apart
>>>        690 nopin
>>>
>>
>> BTW the same test app compiled as pie (g++ tests/misc-ctxsw.cc -lpthread 
>> -o misc-ctxsw) to run on Linux host natively yields the following on the 
>> same hardware:
>>
>> taskset -c 2-2 ./misc-ctxsw 
>>       4075 colocated
>>       5319 apart
>>       1813 nopin
>>
>>>
>>> Fixes #1121
>>> Fixes #1124
>>>
>>> Signed-off-by: Waldemar Kozaczuk <[email protected]>
>>> ---
>>>  Makefile                          |  1 +
>>>  arch/aarch64/arch-switch.hh       | 38 +++--------------
>>>  arch/aarch64/arch-thread-state.hh |  1 +
>>>  arch/aarch64/sched.S              | 70 +++++++++++++++++++++++++++++++
>>>  core/sched.cc                     | 63 ++++++++++++++++++++++++++--
>>>  include/osv/sched.hh              | 34 ++++++++++++++-
>>>  6 files changed, 170 insertions(+), 37 deletions(-)
>>>  create mode 100644 arch/aarch64/sched.S
>>>
>>> diff --git a/Makefile b/Makefile
>>> index bbc57252..e1866c3b 100644
>>> --- a/Makefile
>>> +++ b/Makefile
>>> @@ -878,6 +878,7 @@ objects += arch/$(arch)/memset.o
>>>  objects += arch/$(arch)/memcpy.o
>>>  objects += arch/$(arch)/memmove.o
>>>  objects += arch/$(arch)/tlsdesc.o
>>> +objects += arch/$(arch)/sched.o
>>>  objects += $(libfdt)
>>>  endif
>>>
>>> diff --git a/arch/aarch64/arch-switch.hh b/arch/aarch64/arch-switch.hh
>>> index dff7467c..1a78d132 100644
>>> --- a/arch/aarch64/arch-switch.hh
>>> +++ b/arch/aarch64/arch-switch.hh
>>> @@ -20,32 +20,6 @@ void thread_main_c(sched::thread* t);
>>>
>>>  namespace sched {
>>>
>>> -void thread::switch_to()
>>> -{
>>> -    thread* old = current();
>>> -    asm volatile ("msr tpidr_el0, %0; isb; " :: "r"(_tcb) : "memory");
>>> -
>>> -    asm volatile("\n"
>>> -                 "str x29,     %0  \n"
>>> -                 "mov x2, sp       \n"
>>> -                 "adr x1, 1f       \n" /* address of label */
>>> -                 "stp x2, x1,  %1  \n"
>>> -
>>> -                 "ldp x29, x0, %2  \n"
>>> -                 "ldp x2, x1,  %3  \n"
>>> -
>>> -                 "mov sp, x2       \n"
>>> -                 "blr x1           \n"
>>> -
>>> -                 "1:               \n" /* label */
>>> -                 :
>>> -                 : "Q"(old->_state.fp), "Ump"(old->_state.sp),
>>> -                   "Ump"(this->_state.fp), "Ump"(this->_state.sp)
>>> -                 : "x0", "x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8",
>>> -                   "x9", "x10", "x11", "x12", "x13", "x14", "x15",
>>> -                   "x16", "x17", "x18", "x30", "memory");
>>> -}
>>> -
>>>  void thread::switch_to_first()
>>>  {
>>>      asm volatile ("msr tpidr_el0, %0; isb; " :: "r"(_tcb) : "memory");
>>> @@ -59,14 +33,13 @@ void thread::switch_to_first()
>>>
>>>      asm volatile("\n"
>>>                   "ldp x29, x0, %0  \n"
>>> -                 "ldp x2, x1, %1   \n"
>>> -                 "mov sp, x2       \n"
>>> -                 "blr x1           \n"
>>> +                 "ldp x22, x21, %1 \n"
>>> +                 "mov sp, x22      \n"
>>> +                 "blr x21          \n"
>>>                   :
>>>                   : "Ump"(this->_state.fp), "Ump"(this->_state.sp)
>>> -                 : "x0", "x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8",
>>> -                   "x9", "x10", "x11", "x12", "x13", "x14", "x15",
>>> -                   "x16", "x17", "x18", "x30", "memory");
>>> +                 : "x0", "x19", "x20", "x21", "x22", "x23", "x24",
>>> +                  "x25", "x26", "x27", "x28", "x29", "x30", "memory");
>>>
>>
> This fails when trying to compile with GCC 7.5.0:
> In file included from core/sched.cc:126:0:
> arch/aarch64/arch-switch.hh: In member function ‘void 
> sched::thread::switch_to_first()’:
> arch/aarch64/arch-switch.hh:43:1: error: x29 cannot be used in asm here
>  }
>  ^
>
> Removing x29 from the clobber list allows it to build again with GCC 7.5.0.
>
> You're using x21/x22 instead of x1/x2, and I'm guessing this is an attempt 
> to trick the compiler into not using the clobbered registers to represent 
> the input operands?
> Apparently the way to tell the compiler not to use clobbered registers to 
> represent input operands is to use earlyclobbers. Here's the obscure syntax 
> for turning them into earlyclobbers:
>
> asm volatile("\n"
>              "ldp x29, x0, %2  \n"
>              "ldp x22, x21, %3 \n"
>              "mov sp, x22      \n"
>              "blr x21          \n"
>              : // No output operands - this is to designate the input 
> operands as earlyclobbers
>                "=&Ump"(this->_state.fp), "=&Ump"(this->_state.sp)
>              : "Ump"(this->_state.fp), "Ump"(this->_state.sp)
>              : "x0", "x19", "x20", "x21", "x22", "x23", "x24",
>                "x25", "x26", "x27", "x28", "x30", "memory");
>
> The operands apparently have to be listed as both outputs and inputs even 
> they're only being read in the asm.
>
> It looks like there are other cases throughout the OSv codebase where this 
> could be an issue. I don't think those other potential issues should block 
> this patch from going in.
>
> We're still violating the requirement that the stack pointer should be the 
> same value after an asm statement as it was upon entry, so it seems that 
> switch_to_first() should also be moved to assembly (eventually). Again, I 
> don't think that should block this patch from going in.
>

Honestly, at this point, we might skip touching switch_to_first() 
altogether, and as you point to re-write it in assembly as well. The 
switch_to_first() is only used once per cpu on startup so I do not think it 
is as essential to fix it now. What do you think?

BTW I think I also missed adding d8-d15 registers to the list of clobbers.

>
>  }
>>>
>>>  void thread::init_stack()
>>> @@ -150,6 +123,7 @@ void thread::setup_tcb()
>>>      void* p = aligned_alloc(64, total_tls_size + sizeof(*_tcb));
>>>      _tcb = (thread_control_block *)p;
>>>      _tcb[0].tls_base = &_tcb[1];
>>> +    _state.tcb = p;
>>>      //
>>>      // First goes kernel TLS data
>>>      auto kernel_tls = _tcb[0].tls_base;
>>> diff --git a/arch/aarch64/arch-thread-state.hh 
>>> b/arch/aarch64/arch-thread-state.hh
>>> index af424472..6f1b680d 100644
>>> --- a/arch/aarch64/arch-thread-state.hh
>>> +++ b/arch/aarch64/arch-thread-state.hh
>>> @@ -14,6 +14,7 @@ struct thread_state {
>>>
>>>      void* sp;
>>>      void* pc;
>>> +    void* tcb;
>>>  };
>>>
>>>  #endif /* ARCH_THREAD_STATE_HH_ */
>>> diff --git a/arch/aarch64/sched.S b/arch/aarch64/sched.S
>>> new file mode 100644
>>> index 00000000..1e69f8fd
>>> --- /dev/null
>>> +++ b/arch/aarch64/sched.S
>>> @@ -0,0 +1,70 @@
>>> +#void cpu::reschedule_from_interrupt(bool called_from_yield,
>>> +#                                    thread_runtime::duration 
>>> preempt_after)
>>> +.global reschedule_from_interrupt
>>> +.type reschedule_from_interrupt, @function
>>> +reschedule_from_interrupt:
>>> +        #.cfi_startproc
>>> +        stp     x29, x30, [sp, #-176]!
>>> +        #.cfi_adjust_cfa_offset 176
>>> +        mov    x29, sp
>>> +        #.cfi_def_cfa_register x29
>>> +
>>> +        #Call cpu_schedule_next_thread() to determine next thread to 
>>> switch to if any
>>> +        bl cpu_schedule_next_thread
>>> +
>>> +       #The cpu_schedule_next_thread returns thread_switch_data in x0 
>>> and x1,
>>> +        #where x0 holds old_thread_state and x1 holds new_thread_state
>>> +        #If cpu_schedule_next_thread() returns thread_switch_data with 
>>> null pointers, exit 
>>> +        cmp x0, #0
>>> +        b.eq 2f
>>> +
>>> +        #Store all callee-save registers on the old thread stack
>>> +        stp    x19, x20, [sp, #16]
>>> +        stp    x21, x22, [sp, #32]
>>> +        stp    x23, x24, [sp, #48]
>>> +        stp    x25, x26, [sp, #64]
>>> +        stp    x27, x28, [sp, #80]
>>> +
>>> +        stp    d8, d9, [sp, #96]
>>> +        stp    d10, d11, [sp, #112]
>>> +        stp    d12, d13, [sp, #128]
>>> +        stp    d14, d15, [sp, #144]
>>> +
>>> +        #Switch to new thread
>>> +        ldr    x2, [x1, #32]     //Fetch _tcb of the new thread
>>> +        msr    tpidr_el0, x2     //Set thread pointer 
>>> +        isb
>>> +
>>> +        str     x29, [x0, #0]     //Save frame pointer of the old thread
>>> +        mov     x3, sp            //Fetch old thread stack pointer
>>> +        adr     x4, 1f            //Fetch old thread instruction point
>>> +        stp     x3, x4, [x0, #16] //Save old thread sp and pc
>>> +
>>> +        ldp     x29, x0, [x1, #0] //Set frame pointer of the new thread 
>>> and this (x0) of the new thread
>>> +                                  //Please note that the pc may point 
>>> to thread_main_c(thread*) which is
>>> +                                  //why we have to set x0 (1st 
>>> argument) to new thread object
>>> +        ldp     x3, x4, [x1, #16] //Fetch new thread sp and pc
>>> +        mov     sp, x3            //Set new thread stack pointer
>>> +        blr     x4                //Jump to the new thread pc
>>> +
>>> +1:
>>> +        #Restore all callee-save registers from the new thread stack
>>> +        ldp     x19, x20, [sp, #16]
>>> +        ldp     x21, x22, [sp, #32]
>>> +        ldp     x23, x24, [sp, #48]
>>> +        ldp     x25, x26, [sp, #64]
>>> +        ldp     x27, x28, [sp, #80]
>>> +
>>> +        ldp    d8, d9, [sp, #96]
>>> +        ldp    d10, d11, [sp, #112]
>>> +        ldp    d12, d13, [sp, #128]
>>> +        ldp    d14, d15, [sp, #144]
>>> +
>>> +        #Call destroy_current_cpu_terminating_thread()
>>> +        bl destroy_current_cpu_terminating_thread
>>> +
>>> +2:
>>> +        ldp     x29, x30, [sp], #176
>>> +        #.cfi_def_cfa sp, 0
>>> +        ret
>>> +        #.cfi_endproc
>>> diff --git a/core/sched.cc b/core/sched.cc
>>> index 06f849d1..74b4ab95 100644
>>> --- a/core/sched.cc
>>> +++ b/core/sched.cc
>>> @@ -225,13 +225,34 @@ void thread::cputime_estimator_get(
>>>  void cpu::schedule()
>>>  {
>>>      WITH_LOCK(irq_lock) {
>>> +#ifdef AARCH64_PORT_STUB
>>> +        reschedule_from_interrupt(sched::cpu::current(), false, thyst);
>>> +#else
>>>          current()->reschedule_from_interrupt();
>>> -    }
>>> -}
>>> -
>>> +#endif
>>> +    }
>>> +}
>>> +
>>> +// In aarch64 port, the reschedule_from_interrupt() needs to be 
>>> implemented
>>> +// in assembly (please see arch/aarch64/sched.S) to give us better 
>>> control
>>> +// which registers are used and which ones are saved and restored during
>>> +// the context switch. In essence, most of the original 
>>> reschedule_from_interrupt()
>>> +// code up to switch_to() is shared with aarch64-specific 
>>> cpu::schedule_next_thread()
>>> +// that is called from reschedule_from_interrupt() in 
>>> arch/arch64/sched.S assembly.
>>> +// The logic executed after switch_to() that makes up 
>>> destroy_current_cpu_terminating_thread()
>>> +// is called from arch/arch64/sched.S as well.
>>> +// At the end, we define reschedule_from_interrupt() in C++ for x86_64 
>>> and schedule_next_thread()
>>> +// and destroy_current_cpu_terminating_thread() in C++ for aarch64.
>>> +#ifdef AARCH64_PORT_STUB
>>> +inline thread_switch_data cpu::schedule_next_thread(bool 
>>> called_from_yield,
>>> +                                                    
>>> thread_runtime::duration preempt_after)
>>> +{
>>> +    thread_switch_data switch_data;
>>> +#else
>>>  void cpu::reschedule_from_interrupt(bool called_from_yield,
>>>                                      thread_runtime::duration 
>>> preempt_after)
>>>  {
>>> +#endif
>>>      trace_sched_sched();
>>>      assert(sched::exception_depth <= 1);
>>>      need_reschedule = false;
>>> @@ -259,7 +280,11 @@ void cpu::reschedule_from_interrupt(bool 
>>> called_from_yield,
>>>          // lowest runtime, and update the timer until the next thread's 
>>> turn.
>>>          if (runqueue.empty()) {
>>>              preemption_timer.cancel();
>>> +#ifdef AARCH64_PORT_STUB
>>> +            return switch_data;
>>> +#else
>>>              return;
>>> +#endif
>>>          } else if (!called_from_yield) {
>>>              auto &t = *runqueue.begin();
>>>              if (p->_runtime.get_local() < t._runtime.get_local()) {
>>> @@ -268,7 +293,11 @@ void cpu::reschedule_from_interrupt(bool 
>>> called_from_yield,
>>>                  if (delta > 0) {
>>>                      preemption_timer.set(now + delta);
>>>                  }
>>> +#ifdef AARCH64_PORT_STUB
>>> +                return switch_data;
>>> +#else
>>>                  return;
>>> +#endif
>>>              }
>>>          }
>>>          // If we're here, p no longer has the lowest runtime. Before 
>>> queuing
>>> @@ -336,19 +365,39 @@ void cpu::reschedule_from_interrupt(bool 
>>> called_from_yield,
>>>      if (lazy_flush_tlb.exchange(false, std::memory_order_seq_cst)) {
>>>          mmu::flush_tlb_local();
>>>      }
>>> +#ifdef AARCH64_PORT_STUB
>>> +    switch_data.old_thread_state = &(p->_state);
>>> +    switch_data.new_thread_state = &(n->_state);
>>> +    return switch_data;
>>> +}
>>> +#else
>>>      n->switch_to();
>>> +#endif
>>>
>>>      // Note: after the call to n->switch_to(), we should no longer use 
>>> any of
>>>      // the local variables, nor "this" object, because we just switched 
>>> to n's
>>>      // stack and the values we can access now are those that existed in 
>>> the
>>>      // reschedule call which scheduled n out, and will now be returning.
>>>      // So to get the current cpu, we must use cpu::current(), not 
>>> "this".
>>> +#ifdef AARCH64_PORT_STUB
>>> +extern "C" void destroy_current_cpu_terminating_thread()
>>> +{
>>> +#endif
>>>      if (cpu::current()->terminating_thread) {
>>>          cpu::current()->terminating_thread->destroy();
>>>          cpu::current()->terminating_thread = nullptr;
>>>      }
>>>  }
>>>
>>> +#ifdef AARCH64_PORT_STUB
>>> +extern "C" thread_switch_data cpu_schedule_next_thread(cpu* cpu,
>>> +                                                       bool 
>>> called_from_yield,
>>> +                                                      
>>>  thread_runtime::duration preempt_after)
>>> +{
>>> +    return cpu->schedule_next_thread(called_from_yield, preempt_after);
>>> +}
>>> +#endif
>>> +
>>>  void cpu::timer_fired()
>>>  {
>>>      // nothing to do, preemption will happen if needed
>>> @@ -521,7 +570,11 @@ void thread::pin(cpu *target_cpu)
>>>          // Note that wakeme is on the same CPU, and irq is disabled,
>>>          // so it will not actually run until we stop running.
>>>          wakeme->wake_with([&] { do_wakeme = true; });
>>> +#ifdef AARCH64_PORT_STUB
>>> +        reschedule_from_interrupt(source_cpu, false, thyst);
>>> +#else
>>>          source_cpu->reschedule_from_interrupt();
>>> +#endif
>>>      }
>>>      // wakeme will be implicitly join()ed here.
>>>  }
>>> @@ -760,7 +813,11 @@ void thread::yield(thread_runtime::duration 
>>> preempt_after)
>>>      }
>>>      trace_sched_yield_switch();
>>>
>>> +#ifdef AARCH64_PORT_STUB
>>> +    reschedule_from_interrupt(cpu::current(), true, preempt_after);
>>> +#else
>>>      cpu::current()->reschedule_from_interrupt(true, preempt_after);
>>> +#endif
>>>  }
>>>
>>>  void thread::set_priority(float priority)
>>> diff --git a/include/osv/sched.hh b/include/osv/sched.hh
>>> index 0fb44f77..1dacd623 100644
>>> --- a/include/osv/sched.hh
>>> +++ b/include/osv/sched.hh
>>> @@ -31,6 +31,9 @@ typedef float runtime_t;
>>>
>>>  extern "C" {
>>>  void smp_main();
>>> +#ifdef AARCH64_PORT_STUB
>>> +void destroy_current_cpu_terminating_thread();
>>> +#endif
>>>  };
>>>  void smp_launch();
>>>
>>> @@ -321,7 +324,12 @@ constexpr thread_runtime::duration thyst = 
>>> std::chrono::milliseconds(5);
>>>  constexpr thread_runtime::duration context_switch_penalty =
>>>                                             
>>> std::chrono::microseconds(10);
>>>
>>> -
>>> +#ifdef AARCH64_PORT_STUB
>>> +struct thread_switch_data {
>>> +   thread_state* old_thread_state = nullptr;
>>> +   thread_state* new_thread_state = nullptr;
>>> +};
>>> +#endif
>>>  /**
>>>   * OSv thread
>>>   */
>>> @@ -600,7 +608,9 @@ private:
>>>              unsigned allowed_initial_states_mask = 1 << 
>>> unsigned(status::waiting));
>>>      static void sleep_impl(timer &tmr);
>>>      void main();
>>> +#ifndef AARCH64_PORT_STUB
>>>      void switch_to();
>>> +#endif
>>>      void switch_to_first();
>>>      void prepare_wait();
>>>      void wait();
>>> @@ -702,7 +712,12 @@ private:
>>>      std::vector<char*> _tls;
>>>      bool _app;
>>>      std::shared_ptr<osv::application_runtime> _app_runtime;
>>> +public: //TODO: For whatever reason we need to make this public even 
>>> though destroy_current_cpu_terminating_thread is defined as friend below, 
>>> so compiler still complains.
>>>      void destroy();
>>> +private:
>>> +#ifdef AARCH64_PORT_STUB
>>> +    friend void ::destroy_current_cpu_terminating_thread();
>>> +#endif
>>>      friend class thread_ref_guard;
>>>      friend void thread_main_c(thread* t);
>>>      friend class wait_guard;
>>> @@ -884,8 +899,13 @@ struct cpu : private timer_base::client {
>>>       * @param preempt_after fire the preemption timer after this time 
>>> period
>>>       *                      (relevant only when "called_from_yield" is 
>>> TRUE)
>>>       */
>>> +#ifdef AARCH64_PORT_STUB
>>> +    thread_switch_data schedule_next_thread(bool called_from_yield = 
>>> false,
>>> +                                            thread_runtime::duration 
>>> preempt_after = thyst);
>>> +#else
>>>      void reschedule_from_interrupt(bool called_from_yield = false,
>>> -                                thread_runtime::duration preempt_after 
>>> = thyst);
>>> +                                   thread_runtime::duration 
>>> preempt_after = thyst);
>>> +#endif
>>>      void enqueue(thread& t);
>>>      void init_idle_thread();
>>>      virtual void timer_fired() override;
>>> @@ -895,6 +915,12 @@ struct cpu : private timer_base::client {
>>>      int renormalize_count;
>>>  };
>>>
>>> +#ifdef AARCH64_PORT_STUB
>>> +extern "C" void reschedule_from_interrupt(cpu* cpu,
>>> +                                          bool called_from_yield,
>>> +                                          thread_runtime::duration 
>>> preempt_after);
>>> +#endif
>>> +
>>>  class cpu::notifier {
>>>  public:
>>>      explicit notifier(std::function<void ()> cpu_up);
>>> @@ -996,7 +1022,11 @@ inline bool preemptable()
>>>  inline void preempt()
>>>  {
>>>      if (preemptable()) {
>>> +#ifdef AARCH64_PORT_STUB
>>> +        reschedule_from_interrupt(sched::cpu::current(), false, thyst);
>>> +#else
>>>          sched::cpu::current()->reschedule_from_interrupt();
>>> +#endif
>>>      } else {
>>>          // preempt_enable() will pick this up eventually
>>>          need_reschedule = true;
>>> -- 
>>> 2.27.0
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/50188714-5b44-4823-91eb-a95fbf29b93dn%40googlegroups.com.

[osv-dev] Re: [PATCH v0] aarch64: implement reschedule_from_interrupt() in assembly

Reply via email to