Thanks for a very thorough explanation. Sent from my iPhone
> On Nov 28, 2018, at 03:58, Nadav Har'El <[email protected]> wrote: > >> On Wed, Nov 28, 2018 at 5:54 AM Waldek Kozaczuk <[email protected]> wrote: >> I do not think I understand mechanics of how context switch in handled in >> OSv but fundamentally whenever current thread is preempted for whatever >> reason (timer expired, etc), OSv thread scheduler must somehow save all 16 >> registers, flags and everything else but FPU state somewhere and then upon >> selecting new thread for execution it must restore values of its 16 >> registers plus everything else so that the newly switched thread sees "the >> world" as it was when it was last running. I think more less it is done >> through TSS (task state segment) for each cpu and saving/restoring happens >> automatically (or not?). > > A context switch (from one thread to another thread) can happen in two ways. > The first is voluntary - the first thread calls some blocking function > (sleep(), read(), etc.) and goes to sleep and replaced by another thread. The > second is involuntary - we get an asynchronous event during the run of the > thread - an interrupt, exception, or signal - and the scheduler decides to do > a context switch at that point. > > In the first case (voluntary switching) our life is easy. The user thread > called a *function*. The ABI requires that when calling a function, the > caller must save many of the registers (known as "caller-saved") himself > because these can be used by the called function. It also requires that the > caller cannot assume anything about the FPU state (because the callee is > allowed to use the FPU too!). For this reason the thread context switch code > (switching between user threads) only needs to save some of the registers, > and does NOT need to save the FPU state. > > The situation is different with involuntary context switches. When an > asynchronous event, e.g., an interrupt, occurs, the user thread is in a > random position in the code. It may be using all its registers, and the FPU > state (including the old-style FPU and the new SSE and AVX registers). > Because our interrupt handler (which may do anything from running the > scheduler to reading a page from disk on page fault) may need to use any of > these registers, all of them, including the FPU, need to be saved on > interrupt time. The interrupt has a separate stack, and the FPU is saved on > this stack (see fpu_lock use in interrupt()). When the interrupt finishes, > this FPU is restored. This includes involuntary context switching: thread A > receives an interrupt, saves the FPU, does something and decides to switch to > thread B, and a while later we switch back to thread A at which point the > interrupt handler "returns" and restores the FPU state. Does involuntary case include scenario when enough time designated for current thread by scheduler expires? I would imaging this would qualify as interrupt? > > I can't think of a case where the user thread (doing that printf loop) can > get interrupted in the middle but we *dont'* save the FPU. So if we do, there > remains the possibility that we manage to ruin it before saving (this was > issue 4349bfd3583df03d44cda480049e630081d3a20b) or that something else > entirely overwrites and corrupts the saved state. > > An idea worth trying: We can verify that the latter (corruption of the saved > state) is not the problem by adding a simple checksum to the saved state in > fpu_lock, and before restoring it, check that the checksum is intact. I will give it a try. > >> >> In any case as far as FPU state goes there is number of places (interrupts, >> page fault, syscall handler) where fpu_lock construct is used to isolate any >> changes to FPU registers made by OSv code executed by corresponding code for >> these interrupts, etc. That way OSv code should not corrupt any FPU state. >> >> But what mechanism does exist in OSv to make sure that concurrent >> application threads that use floating point registers (like vfprintf used by >> httpserver thread or any other floating point arithmetic used by ffmpeg >> threads for example) do NOT corrupt each other FPU state when preempted? >> >> Here is what I found on internet (https://wiki.osdev.org/Context_Switching): >> >> "The FPU/MMX and SSE state could be saved and reloaded, but the CPU can also >> be tricked into generating an exception the first time that an FPU/MMX or >> SSE instruction is used by copying the hardware context switch mechanism >> (setting the TS flag in CR0)." > > OSv does not use the lazy FPU trick. This trick has rather gone out of style > when use of the FPU became much more commonly used, especially with modern > compilers generating SSE instructions for things that have nothing to do with > floating point. > > The way this trick worked was to say, on context switch, that the processor > does *not* have an FPU (by setting the TS bit in the CR0 register), and not > save (yet) the outgoing thread's FPU state. When, hopefully a long time > later, another thread uses the first FPU instruction, it will get a #NM > exception (no-math, i.e., no FPU available). The OS catches this exception, > and only *then* saves the old FPU state, loads the current thread's saved FPU > state, and restarts the instruction (and this time it will work). > > But OSv does not do this, it will be rather pointless where a lot of code - > including the kernel's scheduler - uses FPU instructions. > >> >> Does this suggest it should be possible to register an exception handler >> that would be called when first FPU operation is executed to mark to save >> FPU state when preempted and then restore when coming back to this thread? >> But I do not see where such code exists in OSv. Or have I missed it? >> >> Waldek >> >>> On Tuesday, November 27, 2018 at 6:30:03 PM UTC-5, Waldek Kozaczuk wrote: >>> I was also reading about "lazy" vs "eager" FPU state save/restore in Linux >>> and how due to some security reasons they are advocating users to switch to >>> the eager. I think the eager means that on each context switch FPU state >>> gets saved/restored regardless if FPU registered are used. >>> >>> Is OSv using "eager" or "lazy" strategy? I am guessing probably the lazy >>> one. >>> >>> Also I am not sure about cassandra (ant original older bug) but ffmpeg is >>> very heavy on floating point arithmetic so maybe that exposes FPU bugs more >>> easily. >>> >>>> On Tue, Nov 27, 2018 at 6:07 PM Waldek Kozaczuk <[email protected]> >>>> wrote: >>>> I also checked if y == 0 in Gdb when I connected after the crash and >>>> indeed it was true. >>>> >>>> What about eflags which are set by FUCOMI? Are we saving those? >>>> >>>> Sent from my iPhone >>>> >>>>> On Nov 27, 2018, at 17:53, Nadav Har'El <[email protected]> wrote: >>>>> >>>>> >>>>>> On Tue, Nov 27, 2018 at 10:11 PM Nadav Har'El <[email protected]> wrote: >>>>>> Indeed, seems like a loop that works on fpu registers and stack. The >>>>>> actual loop's test, while(y) is the "fucomi" instruction which compares >>>>>> two floating point values one of which being a zero created by "fldz". >>>>>> My completely unproven suspicion is that in the middle of this loop we >>>>>> get an interrupt (possibly also leading to a context switch, running >>>>>> another thread, and only much later returning to this thread), and for >>>>>> some reason the floating point state (which includes the register stack, >>>>>> etc.) is not saved correctly - or not restored correctly (perhaps >>>>>> restored from a corrupted array?). If after such corruption, "y" (in >>>>>> whatever register it sits) becomes, for example, NaN, the loop will >>>>>> never finish. I wonder if we can print these registers from gdb to see >>>>>> if perhaps gdb showing "y=0" isn't really correct. >>>>> >>>>> >>>>> Ok, so I started theorizing what might cause this... >>>>> If I remember correctly, OSv currently always saves the FPU state on some >>>>> stack, using the fpu_lock type. >>>>> Could we possibly be using stacks which are too small to hold this FPU >>>>> state? >>>>> In arch/x64/arch-cpu.hh we set a 4096 byte stack for nested exceptions, >>>>> 4096 byte stack for interrupts, and 4096*4 byte stack for normal >>>>> exceptions. Maybe one of these is too small? If you can easily reproduce >>>>> this bug, can you add a zero to all of these and see if maybe the bug >>>>> goes away with bigger stacks? >>>>> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "OSv Development" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
