Thanks for a very thorough explanation.  

Sent from my iPhone

> On Nov 28, 2018, at 03:58, Nadav Har'El <[email protected]> wrote:
> 
>> On Wed, Nov 28, 2018 at 5:54 AM Waldek Kozaczuk <[email protected]> wrote:
>> I do not think I understand mechanics of how context switch in handled in 
>> OSv but fundamentally whenever current thread is preempted for whatever 
>> reason (timer expired, etc), OSv thread scheduler must somehow save all 16 
>> registers, flags and everything else but FPU state somewhere and then upon 
>> selecting new thread for execution it must restore values of its 16 
>> registers plus everything else so that the newly switched thread sees "the 
>> world" as it was when it was last running. I think more less it is done 
>> through TSS (task state segment) for each cpu and saving/restoring happens 
>> automatically (or not?).
> 
> A context switch (from one thread to another thread) can happen in two ways. 
> The first is voluntary - the first thread calls some blocking function 
> (sleep(), read(), etc.) and goes to sleep and replaced by another thread. The 
> second is involuntary - we get an asynchronous event during the run of the 
> thread - an interrupt, exception, or signal - and the scheduler decides to do 
> a context switch at that point.
> 
> In the first case (voluntary switching) our life is easy. The user thread 
> called a *function*. The ABI requires that when calling a function, the 
> caller must save many of the registers (known as "caller-saved") himself 
> because these can be used by the called function. It also requires that the 
> caller cannot assume anything about the FPU state (because the callee is 
> allowed to use the FPU too!). For this reason the thread context switch code 
> (switching between user threads) only needs to save some of the registers, 
> and does NOT need to save the FPU state.
> 
> The situation is different with involuntary context switches. When an 
> asynchronous event, e.g., an interrupt, occurs, the user thread is in a 
> random position in the code. It may be using all its registers, and the FPU 
> state (including the old-style FPU and the new SSE and AVX registers). 
> Because our interrupt handler (which may do anything from running the 
> scheduler to reading a page from disk on page fault) may need to use any of 
> these registers, all of them, including the FPU, need to be saved on 
> interrupt time. The interrupt has a separate stack, and the FPU is saved on 
> this stack (see fpu_lock use in interrupt()). When the interrupt finishes, 
> this FPU is restored. This includes involuntary context switching: thread A 
> receives an interrupt, saves the FPU, does something and decides to switch to 
> thread B, and a while later we switch back to thread A at which point the 
> interrupt handler "returns" and restores the  FPU state.
Does involuntary case include scenario when enough time designated for current 
thread by scheduler expires? I would imaging this would qualify as interrupt?
> 
> I can't think of a case where the user thread (doing that printf loop) can 
> get interrupted in the middle but we *dont'* save the FPU. So if we do, there 
> remains the possibility that we manage to ruin it before saving (this was 
> issue 4349bfd3583df03d44cda480049e630081d3a20b) or that something else 
> entirely overwrites and corrupts the saved state.
> 
> An idea worth trying: We can verify that the latter (corruption of the saved 
> state) is not the problem by adding a simple checksum to the saved state in 
> fpu_lock, and before restoring it, check that the checksum is intact.
I will give it a try. 
>  
>> 
>> In any case as far as FPU state goes there is number of places (interrupts, 
>> page fault, syscall handler) where fpu_lock construct is used to isolate any 
>> changes to FPU registers made by OSv code executed by corresponding code for 
>> these interrupts, etc. That way OSv code should not corrupt any FPU state.
>> 
>> But what mechanism does exist in OSv to make sure that concurrent 
>> application threads that use floating point registers (like vfprintf used by 
>> httpserver thread or any other floating point arithmetic used by ffmpeg 
>> threads for example) do NOT corrupt each other FPU state when preempted? 
>> 
>> Here is what I found on internet (https://wiki.osdev.org/Context_Switching):
>> 
>> "The FPU/MMX and SSE state could be saved and reloaded, but the CPU can also 
>> be tricked into generating an exception the first time that an FPU/MMX or 
>> SSE instruction is used by copying the hardware context switch mechanism 
>> (setting the TS flag in CR0)."
> 
> OSv does not use the lazy FPU trick. This trick has rather gone out of style 
> when use of the FPU became much more commonly used, especially with modern 
> compilers generating SSE instructions for things that have nothing to do with 
> floating point.
> 
> The way this trick worked was to say, on context switch, that the processor 
> does *not* have an FPU (by setting the TS bit in the CR0 register), and not 
> save (yet) the outgoing thread's FPU state. When, hopefully a long time 
> later, another thread uses the first FPU instruction, it will get a #NM 
> exception (no-math, i.e., no FPU available). The OS catches this exception, 
> and only *then* saves the old FPU state, loads the current thread's saved FPU 
> state, and restarts the instruction (and this time it will work).
> 
> But OSv does not do this, it will be rather pointless where a lot of code - 
> including the kernel's scheduler - uses FPU instructions.
> 
>> 
>> Does this suggest it should be possible to register an exception handler 
>> that would be called when first FPU operation is executed to mark to save 
>> FPU state when preempted and then restore when coming back to this thread? 
>> But I do not see where such code exists in OSv. Or have I missed it?
>> 
>> Waldek
>> 
>>> On Tuesday, November 27, 2018 at 6:30:03 PM UTC-5, Waldek Kozaczuk wrote:
>>> I was also reading about "lazy" vs "eager" FPU state save/restore in Linux 
>>> and how due to some security reasons they are advocating users to switch to 
>>> the eager. I think the eager means that on each context switch FPU state 
>>> gets saved/restored regardless if FPU registered are used.
>>> 
>>> Is OSv using "eager" or "lazy" strategy? I am guessing probably the lazy 
>>> one. 
>>> 
>>> Also I am not sure about cassandra (ant original older bug) but ffmpeg is 
>>> very heavy on floating point arithmetic so maybe that exposes FPU bugs more 
>>> easily. 
>>> 
>>>> On Tue, Nov 27, 2018 at 6:07 PM Waldek Kozaczuk <[email protected]> 
>>>> wrote:
>>>> I also checked if y == 0 in Gdb when I connected after the crash and 
>>>> indeed it was true. 
>>>> 
>>>> What about eflags which are set by FUCOMI? Are we saving those?
>>>> 
>>>> Sent from my iPhone
>>>> 
>>>>> On Nov 27, 2018, at 17:53, Nadav Har'El <[email protected]> wrote:
>>>>> 
>>>>> 
>>>>>> On Tue, Nov 27, 2018 at 10:11 PM Nadav Har'El <[email protected]> wrote:
>>>>>> Indeed, seems like a loop that works on fpu registers and stack. The 
>>>>>> actual loop's test, while(y) is the "fucomi" instruction which compares 
>>>>>> two floating point values one of which being a zero created by "fldz". 
>>>>>> My completely unproven suspicion is that in the middle of this loop we 
>>>>>> get an interrupt (possibly also leading to a context switch, running 
>>>>>> another thread, and only much later returning to this thread), and for 
>>>>>> some reason the floating point state (which includes the register stack, 
>>>>>> etc.) is not saved correctly - or not restored correctly (perhaps 
>>>>>> restored from a corrupted array?). If after such corruption, "y" (in 
>>>>>> whatever register it sits) becomes, for example, NaN, the loop will 
>>>>>> never finish. I wonder if we can print these registers from gdb to see 
>>>>>> if perhaps gdb showing "y=0" isn't really correct.
>>>>> 
>>>>> 
>>>>> Ok, so I started theorizing what might cause this...
>>>>> If I remember correctly, OSv currently always saves the FPU state on some 
>>>>> stack, using the fpu_lock type.
>>>>> Could we possibly be using stacks which are too small to hold this FPU 
>>>>> state?
>>>>> In arch/x64/arch-cpu.hh we set a 4096 byte stack for nested exceptions, 
>>>>> 4096 byte stack for interrupts, and 4096*4 byte stack for normal 
>>>>> exceptions. Maybe one of these is too small? If you can easily reproduce 
>>>>> this bug, can you add a zero to all of these and see if maybe the bug 
>>>>> goes away with bigger stacks?
>>>>> 
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "OSv Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to