[osv-dev] Re: [PATCH] Reset FPU state after saving it and upon context switch

Waldek Kozaczuk Fri, 25 Jan 2019 15:31:41 -0800

Thanks for reviewing this patch.

On Wednesday, January 23, 2019 at 5:10:56 PM UTC-5, Nadav Har'El wrote:
>
> On Wed, Dec 19, 2018 at 2:26 PM Waldemar Kozaczuk <[email protected] 
> <javascript:>> wrote:
>
>> This patch adds "emms" instruction in critical places
>>
>
> Thank you so much for working on this, and sorry for having delayed my 
> review for SO long.
> I'm basically happy with this patch except one thing that nags me - I 
> don't understand why the
> *second* EMMS call, the one in switch_to(), is necessary. And I'd really 
> like to understand.
> I have a *guess* below why it might be necessary, and I would be really 
> grateful if you could
> check my guess. I also have a couple of other small comments below.
>  
>
>> of OSv thread scheduler code to reset FPU state after it is saved
>> upon involuntary context switch (for example interrupt) and right
>> after executing context switch on new thread to clear FPU status word.
>> Clearing FPU status word is necessary because old thread we are switching
>> from, might have used MMX instructions and any code executing after that
>> - new thread we are switching to - on the same CPU that needs to execute
>> traditional FPU instructions would fail with FPU overflow exception 
>> leading
>> to all kind of mysterious and nasty bugs. For specific description look
>> at the comments added to each line adding emms instruction.
>>
>> This patch also adds new unit test that verifies that FPU status
>> word is cleared upon involuntary context switch from thread using MMX
>> instruction to a thread that uses traditional FPU instructions -
>> printf of a float number.
>>
>> Please note there are cases other than involuntary context switch
>> where FPU gets into non-clean state right after context switch which
>> may lead to other kinds of bugs like infinite loop in printf leading
>> to a page fault (see #536, #1010 and #1018). This patch fixes those as 
>> well
>> however we have not been able to provide a unit test that would 
>> demonstrate
>> this.
>>
>> Fixes #1019
>> Fixes #1020
>>
>> Fixes #536 - based on around 10 successful test runs where it would crash 
>> every other
>> time before the patch
>>
>> Fixes #1010 - based on over 100 successful test runs where it would crash 
>> almost every time
>>
>> Fixes #1018 - based on over successful 50 test runs with /os/threads API 
>> being used
>> in tight loop and before patch would crash sporadically
>>
>> Signed-off-by: Waldemar Kozaczuk <[email protected] <javascript:>>
>> ---
>>  arch/x64/arch-cpu.cc    | 10 ++++++
>>  arch/x64/arch-switch.hh |  8 +++++
>>  modules/tests/Makefile  |  4 +--
>>  tests/tst-mmx-fpu.cc    | 69 +++++++++++++++++++++++++++++++++++++++++
>>  4 files changed, 89 insertions(+), 2 deletions(-)
>>  create mode 100644 tests/tst-mmx-fpu.cc
>>
>> diff --git a/arch/x64/arch-cpu.cc b/arch/x64/arch-cpu.cc
>> index bb96af3d..f3bab022 100644
>> --- a/arch/x64/arch-cpu.cc
>> +++ b/arch/x64/arch-cpu.cc
>> @@ -61,11 +61,21 @@ void fpu_state_init_fxsave(processor::fpu_state *s) {
>>  extern "C"
>>  void fpu_state_save_xsave(processor::fpu_state *s) {
>>      processor::xsave(s, -1);
>> +    // Reset FPU state after saving it as the thread we are about
>> +    // to switch from might have been using MMX registers and
>> +    // the thread scheduler that does FPU calculations needs
>> +    // FPU to be a clean state
>> +    asm volatile("emms");
>>  }
>>
>>  extern "C"
>>  void fpu_state_save_fxsave(processor::fpu_state *s) {
>>      processor::fxsave(s);
>> +    // Reset FPU state after saving it as the thread we are about
>> +    // to switch from might have been using MMX registers and
>> +    // the thread scheduler that does FPU calculations needs
>> +    // FPU to be a clean state
>> +    asm volatile("emms");
>>
>
> Nitpick: Here you duplicate this code (and comment) twice, but this was 
> supposed to be a low-level
> function that does either xsave or fxsave, only. Maybe it makes more sense 
> to add the extra "emms" 
> call (and comment) in arch/x64/arch-cpu.hh, in the save_fpu::save() 
> function, so it will only need to
> appear once?
>
I do not remember why I did not do this. Will try to remove this 
duplication.


>
>  }
>>
>>  extern "C"
>> diff --git a/arch/x64/arch-switch.hh b/arch/x64/arch-switch.hh
>> index e3769540..08db318a 100644
>> --- a/arch/x64/arch-switch.hh
>> +++ b/arch/x64/arch-switch.hh
>> @@ -103,6 +103,14 @@ void thread::switch_to()
>>             [rip]"i"(offsetof(thread_state, rip))
>>           : "rbx", "rdx", "rsi", "rdi", "r8", "r9",
>>             "r10", "r11", "r12", "r13", "r14", "r15", "memory");
>> +    // As the catch-all solution, reset FPU state and more specifically
>> +    // its status word as the old thread we have just switched from due 
>> to voluntary
>> +    // or involuntary reason may have been using MMX or caused FPU to get
>> +    // in a state where it needs to be cleared. Otherwise the new thread 
>> we have
>> +    // just switched to, may experience FPU overflow exceptions leading
>> +    // to very nasty bugs like infinite loop or printing wrong floating
>> +    // point number
>>
>
> I'm not sure I fully understand this comment, or why this "emms" is needed
> at all. I tried to explain in 
> https://github.com/cloudius-systems/osv/issues/1020 that
> it was *not* needed, but if I remember correctly you discovered that it 
> *is* needed
> so my explanation was obviously wrong somehow.
>
> So I'm really curious *why* this EMMS is needed beyond the one we added to 
> the
> FPU save) because it's possible that the actual scheduler code used the 
> FPU and
> then *inlined *the thread::switch_to() causing switch_to() to begin with 
> an unclear
> FPU stack. Can you please help me verify if this is (or isn't) the reason 
> why this
> second EMMS is needed? You can check this by adding  __attribute__ 
> ((noinline)
> to thread::switch_to(). If this makes this EMMS call unnecessary, we found 
> the
> reason (if this EMMS is still necessary, then we still don't understand 
> something...).
>

This patch did not help fix the issue #1010 (I did not test #1018 and #536 
this time but given the stack trace looks identical I think I am guessing 
"noinline" fix would not help there either. But who knows. 

Here is how I changed the code on top of what this patch does:
diff --git a/arch/x64/arch-switch.hh b/arch/x64/arch-switch.hh
index 08db318a..8d685ebd 100644
--- a/arch/x64/arch-switch.hh
+++ b/arch/x64/arch-switch.hh
@@ -74,7 +74,8 @@ void (*resolve_set_fsbase(void))(u64 v)

 void set_fsbase(u64 v) __attribute__((ifunc("resolve_set_fsbase")));

-void thread::switch_to()
+__attribute__((noinline)) void thread::switch_to()
+//void thread::switch_to()
 {
     thread* old = current();
     // writing to fs_base invalidates memory accesses, so surround with
@@ -110,7 +111,7 @@ void thread::switch_to()
     // just switched to, may experience FPU overflow exceptions leading
     // to very nasty bugs like infinite loop or printing wrong floating
     // point number
-    asm volatile ("emms");
+    //asm volatile ("emms");
     processor::fldcw(fpucw);
     processor::ldmxcsr(mxcsr);
 }

Also running this before and after applying your suggestion did not change 
anything (it seems to me that switch_to was not inlined in first place, is 
my thinking wrong?). Otherwise the "switch_to" symbol should not show up in 
the elf, should it?
readelf -s build/release/loader-stripped.elf | grep switch
  1566: 00000000003f69c0   124 FUNC    GLOBAL DEFAULT    2 
_ZN5sched6thread15switch_
  9593: 00000000003f6910   172 FUNC    GLOBAL DEFAULT    2 
_ZN5sched6thread9switch_t
 14840: 00000000003a2330    21 FUNC    GLOBAL DEFAULT    2 
_ZN3mmu29switch_to_runtim

Finally OSv scheduler for sure does do floating point calculation, but is 
it a way that corrupts FPU, I do not know. I believe 
thread_runtime::time_until() 
calls taulog() that uses floats to calculate some expression. See 
https://github.com/cloudius-systems/osv/blob/a3cd022fcda2c88eae89476aa6c29e3c4be04926/core/sched.cc#L1759-L1773.
 
The time_until() is called by cpu::reschedule_from_interrupt() which 
eventually calls switch_to().

>
> By the way, which of the various tests you ran allows you to check that 
> this EMMS
> here is necessary?
>
The second EMMS in switch_to() is necessary to fix #536, #1010 and #1018. 
The #1010 is easiest to reproduce - I added original steps there. I you 
have time I am sending a patch with some shell scripts that let you 
replicate this issue (look 
at fpu_test_scripts/run_ffmpeg_extract_frames.sh).
 

>
> P.S. instead of a long comment which I'm not sure is 100% accurate, you
> can also just point to issue 1020 in the comment.
>
> Sure. 

>
> +    asm volatile ("emms");
>>      processor::fldcw(fpucw);
>>      processor::ldmxcsr(mxcsr);
>>  }
>> diff --git a/modules/tests/Makefile b/modules/tests/Makefile
>> index ece5f846..5c94e549 100644
>> --- a/modules/tests/Makefile
>> +++ b/modules/tests/Makefile
>> @@ -116,8 +116,8 @@ tests := tst-pthread.so misc-ramdisk.so tst-vblk.so 
>> tst-bsd-evh.so \
>>         tst-ifaddrs.so tst-pthread-affinity-inherit.so 
>> tst-sem-timed-wait.so \
>>         tst-ttyname.so tst-pthread-barrier.so tst-feexcept.so tst-math.so 
>> \
>>         tst-sigaltstack.so tst-fread.so tst-tcp-cork.so tst-tcp-v6.so \
>> -       tst-calloc.so tst-crypt.so tst-non-fpic.so tst-small-malloc.so
>> -
>> +       tst-calloc.so tst-crypt.so tst-non-fpic.so tst-small-malloc.so \
>> +       tst-mmx-fpu.so
>>  #      libstatic-thread-variable.so tst-static-thread-variable.so \
>>
>>  tests += testrunner.so
>> diff --git a/tests/tst-mmx-fpu.cc b/tests/tst-mmx-fpu.cc
>> new file mode 100644
>> index 00000000..163d9f26
>> --- /dev/null
>> +++ b/tests/tst-mmx-fpu.cc
>> @@ -0,0 +1,69 @@
>> +#include <stdio.h>
>> +#include <unistd.h>
>> +#include <pthread.h>
>> +#include <math.h>
>> +#include <assert.h>
>> +#include <string.h>
>> +#include <stdlib.h>
>> +#include <time.h>
>> +#include <atomic>
>> +
>> +static std::atomic<bool> mmx_finished = { false };
>> +
>> +static void *mmx(void *arg) {
>> +    printf("MMX START: %f\n", 10.5);
>> +
>> +    // Execute some non-sensical code using MMX instructions
>> +    // in long-enough loop to be preempted by OSv thread scheduler
>> +    for (unsigned long i = 0; i < 500000000l; i++) {
>> +        asm volatile ("movq $0x0fffffff, %rdx \n\t");
>> +
>> +        asm volatile ("movq %rdx, %mm0 \n\t");
>> +        asm volatile ("movq %rdx, %mm1 \n\t");
>> +        asm volatile ("paddsb %mm0, %mm1 \n\t");
>> +    }
>> +    asm volatile ("emms");
>> +
>> +    printf("MMX END: %f\n", 11.5);
>> +    return NULL;
>> +}
>> +
>> +static void *fpu(void *arg) {
>> +    printf("FPU START: %f\n", 10.5);
>> +
>> +    struct timespec ts;
>> +    ts.tv_sec = 0;
>> +    ts.tv_nsec = 1000;
>> +
>> +    char buffer[10];
>> +    while (!mmx_finished.load()) {
>> +        sprintf(buffer, "%f", 0.5);
>> +        assert(!strcmp(buffer, "0.500000"));
>> +        nanosleep(&ts, NULL);
>> +    }
>> +
>> +    printf("FPU END: %f\n", 11.5);
>> +    return NULL;
>> +}
>> +
>> +// This test verifies that two threads - one using MMX instructions
>> +// and another one using FPU and yielding (see sleep()) - can execute
>> +// correctly without affecting each other. To that end OSv
>> +// needs to make sure that CPU is properly reset after MMX code
>> +// gets preempted so that code on new thread can safely execute
>> +// code using traditional FPU instructions
>>
>
> Would be nice to refer to the relevant issue and say something about what
> happened before it was fixed (i.e., did this test crash? fail on an 
> assertion
> above? or what?).
>
By the way it is enough to have emms instruction added to  fpu_state_save_* 
functions  
to have this unit test pass. But it is not enough to fix #536, #1010 and 
#1018 -> the second emms in switch_to() is necessary.

>
> +//
>> +int main() {
>> +    printf("Hello from C code, %f\n", NAN);
>> +
>> +    pthread_t mmx_thread, fpu_thread;
>> +
>> +    pthread_create(&mmx_thread, NULL, mmx, NULL);
>> +    pthread_create(&fpu_thread, NULL, fpu, NULL);
>> +
>> +    pthread_join(mmx_thread, NULL);
>> +    mmx_finished.store(true);
>> +    pthread_join(fpu_thread, NULL);
>> +
>> +    return 0;
>> +}
>> -- 
>> 2.19.1
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "OSv Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[osv-dev] Re: [PATCH] Reset FPU state after saving it and upon context switch

Reply via email to