Please note that adding emms instruction right after saving FPU state is 
enough to make the new unit test pass, however it is not enough to fix 
issues #536, #1010 and #1018. Which means that somehow sometimes in 
voluntary task switch scenarios (sleep, etc) FPU gets in a non-clean state. 
So adding emms right after the context switch on new thread (arch-cpu.cc) 
cleans FPU status word and sets new thread in proper state to execute FPU 
instructions. 

I do not understand what would cause FPU to get in such state in voluntary 
task switch scenarios - possibly some bugs in user code, GCC or some 
underlying assumption that kernel typically does not use FPU and OSv does. 
So even though this patch fixes scenarios like #536 and #101, it leaves me 
worrying a bit that there may be cases when FPU is in a non-clean state 
when our thread scheduler calculates time duration 
(see 
https://github.com/cloudius-systems/osv/blob/master/core/sched.cc#L94-L117) 
and ends up with wrong results line NaN. 

Waldek

On Sunday, December 16, 2018 at 5:05:01 PM UTC-5, Waldek Kozaczuk wrote:
>
> This patch add "emms" instruction in critical places 
> of OSv thread scheduler code to reset FPU state after it is saved 
> upon involuntary context switch (for example interrupt) and right 
> after executing context switch on new thread to clear FPU status word. 
> Clearing FPU status word is necessary because old thread we are swiching 
> from, might have used MMX instructions and any code executing after that 
> - new thread we are switching to - on the same CPU that needs to execute 
> traditional FPU instructions would fail with FPU overflow exception 
> leading 
> to all kind of mysterious and nasty bugs. For specific description look 
> at the comments added to each line adding emms instruction. 
>
> This patch also adds new unit test that verifies that FPU status 
> word is cleared upon involuntary context switch from thread using MMX 
> instruction to a thread that uses traditional FPU instructions - 
> printf of a float number. 
>
> Please note there are cases other than involuntary context switch 
> where FPU gets into non-clean state right after context switch which 
> may lead to other kinds of bugs like infinite loop in printf leading 
> to a page fault (see #536, #1010 and #1018). This patch fixes those as 
> well 
> however we have not been able to provide a unit test that would 
> demonstrate 
> this. 
>
> Fixes #1019 
> Fixes #1020 
>
> Fixes #536 - based on around 10 successful test runs where it would crash 
> every other 
> time before the patch 
>
> Fixes #1010 - based on over 100 successful test runs where it would crash 
> almost every time 
>
> Fixes #1018 - based on over successful 50 test runs with /os/threads API 
> being used 
> in tight loop and before patch would crash sporadically 
>
> Signed-off-by: Waldemar Kozaczuk <[email protected]> 
> --- 
>  arch/x64/arch-cpu.cc    | 10 ++++++ 
>  arch/x64/arch-switch.hh | 10 +++++- 
>  modules/tests/Makefile  |  4 +-- 
>  tests/tst-mmx-fpu.cc    | 69 +++++++++++++++++++++++++++++++++++++++++ 
>  4 files changed, 90 insertions(+), 3 deletions(-) 
>  create mode 100644 tests/tst-mmx-fpu.cc 
>
> diff --git a/arch/x64/arch-cpu.cc b/arch/x64/arch-cpu.cc 
> index bb96af3d..f3bab022 100644 
> --- a/arch/x64/arch-cpu.cc 
> +++ b/arch/x64/arch-cpu.cc 
> @@ -61,11 +61,21 @@ void fpu_state_init_fxsave(processor::fpu_state *s) { 
>  extern "C" 
>  void fpu_state_save_xsave(processor::fpu_state *s) { 
>      processor::xsave(s, -1); 
> +    // Reset FPU state after saving it as the thread we are about 
> +    // to switch from might have been using MMX registers and 
> +    // the thread scheduler that does FPU calculations needs 
> +    // FPU to be a clean state 
> +    asm volatile("emms"); 
>  } 
>   
>  extern "C" 
>  void fpu_state_save_fxsave(processor::fpu_state *s) { 
>      processor::fxsave(s); 
> +    // Reset FPU state after saving it as the thread we are about 
> +    // to switch from might have been using MMX registers and 
> +    // the thread scheduler that does FPU calculations needs 
> +    // FPU to be a clean state 
> +    asm volatile("emms"); 
>  } 
>   
>  extern "C" 
> diff --git a/arch/x64/arch-switch.hh b/arch/x64/arch-switch.hh 
> index e3769540..c859ba89 100644 
> --- a/arch/x64/arch-switch.hh 
> +++ b/arch/x64/arch-switch.hh 
> @@ -102,7 +102,15 @@ void thread::switch_to() 
>             [rbp]"i"(offsetof(thread_state, rbp)), 
>             [rip]"i"(offsetof(thread_state, rip)) 
>           : "rbx", "rdx", "rsi", "rdi", "r8", "r9", 
> -           "r10", "r11", "r12", "r13", "r14", "r15", "memory"); 
> +           "r10", "r11", "r12", "r13", "r14", "r15", "memory") 
> +    // As the catch-all solution, reset FPU state and more specifically 
> +    // its status word as the old thread we have just switched from due 
> to voluntary 
> +    // or involuntary reason may have been using MMX or caused FPU to get 
> +    // in a state where it needs to be cleared. Otherwise the new thread 
> we have 
> +    // just switched to, may experience FPU overflow exceptions leading 
> +    // to very nasty bugs like infinite loop or printing wrong floating 
> +    // point number 
> +    asm volatile ("emms"); 
>      processor::fldcw(fpucw); 
>      processor::ldmxcsr(mxcsr); 
>  } 
> diff --git a/modules/tests/Makefile b/modules/tests/Makefile 
> index ece5f846..5c94e549 100644 
> --- a/modules/tests/Makefile 
> +++ b/modules/tests/Makefile 
> @@ -116,8 +116,8 @@ tests := tst-pthread.so misc-ramdisk.so tst-vblk.so 
> tst-bsd-evh.so \ 
>          tst-ifaddrs.so tst-pthread-affinity-inherit.so 
> tst-sem-timed-wait.so \ 
>          tst-ttyname.so tst-pthread-barrier.so tst-feexcept.so tst-math.so 
> \ 
>          tst-sigaltstack.so tst-fread.so tst-tcp-cork.so tst-tcp-v6.so \ 
> -        tst-calloc.so tst-crypt.so tst-non-fpic.so tst-small-malloc.so 
> - 
> +        tst-calloc.so tst-crypt.so tst-non-fpic.so tst-small-malloc.so \ 
> +        tst-mmx-fpu.so 
>  #        libstatic-thread-variable.so tst-static-thread-variable.so \ 
>   
>  tests += testrunner.so 
> diff --git a/tests/tst-mmx-fpu.cc b/tests/tst-mmx-fpu.cc 
> new file mode 100644 
> index 00000000..163d9f26 
> --- /dev/null 
> +++ b/tests/tst-mmx-fpu.cc 
> @@ -0,0 +1,69 @@ 
> +#include <stdio.h> 
> +#include <unistd.h> 
> +#include <pthread.h> 
> +#include <math.h> 
> +#include <assert.h> 
> +#include <string.h> 
> +#include <stdlib.h> 
> +#include <time.h> 
> +#include <atomic> 
> + 
> +static std::atomic<bool> mmx_finished = { false }; 
> + 
> +static void *mmx(void *arg) { 
> +    printf("MMX START: %f\n", 10.5); 
> + 
> +    // Execute some non-sensical code using MMX instructions 
> +    // in long-enough loop to be preempted by OSv thread scheduler 
> +    for (unsigned long i = 0; i < 500000000l; i++) { 
> +        asm volatile ("movq $0x0fffffff, %rdx \n\t"); 
> + 
> +        asm volatile ("movq %rdx, %mm0 \n\t"); 
> +        asm volatile ("movq %rdx, %mm1 \n\t"); 
> +        asm volatile ("paddsb %mm0, %mm1 \n\t"); 
> +    } 
> +    asm volatile ("emms"); 
> + 
> +    printf("MMX END: %f\n", 11.5); 
> +    return NULL; 
> +} 
> + 
> +static void *fpu(void *arg) { 
> +    printf("FPU START: %f\n", 10.5); 
> + 
> +    struct timespec ts; 
> +    ts.tv_sec = 0; 
> +    ts.tv_nsec = 1000; 
> + 
> +    char buffer[10]; 
> +    while (!mmx_finished.load()) { 
> +        sprintf(buffer, "%f", 0.5); 
> +        assert(!strcmp(buffer, "0.500000")); 
> +        nanosleep(&ts, NULL); 
> +    } 
> + 
> +    printf("FPU END: %f\n", 11.5); 
> +    return NULL; 
> +} 
> + 
> +// This test verifies that two threads - one using MMX instructions 
> +// and another one using FPU and yielding (see sleep()) - can execute 
> +// correctly without affecting each other. To that end OSv 
> +// needs to make sure that CPU is properly reset after MMX code 
> +// gets preempted so that code on new thread can safely execute 
> +// code using traditional FPU instructions 
> +// 
> +int main() { 
> +    printf("Hello from C code, %f\n", NAN); 
> + 
> +    pthread_t mmx_thread, fpu_thread; 
> + 
> +    pthread_create(&mmx_thread, NULL, mmx, NULL); 
> +    pthread_create(&fpu_thread, NULL, fpu, NULL); 
> + 
> +    pthread_join(mmx_thread, NULL); 
> +    mmx_finished.store(true); 
> +    pthread_join(fpu_thread, NULL); 
> + 
> +    return 0; 
> +} 
> -- 
> 2.19.1 
>
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to