Thanks! I just committed your patch. It has to be a record on the number of
issues fixed by a single patch :-)

--
Nadav Har'El
[email protected]


On Tue, Jan 29, 2019 at 7:16 AM Waldemar Kozaczuk <[email protected]>
wrote:

> This patch adds "emms" instruction in critical places
> of OSv thread scheduler code to reset FPU state after it is saved
> upon involuntary context switch (for example interrupt) and right
> after executing context switch on new thread to clear FPU status word.
> Clearing FPU status word is necessary because old thread we are switching
> from, might have used MMX instructions and any code executing after that
> - new thread we are switching to - on the same CPU that needs to execute
> traditional FPU instructions would fail with FPU overflow exception leading
> to all kind of mysterious and nasty bugs. For more detailed explanation
> please look at the issues #1019 and #1020.
>
> This patch also adds new unit test that verifies that FPU status
> word is cleared upon involuntary context switch from thread using MMX
> instruction to a thread that uses traditional FPU instructions -
> printf of a float number.
>
> Please note there are some voluntary context switch scenarios
> where FPU gets into non-clean state right after context switch which
> may lead to other kinds of bugs like infinite loop in printf leading
> to a page fault (see #536, #1010 and #1018). This patch fixes those as well
> however we have not been able to provide a unit test that would demonstrate
> this nor we have a clear understanding why FPU status word gets corrupted.
>
> Fixes #1019
> Fixes #1020
>
> Fixes #536 - based on around 10 successful test runs where it would crash
> every other
> time before the patch
>
> Fixes #1010 - based on over 100 successful test runs where it would crash
> almost every time
>
> Fixes #1018 - based on over successful 50 test runs with /os/threads API
> being used
> in tight loop and before patch would crash sporadically
>
> Signed-off-by: Waldemar Kozaczuk <[email protected]>
> ---
>  arch/x64/arch-cpu.hh    |  5 +++
>  arch/x64/arch-switch.hh |  3 ++
>  modules/tests/Makefile  |  4 +--
>  tests/tst-mmx-fpu.cc    | 71 +++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 81 insertions(+), 2 deletions(-)
>  create mode 100644 tests/tst-mmx-fpu.cc
>
> diff --git a/arch/x64/arch-cpu.hh b/arch/x64/arch-cpu.hh
> index b1cb0bf6..978217d0 100644
> --- a/arch/x64/arch-cpu.hh
> +++ b/arch/x64/arch-cpu.hh
> @@ -71,6 +71,11 @@ struct save_fpu {
>      T state;
>      void save() {
>          fpu_state_save(state.addr());
> +        // Reset FPU state after saving it as the thread we are about
> +        // to switch from might have been using MMX registers and
> +        // the thread scheduler that does FPU calculations needs
> +        // FPU to be a clean state
> +        asm volatile("emms");
>      }
>      void restore() {
>          fpu_state_restore(state.addr());
> diff --git a/arch/x64/arch-switch.hh b/arch/x64/arch-switch.hh
> index e3769540..88bef949 100644
> --- a/arch/x64/arch-switch.hh
> +++ b/arch/x64/arch-switch.hh
> @@ -103,6 +103,9 @@ void thread::switch_to()
>             [rip]"i"(offsetof(thread_state, rip))
>           : "rbx", "rdx", "rsi", "rdi", "r8", "r9",
>             "r10", "r11", "r12", "r13", "r14", "r15", "memory");
> +    // As the catch-all solution, reset FPU state and more specifically
> +    // its status word. For details why we need it please see issue #1020.
> +    asm volatile ("emms");
>      processor::fldcw(fpucw);
>      processor::ldmxcsr(mxcsr);
>  }
> diff --git a/modules/tests/Makefile b/modules/tests/Makefile
> index ece5f846..5c94e549 100644
> --- a/modules/tests/Makefile
> +++ b/modules/tests/Makefile
> @@ -116,8 +116,8 @@ tests := tst-pthread.so misc-ramdisk.so tst-vblk.so
> tst-bsd-evh.so \
>         tst-ifaddrs.so tst-pthread-affinity-inherit.so
> tst-sem-timed-wait.so \
>         tst-ttyname.so tst-pthread-barrier.so tst-feexcept.so tst-math.so \
>         tst-sigaltstack.so tst-fread.so tst-tcp-cork.so tst-tcp-v6.so \
> -       tst-calloc.so tst-crypt.so tst-non-fpic.so tst-small-malloc.so
> -
> +       tst-calloc.so tst-crypt.so tst-non-fpic.so tst-small-malloc.so \
> +       tst-mmx-fpu.so
>  #      libstatic-thread-variable.so tst-static-thread-variable.so \
>
>  tests += testrunner.so
> diff --git a/tests/tst-mmx-fpu.cc b/tests/tst-mmx-fpu.cc
> new file mode 100644
> index 00000000..7175a02b
> --- /dev/null
> +++ b/tests/tst-mmx-fpu.cc
> @@ -0,0 +1,71 @@
> +#include <stdio.h>
> +#include <unistd.h>
> +#include <pthread.h>
> +#include <math.h>
> +#include <assert.h>
> +#include <string.h>
> +#include <stdlib.h>
> +#include <time.h>
> +#include <atomic>
> +
> +static std::atomic<bool> mmx_finished = { false };
> +
> +static void *mmx(void *arg) {
> +    printf("MMX START: %f\n", 10.5);
> +
> +    // Execute some non-sensical code using MMX instructions
> +    // in long-enough loop to be preempted by OSv thread scheduler
> +    for (unsigned long i = 0; i < 500000000l; i++) {
> +        asm volatile ("movq $0x0fffffff, %rdx \n\t");
> +
> +        asm volatile ("movq %rdx, %mm0 \n\t");
> +        asm volatile ("movq %rdx, %mm1 \n\t");
> +        asm volatile ("paddsb %mm0, %mm1 \n\t");
> +    }
> +    asm volatile ("emms");
> +
> +    printf("MMX END: %f\n", 11.5);
> +    return NULL;
> +}
> +
> +static void *fpu(void *arg) {
> +    printf("FPU START: %f\n", 10.5);
> +
> +    struct timespec ts;
> +    ts.tv_sec = 0;
> +    ts.tv_nsec = 1000;
> +
> +    char buffer[10];
> +    while (!mmx_finished.load()) {
> +        sprintf(buffer, "%f", 0.5);
> +        // Verify that sprintf, that uses FPU, prints
> +        // correct representation of 0.5
> +        assert(!strcmp(buffer, "0.500000"));
> +        nanosleep(&ts, NULL);
> +    }
> +
> +    printf("FPU END: %f\n", 11.5);
> +    return NULL;
> +}
> +
> +// This test verifies that two threads - one using MMX instructions
> +// and another one using FPU and yielding (see sleep()) - can execute
> +// correctly without affecting each other. To that end OSv
> +// needs to make sure that CPU is properly reset after MMX code
> +// gets preempted so that code on new thread can safely execute
> +// code using traditional FPU instructions. For details please
> +// see issue #1019.
> +int main() {
> +    printf("Hello from C code, %f\n", NAN);
> +
> +    pthread_t mmx_thread, fpu_thread;
> +
> +    pthread_create(&mmx_thread, NULL, mmx, NULL);
> +    pthread_create(&fpu_thread, NULL, fpu, NULL);
> +
> +    pthread_join(mmx_thread, NULL);
> +    mmx_finished.store(true);
> +    pthread_join(fpu_thread, NULL);
> +
> +    return 0;
> +}
> --
> 2.19.1
>
> --
> You received this message because you are subscribed to the Google Groups
> "OSv Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to