Hey Guys, I know, it's not a useful comment from my part but where is the button to "like" or "love" this commit? :)
While it's off topic, I'm still planning to submit patches for tomcat 9 support Keep Rocking/Kind Regards, Geraldo Netto Sapere Aude => Non dvcor, dvco http://exdev.sf.net/ On Tue, 29 Jan 2019 at 08:13, Nadav Har'El <[email protected]> wrote: > > Thanks! I just committed your patch. It has to be a record on the number of > issues fixed by a single patch :-) > > -- > Nadav Har'El > [email protected] > > > On Tue, Jan 29, 2019 at 7:16 AM Waldemar Kozaczuk <[email protected]> > wrote: >> >> This patch adds "emms" instruction in critical places >> of OSv thread scheduler code to reset FPU state after it is saved >> upon involuntary context switch (for example interrupt) and right >> after executing context switch on new thread to clear FPU status word. >> Clearing FPU status word is necessary because old thread we are switching >> from, might have used MMX instructions and any code executing after that >> - new thread we are switching to - on the same CPU that needs to execute >> traditional FPU instructions would fail with FPU overflow exception leading >> to all kind of mysterious and nasty bugs. For more detailed explanation >> please look at the issues #1019 and #1020. >> >> This patch also adds new unit test that verifies that FPU status >> word is cleared upon involuntary context switch from thread using MMX >> instruction to a thread that uses traditional FPU instructions - >> printf of a float number. >> >> Please note there are some voluntary context switch scenarios >> where FPU gets into non-clean state right after context switch which >> may lead to other kinds of bugs like infinite loop in printf leading >> to a page fault (see #536, #1010 and #1018). This patch fixes those as well >> however we have not been able to provide a unit test that would demonstrate >> this nor we have a clear understanding why FPU status word gets corrupted. >> >> Fixes #1019 >> Fixes #1020 >> >> Fixes #536 - based on around 10 successful test runs where it would crash >> every other >> time before the patch >> >> Fixes #1010 - based on over 100 successful test runs where it would crash >> almost every time >> >> Fixes #1018 - based on over successful 50 test runs with /os/threads API >> being used >> in tight loop and before patch would crash sporadically >> >> Signed-off-by: Waldemar Kozaczuk <[email protected]> >> --- >> arch/x64/arch-cpu.hh | 5 +++ >> arch/x64/arch-switch.hh | 3 ++ >> modules/tests/Makefile | 4 +-- >> tests/tst-mmx-fpu.cc | 71 +++++++++++++++++++++++++++++++++++++++++ >> 4 files changed, 81 insertions(+), 2 deletions(-) >> create mode 100644 tests/tst-mmx-fpu.cc >> >> diff --git a/arch/x64/arch-cpu.hh b/arch/x64/arch-cpu.hh >> index b1cb0bf6..978217d0 100644 >> --- a/arch/x64/arch-cpu.hh >> +++ b/arch/x64/arch-cpu.hh >> @@ -71,6 +71,11 @@ struct save_fpu { >> T state; >> void save() { >> fpu_state_save(state.addr()); >> + // Reset FPU state after saving it as the thread we are about >> + // to switch from might have been using MMX registers and >> + // the thread scheduler that does FPU calculations needs >> + // FPU to be a clean state >> + asm volatile("emms"); >> } >> void restore() { >> fpu_state_restore(state.addr()); >> diff --git a/arch/x64/arch-switch.hh b/arch/x64/arch-switch.hh >> index e3769540..88bef949 100644 >> --- a/arch/x64/arch-switch.hh >> +++ b/arch/x64/arch-switch.hh >> @@ -103,6 +103,9 @@ void thread::switch_to() >> [rip]"i"(offsetof(thread_state, rip)) >> : "rbx", "rdx", "rsi", "rdi", "r8", "r9", >> "r10", "r11", "r12", "r13", "r14", "r15", "memory"); >> + // As the catch-all solution, reset FPU state and more specifically >> + // its status word. For details why we need it please see issue #1020. >> + asm volatile ("emms"); >> processor::fldcw(fpucw); >> processor::ldmxcsr(mxcsr); >> } >> diff --git a/modules/tests/Makefile b/modules/tests/Makefile >> index ece5f846..5c94e549 100644 >> --- a/modules/tests/Makefile >> +++ b/modules/tests/Makefile >> @@ -116,8 +116,8 @@ tests := tst-pthread.so misc-ramdisk.so tst-vblk.so >> tst-bsd-evh.so \ >> tst-ifaddrs.so tst-pthread-affinity-inherit.so tst-sem-timed-wait.so >> \ >> tst-ttyname.so tst-pthread-barrier.so tst-feexcept.so tst-math.so \ >> tst-sigaltstack.so tst-fread.so tst-tcp-cork.so tst-tcp-v6.so \ >> - tst-calloc.so tst-crypt.so tst-non-fpic.so tst-small-malloc.so >> - >> + tst-calloc.so tst-crypt.so tst-non-fpic.so tst-small-malloc.so \ >> + tst-mmx-fpu.so >> # libstatic-thread-variable.so tst-static-thread-variable.so \ >> >> tests += testrunner.so >> diff --git a/tests/tst-mmx-fpu.cc b/tests/tst-mmx-fpu.cc >> new file mode 100644 >> index 00000000..7175a02b >> --- /dev/null >> +++ b/tests/tst-mmx-fpu.cc >> @@ -0,0 +1,71 @@ >> +#include <stdio.h> >> +#include <unistd.h> >> +#include <pthread.h> >> +#include <math.h> >> +#include <assert.h> >> +#include <string.h> >> +#include <stdlib.h> >> +#include <time.h> >> +#include <atomic> >> + >> +static std::atomic<bool> mmx_finished = { false }; >> + >> +static void *mmx(void *arg) { >> + printf("MMX START: %f\n", 10.5); >> + >> + // Execute some non-sensical code using MMX instructions >> + // in long-enough loop to be preempted by OSv thread scheduler >> + for (unsigned long i = 0; i < 500000000l; i++) { >> + asm volatile ("movq $0x0fffffff, %rdx \n\t"); >> + >> + asm volatile ("movq %rdx, %mm0 \n\t"); >> + asm volatile ("movq %rdx, %mm1 \n\t"); >> + asm volatile ("paddsb %mm0, %mm1 \n\t"); >> + } >> + asm volatile ("emms"); >> + >> + printf("MMX END: %f\n", 11.5); >> + return NULL; >> +} >> + >> +static void *fpu(void *arg) { >> + printf("FPU START: %f\n", 10.5); >> + >> + struct timespec ts; >> + ts.tv_sec = 0; >> + ts.tv_nsec = 1000; >> + >> + char buffer[10]; >> + while (!mmx_finished.load()) { >> + sprintf(buffer, "%f", 0.5); >> + // Verify that sprintf, that uses FPU, prints >> + // correct representation of 0.5 >> + assert(!strcmp(buffer, "0.500000")); >> + nanosleep(&ts, NULL); >> + } >> + >> + printf("FPU END: %f\n", 11.5); >> + return NULL; >> +} >> + >> +// This test verifies that two threads - one using MMX instructions >> +// and another one using FPU and yielding (see sleep()) - can execute >> +// correctly without affecting each other. To that end OSv >> +// needs to make sure that CPU is properly reset after MMX code >> +// gets preempted so that code on new thread can safely execute >> +// code using traditional FPU instructions. For details please >> +// see issue #1019. >> +int main() { >> + printf("Hello from C code, %f\n", NAN); >> + >> + pthread_t mmx_thread, fpu_thread; >> + >> + pthread_create(&mmx_thread, NULL, mmx, NULL); >> + pthread_create(&fpu_thread, NULL, fpu, NULL); >> + >> + pthread_join(mmx_thread, NULL); >> + mmx_finished.store(true); >> + pthread_join(fpu_thread, NULL); >> + >> + return 0; >> +} >> -- >> 2.19.1 >> >> -- >> You received this message because you are subscribed to the Google Groups >> "OSv Development" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google Groups > "OSv Development" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
