From: Waldemar Kozaczuk <[email protected]>
Committer: Nadav Har'El <[email protected]>
Branch: master
Reset FPU state after saving it and upon context switch
This patch adds "emms" instruction in critical places
of OSv thread scheduler code to reset FPU state after it is saved
upon involuntary context switch (for example interrupt) and right
after executing context switch on new thread to clear FPU status word.
Clearing FPU status word is necessary because old thread we are switching
from, might have used MMX instructions and any code executing after that
- new thread we are switching to - on the same CPU that needs to execute
traditional FPU instructions would fail with FPU overflow exception leading
to all kind of mysterious and nasty bugs. For more detailed explanation
please look at the issues #1019 and #1020.
This patch also adds new unit test that verifies that FPU status
word is cleared upon involuntary context switch from thread using MMX
instruction to a thread that uses traditional FPU instructions -
printf of a float number.
Please note there are some voluntary context switch scenarios
where FPU gets into non-clean state right after context switch which
may lead to other kinds of bugs like infinite loop in printf leading
to a page fault (see #536, #1010 and #1018). This patch fixes those as well
however we have not been able to provide a unit test that would demonstrate
this nor we have a clear understanding why FPU status word gets corrupted.
Fixes #1019
Fixes #1020
Fixes #536 - based on around 10 successful test runs where it would crash
every other
time before the patch
Fixes #1010 - based on over 100 successful test runs where it would crash
almost every time
Fixes #1018 - based on over successful 50 test runs with /os/threads API
being used
in tight loop and before patch would crash sporadically
Signed-off-by: Waldemar Kozaczuk <[email protected]>
Message-Id: <[email protected]>
---
diff --git a/arch/x64/arch-cpu.hh b/arch/x64/arch-cpu.hh
--- a/arch/x64/arch-cpu.hh
+++ b/arch/x64/arch-cpu.hh
@@ -71,6 +71,11 @@ struct save_fpu {
T state;
void save() {
fpu_state_save(state.addr());
+ // Reset FPU state after saving it as the thread we are about
+ // to switch from might have been using MMX registers and
+ // the thread scheduler that does FPU calculations needs
+ // FPU to be a clean state
+ asm volatile("emms");
}
void restore() {
fpu_state_restore(state.addr());
diff --git a/arch/x64/arch-switch.hh b/arch/x64/arch-switch.hh
--- a/arch/x64/arch-switch.hh
+++ b/arch/x64/arch-switch.hh
@@ -103,6 +103,9 @@ void thread::switch_to()
[rip]"i"(offsetof(thread_state, rip))
: "rbx", "rdx", "rsi", "rdi", "r8", "r9",
"r10", "r11", "r12", "r13", "r14", "r15", "memory");
+ // As the catch-all solution, reset FPU state and more specifically
+ // its status word. For details why we need it please see issue #1020.
+ asm volatile ("emms");
processor::fldcw(fpucw);
processor::ldmxcsr(mxcsr);
}
diff --git a/modules/tests/Makefile b/modules/tests/Makefile
--- a/modules/tests/Makefile
+++ b/modules/tests/Makefile
@@ -116,8 +116,8 @@ tests := tst-pthread.so misc-ramdisk.so tst-vblk.so
tst-bsd-evh.so \
tst-ifaddrs.so tst-pthread-affinity-inherit.so tst-sem-timed-wait.so \
tst-ttyname.so tst-pthread-barrier.so tst-feexcept.so tst-math.so \
tst-sigaltstack.so tst-fread.so tst-tcp-cork.so tst-tcp-v6.so \
- tst-calloc.so tst-crypt.so tst-non-fpic.so tst-small-malloc.so
-
+ tst-calloc.so tst-crypt.so tst-non-fpic.so tst-small-malloc.so \
+ tst-mmx-fpu.so
# libstatic-thread-variable.so tst-static-thread-variable.so \
tests += testrunner.so
diff --git a/tests/tst-mmx-fpu.cc b/tests/tst-mmx-fpu.cc
--- a/tests/tst-mmx-fpu.cc
+++ b/tests/tst-mmx-fpu.cc
@@ -0,0 +1,71 @@
+#include <stdio.h>
+#include <unistd.h>
+#include <pthread.h>
+#include <math.h>
+#include <assert.h>
+#include <string.h>
+#include <stdlib.h>
+#include <time.h>
+#include <atomic>
+
+static std::atomic<bool> mmx_finished = { false };
+
+static void *mmx(void *arg) {
+ printf("MMX START: %f\n", 10.5);
+
+ // Execute some non-sensical code using MMX instructions
+ // in long-enough loop to be preempted by OSv thread scheduler
+ for (unsigned long i = 0; i < 500000000l; i++) {
+ asm volatile ("movq $0x0fffffff, %rdx \n\t");
+
+ asm volatile ("movq %rdx, %mm0 \n\t");
+ asm volatile ("movq %rdx, %mm1 \n\t");
+ asm volatile ("paddsb %mm0, %mm1 \n\t");
+ }
+ asm volatile ("emms");
+
+ printf("MMX END: %f\n", 11.5);
+ return NULL;
+}
+
+static void *fpu(void *arg) {
+ printf("FPU START: %f\n", 10.5);
+
+ struct timespec ts;
+ ts.tv_sec = 0;
+ ts.tv_nsec = 1000;
+
+ char buffer[10];
+ while (!mmx_finished.load()) {
+ sprintf(buffer, "%f", 0.5);
+ // Verify that sprintf, that uses FPU, prints
+ // correct representation of 0.5
+ assert(!strcmp(buffer, "0.500000"));
+ nanosleep(&ts, NULL);
+ }
+
+ printf("FPU END: %f\n", 11.5);
+ return NULL;
+}
+
+// This test verifies that two threads - one using MMX instructions
+// and another one using FPU and yielding (see sleep()) - can execute
+// correctly without affecting each other. To that end OSv
+// needs to make sure that CPU is properly reset after MMX code
+// gets preempted so that code on new thread can safely execute
+// code using traditional FPU instructions. For details please
+// see issue #1019.
+int main() {
+ printf("Hello from C code, %f\n", NAN);
+
+ pthread_t mmx_thread, fpu_thread;
+
+ pthread_create(&mmx_thread, NULL, mmx, NULL);
+ pthread_create(&fpu_thread, NULL, fpu, NULL);
+
+ pthread_join(mmx_thread, NULL);
+ mmx_finished.store(true);
+ pthread_join(fpu_thread, NULL);
+
+ return 0;
+}
--
You received this message because you are subscribed to the Google Groups "OSv
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.