On Friday, January 25, 2019 at 6:31:11 PM UTC-5, Waldek Kozaczuk wrote:
>
> Thanks for reviewing this patch.
>
> On Wednesday, January 23, 2019 at 5:10:56 PM UTC-5, Nadav Har'El wrote:
>>
>> On Wed, Dec 19, 2018 at 2:26 PM Waldemar Kozaczuk <[email protected]>
>> wrote:
>>
>>> This patch adds "emms" instruction in critical places
>>>
>>
>> Thank you so much for working on this, and sorry for having delayed my
>> review for SO long.
>> I'm basically happy with this patch except one thing that nags me - I
>> don't understand why the
>> *second* EMMS call, the one in switch_to(), is necessary. And I'd really
>> like to understand.
>> I have a *guess* below why it might be necessary, and I would be really
>> grateful if you could
>> check my guess. I also have a couple of other small comments below.
>>
>>
>>> of OSv thread scheduler code to reset FPU state after it is saved
>>> upon involuntary context switch (for example interrupt) and right
>>> after executing context switch on new thread to clear FPU status word.
>>> Clearing FPU status word is necessary because old thread we are switching
>>> from, might have used MMX instructions and any code executing after that
>>> - new thread we are switching to - on the same CPU that needs to execute
>>> traditional FPU instructions would fail with FPU overflow exception
>>> leading
>>> to all kind of mysterious and nasty bugs. For specific description look
>>> at the comments added to each line adding emms instruction.
>>>
>>> This patch also adds new unit test that verifies that FPU status
>>> word is cleared upon involuntary context switch from thread using MMX
>>> instruction to a thread that uses traditional FPU instructions -
>>> printf of a float number.
>>>
>>> Please note there are cases other than involuntary context switch
>>> where FPU gets into non-clean state right after context switch which
>>> may lead to other kinds of bugs like infinite loop in printf leading
>>> to a page fault (see #536, #1010 and #1018). This patch fixes those as
>>> well
>>> however we have not been able to provide a unit test that would
>>> demonstrate
>>> this.
>>>
>>> Fixes #1019
>>> Fixes #1020
>>>
>>> Fixes #536 - based on around 10 successful test runs where it would
>>> crash every other
>>> time before the patch
>>>
>>> Fixes #1010 - based on over 100 successful test runs where it would
>>> crash almost every time
>>>
>>> Fixes #1018 - based on over successful 50 test runs with /os/threads API
>>> being used
>>> in tight loop and before patch would crash sporadically
>>>
>>> Signed-off-by: Waldemar Kozaczuk <[email protected]>
>>> ---
>>> arch/x64/arch-cpu.cc | 10 ++++++
>>> arch/x64/arch-switch.hh | 8 +++++
>>> modules/tests/Makefile | 4 +--
>>> tests/tst-mmx-fpu.cc | 69 +++++++++++++++++++++++++++++++++++++++++
>>> 4 files changed, 89 insertions(+), 2 deletions(-)
>>> create mode 100644 tests/tst-mmx-fpu.cc
>>>
>>> diff --git a/arch/x64/arch-cpu.cc b/arch/x64/arch-cpu.cc
>>> index bb96af3d..f3bab022 100644
>>> --- a/arch/x64/arch-cpu.cc
>>> +++ b/arch/x64/arch-cpu.cc
>>> @@ -61,11 +61,21 @@ void fpu_state_init_fxsave(processor::fpu_state *s) {
>>> extern "C"
>>> void fpu_state_save_xsave(processor::fpu_state *s) {
>>> processor::xsave(s, -1);
>>> + // Reset FPU state after saving it as the thread we are about
>>> + // to switch from might have been using MMX registers and
>>> + // the thread scheduler that does FPU calculations needs
>>> + // FPU to be a clean state
>>> + asm volatile("emms");
>>> }
>>>
>>> extern "C"
>>> void fpu_state_save_fxsave(processor::fpu_state *s) {
>>> processor::fxsave(s);
>>> + // Reset FPU state after saving it as the thread we are about
>>> + // to switch from might have been using MMX registers and
>>> + // the thread scheduler that does FPU calculations needs
>>> + // FPU to be a clean state
>>> + asm volatile("emms");
>>>
>>
>> Nitpick: Here you duplicate this code (and comment) twice, but this was
>> supposed to be a low-level
>> function that does either xsave or fxsave, only. Maybe it makes more
>> sense to add the extra "emms"
>> call (and comment) in arch/x64/arch-cpu.hh, in the save_fpu::save()
>> function, so it will only need to
>> appear once?
>>
> I do not remember why I did not do this. Will try to remove this
> duplication.
>
>>
>> }
>>>
>>> extern "C"
>>> diff --git a/arch/x64/arch-switch.hh b/arch/x64/arch-switch.hh
>>> index e3769540..08db318a 100644
>>> --- a/arch/x64/arch-switch.hh
>>> +++ b/arch/x64/arch-switch.hh
>>> @@ -103,6 +103,14 @@ void thread::switch_to()
>>> [rip]"i"(offsetof(thread_state, rip))
>>> : "rbx", "rdx", "rsi", "rdi", "r8", "r9",
>>> "r10", "r11", "r12", "r13", "r14", "r15", "memory");
>>> + // As the catch-all solution, reset FPU state and more specifically
>>> + // its status word as the old thread we have just switched from due
>>> to voluntary
>>> + // or involuntary reason may have been using MMX or caused FPU to
>>> get
>>> + // in a state where it needs to be cleared. Otherwise the new
>>> thread we have
>>> + // just switched to, may experience FPU overflow exceptions leading
>>> + // to very nasty bugs like infinite loop or printing wrong floating
>>> + // point number
>>>
>>
>> I'm not sure I fully understand this comment, or why this "emms" is needed
>> at all. I tried to explain in
>> https://github.com/cloudius-systems/osv/issues/1020 that
>> it was *not* needed, but if I remember correctly you discovered that it
>> *is* needed
>> so my explanation was obviously wrong somehow.
>>
>> So I'm really curious *why* this EMMS is needed beyond the one we added
>> to the
>> FPU save) because it's possible that the actual scheduler code used the
>> FPU and
>> then *inlined *the thread::switch_to() causing switch_to() to begin with
>> an unclear
>> FPU stack. Can you please help me verify if this is (or isn't) the reason
>> why this
>> second EMMS is needed? You can check this by adding __attribute__
>> ((noinline)
>> to thread::switch_to(). If this makes this EMMS call unnecessary, we
>> found the
>> reason (if this EMMS is still necessary, then we still don't understand
>> something...).
>>
>
> This patch did not help fix the issue #1010 (I did not test #1018 and #536
> this time but given the stack trace looks identical I think I am guessing
> "noinline" fix would not help there either. But who knows.
>
I meant your "__attribute__ ((noinline)" suggestion on top of my patch
(see the diff below) did not fix #1010.
>
> Here is how I changed the code on top of what this patch does:
> diff --git a/arch/x64/arch-switch.hh b/arch/x64/arch-switch.hh
> index 08db318a..8d685ebd 100644
> --- a/arch/x64/arch-switch.hh
> +++ b/arch/x64/arch-switch.hh
> @@ -74,7 +74,8 @@ void (*resolve_set_fsbase(void))(u64 v)
>
> void set_fsbase(u64 v) __attribute__((ifunc("resolve_set_fsbase")));
>
> -void thread::switch_to()
> +__attribute__((noinline)) void thread::switch_to()
> +//void thread::switch_to()
> {
> thread* old = current();
> // writing to fs_base invalidates memory accesses, so surround with
> @@ -110,7 +111,7 @@ void thread::switch_to()
> // just switched to, may experience FPU overflow exceptions leading
> // to very nasty bugs like infinite loop or printing wrong floating
> // point number
> - asm volatile ("emms");
> + //asm volatile ("emms");
> processor::fldcw(fpucw);
> processor::ldmxcsr(mxcsr);
> }
>
> Also running this before and after applying your suggestion did not change
> anything (it seems to me that switch_to was not inlined in first place, is
> my thinking wrong?). Otherwise the "switch_to" symbol should not show up in
> the elf, should it?
> readelf -s build/release/loader-stripped.elf | grep switch
> 1566: 00000000003f69c0 124 FUNC GLOBAL DEFAULT 2
> _ZN5sched6thread15switch_
> 9593: 00000000003f6910 172 FUNC GLOBAL DEFAULT 2
> _ZN5sched6thread9switch_t
> 14840: 00000000003a2330 21 FUNC GLOBAL DEFAULT 2
> _ZN3mmu29switch_to_runtim
>
> Finally OSv scheduler for sure does do floating point calculation, but is
> it a way that corrupts FPU, I do not know. I believe
> thread_runtime::time_until()
> calls taulog() that uses floats to calculate some expression. See
> https://github.com/cloudius-systems/osv/blob/a3cd022fcda2c88eae89476aa6c29e3c4be04926/core/sched.cc#L1759-L1773.
>
> The time_until() is called by cpu::reschedule_from_interrupt() which
> eventually calls switch_to().
>
>>
>> By the way, which of the various tests you ran allows you to check that
>> this EMMS
>> here is necessary?
>>
> The second EMMS in switch_to() is necessary to fix #536, #1010 and #1018.
> The #1010 is easiest to reproduce - I added original steps there. I you
> have time I am sending a patch with some shell scripts that let you
> replicate this issue (look
> at fpu_test_scripts/run_ffmpeg_extract_frames.sh).
>
>
>>
>> P.S. instead of a long comment which I'm not sure is 100% accurate, you
>> can also just point to issue 1020 in the comment.
>>
>> Sure.
>
>>
>> + asm volatile ("emms");
>>> processor::fldcw(fpucw);
>>> processor::ldmxcsr(mxcsr);
>>> }
>>> diff --git a/modules/tests/Makefile b/modules/tests/Makefile
>>> index ece5f846..5c94e549 100644
>>> --- a/modules/tests/Makefile
>>> +++ b/modules/tests/Makefile
>>> @@ -116,8 +116,8 @@ tests := tst-pthread.so misc-ramdisk.so tst-vblk.so
>>> tst-bsd-evh.so \
>>> tst-ifaddrs.so tst-pthread-affinity-inherit.so
>>> tst-sem-timed-wait.so \
>>> tst-ttyname.so tst-pthread-barrier.so tst-feexcept.so
>>> tst-math.so \
>>> tst-sigaltstack.so tst-fread.so tst-tcp-cork.so tst-tcp-v6.so \
>>> - tst-calloc.so tst-crypt.so tst-non-fpic.so tst-small-malloc.so
>>> -
>>> + tst-calloc.so tst-crypt.so tst-non-fpic.so tst-small-malloc.so \
>>> + tst-mmx-fpu.so
>>> # libstatic-thread-variable.so tst-static-thread-variable.so \
>>>
>>> tests += testrunner.so
>>> diff --git a/tests/tst-mmx-fpu.cc b/tests/tst-mmx-fpu.cc
>>> new file mode 100644
>>> index 00000000..163d9f26
>>> --- /dev/null
>>> +++ b/tests/tst-mmx-fpu.cc
>>> @@ -0,0 +1,69 @@
>>> +#include <stdio.h>
>>> +#include <unistd.h>
>>> +#include <pthread.h>
>>> +#include <math.h>
>>> +#include <assert.h>
>>> +#include <string.h>
>>> +#include <stdlib.h>
>>> +#include <time.h>
>>> +#include <atomic>
>>> +
>>> +static std::atomic<bool> mmx_finished = { false };
>>> +
>>> +static void *mmx(void *arg) {
>>> + printf("MMX START: %f\n", 10.5);
>>> +
>>> + // Execute some non-sensical code using MMX instructions
>>> + // in long-enough loop to be preempted by OSv thread scheduler
>>> + for (unsigned long i = 0; i < 500000000l; i++) {
>>> + asm volatile ("movq $0x0fffffff, %rdx \n\t");
>>> +
>>> + asm volatile ("movq %rdx, %mm0 \n\t");
>>> + asm volatile ("movq %rdx, %mm1 \n\t");
>>> + asm volatile ("paddsb %mm0, %mm1 \n\t");
>>> + }
>>> + asm volatile ("emms");
>>> +
>>> + printf("MMX END: %f\n", 11.5);
>>> + return NULL;
>>> +}
>>> +
>>> +static void *fpu(void *arg) {
>>> + printf("FPU START: %f\n", 10.5);
>>> +
>>> + struct timespec ts;
>>> + ts.tv_sec = 0;
>>> + ts.tv_nsec = 1000;
>>> +
>>> + char buffer[10];
>>> + while (!mmx_finished.load()) {
>>> + sprintf(buffer, "%f", 0.5);
>>> + assert(!strcmp(buffer, "0.500000"));
>>> + nanosleep(&ts, NULL);
>>> + }
>>> +
>>> + printf("FPU END: %f\n", 11.5);
>>> + return NULL;
>>> +}
>>> +
>>> +// This test verifies that two threads - one using MMX instructions
>>> +// and another one using FPU and yielding (see sleep()) - can execute
>>> +// correctly without affecting each other. To that end OSv
>>> +// needs to make sure that CPU is properly reset after MMX code
>>> +// gets preempted so that code on new thread can safely execute
>>> +// code using traditional FPU instructions
>>>
>>
>> Would be nice to refer to the relevant issue and say something about what
>> happened before it was fixed (i.e., did this test crash? fail on an
>> assertion
>> above? or what?).
>>
> By the way it is enough to have emms instruction added to
> fpu_state_save_* functions to have this unit test pass. But it is not
> enough to fix #536, #1010 and #1018 -> the second emms in switch_to() is
> necessary.
>
>>
>> +//
>>> +int main() {
>>> + printf("Hello from C code, %f\n", NAN);
>>> +
>>> + pthread_t mmx_thread, fpu_thread;
>>> +
>>> + pthread_create(&mmx_thread, NULL, mmx, NULL);
>>> + pthread_create(&fpu_thread, NULL, fpu, NULL);
>>> +
>>> + pthread_join(mmx_thread, NULL);
>>> + mmx_finished.store(true);
>>> + pthread_join(fpu_thread, NULL);
>>> +
>>> + return 0;
>>> +}
>>> --
>>> 2.19.1
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "OSv Development" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
--
You received this message because you are subscribed to the Google Groups "OSv
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.