Implemented checksum logic (please let me know if I have made any stupid
mistakes).
diff --git a/arch/x64/arch-cpu.cc b/arch/x64/arch-cpu.cc
index bb96af3d..34bf6081 100644
--- a/arch/x64/arch-cpu.cc
+++ b/arch/x64/arch-cpu.cc
@@ -42,6 +42,14 @@ exception_guard::~exception_guard()
sched::cpu::current()->arch.exit_exception();
}
+static inline u64 calculate_fpu_state_checksum(processor::fpu_state *s) {
+ u64 val = 0;
+ char* data = s->legacy;
+ for (u64 i = 0; i < sizeof(processor::fpu_state) - sizeof(u64); i++)
+ val += data[i];
+ return val;
+}
+
extern "C"
[[gnu::target("no-sse")]]
void fpu_state_init_xsave(processor::fpu_state *s) {
@@ -61,21 +69,25 @@ void fpu_state_init_fxsave(processor::fpu_state *s) {
extern "C"
void fpu_state_save_xsave(processor::fpu_state *s) {
processor::xsave(s, -1);
+ s->checksum = calculate_fpu_state_checksum(s);
}
extern "C"
void fpu_state_save_fxsave(processor::fpu_state *s) {
processor::fxsave(s);
+ s->checksum = calculate_fpu_state_checksum(s);
}
extern "C"
void fpu_state_restore_xsave(processor::fpu_state *s) {
processor::xrstor(s, -1);
+ assert(s->checksum == calculate_fpu_state_checksum(s));
}
extern "C"
void fpu_state_restore_fxsave(processor::fpu_state *s) {
processor::fxrstor(s);
+ assert(s->checksum == calculate_fpu_state_checksum(s));
}
extern "C"
diff --git a/arch/x64/processor.hh b/arch/x64/processor.hh
index 6b1fbf66..a26fb028 100644
--- a/arch/x64/processor.hh
+++ b/arch/x64/processor.hh
@@ -343,6 +343,7 @@ struct fpu_state {
char xsavehdr[24];
char reserved[40];
char ymm[256];
+ u64 checksum;
} __attribute__((packed));
inline void fxsave(fpu_state* s)
:-( Unfortunately no fpu_state corruption got detected. And I am still
getting same faults in same place.
On Thursday, November 29, 2018 at 12:28:53 PM UTC-5, Waldek Kozaczuk wrote:
>
> Nadav,
>
> I have found this email thread around patches adding AVX support -
> https://groups.google.com/forum/#!msg/osv-dev/DIbzudXr0jo/glTrLgjFOssJ;context-place=forum/osv-dev
> .
>
> There is a comment you left specifically about "per-thread fpu save area".
> Maybe that is something we need to pursue once we confirm that the
> fpu_state on interrupt stack somehow gets corrupted. I still need to code
> that checksum logic to verify that indeed it happens.
>
> "When we leave a thread during an involuntary context switch (e.g.,
> interrupt), we need to XSAVEOPT its state in the same place we last
> XRESTORed it from when entering this thread - so this can't be a random
> location on the stack chosen at interrupt time (as in inplace_fpu_state)
> because then it won't be the same address used last time and the
> optimization won't work. So we need to have a per-thread fpu save area, and
> save the state *there* on interrupt. Funny enough, we already had such a
> tread->_fpu area in the past, but removed it (commit
> 202b2cccc14a43d088002a80f7add51e3e4bc4ce)."
>
> On Wednesday, November 28, 2018 at 5:03:31 PM UTC-5, Waldek Kozaczuk wrote:
>>
>>
>>
>> On Wednesday, November 28, 2018 at 4:51:08 PM UTC-5, Nadav Har'El wrote:
>>>
>>> On Wed, Nov 28, 2018 at 9:20 PM Waldek Kozaczuk <[email protected]>
>>> wrote:
>>>
>>>> The title of this thread indicates that the crash happens sporadically
>>>> and indeed it is the case with the scenario described in 1st and 2nd email
>>>> of this thread. The symptom (stack trace with fmt_fp() function captured
>>>> by
>>>> gdb) looks identical to what was reported in
>>>> https://github.com/cloudius-systems/osv/issues/1010 and this
>>>> https://github.com/cloudius-systems/osv/issues/536. We cannot be
>>>> certain about whether the root cause is the same. However it is worth
>>>> mentioning that crash happens with same symptom almost every time if I run
>>>> the scenario described in issue #1010 which is different (but similar)
>>>> than
>>>> scenario described in 1st and 2nd email of this thread. I am
>>>> emphasizing this because we have a way to replicate one of these crashes
>>>> (*possibly
>>>> same root cause*) in very repeatable way in case we have ideas of
>>>> things to try.
>>>>
>>>
>>> That is good. If we give up on this effort (at least temporarily),
>>> please summarize what you know about reproducing this issue in a new
>>> bug-tracker issue (and reference 1010 and 536) so we could return to it
>>> later.
>>>
>>>
>>>> Regarding your excellent explanations how non-FPU and FPU state is
>>>> saved/restored by OSv I wanted to run a scenario to see if I understand it
>>>> well (please forgive my ignorance if I am missing something obvious).
>>>> Imagine we have 2 user threads (uT1 and uT2) executing some *logic
>>>> that involves floating point operations in a loop* (like the one in
>>>> the disassembled code). Ffmpeg is a good example as it must be very CPU
>>>> intensive and doing all kind of FP computations to transcode video. Let us
>>>> say on given cpu this sequence happens in time:
>>>>
>>>> ->uT1 is running (executing long FP logic loop)
>>>> --> timer triggered interrupt happens because allotted time slice
>>>> expired [some FPU registers had some inflight data = FPU has state S1)
>>>> ----> OSv saves FPU state and other registers using fpu_lock construct
>>>>
>>>
>>> So far, right. This state is saved inside the interrupt stack, which
>>> while thread uT1 is running, is uT1->_arch->interrupt_stack.
>>> By the way, note that the regular (non-FPU) registers are also on this
>>> stack - the code in entry.S saves them on the stack and passes the pointer
>>> to it when it calls interrupt().
>>>
>>> -------> OSv identifies next thread to run which is uT2
>>>> ---------> OSv restores FPU to state S1 (fpu_lock goes out of scope)
>>>> and other registers as well and switches to uT2
>>>>
>>>
>>> No, this is not what happens now (at least, not if you mean that "S1"
>>> was uT1's state...). Before interrupt() ends or fpu_lock runs out of scope,
>>> interrupt calls sched::preempt().
>>> This is a weird function. It doesn't do something quickly and return :-)
>>> Rather, it calls the scheduler to decide which thread to run next. If it
>>> decides to run uT1 again, it just returns normally (and the FPU state is
>>> restored and the interrupt returns).
>>> But if it decides to run uT2 it doesn't quite return... Rather, it
>>> *switches* to thread uT2. This means we switch to uT2's stack, and jump to
>>> the program counter where uT2 was before. And where was uT2 running before?
>>> One of the likely options was that it was itself in interrupt() calling
>>> preempt() :-) So now, finally, when uT2 resumes, its call to preempt()
>>> "returns", and we reach the end of the interrupt() function. Remember that
>>> at this point we're running uT2, with uT2's stack. So when interrupt()
>>> ends, the fpu_lock goes out of scope, and the FPU state is popped from
>>> *uT2*'s stack. This is was uT2's FPU state, saved long ago, and now we're
>>> restoring it, and when interrupt() returns, uT2's correct FPU state will
>>> have been restored.
>>>
>> Thanks. I get it now. The key thing is that FPU state is saved (pushed)
>> to current and restored from (popped) new thread's stack.
>>
>>>
>>>
>>>> --------------> UT2 is running (executing long FP logic loop)
>>>> --> timer triggered interrupt happens again because allotted time slice
>>>> expired [some FPU registers had some inflight data = FPU has state S2)
>>>> ... same thing
>>>> ... same thing
>>>> ---------> OSv restores FPU to state S2 and other registers as well and
>>>> switches back to uT1
>>>> *--------------> UT1 is resuming where it left above and instead of FPU
>>>> be in expected S1 state it sees FPU in state S2 -> crash????*
>>>>
>>>> What am I missing in my scenario?
>>>>
>>>> Lastly I am adding the disassembled portion of fmt_fp() with applied
>>>> patch like below that makes the repeatable crash (from issue #1010) go
>>>> away
>>>> or at least way less frequent.
>>>>
>>>> diff --git a/libc/stdio/vfprintf.c b/libc/stdio/vfprintf.c
>>>> index aac790c0..1e116038 100644
>>>> --- a/libc/stdio/vfprintf.c
>>>> +++ b/libc/stdio/vfprintf.c
>>>> @@ -8,6 +8,7 @@
>>>> #include <inttypes.h>
>>>> #include <math.h>
>>>> #include <float.h>
>>>> +#include <assert.h>
>>>> /* Some useful macros */
>>>> @@ -296,9 +297,14 @@ static int fmt_fp(FILE *f, long double y, int w,
>>>> int p, int fl, int t)
>>>> if (e2<0) a=r=z=big;
>>>> else a=r=z=big+sizeof(big)/sizeof(*big) - LDBL_MANT_DIG - 1;
>>>> + int steps = 0;
>>>> do {
>>>> *z = y;
>>>> y = 1000000000*(y-*z++);
>>>> + steps++;
>>>> + if(steps > 2000) {
>>>> + assert(0);
>>>> + }
>>>> } while (y);
>>>> while (e2>0) {
>>>>
>>>> Your explanation was that because of added assert() function call we
>>>> force compiler to generate code to save/restore any used FPU registers.
>>>>
>>>
>>> Yes, my thesis (and I'm too tired now to verify this in the assembler
>>> code below) was that because there is this function call (and the compiler
>>> doesn't know if it is commonly called or not), the compiler doesn't want to
>>> leave y in a register, so it will load and store it from memory. But this
>>> is just a guess.
>>> You can try instead of assert(0), do
>>>
>>> asm volatile ("cli; hlt" : : : "memory");
>>>
>>> This is not a function call, so it will not change the compiled code. It
>>> will allow you to connect with a debugger to print more stuff at that
>>> point. Note that this cli; hlt stuff will only hang the single CPU that ran
>>> this command (if you can reproduce this with -c1, this will be the only cpu
>>> so things will be easier).
>>>
>>>
>>>
>>>> But if we do not see the crash with this patch applied happeny any
>>>> more, then the assert function does not get called and FPU not
>>>> saved/restored, so how this helps make this crash go away?
>>>>
>>>> Portion of disassembled code:
>>>> if (e2<0) a=r=z=big;
>>>> else a=r=z=big+sizeof(big)/sizeof(*big) - LDBL_MANT_DIG - 1;
>>>>
>>>> int steps = 0;
>>>> do {
>>>> *z = y;
>>>> 501: d9 bd 0e e3 ff ff fnstcw -0x1cf2(%rbp)
>>>> if (e2<0) a=r=z=big;
>>>> 507: 48 8d 85 40 e3 ff ff lea -0x1cc0(%rbp),%rax
>>>> 50e: 45 85 c9 test %r9d,%r9d
>>>> 511: 48 8d 95 cc fe ff ff lea -0x134(%rbp),%rdx
>>>> 518: 48 89 c1 mov %rax,%rcx
>>>> *z = y;
>>>> 51b: 0f b7 85 0e e3 ff ff movzwl -0x1cf2(%rbp),%eax
>>>> if (e2<0) a=r=z=big;
>>>> 522: 48 0f 49 ca cmovns %rdx,%rcx
>>>> *z = y;
>>>> 526: 80 cc 0c or $0xc,%ah
>>>> if (e2<0) a=r=z=big;
>>>> 529: 48 89 8d d8 e2 ff ff mov %rcx,-0x1d28(%rbp)
>>>> y = 1000000000*(y-*z++);
>>>> 530: 48 8d 59 04 lea 0x4(%rcx),%rbx
>>>> 534: 48 8d 91 44 1f 00 00 lea 0x1f44(%rcx),%rdx
>>>> *z = y;
>>>> 53b: 66 89 85 0c e3 ff ff mov %ax,-0x1cf4(%rbp)
>>>> 542: d9 c0 fld %st(0)
>>>> 544: d9 ad 0c e3 ff ff fldcw -0x1cf4(%rbp)
>>>> 54a: df bd 00 e3 ff ff fistpll -0x1d00(%rbp)
>>>> 550: d9 ad 0e e3 ff ff fldcw -0x1cf2(%rbp)
>>>> 556: 48 8b 85 00 e3 ff ff mov -0x1d00(%rbp),%rax
>>>> 55d: 89 01 mov %eax,(%rcx)
>>>> y = 1000000000*(y-*z++);
>>>> 55f: 89 c0 mov %eax,%eax
>>>> 561: 48 89 85 f8 e2 ff ff mov %rax,-0x1d08(%rbp)
>>>> 568: df ad f8 e2 ff ff fildll -0x1d08(%rbp)
>>>> 56e: de e9 fsubrp %st,%st(1)
>>>> 570: d9 05 00 00 00 00 flds 0x0(%rip) # 576
>>>> <fmt_fp+0x176>
>>>> 576: dc c9 fmul %st,%st(1)
>>>> steps++;
>>>> if(steps > 2000) {
>>>> 578: eb 48 jmp 5c2 <fmt_fp+0x1c2>
>>>> 57a: d9 c9 fxch %st(1)
>>>> 57c: eb 04 jmp 582 <fmt_fp+0x182>
>>>> 57e: 66 90 xchg %ax,%ax
>>>> 580: d9 c9 fxch %st(1)
>>>> *z = y;
>>>> 582: d9 c0 fld %st(0)
>>>> 584: d9 ad 0c e3 ff ff fldcw -0x1cf4(%rbp)
>>>> 58a: df bd 00 e3 ff ff fistpll -0x1d00(%rbp)
>>>> 590: d9 ad 0e e3 ff ff fldcw -0x1cf2(%rbp)
>>>> y = 1000000000*(y-*z++);
>>>> 596: 48 83 c3 04 add $0x4,%rbx
>>>> *z = y;
>>>> 59a: 48 8b 85 00 e3 ff ff mov -0x1d00(%rbp),%rax
>>>> 5a1: 89 43 fc mov %eax,-0x4(%rbx)
>>>> y = 1000000000*(y-*z++);
>>>> 5a4: 89 c0 mov %eax,%eax
>>>> 5a6: 48 89 85 f8 e2 ff ff mov %rax,-0x1d08(%rbp)
>>>> 5ad: df ad f8 e2 ff ff fildll -0x1d08(%rbp)
>>>> 5b3: de e9 fsubrp %st,%st(1)
>>>> 5b5: d8 c9 fmul %st(1),%st
>>>> if(steps > 2000) {
>>>> 5b7: 48 39 d3 cmp %rdx,%rbx
>>>> 5ba: 0f 84 99 10 00 00 je 1659 <fmt_fp+0x1259>
>>>> 5c0: d9 c9 fxch %st(1)
>>>> assert(0);
>>>> }
>>>> } while (y);
>>>> 5c2: d9 ee fldz
>>>> 5c4: d9 ca fxch %st(2)
>>>> 5c6: db ea fucomi %st(2),%st
>>>> 5c8: dd da fstp %st(2)
>>>> 5ca: 7a ae jp 57a <fmt_fp+0x17a>
>>>> 5cc: 75 b2 jne 580 <fmt_fp+0x180>
>>>> 5ce: dd d8 fstp %st(0)
>>>> 5d0: dd d8 fstp %st(0)
>>>>
>>>> On Wednesday, November 28, 2018 at 8:14:05 AM UTC-5, Nadav Har'El wrote:
>>>>>
>>>>>
>>>>> On Wed, Nov 28, 2018 at 2:18 PM Waldek Kozaczuk <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> On Nov 28, 2018, at 03:58, Nadav Har'El <[email protected]> wrote:
>>>>>>
>>>>>>
>>>>>> The situation is different with involuntary context switches. When an
>>>>>> asynchronous event, e.g., an interrupt, occurs, the user thread is in a
>>>>>> random position in the code. It may be using all its registers, and the
>>>>>> FPU
>>>>>> state (including the old-style FPU and the new SSE and AVX registers).
>>>>>> Because our interrupt handler (which may do anything from running the
>>>>>> scheduler to reading a page from disk on page fault) may need to use any
>>>>>> of
>>>>>> these registers, all of them, including the FPU, need to be saved on
>>>>>> interrupt time. The interrupt has a separate stack, and the FPU is saved
>>>>>> on
>>>>>> this stack (see fpu_lock use in interrupt()). When the interrupt
>>>>>> finishes,
>>>>>> this FPU is restored. This includes involuntary context switching:
>>>>>> thread A
>>>>>> receives an interrupt, saves the FPU, does something and decides to
>>>>>> switch
>>>>>> to thread B, and a while later we switch back to thread A at which point
>>>>>> the interrupt handler "returns" and restores the FPU state.
>>>>>>
>>>>>> Does involuntary case include scenario when enough time designated
>>>>>> for current thread by scheduler expires? I would imaging this would
>>>>>> qualify
>>>>>> as interrupt?
>>>>>>
>>>>>
>>>>> Indeed. We set a timer to when this thread's runtime quota will
>>>>> expire, and this timer generates an interrupt. Check out interrupt() -
>>>>> after saving the FPU state and acknowledging the interrupt (EOI), it
>>>>> calls
>>>>> the scheduler (sched::preempt()). This will decide which thread to run
>>>>> next
>>>>> - it may run the same thread again, or a different thread. When
>>>>> sched::preempt() returns - possibly a long time later, it means the
>>>>> scheduler decided to run *this* thread again. At that point, interrupt()
>>>>> returns and just before returning it restores the FPU state
>>>>> automatically.
>>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "OSv Development" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
--
You received this message because you are subscribed to the Google Groups "OSv
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.