The title of this thread indicates that the crash happens sporadically and
indeed it is the case with the scenario described in 1st and 2nd email of
this thread. The symptom (stack trace with fmt_fp() function captured by
gdb) looks identical to what was reported in
https://github.com/cloudius-systems/osv/issues/1010 and this
https://github.com/cloudius-systems/osv/issues/536. We cannot be certain
about whether the root cause is the same. However it is worth mentioning
that crash happens with same symptom almost every time if I run the
scenario described in issue #1010 which is different (but similar) than
scenario described in 1st and 2nd email of this thread. I am emphasizing
this because we have a way to replicate one of these crashes (*possibly
same root cause*) in very repeatable way in case we have ideas of things to
try.
Regarding your excellent explanations how non-FPU and FPU state is
saved/restored by OSv I wanted to run a scenario to see if I understand it
well (please forgive my ignorance if I am missing something obvious).
Imagine we have 2 user threads (uT1 and uT2) executing some *logic that
involves floating point operations in a loop* (like the one in the
disassembled code). Ffmpeg is a good example as it must be very CPU
intensive and doing all kind of FP computations to transcode video. Let us
say on given cpu this sequence happens in time:
->uT1 is running (executing long FP logic loop)
--> timer triggered interrupt happens because allotted time slice expired
[some FPU registers had some inflight data = FPU has state S1)
----> OSv saves FPU state and other registers using fpu_lock construct
-------> OSv identifies next thread to run which is uT2
---------> OSv restores FPU to state S1 (fpu_lock goes out of scope) and
other registers as well and switches to uT2
--------------> UT2 is running (executing long FP logic loop)
--> timer triggered interrupt happens again because allotted time slice
expired [some FPU registers had some inflight data = FPU has state S2)
... same thing
... same thing
---------> OSv restores FPU to state S2 and other registers as well and
switches back to uT1
*--------------> UT1 is resuming where it left above and instead of FPU be
in expected S1 state it sees FPU in state S2 -> crash????*
What am I missing in my scenario?
Lastly I am adding the disassembled portion of fmt_fp() with applied patch
like below that makes the repeatable crash (from issue #1010) go away or at
least way less frequent.
diff --git a/libc/stdio/vfprintf.c b/libc/stdio/vfprintf.c
index aac790c0..1e116038 100644
--- a/libc/stdio/vfprintf.c
+++ b/libc/stdio/vfprintf.c
@@ -8,6 +8,7 @@
#include <inttypes.h>
#include <math.h>
#include <float.h>
+#include <assert.h>
/* Some useful macros */
@@ -296,9 +297,14 @@ static int fmt_fp(FILE *f, long double y, int w, int p,
int fl, int t)
if (e2<0) a=r=z=big;
else a=r=z=big+sizeof(big)/sizeof(*big) - LDBL_MANT_DIG - 1;
+ int steps = 0;
do {
*z = y;
y = 1000000000*(y-*z++);
+ steps++;
+ if(steps > 2000) {
+ assert(0);
+ }
} while (y);
while (e2>0) {
Your explanation was that because of added assert() function call we force
compiler to generate code to save/restore any used FPU registers. But if we
do not see the crash with this patch applied happeny any more, then the
assert function does not get called and FPU not saved/restored, so how this
helps make this crash go away?
Portion of disassembled code:
if (e2<0) a=r=z=big;
else a=r=z=big+sizeof(big)/sizeof(*big) - LDBL_MANT_DIG - 1;
int steps = 0;
do {
*z = y;
501: d9 bd 0e e3 ff ff fnstcw -0x1cf2(%rbp)
if (e2<0) a=r=z=big;
507: 48 8d 85 40 e3 ff ff lea -0x1cc0(%rbp),%rax
50e: 45 85 c9 test %r9d,%r9d
511: 48 8d 95 cc fe ff ff lea -0x134(%rbp),%rdx
518: 48 89 c1 mov %rax,%rcx
*z = y;
51b: 0f b7 85 0e e3 ff ff movzwl -0x1cf2(%rbp),%eax
if (e2<0) a=r=z=big;
522: 48 0f 49 ca cmovns %rdx,%rcx
*z = y;
526: 80 cc 0c or $0xc,%ah
if (e2<0) a=r=z=big;
529: 48 89 8d d8 e2 ff ff mov %rcx,-0x1d28(%rbp)
y = 1000000000*(y-*z++);
530: 48 8d 59 04 lea 0x4(%rcx),%rbx
534: 48 8d 91 44 1f 00 00 lea 0x1f44(%rcx),%rdx
*z = y;
53b: 66 89 85 0c e3 ff ff mov %ax,-0x1cf4(%rbp)
542: d9 c0 fld %st(0)
544: d9 ad 0c e3 ff ff fldcw -0x1cf4(%rbp)
54a: df bd 00 e3 ff ff fistpll -0x1d00(%rbp)
550: d9 ad 0e e3 ff ff fldcw -0x1cf2(%rbp)
556: 48 8b 85 00 e3 ff ff mov -0x1d00(%rbp),%rax
55d: 89 01 mov %eax,(%rcx)
y = 1000000000*(y-*z++);
55f: 89 c0 mov %eax,%eax
561: 48 89 85 f8 e2 ff ff mov %rax,-0x1d08(%rbp)
568: df ad f8 e2 ff ff fildll -0x1d08(%rbp)
56e: de e9 fsubrp %st,%st(1)
570: d9 05 00 00 00 00 flds 0x0(%rip) # 576
<fmt_fp+0x176>
576: dc c9 fmul %st,%st(1)
steps++;
if(steps > 2000) {
578: eb 48 jmp 5c2 <fmt_fp+0x1c2>
57a: d9 c9 fxch %st(1)
57c: eb 04 jmp 582 <fmt_fp+0x182>
57e: 66 90 xchg %ax,%ax
580: d9 c9 fxch %st(1)
*z = y;
582: d9 c0 fld %st(0)
584: d9 ad 0c e3 ff ff fldcw -0x1cf4(%rbp)
58a: df bd 00 e3 ff ff fistpll -0x1d00(%rbp)
590: d9 ad 0e e3 ff ff fldcw -0x1cf2(%rbp)
y = 1000000000*(y-*z++);
596: 48 83 c3 04 add $0x4,%rbx
*z = y;
59a: 48 8b 85 00 e3 ff ff mov -0x1d00(%rbp),%rax
5a1: 89 43 fc mov %eax,-0x4(%rbx)
y = 1000000000*(y-*z++);
5a4: 89 c0 mov %eax,%eax
5a6: 48 89 85 f8 e2 ff ff mov %rax,-0x1d08(%rbp)
5ad: df ad f8 e2 ff ff fildll -0x1d08(%rbp)
5b3: de e9 fsubrp %st,%st(1)
5b5: d8 c9 fmul %st(1),%st
if(steps > 2000) {
5b7: 48 39 d3 cmp %rdx,%rbx
5ba: 0f 84 99 10 00 00 je 1659 <fmt_fp+0x1259>
5c0: d9 c9 fxch %st(1)
assert(0);
}
} while (y);
5c2: d9 ee fldz
5c4: d9 ca fxch %st(2)
5c6: db ea fucomi %st(2),%st
5c8: dd da fstp %st(2)
5ca: 7a ae jp 57a <fmt_fp+0x17a>
5cc: 75 b2 jne 580 <fmt_fp+0x180>
5ce: dd d8 fstp %st(0)
5d0: dd d8 fstp %st(0)
On Wednesday, November 28, 2018 at 8:14:05 AM UTC-5, Nadav Har'El wrote:
>
>
> On Wed, Nov 28, 2018 at 2:18 PM Waldek Kozaczuk <[email protected]
> <javascript:>> wrote:
>
>> On Nov 28, 2018, at 03:58, Nadav Har'El <[email protected] <javascript:>>
>> wrote:
>>
>>
>> The situation is different with involuntary context switches. When an
>> asynchronous event, e.g., an interrupt, occurs, the user thread is in a
>> random position in the code. It may be using all its registers, and the FPU
>> state (including the old-style FPU and the new SSE and AVX registers).
>> Because our interrupt handler (which may do anything from running the
>> scheduler to reading a page from disk on page fault) may need to use any of
>> these registers, all of them, including the FPU, need to be saved on
>> interrupt time. The interrupt has a separate stack, and the FPU is saved on
>> this stack (see fpu_lock use in interrupt()). When the interrupt finishes,
>> this FPU is restored. This includes involuntary context switching: thread A
>> receives an interrupt, saves the FPU, does something and decides to switch
>> to thread B, and a while later we switch back to thread A at which point
>> the interrupt handler "returns" and restores the FPU state.
>>
>> Does involuntary case include scenario when enough time designated for
>> current thread by scheduler expires? I would imaging this would qualify as
>> interrupt?
>>
>
> Indeed. We set a timer to when this thread's runtime quota will expire,
> and this timer generates an interrupt. Check out interrupt() - after saving
> the FPU state and acknowledging the interrupt (EOI), it calls the scheduler
> (sched::preempt()). This will decide which thread to run next - it may run
> the same thread again, or a different thread. When sched::preempt() returns
> - possibly a long time later, it means the scheduler decided to run *this*
> thread again. At that point, interrupt() returns and just before returning
> it restores the FPU state automatically.
>
--
You received this message because you are subscribed to the Google Groups "OSv
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.