On Tue, Nov 27, 2018 at 8:36 PM Waldek Kozaczuk <[email protected]>
wrote:
> Here is the disassembled portion of the code in vfprint.c:
> 480 else a=r=z=big+sizeof(big)/sizeof(*big) - LDBL_MANT_DIG - 1;
> 481
> 482 do {
> 483 *z = y;
> 484 540: d9 c0 fld %st(0)
> 485 542: d9 ad 0c e3 ff ff fldcw -0x1cf4(%rbp)
> 486 548: df bd 00 e3 ff ff fistpll -0x1d00(%rbp)
> 487 54e: d9 ad 0e e3 ff ff fldcw -0x1cf2(%rbp)
> 488 y = 1000000000*(y-*z++);
> 489 554: 48 83 c3 04 add $0x4,%rbx
> 490 *z = y;
> 491 558: 48 8b 85 00 e3 ff ff mov -0x1d00(%rbp),%rax
> 492 55f: 89 43 fc mov %eax,-0x4(%rbx)
> 493 y = 1000000000*(y-*z++);
> 494 562: 89 c0 mov %eax,%eax
> 495 564: 48 89 85 f0 e2 ff ff mov %rax,-0x1d10(%rbp)
> 496 56b: df ad f0 e2 ff ff fildll -0x1d10(%rbp)
> 497 571: de e9 fsubrp %st,%st(1)
> 498 573: d8 c9 fmul %st(1),%st
> 499 } while (y);
> 500 575: d9 ee fldz
> 501 577: d9 c9 fxch %st(1)
> 502 579: db e9 fucomi %st(1),%st
> 503 57b: dd d9 fstp %st(1)
> 504 57d: 7a c1 jp 540 <fmt_fp+0x140>
> 505 57f: 75 bf jne 540 <fmt_fp+0x140>
> 506 581: dd d8 fstp %st(0)
> 507 583: dd d8 fstp %st(0)
> 508
> 509 while (e2>0) {
> 510 585: 4c 8b a5 e0 e2 ff ff mov -0x1d20(%rbp),%r12
>
> Obviously it does use FP registers.
>
Indeed, seems like a loop that works on fpu registers and stack. The actual
loop's test, while(y) is the "fucomi" instruction which compares two
floating point values one of which being a zero created by "fldz". My
completely unproven suspicion is that in the middle of this loop we get an
interrupt (possibly also leading to a context switch, running another
thread, and only much later returning to this thread), and for some reason
the floating point state (which includes the register stack, etc.) is not
saved correctly - or not restored correctly (perhaps restored from a
corrupted array?). If after such corruption, "y" (in whatever register it
sits) becomes, for example, NaN, the loop will never finish. I wonder if we
can print these registers from gdb to see if perhaps gdb showing "y=0"
isn't really correct.
> I wonder if there is some kind of arithmetic exception encountered
> (overflow when multiplying) which we do not handle correctly. I think we
> have handler that would simply abort in this case.
>
I don't think this is the case. Floating point exceptions are rarely
enabled (see also https://github.com/cloudius-systems/osv/issues/855 where
we discovered qemu doesn't even support them correctly), and usually
overflow when multiplying will just create an "inf" value.
--
You received this message because you are subscribed to the Google Groups "OSv
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.