#2724: foreign functions called from optimized Haskell code receive non-empty
fpu
stack
-----------------------+----------------------------------------------------
Reporter: aruiz | Owner:
Type: bug | Status: new
Priority: normal | Component: Compiler
Version: 6.8.3 | Severity: normal
Keywords: | Testcase:
Architecture: x86 | Os: Linux
-----------------------+----------------------------------------------------
It has been observed that wrong NaN's are produced by foreign functions in
complex and not easily reproducible situations:
http://hackage.haskell.org/trac/ghc/ticket/1944
http://thread.gmane.org/gmane.comp.lang.haskell.cafe/31675
I also had this problem in the hmatrix library. But now it can be easily
reproduced and traced to a particular point in foreign C code, where we
observe that the product of two innocent looking doubles evaluates to NaN.
Using gdb we find that the fpu state received by the foreign function,
which is normally like this:
{{{
(gdb) info float
R7: Empty 0x3ffcaaf881cab3b3b800
R6: Empty 0x3fff8000000000000000
R5: Empty 0x00000000000000000000
R4: Empty 0x3fff8000000000000000
R3: Empty 0xbffb88bf61ae9d22828a
R2: Empty 0x80000000000000000000
R1: Empty 0x3ffe86490c2ce3303000
=>R0: Empty 0xbffcfb4ceea7f26db95c
Status Word: 0x4032 DE UE PE C3
TOP: 0
Control Word: 0x037f IM DM ZM OM UM PM
PC: Extended Precision (64-bits)
RC: Round to nearest
Tag Word: 0xffff
Instruction Pointer: 0x73:0xb7e284b0
Operand Pointer: 0x7b:0xb6e7e1f8
Opcode: 0xdd1a
}}}
has some "valid" data in the calls which produce the NaN:
{{{
(gdb) info float
R7: Empty 0x3fcb9e8b62d59977a000
R6: Empty 0x3fcabc5cf9a9be1cd800
R5: Empty 0x401f8000000000000000
R4: Empty 0x401e9b97f4a800000000
R3: Empty 0xbffcaaf881cab3b3b800
R2: Empty 0x00000000000000000000
R1: Valid 0x3fdddbe6fecebdedd800 +1.000000000000000036e-10
=>R0: Valid 0x3fcb9e8b62d59977a000 +2.750308259657520213e-16
Status Word: 0x0132 DE UE PE C0
TOP: 0
Control Word: 0x037f IM DM ZM OM UM PM
PC: Extended Precision (64-bits)
RC: Round to nearest
Tag Word: 0xfff0
Instruction Pointer: 0x73:0x08063cb3
Operand Pointer: 0x7b:0x00000000
Opcode: 0xd8da
}}}
The attached tarball showbug.tar.gz contains a modified version of the
library which exposes the problem. Detailed information can be found in
the enclosed file README_BUG.
A simple workaround is calling asm("finit") before any work is done in the
foreign functions. However, there is strong evidence that the problem is
caused in the Haskell side because asm("finit") does not avoid the problem
if called '''after''' each foreign function. See README_BUG for details.
In any case, I also attach a version (showbug2) without lapack (please try
the other one first).
It only happens with '''optimized -fasm''' Haskell code, and with certain
foreign functions. Interestingly, some implementations (e.g. Intel's MKL)
don't produce any NaN, so at first it could be thought that there was some
subtle bug in some versions of blas/lapack. But this new result about the
fpu state suggests that this is probably not the case.
The wrong NaN's appear in all the 32bit machines that I have tested (older
Pentium, newer Dual Core, and even on an ultraportable with Atom, and with
different Ubuntu versions, Fedora, Arch, and Debian). Probably it also
happens in Windows, but I have not tested it recently. I experienced the
problem with ghc 6.8.x, 6.10-rc, (and probably 6.6). I cannot reproduce
the problem on 64bit systems (but I have only tested one single 64bit
machine)
Finally, I hope that this problem is not caused by some error in the
hmatrix library...
Alberto Ruiz
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/2724>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs