#4391: forkIO threads do not properly save/restore the floating point
environment
---------------------------------+------------------------------------------
Reporter: draconx | Owner:
Type: bug | Status: merge
Priority: normal | Milestone: 7.2.1
Component: Runtime System | Version: 6.12.3
Keywords: | Testcase:
Blockedby: | Difficulty:
Os: Unknown/Multiple | Blocking:
Architecture: x86_64 (amd64) | Failure: None/Unknown
---------------------------------+------------------------------------------
Comment(by draconx):
Replying to [comment:14 simonmar]:
> Replying to [comment:13 draconx]:
> > Also, this IMO needs to be in the Control.Concurrent manual, because
this
> > has everything to do with threads and nothing to do with the FFI.
>
> I don't think I agree. You have to use the FFI to access the `fenv.h`
> functions
Sure, but the fenv.h functions aren't the only way to access the floating
point unit (the fact that they're standard C helps with portability,
though).
Maybe it's impossible to access it without using the FFI, but that's
incidental: I don't think you can access the internet without using the
FFI either, but that doesn't mean we should put issues related to IP
fragmentation in the FFI section of the user guide.
> and if you do, you have problems even without threads.
The reason why this is purely a threading issue is because it affects
*all*
floating point, not just the built-in floating point ops. You will get no
argument from me in saying that the built-in ops are problematic. To
illustrate this, let's consider an example. Suppose we had a floating
point
API which captures all impurity. For simplicitiy, we'll do this by
putting
everything in IO. Our API therefore looks something like the following:
{{{
dblAdd :: Double -> Double -> IO Double
dblMul :: Double -> Double -> IO Double
fpSetRoundingMode :: FPRoundingMode -> IO ()
fpTestExceptions :: IO [FPException]
etc.
}}}
We can easily implement this API, today, using the FFI (ignoring all
issues
related to the marshalling of floating point data). Further suppose that
the
program never uses any of the built-in floating point ops, thus there are
trivially no problems related to impurity. This API might even come from
a
library which hides the fact that it uses the FFI internally.
A well-intentioned application developer has a correct single-threaded
program
using this API. She realizes that she can make it faster by using
threads, so
she turns to the threading documentation. The thread documentation tells
her
that forkOS threads are needed if you use "thread local state", otherwise
forkIO threads are much faster (the docs emphasize this last point *very*
strongly). It makes no mention of floating point, so she (quite
reasonably)
assumes that floating point (which doesn't depend on "thread local state"
in
the usual sense of that term) is OK to use with forkIO.
Little does she know that her program is now subject to subtle, rare
races.
Despite extensive testing, these races are never encountered in the lab.
The
issue remains hidden until the maiden launch of the spacecraft on which
her
code is running, at which point a mishandled floating point exception
causes
the craft to break apart shortly after takeoff.
> > About purity: the thing is, even if we had the perfect pure API for
floating
> > point, you'd _still_ be bitten by this issue! That's because the
issue is not
> > about purity at all: it's about the runtime clobbering CPU registers
on context
> > switches. Note that integer operations on a certain popular CPU
architecture
> > are just as "impure" as floating point
> Are you claiming we have a problem with the overflow flag, or other CPU
> state?
No, I didn't mean to suggest that there was any problem with the handling
of
the integer overflow flag. I just wanted to draw a parallel to show that
the
issues are the same; that floating point is not somehow special in this
regard.
> I get the impression from your comments that you think GHC preempts
threads
> at arbitrary points
From the perspective of the application developer, this is exactly what
happens, since it's essentially impossible to know in advance when memory
allocations will or will not occur. Furthermore, the wording in the docs
suggests that it's not even safe to rely on this. Statements such as
"threads
are interleaved in a random fashion" and "GHC doesn't *currently* attempt
[to
preempt tight loops]" (emphasis mine) suggest that threads might be
preempted
for other reasons in the future.
> We don't do that - threads are preempted at safe points only, and we
know
> exactly what state needs to be saved and restored (it doesn't include
the
> overflow flag, for instance, because we know that a safe point never
occurs
> between an instruction that sets the overflow flag and one that tests
it).
That's fine, but AFAIK there's no way for an application developer to make
the
same guarantee that a safe point never occurs between an instruction that
sets
the floating point overflow flag and one that tests it. Please correct me
if
I'm wrong.
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/4391#comment:15>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs