#5926: Add strict versions of modifyIORef and atomicModifyIORef
-------------------------------------+--------------------------------------
Reporter: joeyadams | Owner:
Type: feature request | Status: new
Priority: normal | Component: libraries/base
Version: 7.4.1 | Keywords:
Os: Unknown/Multiple | Architecture: Unknown/Multiple
Failure: Runtime performance bug | Testcase:
Blockedby: | Blocking:
Related: |
-------------------------------------+--------------------------------------
It is easy to misuse modifyIORef and atomicModifyIORef due to their lack
of strictness. For example, if you use an IORef as a counter, use
modifyIORef to increment it, and never evaluate the value until the very
end, your program will leak memory, and you may even get a stack overflow:
{{{
ref <- newIORef 0
replicateM_ 1000000 $ modifyIORef ref (+1)
readIORef ref >>= print
}}}
Today, I found a [https://github.com/vincenthz/hs-tls/pull/8 space leak in
the tls package]. Repeatedly calling sendData would leak memory. It
didn't take me long to find the cause once I noticed a module named
"Measurement" used for gathering connection statistics. It used
modifyIORef to update the Measurement structure. When I changed it to
this:
{{{
x <- readIORef (ctxMeasurement ctx)
writeIORef (ctxMeasurement ctx) $! f x
}}}
the space leak went away.
A more subtle mistake can be made using atomicModifyIORef. Can you spot
the problem with this code?
{{{
atomicModifyIORef ref (\_ -> (new_value, ()))
}}}
It's not incrementing anything, it's just replacing the value. However,
it's still deferring evaluation of the function. This mistake was pointed
out in [http://themonadreader.files.wordpress.com/2011/10/issue19.pdf The
Monad.Reader Issue 19], where they suggested the following idiom:
{{{
x <- atomicModifyIORef ref (\_ -> (new_value, ()))
x `seq` return ()
}}}
Thus, I believe there should be strict variants of modifyIORef and
atomicModifyIORef, if only to warn programmers of these pitfalls.
{{{modifyIORef'}}} is pretty straightforward: force the result of applying
the function:
{{{
modifyIORef' ref f = do
x <- readIORef ref
let x' = f x
x' `seq` writeIORef ref x'
}}}
The only question is: would it be better to force `x'` after `writeIORef`
instead of before it, in case a caller is trying to share the IORef among
threads (which they shouldn't be)?
{{{atomicModifyIORef}}} is less straightforward. Should we force the
values themselves? It is possible to avoid the space leak above without
forcing either the new value or the return value:
{{{
atomicModifyIORef' :: IORef a -> (a -> (a,b)) -> IO b
atomicModifyIORef' ref f = do
p <- atomicModifyIORef ref (\a -> let p = f a in (fst p, p))
p `seq` return (snd p)
}}}
It also allows {{{f}}} to decide what to force. For example, with this
definition of atomicModifyIORef', the following program prints 10000000
and does not leak memory:
{{{
ref <- newIORef 0
replicateM_ 10000000 $
atomicModifyIORef' ref
(\n -> let n' = n + 1
in n' `seq` (n', undefined))
readIORef ref >>= print
}}}
In the attached patch, I didn't implement atomicModifyIORef' this way.
Instead, I made it force both the old and new values, and added a separate
function called atomicWriteIORef that has the same signature as
writeIORef, but is based on atomicModifyIORef.
I believe the real value of such functions is to warn programmers about
these pitfalls.
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/5926>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs