#5749: GHC 7.0.4 Performance Regression (Possibly Vector)
-------------------------------------+--------------------------------------
Reporter: sanketr | Owner:
Type: bug | Status: new
Priority: normal | Component: Compiler
Version: 7.0.4 | Keywords: performance, vector
Os: Linux | Architecture: x86_64 (amd64)
Failure: Runtime performance bug | Testcase:
Blockedby: | Blocking:
Related: 5623 |
-------------------------------------+--------------------------------------
I have noticed ~100% performance degradation for my code when I switched
from 6.12.3 to 7.0.4. This might be related to vector performance ticket
[http://hackage.haskell.org/trac/ghc/ticket/5623 5623] but I noticed it
was filed for performance regression of 7.2.1 relative to 7.0.4, and
6.12.1 vs 7.0.4 performance was reported as ok. So, I am filing it as new
bug report.
I am attaching an edited version of my code below which reproduces the
issue. It is from actual production code which is used for db driver.
Relevant performance benchmarks (~95% degradation):
'''GHC 6.12.3 MUT Time:''' 0.48s
'''GHC 7.0.4 MUT Time:''' 0.95s
In actual code, performance degrades by ~100%, from ~1.3s to ~2.6s. So, I
can't move from 6.12.3 to 7.0.4 or 7.4+ if I want to keep the performance
:(
Code below - the comment block at the end shows how to compile, and
reproduce the issue - I will be happy to provide more information to fix
the issue:
{{{
{-# LANGUAGE BangPatterns #-}
import qualified Data.Vector.Storable as SV
import qualified Data.Vector.Storable.Mutable as MSV
import qualified Data.Vector as V
import Foreign (sizeOf)
import Foreign.C.Types (CChar)
import GHC.Int
import System.IO.Unsafe (unsafePerformIO)
import Control.Exception (evaluate)
data Elems = IV {-# UNPACK #-} !(SV.Vector Int32)
| SV {-#UNPACK #-} !Int {-# UNPACK #-} !(SV.Vector CChar) --
Int stores the number of null-terminated C Strings
| T {-# UNPACK #-} !Int {-# UNPACK #-} !(V.Vector Elems) --
Int stores total bytes needed to copy vectors to ByteString
| L {-# UNPACK #-} !(V.Vector Elems) -- General list of
elements
deriving (Show)
-- | Function to return total size in bytes taken to store the data from
Elems
size :: Elems -> Int
size (IV x) = 6 + (sizeOf (undefined :: Int32)) * (SV.length x)
size (SV _ x) = 6 + (sizeOf (undefined :: CChar)) * (SV.length x)
size (T n _) = n
size (L x) = V.foldl' (\x y -> x + size y) 6 x
{-# INLINE size #-}
fillS :: [[CChar]] -> Elems
fillS x = let (x',y') = createS x
in SV x' y'
{-# INLINE fillS #-}
createS :: [[CChar]] -> (Int, SV.Vector CChar)
createS cl = unsafePerformIO $ do
v <- MSV.new (Prelude.length . Prelude.concat $ cl)
fill v 0 $ Prelude.concat cl
SV.unsafeFreeze v >>= \x -> return (Prelude.length cl,x)
where
fill v _ [] = return ()
fill v i (x:xs) = MSV.unsafeWrite v i x >> fill v (i + 1) xs
{-# INLINE createS #-}
-- | Constructor for T - a db table - we must always build it using this
function
fillT :: V.Vector Elems -> Elems
fillT !xs = T (V.foldl' (\x y -> x + size y) 3 xs) xs -- 2 bytes for table
header + 1 additional byte for dict type header => 3 bytes additional
overhead
{-# INLINE fillT #-}
main = do
let il1 = IV $ SV.enumFromN 1 50000000
il2 = IV $ SV.enumFromN 1 50000000
il3 = IV $ SV.enumFromN 1 50000000
l1 = L (V.fromList [il1,il2,il3])
sl1 = fillS [[97,0],[98,0],[99,0]]
evaluate $ fillT (V.fromList [sl1,l1])
return ()
{-- GHC 6.12.3:
$ ghc -O2 --make test.hs -fforce-recomp -rtsopts -fasm && ./test +RTS -s
[1 of 1] Compiling Main ( test.hs, test.o )
Linking test ...
./test +RTS -s
600,843,536 bytes allocated in the heap
8,336 bytes copied during GC
200,002,504 bytes maximum residency (2 sample(s))
793,936 bytes maximum slop
574 MB total memory in use (9 MB lost due to fragmentation)
Generation 0: 2 collections, 0 parallel, 0.00s, 0.00s elapsed
Generation 1: 2 collections, 0 parallel, 0.00s, 0.00s elapsed
INIT time 0.00s ( 0.00s elapsed)
MUT time 0.48s ( 0.97s elapsed)
GC time 0.00s ( 0.00s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 0.48s ( 0.97s elapsed)
%GC time 0.2% (0.1% elapsed)
Alloc rate 1,259,822,857 bytes per MUT second
Productivity 99.6% of total user, 49.0% of total elapsed
-----
GHC 7.0.4:
$ ghc -O2 --make test.hs -fforce-recomp -rtsopts -fasm && ./test +RTS -s
[1 of 1] Compiling Main ( test.hs, test.o )
Linking test ...
./test +RTS -s
600,836,872 bytes allocated in the heap
7,664 bytes copied during GC
200,002,224 bytes maximum residency (2 sample(s))
794,216 bytes maximum slop
574 MB total memory in use (0 MB lost due to fragmentation)
Generation 0: 2 collections, 0 parallel, 0.00s, 0.00s elapsed
Generation 1: 2 collections, 0 parallel, 0.11s, 0.11s elapsed
INIT time 0.00s ( 0.00s elapsed)
MUT time 0.94s ( 1.01s elapsed)
GC time 0.11s ( 0.11s elapsed)
EXIT time 0.00s ( 0.11s elapsed)
Total time 1.05s ( 1.12s elapsed)
%GC time 10.2% (9.7% elapsed)
Alloc rate 638,055,951 bytes per MUT second
Productivity 89.4% of total user, 83.4% of total elapsed
--}
}}}
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/5749>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs