#2289: Needless reboxing of values when returning from a tight loop
-------------------------------------------+--------------------------------
Reporter: dons | Owner:
Type: run-time performance bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 6.8.2
Severity: normal | Resolution:
Keywords: boxing, loops, performance | Testcase:
Architecture: Unknown | Os: Unknown
-------------------------------------------+--------------------------------
Comment (by dons):
Here's a simple test case:
{{{
{-# LANGUAGE TypeOperators #-}
{-# OPTIONS -funbox-strict-fields #-}
import System.Environment
import Text.Printf
data P a b = P !a !b
mean :: Double -> Double -> P Double Int
mean n m = go n 0 0
where
go :: Double -> Int -> Double -> P Double Int
go x l s | x > m = P s l
| otherwise = go (x+1) (l+1) (s+x)
main = do
[d] <- map read `fmap` getArgs
printf "%f\n" (case mean 1 d of
(P x y) -> x / fromIntegral y )
}}}
Yields:
{{{
$wgo_s1az :: Double# -> Int# -> Double# -> (# Double, Int #)
$wgo_s1az =
\ (ww_X1au :: Double#)
(ww1_X1az :: Int#)
(ww2_X1aE :: Double#) ->
case >## ww_X1au y_a178 of wild4_X1g {
False ->
$wgo_s1az
(+## ww_X1au 1.0)
(+# ww1_X1az 1)
(+## ww2_X1aE ww_X1au);
True -> (# D# ww2_X1aE, I# ww1_X1az #)
};
} in
case $wgo_s1az 2.0 1 1.0 of ww_s1a2 { (# ww1_s1a4, ww2_s1a5 #) ->
case ww1_s1a4 of wild4_a17r { D# x_a17t ->
case ww2_s1a5 of wild5_aEV { I# x1_aEX ->
case /## x_a17t (int2Double# x1_aEX)
of wild21_a17z { __DEFAULT ->
D# wild21_a17z
}}}
Exhibiting the reboxing.
Running this:
{{{
$ time ./G 1e9
500000000.067109
./G 1e9 2.21s user 0.00s system 99% cpu 2.225 total
}}}
While if we prevent the reboxing, by moving the division inside the loop:
{{{
$wgo_s1a0 :: Double# -> Int# -> Double# -> Double#
$wgo_s1a0 =
\ (ww_X19X :: Double#)
(ww1_X1a2 :: Int#)
(ww2_X1a7 :: Double#) ->
case >## ww_X19X y_a16C of wild4_X1g {
False ->
$wgo_s1a0
(+## ww_X19X 1.0)
(+# ww1_X1a2 1)
(+## ww2_X1a7 ww_X19X);
True -> /## ww2_X1a7 (int2Double# ww1_X1a2
}}}
We get faster code:
{{{
$ time ./G 1e9
500000000.067109
./G 1e9 1.84s user 0.01s system 99% cpu 1.861 total
}}}
So I suspect the boxing causes a heap check to end up inside the loop (run
on every iteration?),
and thus a performance loss.
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/2289#comment:1>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs