#5237: Inefficient code generated for x^2
-------------------------------+--------------------------------------------
    Reporter:  scpmw           |        Owner:                         
        Type:  bug             |       Status:  patch                  
    Priority:  normal          |    Milestone:  7.4.1                  
   Component:  libraries/base  |      Version:  7.0.3                  
    Keywords:                  |     Testcase:                         
   Blockedby:                  |   Difficulty:                         
          Os:  Linux           |     Blocking:                         
Architecture:  x86_64 (amd64)  |      Failure:  Runtime performance bug
-------------------------------+--------------------------------------------

Comment(by daniel.is.fischer):

 The problem is that if you write `expr ^ 2` in the source, that's exactly
 what you get. But it's not what you want.

 Consider the programme
 {{{
 {-# LANGUAGE BangPatterns #-}
 module Main (main) where

 import System.Environment (getArgs)

 fun :: Double -> Double
 fun x = go 0 0.5
   where
     go !acc z
       | x < z   = acc
       | otherwise = go (acc + z^2) (z+0.25)

 main :: IO ()
 main = getArgs >>= mapM_ (print . fun . read)
 }}}
 Compiling it with ghc-7.2.1, I get nearly 32K of Core and an executable
 delivering
 {{{
 dafis@schwartz:~/Haskell/BeginnersTesting> ./squareTest721 +RTS -s -RTS
 1.2e7
 2.3040000720013998e21
    2,304,132,792 bytes allocated in the heap
          186,544 bytes copied during GC
           28,992 bytes maximum residency (1 sample(s))
           26,288 bytes maximum slop
                1 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max
 pause
   Gen  0      4411 colls,     0 par    0.01s    0.01s     0.0000s
 0.0000s
   Gen  1         1 colls,     0 par    0.00s    0.00s     0.0004s
 0.0004s

   INIT    time    0.00s  (  0.00s elapsed)
   MUT     time    2.53s  (  2.53s elapsed)
   GC      time    0.02s  (  0.01s elapsed)
   EXIT    time    0.00s  (  0.00s elapsed)
   Total   time    2.54s  (  2.54s elapsed)

   %GC     time       0.6%  (0.6% elapsed)

   Alloc rate    910,944,778 bytes per MUT second

   Productivity  99.4% of total user, 99.4% of total elapsed
 }}}
 Compiling it with ghc-7.3.20110926 with rewrite rules, I get 6.6K of Core
 and an executable delivering
 {{{
 dafis@schwartz:~/Haskell/BeginnersTesting> ./squareTest73R +RTS -s -RTS
 1.2e7
 2.3040000720013998e21
          132,184 bytes allocated in the heap
            3,304 bytes copied during GC
           44,200 bytes maximum residency (1 sample(s))
           17,240 bytes maximum slop
                1 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max
 pause
   Gen  0         0 colls,     0 par    0.00s    0.00s     0.0000s
 0.0000s
   Gen  1         1 colls,     0 par    0.00s    0.00s     0.0001s
 0.0001s

   INIT    time    0.00s  (  0.00s elapsed)
   MUT     time    0.07s  (  0.07s elapsed)
   GC      time    0.00s  (  0.00s elapsed)
   EXIT    time    0.00s  (  0.00s elapsed)
   Total   time    0.07s  (  0.07s elapsed)

   %GC     time       0.1%  (0.1% elapsed)

   Alloc rate    1,767,189 bytes per MUT second

   Productivity  99.5% of total user, 103.1% of total elapsed
 }}}

 Since `(^)` is `{-# INLINABLE #-}`, GHC will most of the time create a
 type-specialised version of the exponentiation-by-repeated-squaring
 algorithm (a wrapper to test that the exponent isn't negative and to
 unpack the arguments if applicable, and two loops for the work). That
 function is then called with an exponent of 2. This creates a lot of code,
 much of which is never used (one loop runs twice, the other not at all).
 And it is an out-of-line function call, which can cost a lot of time in a
 loop.

 Having GHC rewrite `expr ^ 2` into `expr * expr`, a) no code for `(^)` has
 to be generated (or linked), b) for many types you get an inline
 multiplication instead of a function call.

 Theoretically, a compiler could in such a situation, when one argument is
 statically known at compile time, try to evaluate the function a few steps
 to see what gives (the speculative loop unrolling mentioned by scpmw in
 the ticket). In this case, it'd find
 {{{
 2 < 0 ? No => calculate f x 2
 even 2 ? Yes => calculate f (x*x) 1
 even 1 ? No => 1 == 1 ? Yes => result is (x*x)
 }}}
 I would expect it to be tremendously hard to implement such a speculative
 evaluation in a way that would often yield useful results and not unduly
 increase compile times, though.
 So while no true magic is available, let's add a few rewrite rules to
 catch the cases where using `(^)` hurts most:
 {{{
 {-# RULES
 "^2/Integer"  forall x. x ^ (2 :: Integer) = x*x
 "^3/Integer"  forall x. x ^ (3 :: Integer) = (x*x)*x
 "^4/Integer"  forall x. x ^ (4 :: Integer) = let y = x*x in y*y
   #-}
 }}}
 Fine. Now scpmw's `2.0 ^ 2` gets rewritten to `2.0 * 2.0` (and that is
 then evaluated to `4.0`), generally, occurrences of `expr ^ 2` get
 rewritten as desired.

 Unless the type of `expr` is `Integer`, because in `GHC.Real`, `(^)` is
 specialised for the type
 `Integer -> Integer -> Integer`. Then GHC has two rules to choose from,
 `^2/Integer` with matching exponent (and type), and the specialisation
 with matching argument and exponent type.
 Specialisation wins in this case. Since `Integer` multiplication isn't a
 primop but a function call, the effect isn't nearly as dramatic, but for
 the analogue of the above programme, I find a reduction of allocation and
 running time by a factor of roughly 2.3.

 The more specific rewrite rule with type of base and exponent specified as
 `Integer` fires, though, since it's more specific than the specialisation.

 Now, if you're compiling with `-Wall`, GHC will tell you
 {{{
     Warning: Defaulting the following constraint(s) to type `Integer'
                (Integral b0) arising from a use of `^' at
 squareTest.hs:11:32
                (Num b0) arising from the literal `2' at
 squareTest.hs:11:33
 }}}
 and you might make it `^(2 :: Int)` to have warning-free code and besides,
 `Int` is more efficient than `Integer`, isn't it? At least I have done
 that.

 But now the above rules don't match and you get the expensive `(^)` again.
 So add rules for `Int` exponents. Since there are also specialisations of
 `(^)` for the types `Integer -> Int -> Integer` and `Int -> Int -> Int`,
 we also need extra rules for these types to win over specialisation.

 Finally, what if the defaulting is resolved by making it `^(2 :: Word)`,
 after all, the exponent mustn't be negative, so why use a type with
 negative values?
 Therefore the first patch included rules for `Word` exponents. Since
 `Word` isn't available in `GHC.Real`, I added them to `GHC.Word`, but when
 testing, the only way to get them to fire was giving the exponent in the
 form `W# n##`, so I left them out of the second patch.

 Afterthought: What about `(^^)`?

-- 
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/5237#comment:9>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs

Reply via email to