#3065: Reorder tests in quot to improve code
---------------------------------------+------------------------------------
  Reporter:  simonpj                   |          Owner:                  
      Type:  run-time performance bug  |         Status:  new             
  Priority:  normal                    |      Milestone:  6.12 branch     
 Component:  libraries/base            |        Version:  6.10.1          
  Severity:  normal                    |       Keywords:                  
Difficulty:  Unknown                   |       Testcase:                  
        Os:  Unknown/Multiple          |   Architecture:  Unknown/Multiple
---------------------------------------+------------------------------------
 [SLPJ: capturing an email thread in a ticket]

 Krasimir Angelov found that this function:
 {{{
 a `quot` b
     | b == 0                     = divZeroError
     | a == minBound && b == (-1) = overflowError
     | otherwise                  =  a `quotInt` b
 }}}
 is expanded to:
 {{{
 a `quot` b =
    if b == 0
      then divZeroError
      else if a == minBound
               then if b == (-1)
                        then overflowError
                        else a `quotInt` b
               else a `quotInt` b
 }}}
 Then the compiler sees that b is a constant and computes that b == 0
 is False and b == (-1) is also False so it could eliminate two If
 statements. The result is:
 {{{
 a `quot` b =
    if a == minBound
      then a `quotInt` b
      else a `quotInt` b
 }}}
 and this is exactly what we get. I bet that if the original function was:
 {{{
 a `quot` b
     | b == 0                     = divZeroError
     | b == (-1) && a == minBound = overflowError   -- Note the changed
 order here
     | otherwise                  =  a `quotInt` b
 }}}
 then we would get what we want. I think that it is much more often to
 have division where the divisor is known so we will get the best code
 in this case.

 Shortly afterwards, Bertram Felgenhauer tried it out.  It works, and it
 has the desired effect. It's not always a win though; the nofib prime
 sieve benchmark suffers.

 For a patch, see
   http://int-e.home.tlink.de/haskell/ghc-libraries-base-tune-
 division.patch
 (SLPJ: I've attached it to the ticket too)

 Nofib results extract:
 {{{
 ------------------------------------------------------------------------
         Program           Size    Allocs   Runtime   Elapsed
 ------------------------------------------------------------------------
            fish          -0.7%     -5.3%      0.05      0.04
          primes          -0.0%    +28.5%    +25.6%    +25.5%
    wheel-sieve2          -0.0%     -0.3%    -17.9%    -18.6%
 ------------------------------------------------------------------------
             Min          -0.9%     -5.3%    -17.9%    -18.6%
             Max          +0.1%    +28.5%    +25.6%    +25.5%
  Geometric Mean          -0.2%     +0.2%     -0.0%     +0.2%
 }}}
 'primes' is an outlier - the other slowdowns are below 3%

 What happens in 'primes' is that 'mod' no longer gets inlined;
 apparently it now looks bigger to the compiler than before.

 Using -funfolding-use-threshold=10 brings the benchmark back to its
 original speed, despite the extra comparison before doing the
 division.

 In 'wheel-sieve2', the first different optimization choice I see is
 again a 'mod' that is no longer inlined; this leads to a change in other
 inlining choices that result in a speedup for reasons that I have not
 investigated.

-- 
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/3065>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs

Reply via email to