#2289: Needless reboxing of values when returning from a tight loop
-------------------------------------------+--------------------------------
Reporter: dons | Owner:
Type: bug | Status: new
Priority: lowest | Milestone: 7.6.1
Component: Compiler | Version: 6.8.2
Keywords: boxing, loops, performance | Os: Unknown/Multiple
Architecture: Unknown/Multiple | Failure: Runtime performance
bug
Difficulty: Unknown | Testcase:
Blockedby: | Blocking:
-------------------------------------------+--------------------------------
Comment(by simonpj@…):
commit 4fa3f16ddb9fa8e5d59bde5354918a39e0430a74
{{{
Author: Simon Peyton Jones <[email protected]>
Date: Mon May 28 17:33:42 2012 +0100
Be less aggressive about the result discount
This patch fixes Trac #6099 by reducing the result discount in
CoreUnfold.conSize.
See Note [Constructor size and result discount] in CoreUnfold.
The existing version is definitely too aggressive. Simon M found it an
"unambiguous win" but it is definitely what led to the bloat. In a
function
with a lot of case branches, all returning a constructor, the discount
could
grow arbitrarily large.
I also had to increase the -funfolding-creation-threshold from 450 to
750,
otherwise some functions that should inline simply never get an
unfolding.
(The massive result discount was allow the unfolding to appear
before.)
The nofib results are these, picking a handful of outliers to show.
Program Size Allocs Runtime Elapsed TotalMem
--------------------------------------------------------------------------------
fulsom -0.5% -1.6% -2.8% -2.6% +31.1%
maillist -0.2% -0.0% 0.09 0.09 -3.7%
mandel -0.4% +6.6% 0.12 0.12 +0.0%
nucleic2 -0.2% +18.5% 0.11 0.11 +0.0%
parstof -0.4% +4.0% 0.00 0.00 +0.0%
--------------------------------------------------------------------------------
Min -0.9% -1.6% -19.7% -19.7% -3.7%
Max +0.3% +18.5% +2.7% +2.7% +31.1%
Geometric Mean -0.3% +0.4% -3.0% -3.0% +0.2%
Turns out that nucleic2 has a function
Main.$wabsolute_pos =
\ (ww_s4oj :: Types.Tfo) (ww1_s4oo :: Types.FloatT)
(ww2_s4op :: Types.FloatT) (ww3_s4oq :: Types.FloatT) ->
case ww_s4oj
of _
{ Types.Tfo a_a1sS b_a1sT c_a1sU d_a1sV e_a1sW f_a1sX g_a1sY
h_a1sZ i_a1t0 tx_a1t1 ty_a1t2 tz_a1t3 ->
(# case ww1_s4oo of _ { GHC.Types.F# x_a2sO ->
case a_a1sS of _ { GHC.Types.F# y_a2sS ->
case ww2_s4op of _ { GHC.Types.F# x1_X2y9 ->
case d_a1sV of _ { GHC.Types.F# y1_X2yh ->
case ww3_s4oq of _ { GHC.Types.F# x2_X2yj ->
case g_a1sY of _ { GHC.Types.F# y2_X2yr ->
case tx_a1t1 of _ { GHC.Types.F# y3_X2yn ->
GHC.Types.F#
(GHC.Prim.plusFloat#
(GHC.Prim.plusFloat#
(GHC.Prim.plusFloat#
(GHC.Prim.timesFloat# x_a2sO y_a2sS)
(GHC.Prim.timesFloat# x1_X2y9 y1_X2yh))
(GHC.Prim.timesFloat# x2_X2yj y2_X2yr))
y3_X2yn)
} } }}}}},
<similar>,
<similar> )
This is pretty big, but inlining it does get rid of that F#
allocation.
But we'll also get rid of it with deep CPR: Trac #2289. For now we
just
accept the change.
compiler/coreSyn/CoreUnfold.lhs | 73
++++++++++++++++++++++-----------------
compiler/main/StaticFlags.hs | 7 +++-
2 files changed, 47 insertions(+), 33 deletions(-)
}}}
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/2289#comment:26>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs