#5996: fix CSE
-------------------------------------+--------------------------------------
Reporter: michalt | Owner:
Type: bug | Status: new
Priority: normal | Component: Compiler
Version: 7.5 | Keywords:
Os: Unknown/Multiple | Architecture: Unknown/Multiple
Failure: Runtime performance bug | Testcase:
Blockedby: | Blocking:
Related: |
-------------------------------------+--------------------------------------
The current version of CSE is slightly broken -- contrary to the comments
explaining it, it does not add an additional mapping when it substitutes
the RHS
of some binding. Taking the example from ```CSE.lhs```
{{{
module Test2 where
data C a b = C a b
a = undefined
{-# NOINLINE a #-}
b = undefined
{-# NOINLINE b #-}
x1 = C a b
x2 = C x1 b
y1 = C a b
y2 = C y1 b
}}}
We want to detect that ```y1``` is the same as ```x1``` and so rewrite it
to
```y1 = x1```. But at this point we also want to add a new substitution
that
changes ```y1``` to ```x1```. So that we can get ```y2 = C x1 b``` and
then
```y2 = x2```. This is '''not''' done by the current code:
{{{
> ghc -O2 -fforce-recomp Test2.hs -ddump-simpl -dsuppress-all -rtsopts
[1 of 1] Compiling Test2 ( Test2.hs, Test2.o )
==================== Tidy Core ====================
Result size = 40
b = undefined
a = b
x1 = \\ @ a_aaf @ b_aag -> C (a) (b)
x2 = \\ @ b_aao @ a_aap @ b1_aaq -> C (x1) (b)
y1 = x1
y2 = \\ @ b_aaG @ a_aaH @ b1_aaI -> C (x1) (b)
}}}
I wrote a patch to fix that and we get the desired result:
{{{
> ~/dev/ghc-work/inplace/bin/ghc-stage2 -O2 -fforce-recomp Test2.hs
-ddump-simpl -dsuppress-all -rtsopts
[1 of 1] Compiling Test2 ( Test2.hs, Test2.o )
==================== Tidy Core ====================
Result size = 30
b = undefined
a = b
x1 = \\ @ a_aal @ b_aam -> C (b) (b)
x2 = \\ @ b_aau @ a_aav @ b1_aaw -> C (x1) (b)
y1 = x1
y2 = x2
}}}
The fix seems quite easy but it made nofib unhappy -- see nofib1
attachment. Apparently there is quite a bit additional alloctation
happening in
a few benchmarks. I managed to narrow it down to the
```GHC.IO.Encoding.*```
modules. Adding a simple ```GHC_OPTIONS -fno-cse``` seems to improve the
performance quite a bit (and above the current HEAD!) -- see nofib2
attachment.
There seems to be a bit more code bloat, but the performance looks worth
it. I
haven't yet looked into exactly what causes the excessive allocation with
CSE in
the ```GHC.IO.Encoding.*``` modules (and I'm also a bit surprised by that
-- I
thought the main issue with CSE would be bigger memory usage). So any
suggestions are more than welcome. ;)
All the patches are in https://github.com/michalt/ghc (branch "cse") and
https://github.com/michalt/packages-base (branch "cse").
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/5996>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs