#4428: Local functions lose their unfoldings
--------------------------------------+-------------------------------------
Reporter: rl | Owner:
Type: bug | Status: new
Priority: normal | Milestone: 7.0.2
Component: Compiler | Version: 7.1
Resolution: | Keywords:
Testcase: | Blockedby:
Difficulty: | Os: Unknown/Multiple
Blocking: | Architecture: Unknown/Multiple
Failure: Runtime performance bug |
--------------------------------------+-------------------------------------
Comment(by simonpj):
Replying to [comment:10 rl]:
> NOINLINE doesn't work because `step` might get floated out and then not
be inlined. That's what prompted the original bug report, in fact. It
''must'' have an INLINE pragma.
I don't understand that. In your example there is precisely one call to
`step2`, by desgn, so it'll be inlined at its (single) call site
regardless of whether it has been floated.
> I don't understand how not inlining until phase 0 would prevent this
from happening. I think I do mean "optimise as if there was no pragma, and
then inline whatever you have in phase 0".
Suppose we had no INLINE pragma on `step`. Then we'd get this:
{{{
mapM <\x.BIG> (Stream step s n)
= Steam step1 s n
where
step1 = ...<\x.BIG>...
}}}
On the other hand, if `step1` does have an INLINE pragma you'd get
{{{
mapM <\x.BIG> (Stream step s n)
= Steam step1 s n
where
f = <\x.BIG>
{-# INLINE[0] step1 #-}
[Tmpl = ...f....]
step1 = ...f...
}}}
In order to preserve the template in source form we must not inline f into
the template, so we let-bind it instead. Now we can inline `step1` at
multiple call sites without duplicating `<\x.BIG>`. Does that explain the
difference? Certainly, the person writing the INLINE pragma on `step` is
saying "duplicate the code for `step`, but he is ''absolutely not'' saying
"duplicate the code for whatever function `mapM` is applied to as well".
That's what the INLINE pragma achieves.
Your original complaint was (a) compilation is slow, and (b) you generate
worse code. I can see why (a) might happen, but I still don't understand
(b) at all. Nor do I understand why (a) is worse than before.
For (a), why does GHC optimise the original body? It's in case `step` is
not inlined at all; then we need something at all. This happens precisely
in the `mapM` case, where `step` is returned as an argument to the
`Stream` data constructor.
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/4428#comment:14>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs