#4428: Local functions lose their unfoldings
--------------------------------------+-------------------------------------
Reporter: rl | Owner:
Type: bug | Status: new
Priority: normal | Milestone: 7.0.2
Component: Compiler | Version: 7.1
Resolution: | Keywords:
Testcase: | Blockedby:
Difficulty: | Os: Unknown/Multiple
Blocking: | Architecture: Unknown/Multiple
Failure: Runtime performance bug |
--------------------------------------+-------------------------------------
Comment(by rl):
Replying to [comment:14 simonpj]:
>
> I don't understand that. In your example there is precisely one call to
`step2`, by desgn, so it'll be inlined at its (single) call site
regardless of whether it has been floated.
I don't have the original example here. It was floated out and then not
inlined. There is no guarantee that step will only be used once anyway.
> Certainly, the person writing the INLINE pragma on `step` is saying
"duplicate the code for `step`, but he is ''absolutely not'' saying
"duplicate the code for whatever function `mapM` is applied to as well".
That's what the INLINE pragma achieves.
I see. Yes, that is a good point. I wouldn't be too concerned even if that
happened occasionally in vector code but I understand that this might be
bad in general.
> Your original complaint was (a) compilation is slow, and (b) you
generate worse code. I can see why (a) might happen, but I still don't
understand (b) at all. Nor do I understand why (a) is worse than before.
As to (b), I haven't had time to investigate. The slowdown is minimal (a
couple of % at worst) but seems quite consistent across the vector
benchmarks. Maybe it's something different altogether but I doubt it. I'll
find out over the weekend.
(a) is caused by recording unfoldings for local functions. My example from
before explains why it happens. We first optimise local's rhs. Then, we
inline local, i.e., it's unoptimised rhs (the unfolding) and optimise that
again. Before, we would inline the already optimised rhs. In effect, we
are now optimising local's rhs twice where before, we only optimised it
once.
> For (a), why does GHC optimise the original body? It's in case `step`
is not inlined at all; then we need something at all. This happens
precisely in the `mapM` case, where `step` is returned as an argument to
the `Stream` data constructor.
I understand that (although this never happens in vector code - if it
does, it's a bug in the library). But usually, we know that local
functions won't be used in this way because they will certainly be inlined
once we reach the right simplifier phase. This is something that we can't
know for global functions.
Perhaps we shouldn't optimise local functions with `INLINE[n]` until phase
n? That way, we avoid duplicating work if they get inlined in that phase.
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/4428#comment:15>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs