Simon Peyton-Jones wrote:
| I guess I was assuming that a function like sum is "simple enough" to
| get inlined at the call site - and hence possibly optimised at that
| site. Apparently not.

There is no reason in principle why not.  It's just that GHC doesn't do it at 
the moment.

On one hand, it looks like something that will often be a win. On the other hand, how do you prevent a combinatorial explosion of specialised functions that hardly ever get called? Hmm... maybe this is why I'm not a compiler designer? ;-)

| By the way, is there some way of discovering what
| GHC has "really" done with your code, rather than what you "think" it's
| done?

Yes: -ddump-simpl.

Quoting the GHC manual: "HACKER TERRITORY. HACKER TERRITORY. (You were warned.)" Ah, that made me smile...

Boy, does this thing generate a lot of output! Ah well, I was warned. ;-) I suppose to really make any sense of this lot, I would need to have a very deep understanding of the low-level internal workings of GHC. (I have read a few papers about GHC, but most of them were far too technical for me to comprehend.)

Well anyway, I see why adding -O2 makes it run so much faster. When I add that flag, the output goes from (+) :: Vector3 -> Vector3 -> Vector3 to something like

(+) :: GHC.Prim.Double#
         -> GHC.Prim.Double#
         -> GHC.Prim.Double#
         -> GHC.Prim.Double#
         -> GHC.Prim.Double#
         -> GHC.Prim.Double#
         -> (# GHC.Float.Double, GHC.Float.Double, GHC.Float.Double #)

and the body went from being an incomprehensible tangle of deeply nested case expressions to being a recognisable expression involving GHC.Prim.+##. (Anybody know what the difference between GHC.Prim.Double# and GHC.Float.Double is?) This in spite of the fact that the definition is actually

 (+) = vzip (+)
vzip f (Vector3 x0 y0 z0) (Vector3 x1 y1 z1) = Vector2 (f x0 x1) (f y0 y1) (f z0 z1)

So GHC has "unrolled" several levels of indirection here, and replaced an algebraic datatype with a bunch of primitive arguments. In other words, GHC has done what I wanted it to do. :-D

There is an absolutely *huge* chunk of code for the derived instance of Read Colour. (Why did I add that again? Hmm, I think I just included the standard library version of my code. Ah well, it's not hurting anything.)

Also moderately puzzled by things like "Str: DmdType U(U(L)U(L)U(L))U(U(L)U(L)U(L))m". Surely that's hacker territory if ever I saw it. ;-)

Ah well, I doubt I'm going to come up with any new ideas for how to make my code go faster, but it's mildly entertaining wading through over 200 KB of textual output trying to guess what it means.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to