On February 18, 2009 04:29:42 Max Bolingbroke wrote: > > The part of the code under the first lambda in digit is as follows (I > > didn't keep the original dump, so the uniques have changed here). It's > > the second part of the Int overflow bounds check (i.e., y <= > > (maxBound-x)`div`10), and, indeed, something you don't want to compute > > unless the easy check fails. > > Yes - GHC wants to share the work of (maxBound-x)`div`10 between > several partial applications of "digit". This is usually a good idea, > but in this case it sucks because it's resulted in a massively > increased arity. IMHO GHC should fix this by: > * Marking divInt# INLINE in the base library. This would result in > your code would just containing uses of quotInt# > * Making some operations cheap even if they may fail > (PrimOp.primpOpIsCheap should change). Though this might mean that we > turn non-terminating programs into terminating ones (such operations > get pushed inside lambdas) but this is consistent with our treatment > of lambdas generally.
Reading your response got me thinking, "quot and div, I thought I had changed everything to quot because earlier I found that GHC was leaving evaluation of constant expressions like "maxBound-9`div`10" until runtime if I used div." Turns out I had missed that one in the second bounds check, and changing it from "y <= (maxBound-x)`div`10" to "y <= (maxBound-x)`qout`10" resulted in GHC "just doing the right thing". I presume this is because quot must be either marked INLINE and/or primOpIsCheap like you said div should be. I am guess this is because quot maps directly onto the x86 idiv instruction due to both of them truncating towards zero, while div, with its truncation towards negative infinity, does not. Running with -ddump-asm seems to back this up as quotInt# compiles down to an idiv instruction and divInt# to a call through base_GHCziBase_divIntzh_info. Unfortunately for me, I always seem to instinctively go with div and mod ahead of quot and rem. > Actually, your divInt# call wouldn't even usually be floated out to > between two lambdas, but at the time FloatOut runs there is something > in between the \x lambda and the lambdas from the state monad - the > monadic bind operator! So FloatOut feels free to move the computation > for "x" up even though that >>= will go away as soon as we run the > simplifier. What a disaster! I would like to try combining the GHC optimizer with a genetic algorithm so you could set it to pound away on your core loops for an hour or so to find the right sequence of ghc optimization steps to generate the tightest code. Maybe it could then write out an optimization hint file that regular GHC could optionally take in for use alongside the standard rules of thumb to produce great code. All I need is more time and more knowledge. Maybe someday. : ) > For me, making digit INLINE fixes all this, but that's probably > because the context of my code is not the same as yours (I had to > invent parts of a program to bind the bs, q and n variables). Sorry about that. I've put the entire routine at http://www.sharcnet.ca/~tyson/haskell/Example1.hs I should have done this in the first place as it allows me to provide the full code while also stripping it down in the emails to make them more readable. > For your > immediate problem, I would suggest this bit of GHC witchdoctory: > > where digit :: Int -> ParseInt Int Int > digit !x = do !y <- lift get > ( if y <= (maxBound-9)`quot`10 || > y <= ghc_hacks > then let !y' = y*10+x in (lift > $ put y') >> s2 > else throwError "integer overflow" ) > where {-# INLINE ghc_hacks #-} > ghc_hacks = (maxBound-x)`div`10 > > With luck, this should make the loop-invariant cheap in GHC's eyes, > preventing it from trying to share it. It works for me - but like I > say things may differ in your program. I tried that as well, and am pleased to report that it works great. I think I actually understand what is going on here and will hopefully be able to wield it in the future to my advantage. Generally I've had very little luck with the INLINE hammer when the function is at all complex. GHC always seems to finds a way to thwart my pathetic attempts and punish me with even worse code (like then not inlining the monad bind operation in the code resulting from the inlining). : ) > If you are interested in trying my script, it's available via Git at > http://github.com/batterseapower/scripts/blob/da1f24ba16c27e3994aa66f9db352 >ec1102c39d2/ghc-dump-split and is called as "ghc-dump-split ghc-dump-file", > where ghc-dump-file results from redirection of GHC stdout+stderr to a > file. I'll grab a copy of that. : ) > Hope all that helps, It sure did. Thanks very much for all your time! I'm really impressed with GHC and the people hacking on it. I've actually managed to get it to produce some truly tight assembler. It's also great, unlike a lot of projects, having papers that you can read on many details. Thanks again! -Tyson
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users