The relief comes when we can confirm, explain, and hopefully avoid it :) On Jun 19, 2013 3:20 PM, "Jan Stolarek" <[email protected]> wrote:
> Nicolas, I kinda like that explanation, because it relieves me of any > responsibility for this > problem :) Still, I have reasons to suspect that this might actually be my > fault. Generated Core > is slightly different - the generated worker function accepts parameters > in different order - and > I don't know why that happens. I also don't see why this would impact > performance. Looks like I > will need to become familiar with the profiling tools that you mentioned. > > Janek > > Dnia środa, 19 czerwca 2013, Nicolas Frisby napisał: > > I'm also seeing performance regressions in the shootout benchmarks that I > > can't identify in the asm. The new asm looks better but performs worse, > > with a ~15% slowdown. > > > > I fired up the performance counters in my CPU and the free Intel code for > > inspecting them showed that my CPU utilization took about a 10% hit, even > > while executing fewer total instructions. > > > > 1) Jan, perhaps we're seeing the same sort of behavior -- the shootout > > benchmarks have extremely hot loops (hundreds of millions of iterations > > IIRC). I used ticky profiling too, and saw no suspicious changes in any > > counters. > > > > 2) Dear Low-level Gurus: How feasible is it that a ~15% slowdown in a > > program with a very hot loop is due to incidentally inhibiting some > caching > > behavior (instr? data?)? Or perhaps effecting alignment? FTR my CPU is a > > Core i7-2620M, Sandy Bridge. > > > > Thanks all. > > > > On Wed, Jun 19, 2013 at 9:27 AM, Jan Stolarek <[email protected] > >wrote: > > > > If it's not sorted out, can you open a ticket, put in the relevant > info > > > > > > (so > > > > > > > we don't need to look at the email trail), and we can tackle it when > > > > you get here. > > > > > > Currently there's a temporary workaround: I'm using new folding rules > for > > > all primitive types, > > > except for Integer, in which case I left the old folding rules > unchanged. > > > This of course should > > > be modified to make all rules uniform, but for now it at least passes > > > validation. I didn't fill > > > the ticket, because the bug does not exist yet :) It only manifests > > > itself in my patches, which > > > have not been applied yet. I'll add all the information from this > > > discussion to my github fork of > > > GHC and then move it to Trac once the bug makes it to HEAD. > > > > > > What worries me more about my patches is the performance regression in > > > kahan, because I see no > > > obvious differences in the generated assembly. > > > > > > Janek > > > > > > > Simon > > > > > > > > -----Original Message----- > > > > From: [email protected] > > > > [mailto:[email protected]] > > > > > > On > > > > > > > Behalf Of Jan Stolarek Sent: 20 May 2013 12:35 > > > > To: Ian Lynagh > > > > Cc: [email protected] > > > > Subject: Re: Integer constant folding in the presence of new primops > > > > > > > > > If you remove everything but the quotInteger test from > > > > > integerConstantFolding and compile with -ddump-rule-rewrites then > > > > > you'll see that the eqInteger rule fires before quotInteger. This > is > > > > > presumably comparing against 0, as the definition of quot for > Integer > > > > > (in GHC.Real) is > > > > > _ `quot` 0 = divZeroError > > > > > n `quot` d = n `quotInteger` d > > > > > > > > Yes, I noticed these two rules firing together - perhaps that's the > > > > explanation why. I created a small program for testing: > > > > > > > > main = print quotInt > > > > quotInt :: Integer > > > > quotInt = 100063 `quot` 156 > > > > > > > > I noticed that when I define eqInteger wrapper to be NOINLINE, the > call > > > > > > to > > > > > > > quot is translated to Core as: > > > > > > > > Main.quotInt = > > > > GHC.Real.$fIntegralInteger_$cquot > > > > (__integer 100063) (__integer 156) > > > > > > > > but when I change the wrapper to INLINE I get: > > > > > > > > Main.quotInt = > > > > GHC.Real.$fNumRatio_$cquot <-------- NumRatio instead > of > > > > IntegralInteger (__integer 100063) (__integer 156) > > > > > > > > All rule firing happens later (I used -ddump-simpl-iterations > > > > -ddump-rule-firings), except that for $fNumRatio_$cquot the quot > rules > > > > don't fire. > > > > > > > > > Do you also still have eqInteger wired in? It sounds like you might > > > > > have given them both the same unique? > > > > > > > > No, they didn't have the same unique. I modified the existing rules > to > > > > > > work > > > > > > > on the new primops and ignore their wrappers. At the moment I > reverted > > > > these changes so that I can make progress and leave this problem for > > > > > > later. > > > > > > > Janek > > > > > > > > _______________________________________________ > > > > ghc-devs mailing list > > > > [email protected] > > > > http://www.haskell.org/mailman/listinfo/ghc-devs > > > > > > _______________________________________________ > > > ghc-devs mailing list > > > [email protected] > > > http://www.haskell.org/mailman/listinfo/ghc-devs > > >
_______________________________________________ ghc-devs mailing list [email protected] http://www.haskell.org/mailman/listinfo/ghc-devs
