Thanks Austin. The program exhibiting these behaviors is shootout/reverse-complement. The performance monitoring I used was Intel's pcm from
http://software.intel.com/en-us/articles/intel-performance-counter-monitor-a-better-way-to-measure-cpu-utilization I've been working only on my MBP, so no perfmon yet. I plan to investigate this with different architectures/machines when this issue percolates back up my todo list. On Wed, Jun 19, 2013 at 11:39 AM, Austin Seipp <[email protected]> wrote: > I mean, it certainly *seems* reasonable a 15% hit could come from > pipelining changes or cache behavior or something. I don't think > alignment would really be a huge issue; post-Nehalem I believe > non-aligned writes/reads are extremely cheap. Non-intuitive behavior > can totally happen too: I've seen cases of adding instructions to a > loop which speeds things up (e.g. by taking the extra step, you may > mitigate a dependency stall, which massively helps pipelining across > the loop body etc.) > > Nicolas, can I ask what benchmark you're looking at? And what > performance tools are you using, Intels'? If you're on Linux, the > 'perf' tool on a modern kernel can be used to quickly get an overview > of how many cache misses/hits your process has, how many pipeline > stalls occur, etc. You can then use it to drill down a bit into the > assembly that's problematic. > > That might not give you an exact culprit (it could be many changes and > accumulative hits,) but it's a start. > > On Wed, Jun 19, 2013 at 10:43 AM, Nicolas Frisby > <[email protected]> wrote: > > I'm also seeing performance regressions in the shootout benchmarks that I > > can't identify in the asm. The new asm looks better but performs worse, > with > > a ~15% slowdown. > > > > I fired up the performance counters in my CPU and the free Intel code for > > inspecting them showed that my CPU utilization took about a 10% hit, even > > while executing fewer total instructions. > > > > 1) Jan, perhaps we're seeing the same sort of behavior — the shootout > > benchmarks have extremely hot loops (hundreds of millions of iterations > > IIRC). I used ticky profiling too, and saw no suspicious changes in any > > counters. > > > > 2) Dear Low-level Gurus: How feasible is it that a ~15% slowdown in a > > program with a very hot loop is due to incidentally inhibiting some > caching > > behavior (instr? data?)? Or perhaps effecting alignment? FTR my CPU is a > > Core i7-2620M, Sandy Bridge. > > > > Thanks all. > > > > On Wed, Jun 19, 2013 at 9:27 AM, Jan Stolarek <[email protected]> > > wrote: > >> > >> > If it's not sorted out, can you open a ticket, put in the relevant > info > >> > (so > >> > we don't need to look at the email trail), and we can tackle it when > you > >> > get here. > >> Currently there's a temporary workaround: I'm using new folding rules > for > >> all primitive types, > >> except for Integer, in which case I left the old folding rules > unchanged. > >> This of course should > >> be modified to make all rules uniform, but for now it at least passes > >> validation. I didn't fill > >> the ticket, because the bug does not exist yet :) It only manifests > itself > >> in my patches, which > >> have not been applied yet. I'll add all the information from this > >> discussion to my github fork of > >> GHC and then move it to Trac once the bug makes it to HEAD. > >> > >> What worries me more about my patches is the performance regression in > >> kahan, because I see no > >> obvious differences in the generated assembly. > >> > >> Janek > >> > >> > > >> > Simon > >> > > >> > -----Original Message----- > >> > From: [email protected] [mailto: > [email protected]] > >> > On > >> > Behalf Of Jan Stolarek Sent: 20 May 2013 12:35 > >> > To: Ian Lynagh > >> > Cc: [email protected] > >> > Subject: Re: Integer constant folding in the presence of new primops > >> > > >> > > If you remove everything but the quotInteger test from > >> > > integerConstantFolding and compile with -ddump-rule-rewrites then > >> > > you'll see that the eqInteger rule fires before quotInteger. This is > >> > > presumably comparing against 0, as the definition of quot for > Integer > >> > > (in GHC.Real) is > >> > > _ `quot` 0 = divZeroError > >> > > n `quot` d = n `quotInteger` d > >> > > >> > Yes, I noticed these two rules firing together - perhaps that's the > >> > explanation why. I created a small program for testing: > >> > > >> > main = print quotInt > >> > quotInt :: Integer > >> > quotInt = 100063 `quot` 156 > >> > > >> > I noticed that when I define eqInteger wrapper to be NOINLINE, the > call > >> > to > >> > quot is translated to Core as: > >> > > >> > Main.quotInt = > >> > GHC.Real.$fIntegralInteger_$cquot > >> > (__integer 100063) (__integer 156) > >> > > >> > but when I change the wrapper to INLINE I get: > >> > > >> > Main.quotInt = > >> > GHC.Real.$fNumRatio_$cquot <-------- NumRatio instead of > >> > IntegralInteger (__integer 100063) (__integer 156) > >> > > >> > All rule firing happens later (I used -ddump-simpl-iterations > >> > -ddump-rule-firings), except that for $fNumRatio_$cquot the quot rules > >> > don't fire. > >> > > >> > > Do you also still have eqInteger wired in? It sounds like you might > >> > > have given them both the same unique? > >> > > >> > No, they didn't have the same unique. I modified the existing rules to > >> > work > >> > on the new primops and ignore their wrappers. At the moment I reverted > >> > these changes so that I can make progress and leave this problem for > >> > later. > >> > > >> > Janek > >> > > >> > _______________________________________________ > >> > ghc-devs mailing list > >> > [email protected] > >> > http://www.haskell.org/mailman/listinfo/ghc-devs > >> > >> > >> > >> _______________________________________________ > >> ghc-devs mailing list > >> [email protected] > >> http://www.haskell.org/mailman/listinfo/ghc-devs > > > > > > > > _______________________________________________ > > ghc-devs mailing list > > [email protected] > > http://www.haskell.org/mailman/listinfo/ghc-devs > > > > > > -- > Regards, > Austin - PGP: 4096R/0x91384671 >
_______________________________________________ ghc-devs mailing list [email protected] http://www.haskell.org/mailman/listinfo/ghc-devs
