Re: [perfmon2] deterministic event on 8-core Intel i7 processor

stephane eranian Thu, 18 Mar 2010 09:44:22 -0700

On Thu, Mar 18, 2010 at 4:59 PM, Vince Weaver <vweav...@eecs.utk.edu> wrote:
> On Thu, 18 Mar 2010, stephane eranian wrote:
>
>> Also something else I am wondering about is: hardware page walker.
>>
>> Does that influence instruction_retired somehow?
>
> I hadn't thought about the hardware walked case.  Page faults definitely
> affect the count; they tend to be rare enough in a long running program
> that they are lost in the noise compared to other hardware interrupts;
> but if there's a lot of memory churn (or if you're on a busy system and
> the TLB gets flushed) you'll see more of them.
>
> About 2 years ago I spent a large amount of time trying to get an exact
> equation for what causes the nondeterminisms in retired_instructions.
> Part of the problem is at the time there was no good way to quantify
> interrupts at a per-thread level.  It might be possible now using
> perf_events.  AMD machines have an "interrupts_taken" hardware counter but
> I was never able to get it to match up with the results I saw.  Also, the
> per-process major/minor page fault values reported by the kernel were
> sometimes close, but never exactly equal to, the number of extra retired
> instructions you'd expect for page fault heavy microbenchmarks.


What about your pin your thread and run it at real-time prio.  Make sure
it is non-blocking, minimal syscalls. Compare cat /proc/interrupts
before and after
for that CPU.

But I think what we are after is the number of transitions in and out of priv
level 3. Could be interrupts, could be syscalls, traps. I believe the walker
runs at the current priv level.


>
> And as a side note on my previous post, using static compiled binaries
> might not be as helpful as I thought.  When I ran the test code enough
> times there was still some small variation; I think for a small
> microbenchmark the difference was that the execution time was much
> shorter so that there was less time for interrupts to happen.  Hash
> tables, especially ones that depend on virtual addresses, are a prime
> source of non-determinism, and that's what makes me suspicious of the
> dynamic linking code.
>
> Vince
> vweav...@eecs.utk.edu
>

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Re: [perfmon2] deterministic event on 8-core Intel i7 processor

Reply via email to