Thank you very much. It is really nice piece of work and very helpful.

BTW, I have a question on this work. Did you include OS instructions (kernel
mode) in your measurement?
In section 3, you mention that you enabled counting the OS. Does that mean
kernel mode counting?

Heechul


On Fri, Mar 12, 2010 at 8:32 AM, Vince Weaver <vweav...@eecs.utk.edu> wrote:

>
> On Thu, 11 Mar 2010, stephane eranian wrote:
> >
> > There are several things you could do to try and narrow down a cause:
> > - write a simple program which is deterministic (e.g., matrix add)
> > - use the Intel PIN tool to count the exact number of instructions
> retired.
> > - then compare the PIN count with the PMU count, that's the error margin
> > - try changing the duration of the program to see how it impacts the
> wobbling
> >
> > I suspect there may be PMU leaks when you enter the kernel for an
> interrupt.
>
> One good way to see if it's an interrupt problem is extend your test case
> so it takes multiple seconds.
>
> Then take the actual count, subtract the expected count.  If it's
> interrupt based, then the overhead should be roughly
>   TIME(seconds) *  HZ
> where HZ is the timer interrupt frequency (this is assuming you are
> running on Linux).   HZ is configurable at kernel compile time, the
> default these days for x86 is 250 I think.  You can even try a few
> different values of HZ to make sure it's the variable you are seeing.
> Since the timer frequency is the most common interrupt, it should roughly
> correspond.
>
> You can see plots showing this interrupt overhead for the spec2k benchmaks
> on various x86 processors with the retired_instruction counter on page
> 12 of the paper located here:
>   
> http://www.csl.cornell.edu/~vince/papers/iiswc08/tr1051_08.pdf<http://www.csl.cornell.edu/%7Evince/papers/iiswc08/tr1051_08.pdf>
>
> Unfortunately for counters like this, you can never know how deterministic
> they are without testing yourself.  Originally the interrupt issue with
> retired_instructions wasn't widely known, I had to find it the hard way.
> Recently though the effect has started turning up in Intel and AMD
> documentation.  I wouldn't be suprised if all of the various retired
> counters have issues like this; it might just have been luck that
> retired_stores on core2 behaved so nicely.
>
> As always when determinism comes up, I am obligated to plug my work on the
> issue (found in the paper mentioned earlier here).  There are many many
> issues that can cause non-determinism in performance counters, and some of
> them are very non-intuitive.
>
> Vince
>
>
>
------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to