Re: [perfmon2] deterministic event on 8-core Intel i7 processor

2010-03-18 Thread stephane eranian
Hi, So I ran the same tests on my Intel Core 2 Quad (Q6600) and there is fluctuation. There is in general over-counting compared to PIN. However it varies depending on the tool you use. It seems perf has more fluctuations. $ pin -t obj-intel64/inscount2_mt.so -o pin.log -- /home/eranian/perfmon/p

Re: [perfmon2] deterministic event on 8-core Intel i7 processor

2010-03-18 Thread stephane eranian
Hi, Well, I was wrong both perf and task use enable_on_exec. The pipe stuff is needed to avoid a race between fork() and exec. The perf_event API needs to know the pid to attach an event to. On Thu, Mar 18, 2010 at 8:29 AM, stephane eranian wrote: > Hi, > > So I ran the same tests on my Intel Co

[perfmon2] [PATCH] perf_events: fix ordering bug in perf_output_sample()

2010-03-18 Thread Stephane Eranian
In order to parse a sample correctly based on the information requested via sample_type, the kernel needs to save each component in a known order. There is no type value saved with each component. The current convention is that each component is saved according to

Re: [perfmon2] deterministic event on 8-core Intel i7 processor

2010-03-18 Thread stephane eranian
Vince, Good points. Also something else I am wondering about is: hardware page walker. Does that influence instruction_retired somehow? On Thu, Mar 18, 2010 at 4:11 PM, Vince Weaver wrote: > >> I wrote a simple Fibonacci and counted the #of instructions (inst_retired) >> using both pin and per

Re: [perfmon2] deterministic event on 8-core Intel i7 processor

2010-03-18 Thread Vince Weaver
> I wrote a simple Fibonacci and counted the #of instructions (inst_retired) > using both pin and performance counter. > As you can see, it seems like perf_counter undercount the #of instructions > and the result is non-deterministic (sometimes 94730 but sometimes 94729) > Any reason for this? Di

Re: [perfmon2] deterministic event on 8-core Intel i7 processor

2010-03-18 Thread Vince Weaver
On Wed, 17 Mar 2010, heechul Yun wrote: > Thank you very much. It is really nice piece of work and very helpful. > > BTW, I have a question on this work. Did you include OS instructions (kernel > mode) in your measurement?  > In section 3, you mention that you enabled counting the OS. Does that m

Re: [perfmon2] deterministic event on 8-core Intel i7 processor

2010-03-18 Thread Vince Weaver
On Thu, 18 Mar 2010, stephane eranian wrote: > Also something else I am wondering about is: hardware page walker. > > Does that influence instruction_retired somehow? I hadn't thought about the hardware walked case. Page faults definitely affect the count; they tend to be rare enough in a long

Re: [perfmon2] deterministic event on 8-core Intel i7 processor

2010-03-18 Thread heechul Yun
On Thu, Mar 18, 2010 at 10:11 AM, Vince Weaver wrote: > > > I wrote a simple Fibonacci and counted the #of instructions > (inst_retired) > > using both pin and performance counter. > > As you can see, it seems like perf_counter undercount the #of > instructions > > and the result is non-determinis

Re: [perfmon2] deterministic event on 8-core Intel i7 processor

2010-03-18 Thread stephane eranian
On Thu, Mar 18, 2010 at 4:59 PM, Vince Weaver wrote: > On Thu, 18 Mar 2010, stephane eranian wrote: > >> Also something else I am wondering about is: hardware page walker. >> >> Does that influence instruction_retired somehow? > > I hadn't thought about the hardware walked case.  Page faults defin

Re: [perfmon2] [PATCH] perf: fix stat attach bogus counts

2010-03-18 Thread Ingo Molnar
* Stephane Eranian wrote: > When perf stat -p pid is used, the events must be enabled > immediately as there is no exec and thus no enable_on_exec. > > Signed-off-by: Stephane Eranian > > -- > builtin-stat.c |6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) >

Re: [perfmon2] [PATCH] perf: fix stat attach bogus counts

2010-03-18 Thread stephane eranian
On Thu, Mar 18, 2010 at 6:36 PM, Ingo Molnar wrote: > > * Stephane Eranian wrote: > >>       When perf stat -p pid is used, the events must be enabled >>       immediately as there is no exec and thus no enable_on_exec. >> >>       Signed-off-by: Stephane Eranian >> >> -- >>  builtin-stat.c |  

Re: [perfmon2] [PATCH] perf_events: fix ordering bug in perf_output_sample()

2010-03-18 Thread Peter Zijlstra
On Thu, 2010-03-18 at 14:42 +0200, Stephane Eranian wrote: > In order to parse a sample correctly based on the information > requested via sample_type, the kernel needs to save each component > in a known order. There is no type value saved with each component. > The current

Re: [perfmon2] [PATCH] perf_events: fix ordering bug in perf_output_sample()

2010-03-18 Thread Stephane Eranian
On Thu, Mar 18, 2010 at 7:33 PM, Peter Zijlstra wrote: > On Thu, 2010-03-18 at 14:42 +0200, Stephane Eranian wrote: >>       In order to parse a sample correctly based on the information >>       requested via sample_type, the kernel needs to save each component >>       in a known order. There is

Re: [perfmon2] IBS on Opteron

2010-03-18 Thread Drongowski, Paul
Hi Ram -- The examples in: http://developer.amd.com/assets/AMD_IBS_paper_EN.pdf were produced on a revision B3 Opteron. Page 7 shows the distribution of IBS op samples across the body of a matrix multiply routine. In a truly random ("fair") selection of ops, one would expect the distribution

Re: [perfmon2] [PATCH] perf_events: fix ordering bug in perf_output_sample()

2010-03-18 Thread Peter Zijlstra
On Thu, 2010-03-18 at 22:29 +0100, Stephane Eranian wrote: > On Thu, Mar 18, 2010 at 7:33 PM, Peter Zijlstra wrote: > > On Thu, 2010-03-18 at 14:42 +0200, Stephane Eranian wrote: > >> In order to parse a sample correctly based on the information > >> requested via sample_type, the kern