Hi,
So I ran the same tests on my Intel Core 2 Quad (Q6600) and there is
fluctuation. There is in general over-counting compared to PIN. However
it varies depending on the tool you use. It seems perf has more fluctuations.
$ pin -t obj-intel64/inscount2_mt.so -o pin.log --
/home/eranian/perfmon/p
Hi,
Well, I was wrong both perf and task use enable_on_exec.
The pipe stuff is needed to avoid a race between fork() and
exec. The perf_event API needs to know the pid to attach
an event to.
On Thu, Mar 18, 2010 at 8:29 AM, stephane eranian
wrote:
> Hi,
>
> So I ran the same tests on my Intel Co
In order to parse a sample correctly based on the information
requested via sample_type, the kernel needs to save each component
in a known order. There is no type value saved with each component.
The current convention is that each component is saved according to
Vince,
Good points.
Also something else I am wondering about is: hardware page walker.
Does that influence instruction_retired somehow?
On Thu, Mar 18, 2010 at 4:11 PM, Vince Weaver wrote:
>
>> I wrote a simple Fibonacci and counted the #of instructions (inst_retired)
>> using both pin and per
> I wrote a simple Fibonacci and counted the #of instructions (inst_retired)
> using both pin and performance counter.
> As you can see, it seems like perf_counter undercount the #of instructions
> and the result is non-deterministic (sometimes 94730 but sometimes 94729)
> Any reason for this?
Di
On Wed, 17 Mar 2010, heechul Yun wrote:
> Thank you very much. It is really nice piece of work and very helpful.
>
> BTW, I have a question on this work. Did you include OS instructions (kernel
> mode) in your measurement?
> In section 3, you mention that you enabled counting the OS. Does that m
On Thu, 18 Mar 2010, stephane eranian wrote:
> Also something else I am wondering about is: hardware page walker.
>
> Does that influence instruction_retired somehow?
I hadn't thought about the hardware walked case. Page faults definitely
affect the count; they tend to be rare enough in a long
On Thu, Mar 18, 2010 at 10:11 AM, Vince Weaver wrote:
>
> > I wrote a simple Fibonacci and counted the #of instructions
> (inst_retired)
> > using both pin and performance counter.
> > As you can see, it seems like perf_counter undercount the #of
> instructions
> > and the result is non-determinis
On Thu, Mar 18, 2010 at 4:59 PM, Vince Weaver wrote:
> On Thu, 18 Mar 2010, stephane eranian wrote:
>
>> Also something else I am wondering about is: hardware page walker.
>>
>> Does that influence instruction_retired somehow?
>
> I hadn't thought about the hardware walked case. Page faults defin
* Stephane Eranian wrote:
> When perf stat -p pid is used, the events must be enabled
> immediately as there is no exec and thus no enable_on_exec.
>
> Signed-off-by: Stephane Eranian
>
> --
> builtin-stat.c |6 --
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
On Thu, Mar 18, 2010 at 6:36 PM, Ingo Molnar wrote:
>
> * Stephane Eranian wrote:
>
>> When perf stat -p pid is used, the events must be enabled
>> immediately as there is no exec and thus no enable_on_exec.
>>
>> Signed-off-by: Stephane Eranian
>>
>> --
>> builtin-stat.c |
On Thu, 2010-03-18 at 14:42 +0200, Stephane Eranian wrote:
> In order to parse a sample correctly based on the information
> requested via sample_type, the kernel needs to save each component
> in a known order. There is no type value saved with each component.
> The current
On Thu, Mar 18, 2010 at 7:33 PM, Peter Zijlstra wrote:
> On Thu, 2010-03-18 at 14:42 +0200, Stephane Eranian wrote:
>> In order to parse a sample correctly based on the information
>> requested via sample_type, the kernel needs to save each component
>> in a known order. There is
Hi Ram --
The examples in:
http://developer.amd.com/assets/AMD_IBS_paper_EN.pdf
were produced on a revision B3 Opteron. Page 7 shows the
distribution of IBS op samples across the body of a matrix
multiply routine. In a truly random ("fair") selection of
ops, one would expect the distribution
On Thu, 2010-03-18 at 22:29 +0100, Stephane Eranian wrote:
> On Thu, Mar 18, 2010 at 7:33 PM, Peter Zijlstra wrote:
> > On Thu, 2010-03-18 at 14:42 +0200, Stephane Eranian wrote:
> >> In order to parse a sample correctly based on the information
> >> requested via sample_type, the kern
15 matches
Mail list logo