Re: [perfmon2] deterministic event on 8-core Intel i7 processor

stephane eranian Thu, 18 Mar 2010 01:53:37 -0700

Hi,

Well, I was wrong both perf and task use enable_on_exec.
The pipe stuff is needed to avoid a race between fork() and
exec. The perf_event API needs to know the pid to attach
an event to.


On Thu, Mar 18, 2010 at 8:29 AM, stephane eranian
<eran...@googlemail.com> wrote:
> Hi,
>
> So I ran the same tests on my Intel Core 2 Quad (Q6600) and there is
> fluctuation. There is in general over-counting compared to PIN. However
> it varies depending on the tool you use. It seems perf has more fluctuations.
>
> $ pin -t obj-intel64/inscount2_mt.so -o pin.log --
> /home/eranian/perfmon/pfmon/tests/fib 30;cat pin.log
> fib(30)=1664080 fib calls=2692537
> Number of threads ever exist = 1
> Count[0]= 59705995
>
> If I use task and measure user level only (task is part of libpfm4 examples):
>
> $ ./task -e instructions_retired ~/perfmon/pfmon/tests/fib 30
> fib(30)=1664080 fib calls=2692537
>            59706177 instructions_retired
> $ ./task -e instructions_retired ~/perfmon/pfmon/tests/fib 30
> fib(30)=1664080 fib calls=2692537
>            59706177 instructions_retired
>
> If I use perf at the user level only:
> $ perf stat -e instructions:u /home/eranian/perfmon/pfmon/tests/fib 30
> fib(30)=1664080 fib calls=2692537
>
>       59705952  instructions             #      0.000 IPC
>
>    0.030017751  seconds time elapsed
>
> $ perf stat -e instructions:u /home/eranian/perfmon/pfmon/tests/fib 30
> fib(30)=1664080 fib calls=2692537
>
>       59705948  instructions             #      0.000 IPC
>
>    0.025740707  seconds time elapsed
>
>
> I am wondering if the way the activation is done does not play some role
> in the fluctuation. Here task and perf use a different approach to activate
> monitoring. The former uses a pipe and may be subject to counting a bit
> before exec(). The latter uses the enable_on_exec feature which is handled
> by the kernel and thus at priv level 0, i.e., not counting. I will try to 
> update
> task.c to see if that has some influence.
>
>
> On Thu, Mar 18, 2010 at 6:36 AM, heechul Yun <heechul....@gmail.com> wrote:
>>
>>> >
>>> > Do you mean that even though I exclude kernel level events (
>>> > exclude_kernel
>>> > = 1) the interrupt handler portion of the events are counted?  Could you
>>> > briefly explain what kind of interruptions destroy determinism?
>>>
>>> There are several things you could do to try and narrow down a cause:
>>> - write a simple program which is deterministic (e.g., matrix add)
>>> - use the Intel PIN tool to count the exact number of instructions
>>> retired.
>>> - then compare the PIN count with the PMU count, that's the error margin
>>> - try changing the duration of the program to see how it impacts the
>>> wobbling
>>>
>>> I suspect there may be PMU leaks when you enter the kernel for an
>>> interrupt.
>>
>>
>> I wrote a simple Fibonacci and counted the #of instructions (inst_retired)
>> using both pin and performance counter.
>> As you can see, it seems like perf_counter undercount the #of instructions
>> and the result is non-deterministic (sometimes 94730 but sometimes 94729)
>> Any reason for this?
>>
>> $ pin -t obj-ia32/inscount2.so -o pin.log -- ./a.out; cat pin.log
>> Count 94768
>>
>> $ task -e "instructions_retired" ./a.out
>> [0x5100c0 event_sel=0xc0 umask=0x0 os=0 usr=1 en=1 int=1 inv=0 edge=0
>> cnt_mask=0 any=0] INSTRUCTION_RE\
>> TIRED:k=0:u=1:e=0:i=0:c=0:t=0
>> PERF[type=4 val=0x5100c0 e_u=0 e_k=1 e_hv=1]
>> INSTRUCTION_RETIRED:k=0:u=1:e=0:i=0:c=0:t=0
>>                94730 instructions_retired
>>            or 94729
>>
>> The code I ran is shown in the following.
>>
>> unsigned long
>> fib(unsigned long n)
>> {
>>         if (n == 0)
>>                 return 0;
>>         if (n == 1)
>>                 return 2;
>>         return fib(n-1)+fib(n-2);
>> }
>>
>> main()
>> {
>>     fib(10)
>> }
>>
>>
>>
>

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Re: [perfmon2] deterministic event on 8-core Intel i7 processor

Reply via email to