Hi,

So I ran the same tests on my Intel Core 2 Quad (Q6600) and there is
fluctuation. There is in general over-counting compared to PIN. However
it varies depending on the tool you use. It seems perf has more fluctuations.

$ pin -t obj-intel64/inscount2_mt.so -o pin.log --
/home/eranian/perfmon/pfmon/tests/fib 30;cat pin.log
fib(30)=1664080 fib calls=2692537
Number of threads ever exist = 1
Count[0]= 59705995

If I use task and measure user level only (task is part of libpfm4 examples):

$ ./task -e instructions_retired ~/perfmon/pfmon/tests/fib 30
fib(30)=1664080 fib calls=2692537
            59706177 instructions_retired
$ ./task -e instructions_retired ~/perfmon/pfmon/tests/fib 30
fib(30)=1664080 fib calls=2692537
            59706177 instructions_retired

If I use perf at the user level only:
$ perf stat -e instructions:u /home/eranian/perfmon/pfmon/tests/fib 30
fib(30)=1664080 fib calls=2692537

       59705952  instructions             #      0.000 IPC

    0.030017751  seconds time elapsed

$ perf stat -e instructions:u /home/eranian/perfmon/pfmon/tests/fib 30
fib(30)=1664080 fib calls=2692537

       59705948  instructions             #      0.000 IPC

    0.025740707  seconds time elapsed


I am wondering if the way the activation is done does not play some role
in the fluctuation. Here task and perf use a different approach to activate
monitoring. The former uses a pipe and may be subject to counting a bit
before exec(). The latter uses the enable_on_exec feature which is handled
by the kernel and thus at priv level 0, i.e., not counting. I will try to update
task.c to see if that has some influence.


On Thu, Mar 18, 2010 at 6:36 AM, heechul Yun <heechul....@gmail.com> wrote:
>
>> >
>> > Do you mean that even though I exclude kernel level events (
>> > exclude_kernel
>> > = 1) the interrupt handler portion of the events are counted?  Could you
>> > briefly explain what kind of interruptions destroy determinism?
>>
>> There are several things you could do to try and narrow down a cause:
>> - write a simple program which is deterministic (e.g., matrix add)
>> - use the Intel PIN tool to count the exact number of instructions
>> retired.
>> - then compare the PIN count with the PMU count, that's the error margin
>> - try changing the duration of the program to see how it impacts the
>> wobbling
>>
>> I suspect there may be PMU leaks when you enter the kernel for an
>> interrupt.
>
>
> I wrote a simple Fibonacci and counted the #of instructions (inst_retired)
> using both pin and performance counter.
> As you can see, it seems like perf_counter undercount the #of instructions
> and the result is non-deterministic (sometimes 94730 but sometimes 94729)
> Any reason for this?
>
> $ pin -t obj-ia32/inscount2.so -o pin.log -- ./a.out; cat pin.log
> Count 94768
>
> $ task -e "instructions_retired" ./a.out
> [0x5100c0 event_sel=0xc0 umask=0x0 os=0 usr=1 en=1 int=1 inv=0 edge=0
> cnt_mask=0 any=0] INSTRUCTION_RE\
> TIRED:k=0:u=1:e=0:i=0:c=0:t=0
> PERF[type=4 val=0x5100c0 e_u=0 e_k=1 e_hv=1]
> INSTRUCTION_RETIRED:k=0:u=1:e=0:i=0:c=0:t=0
>                94730 instructions_retired
>            or 94729
>
> The code I ran is shown in the following.
>
> unsigned long
> fib(unsigned long n)
> {
>         if (n == 0)
>                 return 0;
>         if (n == 1)
>                 return 2;
>         return fib(n-1)+fib(n-2);
> }
>
> main()
> {
>     fib(10)
> }
>
>
>

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to