Hi Jin Yao

This is because every time the clk counter overflows, dtrace cpc resets the rma counter to the initial value (MAX-100) and hence there are very few occasions when the rma counter actually gets to overflow.

You should be able to observe this happening through the following dtrace script.
==========
#!/usr/sbin/dtrace -s

sdt:pcbe.GenuineIntel.6.*:core_pcbe_*:wrmsr,
sdt:pcbe.GenuineIntel.6.*:core_pcbe_*:rdmsr
{
        printf("%llX       %lu(0x%llX)", arg0, (unsigned long)arg1, arg1);
}

sdt:pcbe.GenuineIntel.6.*:core_pcbe_sample:core-pcbe-sample
{
printf("%llX %lu(0x%lX) %lu(0x%lX) %lu(0x%lX)", arg0, (unsigned long)arg1, arg1, (unsigned long)arg2, arg2, (unsigned long)arg3, arg3);
}
==========

collect(1), the other user of overflow profiling, samples the counters on overflow and on the subsequent program, writes the last sampled value instead of the initial value to the counter that did not overflow.

Jonathan, is the current behaviour what is expected of dtrace cpc or should I file a bug?

/kuriakose

On 06/16/10 23:29, Jin Yao wrote:
I write 2 dtrace scripts in order to measure "clk" and "rma" for sytem workload
on nhm-ex (32 cores) when specjbb2005 runs.

r...@shz-os:~# ./test_clk_rma.d
......................
clk
            309547
rma
                32
kcpc_int
            309579
......................
clk
            309931
rma
                57
kcpc_int
            309985
......................
clk
            310158
rma
                25
kcpc_int
            310183
^C

r...@shz-os:~# ./test_rma.d
............
rma
              1531
kcpc_int
              1531
............
rma
              1645
kcpc_int
              1645
............
rma
              1537
kcpc_int
              1537

I find the "rma" in test_clk_rma.d output is smaller than the "rma"
from test_rma.d. I guess some overflow interrupts lost when the dtrace
script test_clk_rma.d runs. But if it's true, why most of time are in
user space not in system space (from the output of vmstat and mpstat)?

The script sources are bellow and I also changed the setting to
"dcpc-min-overflow=100;" in "/kernel/drv/dcpc.conf" before tests.

r...@shz-os:~# cat test_clk_rma.d
#!/usr/sbin/dtrace -s

#pragma D option quiet

cpc:::cpu_clk_unhalted.ref-all-1000000
{
         @clk = count();
}

cpc:::mem_uncore_retired.remote_dram-all-100
{
         @rma = count();
}

kcpc_hw_overflow_intr:entry
{
         @kcpc_int = count();
}

tick-5s
{
         printf("......................\n");
         printf("clk");
         printa(@clk);
         trunc(@clk);
         printf("rma");
         printa(@rma);
         trunc(@rma);
         printf("kcpc_int");
         printa(@kcpc_int);
         trunc(@kcpc_int);
}

r...@shz-os:~# cat test_rma.d
#!/usr/sbin/dtrace -s

#pragma D option quiet

cpc:::mem_uncore_retired.remote_dram-all-100
{
         @rma = count();
}

kcpc_hw_overflow_intr:entry
{
         @kcpc_int = count();
}

tick-5s
{
         printf("............\n");
         printf("rma");
         printa(@rma);
         trunc(@rma);
         printf("kcpc_int");
         printa(@kcpc_int);
         trunc(@kcpc_int);
}

Can anybody give me some suggestions?

Thanks
Jin Yao
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Reply via email to