Re: [perf metricgroup] fcc9c5243c: perf-sanity-tests.Parse_and_process_metrics.fail

2020-11-03 Thread kajoljain



On 11/3/20 10:24 PM, John Garry wrote:
> On 03/11/2020 16:05, Ian Rogers wrote:
>> On Tue, Nov 3, 2020 at 6:43 AM John Garry  wrote:
>>> On 20/10/2020 17:53, Ian Rogers wrote:
>> Thanks for taking a look John. If you want help you can send the
>> output of "perf test 67 -vvv" to me. It is possible Broadwell has
>> similar glitches in the json to Skylake. I tested the original test on
>> server parts as I can access them as cloud machines.
>>
>>> I will have a look, but I was hoping that Ian would have a proper fix
>>> for this on top of ("perf metricgroup: Fix uncore metric expressions"),
>>> which now looks to be merged.
>> I still have these changes to look at in my inbox but I'm assuming
>> they're good:-)  Sorry for not getting to them, but it's good they are
>> merged.
> Hi Ian,
>  Checked in upstream kernel with your fix patch, in powerpc also test 
> case 67 is passing.
> But I am getting issue in test 10 for powerpc
>
> [command]# ./perf test 10
> 10: PMU events  :
> 10.1: PMU event table sanity    : Ok
> 10.2: PMU event map aliases : Ok
> 10.3: Parsing of PMU event table metrics    : 
> Skip (some metrics failed)
> 10.4: Parsing of PMU event table metrics with fake PMUs : 
> FAILED!
>
> Was debugging it, issue is with commit e1c92a7fbbc5 perf tests: Add 
> another metric parsing test.
>
> So, there we are passing different runtime parameter value in 
> "expr__find_other and expr__parse"
> in function `metric_parse_fake`. I believe we need to send same value.
> I will send fix patch for the same.
>>> Just wondering, was a patch ever submitted for this? Something still
>>> broken? I can't see any recent relevant changes to tests/pmu-events.c
>> The test itself shouldn't have changed, but the json files parsed by
>> jevents and turned into C code that the test exercises should have
>> changed. Jin Yao has sent two patch sets fixing a metric issue on SKL
>> (Skylake non-server) that should hopefully fix the issue there - I'll
>> check the status on these. Are you testing on Skylake?
> 
> So I have re-read this thread, and it seems that 2x different things are 
> being discussed:
> a. some breakage for test #10 on skylake
> b. test #67 being broken
> 
> It seems that a. has been addressed. That's what I was asking about just now.

Hi Ian/John,
The breakage for test #10 which I mentioned is for power9 machine, if 
that you were asking.
I still need to send fix patch out. I will send it soon.

Thanks,
Kajol Jain

> 
> So about b., which I thought may be broken for some other reason apart from 
> my hacky patch. But it seems not the case, and a proper patch is needed there.
> 
> Ian, have you had a chance to consider this issue in b.? That is, we have 
> breakage for metrics using uncore alias expressions for when multiple uncore 
> PMUs associated exist in the system? As before, looks broken by ded80bda8bc9 
> ("perf expr: Migrate expr ids table to a hashmap")
> 
> Thanks,
> John
> 
> 


Re: [perf metricgroup] fcc9c5243c: perf-sanity-tests.Parse_and_process_metrics.fail

2020-11-03 Thread John Garry

On 03/11/2020 16:05, Ian Rogers wrote:

On Tue, Nov 3, 2020 at 6:43 AM John Garry  wrote:

On 20/10/2020 17:53, Ian Rogers wrote:

Thanks for taking a look John. If you want help you can send the
output of "perf test 67 -vvv" to me. It is possible Broadwell has
similar glitches in the json to Skylake. I tested the original test on
server parts as I can access them as cloud machines.


I will have a look, but I was hoping that Ian would have a proper fix
for this on top of ("perf metricgroup: Fix uncore metric expressions"),
which now looks to be merged.

I still have these changes to look at in my inbox but I'm assuming
they're good:-)  Sorry for not getting to them, but it's good they are
merged.

Hi Ian,
 Checked in upstream kernel with your fix patch, in powerpc also test case 
67 is passing.
But I am getting issue in test 10 for powerpc

[command]# ./perf test 10
10: PMU events  :
10.1: PMU event table sanity: Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics: Skip 
(some metrics failed)
10.4: Parsing of PMU event table metrics with fake PMUs : FAILED!

Was debugging it, issue is with commit e1c92a7fbbc5 perf tests: Add another 
metric parsing test.

So, there we are passing different runtime parameter value in "expr__find_other and 
expr__parse"
in function `metric_parse_fake`. I believe we need to send same value.
I will send fix patch for the same.

Just wondering, was a patch ever submitted for this? Something still
broken? I can't see any recent relevant changes to tests/pmu-events.c

The test itself shouldn't have changed, but the json files parsed by
jevents and turned into C code that the test exercises should have
changed. Jin Yao has sent two patch sets fixing a metric issue on SKL
(Skylake non-server) that should hopefully fix the issue there - I'll
check the status on these. Are you testing on Skylake?


So I have re-read this thread, and it seems that 2x different things are 
being discussed:

a. some breakage for test #10 on skylake
b. test #67 being broken

It seems that a. has been addressed. That's what I was asking about just 
now.


So about b., which I thought may be broken for some other reason apart 
from my hacky patch. But it seems not the case, and a proper patch is 
needed there.


Ian, have you had a chance to consider this issue in b.? That is, we 
have breakage for metrics using uncore alias expressions for when 
multiple uncore PMUs associated exist in the system? As before, looks 
broken by ded80bda8bc9 ("perf expr: Migrate expr ids table to a hashmap")


Thanks,
John




Re: [perf metricgroup] fcc9c5243c: perf-sanity-tests.Parse_and_process_metrics.fail

2020-11-03 Thread Ian Rogers
On Tue, Nov 3, 2020 at 6:43 AM John Garry  wrote:
>
> On 20/10/2020 17:53, Ian Rogers wrote:
> >>> Thanks for taking a look John. If you want help you can send the
> >>> output of "perf test 67 -vvv" to me. It is possible Broadwell has
> >>> similar glitches in the json to Skylake. I tested the original test on
> >>> server parts as I can access them as cloud machines.
> >>>
>  I will have a look, but I was hoping that Ian would have a proper fix
>  for this on top of ("perf metricgroup: Fix uncore metric expressions"),
>  which now looks to be merged.
> >>> I still have these changes to look at in my inbox but I'm assuming
> >>> they're good:-)  Sorry for not getting to them, but it's good they are
> >>> merged.
> >> Hi Ian,
> >> Checked in upstream kernel with your fix patch, in powerpc also test 
> >> case 67 is passing.
> >> But I am getting issue in test 10 for powerpc
> >>
> >> [command]# ./perf test 10
> >> 10: PMU events  :
> >> 10.1: PMU event table sanity: Ok
> >> 10.2: PMU event map aliases : Ok
> >> 10.3: Parsing of PMU event table metrics: Skip 
> >> (some metrics failed)
> >> 10.4: Parsing of PMU event table metrics with fake PMUs : 
> >> FAILED!
> >>
> >> Was debugging it, issue is with commit e1c92a7fbbc5 perf tests: Add 
> >> another metric parsing test.
> >>
> >> So, there we are passing different runtime parameter value in 
> >> "expr__find_other and expr__parse"
> >> in function `metric_parse_fake`. I believe we need to send same value.
> >> I will send fix patch for the same.
>
> Just wondering, was a patch ever submitted for this? Something still
> broken? I can't see any recent relevant changes to tests/pmu-events.c

The test itself shouldn't have changed, but the json files parsed by
jevents and turned into C code that the test exercises should have
changed. Jin Yao has sent two patch sets fixing a metric issue on SKL
(Skylake non-server) that should hopefully fix the issue there - I'll
check the status on these. Are you testing on Skylake?

Thanks,
Ian

> Thanks,
> John


Re: [perf metricgroup] fcc9c5243c: perf-sanity-tests.Parse_and_process_metrics.fail

2020-11-03 Thread John Garry

On 20/10/2020 17:53, Ian Rogers wrote:

Thanks for taking a look John. If you want help you can send the
output of "perf test 67 -vvv" to me. It is possible Broadwell has
similar glitches in the json to Skylake. I tested the original test on
server parts as I can access them as cloud machines.


I will have a look, but I was hoping that Ian would have a proper fix
for this on top of ("perf metricgroup: Fix uncore metric expressions"),
which now looks to be merged.

I still have these changes to look at in my inbox but I'm assuming
they're good:-)  Sorry for not getting to them, but it's good they are
merged.

Hi Ian,
Checked in upstream kernel with your fix patch, in powerpc also test case 
67 is passing.
But I am getting issue in test 10 for powerpc

[command]# ./perf test 10
10: PMU events  :
10.1: PMU event table sanity: Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics: Skip 
(some metrics failed)
10.4: Parsing of PMU event table metrics with fake PMUs : FAILED!

Was debugging it, issue is with commit e1c92a7fbbc5 perf tests: Add another 
metric parsing test.

So, there we are passing different runtime parameter value in "expr__find_other and 
expr__parse"
in function `metric_parse_fake`. I believe we need to send same value.
I will send fix patch for the same.


Just wondering, was a patch ever submitted for this? Something still 
broken? I can't see any recent relevant changes to tests/pmu-events.c


Thanks,
John


Re: [perf metricgroup] fcc9c5243c: perf-sanity-tests.Parse_and_process_metrics.fail

2020-10-20 Thread Ian Rogers
On Tue, Oct 20, 2020 at 1:56 AM kajoljain  wrote:
>
>
>
> On 10/19/20 9:50 PM, Ian Rogers wrote:
> > On Mon, Oct 19, 2020 at 2:51 AM John Garry  wrote:
> >>
> >> On 19/10/2020 00:30, Ian Rogers wrote:
> >>> On Sun, Oct 18, 2020 at 1:51 AM kernel test robot  
> >>> wrote:
> 
>  Greeting,
> 
>  FYI, we noticed the following commit (built with gcc-9):
> 
>  commit: fcc9c5243c478f104014daf4d23db86098d2aef0 ("perf metricgroup: 
>  Hack a fix for aliases when covering multiple PMUs")
>  url: 
>  https://github.com/0day-ci/linux/commits/John-Garry/perf-pmu-events-Support-event-aliasing-for-system-PMUs/20201008-182049
> 
> 
>  in testcase: perf-sanity-tests
>  version: perf-x86_64-c85fb28b6f99-1_20201008
>  with following parameters:
> 
>   perf_compiler: gcc
>   ucode: 0xdc
> 
> 
> 
>  on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz with 
>  32G memory
> 
>  caused below changes (please refer to attached dmesg/kmsg for entire 
>  log/backtrace):
> >>>
> >>> I believe this is a Skylake and there is a known bug in the Skylake
> >>> metric DRAM_Parallel_Reads as described here:
> >>> https://lore.kernel.org/lkml/CAP-5=fxejvaqa9qfw66cy77qb962+jbe8tt5bslooocfmod...@mail.gmail.com/
> >>> Fixing the bug needs more knowledge than what is available in manuals.
> >>> Hopefully Intel can take a look.
> >>>
> >>> Thanks,
> >>> Ian
> >>
> >> So this named patch ("perf metricgroup: Hack a fix for aliases...") is
> >> breaking test #67 on my machine also, which is a broadwell.
> >
> > Thanks for taking a look John. If you want help you can send the
> > output of "perf test 67 -vvv" to me. It is possible Broadwell has
> > similar glitches in the json to Skylake. I tested the original test on
> > server parts as I can access them as cloud machines.
> >
> >> I will have a look, but I was hoping that Ian would have a proper fix
> >> for this on top of ("perf metricgroup: Fix uncore metric expressions"),
> >> which now looks to be merged.
> >
> > I still have these changes to look at in my inbox but I'm assuming
> > they're good :-) Sorry for not getting to them, but it's good they are
> > merged.
>
> Hi Ian,
>Checked in upstream kernel with your fix patch, in powerpc also test case 
> 67 is passing.
> But I am getting issue in test 10 for powerpc
>
> [command]# ./perf test 10
> 10: PMU events  :
> 10.1: PMU event table sanity: Ok
> 10.2: PMU event map aliases : Ok
> 10.3: Parsing of PMU event table metrics: Skip 
> (some metrics failed)
> 10.4: Parsing of PMU event table metrics with fake PMUs : FAILED!
>
> Was debugging it, issue is with commit e1c92a7fbbc5 perf tests: Add another 
> metric parsing test.
>
> So, there we are passing different runtime parameter value in 
> "expr__find_other and expr__parse"
> in function `metric_parse_fake`. I believe we need to send same value.
> I will send fix patch for the same.
>
> Thanks,
> Kajol Jain

Thanks, the fake support was done by Jiri. I do try to test on Power
8. The awesome thing, aside from the testing nit fixes, is that the
metrics will actually work once the test is passing :-). They may of
course report junk.

Thanks,
Ian

> >
> > Thanks,
> > Ian
> >
> >> Thanks!
> >>
> >>>
> 
> 
>  If you fix the issue, kindly add following tag
>  Reported-by: kernel test robot 
> 
> 
>  2020-10-16 19:31:52 sudo 
>  /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
>   test 67
>  67: Parse and process metrics : FAILED!
>  2020-10-16 19:31:52 sudo 
>  /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
>   test 68
>  68: x86 rdpmc : Ok
>  2020-10-16 19:31:52 sudo 
>  /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
>   test 69
>  69: Convert perf time to TSC  : Ok
>  2020-10-16 19:31:52 sudo 
>  /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
>   test 70
>  70: DWARF unwind  : Ok
>  2020-10-16 19:31:52 sudo 
>  /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
>   test 71
>  71: x86 instruction decoder - new instructions: Ok
>  2020-10-16 19:31:52 sudo 
>  /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
>   test 72
>  72: Intel PT packet decoder   : Ok
>  2020-10-16 19:31:52 sudo 
>  

Re: [perf metricgroup] fcc9c5243c: perf-sanity-tests.Parse_and_process_metrics.fail

2020-10-20 Thread kajoljain



On 10/19/20 9:50 PM, Ian Rogers wrote:
> On Mon, Oct 19, 2020 at 2:51 AM John Garry  wrote:
>>
>> On 19/10/2020 00:30, Ian Rogers wrote:
>>> On Sun, Oct 18, 2020 at 1:51 AM kernel test robot  
>>> wrote:

 Greeting,

 FYI, we noticed the following commit (built with gcc-9):

 commit: fcc9c5243c478f104014daf4d23db86098d2aef0 ("perf metricgroup: Hack 
 a fix for aliases when covering multiple PMUs")
 url: 
 https://github.com/0day-ci/linux/commits/John-Garry/perf-pmu-events-Support-event-aliasing-for-system-PMUs/20201008-182049


 in testcase: perf-sanity-tests
 version: perf-x86_64-c85fb28b6f99-1_20201008
 with following parameters:

  perf_compiler: gcc
  ucode: 0xdc



 on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz with 
 32G memory

 caused below changes (please refer to attached dmesg/kmsg for entire 
 log/backtrace):
>>>
>>> I believe this is a Skylake and there is a known bug in the Skylake
>>> metric DRAM_Parallel_Reads as described here:
>>> https://lore.kernel.org/lkml/CAP-5=fxejvaqa9qfw66cy77qb962+jbe8tt5bslooocfmod...@mail.gmail.com/
>>> Fixing the bug needs more knowledge than what is available in manuals.
>>> Hopefully Intel can take a look.
>>>
>>> Thanks,
>>> Ian
>>
>> So this named patch ("perf metricgroup: Hack a fix for aliases...") is
>> breaking test #67 on my machine also, which is a broadwell.
> 
> Thanks for taking a look John. If you want help you can send the
> output of "perf test 67 -vvv" to me. It is possible Broadwell has
> similar glitches in the json to Skylake. I tested the original test on
> server parts as I can access them as cloud machines.
> 
>> I will have a look, but I was hoping that Ian would have a proper fix
>> for this on top of ("perf metricgroup: Fix uncore metric expressions"),
>> which now looks to be merged.
> 
> I still have these changes to look at in my inbox but I'm assuming
> they're good :-) Sorry for not getting to them, but it's good they are
> merged.

Hi Ian,
   Checked in upstream kernel with your fix patch, in powerpc also test case 67 
is passing. 
But I am getting issue in test 10 for powerpc

[command]# ./perf test 10 
10: PMU events  :
10.1: PMU event table sanity: Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics: Skip 
(some metrics failed)
10.4: Parsing of PMU event table metrics with fake PMUs : FAILED!

Was debugging it, issue is with commit e1c92a7fbbc5 perf tests: Add another 
metric parsing test.

So, there we are passing different runtime parameter value in "expr__find_other 
and expr__parse"
in function `metric_parse_fake`. I believe we need to send same value.
I will send fix patch for the same.

Thanks,
Kajol Jain

> 
> Thanks,
> Ian
> 
>> Thanks!
>>
>>>


 If you fix the issue, kindly add following tag
 Reported-by: kernel test robot 


 2020-10-16 19:31:52 sudo 
 /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
  test 67
 67: Parse and process metrics : FAILED!
 2020-10-16 19:31:52 sudo 
 /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
  test 68
 68: x86 rdpmc : Ok
 2020-10-16 19:31:52 sudo 
 /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
  test 69
 69: Convert perf time to TSC  : Ok
 2020-10-16 19:31:52 sudo 
 /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
  test 70
 70: DWARF unwind  : Ok
 2020-10-16 19:31:52 sudo 
 /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
  test 71
 71: x86 instruction decoder - new instructions: Ok
 2020-10-16 19:31:52 sudo 
 /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
  test 72
 72: Intel PT packet decoder   : Ok
 2020-10-16 19:31:52 sudo 
 /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
  test 73
 73: x86 bp modify : Ok
 2020-10-16 19:31:53 sudo 
 /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
  test 74
 74: probe libc's inet_pton & backtrace it with ping   : Ok
 2020-10-16 19:31:54 sudo 
 /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
  test 75

Re: [perf metricgroup] fcc9c5243c: perf-sanity-tests.Parse_and_process_metrics.fail

2020-10-19 Thread John Garry

On 19/10/2020 17:20, Ian Rogers wrote:

n

So this named patch ("perf metricgroup: Hack a fix for aliases...") is
breaking test #67 on my machine also, which is a broadwell.

Thanks for taking a look John. If you want help you can send the
output of "perf test 67 -vvv" to me. It is possible Broadwell has
similar glitches in the json to Skylake. I tested the original test on
server parts as I can access them as cloud machines.


Here it is:

john@localhost:~/kernel-dev7/tools/perf> ./perf test -vv 67
Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF 
maps, etc

67: Parse and process metrics :
--- start ---
test child forked, pid 24433
metric expr inst_retired.any / cpu_clk_unhalted.thread for IPC
parsing metric: inst_retired.any / cpu_clk_unhalted.thread
found event inst_retired.any
found event cpu_clk_unhalted.thread
adding {inst_retired.any,cpu_clk_unhalted.thread}:W
Attempting to add event pmu 'inst_retired.any' with '' that may result 
in non-fatal errors
Attempting to add event pmu 'cpu_clk_unhalted.thread' with '' that may 
result in non-fatal errors

parsing metric: inst_retired.any / cpu_clk_unhalted.thread
lookup: is_ref 0, counted 0, val 300.00: inst_retired.any
lookup: is_ref 0, counted 101, val 200.00: cpu_clk_unhalted.thread
metric expr idq_uops_not_delivered.core / (4 * (( ( 
cpu_clk_unhalted.thread / 2 ) * ( 1 + cpu_clk_unhalted.one_thread_active 
/ cpu_clk_unhalted.ref_xclk ) ))) for Frontend_Bound_SMT
parsing metric: idq_uops_not_delivered.core / (4 * (( ( 
cpu_clk_unhalted.thread / 2 ) * ( 1 + cpu_clk_unhalted.one_thread_active 
/ cpu_clk_unhalted.ref_xclk ) )))

found event cpu_clk_unhalted.one_thread_active
found event cpu_clk_unhalted.ref_xclk
found event idq_uops_not_delivered.core
found event cpu_clk_unhalted.thread
adding 
{cpu_clk_unhalted.one_thread_active,cpu_clk_unhalted.ref_xclk,idq_uops_not_delivered.core,cpu_clk_unhalted.thread}:W
Attempting to add event pmu 'cpu_clk_unhalted.one_thread_active' with '' 
that may result in non-fatal errors
Attempting to add event pmu 'cpu_clk_unhalted.ref_xclk' with '' that may 
result in non-fatal errors
Attempting to add event pmu 'idq_uops_not_delivered.core' with '' that 
may result in non-fatal errors
Attempting to add event pmu 'cpu_clk_unhalted.thread' with '' that may 
result in non-fatal errors
parsing metric: idq_uops_not_delivered.core / (4 * (( ( 
cpu_clk_unhalted.thread / 2 ) * ( 1 + cpu_clk_unhalted.one_thread_active 
/ cpu_clk_unhalted.ref_xclk ) )))

lookup: is_ref 0, counted 46, val 300.00: idq_uops_not_delivered.core
lookup: is_ref 0, counted 0, val 200.00: cpu_clk_unhalted.thread
lookup: is_ref 0, counted 216, val 400.00: 
cpu_clk_unhalted.one_thread_active

lookup: is_ref 0, counted 46, val 600.00: cpu_clk_unhalted.ref_xclk
metric expr (dcache_miss_cpi + icache_miss_cycles) for cache_miss_cycles
parsing metric: (dcache_miss_cpi + icache_miss_cycles)
metric expr l1d\-loads\-misses / inst_retired.any for dcache_miss_cpi
parsing metric: l1d\-loads\-misses / inst_retired.any
metric expr l1i\-loads\-misses / inst_retired.any for icache_miss_cycles
parsing metric: l1i\-loads\-misses / inst_retired.any
found event inst_retired.any
found event l1i-loads-misses
found event l1d-loads-misses
adding {inst_retired.any,l1i-loads-misses,l1d-loads-misses}:W
Attempting to add event pmu 'inst_retired.any' with '' that may result 
in non-fatal errors

adding ref metric icache_miss_cycles: l1i\-loads\-misses / inst_retired.any
adding ref metric dcache_miss_cpi: l1d\-loads\-misses / inst_retired.any
parsing metric: (dcache_miss_cpi + icache_miss_cycles)
lookup: is_ref 1, counted 0, val 0.00: dcache_miss_cpi
processing metric: dcache_miss_cpi ENTRY
parsing metric: l1d\-loads\-misses / inst_retired.any
lookup: is_ref 0, counted 105, val 300.00: l1d-loads-misses
lookup: is_ref 0, counted 46, val 400.00: inst_retired.any
processing metric: dcache_miss_cpi EXIT: 0.75
lookup: is_ref 1, counted 0, val 0.00: icache_miss_cycles
processing metric: icache_miss_cycles ENTRY
parsing metric: l1i\-loads\-misses / inst_retired.any
lookup: is_ref 0, counted 216, val 200.00: l1i-loads-misses
lookup: is_ref 0, counted 46, val 400.00: inst_retired.any
processing metric: icache_miss_cycles EXIT: 0.50
metric expr d_ratio(dcache_l2_all_hits, dcache_l2_all) for DCache_L2_Hits
parsing metric: d_ratio(dcache_l2_all_hits, dcache_l2_all)
metric expr l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + 
l2_rqsts.rfo_hit for DCache_L2_All_Hits
parsing metric: l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + 
l2_rqsts.rfo_hit

metric expr dcache_l2_all_hits + dcache_l2_all_miss for DCache_L2_All
parsing metric: dcache_l2_all_hits + dcache_l2_all_miss
metric expr l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + 
l2_rqsts.rfo_hit for DCache_L2_All_Hits
parsing metric: l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + 
l2_rqsts.rfo_hit
metric expr max(l2_rqsts.all_demand_data_rd - 

Re: [perf metricgroup] fcc9c5243c: perf-sanity-tests.Parse_and_process_metrics.fail

2020-10-19 Thread Ian Rogers
On Mon, Oct 19, 2020 at 2:51 AM John Garry  wrote:
>
> On 19/10/2020 00:30, Ian Rogers wrote:
> > On Sun, Oct 18, 2020 at 1:51 AM kernel test robot  
> > wrote:
> >>
> >> Greeting,
> >>
> >> FYI, we noticed the following commit (built with gcc-9):
> >>
> >> commit: fcc9c5243c478f104014daf4d23db86098d2aef0 ("perf metricgroup: Hack 
> >> a fix for aliases when covering multiple PMUs")
> >> url: 
> >> https://github.com/0day-ci/linux/commits/John-Garry/perf-pmu-events-Support-event-aliasing-for-system-PMUs/20201008-182049
> >>
> >>
> >> in testcase: perf-sanity-tests
> >> version: perf-x86_64-c85fb28b6f99-1_20201008
> >> with following parameters:
> >>
> >>  perf_compiler: gcc
> >>  ucode: 0xdc
> >>
> >>
> >>
> >> on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz with 
> >> 32G memory
> >>
> >> caused below changes (please refer to attached dmesg/kmsg for entire 
> >> log/backtrace):
> >
> > I believe this is a Skylake and there is a known bug in the Skylake
> > metric DRAM_Parallel_Reads as described here:
> > https://lore.kernel.org/lkml/CAP-5=fxejvaqa9qfw66cy77qb962+jbe8tt5bslooocfmod...@mail.gmail.com/
> > Fixing the bug needs more knowledge than what is available in manuals.
> > Hopefully Intel can take a look.
> >
> > Thanks,
> > Ian
>
> So this named patch ("perf metricgroup: Hack a fix for aliases...") is
> breaking test #67 on my machine also, which is a broadwell.

Thanks for taking a look John. If you want help you can send the
output of "perf test 67 -vvv" to me. It is possible Broadwell has
similar glitches in the json to Skylake. I tested the original test on
server parts as I can access them as cloud machines.

> I will have a look, but I was hoping that Ian would have a proper fix
> for this on top of ("perf metricgroup: Fix uncore metric expressions"),
> which now looks to be merged.

I still have these changes to look at in my inbox but I'm assuming
they're good :-) Sorry for not getting to them, but it's good they are
merged.

Thanks,
Ian

> Thanks!
>
> >
> >>
> >>
> >> If you fix the issue, kindly add following tag
> >> Reported-by: kernel test robot 
> >>
> >>
> >> 2020-10-16 19:31:52 sudo 
> >> /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
> >>  test 67
> >> 67: Parse and process metrics : FAILED!
> >> 2020-10-16 19:31:52 sudo 
> >> /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
> >>  test 68
> >> 68: x86 rdpmc : Ok
> >> 2020-10-16 19:31:52 sudo 
> >> /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
> >>  test 69
> >> 69: Convert perf time to TSC  : Ok
> >> 2020-10-16 19:31:52 sudo 
> >> /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
> >>  test 70
> >> 70: DWARF unwind  : Ok
> >> 2020-10-16 19:31:52 sudo 
> >> /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
> >>  test 71
> >> 71: x86 instruction decoder - new instructions: Ok
> >> 2020-10-16 19:31:52 sudo 
> >> /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
> >>  test 72
> >> 72: Intel PT packet decoder   : Ok
> >> 2020-10-16 19:31:52 sudo 
> >> /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
> >>  test 73
> >> 73: x86 bp modify : Ok
> >> 2020-10-16 19:31:53 sudo 
> >> /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
> >>  test 74
> >> 74: probe libc's inet_pton & backtrace it with ping   : Ok
> >> 2020-10-16 19:31:54 sudo 
> >> /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
> >>  test 75
> >> 75: Zstd perf.data compression/decompression  : Ok
> >>
> >>
> >>
> >> To reproduce:
> >>
> >>  git clone https://github.com/intel/lkp-tests.git
> >>  cd lkp-tests
> >>  bin/lkp install job.yaml  # job file is attached in this email
> >>  bin/lkp run job.yaml
> >>
> >>
> >>
> >> Thanks,
> >> Rong Chen
> >>
> > .
> >
>


Re: [perf metricgroup] fcc9c5243c: perf-sanity-tests.Parse_and_process_metrics.fail

2020-10-19 Thread Jin, Yao

Hi Garry, Hi Ian,

On 10/19/2020 5:48 PM, John Garry wrote:

On 19/10/2020 00:30, Ian Rogers wrote:

On Sun, Oct 18, 2020 at 1:51 AM kernel test robot  wrote:


Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: fcc9c5243c478f104014daf4d23db86098d2aef0 ("perf metricgroup: Hack a fix for aliases when 
covering multiple PMUs")
url: 
https://github.com/0day-ci/linux/commits/John-Garry/perf-pmu-events-Support-event-aliasing-for-system-PMUs/20201008-182049 




in testcase: perf-sanity-tests
version: perf-x86_64-c85fb28b6f99-1_20201008
with following parameters:

 perf_compiler: gcc
 ucode: 0xdc



on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz with 32G 
memory

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


I believe this is a Skylake and there is a known bug in the Skylake
metric DRAM_Parallel_Reads as described here:
https://lore.kernel.org/lkml/CAP-5=fxejvaqa9qfw66cy77qb962+jbe8tt5bslooocfmod...@mail.gmail.com/
Fixing the bug needs more knowledge than what is available in manuals.
Hopefully Intel can take a look.

Thanks,
Ian


So this named patch ("perf metricgroup: Hack a fix for aliases...") is breaking test #67 on my 
machine also, which is a broadwell.


I will have a look, but I was hoping that Ian would have a proper fix for this on top of ("perf 
metricgroup: Fix uncore metric expressions"), which now looks to be merged.


Thanks!



I just think they are different issues.

On my KBL client, the perf test #67 is passed.

But DRAM_Parallel_Reads does have issue.

root@kbl-ppc:~# perf stat -M DRAM_Parallel_Reads -- sleep 1
event syntax error: 
'{arb/event=0x80,umask=0x2/,arb/event=0x80,umask=0x2,thresh=1/}:W'
 \___ unknown term 'thresh' for pmu 'uncore_arb'

valid terms: 
event,edge,inv,umask,cmask,config,config1,config2,name,period,percore

Initial error:
event syntax error: '..umask=0x2/,arb/event=0x80,umask=0x2,thresh=1/}:W'
  \___ Cannot find PMU `arb'. Missing kernel 
support?

 Usage: perf stat [] []

-M, --metrics 
  monitor specified metrics or metric groups (separated 
by ,)

I have a patch to fix DRAM_Parallel_Reads.

After:

root@kbl-ppc:~# perf stat -M MEM_Parallel_Reads -- sleep 1

 Performance counter stats for 'system wide':

 3,043,952  arb/event=0x80,umask=0x2/ # 1.00 MEM_Parallel_Reads

   1.000879932 seconds time elapsed

I will post the patch later.

Thanks
Jin Yao






If you fix the issue, kindly add following tag
Reported-by: kernel test robot 


2020-10-16 19:31:52 sudo 
/usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf 
test 67

67: Parse and process metrics : FAILED!
2020-10-16 19:31:52 sudo 
/usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf 
test 68

68: x86 rdpmc : Ok
2020-10-16 19:31:52 sudo 
/usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf 
test 69

69: Convert perf time to TSC  : Ok
2020-10-16 19:31:52 sudo 
/usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf 
test 70

70: DWARF unwind  : Ok
2020-10-16 19:31:52 sudo 
/usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf 
test 71

71: x86 instruction decoder - new instructions    : Ok
2020-10-16 19:31:52 sudo 
/usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf 
test 72

72: Intel PT packet decoder   : Ok
2020-10-16 19:31:52 sudo 
/usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf 
test 73

73: x86 bp modify : Ok
2020-10-16 19:31:53 sudo 
/usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf 
test 74

74: probe libc's inet_pton & backtrace it with ping   : Ok
2020-10-16 19:31:54 sudo 
/usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf 
test 75

75: Zstd perf.data compression/decompression  : Ok



To reproduce:

 git clone https://github.com/intel/lkp-tests.git
 cd lkp-tests
 bin/lkp install job.yaml  # job file is attached in this email
 bin/lkp run job.yaml



Thanks,
Rong Chen


.





Re: [perf metricgroup] fcc9c5243c: perf-sanity-tests.Parse_and_process_metrics.fail

2020-10-19 Thread John Garry

On 19/10/2020 00:30, Ian Rogers wrote:

On Sun, Oct 18, 2020 at 1:51 AM kernel test robot  wrote:


Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: fcc9c5243c478f104014daf4d23db86098d2aef0 ("perf metricgroup: Hack a fix for 
aliases when covering multiple PMUs")
url: 
https://github.com/0day-ci/linux/commits/John-Garry/perf-pmu-events-Support-event-aliasing-for-system-PMUs/20201008-182049


in testcase: perf-sanity-tests
version: perf-x86_64-c85fb28b6f99-1_20201008
with following parameters:

 perf_compiler: gcc
 ucode: 0xdc



on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz with 32G 
memory

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


I believe this is a Skylake and there is a known bug in the Skylake
metric DRAM_Parallel_Reads as described here:
https://lore.kernel.org/lkml/CAP-5=fxejvaqa9qfw66cy77qb962+jbe8tt5bslooocfmod...@mail.gmail.com/
Fixing the bug needs more knowledge than what is available in manuals.
Hopefully Intel can take a look.

Thanks,
Ian


So this named patch ("perf metricgroup: Hack a fix for aliases...") is 
breaking test #67 on my machine also, which is a broadwell.


I will have a look, but I was hoping that Ian would have a proper fix 
for this on top of ("perf metricgroup: Fix uncore metric expressions"), 
which now looks to be merged.


Thanks!






If you fix the issue, kindly add following tag
Reported-by: kernel test robot 


2020-10-16 19:31:52 sudo 
/usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
 test 67
67: Parse and process metrics : FAILED!
2020-10-16 19:31:52 sudo 
/usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
 test 68
68: x86 rdpmc : Ok
2020-10-16 19:31:52 sudo 
/usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
 test 69
69: Convert perf time to TSC  : Ok
2020-10-16 19:31:52 sudo 
/usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
 test 70
70: DWARF unwind  : Ok
2020-10-16 19:31:52 sudo 
/usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
 test 71
71: x86 instruction decoder - new instructions: Ok
2020-10-16 19:31:52 sudo 
/usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
 test 72
72: Intel PT packet decoder   : Ok
2020-10-16 19:31:52 sudo 
/usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
 test 73
73: x86 bp modify : Ok
2020-10-16 19:31:53 sudo 
/usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
 test 74
74: probe libc's inet_pton & backtrace it with ping   : Ok
2020-10-16 19:31:54 sudo 
/usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
 test 75
75: Zstd perf.data compression/decompression  : Ok



To reproduce:

 git clone https://github.com/intel/lkp-tests.git
 cd lkp-tests
 bin/lkp install job.yaml  # job file is attached in this email
 bin/lkp run job.yaml



Thanks,
Rong Chen


.





Re: [perf metricgroup] fcc9c5243c: perf-sanity-tests.Parse_and_process_metrics.fail

2020-10-19 Thread Jin, Yao




On 10/19/2020 9:52 AM, Andi Kleen wrote:

I believe this is a Skylake and there is a known bug in the Skylake
metric DRAM_Parallel_Reads as described here:
https://lore.kernel.org/lkml/CAP-5=fxejvaqa9qfw66cy77qb962+jbe8tt5bslooocfmod...@mail.gmail.com/
Fixing the bug needs more knowledge than what is available in manuals.
Hopefully Intel can take a look.


Oh I missed the original mail for some reason.  Yes it should be cmask instead 
of thresh
for client.  I think thresh is used on the server uncore only, not on the 
client.

Jin Yao, can you send a patch please?

-Andi



Yes, the DRAM_Parallel_Reads works on server but it's failed on client.

I will post a patch to fix that.

Thanks
Jin Yao



Re: [perf metricgroup] fcc9c5243c: perf-sanity-tests.Parse_and_process_metrics.fail

2020-10-18 Thread Andi Kleen
> I believe this is a Skylake and there is a known bug in the Skylake
> metric DRAM_Parallel_Reads as described here:
> https://lore.kernel.org/lkml/CAP-5=fxejvaqa9qfw66cy77qb962+jbe8tt5bslooocfmod...@mail.gmail.com/
> Fixing the bug needs more knowledge than what is available in manuals.
> Hopefully Intel can take a look.

Oh I missed the original mail for some reason.  Yes it should be cmask instead 
of thresh
for client.  I think thresh is used on the server uncore only, not on the 
client.

Jin Yao, can you send a patch please?

-Andi



Re: [perf metricgroup] fcc9c5243c: perf-sanity-tests.Parse_and_process_metrics.fail

2020-10-18 Thread Ian Rogers
On Sun, Oct 18, 2020 at 1:51 AM kernel test robot  wrote:
>
> Greeting,
>
> FYI, we noticed the following commit (built with gcc-9):
>
> commit: fcc9c5243c478f104014daf4d23db86098d2aef0 ("perf metricgroup: Hack a 
> fix for aliases when covering multiple PMUs")
> url: 
> https://github.com/0day-ci/linux/commits/John-Garry/perf-pmu-events-Support-event-aliasing-for-system-PMUs/20201008-182049
>
>
> in testcase: perf-sanity-tests
> version: perf-x86_64-c85fb28b6f99-1_20201008
> with following parameters:
>
> perf_compiler: gcc
> ucode: 0xdc
>
>
>
> on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz with 32G 
> memory
>
> caused below changes (please refer to attached dmesg/kmsg for entire 
> log/backtrace):

I believe this is a Skylake and there is a known bug in the Skylake
metric DRAM_Parallel_Reads as described here:
https://lore.kernel.org/lkml/CAP-5=fxejvaqa9qfw66cy77qb962+jbe8tt5bslooocfmod...@mail.gmail.com/
Fixing the bug needs more knowledge than what is available in manuals.
Hopefully Intel can take a look.

Thanks,
Ian

>
>
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot 
>
>
> 2020-10-16 19:31:52 sudo 
> /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
>  test 67
> 67: Parse and process metrics : FAILED!
> 2020-10-16 19:31:52 sudo 
> /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
>  test 68
> 68: x86 rdpmc : Ok
> 2020-10-16 19:31:52 sudo 
> /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
>  test 69
> 69: Convert perf time to TSC  : Ok
> 2020-10-16 19:31:52 sudo 
> /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
>  test 70
> 70: DWARF unwind  : Ok
> 2020-10-16 19:31:52 sudo 
> /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
>  test 71
> 71: x86 instruction decoder - new instructions: Ok
> 2020-10-16 19:31:52 sudo 
> /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
>  test 72
> 72: Intel PT packet decoder   : Ok
> 2020-10-16 19:31:52 sudo 
> /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
>  test 73
> 73: x86 bp modify : Ok
> 2020-10-16 19:31:53 sudo 
> /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
>  test 74
> 74: probe libc's inet_pton & backtrace it with ping   : Ok
> 2020-10-16 19:31:54 sudo 
> /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf
>  test 75
> 75: Zstd perf.data compression/decompression  : Ok
>
>
>
> To reproduce:
>
> git clone https://github.com/intel/lkp-tests.git
> cd lkp-tests
> bin/lkp install job.yaml  # job file is attached in this email
> bin/lkp run job.yaml
>
>
>
> Thanks,
> Rong Chen
>