Re: perf: Add support for full Intel event lists v7

2014-07-18 Thread Michael Ellerman
On Fri, 2014-07-11 at 16:59 -0700, Andi Kleen wrote:
> All feedback addressed. Hopefully ready for merge now.
...

> The JSON format and perf parser has some minor Intelisms, but they
> are simple and small and optional. It's easy to extend, so it would be
> possible to use it for other CPUs too, add different pmu attributes, and
> add new download sites to the downloader tool.

Yeah I agree.

We'll do some follow-up patches to get it working for the IBM powerpc chips.

cheers


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: Add support for full Intel event lists v7

2014-07-18 Thread Michael Ellerman
On Fri, 2014-07-11 at 16:59 -0700, Andi Kleen wrote:
 All feedback addressed. Hopefully ready for merge now.
...

 The JSON format and perf parser has some minor Intelisms, but they
 are simple and small and optional. It's easy to extend, so it would be
 possible to use it for other CPUs too, add different pmu attributes, and
 add new download sites to the downloader tool.

Yeah I agree.

We'll do some follow-up patches to get it working for the IBM powerpc chips.

cheers


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


perf: Add support for full Intel event lists v7

2014-07-11 Thread Andi Kleen
All feedback addressed. Hopefully ready for merge now.

[v2: Review feedback addressed and some minor improvements]
[v3: More review feedback addressed and handle test failures better.
Ported to latest tip/core.]
[v4: Addressed Namhyung's feedback]
[v5: Rebase to latest tree. Minor description update.]
[v6: Rebase. Add acked by from Namhyung and address feedback. Some minor
fixes. Should be good to go now I hope. The period patch was dropped,
as that is already handled. I added an extra patch for a --quiet argument
for perf list]
[v7: Address Jiri's feedback. Various changes and some patches
were split. perf download uses curl now instead of wget.]

perf has high level events which are useful in many cases. However
there are some tuning situations where low level events in the CPU
are needed. Traditionally this required specifying the event in 
raw form (very awkward) or using non standard frontends
like ocperf or patching in libpfm.

Intel CPUs can have very large event files (Haswell has ~336 core events,
much more if you add uncore or all the offcore combinations), which is too
large to describe through the kernel interface. It would require tying up
significant amounts of unswappable memory for this.

oprofile always had separate event list files that were maintained by 
the CPU vendors. The oprofile events were shipped with the tool.
The Intel events get updated regularly, for example to add references
to the specification updates or add new events.

Unfortunately oprofile usually did not keep up with these updates,
so the events in oprofile were often out of date. In addition
it ties up quite a bit of disk space, mostly for CPUs you don't have.

This patch kit implements another mechanism that avoids these problems.
Intel releases the event lists for CPUs in a standardized JSON format
on a download server.

I implemented an automatic downloader to get the event file for the
current CPU.  The events are stored in ~/.cache/pmu-events.
Then perf adds a parser that converts the JSON format into perf event
aliases, which then can be used directly as any other perf event.

The parsing is done using a simple existing JSON library.

The events are still abstracted for perf, but the abstraction mechanism is
through the downloaded file instead of through the kernel.

The JSON format and perf parser has some minor Intelisms, but they
are simple and small and optional. It's easy to extend, so it would be
possible to use it for other CPUs too, add different pmu attributes, and
add new download sites to the downloader tool.

Currently only core events are supported, uncore may come at a later
point. No kernel changes, all code in perf user tools only.

Some of the parser files are partially shared with separate event parser
library and are thus 2-clause BSD licensed.

Patches also available from
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc perf/json

Example output:

% perf download 
Downloading models file
Downloading readme.txt
Downloading events file
% perf list
...
  br_inst_exec.all_branches  [Speculative and retired
  branches]
  br_inst_exec.all_conditional   [Speculative and retired
  macro-conditional
  branches]
  br_inst_exec.all_direct_jmp[Speculative and retired
  macro-unconditional
  branches excluding
  calls and indirects]
... 333 more new events ...

% perf stat -e br_inst_exec.all_direct_jmp true

 Performance counter stats for 'true':

 6,817  cpu/br_inst_exec.all_direct_jmp/
   

   0.003503212 seconds time elapsed

One nice feature is that a pointer to the specification update is now
included in the description, which will hopefully clear up many problems:

% perf list
...
  mem_load_uops_l3_hit_retired.xsnp_hit  [Retired load uops which
  data sources were L3
  and cross-core snoop
  hits in on-pkg core
  cache. Supports address
  when precise. Spec
  update: HSM26, HSM30
  (Precise event)]
...


-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


perf: Add support for full Intel event lists v7

2014-07-11 Thread Andi Kleen
All feedback addressed. Hopefully ready for merge now.

[v2: Review feedback addressed and some minor improvements]
[v3: More review feedback addressed and handle test failures better.
Ported to latest tip/core.]
[v4: Addressed Namhyung's feedback]
[v5: Rebase to latest tree. Minor description update.]
[v6: Rebase. Add acked by from Namhyung and address feedback. Some minor
fixes. Should be good to go now I hope. The period patch was dropped,
as that is already handled. I added an extra patch for a --quiet argument
for perf list]
[v7: Address Jiri's feedback. Various changes and some patches
were split. perf download uses curl now instead of wget.]

perf has high level events which are useful in many cases. However
there are some tuning situations where low level events in the CPU
are needed. Traditionally this required specifying the event in 
raw form (very awkward) or using non standard frontends
like ocperf or patching in libpfm.

Intel CPUs can have very large event files (Haswell has ~336 core events,
much more if you add uncore or all the offcore combinations), which is too
large to describe through the kernel interface. It would require tying up
significant amounts of unswappable memory for this.

oprofile always had separate event list files that were maintained by 
the CPU vendors. The oprofile events were shipped with the tool.
The Intel events get updated regularly, for example to add references
to the specification updates or add new events.

Unfortunately oprofile usually did not keep up with these updates,
so the events in oprofile were often out of date. In addition
it ties up quite a bit of disk space, mostly for CPUs you don't have.

This patch kit implements another mechanism that avoids these problems.
Intel releases the event lists for CPUs in a standardized JSON format
on a download server.

I implemented an automatic downloader to get the event file for the
current CPU.  The events are stored in ~/.cache/pmu-events.
Then perf adds a parser that converts the JSON format into perf event
aliases, which then can be used directly as any other perf event.

The parsing is done using a simple existing JSON library.

The events are still abstracted for perf, but the abstraction mechanism is
through the downloaded file instead of through the kernel.

The JSON format and perf parser has some minor Intelisms, but they
are simple and small and optional. It's easy to extend, so it would be
possible to use it for other CPUs too, add different pmu attributes, and
add new download sites to the downloader tool.

Currently only core events are supported, uncore may come at a later
point. No kernel changes, all code in perf user tools only.

Some of the parser files are partially shared with separate event parser
library and are thus 2-clause BSD licensed.

Patches also available from
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc perf/json

Example output:

% perf download 
Downloading models file
Downloading readme.txt
Downloading events file
% perf list
...
  br_inst_exec.all_branches  [Speculative and retired
  branches]
  br_inst_exec.all_conditional   [Speculative and retired
  macro-conditional
  branches]
  br_inst_exec.all_direct_jmp[Speculative and retired
  macro-unconditional
  branches excluding
  calls and indirects]
... 333 more new events ...

% perf stat -e br_inst_exec.all_direct_jmp true

 Performance counter stats for 'true':

 6,817  cpu/br_inst_exec.all_direct_jmp/
   

   0.003503212 seconds time elapsed

One nice feature is that a pointer to the specification update is now
included in the description, which will hopefully clear up many problems:

% perf list
...
  mem_load_uops_l3_hit_retired.xsnp_hit  [Retired load uops which
  data sources were L3
  and cross-core snoop
  hits in on-pkg core
  cache. Supports address
  when precise. Spec
  update: HSM26, HSM30
  (Precise event)]
...


-Andi
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: Add support for full Intel event lists v7

2014-07-09 Thread Jiri Olsa
On Fri, Jun 27, 2014 at 04:15:55PM -0700, Andi Kleen wrote:
> Should be ready for merge now. Please consider.
> 
> [v2: Review feedback addressed and some minor improvements]
> [v3: More review feedback addressed and handle test failures better.
> Ported to latest tip/core.]
> [v4: Addressed Namhyung's feedback]
> [v5: Rebase to latest tree. Minor description update.]
> [v6: Rebase. Add acked by from Namhyung and address feedback. Some minor
> fixes. Should be good to go now I hope. The period patch was dropped,
> as that is already handled. I added an extra patch for a --quiet argument
> for perf list]
> [v7: Just rebase to latest tip/core. Should be ready to merge.]

please remove 'vX' suffixes from patch subjects

jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: Add support for full Intel event lists v7

2014-07-09 Thread Jiri Olsa
On Fri, Jun 27, 2014 at 04:15:55PM -0700, Andi Kleen wrote:
 Should be ready for merge now. Please consider.
 
 [v2: Review feedback addressed and some minor improvements]
 [v3: More review feedback addressed and handle test failures better.
 Ported to latest tip/core.]
 [v4: Addressed Namhyung's feedback]
 [v5: Rebase to latest tree. Minor description update.]
 [v6: Rebase. Add acked by from Namhyung and address feedback. Some minor
 fixes. Should be good to go now I hope. The period patch was dropped,
 as that is already handled. I added an extra patch for a --quiet argument
 for perf list]
 [v7: Just rebase to latest tip/core. Should be ready to merge.]

please remove 'vX' suffixes from patch subjects

jirka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: Add support for full Intel event lists v7

2014-07-08 Thread Sukadev Bhattiprolu
Andi Kleen [a...@firstfloor.org] wrote:
| Works for me with your input file:
| 
| % perf list --events-file t.json
| ...
|   pm_cyc [Cycles completed]
|   pm_inst_cmpl   [Instructions completed]

Ah, lower case. 

| 
| > With the above events file, I get "invalid event" for 'PM_INST_CMPL:u'
| 
| It works in new style syntax, like
| 
| perf stat --events-file t.json  -e cpu/pm_inst_cmpl/u  ls

Yes, thanks.

Sukadev

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: Add support for full Intel event lists v7

2014-07-08 Thread Andi Kleen
Thanks for testing.

On Tue, Jul 08, 2014 at 11:43:11AM -0700, Sukadev Bhattiprolu wrote:
> | The JSON format and perf parser has some minor Intelisms, but they
> | are simple and small and optional. It's easy to extend, so it would be
> | possible to use it for other CPUs too, add different pmu attributes, and
> | add new download sites to the downloader tool.
> 
> Is there a minimal set of JSON entries an architecture would need ?

That should be enough, assuming the EventCode is enough to select
the event.

> 
> I tried the following on Power
>   [
> {
>   "EventCode": "2",
>   "EventName": "PM_INST_CMPL",
>   "BriefDescription": "Instructions completed",
>   "PublicDescription": "Number of PPC instructions finished",
> },
> {
>   "EventCode": "0x1E",
>   "EventName": "PM_CYC",
>   "BriefDescription": "Cycles completed",
>   "PublicDescription": "Number of PPC cycles finished",
> }
>   ]
> 
>   /tmp/perf record --events-file=/tmp/power8.json -e PM_INST_CMPL sleep 1
> 
> works, but for some TBD reason,
> 
>   /tmp/perf list --events-file=/tmp/power8.json doesn't list PM_INST_CMPL.

Works for me with your input file:

% perf list --events-file t.json
...
  pm_cyc [Cycles completed]
  pm_inst_cmpl   [Instructions completed]


> Another observation was that the order of --events-file and -e is significant.
> Maybe worth a note in the man page.

Will add.

> Can you specify the qualifiers like ':k' or ':ku' with the events on
> Intel ?
>to only monitor kernel or user ? Or do they need some additional JSON
> entries ?

They can be all specified.


> With the above events file, I get "invalid event" for 'PM_INST_CMPL:u'

It works in new style syntax, like

perf stat --events-file t.json  -e cpu/pm_inst_cmpl/u  ls


Thanks,
-Andi

-- 
a...@linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: Add support for full Intel event lists v7

2014-07-08 Thread Sukadev Bhattiprolu
Andi Kleen [a...@firstfloor.org] wrote:
| Should be ready for merge now. Please consider.

Overall I think it is a cool feature.  I was able to run some simple
tests on Power8 (by explicitly specifying the JSON file). Have a couple
of questions below.

| 
| [v2: Review feedback addressed and some minor improvements]
| [v3: More review feedback addressed and handle test failures better.
| Ported to latest tip/core.]
| [v4: Addressed Namhyung's feedback]
| [v5: Rebase to latest tree. Minor description update.]
| [v6: Rebase. Add acked by from Namhyung and address feedback. Some minor
| fixes. Should be good to go now I hope. The period patch was dropped,
| as that is already handled. I added an extra patch for a --quiet argument
| for perf list]
| [v7: Just rebase to latest tip/core. Should be ready to merge.]
| 
| perf has high level events which are useful in many cases. However
| there are some tuning situations where low level events in the CPU
| are needed. Traditionally this required specifying the event in 
| raw form (very awkward) or using non standard frontends
| like ocperf or patching in libpfm.
| 
| Intel CPUs can have very large event files (Haswell has ~336 core events,
| much more if you add uncore or all the offcore combinations), which is too
| large to describe through the kernel interface. It would require tying up
| significant amounts of unswappable memory for this.
| 
| oprofile always had separate event list files that were maintained by 
| the CPU vendors. The oprofile events were shipped with the tool.
| The Intel events get updated regularly, for example to add references
| to the specification updates or add new events.
| 
| Unfortunately oprofile usually did not keep up with these updates,
| so the events in oprofile were often out of date. In addition
| it ties up quite a bit of disk space, mostly for CPUs you don't have.
| 
| This patch kit implements another mechanism that avoids these problems.
| Intel releases the event lists for CPUs in a standardized JSON format
| on a download server.
| 
| I implemented an automatic downloader to get the event file for the
| current CPU.  The events are stored in ~/.cache/pmu-events.
| Then perf adds a parser that converts the JSON format into perf event
| aliases, which then can be used directly as any other perf event.
| 
| The parsing is done using a simple existing JSON library.
| 
| The events are still abstracted for perf, but the abstraction mechanism is
| through the downloaded file instead of through the kernel.
| 
| The JSON format and perf parser has some minor Intelisms, but they
| are simple and small and optional. It's easy to extend, so it would be
| possible to use it for other CPUs too, add different pmu attributes, and
| add new download sites to the downloader tool.

Is there a minimal set of JSON entries an architecture would need ?

I tried the following on Power
[
  {
"EventCode": "2",
"EventName": "PM_INST_CMPL",
"BriefDescription": "Instructions completed",
"PublicDescription": "Number of PPC instructions finished",
  },
  {
"EventCode": "0x1E",
"EventName": "PM_CYC",
"BriefDescription": "Cycles completed",
"PublicDescription": "Number of PPC cycles finished",
  }
]

/tmp/perf record --events-file=/tmp/power8.json -e PM_INST_CMPL sleep 1

works, but for some TBD reason,

/tmp/perf list --events-file=/tmp/power8.json doesn't list PM_INST_CMPL.

Another observation was that the order of --events-file and -e is significant.
Maybe worth a note in the man page.

| 
| Currently only core events are supported, uncore may come at a later
| point. No kernel changes, all code in perf user tools only.
| 
| Some of the parser files are partially shared with separate event parser
| library and are thus 2-clause BSD licensed.
| 
| Patches also available from
| git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc perf/json
| 
| Example output:
| 
| % perf download 
| Downloading models file
| Downloading readme.txt
| 2014-03-05 10:39:33 URL:https://download.01.org/perfmon/readme.txt 
[10320/10320] -> "readme.txt" [1]
| 2014-03-05 10:39:34 URL:https://download.01.org/perfmon/mapfile.csv 
[1207/1207] -> "mapfile.csv" [1]
| Downloading events file
| % perf list
| ...
|   br_inst_exec.all_branches  [Speculative and retired
|   branches]
|   br_inst_exec.all_conditional   [Speculative and retired
|   macro-conditional
|   branches]
|   br_inst_exec.all_direct_jmp[Speculative and retired
|   macro-unconditional
|   branches excluding
|   

Re: perf: Add support for full Intel event lists v7

2014-07-08 Thread Sukadev Bhattiprolu
Andi Kleen [a...@firstfloor.org] wrote:
| Should be ready for merge now. Please consider.

Overall I think it is a cool feature.  I was able to run some simple
tests on Power8 (by explicitly specifying the JSON file). Have a couple
of questions below.

| 
| [v2: Review feedback addressed and some minor improvements]
| [v3: More review feedback addressed and handle test failures better.
| Ported to latest tip/core.]
| [v4: Addressed Namhyung's feedback]
| [v5: Rebase to latest tree. Minor description update.]
| [v6: Rebase. Add acked by from Namhyung and address feedback. Some minor
| fixes. Should be good to go now I hope. The period patch was dropped,
| as that is already handled. I added an extra patch for a --quiet argument
| for perf list]
| [v7: Just rebase to latest tip/core. Should be ready to merge.]
| 
| perf has high level events which are useful in many cases. However
| there are some tuning situations where low level events in the CPU
| are needed. Traditionally this required specifying the event in 
| raw form (very awkward) or using non standard frontends
| like ocperf or patching in libpfm.
| 
| Intel CPUs can have very large event files (Haswell has ~336 core events,
| much more if you add uncore or all the offcore combinations), which is too
| large to describe through the kernel interface. It would require tying up
| significant amounts of unswappable memory for this.
| 
| oprofile always had separate event list files that were maintained by 
| the CPU vendors. The oprofile events were shipped with the tool.
| The Intel events get updated regularly, for example to add references
| to the specification updates or add new events.
| 
| Unfortunately oprofile usually did not keep up with these updates,
| so the events in oprofile were often out of date. In addition
| it ties up quite a bit of disk space, mostly for CPUs you don't have.
| 
| This patch kit implements another mechanism that avoids these problems.
| Intel releases the event lists for CPUs in a standardized JSON format
| on a download server.
| 
| I implemented an automatic downloader to get the event file for the
| current CPU.  The events are stored in ~/.cache/pmu-events.
| Then perf adds a parser that converts the JSON format into perf event
| aliases, which then can be used directly as any other perf event.
| 
| The parsing is done using a simple existing JSON library.
| 
| The events are still abstracted for perf, but the abstraction mechanism is
| through the downloaded file instead of through the kernel.
| 
| The JSON format and perf parser has some minor Intelisms, but they
| are simple and small and optional. It's easy to extend, so it would be
| possible to use it for other CPUs too, add different pmu attributes, and
| add new download sites to the downloader tool.

Is there a minimal set of JSON entries an architecture would need ?

I tried the following on Power
[
  {
EventCode: 2,
EventName: PM_INST_CMPL,
BriefDescription: Instructions completed,
PublicDescription: Number of PPC instructions finished,
  },
  {
EventCode: 0x1E,
EventName: PM_CYC,
BriefDescription: Cycles completed,
PublicDescription: Number of PPC cycles finished,
  }
]

/tmp/perf record --events-file=/tmp/power8.json -e PM_INST_CMPL sleep 1

works, but for some TBD reason,

/tmp/perf list --events-file=/tmp/power8.json doesn't list PM_INST_CMPL.

Another observation was that the order of --events-file and -e is significant.
Maybe worth a note in the man page.

| 
| Currently only core events are supported, uncore may come at a later
| point. No kernel changes, all code in perf user tools only.
| 
| Some of the parser files are partially shared with separate event parser
| library and are thus 2-clause BSD licensed.
| 
| Patches also available from
| git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc perf/json
| 
| Example output:
| 
| % perf download 
| Downloading models file
| Downloading readme.txt
| 2014-03-05 10:39:33 URL:https://download.01.org/perfmon/readme.txt 
[10320/10320] - readme.txt [1]
| 2014-03-05 10:39:34 URL:https://download.01.org/perfmon/mapfile.csv 
[1207/1207] - mapfile.csv [1]
| Downloading events file
| % perf list
| ...
|   br_inst_exec.all_branches  [Speculative and retired
|   branches]
|   br_inst_exec.all_conditional   [Speculative and retired
|   macro-conditional
|   branches]
|   br_inst_exec.all_direct_jmp[Speculative and retired
|   macro-unconditional
|   branches excluding
| 

Re: perf: Add support for full Intel event lists v7

2014-07-08 Thread Andi Kleen
Thanks for testing.

On Tue, Jul 08, 2014 at 11:43:11AM -0700, Sukadev Bhattiprolu wrote:
 | The JSON format and perf parser has some minor Intelisms, but they
 | are simple and small and optional. It's easy to extend, so it would be
 | possible to use it for other CPUs too, add different pmu attributes, and
 | add new download sites to the downloader tool.
 
 Is there a minimal set of JSON entries an architecture would need ?

That should be enough, assuming the EventCode is enough to select
the event.

 
 I tried the following on Power
   [
 {
   EventCode: 2,
   EventName: PM_INST_CMPL,
   BriefDescription: Instructions completed,
   PublicDescription: Number of PPC instructions finished,
 },
 {
   EventCode: 0x1E,
   EventName: PM_CYC,
   BriefDescription: Cycles completed,
   PublicDescription: Number of PPC cycles finished,
 }
   ]
 
   /tmp/perf record --events-file=/tmp/power8.json -e PM_INST_CMPL sleep 1
 
 works, but for some TBD reason,
 
   /tmp/perf list --events-file=/tmp/power8.json doesn't list PM_INST_CMPL.

Works for me with your input file:

% perf list --events-file t.json
...
  pm_cyc [Cycles completed]
  pm_inst_cmpl   [Instructions completed]


 Another observation was that the order of --events-file and -e is significant.
 Maybe worth a note in the man page.

Will add.

 Can you specify the qualifiers like ':k' or ':ku' with the events on
 Intel ?
to only monitor kernel or user ? Or do they need some additional JSON
 entries ?

They can be all specified.


 With the above events file, I get invalid event for 'PM_INST_CMPL:u'

It works in new style syntax, like

perf stat --events-file t.json  -e cpu/pm_inst_cmpl/u  ls


Thanks,
-Andi

-- 
a...@linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: Add support for full Intel event lists v7

2014-07-08 Thread Sukadev Bhattiprolu
Andi Kleen [a...@firstfloor.org] wrote:
| Works for me with your input file:
| 
| % perf list --events-file t.json
| ...
|   pm_cyc [Cycles completed]
|   pm_inst_cmpl   [Instructions completed]

Ah, lower case. 

| 
|  With the above events file, I get invalid event for 'PM_INST_CMPL:u'
| 
| It works in new style syntax, like
| 
| perf stat --events-file t.json  -e cpu/pm_inst_cmpl/u  ls

Yes, thanks.

Sukadev

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


perf: Add support for full Intel event lists v7

2014-06-27 Thread Andi Kleen
Should be ready for merge now. Please consider.

[v2: Review feedback addressed and some minor improvements]
[v3: More review feedback addressed and handle test failures better.
Ported to latest tip/core.]
[v4: Addressed Namhyung's feedback]
[v5: Rebase to latest tree. Minor description update.]
[v6: Rebase. Add acked by from Namhyung and address feedback. Some minor
fixes. Should be good to go now I hope. The period patch was dropped,
as that is already handled. I added an extra patch for a --quiet argument
for perf list]
[v7: Just rebase to latest tip/core. Should be ready to merge.]

perf has high level events which are useful in many cases. However
there are some tuning situations where low level events in the CPU
are needed. Traditionally this required specifying the event in 
raw form (very awkward) or using non standard frontends
like ocperf or patching in libpfm.

Intel CPUs can have very large event files (Haswell has ~336 core events,
much more if you add uncore or all the offcore combinations), which is too
large to describe through the kernel interface. It would require tying up
significant amounts of unswappable memory for this.

oprofile always had separate event list files that were maintained by 
the CPU vendors. The oprofile events were shipped with the tool.
The Intel events get updated regularly, for example to add references
to the specification updates or add new events.

Unfortunately oprofile usually did not keep up with these updates,
so the events in oprofile were often out of date. In addition
it ties up quite a bit of disk space, mostly for CPUs you don't have.

This patch kit implements another mechanism that avoids these problems.
Intel releases the event lists for CPUs in a standardized JSON format
on a download server.

I implemented an automatic downloader to get the event file for the
current CPU.  The events are stored in ~/.cache/pmu-events.
Then perf adds a parser that converts the JSON format into perf event
aliases, which then can be used directly as any other perf event.

The parsing is done using a simple existing JSON library.

The events are still abstracted for perf, but the abstraction mechanism is
through the downloaded file instead of through the kernel.

The JSON format and perf parser has some minor Intelisms, but they
are simple and small and optional. It's easy to extend, so it would be
possible to use it for other CPUs too, add different pmu attributes, and
add new download sites to the downloader tool.

Currently only core events are supported, uncore may come at a later
point. No kernel changes, all code in perf user tools only.

Some of the parser files are partially shared with separate event parser
library and are thus 2-clause BSD licensed.

Patches also available from
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc perf/json

Example output:

% perf download 
Downloading models file
Downloading readme.txt
2014-03-05 10:39:33 URL:https://download.01.org/perfmon/readme.txt 
[10320/10320] -> "readme.txt" [1]
2014-03-05 10:39:34 URL:https://download.01.org/perfmon/mapfile.csv [1207/1207] 
-> "mapfile.csv" [1]
Downloading events file
% perf list
...
  br_inst_exec.all_branches  [Speculative and retired
  branches]
  br_inst_exec.all_conditional   [Speculative and retired
  macro-conditional
  branches]
  br_inst_exec.all_direct_jmp[Speculative and retired
  macro-unconditional
  branches excluding
  calls and indirects]
... 333 more new events ...

% perf stat -e br_inst_exec.all_direct_jmp true

 Performance counter stats for 'true':

 6,817  cpu/br_inst_exec.all_direct_jmp/
   

   0.003503212 seconds time elapsed

One nice feature is that a pointer to the specification update is now
included in the description, which will hopefully clear up many problems:

% perf list
...
  mem_load_uops_l3_hit_retired.xsnp_hit  [Retired load uops which
  data sources were L3
  and cross-core snoop
  hits in on-pkg core
  cache. Supports address
  when precise. Spec
  update: HSM26, HSM30
  (Precise event)]
...


-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo 

perf: Add support for full Intel event lists v7

2014-06-27 Thread Andi Kleen
Should be ready for merge now. Please consider.

[v2: Review feedback addressed and some minor improvements]
[v3: More review feedback addressed and handle test failures better.
Ported to latest tip/core.]
[v4: Addressed Namhyung's feedback]
[v5: Rebase to latest tree. Minor description update.]
[v6: Rebase. Add acked by from Namhyung and address feedback. Some minor
fixes. Should be good to go now I hope. The period patch was dropped,
as that is already handled. I added an extra patch for a --quiet argument
for perf list]
[v7: Just rebase to latest tip/core. Should be ready to merge.]

perf has high level events which are useful in many cases. However
there are some tuning situations where low level events in the CPU
are needed. Traditionally this required specifying the event in 
raw form (very awkward) or using non standard frontends
like ocperf or patching in libpfm.

Intel CPUs can have very large event files (Haswell has ~336 core events,
much more if you add uncore or all the offcore combinations), which is too
large to describe through the kernel interface. It would require tying up
significant amounts of unswappable memory for this.

oprofile always had separate event list files that were maintained by 
the CPU vendors. The oprofile events were shipped with the tool.
The Intel events get updated regularly, for example to add references
to the specification updates or add new events.

Unfortunately oprofile usually did not keep up with these updates,
so the events in oprofile were often out of date. In addition
it ties up quite a bit of disk space, mostly for CPUs you don't have.

This patch kit implements another mechanism that avoids these problems.
Intel releases the event lists for CPUs in a standardized JSON format
on a download server.

I implemented an automatic downloader to get the event file for the
current CPU.  The events are stored in ~/.cache/pmu-events.
Then perf adds a parser that converts the JSON format into perf event
aliases, which then can be used directly as any other perf event.

The parsing is done using a simple existing JSON library.

The events are still abstracted for perf, but the abstraction mechanism is
through the downloaded file instead of through the kernel.

The JSON format and perf parser has some minor Intelisms, but they
are simple and small and optional. It's easy to extend, so it would be
possible to use it for other CPUs too, add different pmu attributes, and
add new download sites to the downloader tool.

Currently only core events are supported, uncore may come at a later
point. No kernel changes, all code in perf user tools only.

Some of the parser files are partially shared with separate event parser
library and are thus 2-clause BSD licensed.

Patches also available from
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc perf/json

Example output:

% perf download 
Downloading models file
Downloading readme.txt
2014-03-05 10:39:33 URL:https://download.01.org/perfmon/readme.txt 
[10320/10320] - readme.txt [1]
2014-03-05 10:39:34 URL:https://download.01.org/perfmon/mapfile.csv [1207/1207] 
- mapfile.csv [1]
Downloading events file
% perf list
...
  br_inst_exec.all_branches  [Speculative and retired
  branches]
  br_inst_exec.all_conditional   [Speculative and retired
  macro-conditional
  branches]
  br_inst_exec.all_direct_jmp[Speculative and retired
  macro-unconditional
  branches excluding
  calls and indirects]
... 333 more new events ...

% perf stat -e br_inst_exec.all_direct_jmp true

 Performance counter stats for 'true':

 6,817  cpu/br_inst_exec.all_direct_jmp/
   

   0.003503212 seconds time elapsed

One nice feature is that a pointer to the specification update is now
included in the description, which will hopefully clear up many problems:

% perf list
...
  mem_load_uops_l3_hit_retired.xsnp_hit  [Retired load uops which
  data sources were L3
  and cross-core snoop
  hits in on-pkg core
  cache. Supports address
  when precise. Spec
  update: HSM26, HSM30
  (Precise event)]
...


-Andi
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at