Re: [PATCH 2/4] perf: jevents: Program to convert JSON file to C style file

2015-05-31 Thread Andi Kleen

Ok I did some scripting to add these topics you requested to the Intel JSON 
files,
and changed perf list to group events by them. 

I'll redirect any questions on their value to you.  
And I certainly hope this is the last of your improvements for now.

The updated event lists are available in

git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc 
perf/intel-json-files-3

The updated patches are available in 

git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc perf/builtin-json-6

Also posted separately.

The output looks like this

% perf list
...
Cache:
  l1d.replacement   
   [L1D data line replacements]
  l1d_pend_miss.pending 
   [L1D miss oustandings duration in cycles]
  l1d_pend_miss.pending_cycles  
   [Cycles with L1D load Misses outstanding]
...
Floating point:
  fp_assist.any 
   [Cycles with any input/output SSE or FP assist]
  fp_assist.simd_input  
   [Number of SIMD FP assists due to input values]
  fp_assist.simd_output 
   [Number of SIMD FP assists due to Output values]
...
Memory:
  machine_clears.memory_ordering
   [Counts the number of machine clears due to memory order conflicts]
  mem_trans_retired.load_latency_gt_128 
   [Loads with latency value being above 128 (Must be precise)]
  mem_trans_retired.load_latency_gt_16  
   [Loads with latency value being above 16 (Must be precise)]
...
Pipeline:
  arith.fpu_div 
   [Divide operations executed]
  arith.fpu_div_active  
   [Cycles when divider is busy executing divide operations]
  baclears.any  
   [Counts the total number when the front end is resteered, mainly when 
the BPU cannot provide a correct
prediction and this is corrected by other branch handling mechanisms at 
the front end]


-Andi

P.S.: You may want to look up the definition of logical fallacy in wikipedia.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/4] perf: jevents: Program to convert JSON file to C style file

2015-05-29 Thread Ingo Molnar

* Andi Kleen a...@linux.intel.com wrote:

  So instead of this flat structure, there should at minimum be broad 
  categorization 
  of the various parts of the hardware they relate to: whether they relate to 
  the 
  branch predictor, memory caches, TLB caches, memory ops, offcore, decoders, 
  execution units, FPU ops, etc., etc. - so that they can be queried via 
  'perf 
  list'.
 
 The categorization is generally on the stem name, which already works fine 
 with 
 the existing perf list wildcard support. So for example you only want 
 branches.

 perf list br*
 ...
   br_inst_exec.all_branches 
[Speculative and retired branches]
   br_inst_exec.all_conditional  
[Speculative and retired macro-conditional branches]
   br_inst_exec.all_direct_jmp   
[Speculative and retired macro-unconditional branches excluding calls 
 and indirects]
   br_inst_exec.all_direct_near_call 
[Speculative and retired direct near calls]
   br_inst_exec.all_indirect_jump_non_call_ret   
[Speculative and retired indirect branches excluding calls and returns]
   br_inst_exec.all_indirect_near_return 
[Speculative and retired indirect return branches]
 ...
 
 Or mid level cache events:
 
 perf list l2*
 ...
   l2_l1d_wb_rqsts.all   
[Not rejected writebacks from L1D to L2 cache lines in any state]
   l2_l1d_wb_rqsts.hit_e 
[Not rejected writebacks from L1D to L2 cache lines in E state]
   l2_l1d_wb_rqsts.hit_m 
[Not rejected writebacks from L1D to L2 cache lines in M state]
   l2_l1d_wb_rqsts.miss  
[Count the number of modified Lines evicted from L1 and missed L2. 
 (Non-rejected WBs from the DCU.)]
   l2_lines_in.all   
[L2 cache lines filling L2]
 ...
 
 There are some exceptions, but generally it works this way.

You are missing my point in several ways:

1)

Firstly, there are _tons_ of 'exceptions' to the 'stem name' grouping, to the 
level that makes it unusable for high level grouping of events.

Here's the 'stem name' histogram on the SandyBridge event list:

  $ grep EventName pmu-events/arch/x86/SandyBridge_core.json  | cut -d\. -f1 | 
cut -d\ -f4 | cut -d\_ -f1 | sort | uniq -c | sort -n

  1 AGU
  1 BACLEARS
  1 EPT
  1 HW
  1 ICACHE
  1 INSTS
  1 PAGE
  1 ROB
  1 RS
  1 SQ
  2 ARITH
  2 DSB2MITE
  2 ILD
  2 LOAD
  2 LOCK
  2 LONGEST
  2 MISALIGN
  2 SIMD
  2 TLB
  3 CPL
  3 DSB
  3 INST
  3 INT
  3 LSD
  3 MACHINE
  4 CPU
  4 OTHER
  4 PARTIAL
  5 CYCLE
  5 ITLB
  6 LD
  7 L1D
  8 DTLB
 10 FP
 12 RESOURCE
 21 UOPS
 24 IDQ
 25 MEM
 37 BR
 37 L2
131 OFFCORE

Out of 386 events. This grouping has the following severe problems:

  - that's 41 'stem name' groups, way too much as a first hop high level 
structure. We want the kind of high level categorization I suggested:
cache, decoding, branches, execution pipeline, memory events, vector unit 
events - which broad categories exist in all CPUs and are microarchitecture 
independent.

  - even these 'stem names' are mostly unstructured and unreadable. The two 
examples you cited are the best case that are borderline readable, but they
cover less than 20% of all events.

  - the 'stem name' concept is not even used consistently, the names are 
essentially a random collection of Intel internal acronyms, which 
occasionally 
match up with high level concepts. These vendor defined names have very 
poor 
high level structure.

  - the 'stem names' are totally imbalanced: there's one 'super' category 'stem 
name': OFFCORE_RESPONSE, with 131 events in it and then there are super 
small 
groups in the list above. Not well suited to get a good overview about what 
measurement capabilities the hardware has.

So forget about using 'stem names' as the high level structure. These events 
have 
no high level structure and we should provide that, instead of dumping 380+ 
events 
on the unsuspecting user.

2)

Secondly, categorization and higher level hieararchy should be used to keep the 
list manageable. The fact that if _you_ know what to search for you can list 
just 
a subset does not mean anything to the new user trying to discover events.

A simple 'perf list' should list the high level categories by default, with a 
count displayed that shows how many further events are within that category. 
(compacted tree output would be usable as well.)

 The stem could be put into a separate header, but it would seem redundant to 
 me.

Higher level categories simply don't exist in these names in any usable form, 
so 
it has to be created. Just redundantly 

Re: [PATCH 2/4] perf: jevents: Program to convert JSON file to C style file

2015-05-28 Thread Jiri Olsa
On Wed, May 27, 2015 at 11:59:04PM +0900, Namhyung Kim wrote:
 Hi Andi,
 
 On Wed, May 27, 2015 at 11:40 PM, Andi Kleen a...@linux.intel.com wrote:
  So we build tables of all models in the architecture, and choose
  matching one when compiling perf, right?  Can't we do that when
  building the tables?  IOW, why don't we check the VFM and discard
  non-matching tables?  Those non-matching tables are also needed?
 
  We build it for all cpus in an architecture, not all architectures.
  So e.g. for an x86 binary power is not included, and vice versa.
 
 OK.
 
  It always includes all CPUs for a given architecture, so it's possible
  to use the perf binary on other systems than just the one it was
  build on.
 
 So it selects one at run-time not build-time, good.  But I worry about
 the size of the intel tables.  How large are they?  Maybe we can make
 it dynamic-loadable if needed..

just compiled Sukadev's new version with Andi's events list
and stripped binary size is:

[jolsa@krava perf]$ ls -l perf
-rwxrwxr-x 1 jolsa jolsa 2772640 May 28 13:49 perf


while perf on Arnaldo's perf/core is:

[jolsa@krava perf]$ ls -l perf
-rwxrwxr-x 1 jolsa jolsa 2334816 May 28 13:49 perf


seems not that bad

jirka
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/4] perf: jevents: Program to convert JSON file to C style file

2015-05-28 Thread Ingo Molnar

* Ingo Molnar mi...@kernel.org wrote:

 
 * Jiri Olsa jo...@redhat.com wrote:
 
  On Wed, May 27, 2015 at 11:59:04PM +0900, Namhyung Kim wrote:
   Hi Andi,
   
   On Wed, May 27, 2015 at 11:40 PM, Andi Kleen a...@linux.intel.com wrote:
So we build tables of all models in the architecture, and choose
matching one when compiling perf, right?  Can't we do that when
building the tables?  IOW, why don't we check the VFM and discard
non-matching tables?  Those non-matching tables are also needed?
   
We build it for all cpus in an architecture, not all architectures.
So e.g. for an x86 binary power is not included, and vice versa.
   
   OK.
   
It always includes all CPUs for a given architecture, so it's possible
to use the perf binary on other systems than just the one it was
build on.
   
   So it selects one at run-time not build-time, good.  But I worry about
   the size of the intel tables.  How large are they?  Maybe we can make
   it dynamic-loadable if needed..
  
  just compiled Sukadev's new version with Andi's events list
  and stripped binary size is:
  
  [jolsa@krava perf]$ ls -l perf
  -rwxrwxr-x 1 jolsa jolsa 2772640 May 28 13:49 perf
  
  
  while perf on Arnaldo's perf/core is:
  
  [jolsa@krava perf]$ ls -l perf
  -rwxrwxr-x 1 jolsa jolsa 2334816 May 28 13:49 perf
  
  seems not that bad
 
 It's not bad at all.
 
 Do you have a Git tree URI where I could take a look at its current state? A 
 tree would be nice that has as many of these patches integrated as possible.

A couple of observations:

1)

The x86 JSON files are unnecessarily large, and for no good reason, for example:

 triton:~/tip/tools/perf/pmu-events/arch/x86 grep -h EdgeDetect * | sort | 
uniq -c
   5534 EdgeDetect: 0,
 57 EdgeDetect: 1,

it's ridiculous to repeat EdgeDetect: 0 more than 5 thousand times, just so 
that in 57 cases we can say '1'. Those lines should be omitted, and the default 
value should be 0.

This would reduce the source code line count of the JSON files by 40% already:

 triton:~/tip/tools/perf/pmu-events/arch/x86 grep ': 0,' * | wc -l
 42127
 triton:~/tip/tools/perf/pmu-events/arch/x86 cat * | wc -l
 103702

And no, I don't care if manufacturers release crappy JSON files - they need to 
be 
fixed/stripped before applied to our source tree.

2)

Also, the JSON files should carry more high levelstructure than they do today. 
Let's take SandyBridge_core.json as an example: it defines 386 events, but they 
are all in a 'flat' hierarchy, which is almost impossible for all but the most 
expert users to overview.

So instead of this flat structure, there should at minimum be broad 
categorization 
of the various parts of the hardware they relate to: whether they relate to the 
branch predictor, memory caches, TLB caches, memory ops, offcore, decoders, 
execution units, FPU ops, etc., etc. - so that they can be queried via 'perf 
list'.

We don't just want the import the unstructured mess that these event files are 
- 
we want to turn them into real structure. We can still keep the messy vendor 
names 
as well, like IDQ.DSB_CYCLES, but we want to impose structure as well.

3)

There should be good 'perf list' visualization for these events: grouping, 
individual names, with a good interface to query details if needed. I.e. it 
should 
be possible to browse and discover events relevant to the CPU the tool is 
executing on.

Thanks,

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/4] perf: jevents: Program to convert JSON file to C style file

2015-05-28 Thread Ingo Molnar

* Jiri Olsa jo...@redhat.com wrote:

 On Wed, May 27, 2015 at 11:59:04PM +0900, Namhyung Kim wrote:
  Hi Andi,
  
  On Wed, May 27, 2015 at 11:40 PM, Andi Kleen a...@linux.intel.com wrote:
   So we build tables of all models in the architecture, and choose
   matching one when compiling perf, right?  Can't we do that when
   building the tables?  IOW, why don't we check the VFM and discard
   non-matching tables?  Those non-matching tables are also needed?
  
   We build it for all cpus in an architecture, not all architectures.
   So e.g. for an x86 binary power is not included, and vice versa.
  
  OK.
  
   It always includes all CPUs for a given architecture, so it's possible
   to use the perf binary on other systems than just the one it was
   build on.
  
  So it selects one at run-time not build-time, good.  But I worry about
  the size of the intel tables.  How large are they?  Maybe we can make
  it dynamic-loadable if needed..
 
 just compiled Sukadev's new version with Andi's events list
 and stripped binary size is:
 
 [jolsa@krava perf]$ ls -l perf
 -rwxrwxr-x 1 jolsa jolsa 2772640 May 28 13:49 perf
 
 
 while perf on Arnaldo's perf/core is:
 
 [jolsa@krava perf]$ ls -l perf
 -rwxrwxr-x 1 jolsa jolsa 2334816 May 28 13:49 perf
 
 seems not that bad

It's not bad at all.

Do you have a Git tree URI where I could take a look at its current state? A 
tree 
would be nice that has as many of these patches integrated as possible.

Thanks,

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/4] perf: jevents: Program to convert JSON file to C style file

2015-05-28 Thread Andi Kleen
 So instead of this flat structure, there should at minimum be broad 
 categorization 
 of the various parts of the hardware they relate to: whether they relate to 
 the 
 branch predictor, memory caches, TLB caches, memory ops, offcore, decoders, 
 execution units, FPU ops, etc., etc. - so that they can be queried via 'perf 
 list'.

The categorization is generally on the stem name, which already works fine with
the existing perf list wildcard support. So for example you only want 
branches. 

perf list br*
...
  br_inst_exec.all_branches 
   [Speculative and retired branches]
  br_inst_exec.all_conditional  
   [Speculative and retired macro-conditional branches]
  br_inst_exec.all_direct_jmp   
   [Speculative and retired macro-unconditional branches excluding calls 
and indirects]
  br_inst_exec.all_direct_near_call 
   [Speculative and retired direct near calls]
  br_inst_exec.all_indirect_jump_non_call_ret   
   [Speculative and retired indirect branches excluding calls and returns]
  br_inst_exec.all_indirect_near_return 
   [Speculative and retired indirect return branches]
...

Or mid level cache events:

perf list l2*
...
  l2_l1d_wb_rqsts.all   
   [Not rejected writebacks from L1D to L2 cache lines in any state]
  l2_l1d_wb_rqsts.hit_e 
   [Not rejected writebacks from L1D to L2 cache lines in E state]
  l2_l1d_wb_rqsts.hit_m 
   [Not rejected writebacks from L1D to L2 cache lines in M state]
  l2_l1d_wb_rqsts.miss  
   [Count the number of modified Lines evicted from L1 and missed L2. 
(Non-rejected WBs from the DCU.)]
  l2_lines_in.all   
   [L2 cache lines filling L2]
...

There are some exceptions, but generally it works this way.

The stem could be put into a separate header, but it would seem redundant to 
me. 

 We don't just want the import the unstructured mess that these event files 
 are - 
 we want to turn them into real structure. We can still keep the messy vendor 
 names 
 as well, like IDQ.DSB_CYCLES, but we want to impose structure as well.

The vendor names directly map to the micro architecture, which is whole
point of the events. IDQ is a part of the CPU, and is described in the 
CPU manuals. One of the main motivations for adding event lists is to make
perf match to that documentation.

 
 3)
 
 There should be good 'perf list' visualization for these events: grouping, 
 individual names, with a good interface to query details if needed. I.e. it 
 should 
 be possible to browse and discover events relevant to the CPU the tool is 
 executing on.

I suppose we could change perf list to give the stem names as section headers
to make the long list a bit more readable.

Generally you need to have some knowledge of the micro architecture to use
these events. There is no way around that.

-Andi
-- 
a...@linux.intel.com -- Speaking for myself only
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/4] perf: jevents: Program to convert JSON file to C style file

2015-05-27 Thread Namhyung Kim
Hi Sukadev,

On Tue, May 19, 2015 at 05:02:08PM -0700, Sukadev Bhattiprolu wrote:
 From: Andi Kleen a...@linux.intel.com
 
 This is a modified version of an earlier patch by Andi Kleen.
 
 We expect architectures to describe the performance monitoring events
 for each CPU in a corresponding JSON file, which look like:
 
   [
   {
   EventCode: 0x00,
   UMask: 0x01,
   EventName: INST_RETIRED.ANY,
   BriefDescription: Instructions retired from execution.,
   PublicDescription: Instructions retired from execution.,
   Counter: Fixed counter 1,
   CounterHTOff: Fixed counter 1,
   SampleAfterValue: 203,
   SampleAfterValue: 203,
   MSRIndex: 0,
   MSRValue: 0,
   TakenAlone: 0,
   CounterMask: 0,
   Invert: 0,
   AnyThread: 0,
   EdgeDetect: 0,
   PEBS: 0,
   PRECISE_STORE: 0,
   Errata: null,
   Offcore: 0
   }
   ]
 
 We also expect the architectures to provide a mapping between individual
 CPUs to their JSON files. Eg:
 
   GenuineIntel-6-1E,V1,/NHM-EP/NehalemEP_core_V1.json,core
 
 which maps each CPU, identified by [vendor, family, model, version, type]
 to a JSON file.
 
 Given these files, the program, jevents::
   - locates all JSON files for the architecture,
   - parses each JSON file and generates a C-style PMU-events table
 (pmu-events.c)
   - locates a mapfile for the architecture
   - builds a global table, mapping each model of CPU to the
 corresponding PMU-events table.

So we build tables of all models in the architecture, and choose
matching one when compiling perf, right?  Can't we do that when
building the tables?  IOW, why don't we check the VFM and discard
non-matching tables?  Those non-matching tables are also needed?

Sorry if I missed something..


 
 The 'pmu-events.c' is generated when building perf and added to libperf.a.
 The global table pmu_events_map[] table in this pmu-events.c will be used
 in perf in a follow-on patch.
 
 If the architecture does not have any JSON files or there is an error in
 processing them, an empty mapping file is created. This would allow the
 build of perf to proceed even if we are not able to provide aliases for
 events.
 
 The parser for JSON files allows parsing Intel style JSON event files. This
 allows to use an Intel event list directly with perf. The Intel event lists
 can be quite large and are too big to store in unswappable kernel memory.
 
 The conversion from JSON to C-style is straight forward.  The parser knows
 (very little) Intel specific information, and can be easily extended to
 handle fields for other CPUs.
 
 The parser code is partially shared with an independent parsing library,
 which is 2-clause BSD licenced. To avoid any conflicts I marked those
 files as BSD licenced too. As part of perf they become GPLv2.
 
 Signed-off-by: Andi Kleen a...@linux.intel.com
 Signed-off-by: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
 
 v2: Address review feedback. Rename option to --event-files
 v3: Add JSON example
 v4: Update manpages.
 v5: Don't remove dot in fixname. Fix compile error. Add include
   protection. Comment realloc.
 v6: Include debug/util.h
 v7: (Sukadev Bhattiprolu)
   Rebase to 4.0 and fix some conflicts.
 v8: (Sukadev Bhattiprolu)
   Move jevents.[hc] to tools/perf/pmu-events/
   Rewrite to locate and process arch specific JSON and map files;
   and generate a C file.
   (Removed acked-by Namhyung Kim due to modest changes to patch)
   Compile the generated pmu-events.c and add the pmu-events.o to
   libperf.a
 ---

[SNIP]
 +/* Call func with each event in the json file */
 +int json_events(const char *fn,
 +   int (*func)(void *data, char *name, char *event, char *desc),
 +   void *data)
 +{
 + int err = -EIO;
 + size_t size;
 + jsmntok_t *tokens, *tok;
 + int i, j, len;
 + char *map;
 +
 + if (!fn)
 + return -ENOENT;
 +
 + tokens = parse_json(fn, map, size, len);
 + if (!tokens)
 + return -EIO;
 + EXPECT(tokens-type == JSMN_ARRAY, tokens, expected top level array);
 + tok = tokens + 1;
 + for (i = 0; i  tokens-size; i++) {
 + char *event = NULL, *desc = NULL, *name = NULL;
 + struct msrmap *msr = NULL;
 + jsmntok_t *msrval = NULL;
 + jsmntok_t *precise = NULL;
 + jsmntok_t *obj = tok++;
 +
 + EXPECT(obj-type == JSMN_OBJECT, obj, expected object);
 + for (j = 0; j  obj-size; j += 2) {
 + jsmntok_t *field, *val;
 + int nz;
 +
 + field = tok + j;
 + EXPECT(field-type == JSMN_STRING, tok + j,
 +Expected field name);
 + val = tok + j + 1;
 + EXPECT(val-type == JSMN_STRING, tok + j + 1,
 +Expected string value);
 +
 +   

Re: [PATCH 2/4] perf: jevents: Program to convert JSON file to C style file

2015-05-27 Thread Andi Kleen
 So we build tables of all models in the architecture, and choose
 matching one when compiling perf, right?  Can't we do that when
 building the tables?  IOW, why don't we check the VFM and discard
 non-matching tables?  Those non-matching tables are also needed?

We build it for all cpus in an architecture, not all architectures.
So e.g. for an x86 binary power is not included, and vice versa.
It always includes all CPUs for a given architecture, so it's possible
to use the perf binary on other systems than just the one it was 
build on.

-andi

-- 
a...@linux.intel.com -- Speaking for myself only
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/4] perf: jevents: Program to convert JSON file to C style file

2015-05-27 Thread Namhyung Kim
Hi Andi,

On Wed, May 27, 2015 at 11:40 PM, Andi Kleen a...@linux.intel.com wrote:
 So we build tables of all models in the architecture, and choose
 matching one when compiling perf, right?  Can't we do that when
 building the tables?  IOW, why don't we check the VFM and discard
 non-matching tables?  Those non-matching tables are also needed?

 We build it for all cpus in an architecture, not all architectures.
 So e.g. for an x86 binary power is not included, and vice versa.

OK.

 It always includes all CPUs for a given architecture, so it's possible
 to use the perf binary on other systems than just the one it was
 build on.

So it selects one at run-time not build-time, good.  But I worry about
the size of the intel tables.  How large are they?  Maybe we can make
it dynamic-loadable if needed..

Thanks,
Namhyung
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/4] perf: jevents: Program to convert JSON file to C style file

2015-05-22 Thread Andi Kleen
 Sure, but shouldn't we allow JSON files to be in subdirs
 
   pmu-events/arch/x86/HSX/Haswell_core.json
 
 and this could go to arbitrary levels?

I used a flat hierarchy. Should be good enough.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/4] perf: jevents: Program to convert JSON file to C style file

2015-05-22 Thread Jiri Olsa
On Tue, May 19, 2015 at 05:02:08PM -0700, Sukadev Bhattiprolu wrote:

SNIP

 +int main(int argc, char *argv[])
 +{
 + int rc;
 + int flags;

SNIP

 +
 + rc = uname(uts);
 + if (rc  0) {
 + printf(%s: uname() failed: %s\n, argv[0], strerror(errno));
 + goto empty_map;
 + }
 +
 + /* TODO: Add other flavors of machine type here */
 + if (!strcmp(uts.machine, ppc64))
 + arch = powerpc;
 + else if (!strcmp(uts.machine, i686))
 + arch = x86;
 + else if (!strcmp(uts.machine, x86_64))
 + arch = x86;
 + else {
 + printf(%s: Unknown architecture %s\n, argv[0], uts.machine);
 + goto empty_map;
 + }

hum, wouldnt it be easier to pass the arch directly from the Makefile,
we should have it ready in the $(ARCH) variable..

jirka
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/4] perf: jevents: Program to convert JSON file to C style file

2015-05-22 Thread Jiri Olsa
On Tue, May 19, 2015 at 05:02:08PM -0700, Sukadev Bhattiprolu wrote:

SNIP

 ---
  tools/perf/Build   |1 +
  tools/perf/Makefile.perf   |4 +-
  tools/perf/pmu-events/Build|   38 ++
  tools/perf/pmu-events/README   |   67 
  tools/perf/pmu-events/jevents.c|  700 
 
  tools/perf/pmu-events/jevents.h|   17 +
  tools/perf/pmu-events/pmu-events.h |   39 ++
  7 files changed, 865 insertions(+), 1 deletion(-)
  create mode 100644 tools/perf/pmu-events/Build
  create mode 100644 tools/perf/pmu-events/README
  create mode 100644 tools/perf/pmu-events/jevents.c
  create mode 100644 tools/perf/pmu-events/jevents.h
  create mode 100644 tools/perf/pmu-events/pmu-events.h
 
 diff --git a/tools/perf/Build b/tools/perf/Build
 index b77370e..40bffa0 100644
 --- a/tools/perf/Build
 +++ b/tools/perf/Build
 @@ -36,6 +36,7 @@ CFLAGS_builtin-help.o  += $(paths)
  CFLAGS_builtin-timechart.o += $(paths)
  CFLAGS_perf.o  += -DPERF_HTML_PATH=BUILD_STR($(htmldir_SQ)) 
 -include $(OUTPUT)PERF-VERSION-FILE
  
 +libperf-y += pmu-events/

there's no concetion (yet) in the new build system to trigger
another binery build as a dependency for object file.. I'd
rather do this the framework way, please check attached patch

also currently the pmu-events.c is generated every time,
so we need to add the event json data files as dependency

jirka


---
diff --git a/tools/build/Makefile.build b/tools/build/Makefile.build
index 10df57237a66..f6e7fd868892 100644
--- a/tools/build/Makefile.build
+++ b/tools/build/Makefile.build
@@ -41,6 +41,7 @@ include $(build-file)
 
 quiet_cmd_flex  = FLEX $@
 quiet_cmd_bison = BISON$@
+quiet_cmd_gen   = GEN  $@
 
 # Create directory unless it exists
 quiet_cmd_mkdir = MKDIR$(dir $@)
diff --git a/tools/perf/Build b/tools/perf/Build
index 40bffa0b6ee1..b77370ef7005 100644
--- a/tools/perf/Build
+++ b/tools/perf/Build
@@ -36,7 +36,6 @@ CFLAGS_builtin-help.o  += $(paths)
 CFLAGS_builtin-timechart.o += $(paths)
 CFLAGS_perf.o  += -DPERF_HTML_PATH=BUILD_STR($(htmldir_SQ)) 
-include $(OUTPUT)PERF-VERSION-FILE
 
-libperf-y += pmu-events/
 libperf-y += util/
 libperf-y += arch/
 libperf-y += ui/
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 57e46a541686..a4ba451cffa2 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -272,14 +272,29 @@ strip: $(PROGRAMS) $(OUTPUT)perf
 
 PERF_IN := $(OUTPUT)perf-in.o
 
+JEVENTS   := $(OUTPUT)pmu-events/jevents
+JEVENTS_IN:= $(OUTPUT)pmu-events/jevents-in.o
+PMU_EVENTS_IN := $(OUTPUT)pmu-events/pmu-events-in.o
+
+export JEVENTS
+
 export srctree OUTPUT RM CC LD AR CFLAGS V BISON FLEX
 build := -f $(srctree)/tools/build/Makefile.build dir=. obj
 
 $(PERF_IN): $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h FORCE
$(Q)$(MAKE) $(build)=perf
 
-$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN)
-   $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) $(PERF_IN) $(LIBS) -o $@
+$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN)
+   $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) $(PERF_IN) $(PMU_EVENTS_IN) 
$(LIBS) -o $@
+
+$(JEVENTS_IN): FORCE
+   $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build 
dir=$(OUTPUT)pmu-events obj=jevents
+
+$(JEVENTS): $(JEVENTS_IN)
+   $(QUIET_LINK)$(CC) $(JEVENTS_IN) -o $@
+
+$(PMU_EVENTS_IN): $(JEVENTS) FORCE
+   $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build 
dir=$(OUTPUT)pmu-events obj=pmu-events
 
 $(GTK_IN): FORCE
$(Q)$(MAKE) $(build)=gtk
@@ -538,7 +553,7 @@ clean: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean config-clean
$(Q)find . -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name 
'\.*.d' -delete
$(Q)$(RM) .config-detected
$(call QUIET_CLEAN, core-progs) $(RM) $(ALL_PROGRAMS) perf 
perf-read-vdso32 perf-read-vdsox32 $(OUTPUT)pmu-events/jevents
-   $(call QUIET_CLEAN, core-gen)   $(RM)  *.spec *.pyc *.pyo */*.pyc 
*/*.pyo $(OUTPUT)common-cmds.h TAGS tags cscope* $(OUTPUT)PERF-VERSION-FILE 
$(OUTPUT)FEATURE-DUMP $(OUTPUT)util/*-bison* $(OUTPUT)util/*-flex*
+   $(call QUIET_CLEAN, core-gen)   $(RM)  *.spec *.pyc *.pyo */*.pyc 
*/*.pyo $(OUTPUT)common-cmds.h TAGS tags cscope* $(OUTPUT)PERF-VERSION-FILE 
$(OUTPUT)FEATURE-DUMP $(OUTPUT)util/*-bison* $(OUTPUT)util/*-flex* 
$(OUTPUT)pmu-events/pmu-events.c
$(QUIET_SUBDIR0)Documentation $(QUIET_SUBDIR1) clean
$(python-clean)
 
diff --git a/tools/perf/pmu-events/Build b/tools/perf/pmu-events/Build
index 7a2aaafa05e5..c35eeec2674c 100644
--- a/tools/perf/pmu-events/Build
+++ b/tools/perf/pmu-events/Build
@@ -1,26 +1,13 @@
-.SUFFIXES:
-
-libperf-y += pmu-events.o
-
-JEVENTS =  $(OUTPUT)pmu-events/jevents
-JEVENTS_OBJS = $(OUTPUT)pmu-events/json.o $(OUTPUT)pmu-events/jsmn.o \
-   $(OUTPUT)pmu-events/jevents.o
-
-PMU_EVENTS =   $(srctree)/tools/perf/pmu-events/
-
-all: $(OUTPUT)pmu-events.o
-
-$(OUTPUT)pmu-events/jevents: $(JEVENTS_OBJS)
-   

Re: [PATCH 2/4] perf: jevents: Program to convert JSON file to C style file

2015-05-22 Thread Sukadev Bhattiprolu
Jiri Olsa [jo...@redhat.com] wrote:
| On Tue, May 19, 2015 at 05:02:08PM -0700, Sukadev Bhattiprolu wrote:
| 
| SNIP
| 
|  ---
|   tools/perf/Build   |1 +
|   tools/perf/Makefile.perf   |4 +-
|   tools/perf/pmu-events/Build|   38 ++
|   tools/perf/pmu-events/README   |   67 
|   tools/perf/pmu-events/jevents.c|  700 

|   tools/perf/pmu-events/jevents.h|   17 +
|   tools/perf/pmu-events/pmu-events.h |   39 ++
|   7 files changed, 865 insertions(+), 1 deletion(-)
|   create mode 100644 tools/perf/pmu-events/Build
|   create mode 100644 tools/perf/pmu-events/README
|   create mode 100644 tools/perf/pmu-events/jevents.c
|   create mode 100644 tools/perf/pmu-events/jevents.h
|   create mode 100644 tools/perf/pmu-events/pmu-events.h
|  
|  diff --git a/tools/perf/Build b/tools/perf/Build
|  index b77370e..40bffa0 100644
|  --- a/tools/perf/Build
|  +++ b/tools/perf/Build
|  @@ -36,6 +36,7 @@ CFLAGS_builtin-help.o  += $(paths)
|   CFLAGS_builtin-timechart.o += $(paths)
|   CFLAGS_perf.o  += -DPERF_HTML_PATH=BUILD_STR($(htmldir_SQ)) 
-include $(OUTPUT)PERF-VERSION-FILE
|   
|  +libperf-y += pmu-events/
| 
| there's no concetion (yet) in the new build system to trigger
| another binery build as a dependency for object file.. I'd
| rather do this the framework way, please check attached patch
| 
| also currently the pmu-events.c is generated every time,
| so we need to add the event json data files as dependency

pmu-events.c depends only on JSON files relevant to the arch perf is
being built on and there could be several JSON files per arch. So it
would complicate the Makefiles.

Besides, didn't we conclude that the cost of generating pmu-events.c
during build is negligible ?

| 
| jirka
| 
| 
| ---
| diff --git a/tools/build/Makefile.build b/tools/build/Makefile.build
| index 10df57237a66..f6e7fd868892 100644
| --- a/tools/build/Makefile.build
| +++ b/tools/build/Makefile.build
| @@ -41,6 +41,7 @@ include $(build-file)
|  
|  quiet_cmd_flex  = FLEX $@
|  quiet_cmd_bison = BISON$@
| +quiet_cmd_gen   = GEN  $@
|  
|  # Create directory unless it exists
|  quiet_cmd_mkdir = MKDIR$(dir $@)
| diff --git a/tools/perf/Build b/tools/perf/Build
| index 40bffa0b6ee1..b77370ef7005 100644
| --- a/tools/perf/Build
| +++ b/tools/perf/Build
| @@ -36,7 +36,6 @@ CFLAGS_builtin-help.o  += $(paths)
|  CFLAGS_builtin-timechart.o += $(paths)
|  CFLAGS_perf.o  += -DPERF_HTML_PATH=BUILD_STR($(htmldir_SQ)) 
-include $(OUTPUT)PERF-VERSION-FILE
|  
| -libperf-y += pmu-events/
|  libperf-y += util/
|  libperf-y += arch/
|  libperf-y += ui/
| diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
| index 57e46a541686..a4ba451cffa2 100644
| --- a/tools/perf/Makefile.perf
| +++ b/tools/perf/Makefile.perf
| @@ -272,14 +272,29 @@ strip: $(PROGRAMS) $(OUTPUT)perf
|  
|  PERF_IN := $(OUTPUT)perf-in.o
|  
| +JEVENTS   := $(OUTPUT)pmu-events/jevents
| +JEVENTS_IN:= $(OUTPUT)pmu-events/jevents-in.o
| +PMU_EVENTS_IN := $(OUTPUT)pmu-events/pmu-events-in.o

I will try this out, but why not just add pmu-events.o to libperf?

| +
| +export JEVENTS
| +
|  export srctree OUTPUT RM CC LD AR CFLAGS V BISON FLEX
|  build := -f $(srctree)/tools/build/Makefile.build dir=. obj
|  
|  $(PERF_IN): $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h FORCE
|   $(Q)$(MAKE) $(build)=perf
|  
| -$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN)
| - $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) $(PERF_IN) $(LIBS) -o $@
| +$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN)
| + $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) $(PERF_IN) $(PMU_EVENTS_IN) 
$(LIBS) -o $@
| +
| +$(JEVENTS_IN): FORCE
| + $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build 
dir=$(OUTPUT)pmu-events obj=jevents
| +
| +$(JEVENTS): $(JEVENTS_IN)
| + $(QUIET_LINK)$(CC) $(JEVENTS_IN) -o $@
| +
| +$(PMU_EVENTS_IN): $(JEVENTS) FORCE
| + $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build 
dir=$(OUTPUT)pmu-events obj=pmu-events
|  
|  $(GTK_IN): FORCE
|   $(Q)$(MAKE) $(build)=gtk
| @@ -538,7 +553,7 @@ clean: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean config-clean
|   $(Q)find . -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name 
'\.*.d' -delete
|   $(Q)$(RM) .config-detected
|   $(call QUIET_CLEAN, core-progs) $(RM) $(ALL_PROGRAMS) perf 
perf-read-vdso32 perf-read-vdsox32 $(OUTPUT)pmu-events/jevents
| - $(call QUIET_CLEAN, core-gen)   $(RM)  *.spec *.pyc *.pyo */*.pyc 
*/*.pyo $(OUTPUT)common-cmds.h TAGS tags cscope* $(OUTPUT)PERF-VERSION-FILE 
$(OUTPUT)FEATURE-DUMP $(OUTPUT)util/*-bison* $(OUTPUT)util/*-flex*
| + $(call QUIET_CLEAN, core-gen)   $(RM)  *.spec *.pyc *.pyo */*.pyc 
*/*.pyo $(OUTPUT)common-cmds.h TAGS tags cscope* $(OUTPUT)PERF-VERSION-FILE 
$(OUTPUT)FEATURE-DUMP $(OUTPUT)util/*-bison* $(OUTPUT)util/*-flex* 
$(OUTPUT)pmu-events/pmu-events.c
|   $(QUIET_SUBDIR0)Documentation $(QUIET_SUBDIR1) clean
|   

Re: [PATCH 2/4] perf: jevents: Program to convert JSON file to C style file

2015-05-22 Thread Sukadev Bhattiprolu
Jiri Olsa [jo...@redhat.com] wrote:
| On Tue, May 19, 2015 at 05:02:08PM -0700, Sukadev Bhattiprolu wrote:
| 
| SNIP
| 
|  +int main(int argc, char *argv[])
|  +{
|  +   int rc;
|  +   int flags;
| 
| SNIP
| 
|  +
|  +   rc = uname(uts);
|  +   if (rc  0) {
|  +   printf(%s: uname() failed: %s\n, argv[0], strerror(errno));
|  +   goto empty_map;
|  +   }
|  +
|  +   /* TODO: Add other flavors of machine type here */
|  +   if (!strcmp(uts.machine, ppc64))
|  +   arch = powerpc;
|  +   else if (!strcmp(uts.machine, i686))
|  +   arch = x86;
|  +   else if (!strcmp(uts.machine, x86_64))
|  +   arch = x86;
|  +   else {
|  +   printf(%s: Unknown architecture %s\n, argv[0], uts.machine);
|  +   goto empty_map;
|  +   }
| 
| hum, wouldnt it be easier to pass the arch directly from the Makefile,
| we should have it ready in the $(ARCH) variable..

Yes, I will do that and make all three args (arch, start_dir, output_file)
mandatory (jevents won't be run from command line often, it doesn't need
default args).

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/4] perf: jevents: Program to convert JSON file to C style file

2015-05-22 Thread Jiri Olsa
On Fri, May 22, 2015 at 08:58:22AM -0700, Sukadev Bhattiprolu wrote:

SNIP

 | 
 | there's no concetion (yet) in the new build system to trigger
 | another binery build as a dependency for object file.. I'd
 | rather do this the framework way, please check attached patch
 | 
 | also currently the pmu-events.c is generated every time,
 | so we need to add the event json data files as dependency
 
 pmu-events.c depends only on JSON files relevant to the arch perf is
 being built on and there could be several JSON files per arch. So it
 would complicate the Makefiles.
 
 Besides, didn't we conclude that the cost of generating pmu-events.c
 during build is negligible ?

yes, but only when it's necessary.. if there's no change in definitions
and we already have pmu-events.o built.. why rebuild?

 |  
 | -libperf-y += pmu-events/
 |  libperf-y += util/
 |  libperf-y += arch/
 |  libperf-y += ui/
 | diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
 | index 57e46a541686..a4ba451cffa2 100644
 | --- a/tools/perf/Makefile.perf
 | +++ b/tools/perf/Makefile.perf
 | @@ -272,14 +272,29 @@ strip: $(PROGRAMS) $(OUTPUT)perf
 |  
 |  PERF_IN := $(OUTPUT)perf-in.o
 |  
 | +JEVENTS   := $(OUTPUT)pmu-events/jevents
 | +JEVENTS_IN:= $(OUTPUT)pmu-events/jevents-in.o
 | +PMU_EVENTS_IN := $(OUTPUT)pmu-events/pmu-events-in.o
 
 I will try this out, but why not just add pmu-events.o to libperf?

this is related to my first comment:

 | there's no concetion (yet) in the new build system to trigger
 | another binery build as a dependency for object file.. I'd
 | rather do this the framework way, please check attached patch

it's not possible to trigger the application build within the Build file
in a way the framework was designed.. so it cannot easily display commands
handle dependencies etc.. just allows simple/hacky solution you did ;-)

so I separated the pmu-events.o so libperf does not have dependency
on the jevents applications, and treat it as separated object

jirka
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/4] perf: jevents: Program to convert JSON file to C style file

2015-05-22 Thread Sukadev Bhattiprolu
Andi Kleen [a...@linux.intel.com] wrote:
|  pmu-events.c depends only on JSON files relevant to the arch perf is
|  being built on and there could be several JSON files per arch. So it
|  would complicate the Makefiles.
| 
| Could just use a wildcard dependency on */$(ARCH)/*.json 

Sure, but shouldn't we allow JSON files to be in subdirs

pmu-events/arch/x86/HSX/Haswell_core.json

and this could go to arbitrary levels?

| 
| Also it would be good to move the generated file into the object
| directory. I tried it but it needs some more changes to the Makefiles.
| 
| -Andi

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/4] perf: jevents: Program to convert JSON file to C style file

2015-05-22 Thread Andi Kleen
 pmu-events.c depends only on JSON files relevant to the arch perf is
 being built on and there could be several JSON files per arch. So it
 would complicate the Makefiles.

Could just use a wildcard dependency on */$(ARCH)/*.json 

Also it would be good to move the generated file into the object
directory. I tried it but it needs some more changes to the Makefiles.

-Andi

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/4] perf: jevents: Program to convert JSON file to C style file

2015-05-19 Thread Sukadev Bhattiprolu
From: Andi Kleen a...@linux.intel.com

This is a modified version of an earlier patch by Andi Kleen.

We expect architectures to describe the performance monitoring events
for each CPU in a corresponding JSON file, which look like:

[
{
EventCode: 0x00,
UMask: 0x01,
EventName: INST_RETIRED.ANY,
BriefDescription: Instructions retired from execution.,
PublicDescription: Instructions retired from execution.,
Counter: Fixed counter 1,
CounterHTOff: Fixed counter 1,
SampleAfterValue: 203,
SampleAfterValue: 203,
MSRIndex: 0,
MSRValue: 0,
TakenAlone: 0,
CounterMask: 0,
Invert: 0,
AnyThread: 0,
EdgeDetect: 0,
PEBS: 0,
PRECISE_STORE: 0,
Errata: null,
Offcore: 0
}
]

We also expect the architectures to provide a mapping between individual
CPUs to their JSON files. Eg:

GenuineIntel-6-1E,V1,/NHM-EP/NehalemEP_core_V1.json,core

which maps each CPU, identified by [vendor, family, model, version, type]
to a JSON file.

Given these files, the program, jevents::
- locates all JSON files for the architecture,
- parses each JSON file and generates a C-style PMU-events table
  (pmu-events.c)
- locates a mapfile for the architecture
- builds a global table, mapping each model of CPU to the
  corresponding PMU-events table.

The 'pmu-events.c' is generated when building perf and added to libperf.a.
The global table pmu_events_map[] table in this pmu-events.c will be used
in perf in a follow-on patch.

If the architecture does not have any JSON files or there is an error in
processing them, an empty mapping file is created. This would allow the
build of perf to proceed even if we are not able to provide aliases for
events.

The parser for JSON files allows parsing Intel style JSON event files. This
allows to use an Intel event list directly with perf. The Intel event lists
can be quite large and are too big to store in unswappable kernel memory.

The conversion from JSON to C-style is straight forward.  The parser knows
(very little) Intel specific information, and can be easily extended to
handle fields for other CPUs.

The parser code is partially shared with an independent parsing library,
which is 2-clause BSD licenced. To avoid any conflicts I marked those
files as BSD licenced too. As part of perf they become GPLv2.

Signed-off-by: Andi Kleen a...@linux.intel.com
Signed-off-by: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com

v2: Address review feedback. Rename option to --event-files
v3: Add JSON example
v4: Update manpages.
v5: Don't remove dot in fixname. Fix compile error. Add include
protection. Comment realloc.
v6: Include debug/util.h
v7: (Sukadev Bhattiprolu)
Rebase to 4.0 and fix some conflicts.
v8: (Sukadev Bhattiprolu)
Move jevents.[hc] to tools/perf/pmu-events/
Rewrite to locate and process arch specific JSON and map files;
and generate a C file.
(Removed acked-by Namhyung Kim due to modest changes to patch)
Compile the generated pmu-events.c and add the pmu-events.o to
libperf.a
---
 tools/perf/Build   |1 +
 tools/perf/Makefile.perf   |4 +-
 tools/perf/pmu-events/Build|   38 ++
 tools/perf/pmu-events/README   |   67 
 tools/perf/pmu-events/jevents.c|  700 
 tools/perf/pmu-events/jevents.h|   17 +
 tools/perf/pmu-events/pmu-events.h |   39 ++
 7 files changed, 865 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/pmu-events/Build
 create mode 100644 tools/perf/pmu-events/README
 create mode 100644 tools/perf/pmu-events/jevents.c
 create mode 100644 tools/perf/pmu-events/jevents.h
 create mode 100644 tools/perf/pmu-events/pmu-events.h

diff --git a/tools/perf/Build b/tools/perf/Build
index b77370e..40bffa0 100644
--- a/tools/perf/Build
+++ b/tools/perf/Build
@@ -36,6 +36,7 @@ CFLAGS_builtin-help.o  += $(paths)
 CFLAGS_builtin-timechart.o += $(paths)
 CFLAGS_perf.o  += -DPERF_HTML_PATH=BUILD_STR($(htmldir_SQ)) 
-include $(OUTPUT)PERF-VERSION-FILE
 
+libperf-y += pmu-events/
 libperf-y += util/
 libperf-y += arch/
 libperf-y += ui/
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index c43a205..d078c71 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -306,6 +306,8 @@ perf.spec $(SCRIPTS) \
 ifneq ($(OUTPUT),)
 %.o: $(OUTPUT)%.o
@echo # Redirected target $@ = $(OUTPUT)$@
+pmu-events/%.o: $(OUTPUT)pmu-events/%.o
+   @echo # Redirected target $@ = $(OUTPUT)$@
 util/%.o: $(OUTPUT)util/%.o
@echo # Redirected target $@ = $(OUTPUT)$@
 bench/%.o: $(OUTPUT)bench/%.o
@@ -529,7 +531,7 @@ clean: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean config-clean
$(call QUIET_CLEAN, core-objs)  $(RM) $(LIB_FILE) $(OUTPUT)perf-archive