date:20230125

Re: [PATCH 3/3] mm, arch: add generic implementation of pfn_valid() for FLATMEM

2023-01-25 Thread Arnd Bergmann

On Wed, Jan 25, 2023, at 20:07, Mike Rapoport wrote:
> From: "Mike Rapoport (IBM)" 
>
> Every architecture that supports FLATMEM memory model defines its own
> version of pfn_valid() that essentially compares a pfn to max_mapnr.
>
> Use mips/powerpc version implemented as static inline as a generic
> implementation of pfn_valid() and drop its per-architecture definitions
>
> Signed-off-by: Mike Rapoport (IBM) 

Acked-by: Arnd Bergmann 

I assume this can best go through the mm tree, let me know if I should
pick it up in the asm-generic tree instead.

 Arnd

Re: [PATCH] powerpc/kasan/book3s_64: warn when running with hash MMU

2023-01-25 Thread Christophe Leroy

Le 11/10/2022 à 12:25, Christophe Leroy a écrit :

Le 11/10/2022 à 12:00, Michael Ellerman a écrit :

Nathan Lynch writes:

Michael Ellerman writes:

Christophe Leroy writes:

+ KASAN list

Le 06/10/2022 à 06:10, Michael Ellerman a écrit :

Nathan Lynch writes:

kasan is known to crash at boot on book3s_64 with non-radix MMU. As
noted in commit 41b7a347bf14 ("powerpc: Book3S 64-bit outline-only
KASAN support"):

A kernel with CONFIG_KASAN=y will crash during boot on a machine
using HPT translation because not all the entry points to the
generic KASAN code are protected with a call to kasan_arch_is_ready().

I guess I thought there was some plan to fix that.

I was thinking the same.

Do we have a list of the said entry points to the generic code that are
lacking a call to kasan_arch_is_ready() ?

Typically, the BUG dump below shows that kasan_byte_accessible() is
lacking the check. It should be straight forward to add
kasan_arch_is_ready() check to kasan_byte_accessible(), shouldn't it ?

Yes :)

And one other spot, but the patch below boots OK for me. I'll leave it
running for a while just in case there's a path I've missed.

It works for me too, thanks (p8 pseries qemu).

It works but I still see the kasan shadow getting mapped, which we would
ideally avoid.

From PTDUMP:

---[ kasan shadow mem start ]---
0xc00f-0xc00f0006 0x045e 448K r
w pte valid presentdirty accessed
0xc00f3ffe-0xc00f3fff 0x04d8 128K r
w pte valid presentdirty accessed

I haven't worked out how those are getting mapped.

Alternative patch proposed at
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/150768c55722311699fdcf8f5379e8256749f47d.1674716617.git.christophe.le...@csgroup.eu/

Christophe

[PATCH] kasan: Fix Oops due to missing calls to kasan_arch_is_ready()

2023-01-25 Thread Christophe Leroy

On powerpc64, you can build a kernel with KASAN as soon as you build it
with RADIX MMU support. However if the CPU doesn't have RADIX MMU,
KASAN isn't enabled at init and the following Oops is encountered.

  [0.00][T0] KASAN not enabled as it requires radix!

  [4.484295][   T26] BUG: Unable to handle kernel data access at 
0xc00e00804a04
  [4.485270][   T26] Faulting instruction address: 0xc062ec6c
  [4.485748][   T26] Oops: Kernel access of bad area, sig: 11 [#1]
  [4.485920][   T26] BE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  [4.486259][   T26] Modules linked in:
  [4.486637][   T26] CPU: 0 PID: 26 Comm: kworker/u2:2 Not tainted 
6.2.0-rc3-02590-gf8a023b0a805 #249
  [4.486907][   T26] Hardware name: IBM pSeries (emulated by qemu) POWER9 
(raw) 0x4e1200 0xf05 of:SLOF,HEAD pSeries
  [4.487445][   T26] Workqueue: eval_map_wq .tracer_init_tracefs_work_func
  [4.488744][   T26] NIP:  c062ec6c LR: c062bb84 CTR: 
c02ebcd0
  [4.488867][   T26] REGS: c49175c0 TRAP: 0380   Not tainted  
(6.2.0-rc3-02590-gf8a023b0a805)
  [4.489028][   T26] MSR:  82009032   CR: 
44002808  XER: 
  [4.489584][   T26] CFAR: c062bb80 IRQMASK: 0
  [4.489584][   T26] GPR00: c05624d4 c4917860 
c1cfc000 18804a04
  [4.489584][   T26] GPR04: c03a2650 0cc0 
c000d3d8 c000d3d8
  [4.489584][   T26] GPR08: c49175b0 a80e 
 17d78400
  [4.489584][   T26] GPR12: 44002204 c379 
c435003c c43f1c40
  [4.489584][   T26] GPR16: c43f1c68 c43501a0 
c2106138 c43f1c08
  [4.489584][   T26] GPR20: c43f1c10 c43f1c20 
c4146c40 c2fdb7f8
  [4.489584][   T26] GPR24: c2fdb834 c3685e00 
c4025030 c3522e90
  [4.489584][   T26] GPR28: 0cc0 c03a2650 
c4025020 c4025020
  [4.491201][   T26] NIP [c062ec6c] .kasan_byte_accessible+0xc/0x20
  [4.491430][   T26] LR [c062bb84] .__kasan_check_byte+0x24/0x90
  [4.491767][   T26] Call Trace:
  [4.491941][   T26] [c4917860] [c062ae70] 
.__kasan_kmalloc+0xc0/0x110 (unreliable)
  [4.492270][   T26] [c49178f0] [c05624d4] 
.krealloc+0x54/0x1c0
  [4.492453][   T26] [c4917990] [c03a2650] 
.create_trace_option_files+0x280/0x530
  [4.492613][   T26] [c4917a90] [c2050d90] 
.tracer_init_tracefs_work_func+0x274/0x2c0
  [4.492771][   T26] [c4917b40] [c01f9948] 
.process_one_work+0x578/0x9f0
  [4.492927][   T26] [c4917c30] [c01f9ebc] 
.worker_thread+0xfc/0x950
  [4.493084][   T26] [c4917d60] [c020be84] 
.kthread+0x1a4/0x1b0
  [4.493232][   T26] [c4917e10] [c000d3d8] 
.ret_from_kernel_thread+0x58/0x60
  [4.495642][   T26] Code: 6000 7cc802a6 38a0 4bfffc78 6000 
7cc802a6 38a1 4bfffc68 6000 3d20a80e 7863e8c2 792907c6 <7c6348ae> 
20630007 78630fe0 68630001
  [4.496704][   T26] ---[ end trace  ]---

The Oops is due to kasan_byte_accessible() not checking the readiness
of KASAN. Add missing call to kasan_arch_is_ready() and bail out when
not ready. The same problem is observed with kasan_kfree_large()
so fix it the same.

Also, as KASAN is not available and no shadow area is allocated for
linear memory mapping, there is no point in allocating shadow mem for
vmalloc memory as shown below in /sys/kernel/debug/kernel_page_tables

  ---[ kasan shadow mem start ]---
  0xc00f-0xc00f0006  0x040f   448K 
r  w   pte  valid  presentdirty  accessed
  0xc00f0086-0xc00f0086  0x0ac164K 
r  w   pte  valid  presentdirty  accessed
  0xc00f3ffe-0xc00f3fff  0x04d1   128K 
r  w   pte  valid  presentdirty  accessed
  ---[ kasan shadow mem end ]---

So, also verify KASAN readiness before allocating and poisoning
shadow mem for VMAs.

Reported-by: Nathan Lynch 
Suggested-by: Michael Ellerman 
Signed-off-by: Christophe Leroy 
---
 mm/kasan/common.c  |  3 +++
 mm/kasan/generic.c |  7 ++-
 mm/kasan/shadow.c  | 12 
 3 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/mm/kasan/common.c b/mm/kasan/common.c
index 833bf2cfd2a3..21e66d7f261d 100644
--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -246,6 +246,9 @@ bool __kasan_slab_free(struct kmem_cache *cache, void 
*object,
 
 static inline bool kasan_kfree_large(void *ptr, unsigned long ip)
 {
+   if (!kasan_arch_is_ready())
+   return false;
+
if (ptr != page_address(virt_to_head_page(ptr))) {

Re: [PATCH] powerpc/vdso: Filter clang's auto var init zero enabler when linking

2023-01-25 Thread Masahiro Yamada

On Wed, Jan 25, 2023 at 1:20 AM Nathan Chancellor  wrote:
>
> After commit 7bbf02b875b5 ("kbuild: Stop using '-Qunused-arguments' with
> clang"), the PowerPC vDSO shows the following error with clang-13 and
> older when CONFIG_INIT_STACK_ALL_ZERO is enabled:
>
>   clang: error: argument unused during compilation: 
> '-enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang' 
> [-Werror,-Wunused-command-line-argument]
>
> clang-14 added a change to make sure this flag never triggers
> -Wunused-command-line-argument, so it is fixed with newer releases. For
> older releases that the kernel still supports building with, just filter
> out this flag, as has been done for other flags.
>
> Fixes: b174f4c26aa3 ("powerpc/vdso: Improve linker flags")
> Fixes: 7bbf02b875b5 ("kbuild: Stop using '-Qunused-arguments' with clang")
> Link: 
> https://github.com/llvm/llvm-project/commit/ca6d5813d17598cd180995fb3bdfca00f364475f
> Signed-off-by: Nathan Chancellor 
> ---
> This should be the last flag that needs to be filtered (famous last
> words...) but if any more come up, we should really just explore
> switching the PowerPC vDSO to linking with $(LD) like every other part
> of the kernel; for now, I hope this is fine.
>
> Cheers,
> Nathan


Applied to linux-kbuild. Thanks.

Since I rebased the branch, the tags have been updated
accordingly.





powerpc/vdso: Filter clang's auto var init zero enabler when linking

After commit 8d9acfce3332 ("kbuild: Stop using '-Qunused-arguments' with
clang"), the PowerPC vDSO shows the following error with clang-13 and
older when CONFIG_INIT_STACK_ALL_ZERO is enabled:

  clang: error: argument unused during compilation:
'-enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang'
[-Werror,-Wunused-command-line-argument]

clang-14 added a change to make sure this flag never triggers
-Wunused-command-line-argument, so it is fixed with newer releases. For
older releases that the kernel still supports building with, just filter
out this flag, as has been done for other flags.

Fixes: f0a42fbab447 ("powerpc/vdso: Improve linker flags")
Fixes: 8d9acfce3332 ("kbuild: Stop using '-Qunused-arguments' with clang")
Link: 
https://github.com/llvm/llvm-project/commit/ca6d5813d17598cd180995fb3bdfca00f364475f
Signed-off-by: Nathan Chancellor 
Signed-off-by: Masahiro Yamada 



-- 
Best Regards
Masahiro Yamada

Re: [PATCH v2 05/14] powerpc: Remove linker flag from KBUILD_AFLAGS

2023-01-25 Thread Masahiro Yamada

On Thu, Jan 26, 2023 at 11:07 AM Nathan Chancellor  wrote:
>
> On Thu, Jan 26, 2023 at 10:29:54AM +0900, Masahiro Yamada wrote:
> > On Wed, Jan 25, 2023 at 1:11 PM Michael Ellerman  
> > wrote:
> > >
> > > Nathan Chancellor  writes:
> > > > When clang's -Qunused-arguments is dropped from KBUILD_CPPFLAGS, it
> > > > points out that KBUILD_AFLAGS contains a linker flag, which will be
> > > > used:
> > >
> > > Should that say "unused" ?
> >
> >
> >
> > Nathan, shall I fix it up locally?
> > (it will change the commit hash, though.)
>
> Yes please, if you would not mind. Sorry about that and thank you for
> spotting it Michael!
>
> Since you have to rebase to fix it, you can include Michael's acks?
>
> Cheers,
> Nathan



Done.








-- 
Best Regards
Masahiro Yamada

Re: [PATCH v2 05/14] powerpc: Remove linker flag from KBUILD_AFLAGS

2023-01-25 Thread Nathan Chancellor

On Thu, Jan 26, 2023 at 10:29:54AM +0900, Masahiro Yamada wrote:
> On Wed, Jan 25, 2023 at 1:11 PM Michael Ellerman  wrote:
> >
> > Nathan Chancellor  writes:
> > > When clang's -Qunused-arguments is dropped from KBUILD_CPPFLAGS, it
> > > points out that KBUILD_AFLAGS contains a linker flag, which will be
> > > used:
> >
> > Should that say "unused" ?
> 
> 
> 
> Nathan, shall I fix it up locally?
> (it will change the commit hash, though.)

Yes please, if you would not mind. Sorry about that and thank you for
spotting it Michael!

Since you have to rebase to fix it, you can include Michael's acks?

Cheers,
Nathan

> > >   clang: error: -Wl,-a32: 'linker' input unused 
> > > [-Werror,-Wunused-command-line-argument]
> > >
> > > This was likely supposed to be '-Wa,-a$(BITS)'. However, this change is
> > > unnecessary, as all supported versions of clang and gcc will pass '-a64'
> > > or '-a32' to GNU as based on the value of '-m'; the behavior of the
> > > latest stable release of the oldest supported major version of each
> > > compiler is shown below and each compiler's latest release exhibits the
> > > same behavior (GCC 12.2.0 and Clang 15.0.6).
> > >
> > >   $ powerpc64-linux-gcc --version | head -1
> > >   powerpc64-linux-gcc (GCC) 5.5.0
> > >
> > >   $ powerpc64-linux-gcc -m64 -### -x assembler-with-cpp -c -o /dev/null 
> > > /dev/null &| grep 'as '
> > >   .../as -a64 -mppc64 -many -mbig -o /dev/null /tmp/cctwuBzZ.s
> > >
> > >   $ powerpc64-linux-gcc -m32 -### -x assembler-with-cpp -c -o /dev/null 
> > > /dev/null &| grep 'as '
> > >   .../as -a32 -mppc -many -mbig -o /dev/null /tmp/ccaZP4mF.sg
> > >
> > >   $ clang --version | head -1
> > >   Ubuntu clang version 
> > > 11.1.0-++20211011094159+1fdec59bffc1-1~exp1~20211011214622.5
> > >
> > >   $ clang --target=powerpc64-linux-gnu -fno-integrated-as -m64 -### \
> > > -x assembler-with-cpp -c -o /dev/null /dev/null &| grep gnu-as
> > >"/usr/bin/powerpc64-linux-gnu-as" "-a64" "-mppc64" "-many" "-o" 
> > > "/dev/null" "/tmp/null-80267c.s"
> > >
> > >   $ clang --target=powerpc64-linux-gnu -fno-integrated-as -m64 -### \
> > > -x assembler-with-cpp -c -o /dev/null /dev/null &| grep gnu-as
> > >"/usr/bin/powerpc64-linux-gnu-as" "-a32" "-mppc" "-many" "-o" 
> > > "/dev/null" "/tmp/null-ab8f8d.s"
> > >
> > > Remove this flag altogether to avoid future issues.
> > >
> > > Fixes: 1421dc6d4829 ("powerpc/kbuild: Use flags variables rather than 
> > > overriding LD/CC/AS")
> > > Signed-off-by: Nathan Chancellor 
> > > Reviewed-by: Nick Desaulniers 
> > > ---
> > > Cc: m...@ellerman.id.au
> >
> > Acked-by: Michael Ellerman  (powerpc)
> >
> > cheers
> 
> 
> 
> -- 
> Best Regards
> Masahiro Yamada

Re: [PATCH 3/3] mm, arch: add generic implementation of pfn_valid() for FLATMEM

2023-01-25 Thread Andrew Morton

On Wed, 25 Jan 2023 21:07:57 +0200 Mike Rapoport  wrote:

> Every architecture that supports FLATMEM memory model defines its own
> version of pfn_valid() that essentially compares a pfn to max_mapnr.
> 
> Use mips/powerpc version implemented as static inline as a generic
> implementation of pfn_valid() and drop its per-architecture definitions

arm allnoconfig:

./include/asm-generic/memory_model.h:23:19: error: static declaration of 
'pfn_valid' follows non-static declaration
   23 | static inline int pfn_valid(unsigned long pfn)
  |   ^
./arch/arm/include/asm/page.h:160:12: note: previous declaration of 'pfn_valid' 
with type 'int(long unsigned int)'
  160 | extern int pfn_valid(unsigned long);
  |^


I thought of doing

--- 
a/arch/arm/include/asm/page.h~mm-arch-add-generic-implementation-of-pfn_valid-for-flatmem-fix
+++ a/arch/arm/include/asm/page.h
@@ -156,10 +156,6 @@ extern void copy_page(void *to, const vo
 
 typedef struct page *pgtable_t;
 
-#ifdef CONFIG_HAVE_ARCH_PFN_VALID
-extern int pfn_valid(unsigned long);
-#endif
-
 #include 
 
 #endif /* !__ASSEMBLY__ */
_

but I'm seeing a pfn_valid declaration in arch/arc/include/asm/page.h
which might be a problem.

v2, please ;)

Re: [PATCH v3 1/7] kernel/fork: convert vma assignment to a memcpy

2023-01-25 Thread Andrew Morton

On Wed, 25 Jan 2023 16:50:01 -0800 Suren Baghdasaryan  wrote:

> On Wed, Jan 25, 2023 at 4:22 PM Andrew Morton  
> wrote:
> >
> > On Wed, 25 Jan 2023 15:35:48 -0800 Suren Baghdasaryan  
> > wrote:
> >
> > > Convert vma assignment in vm_area_dup() to a memcpy() to prevent compiler
> > > errors when we add a const modifier to vma->vm_flags.
> > >
> > > ...
> > >
> > > --- a/kernel/fork.c
> > > +++ b/kernel/fork.c
> > > @@ -482,7 +482,7 @@ struct vm_area_struct *vm_area_dup(struct 
> > > vm_area_struct *orig)
> > >* orig->shared.rb may be modified concurrently, but the 
> > > clone
> > >* will be reinitialized.
> > >*/
> > > - *new = data_race(*orig);
> > > + memcpy(new, orig, sizeof(*new));
> >
> > The data_race() removal is unchangelogged?
> 
> True. I'll add a note in the changelog about that. Ideally I would
> like to preserve it but I could not find a way to do that.
> 

Perhaps Paul can comment?

I wonder if KCSAN knows how to detect this race, given that it's now in
a memcpy.  I assume so.

Re: [PATCH v2 05/14] powerpc: Remove linker flag from KBUILD_AFLAGS

2023-01-25 Thread Masahiro Yamada

On Wed, Jan 25, 2023 at 1:11 PM Michael Ellerman  wrote:
>
> Nathan Chancellor  writes:
> > When clang's -Qunused-arguments is dropped from KBUILD_CPPFLAGS, it
> > points out that KBUILD_AFLAGS contains a linker flag, which will be
> > used:
>
> Should that say "unused" ?



Nathan, shall I fix it up locally?
(it will change the commit hash, though.)





>
> >   clang: error: -Wl,-a32: 'linker' input unused 
> > [-Werror,-Wunused-command-line-argument]
> >
> > This was likely supposed to be '-Wa,-a$(BITS)'. However, this change is
> > unnecessary, as all supported versions of clang and gcc will pass '-a64'
> > or '-a32' to GNU as based on the value of '-m'; the behavior of the
> > latest stable release of the oldest supported major version of each
> > compiler is shown below and each compiler's latest release exhibits the
> > same behavior (GCC 12.2.0 and Clang 15.0.6).
> >
> >   $ powerpc64-linux-gcc --version | head -1
> >   powerpc64-linux-gcc (GCC) 5.5.0
> >
> >   $ powerpc64-linux-gcc -m64 -### -x assembler-with-cpp -c -o /dev/null 
> > /dev/null &| grep 'as '
> >   .../as -a64 -mppc64 -many -mbig -o /dev/null /tmp/cctwuBzZ.s
> >
> >   $ powerpc64-linux-gcc -m32 -### -x assembler-with-cpp -c -o /dev/null 
> > /dev/null &| grep 'as '
> >   .../as -a32 -mppc -many -mbig -o /dev/null /tmp/ccaZP4mF.sg
> >
> >   $ clang --version | head -1
> >   Ubuntu clang version 
> > 11.1.0-++20211011094159+1fdec59bffc1-1~exp1~20211011214622.5
> >
> >   $ clang --target=powerpc64-linux-gnu -fno-integrated-as -m64 -### \
> > -x assembler-with-cpp -c -o /dev/null /dev/null &| grep gnu-as
> >"/usr/bin/powerpc64-linux-gnu-as" "-a64" "-mppc64" "-many" "-o" 
> > "/dev/null" "/tmp/null-80267c.s"
> >
> >   $ clang --target=powerpc64-linux-gnu -fno-integrated-as -m64 -### \
> > -x assembler-with-cpp -c -o /dev/null /dev/null &| grep gnu-as
> >"/usr/bin/powerpc64-linux-gnu-as" "-a32" "-mppc" "-many" "-o" 
> > "/dev/null" "/tmp/null-ab8f8d.s"
> >
> > Remove this flag altogether to avoid future issues.
> >
> > Fixes: 1421dc6d4829 ("powerpc/kbuild: Use flags variables rather than 
> > overriding LD/CC/AS")
> > Signed-off-by: Nathan Chancellor 
> > Reviewed-by: Nick Desaulniers 
> > ---
> > Cc: m...@ellerman.id.au
>
> Acked-by: Michael Ellerman  (powerpc)
>
> cheers



-- 
Best Regards
Masahiro Yamada

[PATCH v4 12/12] perf pmu-events: Fix testing with JEVENTS_ARCH=all

2023-01-25 Thread Ian Rogers

The #slots literal will return NAN when not on ARM64 which causes a
perf test failure when not on an ARM64 for a JEVENTS_ARCH=all build:
..
 10.4: Parsing of PMU event table metrics with fake PMUs : FAILED!
..
Add an is_test boolean so that the failure can be avoided when running
as a test.

Fixes: acef233b7ca7 ("perf pmu: Add #slots literal support for arm64")
Signed-off-by: Ian Rogers 
---
 tools/perf/tests/pmu-events.c | 1 +
 tools/perf/util/expr.h| 1 +
 tools/perf/util/expr.l| 8 +---
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/perf/tests/pmu-events.c b/tools/perf/tests/pmu-events.c
index 962c3c0d53ba..accf44b3d968 100644
--- a/tools/perf/tests/pmu-events.c
+++ b/tools/perf/tests/pmu-events.c
@@ -950,6 +950,7 @@ static int metric_parse_fake(const char *metric_name, const 
char *str)
pr_debug("expr__ctx_new failed");
return TEST_FAIL;
}
+   ctx->sctx.is_test = true;
if (expr__find_ids(str, NULL, ctx) < 0) {
pr_err("expr__find_ids failed\n");
return -1;
diff --git a/tools/perf/util/expr.h b/tools/perf/util/expr.h
index 029271540fb0..eaa44b24c555 100644
--- a/tools/perf/util/expr.h
+++ b/tools/perf/util/expr.h
@@ -9,6 +9,7 @@ struct expr_scanner_ctx {
char *user_requested_cpu_list;
int runtime;
bool system_wide;
+   bool is_test;
 };
 
 struct expr_parse_ctx {
diff --git a/tools/perf/util/expr.l b/tools/perf/util/expr.l
index 0168a9637330..72ff4f3d6d4b 100644
--- a/tools/perf/util/expr.l
+++ b/tools/perf/util/expr.l
@@ -84,9 +84,11 @@ static int literal(yyscan_t scanner, const struct 
expr_scanner_ctx *sctx)
YYSTYPE *yylval = expr_get_lval(scanner);
 
yylval->num = expr__get_literal(expr_get_text(scanner), sctx);
-   if (isnan(yylval->num))
-   return EXPR_ERROR;
-
+   if (isnan(yylval->num)) {
+   if (!sctx->is_test)
+   return EXPR_ERROR;
+   yylval->num = 1;
+   }
return LITERAL;
 }
 %}
-- 
2.39.1.456.gfc5497dd1b-goog

[PATCH v4 11/12] perf jevents: Add model list option

2023-01-25 Thread Ian Rogers

This allows the set of generated jevents events and metrics be limited
to a subset of the model names. Appropriate if trying to minimize the
binary size where only a set of models are possible.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/Build  |  3 ++-
 tools/perf/pmu-events/jevents.py | 14 ++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/tools/perf/pmu-events/Build b/tools/perf/pmu-events/Build
index 15b9e8fdbffa..a14de24ecb69 100644
--- a/tools/perf/pmu-events/Build
+++ b/tools/perf/pmu-events/Build
@@ -10,6 +10,7 @@ JEVENTS_PY=  pmu-events/jevents.py
 ifeq ($(JEVENTS_ARCH),)
 JEVENTS_ARCH=$(SRCARCH)
 endif
+JEVENTS_MODEL ?= all
 
 #
 # Locate/process JSON files in pmu-events/arch/
@@ -23,5 +24,5 @@ $(OUTPUT)pmu-events/pmu-events.c: 
pmu-events/empty-pmu-events.c
 else
 $(OUTPUT)pmu-events/pmu-events.c: $(JSON) $(JSON_TEST) $(JEVENTS_PY) 
pmu-events/metric.py
$(call rule_mkdir)
-   $(Q)$(call echo-cmd,gen)$(PYTHON) $(JEVENTS_PY) $(JEVENTS_ARCH) 
pmu-events/arch $@
+   $(Q)$(call echo-cmd,gen)$(PYTHON) $(JEVENTS_PY) $(JEVENTS_ARCH) 
$(JEVENTS_MODEL) pmu-events/arch $@
 endif
diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 627ee817f57f..2bcd07ce609f 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -599,6 +599,8 @@ const struct pmu_events_map pmu_events_map[] = {
 else:
   metric_tblname = 'NULL'
   metric_size = '0'
+if event_size == '0' and metric_size == '0':
+  continue
 cpuid = row[0].replace('\\', '')
 _args.output_file.write(f"""{{
 \t.arch = "{arch}",
@@ -888,12 +890,24 @@ def main() -> None:
   action: Callable[[Sequence[str], os.DirEntry], None]) -> None:
 """Replicate the directory/file walking behavior of C's file tree walk."""
 for item in os.scandir(path):
+  if _args.model != 'all' and item.is_dir():
+# Check if the model matches one in _args.model.
+if len(parents) == _args.model.split(',')[0].count('/'):
+  # We're testing the correct directory.
+  item_path = '/'.join(parents) + ('/' if len(parents) > 0 else '') + 
item.name
+  if 'test' not in item_path and item_path not in 
_args.model.split(','):
+continue
   action(parents, item)
   if item.is_dir():
 ftw(item.path, parents + [item.name], action)
 
   ap = argparse.ArgumentParser()
   ap.add_argument('arch', help='Architecture name like x86')
+  ap.add_argument('model', help='''Select a model such as skylake to
+reduce the code size.  Normally set to "all". For architectures like
+ARM64 with an implementor/model, the model must include the implementor
+such as "arm/cortex-a34".''',
+  default='all')
   ap.add_argument(
   'starting_dir',
   type=dir_path,
-- 
2.39.1.456.gfc5497dd1b-goog

[PATCH v4 10/12] perf jevents: Generate metrics and events as separate tables

2023-01-25 Thread Ian Rogers

Turn a perf json event into an event, metric or both. This reduces the
number of events needed to scan to find an event or metric. As events
no longer need the relatively seldom used metric fields, 4 bytes is
saved per event. This reduces the big C string's size by 335kb (14.8%)
on x86.

Note, for the test PMU architecture pme_test_soc_cpu is renamed
pmu_events__test_soc_cpu for consistency with the event vs metric
naming convention.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/jevents.py | 244 +++
 tools/perf/tests/pmu-events.c|   3 +-
 2 files changed, 189 insertions(+), 58 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index d83cc94af51f..627ee817f57f 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -13,28 +13,40 @@ import collections
 
 # Global command line arguments.
 _args = None
+# List of regular event tables.
+_event_tables = []
 # List of event tables generated from "/sys" directories.
 _sys_event_tables = []
+# List of regular metric tables.
+_metric_tables = []
+# List of metric tables generated from "/sys" directories.
+_sys_metric_tables = []
+# Mapping between sys event table names and sys metric table names.
+_sys_event_table_to_metric_table_mapping = {}
 # Map from an event name to an architecture standard
 # JsonEvent. Architecture standard events are in json files in the top
 # f'{_args.starting_dir}/{_args.arch}' directory.
 _arch_std_events = {}
 # Events to write out when the table is closed
 _pending_events = []
-# Name of table to be written out
+# Name of events table to be written out
 _pending_events_tblname = None
+# Metrics to write out when the table is closed
+_pending_metrics = []
+# Name of metrics table to be written out
+_pending_metrics_tblname = None
 # Global BigCString shared by all structures.
 _bcs = None
 # Order specific JsonEvent attributes will be visited.
 _json_event_attributes = [
 # cmp_sevent related attributes.
-'name', 'pmu', 'topic', 'desc', 'metric_name', 'metric_group',
+'name', 'pmu', 'topic', 'desc',
 # Seems useful, put it early.
 'event',
 # Short things in alphabetical order.
 'aggr_mode', 'compat', 'deprecated', 'perpkg', 'unit',
 # Longer things (the last won't be iterated over during decompress).
-'metric_constraint', 'metric_expr', 'long_desc'
+'long_desc'
 ]
 
 # Attributes that are in pmu_metric rather than pmu_event.
@@ -52,14 +64,16 @@ def removesuffix(s: str, suffix: str) -> str:
   return s[0:-len(suffix)] if s.endswith(suffix) else s
 
 
-def file_name_to_table_name(parents: Sequence[str], dirname: str) -> str:
+def file_name_to_table_name(prefix: str, parents: Sequence[str],
+dirname: str) -> str:
   """Generate a C table name from directory names."""
-  tblname = 'pme'
+  tblname = prefix
   for p in parents:
 tblname += '_' + p
   tblname += '_' + dirname
   return tblname.replace('-', '_')
 
+
 def c_len(s: str) -> int:
   """Return the length of s a C string
 
@@ -277,7 +291,7 @@ class JsonEvent:
 self.metric_constraint = jd.get('MetricConstraint')
 self.metric_expr = None
 if 'MetricExpr' in jd:
-   self.metric_expr = metric.ParsePerfJson(jd['MetricExpr']).Simplify()
+  self.metric_expr = metric.ParsePerfJson(jd['MetricExpr']).Simplify()
 
 arch_std = jd.get('ArchStdEvent')
 if precise and self.desc and '(Precise Event)' not in self.desc:
@@ -326,23 +340,24 @@ class JsonEvent:
 s += f'\t{attr} = {value},\n'
 return s + '}'
 
-  def build_c_string(self) -> str:
+  def build_c_string(self, metric: bool) -> str:
 s = ''
-for attr in _json_event_attributes:
+for attr in _json_metric_attributes if metric else _json_event_attributes:
   x = getattr(self, attr)
-  if x and attr == 'metric_expr':
+  if metric and x and attr == 'metric_expr':
 # Convert parsed metric expressions into a string. Slashes
 # must be doubled in the file.
 x = x.ToPerfJson().replace('\\', '')
   s += f'{x}\\000' if x else '\\000'
 return s
 
-  def to_c_string(self) -> str:
+  def to_c_string(self, metric: bool) -> str:
 """Representation of the event as a C struct initializer."""
 
-s = self.build_c_string()
+s = self.build_c_string(metric)
 return f'{{ { _bcs.offsets[s] } }}, /* {s} */\n'
 
+
 @lru_cache(maxsize=None)
 def read_json_events(path: str, topic: str) -> Sequence[JsonEvent]:
   """Read json events from the specified file."""
@@ -381,7 +396,10 @@ def preprocess_arch_std_files(archpath: str) -> None:
 def add_events_table_entries(item: os.DirEntry, topic: str) -> None:
   """Add contents of file to _pending_events table."""
   for e in read_json_events(item.path, topic):
-_pending_events.append(e)
+if e.name:
+  _pending_events.append(e)
+if e.metric_name:
+  _pending_metrics.append(e)
 
 
 def print_pending_events()

[PATCH v4 09/12] perf pmu-events: Introduce pmu_metrics_table

2023-01-25 Thread Ian Rogers

Add a metrics table that is just a cast from pmu_events_table. This
changes the APIs so that event and metric usage of the underlying
table is different. For the no jevents case the tables are already
separate, later changes will separate the tables for the jevents case.

Signed-off-by: Ian Rogers 
---
 tools/perf/arch/arm64/util/pmu.c | 11 -
 tools/perf/pmu-events/empty-pmu-events.c | 21 -
 tools/perf/pmu-events/jevents.py | 21 ++---
 tools/perf/pmu-events/pmu-events.h   | 10 +++--
 tools/perf/tests/expand-cgroup.c |  2 +-
 tools/perf/tests/parse-metric.c  |  2 +-
 tools/perf/tests/pmu-events.c|  5 ++-
 tools/perf/util/metricgroup.c| 54 
 tools/perf/util/metricgroup.h|  2 +-
 tools/perf/util/pmu.c|  5 +++
 tools/perf/util/pmu.h|  1 +
 11 files changed, 78 insertions(+), 56 deletions(-)

diff --git a/tools/perf/arch/arm64/util/pmu.c b/tools/perf/arch/arm64/util/pmu.c
index 801bf52e2ea6..2779840d8896 100644
--- a/tools/perf/arch/arm64/util/pmu.c
+++ b/tools/perf/arch/arm64/util/pmu.c
@@ -22,7 +22,14 @@ static struct perf_pmu *pmu__find_core_pmu(void)
return NULL;
 
return pmu;
-   }
+}
+
+const struct pmu_metrics_table *pmu_metrics_table__find(void)
+{
+   struct perf_pmu *pmu = pmu__find_core_pmu();
+
+   if (pmu)
+   return perf_pmu__find_metrics_table(pmu);
 
return NULL;
 }
@@ -32,7 +39,7 @@ const struct pmu_events_table *pmu_events_table__find(void)
struct perf_pmu *pmu = pmu__find_core_pmu();
 
if (pmu)
-   return perf_pmu__find_table(pmu);
+   return perf_pmu__find_events_table(pmu);
 
return NULL;
 }
diff --git a/tools/perf/pmu-events/empty-pmu-events.c 
b/tools/perf/pmu-events/empty-pmu-events.c
index 10bd4943ebf8..a938b74cf487 100644
--- a/tools/perf/pmu-events/empty-pmu-events.c
+++ b/tools/perf/pmu-events/empty-pmu-events.c
@@ -278,13 +278,11 @@ int pmu_events_table_for_each_event(const struct 
pmu_events_table *table, pmu_ev
return 0;
 }
 
-int pmu_events_table_for_each_metric(const struct pmu_events_table *etable, 
pmu_metric_iter_fn fn,
-void *data)
+int pmu_metrics_table_for_each_metric(const struct pmu_metrics_table *table, 
pmu_metric_iter_fn fn,
+ void *data)
 {
-   struct pmu_metrics_table *table = (struct pmu_metrics_table *)etable;
-
for (const struct pmu_metric *pm = >entries[0]; pm->metric_expr; 
pm++) {
-   int ret = fn(pm, etable, data);
+   int ret = fn(pm, table, data);
 
if (ret)
return ret;
@@ -320,9 +318,9 @@ const struct pmu_events_table 
*perf_pmu__find_events_table(struct perf_pmu *pmu)
return table;
 }
 
-const struct pmu_events_table *perf_pmu__find_metrics_table(struct perf_pmu 
*pmu)
+const struct pmu_metrics_table *perf_pmu__find_metrics_table(struct perf_pmu 
*pmu)
 {
-   const struct pmu_events_table *table = NULL;
+   const struct pmu_metrics_table *table = NULL;
char *cpuid = perf_pmu__getcpuid(pmu);
int i;
 
@@ -340,7 +338,7 @@ const struct pmu_events_table 
*perf_pmu__find_metrics_table(struct perf_pmu *pmu
break;
 
if (!strcmp_cpuid_str(map->cpuid, cpuid)) {
-   table = (const struct pmu_events_table 
*)>metric_table;
+   table = >metric_table;
break;
}
}
@@ -359,13 +357,13 @@ const struct pmu_events_table 
*find_core_events_table(const char *arch, const ch
return NULL;
 }
 
-const struct pmu_events_table *find_core_metrics_table(const char *arch, const 
char *cpuid)
+const struct pmu_metrics_table *find_core_metrics_table(const char *arch, 
const char *cpuid)
 {
for (const struct pmu_events_map *tables = _events_map[0];
 tables->arch;
 tables++) {
if (!strcmp(tables->arch, arch) && 
!strcmp_cpuid_str(tables->cpuid, cpuid))
-   return (const struct pmu_events_table 
*)>metric_table;
+   return >metric_table;
}
return NULL;
 }
@@ -386,8 +384,7 @@ int pmu_for_each_core_metric(pmu_metric_iter_fn fn, void 
*data)
for (const struct pmu_events_map *tables = _events_map[0];
 tables->arch;
 tables++) {
-   int ret = pmu_events_table_for_each_metric(
-   (const struct pmu_events_table *)>metric_table, 
fn, data);
+   int ret = 
pmu_metrics_table_for_each_metric(>metric_table, fn, data);
 
if (ret)
return ret;
diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 5f8d490c7269..d83cc94af51f 100755
--- a/tools/perf/pmu-events/jevents.py
+++

[PATCH v4 08/12] perf jevents: Combine table prefix and suffix writing

2023-01-25 Thread Ian Rogers

Combine into a single function to simplify, in a later change, writing
metrics separately.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/jevents.py | 36 +---
 1 file changed, 14 insertions(+), 22 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 4cdbf34b7298..5f8d490c7269 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -19,10 +19,10 @@ _sys_event_tables = []
 # JsonEvent. Architecture standard events are in json files in the top
 # f'{_args.starting_dir}/{_args.arch}' directory.
 _arch_std_events = {}
-# Track whether an events table is currently being defined and needs closing.
-_close_table = False
 # Events to write out when the table is closed
 _pending_events = []
+# Name of table to be written out
+_pending_events_tblname = None
 # Global BigCString shared by all structures.
 _bcs = None
 # Order specific JsonEvent attributes will be visited.
@@ -378,24 +378,13 @@ def preprocess_arch_std_files(archpath: str) -> None:
   _arch_std_events[event.metric_name.lower()] = event
 
 
-def print_events_table_prefix(tblname: str) -> None:
-  """Called when a new events table is started."""
-  global _close_table
-  if _close_table:
-raise IOError('Printing table prefix but last table has no suffix')
-  _args.output_file.write(f'static const struct compact_pmu_event {tblname}[] 
= {{\n')
-  _close_table = True
-
-
 def add_events_table_entries(item: os.DirEntry, topic: str) -> None:
   """Add contents of file to _pending_events table."""
-  if not _close_table:
-raise IOError('Table entries missing prefix')
   for e in read_json_events(item.path, topic):
 _pending_events.append(e)
 
 
-def print_events_table_suffix() -> None:
+def print_pending_events() -> None:
   """Optionally close events table."""
 
   def event_cmp_key(j: JsonEvent) -> Tuple[bool, str, str, str, str]:
@@ -407,17 +396,19 @@ def print_events_table_suffix() -> None:
 return (j.desc is not None, fix_none(j.topic), fix_none(j.name), 
fix_none(j.pmu),
 fix_none(j.metric_name))
 
-  global _close_table
-  if not _close_table:
+  global _pending_events
+  if not _pending_events:
 return
 
-  global _pending_events
+  global _pending_events_tblname
+  _args.output_file.write(
+  f'static const struct compact_pmu_event {_pending_events_tblname}[] = 
{{\n')
+
   for event in sorted(_pending_events, key=event_cmp_key):
 _args.output_file.write(event.to_c_string())
-_pending_events = []
+  _pending_events = []
 
   _args.output_file.write('};\n\n')
-  _close_table = False
 
 def get_topic(topic: str) -> str:
   if topic.endswith('metrics.json'):
@@ -455,12 +446,13 @@ def process_one_file(parents: Sequence[str], item: 
os.DirEntry) -> None:
 
   # model directory, reset topic
   if item.is_dir() and is_leaf_dir(item.path):
-print_events_table_suffix()
+print_pending_events()
 
 tblname = file_name_to_table_name(parents, item.name)
 if item.name == 'sys':
   _sys_event_tables.append(tblname)
-print_events_table_prefix(tblname)
+global _pending_events_tblname
+_pending_events_tblname = tblname
 return
 
   # base dir or too deep
@@ -809,7 +801,7 @@ struct compact_pmu_event {
   for arch in archs:
 arch_path = f'{_args.starting_dir}/{arch}'
 ftw(arch_path, [], process_one_file)
-print_events_table_suffix()
+print_pending_events()
 
   print_mapping_table(archs)
   print_system_mapping_table()
-- 
2.39.1.456.gfc5497dd1b-goog

[PATCH v4 07/12] perf stat: Remove evsel metric_name/expr

2023-01-25 Thread Ian Rogers

Metrics are their own unit and these variables held broken metrics
previously and now just hold the value NULL. Remove code that used
these variables.

Reviewed-by: John Garry 
Signed-off-by: Ian Rogers 
---
 tools/perf/builtin-stat.c |   1 -
 tools/perf/util/cgroup.c  |   1 -
 tools/perf/util/evsel.c   |   2 -
 tools/perf/util/evsel.h   |   2 -
 tools/perf/util/python.c  |   7 ---
 tools/perf/util/stat-shadow.c | 112 --
 tools/perf/util/stat.h|   1 -
 7 files changed, 126 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9f3e4b257516..5d18a5a6f662 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2524,7 +2524,6 @@ int cmd_stat(int argc, const char **argv)
_config.metric_events);
zfree();
}
-   perf_stat__collect_metric_expr(evsel_list);
perf_stat__init_shadow_stats();
 
if (add_default_attributes())
diff --git a/tools/perf/util/cgroup.c b/tools/perf/util/cgroup.c
index cd978c240e0d..bfb13306d82c 100644
--- a/tools/perf/util/cgroup.c
+++ b/tools/perf/util/cgroup.c
@@ -481,7 +481,6 @@ int evlist__expand_cgroup(struct evlist *evlist, const char 
*str,
nr_cgroups++;
 
if (metric_events) {
-   perf_stat__collect_metric_expr(tmp_list);
if (metricgroup__copy_metric_events(tmp_list, cgrp,
metric_events,

_metric_events) < 0)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 8550638587e5..a90e998826e0 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -285,8 +285,6 @@ void evsel__init(struct evsel *evsel,
evsel->sample_size = __evsel__sample_size(attr->sample_type);
evsel__calc_id_pos(evsel);
evsel->cmdline_group_boundary = false;
-   evsel->metric_expr   = NULL;
-   evsel->metric_name   = NULL;
evsel->metric_events = NULL;
evsel->per_pkg_mask  = NULL;
evsel->collect_stat  = false;
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index d572be41b960..24cb807ef6ce 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -105,8 +105,6 @@ struct evsel {
 * metric fields are similar, but needs more care as they can have
 * references to other metric (evsel).
 */
-   const char *metric_expr;
-   const char *metric_name;
struct evsel**metric_events;
struct evsel*metric_leader;
 
diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
index 9e5d881b0987..42e8b813d010 100644
--- a/tools/perf/util/python.c
+++ b/tools/perf/util/python.c
@@ -76,13 +76,6 @@ const char *perf_env__arch(struct perf_env *env 
__maybe_unused)
return NULL;
 }
 
-/*
- * Add this one here not to drag util/stat-shadow.c
- */
-void perf_stat__collect_metric_expr(struct evlist *evsel_list)
-{
-}
-
 /*
  * These ones are needed not to drag the PMU bandwagon, jevents generated
  * pmu_sys_event_tables, etc and evsel__find_pmu() is used so far just for
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index cadb2df23c87..35ea4813f468 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -346,114 +346,6 @@ static const char *get_ratio_color(enum grc_type type, 
double ratio)
return color;
 }
 
-static struct evsel *perf_stat__find_event(struct evlist *evsel_list,
-   const char *name)
-{
-   struct evsel *c2;
-
-   evlist__for_each_entry (evsel_list, c2) {
-   if (!strcasecmp(c2->name, name) && !c2->collect_stat)
-   return c2;
-   }
-   return NULL;
-}
-
-/* Mark MetricExpr target events and link events using them to them. */
-void perf_stat__collect_metric_expr(struct evlist *evsel_list)
-{
-   struct evsel *counter, *leader, **metric_events, *oc;
-   bool found;
-   struct expr_parse_ctx *ctx;
-   struct hashmap_entry *cur;
-   size_t bkt;
-   int i;
-
-   ctx = expr__ctx_new();
-   if (!ctx) {
-   pr_debug("expr__ctx_new failed");
-   return;
-   }
-   evlist__for_each_entry(evsel_list, counter) {
-   bool invalid = false;
-
-   leader = evsel__leader(counter);
-   if (!counter->metric_expr)
-   continue;
-
-   expr__ctx_clear(ctx);
-   metric_events = counter->metric_events;
-   if (!metric_events) {
-   if (expr__find_ids(counter->metric_expr,
-  counter->name,
-  ctx) < 0)
-   continue;
-
-

[PATCH v4 06/12] perf pmu-events: Remove now unused event and metric variables

2023-01-25 Thread Ian Rogers

Previous changes separated the uses of pmu_event and pmu_metric,
however, both structures contained all the variables of event and
metric. This change removes the event variables from metric and the
metric variables from event.

Note, this change removes the setting of evsel's metric_name/expr as
these fields are no longer part of struct pmu_event. The metric
remains but is no longer implicitly requested when the event is. This
impacts a few Intel uncore events, however, as the ScaleUnit is shared
by the event and the metric this utility is questionable. Also the
MetricNames look broken (contain spaces) in some cases and when trying
to use the functionality with '-e' the metrics fail but regular
metrics with '-M' work. For example, on SkylakeX '-M' works:

```
$ perf stat -M LLC_MISSES.PCIE_WRITE -a sleep 1

 Performance counter stats for 'system wide':

 0  UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 #  57896.0 
Bytes  LLC_MISSES.PCIE_WRITE  (49.84%)
 7,174  UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 
   (49.85%)
 0  UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 
   (50.16%)
63  UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 
   (50.15%)

   1.004576381 seconds time elapsed
```

whilst the event '-e' version is broken even with --group/-g (fwiw, we should 
also remove -g [1]):

```
$ perf stat -g -e LLC_MISSES.PCIE_WRITE -g -a sleep 1
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE

 Performance counter stats for 'system wide':

27,316 Bytes LLC_MISSES.PCIE_WRITE

   1.004505469 seconds time elapsed
```

The code also carries warnings where the user is supposed to select
events for metrics [2] but given the lack of use of such a feature,
let's clean the code and just remove.

[1] https://lore.kernel.org/lkml/20220707195610.303254-1-irog...@google.com/
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/stat-shadow.c?id=01b8957b738f42f96a130079bc951b3cc78c5b8a#n425

Reviewed-by: John Garry 
Signed-off-by: Ian Rogers 
---
 tools/perf/builtin-list.c  | 20 ++---
 tools/perf/pmu-events/jevents.py   | 20 +
 tools/perf/pmu-events/pmu-events.h | 22 +--

[PATCH v4 05/12] perf pmu-events: Separate the metrics from events for no jevents

2023-01-25 Thread Ian Rogers

Separate the event and metric table when building without jevents. Add
find_core_metrics_table and perf_pmu__find_metrics_table while
renaming existing utilities to be event specific, so that users can
find the right table for their need.

Reviewed-by: John Garry 
Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/empty-pmu-events.c | 88 ++--
 tools/perf/pmu-events/jevents.py |  7 +-
 tools/perf/pmu-events/pmu-events.h   |  4 +-
 tools/perf/tests/expand-cgroup.c |  2 +-
 tools/perf/tests/parse-metric.c  |  2 +-
 tools/perf/util/pmu.c|  4 +-
 6 files changed, 79 insertions(+), 28 deletions(-)

diff --git a/tools/perf/pmu-events/empty-pmu-events.c 
b/tools/perf/pmu-events/empty-pmu-events.c
index 4e39d1a8d6d6..10bd4943ebf8 100644
--- a/tools/perf/pmu-events/empty-pmu-events.c
+++ b/tools/perf/pmu-events/empty-pmu-events.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 
-static const struct pmu_event pme_test_soc_cpu[] = {
+static const struct pmu_event pmu_events__test_soc_cpu[] = {
{
.name = "l3_cache_rd",
.event = "event=0x40",
@@ -105,6 +105,14 @@ static const struct pmu_event pme_test_soc_cpu[] = {
.desc = "L2 BTB Correction",
.topic = "branch",
},
+   {
+   .name = 0,
+   .event = 0,
+   .desc = 0,
+   },
+};
+
+static const struct pmu_metric pmu_metrics__test_soc_cpu[] = {
{
.metric_expr= "1 / IPC",
.metric_name= "CPI",
@@ -170,9 +178,8 @@ static const struct pmu_event pme_test_soc_cpu[] = {
.metric_name= "L1D_Cache_Fill_BW",
},
{
-   .name = 0,
-   .event = 0,
-   .desc = 0,
+   .metric_expr = 0,
+   .metric_name = 0,
},
 };
 
@@ -197,7 +204,8 @@ struct pmu_metrics_table {
 struct pmu_events_map {
const char *arch;
const char *cpuid;
-   const struct pmu_events_table table;
+   const struct pmu_events_table event_table;
+   const struct pmu_metrics_table metric_table;
 };
 
 /*
@@ -208,12 +216,14 @@ static const struct pmu_events_map pmu_events_map[] = {
{
.arch = "testarch",
.cpuid = "testcpu",
-   .table = { pme_test_soc_cpu },
+   .event_table = { pmu_events__test_soc_cpu },
+   .metric_table = { pmu_metrics__test_soc_cpu },
},
{
.arch = 0,
.cpuid = 0,
-   .table = { 0 },
+   .event_table = { 0 },
+   .metric_table = { 0 },
},
 };
 
@@ -259,12 +269,9 @@ static const struct pmu_sys_events pmu_sys_event_tables[] 
= {
 int pmu_events_table_for_each_event(const struct pmu_events_table *table, 
pmu_event_iter_fn fn,
void *data)
 {
-   for (const struct pmu_event *pe = >entries[0]; pe->name || 
pe->metric_expr; pe++) {
-   int ret;
+   for (const struct pmu_event *pe = >entries[0]; pe->name; pe++) {
+   int ret = fn(pe, table, data);
 
-   if (!pe->name)
-   continue;
-   ret = fn(pe, table, data);
if (ret)
return ret;
}
@@ -276,19 +283,44 @@ int pmu_events_table_for_each_metric(const struct 
pmu_events_table *etable, pmu_
 {
struct pmu_metrics_table *table = (struct pmu_metrics_table *)etable;
 
-   for (const struct pmu_metric *pm = >entries[0]; pm->name || 
pm->metric_expr; pm++) {
-   int ret;
+   for (const struct pmu_metric *pm = >entries[0]; pm->metric_expr; 
pm++) {
+   int ret = fn(pm, etable, data);
 
-   if (!pm->metric_expr)
-   continue;
-   ret = fn(pm, etable, data);
if (ret)
return ret;
}
return 0;
 }
 
-const struct pmu_events_table *perf_pmu__find_table(struct perf_pmu *pmu)
+const struct pmu_events_table *perf_pmu__find_events_table(struct perf_pmu 
*pmu)
+{
+   const struct pmu_events_table *table = NULL;
+   char *cpuid = perf_pmu__getcpuid(pmu);
+   int i;
+
+   /* on some platforms which uses cpus map, cpuid can be NULL for
+* PMUs other than CORE PMUs.
+*/
+   if (!cpuid)
+   return NULL;
+
+   i = 0;
+   for (;;) {
+   const struct pmu_events_map *map = _events_map[i++];
+
+   if (!map->cpuid)
+   break;
+
+   if (!strcmp_cpuid_str(map->cpuid, cpuid)) {
+   table = >event_table;
+   break;
+   }
+   }
+   free(cpuid);
+   return table;
+}
+
+const struct pmu_events_table *perf_pmu__find_metrics_table(struct perf_pmu 
*pmu)
 {
const struct pmu_events_table *table = NULL;

[PATCH v4 04/12] perf pmu-events: Add separate metric from pmu_event

2023-01-25 Thread Ian Rogers

Create a new pmu_metric for the metric related variables from
pmu_event but that is initially just a clone of pmu_event. Add
iterators for pmu_metric and use in places that metrics are desired
rather than events. Make the event iterator skip metric only events,
and the metric iterator skip event only events.

Reviewed-by: John Garry 
Signed-off-by: Ian Rogers 
---
 tools/perf/arch/powerpc/util/header.c|   4 +-
 tools/perf/pmu-events/empty-pmu-events.c |  49 ++-
 tools/perf/pmu-events/jevents.py |  62 -
 tools/perf/pmu-events/pmu-events.h   |  26 
 tools/perf/tests/pmu-events.c|  35 +++--
 tools/perf/util/metricgroup.c| 161 +++
 tools/perf/util/metricgroup.h|   2 +-
 7 files changed, 228 insertions(+), 111 deletions(-)

diff --git a/tools/perf/arch/powerpc/util/header.c 
b/tools/perf/arch/powerpc/util/header.c
index e8fe36b10d20..78eef77d8a8d 100644
--- a/tools/perf/arch/powerpc/util/header.c
+++ b/tools/perf/arch/powerpc/util/header.c
@@ -40,11 +40,11 @@ get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
return bufp;
 }
 
-int arch_get_runtimeparam(const struct pmu_event *pe)
+int arch_get_runtimeparam(const struct pmu_metric *pm)
 {
int count;
char path[PATH_MAX] = "/devices/hv_24x7/interface/";
 
-   atoi(pe->aggr_mode) == PerChip ? strcat(path, "sockets") : strcat(path, 
"coresperchip");
+   atoi(pm->aggr_mode) == PerChip ? strcat(path, "sockets") : strcat(path, 
"coresperchip");
return sysfs__read_int(path, ) < 0 ? 1 : count;
 }
diff --git a/tools/perf/pmu-events/empty-pmu-events.c 
b/tools/perf/pmu-events/empty-pmu-events.c
index 480e8f0d30c8..4e39d1a8d6d6 100644
--- a/tools/perf/pmu-events/empty-pmu-events.c
+++ b/tools/perf/pmu-events/empty-pmu-events.c
@@ -181,6 +181,11 @@ struct pmu_events_table {
const struct pmu_event *entries;
 };
 
+/* Struct used to make the PMU metric table implementation opaque to callers. 
*/
+struct pmu_metrics_table {
+   const struct pmu_metric *entries;
+};
+
 /*
  * Map a CPU to its table of PMU events. The CPU is identified by the
  * cpuid field, which is an arch-specific identifier for the CPU.
@@ -254,11 +259,29 @@ static const struct pmu_sys_events pmu_sys_event_tables[] 
= {
 int pmu_events_table_for_each_event(const struct pmu_events_table *table, 
pmu_event_iter_fn fn,
void *data)
 {
-   for (const struct pmu_event *pe = >entries[0];
-pe->name || pe->metric_group || pe->metric_name;
-pe++) {
-   int ret = fn(pe, table, data);
+   for (const struct pmu_event *pe = >entries[0]; pe->name || 
pe->metric_expr; pe++) {
+   int ret;
 
+   if (!pe->name)
+   continue;
+   ret = fn(pe, table, data);
+   if (ret)
+   return ret;
+   }
+   return 0;
+}
+
+int pmu_events_table_for_each_metric(const struct pmu_events_table *etable, 
pmu_metric_iter_fn fn,
+void *data)
+{
+   struct pmu_metrics_table *table = (struct pmu_metrics_table *)etable;
+
+   for (const struct pmu_metric *pm = >entries[0]; pm->name || 
pm->metric_expr; pm++) {
+   int ret;
+
+   if (!pm->metric_expr)
+   continue;
+   ret = fn(pm, etable, data);
if (ret)
return ret;
}
@@ -305,11 +328,22 @@ const struct pmu_events_table 
*find_core_events_table(const char *arch, const ch
 }
 
 int pmu_for_each_core_event(pmu_event_iter_fn fn, void *data)
+{
+   for (const struct pmu_events_map *tables = _events_map[0]; 
tables->arch; tables++) {
+   int ret = pmu_events_table_for_each_event(>table, fn, 
data);
+
+   if (ret)
+   return ret;
+   }
+   return 0;
+}
+
+int pmu_for_each_core_metric(pmu_metric_iter_fn fn, void *data)
 {
for (const struct pmu_events_map *tables = _events_map[0];
 tables->arch;
 tables++) {
-   int ret = pmu_events_table_for_each_event(>table, fn, 
data);
+   int ret = pmu_events_table_for_each_metric(>table, fn, 
data);
 
if (ret)
return ret;
@@ -340,3 +374,8 @@ int pmu_for_each_sys_event(pmu_event_iter_fn fn, void *data)
}
return 0;
 }
+
+int pmu_for_each_sys_metric(pmu_metric_iter_fn fn __maybe_unused, void *data 
__maybe_unused)
+{
+   return 0;
+}
diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 15a1671740cc..858787a12302 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -564,7 +564,19 @@ static const struct pmu_sys_events pmu_sys_event_tables[] 
= {
 \t},
 };
 
-static void decompress(int offset, struct pmu_event *pe)
+static void decompress_event(int offset, struct pmu_event *pe)
+{

[PATCH v4 03/12] perf jevents: Rewrite metrics in the same file with each other

2023-01-25 Thread Ian Rogers

Rewrite metrics within the same file in terms of each other. For example, on 
Power8
other_stall_cpi is rewritten from:
"PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / 
PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL"
to:
"stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi 
- ntcg_flush_cpi - no_ntf_stall_cpi"
Which more closely matches the definition on Power9.

To avoid recomputation decorate the function with a cache.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/jevents.py | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 0416b7442171..15a1671740cc 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -3,6 +3,7 @@
 """Convert directories of JSON events to C code."""
 import argparse
 import csv
+from functools import lru_cache
 import json
 import metric
 import os
@@ -337,18 +338,28 @@ class JsonEvent:
 s = self.build_c_string()
 return f'{{ { _bcs.offsets[s] } }}, /* {s} */\n'
 
-
+@lru_cache(maxsize=None)
 def read_json_events(path: str, topic: str) -> Sequence[JsonEvent]:
   """Read json events from the specified file."""
-
   try:
-result = json.load(open(path), object_hook=JsonEvent)
+events = json.load(open(path), object_hook=JsonEvent)
   except BaseException as err:
 print(f"Exception processing {path}")
 raise
-  for event in result:
+  metrics: list[Tuple[str, metric.Expression]] = []
+  for event in events:
 event.topic = topic
-  return result
+if event.metric_name and '-' not in event.metric_name:
+  metrics.append((event.metric_name, event.metric_expr))
+  updates = metric.RewriteMetricsInTermsOfOthers(metrics)
+  if updates:
+for event in events:
+  if event.metric_name in updates:
+# print(f'Updated {event.metric_name} from\n"{event.metric_expr}"\n'
+#   f'to\n"{updates[event.metric_name]}"')
+event.metric_expr = updates[event.metric_name]
+
+  return events
 
 def preprocess_arch_std_files(archpath: str) -> None:
   """Read in all architecture standard events."""
-- 
2.39.1.456.gfc5497dd1b-goog

[PATCH v4 02/12] perf jevents metric: Add ability to rewrite metrics in terms of others

2023-01-25 Thread Ian Rogers

Add RewriteMetricsInTermsOfOthers that iterates over pairs of names
and expressions trying to replace an expression, within the current
expression, with its name.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/metric.py  | 73 +++-
 tools/perf/pmu-events/metric_test.py | 10 
 2 files changed, 81 insertions(+), 2 deletions(-)

diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 2f2fd220e843..ed13efac7389 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -4,7 +4,7 @@ import ast
 import decimal
 import json
 import re
-from typing import Dict, List, Optional, Set, Union
+from typing import Dict, List, Optional, Set, Tuple, Union
 
 
 class Expression:
@@ -26,6 +26,9 @@ class Expression:
 """Returns true when two expressions are the same."""
 raise NotImplementedError()
 
+  def Substitute(self, name: str, expression: 'Expression') -> 'Expression':
+raise NotImplementedError()
+
   def __str__(self) -> str:
 return self.ToPerfJson()
 
@@ -186,6 +189,15 @@ class Operator(Expression):
   other.lhs) and self.rhs.Equals(other.rhs)
 return False
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+if self.Equals(expression):
+  return Event(name)
+lhs = self.lhs.Substitute(name, expression)
+rhs = None
+if self.rhs:
+  rhs = self.rhs.Substitute(name, expression)
+return Operator(self.operator, lhs, rhs)
+
 
 class Select(Expression):
   """Represents a select ternary in the parse tree."""
@@ -225,6 +237,14 @@ class Select(Expression):
   other.false_val) and self.true_val.Equals(other.true_val)
 return False
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+if self.Equals(expression):
+  return Event(name)
+true_val = self.true_val.Substitute(name, expression)
+cond = self.cond.Substitute(name, expression)
+false_val = self.false_val.Substitute(name, expression)
+return Select(true_val, cond, false_val)
+
 
 class Function(Expression):
   """A function in an expression like min, max, d_ratio."""
@@ -267,6 +287,15 @@ class Function(Expression):
   return result
 return False
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+if self.Equals(expression):
+  return Event(name)
+lhs = self.lhs.Substitute(name, expression)
+rhs = None
+if self.rhs:
+  rhs = self.rhs.Substitute(name, expression)
+return Function(self.fn, lhs, rhs)
+
 
 def _FixEscapes(s: str) -> str:
   s = re.sub(r'([^\\]),', r'\1\\,', s)
@@ -293,6 +322,9 @@ class Event(Expression):
   def Equals(self, other: Expression) -> bool:
 return isinstance(other, Event) and self.name == other.name
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+return self
+
 
 class Constant(Expression):
   """A constant within the expression tree."""
@@ -317,6 +349,9 @@ class Constant(Expression):
   def Equals(self, other: Expression) -> bool:
 return isinstance(other, Constant) and self.value == other.value
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+return self
+
 
 class Literal(Expression):
   """A runtime literal within the expression tree."""
@@ -336,6 +371,9 @@ class Literal(Expression):
   def Equals(self, other: Expression) -> bool:
 return isinstance(other, Literal) and self.value == other.value
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+return self
+
 
 def min(lhs: Union[int, float, Expression], rhs: Union[int, float,
Expression]) -> 
Function:
@@ -461,6 +499,7 @@ class MetricGroup:
 
 
 class _RewriteIfExpToSelect(ast.NodeTransformer):
+  """Transformer to convert if-else nodes to Select expressions."""
 
   def visit_IfExp(self, node):
 # pylint: disable=invalid-name
@@ -498,7 +537,37 @@ def ParsePerfJson(orig: str) -> Expression:
   for kw in keywords:
 py = re.sub(rf'Event\(r"{kw}"\)', kw, py)
 
-  parsed = ast.parse(py, mode='eval')
+  try:
+parsed = ast.parse(py, mode='eval')
+  except SyntaxError as e:
+raise SyntaxError(f'Parsing expression:\n{orig}') from e
   _RewriteIfExpToSelect().visit(parsed)
   parsed = ast.fix_missing_locations(parsed)
   return _Constify(eval(compile(parsed, orig, 'eval')))
+
+
+def RewriteMetricsInTermsOfOthers(metrics: list[Tuple[str, Expression]]
+  )-> Dict[str, Expression]:
+  """Shorten metrics by rewriting in terms of others.
+
+  Args:
+metrics (list): pairs of metric names and their expressions.
+  Returns:
+Dict: mapping from a metric name to a shortened expression.
+  """
+  updates: Dict[str, Expression] = dict()
+  for outer_name, outer_expression in metrics:
+updated = outer_expression
+while True:
+  for inner_name, inner_expression in metrics:
+if

[PATCH v4 01/12] perf jevents metric: Correct Function equality

2023-01-25 Thread Ian Rogers

rhs may not be defined, say for source_count, so add a guard.

Reviewed-by: Kajol Jain
Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/metric.py | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 4797ed4fd817..2f2fd220e843 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -261,8 +261,10 @@ class Function(Expression):
 
   def Equals(self, other: Expression) -> bool:
 if isinstance(other, Function):
-  return self.fn == other.fn and self.lhs.Equals(
-  other.lhs) and self.rhs.Equals(other.rhs)
+  result = self.fn == other.fn and self.lhs.Equals(other.lhs)
+  if self.rhs:
+result = result and self.rhs.Equals(other.rhs)
+  return result
 return False
 
 
-- 
2.39.1.456.gfc5497dd1b-goog

[PATCH v4 00/12] jevents/pmu-events improvements

2023-01-25 Thread Ian Rogers

Add an optimization to jevents using the metric code, rewrite metrics
in terms of each other in order to minimize size and improve
readability. For example, on Power8
other_stall_cpi is rewritten from:
"PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / 
PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL"
to:
"stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi 
- ntcg_flush_cpi - no_ntf_stall_cpi"
Which more closely matches the definition on Power9.

A limitation of the substitutions are that they depend on strict
equality and the shape of the tree. This means that for "a + b + c"
then a substitution of "a + b" will succeed while "b + c" will fail
(the LHS for "+ c" is "a + b" not just "b").

Separate out the events and metrics in the pmu-events tables saving
14.8% in the table size while making it that metrics no longer need to
iterate over all events and vice versa. These changes remove evsel's
direct metric support as the pmu_event no longer has a metric to
populate it. This is a minor issue as the code wasn't working
properly, metrics for this are rare and can still be properly ran
using '-M'.

Add an ability to just build certain models into the jevents generated
pmu-metrics.c code. This functionality is appropriate for operating
systems like ChromeOS, that aim to minimize binary size and know all
the target CPU models.

v4. Better support the implementor/model style --model argument for
jevents.py. Add #slots test fix. On some patches add reviewed-by
John Garry  and Kajol
Jain.
v3. Rebase an incorporate review comments from John Garry
, in particular breaking apart patch 4
into 3 patches. The no jevents breakage and then later fix is
avoided in this series too.
v2. Rebase. Modify the code that skips rewriting a metric with the
same name with itself, to make the name check case insensitive.

Ian Rogers (12):
  perf jevents metric: Correct Function equality
  perf jevents metric: Add ability to rewrite metrics in terms of others
  perf jevents: Rewrite metrics in the same file with each other
  perf pmu-events: Add separate metric from pmu_event
  perf pmu-events: Separate the metrics from events for no jevents
  perf pmu-events: Remove now unused event and metric variables
  perf stat: Remove evsel metric_name/expr
  perf jevents: Combine table prefix and suffix writing
  perf pmu-events: Introduce pmu_metrics_table
  perf jevents: Generate metrics and events as separate tables
  perf jevents: Add model list option
  perf pmu-events: Fix testing with JEVENTS_ARCH=all

 tools/perf/arch/arm64/util/pmu.c |  11 +-
 tools/perf/arch/powerpc/util/header.c|   4 +-
 tools/perf/builtin-list.c|  20 +-
 tools/perf/builtin-stat.c|   1 -
 tools/perf/pmu-events/Build  |   3 +-
 tools/perf/pmu-events/empty-pmu-events.c | 108 ++-
 tools/perf/pmu-events/jevents.py | 357 +++
 tools/perf/pmu-events/metric.py  |  79 -
 tools/perf/pmu-events/metric_test.py |  10 +
 tools/perf/pmu-events/pmu-events.h   |  26 +-
 tools/perf/tests/expand-cgroup.c |   4 +-
 tools/perf/tests/parse-metric.c  |   4 +-
 tools/perf/tests/pmu-events.c|  69 ++---
 tools/perf/util/cgroup.c |   1 -
 tools/perf/util/evsel.c  |   2 -
 tools/perf/util/evsel.h  |   2 -
 tools/perf/util/expr.h   |   1 +
 tools/perf/util/expr.l   |   8 +-
 tools/perf/util/metricgroup.c| 207 +++--
 tools/perf/util/metricgroup.h|   4 +-
 tools/perf/util/parse-events.c   |   2 -
 tools/perf/util/pmu.c|  44 +--
 tools/perf/util/pmu.h|  10 +-
 tools/perf/util/print-events.c   |  32 +-
 tools/perf/util/print-events.h   |   3 +-
 tools/perf/util/python.c |   7 -
 tools/perf/util/stat-shadow.c| 112 ---
 tools/perf/util/stat.h   |   1 -
 28 files changed, 666 insertions(+), 466 deletions(-)

-- 
2.39.1.456.gfc5497dd1b-goog

82 matches

Mail list logo