Re: [PATCH] perf stat: Add support for s390 transaction counters
> Displaying these different types for s390 is important from my point of > view. Of course, I could create a mapping of TX_NC_TABORT/TX_NC_TEND > to tx-commit/tx-abort. The remaining events would still appear to be specific > to the cpum_cf. If you want more generic metrics you can just use the new json generic metrics / metric group support in the eventlist. In fact if we did it today I wouldn't add perf stat -T at all, but just use that. -Andi
Re: [PATCH] perf stat: Add support for s390 transaction counters
Hi, On Wed, Mar 14, 2018 at 08:43:17AM -0700, Andi Kleen wrote: > > S390 has no support for Elision and uses transaction begin/end/abort > > instructions. The CPU measurement counter facility provides counters for > > transaction end and transaction abort. > > You don't need to implement the el-* events. > > > I have used this table (taken from arch/x86/events/intel/core.c) as > > giudeline: > > /* Haswell special events */ > > EVENT_ATTR_STR(tx-start,tx_start, "event=0xc9,umask=0x1"); > > EVENT_ATTR_STR(tx-commit, tx_commit, "event=0xc9,umask=0x2"); > > EVENT_ATTR_STR(tx-abort,tx_abort, "event=0xc9,umask=0x4"); > > EVENT_ATTR_STR(tx-capacity, tx_capacity,"event=0x54,umask=0x2"); > > EVENT_ATTR_STR(tx-conflict, tx_conflict,"event=0x54,umask=0x1"); > > EVENT_ATTR_STR(el-start,el_start, "event=0xc8,umask=0x1"); > > EVENT_ATTR_STR(el-commit, el_commit, "event=0xc8,umask=0x2"); > > EVENT_ATTR_STR(el-abort,el_abort, "event=0xc8,umask=0x4"); > > EVENT_ATTR_STR(el-capacity, el_capacity,"event=0x54,umask=0x2"); > > EVENT_ATTR_STR(el-conflict, el_conflict,"event=0x54,umask=0x1"); > > EVENT_ATTR_STR(cycles-t,cycles_t, "event=0x3c,in_tx=1"); > > EVENT_ATTR_STR(cycles-ct, cycles_ct, > > "event=0x3c,in_tx=1,in_tx_cp=1"); > > > > > > So s390 can only support tx_commit and tx-abort symbolic names. In detail, for s390 we have: cpum_cf/TX_C_TABORT_NO_SPECIAL/ cpum_cf/TX_C_TABORT_SPECIAL/ cpum_cf/TX_C_TEND/ cpum_cf/TX_NC_TABORT/ cpum_cf/TX_NC_TEND/ The mapping of the above is not that easy. As s390 have counters for non-constraint (TC_NC_*) and contraint (TX_C_*) transaction commits (TEND) and aborts (TABORTs). > We could change perf stat to fall back to only tx commit and tx abort. > We already did that for one limited case. Displaying these different types for s390 is important from my point of view. Of course, I could create a mapping of TX_NC_TABORT/TX_NC_TEND to tx-commit/tx-abort. The remaining events would still appear to be specific to the cpum_cf. So I would propose to go with adding the cpum_cf/ specific ones first. If necessary, they could go into the perf/arch/s390/ directory and included in builtin-stat. I put a todo on my list to provide at least a tx-commit/abort for the nonconstraint transactions. (The other would still be specific). Kind regards, Hendrik -- Hendrik Brueckner brueck...@linux.vnet.ibm.com | IBM Deutschland Research & Development GmbH Linux on z Systems Development| Schoenaicher Str. 220, 71032 Boeblingen IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martina Koederitz Geschaeftsfuehrung: Dirk Wittkopp Sitz der Gesellschaft: Boeblingen Registergericht: Amtsgericht Stuttgart, HRB 243294
Re: [PATCH] perf stat: Add support for s390 transaction counters
> S390 has no support for Elision and uses transaction begin/end/abort > instructions. The CPU measurement counter facility provides counters for > transaction end and transaction abort. You don't need to implement the el-* events. > I have used this table (taken from arch/x86/events/intel/core.c) as giudeline: > /* Haswell special events */ > EVENT_ATTR_STR(tx-start,tx_start, "event=0xc9,umask=0x1"); > EVENT_ATTR_STR(tx-commit, tx_commit, "event=0xc9,umask=0x2"); > EVENT_ATTR_STR(tx-abort,tx_abort, "event=0xc9,umask=0x4"); > EVENT_ATTR_STR(tx-capacity, tx_capacity,"event=0x54,umask=0x2"); > EVENT_ATTR_STR(tx-conflict, tx_conflict,"event=0x54,umask=0x1"); > EVENT_ATTR_STR(el-start,el_start, "event=0xc8,umask=0x1"); > EVENT_ATTR_STR(el-commit, el_commit, "event=0xc8,umask=0x2"); > EVENT_ATTR_STR(el-abort,el_abort, "event=0xc8,umask=0x4"); > EVENT_ATTR_STR(el-capacity, el_capacity,"event=0x54,umask=0x2"); > EVENT_ATTR_STR(el-conflict, el_conflict,"event=0x54,umask=0x1"); > EVENT_ATTR_STR(cycles-t,cycles_t, "event=0x3c,in_tx=1"); > EVENT_ATTR_STR(cycles-ct, cycles_ct, > "event=0x3c,in_tx=1,in_tx_cp=1"); > > > So s390 can only support tx_commit and tx-abort symbolic names. > None of them show up in the transactions_attrs and transaction_limited_attrs > string variables used in builtin-stat.c file. We could change perf stat to fall back to only tx commit and tx abort. We already did that for one limited case. -Andi
Re: [PATCH] perf stat: Add support for s390 transaction counters
On 03/14/2018 02:18 PM, Arnaldo Carvalho de Melo wrote: > Em Wed, Mar 14, 2018 at 09:34:48AM +0100, Thomas-Mich Richter escreveu: >> On 03/13/2018 04:23 AM, Andi Kleen wrote: >>> Thomas Richter writes: > Right now there is only hard coded support for x86. > >>> That's not true. There is support for generic transaction events in perf. > >>> As far as I can tell your events would map 1:1 to the generic tx-* events. > >> I might be wrong, but when I look at function add_default_attributes() >> in file buildin-stat.c the string variables transaction_attrs >> and transaction_limited_attrs are used when flag T is specified on command >> line: > >> /* Default events used for perf stat -T */ >> static const char *transaction_attrs = { >> "task-clock," >> "{" >> "instructions," >> "cycles," >> "cpu/cycles-t/," >> "cpu/tx-start/," >> "cpu/el-start/," >> "cpu/cycles-ct/" >> "}" >> }; > >> These PMU events show up on my x86 notebook but no on the s390. >> That's why I came to this conclusion. I have not tried other architectures. > > So, I think Andi is saying that the s/390 kernel should map the generic > transaction events (cpu/cycles-t/, cpu/tx-start/, etc) to the events you > want to make 'perf stat -T' use, that way 'perf stat' doesn't have to be > changed, it will continue asking for the generic transaction event names > and then the kernel does the translation. > > Just like we map "cycles" to different underlying events in different > architectures, right? > > - Arnaldo S390 has no support for Elision and uses transaction begin/end/abort instructions. The CPU measurement counter facility provides counters for transaction end and transaction abort. This means s390 counter facility device driver in arch/s390/kernel/perf_cpum_cf.c could support the tx_abort and tx_commit symbolic counter names. I have used this table (taken from arch/x86/events/intel/core.c) as giudeline: /* Haswell special events */ EVENT_ATTR_STR(tx-start,tx_start, "event=0xc9,umask=0x1"); EVENT_ATTR_STR(tx-commit, tx_commit, "event=0xc9,umask=0x2"); EVENT_ATTR_STR(tx-abort,tx_abort, "event=0xc9,umask=0x4"); EVENT_ATTR_STR(tx-capacity, tx_capacity,"event=0x54,umask=0x2"); EVENT_ATTR_STR(tx-conflict, tx_conflict,"event=0x54,umask=0x1"); EVENT_ATTR_STR(el-start,el_start, "event=0xc8,umask=0x1"); EVENT_ATTR_STR(el-commit, el_commit, "event=0xc8,umask=0x2"); EVENT_ATTR_STR(el-abort,el_abort, "event=0xc8,umask=0x4"); EVENT_ATTR_STR(el-capacity, el_capacity,"event=0x54,umask=0x2"); EVENT_ATTR_STR(el-conflict, el_conflict,"event=0x54,umask=0x1"); EVENT_ATTR_STR(cycles-t,cycles_t, "event=0x3c,in_tx=1"); EVENT_ATTR_STR(cycles-ct, cycles_ct, "event=0x3c,in_tx=1,in_tx_cp=1"); So s390 can only support tx_commit and tx-abort symbolic names. None of them show up in the transactions_attrs and transaction_limited_attrs string variables used in builtin-stat.c file. That is the reason why I introduced the patch set v2. -- Thomas Richter, Dept 3303, IBM LTC Boeblingen Germany -- Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
Re: [PATCH] perf stat: Add support for s390 transaction counters
Em Wed, Mar 14, 2018 at 09:34:48AM +0100, Thomas-Mich Richter escreveu: > On 03/13/2018 04:23 AM, Andi Kleen wrote: > > Thomas Richter writes: > >> Right now there is only hard coded support for x86. > > That's not true. There is support for generic transaction events in perf. > > As far as I can tell your events would map 1:1 to the generic tx-* events. > I might be wrong, but when I look at function add_default_attributes() > in file buildin-stat.c the string variables transaction_attrs > and transaction_limited_attrs are used when flag T is specified on command > line: > /* Default events used for perf stat -T */ > static const char *transaction_attrs = { > "task-clock," > "{" > "instructions," > "cycles," > "cpu/cycles-t/," > "cpu/tx-start/," > "cpu/el-start/," > "cpu/cycles-ct/" > "}" > }; > These PMU events show up on my x86 notebook but no on the s390. > That's why I came to this conclusion. I have not tried other architectures. So, I think Andi is saying that the s/390 kernel should map the generic transaction events (cpu/cycles-t/, cpu/tx-start/, etc) to the events you want to make 'perf stat -T' use, that way 'perf stat' doesn't have to be changed, it will continue asking for the generic transaction event names and then the kernel does the translation. Just like we map "cycles" to different underlying events in different architectures, right? - Arnaldo
Re: [PATCH] perf stat: Add support for s390 transaction counters
On 03/13/2018 04:23 AM, Andi Kleen wrote: > Thomas Richter writes: > >> Right now there is only hard coded support for x86. > > That's not true. There is support for generic transaction events in perf. > > As far as I can tell your events would map 1:1 to the generic tx-* events. > > -Andi > I might be wrong, but when I look at function add_default_attributes() in file buildin-stat.c the string variables transaction_attrs and transaction_limited_attrs are used when flag T is specified on command line: /* Default events used for perf stat -T */ static const char *transaction_attrs = { "task-clock," "{" "instructions," "cycles," "cpu/cycles-t/," "cpu/tx-start/," "cpu/el-start/," "cpu/cycles-ct/" "}" }; These PMU events show up on my x86 notebook but no on the s390. That's why I came to this conclusion. I have not tried other architectures. -- Thomas Richter, Dept 3303, IBM LTC Boeblingen Germany -- Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
Re: [PATCH] perf stat: Add support for s390 transaction counters
Thomas Richter writes: > Right now there is only hard coded support for x86. That's not true. There is support for generic transaction events in perf. As far as I can tell your events would map 1:1 to the generic tx-* events. -Andi
Re: [PATCH] perf stat: Add support for s390 transaction counters
Em Mon, Mar 12, 2018 at 11:38:06AM +0100, Thomas Richter escreveu: > This patch introduces support for s390 transaction counters > displayed with command 'perf stat -T -- sleep 2' > > Right now there is only hard coded support for x86. > > This patch introduces architecture specfic counter > tables for x86 and s390. The architecture is queried > and the event string for transaction counters is constructed > depending on the architecture and the CPU measurement > facility counter list. > > Output Before: > [root@s35lp76 perf]# perf stat -T -- sleep 1 > Cannot set up transaction events > [root@s35lp76 perf]# > > Output after: > [root@s35lp76 perf]# ./perf stat -T -- sleep 1 > > Performance counter stats for 'sleep 1': > >0.939985 task-clock (msec) #0.001 CPUs utilized > 2,557,145 instructions #0.53 insn per cycle > 4,785,929 cycles#5.091 GHz > 0 cpum_cf/TX_C_TABORT_NO_SPECIAL/ #0.000 K/sec > 0 cpum_cf/TX_C_TABORT_SPECIAL/ #0.000 K/sec > 0 cpum_cf/TX_C_TEND/#0.000 K/sec > 0 cpum_cf/TX_NC_TABORT/ #0.000 K/sec > 0 cpum_cf/TX_NC_TEND/ #0.000 K/sec > > 1.001934710 seconds time elapsed > > [root@s35lp76 perf]# > > Output on x86 is unchanged. Please break this patch in multiple ones, for instance, that part that converts the hardcoded string to a table with events, etc, should be done first, just for x86, then when you add the s/390 support that patch will have mostly + lines, i.e. will be _just_ for s/390. I.e. the logic that discovers which events should be used instead of trying out of two possible sets ("full featured" and that limited one) is an improvement that works for x86 or any other arch where there are processors with different sets of transaction counters. See other comments below. > Signed-off-by: Thomas Richter > Reviewed--by: Hendrik Brueckner > --- > tools/perf/builtin-stat.c | 162 > ++ > 1 file changed, 135 insertions(+), 27 deletions(-) > > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c > index 86c8c8a9229c..75b4f1c9cd78 100644 > --- a/tools/perf/builtin-stat.c > +++ b/tools/perf/builtin-stat.c > @@ -95,22 +95,8 @@ static const char *transaction_attrs = { > "task-clock," > "{" > "instructions," > - "cycles," > - "cpu/cycles-t/," > - "cpu/tx-start/," > - "cpu/el-start/," > - "cpu/cycles-ct/" > - "}" > -}; > - > -/* More limited version when the CPU does not have all events. */ > -static const char * transaction_limited_attrs = { > - "task-clock," > - "{" > - "instructions," > - "cycles," > - "cpu/cycles-t/," > - "cpu/tx-start/" > + "cycles" > + "%s" > "}" > }; > > @@ -2149,13 +2135,123 @@ __weak void arch_topdown_group_warn(void) > { > } > > +struct pmu_tx_events { /* Define transaction counters > */ > + const char *pmu;/* PMU name */ > + const char *name; /* Counter name */ > +}; > + > +static struct pmu_tx_events x86_tx_events[] = {/* x86 transaction counters */ > + { > + .pmu = "cpu", > + .name = "cycles-t", > + }, > + { > + .pmu = "cpu", > + .name = "tx-start", > + }, > + { > + .pmu = "cpu", > + .name = "el-start", > + }, > + { > + .pmu = "cpu", > + .name = "cycles-ct", > + }, > + { > + .pmu = 0 > + } > +}; > + > +static struct pmu_tx_events s390_tx_events[] = {/* s390 transaction counters > */ > + { > + .pmu = "cpum_cf", > + .name = "TX_C_TABORT_NO_SPECIAL", > + }, > + { > + .pmu = "cpum_cf", > + .name = "TX_C_TABORT_SPECIAL", > + }, > + { > + .pmu = "cpum_cf", > + .name = "TX_C_TEND", > + }, > + { > + .pmu = "cpum_cf", > + .name = "TX_NC_TABORT", > + }, > + { > + .pmu = "cpum_cf", > + .name = "TX_NC_TEND", > + }, > + { > + .pmu = 0 > + } > +}; > + > +struct arch_pmu_tx_events { > + const char *archname; /* Architecture name */ > + struct pmu_tx_events *tx; /* Architecture specific counters */ > +} arch_pmu_tx_events[] = { > + { > + .archname = "s390", > + .tx = s390_tx_events > + }, > + { > + .archname = "x86", > + .tx = x86_tx_events > + }, > + { > + .archname = 0 > + } > +}; > + > +static struct pmu_tx_events *pmu_tx_find_arch(const char *name) > +{ > + struct arch_pmu_tx_events *p = arch_pmu_tx_events; > + > + for (; p->archname; ++p) > + if (!strcmp(name, p->archname)) > + return p->tx; > + return N
[PATCH] perf stat: Add support for s390 transaction counters
This patch introduces support for s390 transaction counters displayed with command 'perf stat -T -- sleep 2' Right now there is only hard coded support for x86. This patch introduces architecture specfic counter tables for x86 and s390. The architecture is queried and the event string for transaction counters is constructed depending on the architecture and the CPU measurement facility counter list. Output Before: [root@s35lp76 perf]# perf stat -T -- sleep 1 Cannot set up transaction events [root@s35lp76 perf]# Output after: [root@s35lp76 perf]# ./perf stat -T -- sleep 1 Performance counter stats for 'sleep 1': 0.939985 task-clock (msec) #0.001 CPUs utilized 2,557,145 instructions #0.53 insn per cycle 4,785,929 cycles#5.091 GHz 0 cpum_cf/TX_C_TABORT_NO_SPECIAL/ #0.000 K/sec 0 cpum_cf/TX_C_TABORT_SPECIAL/ #0.000 K/sec 0 cpum_cf/TX_C_TEND/#0.000 K/sec 0 cpum_cf/TX_NC_TABORT/ #0.000 K/sec 0 cpum_cf/TX_NC_TEND/ #0.000 K/sec 1.001934710 seconds time elapsed [root@s35lp76 perf]# Output on x86 is unchanged. Signed-off-by: Thomas Richter Reviewed--by: Hendrik Brueckner --- tools/perf/builtin-stat.c | 162 ++ 1 file changed, 135 insertions(+), 27 deletions(-) diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index 86c8c8a9229c..75b4f1c9cd78 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -95,22 +95,8 @@ static const char *transaction_attrs = { "task-clock," "{" "instructions," - "cycles," - "cpu/cycles-t/," - "cpu/tx-start/," - "cpu/el-start/," - "cpu/cycles-ct/" - "}" -}; - -/* More limited version when the CPU does not have all events. */ -static const char * transaction_limited_attrs = { - "task-clock," - "{" - "instructions," - "cycles," - "cpu/cycles-t/," - "cpu/tx-start/" + "cycles" + "%s" "}" }; @@ -2149,13 +2135,123 @@ __weak void arch_topdown_group_warn(void) { } +struct pmu_tx_events { /* Define transaction counters */ + const char *pmu;/* PMU name */ + const char *name; /* Counter name */ +}; + +static struct pmu_tx_events x86_tx_events[] = {/* x86 transaction counters */ + { + .pmu = "cpu", + .name = "cycles-t", + }, + { + .pmu = "cpu", + .name = "tx-start", + }, + { + .pmu = "cpu", + .name = "el-start", + }, + { + .pmu = "cpu", + .name = "cycles-ct", + }, + { + .pmu = 0 + } +}; + +static struct pmu_tx_events s390_tx_events[] = {/* s390 transaction counters */ + { + .pmu = "cpum_cf", + .name = "TX_C_TABORT_NO_SPECIAL", + }, + { + .pmu = "cpum_cf", + .name = "TX_C_TABORT_SPECIAL", + }, + { + .pmu = "cpum_cf", + .name = "TX_C_TEND", + }, + { + .pmu = "cpum_cf", + .name = "TX_NC_TABORT", + }, + { + .pmu = "cpum_cf", + .name = "TX_NC_TEND", + }, + { + .pmu = 0 + } +}; + +struct arch_pmu_tx_events { + const char *archname; /* Architecture name */ + struct pmu_tx_events *tx; /* Architecture specific counters */ +} arch_pmu_tx_events[] = { + { + .archname = "s390", + .tx = s390_tx_events + }, + { + .archname = "x86", + .tx = x86_tx_events + }, + { + .archname = 0 + } +}; + +static struct pmu_tx_events *pmu_tx_find_arch(const char *name) +{ + struct arch_pmu_tx_events *p = arch_pmu_tx_events; + + for (; p->archname; ++p) + if (!strcmp(name, p->archname)) + return p->tx; + return NULL; +} + +/* Search the list of transaction events and test if they are valid for + * this PMU. The events are named cpu/cycles-t/ or cpum_cf/TX_NC_TEND/ + * that is the PMU name followed by the event name surrounded by slashes. + * + * The function returns a string which contains the tested (and existing) + * PMU transaction events. The caller must free this string. + * + * If no PMU transaction events are found, a NULL pointer is returned. + */ +static char *build_tx_string(const char *fmt, struct pmu_tx_events *txp) +{ + struct pmu_tx_events *p = txp; + char buffer[512], *result; + int rc, i = 0; + + memset(buffer, 0, sizeof(buffer)); + for (p = txp; p->pmu; ++p) { + if (pmu_have_event(p->pmu, p->name)) { + rc = s