Re: [PATCH 1/5] perf/x86/intel/uncore: Parse uncore discovery tables

2021-03-16 Thread Liang, Kan




On 3/16/2021 10:05 AM, Peter Zijlstra wrote:

On Tue, Mar 16, 2021 at 08:42:25AM -0400, Liang, Kan wrote:



On 3/16/2021 7:43 AM, Peter Zijlstra wrote:

On Fri, Mar 12, 2021 at 08:34:34AM -0800, kan.li...@linux.intel.com wrote:

From: Kan Liang 

A self-describing mechanism for the uncore PerfMon hardware has been
introduced with the latest Intel platforms. By reading through an MMIO
page worth of information, perf can 'discover' all the standard uncore
PerfMon registers in a machine.

The discovery mechanism relies on BIOS's support. With a proper BIOS,
a PCI device with the unique capability ID 0x23 can be found on each
die. Perf can retrieve the information of all available uncore PerfMons
from the device via MMIO. The information is composed of one global
discovery table and several unit discovery tables.



If a BIOS doesn't support the 'discovery' mechanism, there is nothing
changed.


What if the BIOS got it wrong? Will the driver still get it correct if
it is a known platform?


Yes, I will submit a platform specific patch to fix this case.



Do we need a chicken flag to kill the discovery? uncore_no_discover?



Yes, I plan to introduce a .use_discovery_tables flag to indicate whether to
use the discovery tables for the known platform.

The below codes is part of the upcoming SPR uncore patches.
The first SPR uncore patch will still rely on the BIOS discovery tables,
because some uncore block information hasn't been published yet. We have to
retrieve the information fro the tables. Once all the information is
published, we can kill the discovery by removing the ".use_discovery_tables
= true".


I was thinking of a module parameter, such that we can tell it to skip
discovery on module load time etc.



Sure, I will add a module parameter, uncore_no_discover.
If users don't want the discovery feature, they can set 
uncore_no_discover=true.


Thanks,
Kan


Re: [PATCH 1/5] perf/x86/intel/uncore: Parse uncore discovery tables

2021-03-16 Thread Peter Zijlstra
On Tue, Mar 16, 2021 at 08:42:25AM -0400, Liang, Kan wrote:
> 
> 
> On 3/16/2021 7:43 AM, Peter Zijlstra wrote:
> > On Fri, Mar 12, 2021 at 08:34:34AM -0800, kan.li...@linux.intel.com wrote:
> > > From: Kan Liang 
> > > 
> > > A self-describing mechanism for the uncore PerfMon hardware has been
> > > introduced with the latest Intel platforms. By reading through an MMIO
> > > page worth of information, perf can 'discover' all the standard uncore
> > > PerfMon registers in a machine.
> > > 
> > > The discovery mechanism relies on BIOS's support. With a proper BIOS,
> > > a PCI device with the unique capability ID 0x23 can be found on each
> > > die. Perf can retrieve the information of all available uncore PerfMons
> > > from the device via MMIO. The information is composed of one global
> > > discovery table and several unit discovery tables.
> > 
> > > If a BIOS doesn't support the 'discovery' mechanism, there is nothing
> > > changed.
> > 
> > What if the BIOS got it wrong? Will the driver still get it correct if
> > it is a known platform?
> 
> Yes, I will submit a platform specific patch to fix this case.
> 
> > 
> > Do we need a chicken flag to kill the discovery? uncore_no_discover?
> > 
> 
> Yes, I plan to introduce a .use_discovery_tables flag to indicate whether to
> use the discovery tables for the known platform.
> 
> The below codes is part of the upcoming SPR uncore patches.
> The first SPR uncore patch will still rely on the BIOS discovery tables,
> because some uncore block information hasn't been published yet. We have to
> retrieve the information fro the tables. Once all the information is
> published, we can kill the discovery by removing the ".use_discovery_tables
> = true".

I was thinking of a module parameter, such that we can tell it to skip
discovery on module load time etc.


Re: [PATCH 1/5] perf/x86/intel/uncore: Parse uncore discovery tables

2021-03-16 Thread Liang, Kan




On 3/16/2021 7:43 AM, Peter Zijlstra wrote:

On Fri, Mar 12, 2021 at 08:34:34AM -0800, kan.li...@linux.intel.com wrote:

From: Kan Liang 

A self-describing mechanism for the uncore PerfMon hardware has been
introduced with the latest Intel platforms. By reading through an MMIO
page worth of information, perf can 'discover' all the standard uncore
PerfMon registers in a machine.

The discovery mechanism relies on BIOS's support. With a proper BIOS,
a PCI device with the unique capability ID 0x23 can be found on each
die. Perf can retrieve the information of all available uncore PerfMons
from the device via MMIO. The information is composed of one global
discovery table and several unit discovery tables.



If a BIOS doesn't support the 'discovery' mechanism, there is nothing
changed.


What if the BIOS got it wrong? Will the driver still get it correct if
it is a known platform?


Yes, I will submit a platform specific patch to fix this case.



Do we need a chicken flag to kill the discovery? uncore_no_discover?



Yes, I plan to introduce a .use_discovery_tables flag to indicate 
whether to use the discovery tables for the known platform.


The below codes is part of the upcoming SPR uncore patches.
The first SPR uncore patch will still rely on the BIOS discovery tables, 
because some uncore block information hasn't been published yet. We have 
to retrieve the information fro the tables. Once all the information is 
published, we can kill the discovery by removing the 
".use_discovery_tables = true".


+static const struct intel_uncore_init_fun spr_uncore_init __initconst = {
+   .cpu_init = spr_uncore_cpu_init,
+   .pci_init = spr_uncore_pci_init,
+   .mmio_init = spr_uncore_mmio_init,
+   .use_discovery_tables = true,
+};

Thanks,
Kan


Re: [PATCH 1/5] perf/x86/intel/uncore: Parse uncore discovery tables

2021-03-16 Thread Liang, Kan




On 3/16/2021 7:40 AM, Peter Zijlstra wrote:

On Fri, Mar 12, 2021 at 08:34:34AM -0800, kan.li...@linux.intel.com wrote:

+static struct intel_uncore_discovery_type *
+search_uncore_discovery_type(u16 type_id)
+{
+   struct rb_node *node = discovery_tables.rb_node;
+   struct intel_uncore_discovery_type *type;
+
+   while (node) {
+   type = rb_entry(node, struct intel_uncore_discovery_type, node);
+
+   if (type->type > type_id)
+   node = node->rb_left;
+   else if (type->type < type_id)
+   node = node->rb_right;
+   else
+   return type;
+   }
+
+   return NULL;
+}
+
+static struct intel_uncore_discovery_type *
+add_uncore_discovery_type(struct uncore_unit_discovery *unit)
+{
+   struct intel_uncore_discovery_type *type, *cur;
+   struct rb_node **node = _tables.rb_node;
+   struct rb_node *parent = *node;
+
+   if (unit->access_type >= UNCORE_ACCESS_MAX) {
+   pr_warn("Unsupported access type %d\n", unit->access_type);
+   return NULL;
+   }
+
+   type = kzalloc(sizeof(struct intel_uncore_discovery_type), GFP_KERNEL);
+   if (!type)
+   return NULL;
+
+   type->box_ctrl_die = kcalloc(__uncore_max_dies, sizeof(u64), 
GFP_KERNEL);
+   if (!type->box_ctrl_die)
+   goto free_type;
+
+   type->access_type = unit->access_type;
+   num_discovered_types[type->access_type]++;
+   type->type = unit->box_type;
+
+   while (*node) {
+   parent = *node;
+   cur = rb_entry(parent, struct intel_uncore_discovery_type, 
node);
+
+   if (cur->type > type->type)
+   node = >rb_left;
+   else
+   node = >rb_right;
+   }
+
+   rb_link_node(>node, parent, node);
+   rb_insert_color(>node, _tables);
+
+   return type;
+
+free_type:
+   kfree(type);
+
+   return NULL;
+
+}


I'm thinking this can use some of this:

   2d24dd5798d0 ("rbtree: Add generic add and find helpers")



Sure, I will use the generic rbtree framework in V2.

Thanks,
Kan


Re: [PATCH 1/5] perf/x86/intel/uncore: Parse uncore discovery tables

2021-03-16 Thread Peter Zijlstra
On Fri, Mar 12, 2021 at 08:34:34AM -0800, kan.li...@linux.intel.com wrote:
> From: Kan Liang 
> 
> A self-describing mechanism for the uncore PerfMon hardware has been
> introduced with the latest Intel platforms. By reading through an MMIO
> page worth of information, perf can 'discover' all the standard uncore
> PerfMon registers in a machine.
> 
> The discovery mechanism relies on BIOS's support. With a proper BIOS,
> a PCI device with the unique capability ID 0x23 can be found on each
> die. Perf can retrieve the information of all available uncore PerfMons
> from the device via MMIO. The information is composed of one global
> discovery table and several unit discovery tables.

> If a BIOS doesn't support the 'discovery' mechanism, there is nothing
> changed.

What if the BIOS got it wrong? Will the driver still get it correct if
it is a known platform?

Do we need a chicken flag to kill the discovery? uncore_no_discover?


Re: [PATCH 1/5] perf/x86/intel/uncore: Parse uncore discovery tables

2021-03-16 Thread Peter Zijlstra
On Fri, Mar 12, 2021 at 08:34:34AM -0800, kan.li...@linux.intel.com wrote:
> +static struct intel_uncore_discovery_type *
> +search_uncore_discovery_type(u16 type_id)
> +{
> + struct rb_node *node = discovery_tables.rb_node;
> + struct intel_uncore_discovery_type *type;
> +
> + while (node) {
> + type = rb_entry(node, struct intel_uncore_discovery_type, node);
> +
> + if (type->type > type_id)
> + node = node->rb_left;
> + else if (type->type < type_id)
> + node = node->rb_right;
> + else
> + return type;
> + }
> +
> + return NULL;
> +}
> +
> +static struct intel_uncore_discovery_type *
> +add_uncore_discovery_type(struct uncore_unit_discovery *unit)
> +{
> + struct intel_uncore_discovery_type *type, *cur;
> + struct rb_node **node = _tables.rb_node;
> + struct rb_node *parent = *node;
> +
> + if (unit->access_type >= UNCORE_ACCESS_MAX) {
> + pr_warn("Unsupported access type %d\n", unit->access_type);
> + return NULL;
> + }
> +
> + type = kzalloc(sizeof(struct intel_uncore_discovery_type), GFP_KERNEL);
> + if (!type)
> + return NULL;
> +
> + type->box_ctrl_die = kcalloc(__uncore_max_dies, sizeof(u64), 
> GFP_KERNEL);
> + if (!type->box_ctrl_die)
> + goto free_type;
> +
> + type->access_type = unit->access_type;
> + num_discovered_types[type->access_type]++;
> + type->type = unit->box_type;
> +
> + while (*node) {
> + parent = *node;
> + cur = rb_entry(parent, struct intel_uncore_discovery_type, 
> node);
> +
> + if (cur->type > type->type)
> + node = >rb_left;
> + else
> + node = >rb_right;
> + }
> +
> + rb_link_node(>node, parent, node);
> + rb_insert_color(>node, _tables);
> +
> + return type;
> +
> +free_type:
> + kfree(type);
> +
> + return NULL;
> +
> +}

I'm thinking this can use some of this:

  2d24dd5798d0 ("rbtree: Add generic add and find helpers")


[PATCH 1/5] perf/x86/intel/uncore: Parse uncore discovery tables

2021-03-12 Thread kan . liang
From: Kan Liang 

A self-describing mechanism for the uncore PerfMon hardware has been
introduced with the latest Intel platforms. By reading through an MMIO
page worth of information, perf can 'discover' all the standard uncore
PerfMon registers in a machine.

The discovery mechanism relies on BIOS's support. With a proper BIOS,
a PCI device with the unique capability ID 0x23 can be found on each
die. Perf can retrieve the information of all available uncore PerfMons
from the device via MMIO. The information is composed of one global
discovery table and several unit discovery tables.
- The global discovery table includes global uncore information of the
  die, e.g., the address of the global control register, the offset of
  the global status register, the number of uncore units, the offset of
  unit discovery tables, etc.
- The unit discovery table includes generic uncore unit information,
  e.g., the access type, the counter width, the address of counters,
  the address of the counter control, the unit ID, the unit type, etc.
  The unit is also called "box" in the code.
Perf can provide basic uncore support based on this information
with the following patches.

To locate the PCI device with the discovery tables, check the generic
PCI ID first. If it doesn't match, go through the entire PCI device tree
and locate the device with the unique capability ID.

The uncore information is similar among dies. To save parsing time and
space, only completely parse and store the discovery tables on the first
die and the first box of each die. The parsed information is stored in an
RB tree structure, intel_uncore_discovery_type. The size of the stored
discovery tables varies among platforms. It's around 4KB for a Sapphire
Rapids server.

If a BIOS doesn't support the 'discovery' mechanism, there is nothing
changed.

Signed-off-by: Kan Liang 
---
 arch/x86/events/intel/Makefile   |   2 +-
 arch/x86/events/intel/uncore.c   |  27 ++-
 arch/x86/events/intel/uncore_discovery.c | 322 +++
 arch/x86/events/intel/uncore_discovery.h | 105 ++
 4 files changed, 448 insertions(+), 8 deletions(-)
 create mode 100644 arch/x86/events/intel/uncore_discovery.c
 create mode 100644 arch/x86/events/intel/uncore_discovery.h

diff --git a/arch/x86/events/intel/Makefile b/arch/x86/events/intel/Makefile
index e67a588..10bde6c 100644
--- a/arch/x86/events/intel/Makefile
+++ b/arch/x86/events/intel/Makefile
@@ -3,6 +3,6 @@ obj-$(CONFIG_CPU_SUP_INTEL) += core.o bts.o
 obj-$(CONFIG_CPU_SUP_INTEL)+= ds.o knc.o
 obj-$(CONFIG_CPU_SUP_INTEL)+= lbr.o p4.o p6.o pt.o
 obj-$(CONFIG_PERF_EVENTS_INTEL_UNCORE) += intel-uncore.o
-intel-uncore-objs  := uncore.o uncore_nhmex.o uncore_snb.o 
uncore_snbep.o
+intel-uncore-objs  := uncore.o uncore_nhmex.o uncore_snb.o 
uncore_snbep.o uncore_discovery.o
 obj-$(CONFIG_PERF_EVENTS_INTEL_CSTATE) += intel-cstate.o
 intel-cstate-objs  := cstate.o
diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 33c8180..f5b5b8b 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -4,6 +4,7 @@
 #include 
 #include 
 #include "uncore.h"
+#include "uncore_discovery.h"
 
 static struct intel_uncore_type *empty_uncore[] = { NULL, };
 struct intel_uncore_type **uncore_msr_uncores = empty_uncore;
@@ -1637,6 +1638,9 @@ static const struct intel_uncore_init_fun snr_uncore_init 
__initconst = {
.mmio_init = snr_uncore_mmio_init,
 };
 
+static const struct intel_uncore_init_fun generic_uncore_init __initconst = {
+};
+
 static const struct x86_cpu_id intel_uncore_match[] __initconst = {
X86_MATCH_INTEL_FAM6_MODEL(NEHALEM_EP,  _uncore_init),
X86_MATCH_INTEL_FAM6_MODEL(NEHALEM, _uncore_init),
@@ -1684,17 +1688,21 @@ static int __init intel_uncore_init(void)
struct intel_uncore_init_fun *uncore_init;
int pret = 0, cret = 0, mret = 0, ret;
 
-   id = x86_match_cpu(intel_uncore_match);
-   if (!id)
-   return -ENODEV;
-
if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
return -ENODEV;
 
__uncore_max_dies =
topology_max_packages() * topology_max_die_per_package();
 
-   uncore_init = (struct intel_uncore_init_fun *)id->driver_data;
+   id = x86_match_cpu(intel_uncore_match);
+   if (!id) {
+   if (intel_uncore_has_discovery_tables())
+   uncore_init = (struct intel_uncore_init_fun 
*)_uncore_init;
+   else
+   return -ENODEV;
+   } else
+   uncore_init = (struct intel_uncore_init_fun *)id->driver_data;
+
if (uncore_init->pci_init) {
pret = uncore_init->pci_init();
if (!pret)
@@ -1711,8 +1719,10 @@ static int __init intel_uncore_init(void)
mret = uncore_mmio_init();
}
 
-