On 3/7/2018 3:33 PM, Kroening, Gary wrote:
For systems with a single PCI segment, it is sufficient to look for the
bus number to change in order to determine that all of the CHa's have
been counted for a single socket.

However, for multi PCI segment systems, each socket is given a new
segment and the bus number does NOT change.  So looking only for the
bus number to change ends up counting all of the CHa's on all sockets
in the system.  This leads to writing CPU MSRs beyond a valid range and
causes an error in ivbep_uncore_msr_init_box().

The fix is to check for either the bus number or segment number to change.


Hi Gary,

There is a recommended way in uncore document to query the number of CHAs on Skylake server.
I have a patch to implement the new way.

Could you please take a look at the patch and see if it can fix your issue?


Thanks,
Kan

------
From 55f54b2fa3021c691c2fd4f5cfc8f441fd104e91 Mon Sep 17 00:00:00 2001
From: Kan Liang <kan.li...@linux.intel.com>
Date: Mon, 12 Mar 2018 13:03:40 -0700
Subject: [PATCH] perf/x86/intel/uncore: Querying number of CHAs from CAPID6 register

The number of CHAs is miscalculated on multi PCI domain systems on
Skylake server

(From Kroening, Gary:

For systems with a single PCI segment, it is sufficient to look for the
bus number to change in order to determine that all of the CHa's have
been counted for a single socket.
However, for multi PCI segment systems, each socket is given a new
segment and the bus number does NOT change.  So looking only for the
bus number to change ends up counting all of the CHa's on all sockets
in the system.  This leads to writing CPU MSRs beyond a valid range and
causes an error in ivbep_uncore_msr_init_box().)

To determine the number of CHAs, it should read bits 27:0 in the CAPID6
register located at Device 30, Function 3, Offset 0x9C. These 28 bits
form a bit vector of available LLC slices and the CHAs that manage those
slices.

Fixes: cd34cd97b7b4 ("perf/x86/intel/uncore: Add Skylake server uncore
support")
Reported-by: Kroening, Gary <gary.kroen...@hpe.com>
Signed-off-by: Kan Liang <kan.li...@linux.intel.com>
---
 arch/x86/events/intel/uncore_snbep.c | 24 ++++++++++--------------
 1 file changed, 10 insertions(+), 14 deletions(-)

diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index d4672ed..a42715b 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -3575,24 +3575,20 @@ static struct intel_uncore_type *skx_msr_uncores[] = {
        NULL,
 };

+#define SKX_CAPID6             0x9c
+#define SKX_CHA_BIT_WIDTH      28
+
 static int skx_count_chabox(void)
 {
-       struct pci_dev *chabox_dev = NULL;
-       int bus, count = 0;
+       struct pci_dev *dev = NULL;
+       u32 val = 0;

-       while (1) {
-               chabox_dev = pci_get_device(PCI_VENDOR_ID_INTEL, 0x208d, 
chabox_dev);
-               if (!chabox_dev)
-                       break;
-               if (count == 0)
-                       bus = chabox_dev->bus->number;
-               if (bus != chabox_dev->bus->number)
-                       break;
-               count++;
-       }
+       dev = pci_get_device(PCI_VENDOR_ID_INTEL, 0x2083, dev);
+       if (!dev)
+               return 0;

-       pci_dev_put(chabox_dev);
-       return count;
+       pci_read_config_dword(dev, SKX_CAPID6, &val);
+       return bitmap_weight((unsigned long *)&val, SKX_CHA_BIT_WIDTH);
 }

 void skx_uncore_cpu_init(void)
--
2.7.4

Reply via email to