Re: [PATCH] MAINTAINERS: cxl: update maintainership
Acked-by: Ian Munsie <imun...@au1.ibm.com> Excerpts from andrew.donnellan's message of 2017-06-28 17:22:30 +1000: > As Ian's stepping down from his maintainer role now that he's leaving IBM, > Frederic has asked me to add myself to the cxl maintainer list. Updating > accordingly. > > Cc: Frederic Barrat <fbar...@linux.vnet.ibm.com> > Cc: Ian Munsie <imun...@au1.ibm.com> > Signed-off-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> > > --- > > Applies on top of http://patchwork.ozlabs.org/patch/781464 > --- > MAINTAINERS | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/MAINTAINERS b/MAINTAINERS > index 6547fc1b5299..b1fd5a02dd28 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -3722,6 +3722,7 @@ F:drivers/net/ethernet/chelsio/cxgb4vf/ > > CXL (IBM Coherent Accelerator Processor Interface CAPI) DRIVER > M:Frederic Barrat <fbar...@linux.vnet.ibm.com> > +M:Andrew Donnellan <andrew.donnel...@au1.ibm.com> > L:linuxppc-dev@lists.ozlabs.org > S:Supported > F:arch/powerpc/platforms/powernv/pci-cxl.c
[PATCH] MAINTAINERS: Remove myself as cxl maintainer
From: Ian Munsie <imun...@au1.ibm.com> I am no longer employed by IBM and will no longer have access to cxl hardware, so remove myself as a cxl maintainer. If anyone needs to contact me in the future, please use my personal email address darkstarsw...@gmail.com Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Cc: Frederic Barrat <fbar...@linux.vnet.ibm.com> Cc: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org --- MAINTAINERS | 1 - 1 file changed, 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 9e984645c4b0..1e8a915577fa 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3721,7 +3721,6 @@ S:Supported F: drivers/net/ethernet/chelsio/cxgb4vf/ CXL (IBM Coherent Accelerator Processor Interface CAPI) DRIVER -M: Ian Munsie <imun...@au1.ibm.com> M: Frederic Barrat <fbar...@linux.vnet.ibm.com> L: linuxppc-dev@lists.ozlabs.org S: Supported -- 2.11.0
Re: [PATCH v2] cxl: fix build when CONFIG_DEBUG_FS=n
Acked-by: Ian Munsie <imun...@au1.ibm.com>
Re: [PATCH] cxl: prevent read/write to AFU config space while AFU not configured
Acked-by: Ian Munsie <imun...@au1.ibm.com> Looks like a reasonable solution > Pradipta found this while doing testing for cxlflash. I've tested this > patch and I'm satisfied that it solves the issue, but I've asked Pradipta > to test it a bit further. :)
Re: [PATCH] cxl: drop duplicate header sched.h
Acked-by: Ian Munsie <imun...@au1.ibm.com>
Re: [PATCH] cxl: fix coccinelle warnings
Excerpts from andrew.donnellan's message of 2016-11-23 18:06:59 +1100: > On 23/11/16 17:49, Ian Munsie wrote: > > Most of these look fine > > > >> -return debugfs_create_file(name, mode, parent, (void __force *)value, > >> _io_x64); > >> +return debugfs_create_file_unsafe(name, mode, parent, > >> + (void __force *)value, _io_x64); > > > > Just wondering what this one is about? > > See explanation at https://lkml.org/lkml/2016/3/6/75 - when we use > DEFINE_DEBUGFS_ATTRIBUTE() rather than DEFINE_SIMPLE_ATTRIBUTE(), we > don't need the "lifetime managing proxy" that debugfs_create_file() sets up. > > coccinelle proposed that change based on > scripts/coccinelle/api/debugfs/debugfs_simple_attr.cocci Thanks, that makes sense :) Acked-by: Ian Munsie <imun...@au1.ibm.com>
Re: [PATCH] cxl: fix coccinelle warnings
Most of these look fine > -return debugfs_create_file(name, mode, parent, (void __force *)value, > _io_x64); > +return debugfs_create_file_unsafe(name, mode, parent, > + (void __force *)value, _io_x64); Just wondering what this one is about? Cheers, -Ian
Re: [PATCH] cxl: Fix memory allocation failure test
Acked-by: Ian Munsie <imun...@au1.ibm.com>
Re: [PATCH] cxl: Fix error handling
Acked-by: Ian Munsie <imun...@au1.ibm.com>
Re: [PATCH] cxl: Fix error handling
Acked-by: Ian Munsie <imun...@au1.ibm.com>
Re: [PATCH] powerpc/mm/coproc: Handle bad address on coproc slb fault
Reviewed-by: Ian Munsie <imun...@au1.ibm.com>
Re: [PATCH v2] cxl: Flush PSL cache before resetting the adapter
Acked-by: Ian Munsie <imun...@au1.ibm.com>
Re: [PATCH] cxl: Flush PSL cache before resetting the adapter
Acked-by: Ian Munsie <imun...@au1.ibm.com>
Re: [PATCH] cxl: use pcibios_free_controller_deferred() when removing vPHBs
Acked-by: Ian Munsie <imun...@au1.ibm.com>
Re: [PATCH] cxl: fix NULL dereference in cxl_context_init() on PowerVM guests
Whoops! Acked-by: Ian Munsie <imun...@au1.ibm.com> ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] cxl: add device ID for Mellanox ConnectX-4
Acked-by: Ian Munsie <imun...@au1.ibm.com> ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] cxl: fix sparse warnings
Acked-by: Ian Munsie <imun...@au1.ibm.com> ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2] powerpc/powernv: fix pci-cxl.c build when CONFIG_MODULES=n
From: Ian Munsie <imun...@au1.ibm.com> pnv_cxl_enable_phb_kernel_api() grabs a reference to the cxl module to prevent it from being unloaded after the PHB has been switched to CX4 mode. This breaks the build when CONFIG_MODULES=n as module_mutex doesn't exist. However, if we don't have modules, we don't need to protect against the case of the cxl module being unloaded. As such, split the relevant code out into a function surrounded with #if IS_MODULE(CXL) so we don't try to compile it if cxl isn't being compiled as a module. Fixes: 5918dbc9b4ec ("powerpc/powernv: Add support for the cxl kernel api on the real phb") Reported-by: Michael Ellerman <m...@ellerman.id.au> Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Signed-off-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> --- Changes since v1: - Actually tested now our systems are back online - Passed error code back to caller - Fixed IS_MODULE(cxl) to IS_MODULE(CONFIG_CXL) - Face was introduced to palm, several times --- arch/powerpc/platforms/powernv/pci-cxl.c | 30 ++ 1 file changed, 22 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/platforms/powernv/pci-cxl.c b/arch/powerpc/platforms/powernv/pci-cxl.c index 3f34207..2d67be0 100644 --- a/arch/powerpc/platforms/powernv/pci-cxl.c +++ b/arch/powerpc/platforms/powernv/pci-cxl.c @@ -166,6 +166,24 @@ int pnv_cxl_ioda_msi_setup(struct pci_dev *dev, unsigned int hwirq, } EXPORT_SYMBOL(pnv_cxl_ioda_msi_setup); +#if IS_MODULE(CONFIG_CXL) +static inline int get_cxl_module(void) +{ + struct module *cxl_module; + + mutex_lock(_mutex); + cxl_module = find_module("cxl"); + if (cxl_module) + __module_get(cxl_module); + mutex_unlock(_mutex); + if (!cxl_module) + return -ENODEV; + return 0; +} +#else +static inline int get_cxl_module(void) { return 0; } +#endif + /* * Sets flags and switches the controller ops to enable the cxl kernel api. * Originally the cxl kernel API operated on a virtual PHB, but certain cards @@ -175,7 +193,7 @@ EXPORT_SYMBOL(pnv_cxl_ioda_msi_setup); int pnv_cxl_enable_phb_kernel_api(struct pci_controller *hose, bool enable) { struct pnv_phb *phb = hose->private_data; - struct module *cxl_module; + int rc; if (!enable) { /* @@ -194,13 +212,9 @@ int pnv_cxl_enable_phb_kernel_api(struct pci_controller *hose, bool enable) * long as we are in this mode (and since we can't safely disable this * mode once enabled...). */ - mutex_lock(_mutex); - cxl_module = find_module("cxl"); - if (cxl_module) - __module_get(cxl_module); - mutex_unlock(_mutex); - if (!cxl_module) - return -ENODEV; + rc = get_cxl_module(); + if (rc) + return rc; phb->flags |= PNV_PHB_FLAG_CXL; hose->controller_ops = pnv_cxl_cx4_ioda_controller_ops; -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] cxl: add option to enable -DDEBUG
Acked-by: Ian Munsie <imun...@au1.ibm.com> ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH -next] cxl: Use for_each_compatible_node() macro
Acked-by: Ian Munsie <imun...@au1.ibm.com> ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 15/15] cxl: Add cxl_check_and_switch_mode() API to switch bi-modal cards
From: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Add a new API, cxl_check_and_switch_mode() to allow for switching of bi-modal CAPI cards, such as the Mellanox CX-4 network card. When a driver requests to switch a card to CAPI mode, use PCI hotplug infrastructure to remove all PCI devices underneath the slot. We then write an updated mode control register to the CAPI VSEC, hot reset the card, and reprobe the card. As the card may present a different set of PCI devices after the mode switch, use the infrastructure provided by the pnv_php driver and the OPAL PCI slot management facilities to ensure that: * the old devices are removed from both the OPAL and Linux device trees * the new devices are probed by OPAL and added to the OPAL device tree * the new devices are added to the Linux device tree and probed through the regular PCI device probe path As such, introduce a new option, CONFIG_CXL_BIMODAL, with a dependency on the pnv_php driver. Refactor existing code that touches the mode control register in the regular single mode case into a new function, setup_cxl_protocol_area(). Co-authored-by: Ian Munsie <imun...@au1.ibm.com> Cc: Gavin Shan <gws...@linux.vnet.ibm.com> Signed-off-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Gavin Shan <gws...@linux.vnet.ibm.com> --- V1->V2: - Added comments around pci_dev_put() - suggested by Frederic Barrat - Added new error label for error paths calling pci_dev_put() - suggested by Ian Munsie - Added newline at end of Kconfig - Removed extraneous comment in setup_cxl_protocol_area() --- drivers/misc/cxl/Kconfig | 8 ++ drivers/misc/cxl/pci.c | 236 +++ include/misc/cxl.h | 25 + 3 files changed, 251 insertions(+), 18 deletions(-) diff --git a/drivers/misc/cxl/Kconfig b/drivers/misc/cxl/Kconfig index 560412c..8d76770 100644 --- a/drivers/misc/cxl/Kconfig +++ b/drivers/misc/cxl/Kconfig @@ -38,3 +38,11 @@ config CXL CAPI adapters are found in POWER8 based systems. If unsure, say N. + +config CXL_BIMODAL + bool "Support for bi-modal CAPI cards" + depends on HOTPLUG_PCI_POWERNV = y && CXL || HOTPLUG_PCI_POWERNV = m && CXL = m + default y + help + Select this option to enable support for bi-modal CAPI cards, such as + the Mellanox CX-4. diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index efe202f..d152e2d 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -55,6 +55,8 @@ pci_read_config_byte(dev, vsec + 0xa, dest) #define CXL_WRITE_VSEC_MODE_CONTROL(dev, vsec, val) \ pci_write_config_byte(dev, vsec + 0xa, val) +#define CXL_WRITE_VSEC_MODE_CONTROL_BUS(bus, devfn, vsec, val) \ + pci_bus_write_config_byte(bus, devfn, vsec + 0xa, val) #define CXL_VSEC_PROTOCOL_MASK 0xe0 #define CXL_VSEC_PROTOCOL_1024TB 0x80 #define CXL_VSEC_PROTOCOL_512TB 0x40 @@ -614,36 +616,234 @@ static int setup_cxl_bars(struct pci_dev *dev) return 0; } -/* pciex node: ibm,opal-m64-window = <0x3d058 0x0 0x3d058 0x0 0x8 0x0>; */ -static int switch_card_to_cxl(struct pci_dev *dev) -{ +#ifdef CONFIG_CXL_BIMODAL + +struct cxl_switch_work { + struct pci_dev *dev; + struct work_struct work; int vsec; + int mode; +}; + +static void switch_card_to_cxl(struct work_struct *work) +{ + struct cxl_switch_work *switch_work = + container_of(work, struct cxl_switch_work, work); + struct pci_dev *dev = switch_work->dev; + struct pci_bus *bus = dev->bus; + struct pci_controller *hose = pci_bus_to_host(bus); + struct pci_dev *bridge; + struct pnv_php_slot *php_slot; + unsigned int devfn; u8 val; int rc; - dev_info(>dev, "switch card to CXL\n"); + dev_info(>dev, "cxl: Preparing for mode switch...\n"); + bridge = list_first_entry_or_null(>bus->devices, struct pci_dev, + bus_list); + if (!bridge) { + dev_WARN(>dev, "cxl: Couldn't find root port!\n"); + goto err_dev_put; + } - if (!(vsec = find_cxl_vsec(dev))) { - dev_err(>dev, "ABORTING: CXL VSEC not found!\n"); + php_slot = pnv_php_find_slot(pci_device_to_OF_node(bridge)); + if (!php_slot) { + dev_err(>dev, "cxl: Failed to find slot hotplug " + "information. You may need to upgrade " + "skiboot. Aborting.\n"); + goto err_dev_put; + } + + rc = CXL_READ_VSEC_MODE_CONTROL(dev, switch_work->vsec, ); + if (rc) { + dev_err(>
[PATCH 14/15] PCI/hotplug: pnv_php: handle OPAL_PCI_SLOT_OFFLINE power state
From: Andrew Donnellan <andrew.donnel...@au1.ibm.com> When calling pnv_php_set_slot_power_state() with state == OPAL_PCI_SLOT_OFFLINE, remove devices from the device tree as if we're dealing with OPAL_PCI_SLOT_POWER_OFF. Cc: Gavin Shan <gws...@linux.vnet.ibm.com> Cc: linux-...@vger.kernel.org Cc: Bjorn Helgaas <bhelg...@google.com> Signed-off-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Acked-by: Gavin Shan <gws...@linux.vnet.ibm.com> --- drivers/pci/hotplug/pnv_php.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c index 2d2f704..e6245b0 100644 --- a/drivers/pci/hotplug/pnv_php.c +++ b/drivers/pci/hotplug/pnv_php.c @@ -317,7 +317,7 @@ int pnv_php_set_slot_power_state(struct hotplug_slot *slot, return ret; } - if (state == OPAL_PCI_SLOT_POWER_OFF) + if (state == OPAL_PCI_SLOT_POWER_OFF || state == OPAL_PCI_SLOT_OFFLINE) pnv_php_rmv_devtree(php_slot); else ret = pnv_php_add_devtree(php_slot); -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 13/15] PCI/hotplug: pnv_php: export symbols and move struct types needed by cxl
From: Andrew Donnellan <andrew.donnel...@au1.ibm.com> The cxl driver will use infrastructure from pnv_php to handle device tree updates when switching bi-modal CAPI cards into CAPI mode. To enable this, export pnv_php_find_slot() and pnv_php_set_slot_power_state(), and add corresponding declarations, as well as the definition of struct pnv_php_slot, to asm/pnv-pci.h. Cc: Gavin Shan <gws...@linux.vnet.ibm.com> Cc: linux-...@vger.kernel.org Cc: Bjorn Helgaas <bhelg...@google.com> Signed-off-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Acked-by: Gavin Shan <gws...@linux.vnet.ibm.com> --- V1->V2: - Dropped extraneous "select HOTPLUG_PCI_POWERNV_BASE" in Kconfig, which was accidentally left in from an earlier non-public revision. Thanks to Gavin Shan for pointing it out --- arch/powerpc/include/asm/pnv-pci.h | 28 drivers/pci/hotplug/pnv_php.c | 32 +--- 2 files changed, 33 insertions(+), 27 deletions(-) diff --git a/arch/powerpc/include/asm/pnv-pci.h b/arch/powerpc/include/asm/pnv-pci.h index c47097f..0cbd813 100644 --- a/arch/powerpc/include/asm/pnv-pci.h +++ b/arch/powerpc/include/asm/pnv-pci.h @@ -11,6 +11,7 @@ #define _ASM_PNV_PCI_H #include +#include #include #include @@ -47,4 +48,31 @@ void pnv_cxl_phb_set_peer_afu(struct pci_dev *dev, struct cxl_afu *afu); #endif +struct pnv_php_slot { + struct hotplug_slot slot; + struct hotplug_slot_infoslot_info; + uint64_tid; + char*name; + int slot_no; + struct kref kref; +#define PNV_PHP_STATE_INITIALIZED 0 +#define PNV_PHP_STATE_REGISTERED 1 +#define PNV_PHP_STATE_POPULATED2 +#define PNV_PHP_STATE_OFFLINE 3 + int state; + struct device_node *dn; + struct pci_dev *pdev; + struct pci_bus *bus; + boolpower_state_check; + void*fdt; + void*dt; + struct of_changeset ocs; + struct pnv_php_slot *parent; + struct list_headchildren; + struct list_headlink; +}; +extern struct pnv_php_slot *pnv_php_find_slot(struct device_node *dn); +extern int pnv_php_set_slot_power_state(struct hotplug_slot *slot, + uint8_t state); + #endif diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c index 6086db6..2d2f704 100644 --- a/drivers/pci/hotplug/pnv_php.c +++ b/drivers/pci/hotplug/pnv_php.c @@ -22,30 +22,6 @@ #define DRIVER_AUTHOR "Gavin Shan, IBM Corporation" #define DRIVER_DESC"PowerPC PowerNV PCI Hotplug Driver" -struct pnv_php_slot { - struct hotplug_slot slot; - struct hotplug_slot_infoslot_info; - uint64_tid; - char*name; - int slot_no; - struct kref kref; -#define PNV_PHP_STATE_INITIALIZED 0 -#define PNV_PHP_STATE_REGISTERED 1 -#define PNV_PHP_STATE_POPULATED2 -#define PNV_PHP_STATE_OFFLINE 3 - int state; - struct device_node *dn; - struct pci_dev *pdev; - struct pci_bus *bus; - boolpower_state_check; - void*fdt; - void*dt; - struct of_changeset ocs; - struct pnv_php_slot *parent; - struct list_headchildren; - struct list_headlink; -}; - static LIST_HEAD(pnv_php_slot_list); static DEFINE_SPINLOCK(pnv_php_lock); @@ -91,7 +67,7 @@ static struct pnv_php_slot *pnv_php_match(struct device_node *dn, return NULL; } -static struct pnv_php_slot *pnv_php_find_slot(struct device_node *dn) +struct pnv_php_slot *pnv_php_find_slot(struct device_node *dn) { struct pnv_php_slot *php_slot, *tmp; unsigned long flags; @@ -108,6 +84,7 @@ static struct pnv_php_slot *pnv_php_find_slot(struct device_node *dn) return NULL; } +EXPORT_SYMBOL_GPL(pnv_php_find_slot); /* * Remove pdn for all children of the indicated device node. @@ -316,8 +293,8 @@ out: return ret; } -static int pnv_php_set_slot_power_state(struct hotplug_slot *slot, - uint8_t state) +int pnv_php_set_slot_power_state(struct hotplug_slot *slot, +uint8_t state) { struc
[PATCH 11/15] cxl: Add support for interrupts on the Mellanox CX4
From: Ian Munsie <imun...@au1.ibm.com> The Mellanox CX4 in cxl mode uses a hybrid interrupt model, where interrupts are routed from the networking hardware to the XSL using the MSIX table, and from there will be transformed back into an MSIX interrupt using the cxl style interrupts (i.e. using IVTE entries and ranges to map a PE and AFU interrupt number to an MSIX address). We want to hide the implementation details of cxl interrupts as much as possible. To this end, we use a special version of the MSI setup & teardown routines in the PHB while in cxl mode to allocate the cxl interrupts and configure the IVTE entries in the process element. This function does not configure the MSIX table - the CX4 card uses a custom format in that table and it would not be appropriate to fill that out in generic code. The rest of the functionality is similar to the "Full MSI-X mode" described in the CAIA, and this could be easily extended to support other adapters that use that mode in the future. The interrupts will be associated with the default context. If the maximum number of interrupts per context has been limited (e.g. by the mlx5 driver), it will automatically allocate additional kernel contexts to associate extra interrupts as required. These contexts will be started using the same WED that was used to start the default context. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- V1->V2: - Handle error case if cxl_next_msi_hwirq returns 0 signifying that an AFU IRQ is not mapped to a hardware interrupt. --- arch/powerpc/platforms/powernv/pci-cxl.c | 84 +++ arch/powerpc/platforms/powernv/pci-ioda.c | 4 ++ arch/powerpc/platforms/powernv/pci.h | 2 + drivers/misc/cxl/api.c| 71 ++ drivers/misc/cxl/base.c | 31 drivers/misc/cxl/cxl.h| 4 ++ drivers/misc/cxl/main.c | 2 + include/misc/cxl-base.h | 4 ++ 8 files changed, 202 insertions(+) diff --git a/arch/powerpc/platforms/powernv/pci-cxl.c b/arch/powerpc/platforms/powernv/pci-cxl.c index 831bbfb..3f34207 100644 --- a/arch/powerpc/platforms/powernv/pci-cxl.c +++ b/arch/powerpc/platforms/powernv/pci-cxl.c @@ -8,6 +8,7 @@ */ #include +#include #include #include #include @@ -281,3 +282,86 @@ void pnv_cxl_disable_device(struct pci_dev *dev) cxl_pci_disable_device(dev); cxl_afu_put(afu); } + +/* + * This is a special version of pnv_setup_msi_irqs for cards in cxl mode. This + * function handles setting up the IVTE entries for the XSL to use. + * + * We are currently not filling out the MSIX table, since the only currently + * supported adapter (CX4) uses a custom MSIX table format in cxl mode and it + * is up to their driver to fill that out. In the future we may fill out the + * MSIX table (and change the IVTE entries to be an index to the MSIX table) + * for adapters implementing the Full MSI-X mode described in the CAIA. + */ +int pnv_cxl_cx4_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type) +{ + struct pci_controller *hose = pci_bus_to_host(pdev->bus); + struct pnv_phb *phb = hose->private_data; + struct msi_desc *entry; + struct cxl_context *ctx = NULL; + unsigned int virq; + int hwirq; + int afu_irq = 0; + int rc; + + if (WARN_ON(!phb) || !phb->msi_bmp.bitmap) + return -ENODEV; + + if (pdev->no_64bit_msi && !phb->msi32_support) + return -ENODEV; + + rc = cxl_cx4_setup_msi_irqs(pdev, nvec, type); + if (rc) + return rc; + + for_each_pci_msi_entry(entry, pdev) { + if (!entry->msi_attrib.is_64 && !phb->msi32_support) { + pr_warn("%s: Supports only 64-bit MSIs\n", + pci_name(pdev)); + return -ENXIO; + } + + hwirq = cxl_next_msi_hwirq(pdev, , _irq); + if (WARN_ON(hwirq <= 0)) + return (hwirq ? hwirq : -ENOMEM); + + virq = irq_create_mapping(NULL, hwirq); + if (virq == NO_IRQ) { + pr_warn("%s: Failed to map cxl mode MSI to linux irq\n", + pci_name(pdev)); + return -ENOMEM; + } + + rc = pnv_cxl_ioda_msi_setup(pdev, hwirq, virq); + if (rc) { + pr_warn("%s: Failed to setup cxl mode MSI\n", pci_name(pdev)); + irq_dispose_mapping(virq); + return rc; + } + + irq_set_msi_desc(virq, entry); + } + + return 0; +} + +void pnv_cxl_cx4_teardown_msi_irqs(struct pci_dev *pdev) +{ + struct pci_controller *hose = pci_bus_t
[PATCH 12/15] cxl: Workaround PE=0 hardware limitation in Mellanox CX4
From: Ian Munsie <imun...@au1.ibm.com> The CX4 card cannot cope with a context with PE=0 due to a hardware limitation, resulting in: [ 34.166577] command failed, status limits exceeded(0x8), syndrome 0x5a7939 [ 34.166580] mlx5_core :01:00.1: Failed allocating uar, aborting Since the kernel API allocates a default context very early during device init that will almost certainly get Process Element ID 0 there is no easy way for us to extend the API to allow the Mellanox to inform us of this limitation ahead of time. Instead, work around the issue by extending the XSL structure to include a minimum PE to allocate. Although the bug is not in the XSL, it is the easiest place to work around this limitation given that the CX4 is currently the only card that uses an XSL. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> --- drivers/misc/cxl/context.c | 3 ++- drivers/misc/cxl/cxl.h | 1 + drivers/misc/cxl/pci.c | 1 + 3 files changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/misc/cxl/context.c b/drivers/misc/cxl/context.c index 2616cddb..bdee9a0 100644 --- a/drivers/misc/cxl/context.c +++ b/drivers/misc/cxl/context.c @@ -90,7 +90,8 @@ int cxl_context_init(struct cxl_context *ctx, struct cxl_afu *afu, bool master, */ mutex_lock(>contexts_lock); idr_preload(GFP_KERNEL); - i = idr_alloc(>afu->contexts_idr, ctx, 0, + i = idr_alloc(>afu->contexts_idr, ctx, + ctx->afu->adapter->native->sl_ops->min_pe, ctx->afu->num_procs, GFP_NOWAIT); idr_preload_end(); mutex_unlock(>contexts_lock); diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index d50cdb1..de09053 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -561,6 +561,7 @@ struct cxl_service_layer_ops { u64 (*timebase_read)(struct cxl *adapter); int capi_mode; bool needs_reset_before_disable; + int min_pe; }; struct cxl_native { diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index cb5d172..efe202f 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -1321,6 +1321,7 @@ static const struct cxl_service_layer_ops xsl_ops = { .write_timebase_ctrl = write_timebase_ctrl_xsl, .timebase_read = timebase_read_xsl, .capi_mode = OPAL_PHB_CAPI_MODE_DMA, + .min_pe = 1, /* Workaround for Mellanox CX4 HW bug */ }; static void set_sl_ops(struct cxl *adapter, struct pci_dev *dev) -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 10/15] cxl: Add preliminary workaround for CX4 interrupt limitation
From: Ian Munsie <imun...@au1.ibm.com> The Mellanox CX4 has a hardware limitation where only 4 bits of the AFU interrupt number can be passed to the XSL when sending an interrupt, limiting it to only 15 interrupts per context (AFU interrupt number 0 is invalid). In order to overcome this, we will allocate additional contexts linked to the default context as extra address space for the extra interrupts - this will be implemented in the next patch. This patch adds the preliminary support to allow this, by way of adding a linked list in the context structure that we use to keep track of the contexts dedicated to interrupts, and an API to simultaneously iterate over the related context structures, AFU interrupt numbers and hardware interrupt numbers. The point of using a single API to iterate these is to hide some of the details of the iteration from external code, and to reduce the number of APIs that need to be exported via base.c to allow built in code to call. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> --- V1->V2: - Fixed typo spotted by Fred --- drivers/misc/cxl/api.c | 15 +++ drivers/misc/cxl/base.c| 17 + drivers/misc/cxl/context.c | 1 + drivers/misc/cxl/cxl.h | 10 ++ drivers/misc/cxl/main.c| 1 + include/misc/cxl.h | 9 + 6 files changed, 53 insertions(+) diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index 1e2c0d9..f02a859 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -97,6 +97,21 @@ static irq_hw_number_t cxl_find_afu_irq(struct cxl_context *ctx, int num) return 0; } +int _cxl_next_msi_hwirq(struct pci_dev *pdev, struct cxl_context **ctx, int *afu_irq) +{ + if (*ctx == NULL || *afu_irq == 0) { + *afu_irq = 1; + *ctx = cxl_get_context(pdev); + } else { + (*afu_irq)++; + if (*afu_irq > cxl_get_max_irqs_per_process(pdev)) { + *ctx = list_next_entry(*ctx, extra_irq_contexts); + *afu_irq = 1; + } + } + return cxl_find_afu_irq(*ctx, *afu_irq); +} +/* Exported via cxl_base */ int cxl_set_priv(struct cxl_context *ctx, void *priv) { diff --git a/drivers/misc/cxl/base.c b/drivers/misc/cxl/base.c index 1c3e737f..d7d9d02 100644 --- a/drivers/misc/cxl/base.c +++ b/drivers/misc/cxl/base.c @@ -141,6 +141,23 @@ void cxl_pci_disable_device(struct pci_dev *dev) } EXPORT_SYMBOL_GPL(cxl_pci_disable_device); +int cxl_next_msi_hwirq(struct pci_dev *pdev, struct cxl_context **ctx, int *afu_irq) +{ + int ret; + struct cxl_calls *calls; + + calls = cxl_calls_get(); + if (!calls) + return -EBUSY; + + ret = calls->cxl_next_msi_hwirq(pdev, ctx, afu_irq); + + cxl_calls_put(calls); + + return ret; +} +EXPORT_SYMBOL_GPL(cxl_next_msi_hwirq); + static int __init cxl_base_init(void) { struct device_node *np = NULL; diff --git a/drivers/misc/cxl/context.c b/drivers/misc/cxl/context.c index edbb99e..2616cddb 100644 --- a/drivers/misc/cxl/context.c +++ b/drivers/misc/cxl/context.c @@ -68,6 +68,7 @@ int cxl_context_init(struct cxl_context *ctx, struct cxl_afu *afu, bool master, ctx->pending_afu_err = false; INIT_LIST_HEAD(>irq_names); + INIT_LIST_HEAD(>extra_irq_contexts); /* * When we have to destroy all contexts in cxl_context_detach_all() we diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index b81f476..73b9a55 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -537,6 +537,14 @@ struct cxl_context { atomic_t afu_driver_events; struct rcu_head rcu; + + /* +* Only used when more interrupts are allocated via +* pci_enable_msix_range than are supported in the default context, to +* use additional contexts to overcome the limitation. i.e. Mellanox +* CX4 only: +*/ + struct list_head extra_irq_contexts; }; struct cxl_service_layer_ops { @@ -722,11 +730,13 @@ ssize_t cxl_pci_afu_read_err_buffer(struct cxl_afu *afu, char *buf, /* Internal functions wrapped in cxl_base to allow PHB to call them */ bool _cxl_pci_associate_default_context(struct pci_dev *dev, struct cxl_afu *afu); void _cxl_pci_disable_device(struct pci_dev *dev); +int _cxl_next_msi_hwirq(struct pci_dev *pdev, struct cxl_context **ctx, int *afu_irq); struct cxl_calls { void (*cxl_slbia)(struct mm_struct *mm); bool (*cxl_pci_associate_default_context)(struct pci_dev *dev, struct cxl_afu *afu); void (*cxl_pci_disable_device)(struct pci_dev *dev); + int (*cxl_next_msi_hwirq)(struct pci_dev *pdev, struct cxl_context **ctx, int *afu_irq); struct module *owner; }; diff --git a/
[PATCH 09/15] cxl: Add kernel APIs to get & set the max irqs per context
From: Ian Munsie <imun...@au1.ibm.com> These APIs will be used by the Mellanox CX4 support. While they function standalone to configure existing behaviour, their primary purpose is to allow the Mellanox driver to inform the cxl driver of a hardware limitation, which will be used in a future patch. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> --- drivers/misc/cxl/api.c | 27 +++ include/misc/cxl.h | 10 ++ 2 files changed, 37 insertions(+) diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index 6a030bf..1e2c0d9 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -447,3 +447,30 @@ ssize_t cxl_read_adapter_vpd(struct pci_dev *dev, void *buf, size_t count) return cxl_ops->read_adapter_vpd(afu->adapter, buf, count); } EXPORT_SYMBOL_GPL(cxl_read_adapter_vpd); + +int cxl_set_max_irqs_per_process(struct pci_dev *dev, int irqs) +{ + struct cxl_afu *afu = cxl_pci_to_afu(dev); + if (IS_ERR(afu)) + return -ENODEV; + + if (irqs > afu->adapter->user_irqs) + return -EINVAL; + + /* Limit user_irqs to prevent the user increasing this via sysfs */ + afu->adapter->user_irqs = irqs; + afu->irqs_max = irqs; + + return 0; +} +EXPORT_SYMBOL_GPL(cxl_set_max_irqs_per_process); + +int cxl_get_max_irqs_per_process(struct pci_dev *dev) +{ + struct cxl_afu *afu = cxl_pci_to_afu(dev); + if (IS_ERR(afu)) + return -ENODEV; + + return afu->irqs_max; +} +EXPORT_SYMBOL_GPL(cxl_get_max_irqs_per_process); diff --git a/include/misc/cxl.h b/include/misc/cxl.h index dd9eebb..fc07ed4 100644 --- a/include/misc/cxl.h +++ b/include/misc/cxl.h @@ -166,6 +166,16 @@ void cxl_psa_unmap(void __iomem *addr); /* Get the process element for this context */ int cxl_process_element(struct cxl_context *ctx); +/* + * Limit the number of interrupts that a single context can allocate via + * cxl_start_work. If using the api with a real phb, this may be used to + * request that additional default contexts be created when allocating + * interrupts via pci_enable_msix_range. These will be set to the same running + * state as the default context, and if that is running it will reuse the + * parameters previously passed to cxl_start_context for the default context. + */ +int cxl_set_max_irqs_per_process(struct pci_dev *dev, int irqs); +int cxl_get_max_irqs_per_process(struct pci_dev *dev); /* * These calls allow drivers to create their own file descriptors and make them -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 08/15] cxl: Add support for using the kernel API with a real PHB
From: Ian Munsie <imun...@au1.ibm.com> This hooks up support for using the kernel API with a real PHB. After the AFU initialisation has completed it calls into the PHB code to pass it the AFU that will be used by other peer physical functions on the adapter. The cxl_pci_to_afu API is extended to work with peer PCI devices, retrieving the peer AFU from the PHB. This API may also now return an error if it is called on a PCI device that is not associated with either a cxl vPHB or a peer PCI device to an AFU, and this error is propagated down. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> --- V1->V2: - Removed change to skip participating in EEH without a vPHB - moved out into an earlier patch. --- drivers/misc/cxl/api.c | 5 + drivers/misc/cxl/pci.c | 3 +++ drivers/misc/cxl/vphb.c | 16 ++-- 3 files changed, 22 insertions(+), 2 deletions(-) diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index 7707055..6a030bf 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -13,6 +13,7 @@ #include #include #include +#include #include "cxl.h" @@ -24,6 +25,8 @@ struct cxl_context *cxl_dev_context_init(struct pci_dev *dev) int rc; afu = cxl_pci_to_afu(dev); + if (IS_ERR(afu)) + return ERR_CAST(afu); ctx = cxl_context_alloc(); if (IS_ERR(ctx)) { @@ -438,6 +441,8 @@ EXPORT_SYMBOL_GPL(cxl_perst_reloads_same_image); ssize_t cxl_read_adapter_vpd(struct pci_dev *dev, void *buf, size_t count) { struct cxl_afu *afu = cxl_pci_to_afu(dev); + if (IS_ERR(afu)) + return -ENODEV; return cxl_ops->read_adapter_vpd(afu->adapter, buf, count); } diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index dd7ff22..cb5d172 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -1502,6 +1502,9 @@ static int cxl_probe(struct pci_dev *dev, const struct pci_device_id *id) dev_err(>dev, "AFU %i failed to start: %i\n", slice, rc); } + if (pnv_pci_on_cxl_phb(dev) && adapter->slices >= 1) + pnv_cxl_phb_set_peer_afu(dev, adapter->afu[0]); + return 0; } diff --git a/drivers/misc/cxl/vphb.c b/drivers/misc/cxl/vphb.c index 8865e8d..dee8def 100644 --- a/drivers/misc/cxl/vphb.c +++ b/drivers/misc/cxl/vphb.c @@ -9,6 +9,7 @@ #include #include +#include #include "cxl.h" static int cxl_dma_set_mask(struct pci_dev *pdev, u64 dma_mask) @@ -258,13 +259,18 @@ void cxl_pci_vphb_remove(struct cxl_afu *afu) pcibios_free_controller(phb); } +static bool _cxl_pci_is_vphb_device(struct pci_controller *phb) +{ + return (phb->ops == _pcie_pci_ops); +} + bool cxl_pci_is_vphb_device(struct pci_dev *dev) { struct pci_controller *phb; phb = pci_bus_to_host(dev->bus); - return (phb->ops == _pcie_pci_ops); + return _cxl_pci_is_vphb_device(phb); } struct cxl_afu *cxl_pci_to_afu(struct pci_dev *dev) @@ -273,7 +279,13 @@ struct cxl_afu *cxl_pci_to_afu(struct pci_dev *dev) phb = pci_bus_to_host(dev->bus); - return (struct cxl_afu *)phb->private_data; + if (_cxl_pci_is_vphb_device(phb)) + return (struct cxl_afu *)phb->private_data; + + if (pnv_pci_on_cxl_phb(dev)) + return pnv_cxl_phb_to_afu(phb); + + return ERR_PTR(-ENODEV); } EXPORT_SYMBOL_GPL(cxl_pci_to_afu); -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 07/15] powerpc/powernv: Add support for the cxl kernel api on the real phb
From: Ian Munsie <imun...@au1.ibm.com> This adds support for the peer model of the cxl kernel api to the PowerNV PHB, in which physical function 0 represents the cxl function on the card (an XSL in the case of the CX4), which other physical functions will use for memory access and interrupt services. It is referred to as the peer model as these functions are peers of one another, as opposed to the Virtual PHB model which forms a hierarchy. This patch exports APIs to enable the peer mode, check if a PCI device is attached to a PHB in this mode, and to set and get the peer AFU for this mode. The cxl driver will enable this mode for supported cards by calling pnv_cxl_enable_phb_kernel_api(). This will set a flag in the PHB to note that this mode is enabled, and switch out it's controller_ops for the cxl version. The cxl version of the controller_ops struct implements it's own versions of the enable_device_hook and release_device to handle refcounting on the peer AFU and to allocate a default context for the device. Once enabled, the cxl kernel API may not be disabled on a PHB. Currently there is no safe way to disable cxl mode short of a reboot, so until that changes there is no reason to support the disable path. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> --- V1->V2: - Add an explanation of the peer model to the commit message, and a comment above the pnv_cxl_enable_device_hook function. V2->V3 Addressed comments from Andrew Donnellan: - Fix typo in comment - Changed two exported symbols to EXPORT_SYMBOL_GPL - Undid change to remove static from pnv_pci_release_device and pci_controller_ops and declare them in the header, both of which were left over from an earlier cut. --- arch/powerpc/include/asm/pnv-pci.h| 7 ++ arch/powerpc/platforms/powernv/pci-cxl.c | 120 ++ arch/powerpc/platforms/powernv/pci-ioda.c | 18 - arch/powerpc/platforms/powernv/pci.h | 14 4 files changed, 158 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/pnv-pci.h b/arch/powerpc/include/asm/pnv-pci.h index 791db1b..c47097f 100644 --- a/arch/powerpc/include/asm/pnv-pci.h +++ b/arch/powerpc/include/asm/pnv-pci.h @@ -38,6 +38,13 @@ int pnv_cxl_alloc_hwirq_ranges(struct cxl_irq_ranges *irqs, struct pci_dev *dev, int num); void pnv_cxl_release_hwirq_ranges(struct cxl_irq_ranges *irqs, struct pci_dev *dev); + +/* Support for the cxl kernel api on the real PHB (instead of vPHB) */ +int pnv_cxl_enable_phb_kernel_api(struct pci_controller *hose, bool enable); +bool pnv_pci_on_cxl_phb(struct pci_dev *dev); +struct cxl_afu *pnv_cxl_phb_to_afu(struct pci_controller *hose); +void pnv_cxl_phb_set_peer_afu(struct pci_dev *dev, struct cxl_afu *afu); + #endif #endif diff --git a/arch/powerpc/platforms/powernv/pci-cxl.c b/arch/powerpc/platforms/powernv/pci-cxl.c index e0eeb00..831bbfb 100644 --- a/arch/powerpc/platforms/powernv/pci-cxl.c +++ b/arch/powerpc/platforms/powernv/pci-cxl.c @@ -7,8 +7,11 @@ * 2 of the License, or (at your option) any later version. */ +#include +#include #include #include +#include #include "pci.h" @@ -161,3 +164,120 @@ int pnv_cxl_ioda_msi_setup(struct pci_dev *dev, unsigned int hwirq, return 0; } EXPORT_SYMBOL(pnv_cxl_ioda_msi_setup); + +/* + * Sets flags and switches the controller ops to enable the cxl kernel api. + * Originally the cxl kernel API operated on a virtual PHB, but certain cards + * such as the Mellanox CX4 use a peer model instead and for these cards the + * cxl kernel api will operate on the real PHB. + */ +int pnv_cxl_enable_phb_kernel_api(struct pci_controller *hose, bool enable) +{ + struct pnv_phb *phb = hose->private_data; + struct module *cxl_module; + + if (!enable) { + /* +* Once cxl mode is enabled on the PHB, there is currently no +* known safe method to disable it again, and trying risks a +* checkstop. If we can find a way to safely disable cxl mode +* in the future we can revisit this, but for now the only sane +* thing to do is to refuse to disable cxl mode: +*/ + return -EPERM; + } + + /* +* Hold a reference to the cxl module since several PHB operations now +* depend on it, and it would be insane to allow it to be removed so +* long as we are in this mode (and since we can't safely disable this +* mode once enabled...). +*/ + mutex_lock(_mutex); + cxl_module = find_module("cxl"); + if (cxl_module) + __module_get(cxl_module); + mutex_unlock(_mutex); + if (!cxl_module) + return -E
[PATCH 06/15] cxl: Do not create vPHB if there are no AFU configuration records
From: Ian Munsie <imun...@au1.ibm.com> The vPHB model of the cxl kernel API is a hierarchy where the AFU is represented by the vPHB, and it's AFU configuration records are exposed as functions under that vPHB. If there are no AFU configuration records we will create a vPHB with nothing under it, which is a waste of resources and will opt us into EEH handling despite not having anything special to handle. This also does not make sense for cards using the peer model of the cxl kernel API, where the other functions of the device are exposed via additional peer physical functions rather than AFU configuration records. This model will also not work with the existing EEH handling in the cxl driver, as that is designed around the vPHB model. Skip creating the vPHB for AFUs without any AFU configuration records, and opt out of EEH handling for them. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> --- drivers/misc/cxl/pci.c | 3 +++ drivers/misc/cxl/vphb.c | 11 +++ 2 files changed, 14 insertions(+) diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index deef9c7..dd7ff22 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -1572,6 +1572,9 @@ static pci_ers_result_t cxl_pci_error_detected(struct pci_dev *pdev, */ for (i = 0; i < adapter->slices; i++) { afu = adapter->afu[i]; + /* Only participate in EEH if we are on a virtual PHB */ + if (afu->phb == NULL) + return PCI_ERS_RESULT_NONE; cxl_vphb_error_detected(afu, state); } return PCI_ERS_RESULT_DISCONNECT; diff --git a/drivers/misc/cxl/vphb.c b/drivers/misc/cxl/vphb.c index c8a759f..8865e8d 100644 --- a/drivers/misc/cxl/vphb.c +++ b/drivers/misc/cxl/vphb.c @@ -188,6 +188,17 @@ int cxl_pci_vphb_add(struct cxl_afu *afu) struct device_node *vphb_dn; struct device *parent; + /* +* If there are no AFU configuration records we won't have anything to +* expose under the vPHB, so skip creating one, returning success since +* this is still a valid case. This will also opt us out of EEH +* handling since we won't have anything special to do if there are no +* kernel drivers attached to the vPHB, and EEH handling is not yet +* supported in the peer model. +*/ + if (!afu->crs_num) + return 0; + /* The parent device is the adapter. Reuse the device node of * the adapter. * We don't seem to care what device node is used for the vPHB, -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 05/15] cxl: Allow a default context to be associated with an external pci_dev
From: Ian Munsie <imun...@au1.ibm.com> The cxl kernel API has a concept of a default context associated with each PCI device under the virtual PHB. The Mellanox CX4 will also use the cxl kernel API, but it does not use a virtual PHB - rather, the AFU appears as a physical function as a peer to the networking functions. In order to allow the kernel API to work with those networking functions, we will need to associate a default context with them as well. To this end, refactor the corresponding code to do this in vphb.c and export it so that it can be called from the PHB code. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> --- V2->V3: Addressed feedback from Andrew Donnellan: - Fixed typo in comment - Moved _cxl_pci_associate_default_context and _cxl_pci_disable_device from vphb.c to a new file phb.c since they are used by both the vPHB and peer models. --- drivers/misc/cxl/Makefile | 2 +- drivers/misc/cxl/base.c | 35 +++ drivers/misc/cxl/cxl.h| 6 ++ drivers/misc/cxl/main.c | 2 ++ drivers/misc/cxl/phb.c| 44 drivers/misc/cxl/vphb.c | 30 +++--- include/misc/cxl-base.h | 6 ++ 7 files changed, 97 insertions(+), 28 deletions(-) create mode 100644 drivers/misc/cxl/phb.c diff --git a/drivers/misc/cxl/Makefile b/drivers/misc/cxl/Makefile index 8a55c1a..56e9a47 100644 --- a/drivers/misc/cxl/Makefile +++ b/drivers/misc/cxl/Makefile @@ -3,7 +3,7 @@ ccflags-$(CONFIG_PPC_WERROR)+= -Werror cxl-y += main.o file.o irq.o fault.o native.o cxl-y += context.o sysfs.o debugfs.o pci.o trace.o -cxl-y += vphb.o api.o +cxl-y += vphb.o phb.o api.o cxl-$(CONFIG_PPC_PSERIES) += flash.o guest.o of.o hcalls.o obj-$(CONFIG_CXL) += cxl.o obj-$(CONFIG_CXL_BASE) += base.o diff --git a/drivers/misc/cxl/base.c b/drivers/misc/cxl/base.c index d7dcf5b..1c3e737f 100644 --- a/drivers/misc/cxl/base.c +++ b/drivers/misc/cxl/base.c @@ -106,6 +106,41 @@ int cxl_update_properties(struct device_node *dn, } EXPORT_SYMBOL_GPL(cxl_update_properties); +/* + * API calls into the driver that may be called from the PHB code and must be + * built in. + */ +bool cxl_pci_associate_default_context(struct pci_dev *dev, struct cxl_afu *afu) +{ + bool ret; + struct cxl_calls *calls; + + calls = cxl_calls_get(); + if (!calls) + return false; + + ret = calls->cxl_pci_associate_default_context(dev, afu); + + cxl_calls_put(calls); + + return ret; +} +EXPORT_SYMBOL_GPL(cxl_pci_associate_default_context); + +void cxl_pci_disable_device(struct pci_dev *dev) +{ + struct cxl_calls *calls; + + calls = cxl_calls_get(); + if (!calls) + return; + + calls->cxl_pci_disable_device(dev); + + cxl_calls_put(calls); +} +EXPORT_SYMBOL_GPL(cxl_pci_disable_device); + static int __init cxl_base_init(void) { struct device_node *np = NULL; diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index d4aae6f..b81f476 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -719,9 +719,15 @@ static inline u64 cxl_p2n_read(struct cxl_afu *afu, cxl_p2n_reg_t reg) ssize_t cxl_pci_afu_read_err_buffer(struct cxl_afu *afu, char *buf, loff_t off, size_t count); +/* Internal functions wrapped in cxl_base to allow PHB to call them */ +bool _cxl_pci_associate_default_context(struct pci_dev *dev, struct cxl_afu *afu); +void _cxl_pci_disable_device(struct pci_dev *dev); struct cxl_calls { void (*cxl_slbia)(struct mm_struct *mm); + bool (*cxl_pci_associate_default_context)(struct pci_dev *dev, struct cxl_afu *afu); + void (*cxl_pci_disable_device)(struct pci_dev *dev); + struct module *owner; }; int register_cxl_calls(struct cxl_calls *calls); diff --git a/drivers/misc/cxl/main.c b/drivers/misc/cxl/main.c index ae68c32..4e5474b 100644 --- a/drivers/misc/cxl/main.c +++ b/drivers/misc/cxl/main.c @@ -110,6 +110,8 @@ static inline void cxl_slbia_core(struct mm_struct *mm) static struct cxl_calls cxl_calls = { .cxl_slbia = cxl_slbia_core, + .cxl_pci_associate_default_context = _cxl_pci_associate_default_context, + .cxl_pci_disable_device = _cxl_pci_disable_device, .owner = THIS_MODULE, }; diff --git a/drivers/misc/cxl/phb.c b/drivers/misc/cxl/phb.c new file mode 100644 index 000..0935d44 --- /dev/null +++ b/drivers/misc/cxl/phb.c @@ -0,0 +1,44 @@ +/* + * Copyright 2014-2016 IBM Corp. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU Genera
[PATCH 03/15] cxl: Enable bus mastering for devices using CAPP DMA mode
From: Ian Munsie <imun...@au1.ibm.com> Devices that use CAPP DMA mode (such as the Mellanox CX4) require bus master to be enabled in order for the CAPI traffic to flow. This should be harmless to enable for other cxl devices, so unconditionally enable it in the adapter init flow. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> --- drivers/misc/cxl/pci.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index 6ac6b05..deef9c7 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -1264,6 +1264,9 @@ static int cxl_configure_adapter(struct cxl *adapter, struct pci_dev *dev) if ((rc = adapter->native->sl_ops->adapter_regs_init(adapter, dev))) goto err; + /* Required for devices using CAPP DMA mode, harmless for others */ + pci_set_master(dev); + if ((rc = pnv_phb_to_cxl_mode(dev, adapter->native->sl_ops->capi_mode))) goto err; -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 04/15] cxl: Move cxl_afu_get / cxl_afu_put to base
From: Ian Munsie <imun...@au1.ibm.com> The Mellanox CX4 uses a model where the AFU is one physical function of the device, and is used by other peer physical functions of the same device. This will require those other devices to grab a reference on the AFU when they are initialised to make sure that it does not go away during their lifetime. Move the AFU refcount functions to base.c so they can be called from the PHB code. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> --- drivers/misc/cxl/base.c | 13 + drivers/misc/cxl/cxl.h | 12 include/misc/cxl-base.h | 4 3 files changed, 17 insertions(+), 12 deletions(-) diff --git a/drivers/misc/cxl/base.c b/drivers/misc/cxl/base.c index e6f49ac..d7dcf5b 100644 --- a/drivers/misc/cxl/base.c +++ b/drivers/misc/cxl/base.c @@ -54,6 +54,19 @@ static inline void cxl_calls_put(struct cxl_calls *calls) { } #endif /* CONFIG_CXL_MODULE */ +/* AFU refcount management */ +struct cxl_afu *cxl_afu_get(struct cxl_afu *afu) +{ + return (get_device(>dev) == NULL) ? NULL : afu; +} +EXPORT_SYMBOL_GPL(cxl_afu_get); + +void cxl_afu_put(struct cxl_afu *afu) +{ + put_device(>dev); +} +EXPORT_SYMBOL_GPL(cxl_afu_put); + void cxl_slbia(struct mm_struct *mm) { struct cxl_calls *calls; diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index 36b3237..d4aae6f 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -440,18 +440,6 @@ struct cxl_afu { bool enabled; }; -/* AFU refcount management */ -static inline struct cxl_afu *cxl_afu_get(struct cxl_afu *afu) -{ - - return (get_device(>dev) == NULL) ? NULL : afu; -} - -static inline void cxl_afu_put(struct cxl_afu *afu) -{ - put_device(>dev); -} - struct cxl_irq_name { struct list_head list; diff --git a/include/misc/cxl-base.h b/include/misc/cxl-base.h index 5ae9625..f53808f 100644 --- a/include/misc/cxl-base.h +++ b/include/misc/cxl-base.h @@ -36,11 +36,15 @@ static inline void cxl_ctx_put(void) atomic_dec(_use_count); } +struct cxl_afu *cxl_afu_get(struct cxl_afu *afu); +void cxl_afu_put(struct cxl_afu *afu); void cxl_slbia(struct mm_struct *mm); #else /* CONFIG_CXL_BASE */ static inline bool cxl_ctx_in_use(void) { return false; } +static inline struct cxl_afu *cxl_afu_get(struct cxl_afu *afu) { return NULL; } +static inline void cxl_afu_put(struct cxl_afu *afu) {} static inline void cxl_slbia(struct mm_struct *mm) {} #endif /* CONFIG_CXL_BASE */ -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 02/15] cxl: Add cxl_slot_is_supported API
From: Ian Munsie <imun...@au1.ibm.com> This extends the check that the adapter is in a CAPI capable slot so that it may be called by external users in the kernel API. This will be used by the upcoming Mellanox CX4 support, which needs to know ahead of time if the card can be switched to cxl mode so that it can leave it in PCI mode if it is not. This API takes a parameter to check if CAPP DMA mode is supported, which it currently only allows on P8NVL systems, since that mode currently has issues accessing memory < 4GB on P8, and we cannot realistically avoid that. This API does not currently check if a CAPP unit is available (i.e. not already assigned to another PHB) on P8. Doing so would be racy since it is assigned on a first come first serve basis, and so long as CAPP DMA mode is not supported on P8 we don't need this, since the only anticipated user of this API requires CAPP DMA mode. Cc: Philippe Bergheaud <fe...@linux.vnet.ibm.com> Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> --- V1->V2: - Fixed typos in comments spotted by Andrew --- drivers/misc/cxl/pci.c | 37 + include/misc/cxl.h | 15 +++ 2 files changed, 52 insertions(+) diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index 3a5f980..6ac6b05 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -1426,6 +1426,43 @@ static int cxl_slot_is_switched(struct pci_dev *dev) return (depth > CXL_MAX_PCIEX_PARENT); } +bool cxl_slot_is_supported(struct pci_dev *dev, int flags) +{ + if (!cpu_has_feature(CPU_FTR_HVMODE)) + return false; + + if ((flags & CXL_SLOT_FLAG_DMA) && (!pvr_version_is(PVR_POWER8NVL))) { + /* +* CAPP DMA mode is technically supported on regular P8, but +* will EEH if the card attempts to access memory < 4GB, which +* we cannot realistically avoid. We might be able to work +* around the issue, but until then return unsupported: +*/ + return false; + } + + if (cxl_slot_is_switched(dev)) + return false; + + /* +* XXX: This gets a little tricky on regular P8 (not POWER8NVL) since +* the CAPP can be connected to PHB 0, 1 or 2 on a first come first +* served basis, which is racy to check from here. If we need to +* support this in future we might need to consider having this +* function effectively reserve it ahead of time. +* +* Currently, the only user of this API is the Mellanox CX4, which is +* only supported on P8NVL due to the above mentioned limitation of +* CAPP DMA mode and therefore does not need to worry about this. If the +* issue with CAPP DMA mode is later worked around on P8 we might need +* to revisit this. +*/ + + return true; +} +EXPORT_SYMBOL_GPL(cxl_slot_is_supported); + + static int cxl_probe(struct pci_dev *dev, const struct pci_device_id *id) { struct cxl *adapter; diff --git a/include/misc/cxl.h b/include/misc/cxl.h index b6d040f..dd9eebb 100644 --- a/include/misc/cxl.h +++ b/include/misc/cxl.h @@ -24,6 +24,21 @@ * generic PCI API. This API is agnostic to the actual AFU. */ +#define CXL_SLOT_FLAG_DMA 0x1 + +/* + * Checks if the given card is in a cxl capable slot. Pass CXL_SLOT_FLAG_DMA if + * the card requires CAPP DMA mode to also check if the system supports it. + * This is intended to be used by bi-modal devices to determine if they can use + * cxl mode or if they should continue running in PCI mode. + * + * Note that this only checks if the slot is cxl capable - it does not + * currently check if the CAPP is currently available for chips where it can be + * assigned to different PHBs on a first come first serve basis (i.e. P8) + */ +bool cxl_slot_is_supported(struct pci_dev *dev, int flags); + + /* Get the AFU associated with a pci_dev */ struct cxl_afu *cxl_pci_to_afu(struct pci_dev *dev); -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 01/15] powerpc/powernv: Split cxl code out into a separate file
From: Ian Munsie <imun...@au1.ibm.com> The support for using the Mellanox CX4 in cxl mode will require additions to the PHB code. In preparation for this, move the existing cxl code out of pci-ioda.c into a separate pci-cxl.c file to keep things more organised. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> --- V1 -> V2: Changed copyright message in new file to 2014-2016, since most of the code originated in other files written in 2014, and will be adding new code shortly. --- arch/powerpc/platforms/powernv/Makefile | 1 + arch/powerpc/platforms/powernv/pci-cxl.c | 163 ++ arch/powerpc/platforms/powernv/pci-ioda.c | 159 + arch/powerpc/platforms/powernv/pci.h | 6 ++ 4 files changed, 173 insertions(+), 156 deletions(-) create mode 100644 arch/powerpc/platforms/powernv/pci-cxl.c diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile index cd9711e..b5d98cb 100644 --- a/arch/powerpc/platforms/powernv/Makefile +++ b/arch/powerpc/platforms/powernv/Makefile @@ -6,6 +6,7 @@ obj-y += opal-kmsg.o obj-$(CONFIG_SMP) += smp.o subcore.o subcore-asm.o obj-$(CONFIG_PCI) += pci.o pci-ioda.o npu-dma.o +obj-$(CONFIG_CXL_BASE) += pci-cxl.o obj-$(CONFIG_EEH) += eeh-powernv.o obj-$(CONFIG_PPC_SCOM) += opal-xscom.o obj-$(CONFIG_MEMORY_FAILURE) += opal-memory-errors.o diff --git a/arch/powerpc/platforms/powernv/pci-cxl.c b/arch/powerpc/platforms/powernv/pci-cxl.c new file mode 100644 index 000..e0eeb00 --- /dev/null +++ b/arch/powerpc/platforms/powernv/pci-cxl.c @@ -0,0 +1,163 @@ +/* + * Copyright 2014-2016 IBM Corp. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include + +#include "pci.h" + +struct device_node *pnv_pci_get_phb_node(struct pci_dev *dev) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + + return of_node_get(hose->dn); +} +EXPORT_SYMBOL(pnv_pci_get_phb_node); + +int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t mode) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + struct pnv_ioda_pe *pe; + int rc; + + pe = pnv_ioda_get_pe(dev); + if (!pe) + return -ENODEV; + + pe_info(pe, "Switching PHB to CXL\n"); + + rc = opal_pci_set_phb_cxl_mode(phb->opal_id, mode, pe->pe_number); + if (rc == OPAL_UNSUPPORTED) + dev_err(>dev, "Required cxl mode not supported by firmware - update skiboot\n"); + else if (rc) + dev_err(>dev, "opal_pci_set_phb_cxl_mode failed: %i\n", rc); + + return rc; +} +EXPORT_SYMBOL(pnv_phb_to_cxl_mode); + +/* Find PHB for cxl dev and allocate MSI hwirqs? + * Returns the absolute hardware IRQ number + */ +int pnv_cxl_alloc_hwirqs(struct pci_dev *dev, int num) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + int hwirq = msi_bitmap_alloc_hwirqs(>msi_bmp, num); + + if (hwirq < 0) { + dev_warn(>dev, "Failed to find a free MSI\n"); + return -ENOSPC; + } + + return phb->msi_base + hwirq; +} +EXPORT_SYMBOL(pnv_cxl_alloc_hwirqs); + +void pnv_cxl_release_hwirqs(struct pci_dev *dev, int hwirq, int num) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + + msi_bitmap_free_hwirqs(>msi_bmp, hwirq - phb->msi_base, num); +} +EXPORT_SYMBOL(pnv_cxl_release_hwirqs); + +void pnv_cxl_release_hwirq_ranges(struct cxl_irq_ranges *irqs, + struct pci_dev *dev) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + int i, hwirq; + + for (i = 1; i < CXL_IRQ_RANGES; i++) { + if (!irqs->range[i]) + continue; + pr_devel("cxl release irq range 0x%x: offset: 0x%lx limit: %ld\n", +i, irqs->offset[i], +irqs->range[i]); + hwirq = irqs->offset[i] - phb->msi_base; + msi_bitmap_free_hwirqs(>msi_bmp, hwirq, + irqs->range[i]); + } +} +EXPORT_SYMBOL(pnv_cxl_release_hwirq_ranges); + +int pnv_cxl_alloc_hwirq_ranges(struct cxl_irq_ranges *irqs, + st
[PATCH v3] powerpc / cxl: Add support for the Mellanox CX4 in cxl mode
This series adds support for the Mellanox CX4 network adapter operating in cxl mode to the cxl driver and the PowerNV PHB code. The Mellanox developers will submit a separate patch series that makes use of this in the mlx5 driver. The CX4 card can operate in either pci mode, or cxl mode. In cxl mode, memory accesses from the card go through the XSL (Translation Service Layer, essentially a stripped down version of the Power Service Layer), allowing it to transparently access unpinned memory with the cxl driver handling faulting in pages as necessary, etc. Most of the support for the XSL is already upstream, though this series does include a bug fix to enable bus mastering for this (patch 3). Patch 2 in this series provides an API which the mlx5 driver can query to check if it is in a cxl capable slot. The card will come up in pci mode, and the mlx5 driver can choose to switch it to cxl mode, wherein it will reappear with an additional physical function representing the XSL that the cxl driver will bind to. Patches 13-15 add support for switching the card's mode, including using the PCI hotplug support to re-enumerate the device tree and re-probind the card. Unlike previous users of the cxl kernel API where we used a virtual PHB and exposed PCI devices under it, the Mellanox CX4 uses a peer model where cxl binds to one of the physical functions of the card and the mlx5_core driver binds to the other networking physical functions. Patch 6 skips creating a vPHB for AFUs without any AFU configuration records (including devices using the peer model) and opts out of EEH handling. Patches 7 and 8 add support for using the cxl kernel API with the real PHB to enable this peer model. Patches 4 and 5 are prepatory patches exposing some APIs that the PHB will need to call. While in cxl mode, interrupts from the CX4 are a little unusual - they are neither pci interrupts, nor cxl interrutps, but rather a hybrid of the two. The interrupts are passed from the networking hardware to the XSL using a custom format in the MSIX table, and from there are treated as cxl interrupts. These are configured mostly transparently using the standard msix APIs - the PHB handles allocating and configuring the cxl interrupts, associating them with the default context, and the mlx5 driver handles filling out the MSIX table with their custom format (not included in this series). See patch 11. Additionally, the CX4 has a hard limitation of the number of interrupts that can be associated with a given context, so to overcome this patches 9 and 10 expose an API to allow the mlx5 driver to inform us of the limit, and the interrupt allocation code in patch 11 will allocate additional contexts to associate these with. Patch 1 is a prepatory cleanup patch to reorganise cxl code in arch/powerpc into a separate file. Patch 12 is a workaround for a hardware limitation in the CX4 where a context with PE=0 cannot be used. The entire series is bisectable. Changes since v2: Addressed feedback from Andrew Donnellan: - Fixed typos in several comments - Moved _cxl_pci_associate_default_context and _cxl_pci_disable_device from vphb.c to a new file phb.c since they are used by both the vPHB and peer models. (Patch 5) - Changed two exported symbols to EXPORT_SYMBOL_GPL (Patch 7) - Undid change to remove static from pnv_pci_release_device and pci_controller_ops and declare them in the header, both of which were left over from an earlier cut. (Patch 7) Changes since v1: - New patch 6 to skip creating a vPHB if there are no AFU configuration records, and opt out of EEH handling (partially split from patch 8). - Updated comments in various patches (1, 2, 7, 10, 15) with feedback from Andrew Donnellan and Frederic Barrat - Handle error case if cxl_next_msi_hwirq returns 0 signifying that an AFU IRQ is not mapped to a hardware interrupt (Patch 11) - Dropped extraneous "select HOTPLUG_PCI_POWERNV_BASE" in Kconfig, which was accidentally left in from an earlier non-public revision. Thanks to Gavin Shan for pointing it out (Patch 13) - Added new error label for error paths calling pci_dev_put() - suggested by Ian Munsie (Patch 15) - Added newline at end of Kconfig (Patch 15) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 05/15] cxl: Allow a default context to be associated with an external pci_dev
Excerpts from andrew.donnellan's message of 2016-07-13 15:52:45 +1000: > > +bool _cxl_pci_associate_default_context(struct pci_dev *dev, struct > > cxl_afu *afu) > > If we're sharing these functions between the vPHB and peer models, do we > have a better place than vphb.c for them? Sure, I might split them out into a new phb.c for V3. It just seemed a little pointless to create a new file for two functions at the time, but you are right that they don't really belong in vphb.c. I guess an alternative would be to rename vphb.c to phb.c, but 90% of that file is vphb specific... I'll split these out. > > +{ > > +struct cxl_context *ctx; > > + > > +/* > > + * Allocate a context to do cxl things too. This is used for interrupts > > s/too/to/? Heh, the one part of the comment that I didn't change from Mikey's code ;-P Will fix. Cheers, -Ian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 07/15] powerpc/powernv: Add support for the cxl kernel api on the real phb
Excerpts from andrew.donnellan's message of 2016-07-12 20:39:13 +1000: > Some comments below - with those addressed: > > Reviewed-by: Andrew DonnellanThanks for the review :) > > V1->V2: > > - Add an explanation of the peer model to the commit message, > > and a comment above the pnv_cxl_enable_device_hook function. > > The comments are good! Thanks :) > > +/* > > + * Sets flags and switches the controller ops to enable the cxl kernel api. > > + * Original the cxl kernel API operated on a virtual PHB, but certain cards > > Originally Will fix. > > +EXPORT_SYMBOL(pnv_cxl_enable_phb_kernel_api); ... > > +EXPORT_SYMBOL(pnv_pci_on_cxl_phb); > > Should these two exports be _GPL as well? Yep, they should be - will fix in V3. Looks like we have several more in this file that should be _GPL as well, but since they weren't introduced in this series (just moved from another file) I might send a separate patch for those after this has been merged. > > -static void pnv_pci_release_device(struct pci_dev *pdev) > > +void pnv_pci_release_device(struct pci_dev *pdev) > > Why is this being unstatic-ified? I don't see us introducing any new > uses of it. Left over from an earlier cut - will undo this for V3. > > -static const struct pci_controller_ops pnv_pci_ioda_controller_ops = { > > +const struct pci_controller_ops pnv_pci_ioda_controller_ops = { > > It looks like we don't currently use this - is this being unstatic-ised > in view of allowing a switch back to regular mode in future? Oops, again left over from an earlier cut (for that reason). Will undo this for V3 - if we allow switching back in the future we can remove the static then. > > +extern void pnv_pci_release_device(struct pci_dev *pdev); Will also remove this declaration of pnv_pci_release_device. > > +extern const struct pci_controller_ops pnv_pci_ioda_controller_ops; > > See applicable comments about static. Will remove. Cheers, -Ian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 15/15] cxl: Add cxl_check_and_switch_mode() API to switch bi-modal cards
From: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Add a new API, cxl_check_and_switch_mode() to allow for switching of bi-modal CAPI cards, such as the Mellanox CX-4 network card. When a driver requests to switch a card to CAPI mode, use PCI hotplug infrastructure to remove all PCI devices underneath the slot. We then write an updated mode control register to the CAPI VSEC, hot reset the card, and reprobe the card. As the card may present a different set of PCI devices after the mode switch, use the infrastructure provided by the pnv_php driver and the OPAL PCI slot management facilities to ensure that: * the old devices are removed from both the OPAL and Linux device trees * the new devices are probed by OPAL and added to the OPAL device tree * the new devices are added to the Linux device tree and probed through the regular PCI device probe path As such, introduce a new option, CONFIG_CXL_BIMODAL, with a dependency on the pnv_php driver. Refactor existing code that touches the mode control register in the regular single mode case into a new function, setup_cxl_protocol_area(). Co-authored-by: Ian Munsie <imun...@au1.ibm.com> Cc: Gavin Shan <gws...@linux.vnet.ibm.com> Signed-off-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Gavin Shan <gws...@linux.vnet.ibm.com> --- V1->V2: - Added comments around pci_dev_put() - suggested by Frederic Barrat - Added new error label for error paths calling pci_dev_put() - suggested by Ian Munsie - Added newline at end of Kconfig - Removed extraneous comment in setup_cxl_protocol_area() --- drivers/misc/cxl/Kconfig | 8 ++ drivers/misc/cxl/pci.c | 236 +++ include/misc/cxl.h | 25 + 3 files changed, 251 insertions(+), 18 deletions(-) diff --git a/drivers/misc/cxl/Kconfig b/drivers/misc/cxl/Kconfig index 560412c..8d76770 100644 --- a/drivers/misc/cxl/Kconfig +++ b/drivers/misc/cxl/Kconfig @@ -38,3 +38,11 @@ config CXL CAPI adapters are found in POWER8 based systems. If unsure, say N. + +config CXL_BIMODAL + bool "Support for bi-modal CAPI cards" + depends on HOTPLUG_PCI_POWERNV = y && CXL || HOTPLUG_PCI_POWERNV = m && CXL = m + default y + help + Select this option to enable support for bi-modal CAPI cards, such as + the Mellanox CX-4. diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index efe202f..d152e2d 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -55,6 +55,8 @@ pci_read_config_byte(dev, vsec + 0xa, dest) #define CXL_WRITE_VSEC_MODE_CONTROL(dev, vsec, val) \ pci_write_config_byte(dev, vsec + 0xa, val) +#define CXL_WRITE_VSEC_MODE_CONTROL_BUS(bus, devfn, vsec, val) \ + pci_bus_write_config_byte(bus, devfn, vsec + 0xa, val) #define CXL_VSEC_PROTOCOL_MASK 0xe0 #define CXL_VSEC_PROTOCOL_1024TB 0x80 #define CXL_VSEC_PROTOCOL_512TB 0x40 @@ -614,36 +616,234 @@ static int setup_cxl_bars(struct pci_dev *dev) return 0; } -/* pciex node: ibm,opal-m64-window = <0x3d058 0x0 0x3d058 0x0 0x8 0x0>; */ -static int switch_card_to_cxl(struct pci_dev *dev) -{ +#ifdef CONFIG_CXL_BIMODAL + +struct cxl_switch_work { + struct pci_dev *dev; + struct work_struct work; int vsec; + int mode; +}; + +static void switch_card_to_cxl(struct work_struct *work) +{ + struct cxl_switch_work *switch_work = + container_of(work, struct cxl_switch_work, work); + struct pci_dev *dev = switch_work->dev; + struct pci_bus *bus = dev->bus; + struct pci_controller *hose = pci_bus_to_host(bus); + struct pci_dev *bridge; + struct pnv_php_slot *php_slot; + unsigned int devfn; u8 val; int rc; - dev_info(>dev, "switch card to CXL\n"); + dev_info(>dev, "cxl: Preparing for mode switch...\n"); + bridge = list_first_entry_or_null(>bus->devices, struct pci_dev, + bus_list); + if (!bridge) { + dev_WARN(>dev, "cxl: Couldn't find root port!\n"); + goto err_dev_put; + } - if (!(vsec = find_cxl_vsec(dev))) { - dev_err(>dev, "ABORTING: CXL VSEC not found!\n"); + php_slot = pnv_php_find_slot(pci_device_to_OF_node(bridge)); + if (!php_slot) { + dev_err(>dev, "cxl: Failed to find slot hotplug " + "information. You may need to upgrade " + "skiboot. Aborting.\n"); + goto err_dev_put; + } + + rc = CXL_READ_VSEC_MODE_CONTROL(dev, switch_work->vsec, ); + if (rc) { + dev_err(>
[PATCH 14/15] PCI/hotplug: pnv_php: handle OPAL_PCI_SLOT_OFFLINE power state
From: Andrew Donnellan <andrew.donnel...@au1.ibm.com> When calling pnv_php_set_slot_power_state() with state == OPAL_PCI_SLOT_OFFLINE, remove devices from the device tree as if we're dealing with OPAL_PCI_SLOT_POWER_OFF. Cc: Gavin Shan <gws...@linux.vnet.ibm.com> Cc: linux-...@vger.kernel.org Cc: Bjorn Helgaas <bhelg...@google.com> Signed-off-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Acked-by: Gavin Shan <gws...@linux.vnet.ibm.com> --- drivers/pci/hotplug/pnv_php.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c index 2d2f704..e6245b0 100644 --- a/drivers/pci/hotplug/pnv_php.c +++ b/drivers/pci/hotplug/pnv_php.c @@ -317,7 +317,7 @@ int pnv_php_set_slot_power_state(struct hotplug_slot *slot, return ret; } - if (state == OPAL_PCI_SLOT_POWER_OFF) + if (state == OPAL_PCI_SLOT_POWER_OFF || state == OPAL_PCI_SLOT_OFFLINE) pnv_php_rmv_devtree(php_slot); else ret = pnv_php_add_devtree(php_slot); -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 10/15] cxl: Add preliminary workaround for CX4 interrupt limitation
From: Ian Munsie <imun...@au1.ibm.com> The Mellanox CX4 has a hardware limitation where only 4 bits of the AFU interrupt number can be passed to the XSL when sending an interrupt, limiting it to only 15 interrupts per context (AFU interrupt number 0 is invalid). In order to overcome this, we will allocate additional contexts linked to the default context as extra address space for the extra interrupts - this will be implemented in the next patch. This patch adds the preliminary support to allow this, by way of adding a linked list in the context structure that we use to keep track of the contexts dedicated to interrupts, and an API to simultaneously iterate over the related context structures, AFU interrupt numbers and hardware interrupt numbers. The point of using a single API to iterate these is to hide some of the details of the iteration from external code, and to reduce the number of APIs that need to be exported via base.c to allow built in code to call. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> --- V1->V2: - Fixed typo spotted by Fred --- drivers/misc/cxl/api.c | 15 +++ drivers/misc/cxl/base.c| 17 + drivers/misc/cxl/context.c | 1 + drivers/misc/cxl/cxl.h | 10 ++ drivers/misc/cxl/main.c| 1 + include/misc/cxl.h | 9 + 6 files changed, 53 insertions(+) diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index 1e2c0d9..f02a859 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -97,6 +97,21 @@ static irq_hw_number_t cxl_find_afu_irq(struct cxl_context *ctx, int num) return 0; } +int _cxl_next_msi_hwirq(struct pci_dev *pdev, struct cxl_context **ctx, int *afu_irq) +{ + if (*ctx == NULL || *afu_irq == 0) { + *afu_irq = 1; + *ctx = cxl_get_context(pdev); + } else { + (*afu_irq)++; + if (*afu_irq > cxl_get_max_irqs_per_process(pdev)) { + *ctx = list_next_entry(*ctx, extra_irq_contexts); + *afu_irq = 1; + } + } + return cxl_find_afu_irq(*ctx, *afu_irq); +} +/* Exported via cxl_base */ int cxl_set_priv(struct cxl_context *ctx, void *priv) { diff --git a/drivers/misc/cxl/base.c b/drivers/misc/cxl/base.c index 1c3e737f..d7d9d02 100644 --- a/drivers/misc/cxl/base.c +++ b/drivers/misc/cxl/base.c @@ -141,6 +141,23 @@ void cxl_pci_disable_device(struct pci_dev *dev) } EXPORT_SYMBOL_GPL(cxl_pci_disable_device); +int cxl_next_msi_hwirq(struct pci_dev *pdev, struct cxl_context **ctx, int *afu_irq) +{ + int ret; + struct cxl_calls *calls; + + calls = cxl_calls_get(); + if (!calls) + return -EBUSY; + + ret = calls->cxl_next_msi_hwirq(pdev, ctx, afu_irq); + + cxl_calls_put(calls); + + return ret; +} +EXPORT_SYMBOL_GPL(cxl_next_msi_hwirq); + static int __init cxl_base_init(void) { struct device_node *np = NULL; diff --git a/drivers/misc/cxl/context.c b/drivers/misc/cxl/context.c index edbb99e..2616cddb 100644 --- a/drivers/misc/cxl/context.c +++ b/drivers/misc/cxl/context.c @@ -68,6 +68,7 @@ int cxl_context_init(struct cxl_context *ctx, struct cxl_afu *afu, bool master, ctx->pending_afu_err = false; INIT_LIST_HEAD(>irq_names); + INIT_LIST_HEAD(>extra_irq_contexts); /* * When we have to destroy all contexts in cxl_context_detach_all() we diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index b81f476..73b9a55 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -537,6 +537,14 @@ struct cxl_context { atomic_t afu_driver_events; struct rcu_head rcu; + + /* +* Only used when more interrupts are allocated via +* pci_enable_msix_range than are supported in the default context, to +* use additional contexts to overcome the limitation. i.e. Mellanox +* CX4 only: +*/ + struct list_head extra_irq_contexts; }; struct cxl_service_layer_ops { @@ -722,11 +730,13 @@ ssize_t cxl_pci_afu_read_err_buffer(struct cxl_afu *afu, char *buf, /* Internal functions wrapped in cxl_base to allow PHB to call them */ bool _cxl_pci_associate_default_context(struct pci_dev *dev, struct cxl_afu *afu); void _cxl_pci_disable_device(struct pci_dev *dev); +int _cxl_next_msi_hwirq(struct pci_dev *pdev, struct cxl_context **ctx, int *afu_irq); struct cxl_calls { void (*cxl_slbia)(struct mm_struct *mm); bool (*cxl_pci_associate_default_context)(struct pci_dev *dev, struct cxl_afu *afu); void (*cxl_pci_disable_device)(struct pci_dev *dev); + int (*cxl_next_msi_hwirq)(struct pci_dev *pdev, struct cxl_context **ctx, int *afu_irq); struct module *owner; }; diff --git a/drivers/misc/cxl/main.c b/drivers/misc/cxl/main.c index 4e5474b..6
[PATCH 08/15] cxl: Add support for using the kernel API with a real PHB
From: Ian Munsie <imun...@au1.ibm.com> This hooks up support for using the kernel API with a real PHB. After the AFU initialisation has completed it calls into the PHB code to pass it the AFU that will be used by other peer physical functions on the adapter. The cxl_pci_to_afu API is extended to work with peer PCI devices, retrieving the peer AFU from the PHB. This API may also now return an error if it is called on a PCI device that is not associated with either a cxl vPHB or a peer PCI device to an AFU, and this error is propagated down. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- V1->V2: - Removed change to skip participating in EEH without a vPHB - moved out into an earlier patch. --- drivers/misc/cxl/api.c | 5 + drivers/misc/cxl/pci.c | 3 +++ drivers/misc/cxl/vphb.c | 16 ++-- 3 files changed, 22 insertions(+), 2 deletions(-) diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index 7707055..6a030bf 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -13,6 +13,7 @@ #include #include #include +#include #include "cxl.h" @@ -24,6 +25,8 @@ struct cxl_context *cxl_dev_context_init(struct pci_dev *dev) int rc; afu = cxl_pci_to_afu(dev); + if (IS_ERR(afu)) + return ERR_CAST(afu); ctx = cxl_context_alloc(); if (IS_ERR(ctx)) { @@ -438,6 +441,8 @@ EXPORT_SYMBOL_GPL(cxl_perst_reloads_same_image); ssize_t cxl_read_adapter_vpd(struct pci_dev *dev, void *buf, size_t count) { struct cxl_afu *afu = cxl_pci_to_afu(dev); + if (IS_ERR(afu)) + return -ENODEV; return cxl_ops->read_adapter_vpd(afu->adapter, buf, count); } diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index dd7ff22..cb5d172 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -1502,6 +1502,9 @@ static int cxl_probe(struct pci_dev *dev, const struct pci_device_id *id) dev_err(>dev, "AFU %i failed to start: %i\n", slice, rc); } + if (pnv_pci_on_cxl_phb(dev) && adapter->slices >= 1) + pnv_cxl_phb_set_peer_afu(dev, adapter->afu[0]); + return 0; } diff --git a/drivers/misc/cxl/vphb.c b/drivers/misc/cxl/vphb.c index 4b81f0f..3d0e791 100644 --- a/drivers/misc/cxl/vphb.c +++ b/drivers/misc/cxl/vphb.c @@ -9,6 +9,7 @@ #include #include +#include #include "cxl.h" static int cxl_dma_set_mask(struct pci_dev *pdev, u64 dma_mask) @@ -291,13 +292,18 @@ void cxl_pci_vphb_remove(struct cxl_afu *afu) pcibios_free_controller(phb); } +static bool _cxl_pci_is_vphb_device(struct pci_controller *phb) +{ + return (phb->ops == _pcie_pci_ops); +} + bool cxl_pci_is_vphb_device(struct pci_dev *dev) { struct pci_controller *phb; phb = pci_bus_to_host(dev->bus); - return (phb->ops == _pcie_pci_ops); + return _cxl_pci_is_vphb_device(phb); } struct cxl_afu *cxl_pci_to_afu(struct pci_dev *dev) @@ -306,7 +312,13 @@ struct cxl_afu *cxl_pci_to_afu(struct pci_dev *dev) phb = pci_bus_to_host(dev->bus); - return (struct cxl_afu *)phb->private_data; + if (_cxl_pci_is_vphb_device(phb)) + return (struct cxl_afu *)phb->private_data; + + if (pnv_pci_on_cxl_phb(dev)) + return pnv_cxl_phb_to_afu(phb); + + return ERR_PTR(-ENODEV); } EXPORT_SYMBOL_GPL(cxl_pci_to_afu); -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 13/15] PCI/hotplug: pnv_php: export symbols and move struct types needed by cxl
From: Andrew Donnellan <andrew.donnel...@au1.ibm.com> The cxl driver will use infrastructure from pnv_php to handle device tree updates when switching bi-modal CAPI cards into CAPI mode. To enable this, export pnv_php_find_slot() and pnv_php_set_slot_power_state(), and add corresponding declarations, as well as the definition of struct pnv_php_slot, to asm/pnv-pci.h. Cc: Gavin Shan <gws...@linux.vnet.ibm.com> Cc: linux-...@vger.kernel.org Cc: Bjorn Helgaas <bhelg...@google.com> Signed-off-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Acked-by: Gavin Shan <gws...@linux.vnet.ibm.com> --- V1->V2: - Dropped extraneous "select HOTPLUG_PCI_POWERNV_BASE" in Kconfig, which was accidentally left in from an earlier non-public revision. Thanks to Gavin Shan for pointing it out --- arch/powerpc/include/asm/pnv-pci.h | 28 drivers/pci/hotplug/pnv_php.c | 32 +--- 2 files changed, 33 insertions(+), 27 deletions(-) diff --git a/arch/powerpc/include/asm/pnv-pci.h b/arch/powerpc/include/asm/pnv-pci.h index c47097f..0cbd813 100644 --- a/arch/powerpc/include/asm/pnv-pci.h +++ b/arch/powerpc/include/asm/pnv-pci.h @@ -11,6 +11,7 @@ #define _ASM_PNV_PCI_H #include +#include #include #include @@ -47,4 +48,31 @@ void pnv_cxl_phb_set_peer_afu(struct pci_dev *dev, struct cxl_afu *afu); #endif +struct pnv_php_slot { + struct hotplug_slot slot; + struct hotplug_slot_infoslot_info; + uint64_tid; + char*name; + int slot_no; + struct kref kref; +#define PNV_PHP_STATE_INITIALIZED 0 +#define PNV_PHP_STATE_REGISTERED 1 +#define PNV_PHP_STATE_POPULATED2 +#define PNV_PHP_STATE_OFFLINE 3 + int state; + struct device_node *dn; + struct pci_dev *pdev; + struct pci_bus *bus; + boolpower_state_check; + void*fdt; + void*dt; + struct of_changeset ocs; + struct pnv_php_slot *parent; + struct list_headchildren; + struct list_headlink; +}; +extern struct pnv_php_slot *pnv_php_find_slot(struct device_node *dn); +extern int pnv_php_set_slot_power_state(struct hotplug_slot *slot, + uint8_t state); + #endif diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c index 6086db6..2d2f704 100644 --- a/drivers/pci/hotplug/pnv_php.c +++ b/drivers/pci/hotplug/pnv_php.c @@ -22,30 +22,6 @@ #define DRIVER_AUTHOR "Gavin Shan, IBM Corporation" #define DRIVER_DESC"PowerPC PowerNV PCI Hotplug Driver" -struct pnv_php_slot { - struct hotplug_slot slot; - struct hotplug_slot_infoslot_info; - uint64_tid; - char*name; - int slot_no; - struct kref kref; -#define PNV_PHP_STATE_INITIALIZED 0 -#define PNV_PHP_STATE_REGISTERED 1 -#define PNV_PHP_STATE_POPULATED2 -#define PNV_PHP_STATE_OFFLINE 3 - int state; - struct device_node *dn; - struct pci_dev *pdev; - struct pci_bus *bus; - boolpower_state_check; - void*fdt; - void*dt; - struct of_changeset ocs; - struct pnv_php_slot *parent; - struct list_headchildren; - struct list_headlink; -}; - static LIST_HEAD(pnv_php_slot_list); static DEFINE_SPINLOCK(pnv_php_lock); @@ -91,7 +67,7 @@ static struct pnv_php_slot *pnv_php_match(struct device_node *dn, return NULL; } -static struct pnv_php_slot *pnv_php_find_slot(struct device_node *dn) +struct pnv_php_slot *pnv_php_find_slot(struct device_node *dn) { struct pnv_php_slot *php_slot, *tmp; unsigned long flags; @@ -108,6 +84,7 @@ static struct pnv_php_slot *pnv_php_find_slot(struct device_node *dn) return NULL; } +EXPORT_SYMBOL_GPL(pnv_php_find_slot); /* * Remove pdn for all children of the indicated device node. @@ -316,8 +293,8 @@ out: return ret; } -static int pnv_php_set_slot_power_state(struct hotplug_slot *slot, - uint8_t state) +int pnv_php_set_slot_power_state(struct hotplug_slot *slot, +uint8_t state) { struc
[PATCH 12/15] cxl: Workaround PE=0 hardware limitation in Mellanox CX4
From: Ian Munsie <imun...@au1.ibm.com> The CX4 card cannot cope with a context with PE=0 due to a hardware limitation, resulting in: [ 34.166577] command failed, status limits exceeded(0x8), syndrome 0x5a7939 [ 34.166580] mlx5_core :01:00.1: Failed allocating uar, aborting Since the kernel API allocates a default context very early during device init that will almost certainly get Process Element ID 0 there is no easy way for us to extend the API to allow the Mellanox to inform us of this limitation ahead of time. Instead, work around the issue by extending the XSL structure to include a minimum PE to allocate. Although the bug is not in the XSL, it is the easiest place to work around this limitation given that the CX4 is currently the only card that uses an XSL. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> --- drivers/misc/cxl/context.c | 3 ++- drivers/misc/cxl/cxl.h | 1 + drivers/misc/cxl/pci.c | 1 + 3 files changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/misc/cxl/context.c b/drivers/misc/cxl/context.c index 2616cddb..bdee9a0 100644 --- a/drivers/misc/cxl/context.c +++ b/drivers/misc/cxl/context.c @@ -90,7 +90,8 @@ int cxl_context_init(struct cxl_context *ctx, struct cxl_afu *afu, bool master, */ mutex_lock(>contexts_lock); idr_preload(GFP_KERNEL); - i = idr_alloc(>afu->contexts_idr, ctx, 0, + i = idr_alloc(>afu->contexts_idr, ctx, + ctx->afu->adapter->native->sl_ops->min_pe, ctx->afu->num_procs, GFP_NOWAIT); idr_preload_end(); mutex_unlock(>contexts_lock); diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index d50cdb1..de09053 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -561,6 +561,7 @@ struct cxl_service_layer_ops { u64 (*timebase_read)(struct cxl *adapter); int capi_mode; bool needs_reset_before_disable; + int min_pe; }; struct cxl_native { diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index cb5d172..efe202f 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -1321,6 +1321,7 @@ static const struct cxl_service_layer_ops xsl_ops = { .write_timebase_ctrl = write_timebase_ctrl_xsl, .timebase_read = timebase_read_xsl, .capi_mode = OPAL_PHB_CAPI_MODE_DMA, + .min_pe = 1, /* Workaround for Mellanox CX4 HW bug */ }; static void set_sl_ops(struct cxl *adapter, struct pci_dev *dev) -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 05/15] cxl: Allow a default context to be associated with an external pci_dev
From: Ian Munsie <imun...@au1.ibm.com> The cxl kernel API has a concept of a default context associated with each PCI device under the virtual PHB. The Mellanox CX4 will also use the cxl kernel API, but it does not use a virtual PHB - rather, the AFU appears as a physical function as a peer to the networking functions. In order to allow the kernel API to work with those networking functions, we will need to associate a default context with them as well. To this end, refactor the corresponding code to do this in vphb.c and export it so that it can be called from the PHB code. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> --- drivers/misc/cxl/base.c | 35 +++ drivers/misc/cxl/cxl.h | 6 ++ drivers/misc/cxl/main.c | 2 ++ drivers/misc/cxl/vphb.c | 37 +++-- include/misc/cxl-base.h | 6 ++ 5 files changed, 72 insertions(+), 14 deletions(-) diff --git a/drivers/misc/cxl/base.c b/drivers/misc/cxl/base.c index d7dcf5b..1c3e737f 100644 --- a/drivers/misc/cxl/base.c +++ b/drivers/misc/cxl/base.c @@ -106,6 +106,41 @@ int cxl_update_properties(struct device_node *dn, } EXPORT_SYMBOL_GPL(cxl_update_properties); +/* + * API calls into the driver that may be called from the PHB code and must be + * built in. + */ +bool cxl_pci_associate_default_context(struct pci_dev *dev, struct cxl_afu *afu) +{ + bool ret; + struct cxl_calls *calls; + + calls = cxl_calls_get(); + if (!calls) + return false; + + ret = calls->cxl_pci_associate_default_context(dev, afu); + + cxl_calls_put(calls); + + return ret; +} +EXPORT_SYMBOL_GPL(cxl_pci_associate_default_context); + +void cxl_pci_disable_device(struct pci_dev *dev) +{ + struct cxl_calls *calls; + + calls = cxl_calls_get(); + if (!calls) + return; + + calls->cxl_pci_disable_device(dev); + + cxl_calls_put(calls); +} +EXPORT_SYMBOL_GPL(cxl_pci_disable_device); + static int __init cxl_base_init(void) { struct device_node *np = NULL; diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index d4aae6f..b81f476 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -719,9 +719,15 @@ static inline u64 cxl_p2n_read(struct cxl_afu *afu, cxl_p2n_reg_t reg) ssize_t cxl_pci_afu_read_err_buffer(struct cxl_afu *afu, char *buf, loff_t off, size_t count); +/* Internal functions wrapped in cxl_base to allow PHB to call them */ +bool _cxl_pci_associate_default_context(struct pci_dev *dev, struct cxl_afu *afu); +void _cxl_pci_disable_device(struct pci_dev *dev); struct cxl_calls { void (*cxl_slbia)(struct mm_struct *mm); + bool (*cxl_pci_associate_default_context)(struct pci_dev *dev, struct cxl_afu *afu); + void (*cxl_pci_disable_device)(struct pci_dev *dev); + struct module *owner; }; int register_cxl_calls(struct cxl_calls *calls); diff --git a/drivers/misc/cxl/main.c b/drivers/misc/cxl/main.c index ae68c32..4e5474b 100644 --- a/drivers/misc/cxl/main.c +++ b/drivers/misc/cxl/main.c @@ -110,6 +110,8 @@ static inline void cxl_slbia_core(struct mm_struct *mm) static struct cxl_calls cxl_calls = { .cxl_slbia = cxl_slbia_core, + .cxl_pci_associate_default_context = _cxl_pci_associate_default_context, + .cxl_pci_disable_device = _cxl_pci_disable_device, .owner = THIS_MODULE, }; diff --git a/drivers/misc/cxl/vphb.c b/drivers/misc/cxl/vphb.c index 012b6aa..c5b9c201 100644 --- a/drivers/misc/cxl/vphb.c +++ b/drivers/misc/cxl/vphb.c @@ -40,11 +40,28 @@ static void cxl_teardown_msi_irqs(struct pci_dev *pdev) */ } +bool _cxl_pci_associate_default_context(struct pci_dev *dev, struct cxl_afu *afu) +{ + struct cxl_context *ctx; + + /* +* Allocate a context to do cxl things too. This is used for interrupts +* in the peer model using a real phb, and if we eventually do DMA ops +* in the virtual phb, we'll need a default context to attach them to. +*/ + ctx = cxl_dev_context_init(dev); + if (!ctx) + return false; + dev->dev.archdata.cxl_ctx = ctx; + + return (cxl_ops->afu_check_and_enable(afu) == 0); +} +/* exported via cxl_base */ + static bool cxl_pci_enable_device_hook(struct pci_dev *dev) { struct pci_controller *phb; struct cxl_afu *afu; - struct cxl_context *ctx; phb = pci_bus_to_host(dev->bus); afu = (struct cxl_afu *)phb->private_data; @@ -57,19 +74,10 @@ static bool cxl_pci_enable_device_hook(struct pci_dev *dev) set_dma_ops(>dev, _direct_ops); set_dma_offset(>dev, PAGE_OFFSET); - /* -* Allocate a context to do cxl things too. If we eventually do real -* DMA ops, we'll need a default context to attach them to -
[PATCH 11/15] cxl: Add support for interrupts on the Mellanox CX4
From: Ian Munsie <imun...@au1.ibm.com> The Mellanox CX4 in cxl mode uses a hybrid interrupt model, where interrupts are routed from the networking hardware to the XSL using the MSIX table, and from there will be transformed back into an MSIX interrupt using the cxl style interrupts (i.e. using IVTE entries and ranges to map a PE and AFU interrupt number to an MSIX address). We want to hide the implementation details of cxl interrupts as much as possible. To this end, we use a special version of the MSI setup & teardown routines in the PHB while in cxl mode to allocate the cxl interrupts and configure the IVTE entries in the process element. This function does not configure the MSIX table - the CX4 card uses a custom format in that table and it would not be appropriate to fill that out in generic code. The rest of the functionality is similar to the "Full MSI-X mode" described in the CAIA, and this could be easily extended to support other adapters that use that mode in the future. The interrupts will be associated with the default context. If the maximum number of interrupts per context has been limited (e.g. by the mlx5 driver), it will automatically allocate additional kernel contexts to associate extra interrupts as required. These contexts will be started using the same WED that was used to start the default context. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- V1->V2: - Handle error case if cxl_next_msi_hwirq returns 0 signifying that an AFU IRQ is not mapped to a hardware interrupt. --- arch/powerpc/platforms/powernv/pci-cxl.c | 84 +++ arch/powerpc/platforms/powernv/pci-ioda.c | 4 ++ arch/powerpc/platforms/powernv/pci.h | 2 + drivers/misc/cxl/api.c| 71 ++ drivers/misc/cxl/base.c | 31 drivers/misc/cxl/cxl.h| 4 ++ drivers/misc/cxl/main.c | 2 + include/misc/cxl-base.h | 4 ++ 8 files changed, 202 insertions(+) diff --git a/arch/powerpc/platforms/powernv/pci-cxl.c b/arch/powerpc/platforms/powernv/pci-cxl.c index 3c4caf0..0e6bd0a 100644 --- a/arch/powerpc/platforms/powernv/pci-cxl.c +++ b/arch/powerpc/platforms/powernv/pci-cxl.c @@ -8,6 +8,7 @@ */ #include +#include #include #include #include @@ -281,3 +282,86 @@ void pnv_cxl_disable_device(struct pci_dev *dev) cxl_pci_disable_device(dev); cxl_afu_put(afu); } + +/* + * This is a special version of pnv_setup_msi_irqs for cards in cxl mode. This + * function handles setting up the IVTE entries for the XSL to use. + * + * We are currently not filling out the MSIX table, since the only currently + * supported adapter (CX4) uses a custom MSIX table format in cxl mode and it + * is up to their driver to fill that out. In the future we may fill out the + * MSIX table (and change the IVTE entries to be an index to the MSIX table) + * for adapters implementing the Full MSI-X mode described in the CAIA. + */ +int pnv_cxl_cx4_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type) +{ + struct pci_controller *hose = pci_bus_to_host(pdev->bus); + struct pnv_phb *phb = hose->private_data; + struct msi_desc *entry; + struct cxl_context *ctx = NULL; + unsigned int virq; + int hwirq; + int afu_irq = 0; + int rc; + + if (WARN_ON(!phb) || !phb->msi_bmp.bitmap) + return -ENODEV; + + if (pdev->no_64bit_msi && !phb->msi32_support) + return -ENODEV; + + rc = cxl_cx4_setup_msi_irqs(pdev, nvec, type); + if (rc) + return rc; + + for_each_pci_msi_entry(entry, pdev) { + if (!entry->msi_attrib.is_64 && !phb->msi32_support) { + pr_warn("%s: Supports only 64-bit MSIs\n", + pci_name(pdev)); + return -ENXIO; + } + + hwirq = cxl_next_msi_hwirq(pdev, , _irq); + if (WARN_ON(hwirq <= 0)) + return (hwirq ? hwirq : -ENOMEM); + + virq = irq_create_mapping(NULL, hwirq); + if (virq == NO_IRQ) { + pr_warn("%s: Failed to map cxl mode MSI to linux irq\n", + pci_name(pdev)); + return -ENOMEM; + } + + rc = pnv_cxl_ioda_msi_setup(pdev, hwirq, virq); + if (rc) { + pr_warn("%s: Failed to setup cxl mode MSI\n", pci_name(pdev)); + irq_dispose_mapping(virq); + return rc; + } + + irq_set_msi_desc(virq, entry); + } + + return 0; +} + +void pnv_cxl_cx4_teardown_msi_irqs(struct pci_dev *pdev) +{ + struct pci_controller *hose = pci_bus_t
[PATCH 09/15] cxl: Add kernel APIs to get & set the max irqs per context
From: Ian Munsie <imun...@au1.ibm.com> These APIs will be used by the Mellanox CX4 support. While they function standalone to configure existing behaviour, their primary purpose is to allow the Mellanox driver to inform the cxl driver of a hardware limitation, which will be used in a future patch. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> --- drivers/misc/cxl/api.c | 27 +++ include/misc/cxl.h | 10 ++ 2 files changed, 37 insertions(+) diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index 6a030bf..1e2c0d9 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -447,3 +447,30 @@ ssize_t cxl_read_adapter_vpd(struct pci_dev *dev, void *buf, size_t count) return cxl_ops->read_adapter_vpd(afu->adapter, buf, count); } EXPORT_SYMBOL_GPL(cxl_read_adapter_vpd); + +int cxl_set_max_irqs_per_process(struct pci_dev *dev, int irqs) +{ + struct cxl_afu *afu = cxl_pci_to_afu(dev); + if (IS_ERR(afu)) + return -ENODEV; + + if (irqs > afu->adapter->user_irqs) + return -EINVAL; + + /* Limit user_irqs to prevent the user increasing this via sysfs */ + afu->adapter->user_irqs = irqs; + afu->irqs_max = irqs; + + return 0; +} +EXPORT_SYMBOL_GPL(cxl_set_max_irqs_per_process); + +int cxl_get_max_irqs_per_process(struct pci_dev *dev) +{ + struct cxl_afu *afu = cxl_pci_to_afu(dev); + if (IS_ERR(afu)) + return -ENODEV; + + return afu->irqs_max; +} +EXPORT_SYMBOL_GPL(cxl_get_max_irqs_per_process); diff --git a/include/misc/cxl.h b/include/misc/cxl.h index dd9eebb..fc07ed4 100644 --- a/include/misc/cxl.h +++ b/include/misc/cxl.h @@ -166,6 +166,16 @@ void cxl_psa_unmap(void __iomem *addr); /* Get the process element for this context */ int cxl_process_element(struct cxl_context *ctx); +/* + * Limit the number of interrupts that a single context can allocate via + * cxl_start_work. If using the api with a real phb, this may be used to + * request that additional default contexts be created when allocating + * interrupts via pci_enable_msix_range. These will be set to the same running + * state as the default context, and if that is running it will reuse the + * parameters previously passed to cxl_start_context for the default context. + */ +int cxl_set_max_irqs_per_process(struct pci_dev *dev, int irqs); +int cxl_get_max_irqs_per_process(struct pci_dev *dev); /* * These calls allow drivers to create their own file descriptors and make them -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 07/15] powerpc/powernv: Add support for the cxl kernel api on the real phb
From: Ian Munsie <imun...@au1.ibm.com> This adds support for the peer model of the cxl kernel api to the PowerNV PHB, in which physical function 0 represents the cxl function on the card (an XSL in the case of the CX4), which other physical functions will use for memory access and interrupt services. It is referred to as the peer model as these functions are peers of one another, as opposed to the Virtual PHB model which forms a hierarchy. This patch exports APIs to enable the peer mode, check if a PCI device is attached to a PHB in this mode, and to set and get the peer AFU for this mode. The cxl driver will enable this mode for supported cards by calling pnv_cxl_enable_phb_kernel_api(). This will set a flag in the PHB to note that this mode is enabled, and switch out it's controller_ops for the cxl version. The cxl version of the controller_ops struct implements it's own versions of the enable_device_hook and release_device to handle refcounting on the peer AFU and to allocate a default context for the device. Once enabled, the cxl kernel API may not be disabled on a PHB. Currently there is no safe way to disable cxl mode short of a reboot, so until that changes there is no reason to support the disable path. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- V1->V2: - Add an explanation of the peer model to the commit message, and a comment above the pnv_cxl_enable_device_hook function. --- arch/powerpc/include/asm/pnv-pci.h| 7 ++ arch/powerpc/platforms/powernv/pci-cxl.c | 120 ++ arch/powerpc/platforms/powernv/pci-ioda.c | 22 +- arch/powerpc/platforms/powernv/pci.h | 16 4 files changed, 162 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/pnv-pci.h b/arch/powerpc/include/asm/pnv-pci.h index 791db1b..c47097f 100644 --- a/arch/powerpc/include/asm/pnv-pci.h +++ b/arch/powerpc/include/asm/pnv-pci.h @@ -38,6 +38,13 @@ int pnv_cxl_alloc_hwirq_ranges(struct cxl_irq_ranges *irqs, struct pci_dev *dev, int num); void pnv_cxl_release_hwirq_ranges(struct cxl_irq_ranges *irqs, struct pci_dev *dev); + +/* Support for the cxl kernel api on the real PHB (instead of vPHB) */ +int pnv_cxl_enable_phb_kernel_api(struct pci_controller *hose, bool enable); +bool pnv_pci_on_cxl_phb(struct pci_dev *dev); +struct cxl_afu *pnv_cxl_phb_to_afu(struct pci_controller *hose); +void pnv_cxl_phb_set_peer_afu(struct pci_dev *dev, struct cxl_afu *afu); + #endif #endif diff --git a/arch/powerpc/platforms/powernv/pci-cxl.c b/arch/powerpc/platforms/powernv/pci-cxl.c index e0eeb00..3c4caf0 100644 --- a/arch/powerpc/platforms/powernv/pci-cxl.c +++ b/arch/powerpc/platforms/powernv/pci-cxl.c @@ -7,8 +7,11 @@ * 2 of the License, or (at your option) any later version. */ +#include +#include #include #include +#include #include "pci.h" @@ -161,3 +164,120 @@ int pnv_cxl_ioda_msi_setup(struct pci_dev *dev, unsigned int hwirq, return 0; } EXPORT_SYMBOL(pnv_cxl_ioda_msi_setup); + +/* + * Sets flags and switches the controller ops to enable the cxl kernel api. + * Original the cxl kernel API operated on a virtual PHB, but certain cards + * such as the Mellanox CX4 use a peer model instead and for these cards the + * cxl kernel api will operate on the real PHB. + */ +int pnv_cxl_enable_phb_kernel_api(struct pci_controller *hose, bool enable) +{ + struct pnv_phb *phb = hose->private_data; + struct module *cxl_module; + + if (!enable) { + /* +* Once cxl mode is enabled on the PHB, there is currently no +* known safe method to disable it again, and trying risks a +* checkstop. If we can find a way to safely disable cxl mode +* in the future we can revisit this, but for now the only sane +* thing to do is to refuse to disable cxl mode: +*/ + return -EPERM; + } + + /* +* Hold a reference to the cxl module since several PHB operations now +* depend on it, and it would be insane to allow it to be removed so +* long as we are in this mode (and since we can't safely disable this +* mode once enabled...). +*/ + mutex_lock(_mutex); + cxl_module = find_module("cxl"); + if (cxl_module) + __module_get(cxl_module); + mutex_unlock(_mutex); + if (!cxl_module) + return -ENODEV; + + phb->flags |= PNV_PHB_FLAG_CXL; + hose->controller_ops = pnv_cxl_cx4_ioda_controller_ops; + + return 0; +} +EXPORT_SYMBOL(pnv_cxl_enable_phb_kernel_api); + +bool pnv_pci_on_cxl_phb(struct pci_dev *dev) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + + return !!(phb->flags & PNV_PHB_FL
[PATCH 06/15] cxl: Do not create vPHB if there are no AFU configuration records
From: Ian Munsie <imun...@au1.ibm.com> The vPHB model of the cxl kernel API is a hierarchy where the AFU is represented by the vPHB, and it's AFU configuration records are exposed as functions under that vPHB. If there are no AFU configuration records we will create a vPHB with nothing under it, which is a waste of resources and will opt us into EEH handling despite not having anything special to handle. This also does not make sense for cards using the peer model of the cxl kernel API, where the other functions of the device are exposed via additional peer physical functions rather than AFU configuration records. This model will also not work with the existing EEH handling in the cxl driver, as that is designed around the vPHB model. Skip creating the vPHB for AFUs without any AFU configuration records, and opt out of EEH handling for them. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- drivers/misc/cxl/pci.c | 3 +++ drivers/misc/cxl/vphb.c | 11 +++ 2 files changed, 14 insertions(+) diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index deef9c7..dd7ff22 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -1572,6 +1572,9 @@ static pci_ers_result_t cxl_pci_error_detected(struct pci_dev *pdev, */ for (i = 0; i < adapter->slices; i++) { afu = adapter->afu[i]; + /* Only participate in EEH if we are on a virtual PHB */ + if (afu->phb == NULL) + return PCI_ERS_RESULT_NONE; cxl_vphb_error_detected(afu, state); } return PCI_ERS_RESULT_DISCONNECT; diff --git a/drivers/misc/cxl/vphb.c b/drivers/misc/cxl/vphb.c index c5b9c201..4b81f0f 100644 --- a/drivers/misc/cxl/vphb.c +++ b/drivers/misc/cxl/vphb.c @@ -221,6 +221,17 @@ int cxl_pci_vphb_add(struct cxl_afu *afu) struct device_node *vphb_dn; struct device *parent; + /* +* If there are no AFU configuration records we won't have anything to +* expose under the vPHB, so skip creating one, returning success since +* this is still a valid case. This will also opt us out of EEH +* handling since we won't have anything special to do if there are no +* kernel drivers attached to the vPHB, and EEH handling is not yet +* supported in the peer model. +*/ + if (!afu->crs_num) + return 0; + /* The parent device is the adapter. Reuse the device node of * the adapter. * We don't seem to care what device node is used for the vPHB, -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2] powerpc / cxl: Add support for the Mellanox CX4 in cxl mode
This series adds support for the Mellanox CX4 network adapter operating in cxl mode to the cxl driver and the PowerNV PHB code. The Mellanox developers will submit a separate patch series that makes use of this in the mlx5 driver. The CX4 card can operate in either pci mode, or cxl mode. In cxl mode, memory accesses from the card go through the XSL (Translation Service Layer, essentially a stripped down version of the Power Service Layer), allowing it to transparently access unpinned memory with the cxl driver handling faulting in pages as necessary, etc. Most of the support for the XSL is already upstream, though this series does include a bug fix to enable bus mastering for this (patch 3). Patch 2 in this series provides an API which the mlx5 driver can query to check if it is in a cxl capable slot. The card will come up in pci mode, and the mlx5 driver can choose to switch it to cxl mode, wherein it will reappear with an additional physical function representing the XSL that the cxl driver will bind to. Patches 13-15 add support for switching the card's mode, including using the PCI hotplug support to re-enumerate the device tree and re-probind the card. Unlike previous users of the cxl kernel API where we used a virtual PHB and exposed PCI devices under it, the Mellanox CX4 uses a peer model where cxl binds to one of the physical functions of the card and the mlx5_core driver binds to the other networking physical functions. Patch 6 skips creating a vPHB for AFUs without any AFU configuration records (including devices using the peer model) and opts out of EEH handling. Patches 7 and 8 add support for using the cxl kernel API with the real PHB to enable this peer model. Patches 4 and 5 are prepatory patches exposing some APIs that the PHB will need to call. While in cxl mode, interrupts from the CX4 are a little unusual - they are neither pci interrupts, nor cxl interrutps, but rather a hybrid of the two. The interrupts are passed from the networking hardware to the XSL using a custom format in the MSIX table, and from there are treated as cxl interrupts. These are configured mostly transparently using the standard msix APIs - the PHB handles allocating and configuring the cxl interrupts, associating them with the default context, and the mlx5 driver handles filling out the MSIX table with their custom format (not included in this series). See patch 11. Additionally, the CX4 has a hard limitation of the number of interrupts that can be associated with a given context, so to overcome this patches 9 and 10 expose an API to allow the mlx5 driver to inform us of the limit, and the interrupt allocation code in patch 11 will allocate additional contexts to associate these with. Patch 1 is a prepatory cleanup patch to reorganise cxl code in arch/powerpc into a separate file. Patch 12 is a workaround for a hardware limitation in the CX4 where a context with PE=0 cannot be used. The entire series is bisectable. Changes since v1: - New patch 6 to skip creating a vPHB if there are no AFU configuration records, and opt out of EEH handling (partially split from patch 8). - Updated comments in various patches (1, 2, 7, 10, 15) with feedback from Andrew Donnellan and Frederic Barrat - Handle error case if cxl_next_msi_hwirq returns 0 signifying that an AFU IRQ is not mapped to a hardware interrupt (Patch 11) - Dropped extraneous "select HOTPLUG_PCI_POWERNV_BASE" in Kconfig, which was accidentally left in from an earlier non-public revision. Thanks to Gavin Shan for pointing it out (Patch 13) - Added new error label for error paths calling pci_dev_put() - suggested by Ian Munsie (Patch 15) - Added newline at end of Kconfig (Patch 15) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 04/15] cxl: Move cxl_afu_get / cxl_afu_put to base
From: Ian Munsie <imun...@au1.ibm.com> The Mellanox CX4 uses a model where the AFU is one physical function of the device, and is used by other peer physical functions of the same device. This will require those other devices to grab a reference on the AFU when they are initialised to make sure that it does not go away during their lifetime. Move the AFU refcount functions to base.c so they can be called from the PHB code. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> --- drivers/misc/cxl/base.c | 13 + drivers/misc/cxl/cxl.h | 12 include/misc/cxl-base.h | 4 3 files changed, 17 insertions(+), 12 deletions(-) diff --git a/drivers/misc/cxl/base.c b/drivers/misc/cxl/base.c index e6f49ac..d7dcf5b 100644 --- a/drivers/misc/cxl/base.c +++ b/drivers/misc/cxl/base.c @@ -54,6 +54,19 @@ static inline void cxl_calls_put(struct cxl_calls *calls) { } #endif /* CONFIG_CXL_MODULE */ +/* AFU refcount management */ +struct cxl_afu *cxl_afu_get(struct cxl_afu *afu) +{ + return (get_device(>dev) == NULL) ? NULL : afu; +} +EXPORT_SYMBOL_GPL(cxl_afu_get); + +void cxl_afu_put(struct cxl_afu *afu) +{ + put_device(>dev); +} +EXPORT_SYMBOL_GPL(cxl_afu_put); + void cxl_slbia(struct mm_struct *mm) { struct cxl_calls *calls; diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index 36b3237..d4aae6f 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -440,18 +440,6 @@ struct cxl_afu { bool enabled; }; -/* AFU refcount management */ -static inline struct cxl_afu *cxl_afu_get(struct cxl_afu *afu) -{ - - return (get_device(>dev) == NULL) ? NULL : afu; -} - -static inline void cxl_afu_put(struct cxl_afu *afu) -{ - put_device(>dev); -} - struct cxl_irq_name { struct list_head list; diff --git a/include/misc/cxl-base.h b/include/misc/cxl-base.h index 5ae9625..f53808f 100644 --- a/include/misc/cxl-base.h +++ b/include/misc/cxl-base.h @@ -36,11 +36,15 @@ static inline void cxl_ctx_put(void) atomic_dec(_use_count); } +struct cxl_afu *cxl_afu_get(struct cxl_afu *afu); +void cxl_afu_put(struct cxl_afu *afu); void cxl_slbia(struct mm_struct *mm); #else /* CONFIG_CXL_BASE */ static inline bool cxl_ctx_in_use(void) { return false; } +static inline struct cxl_afu *cxl_afu_get(struct cxl_afu *afu) { return NULL; } +static inline void cxl_afu_put(struct cxl_afu *afu) {} static inline void cxl_slbia(struct mm_struct *mm) {} #endif /* CONFIG_CXL_BASE */ -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 03/15] cxl: Enable bus mastering for devices using CAPP DMA mode
From: Ian Munsie <imun...@au1.ibm.com> Devices that use CAPP DMA mode (such as the Mellanox CX4) require bus master to be enabled in order for the CAPI traffic to flow. This should be harmless to enable for other cxl devices, so unconditionally enable it in the adapter init flow. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> --- drivers/misc/cxl/pci.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index 6ac6b05..deef9c7 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -1264,6 +1264,9 @@ static int cxl_configure_adapter(struct cxl *adapter, struct pci_dev *dev) if ((rc = adapter->native->sl_ops->adapter_regs_init(adapter, dev))) goto err; + /* Required for devices using CAPP DMA mode, harmless for others */ + pci_set_master(dev); + if ((rc = pnv_phb_to_cxl_mode(dev, adapter->native->sl_ops->capi_mode))) goto err; -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 02/15] cxl: Add cxl_slot_is_supported API
From: Ian Munsie <imun...@au1.ibm.com> This extends the check that the adapter is in a CAPI capable slot so that it may be called by external users in the kernel API. This will be used by the upcoming Mellanox CX4 support, which needs to know ahead of time if the card can be switched to cxl mode so that it can leave it in PCI mode if it is not. This API takes a parameter to check if CAPP DMA mode is supported, which it currently only allows on P8NVL systems, since that mode currently has issues accessing memory < 4GB on P8, and we cannot realistically avoid that. This API does not currently check if a CAPP unit is available (i.e. not already assigned to another PHB) on P8. Doing so would be racy since it is assigned on a first come first serve basis, and so long as CAPP DMA mode is not supported on P8 we don't need this, since the only anticipated user of this API requires CAPP DMA mode. Cc: Philippe Bergheaud <fe...@linux.vnet.ibm.com> Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> --- V1->V2: - Fixed typos in comments spotted by Andrew --- drivers/misc/cxl/pci.c | 37 + include/misc/cxl.h | 15 +++ 2 files changed, 52 insertions(+) diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index 3a5f980..6ac6b05 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -1426,6 +1426,43 @@ static int cxl_slot_is_switched(struct pci_dev *dev) return (depth > CXL_MAX_PCIEX_PARENT); } +bool cxl_slot_is_supported(struct pci_dev *dev, int flags) +{ + if (!cpu_has_feature(CPU_FTR_HVMODE)) + return false; + + if ((flags & CXL_SLOT_FLAG_DMA) && (!pvr_version_is(PVR_POWER8NVL))) { + /* +* CAPP DMA mode is technically supported on regular P8, but +* will EEH if the card attempts to access memory < 4GB, which +* we cannot realistically avoid. We might be able to work +* around the issue, but until then return unsupported: +*/ + return false; + } + + if (cxl_slot_is_switched(dev)) + return false; + + /* +* XXX: This gets a little tricky on regular P8 (not POWER8NVL) since +* the CAPP can be connected to PHB 0, 1 or 2 on a first come first +* served basis, which is racy to check from here. If we need to +* support this in future we might need to consider having this +* function effectively reserve it ahead of time. +* +* Currently, the only user of this API is the Mellanox CX4, which is +* only supported on P8NVL due to the above mentioned limitation of +* CAPP DMA mode and therefore does not need to worry about this. If the +* issue with CAPP DMA mode is later worked around on P8 we might need +* to revisit this. +*/ + + return true; +} +EXPORT_SYMBOL_GPL(cxl_slot_is_supported); + + static int cxl_probe(struct pci_dev *dev, const struct pci_device_id *id) { struct cxl *adapter; diff --git a/include/misc/cxl.h b/include/misc/cxl.h index b6d040f..dd9eebb 100644 --- a/include/misc/cxl.h +++ b/include/misc/cxl.h @@ -24,6 +24,21 @@ * generic PCI API. This API is agnostic to the actual AFU. */ +#define CXL_SLOT_FLAG_DMA 0x1 + +/* + * Checks if the given card is in a cxl capable slot. Pass CXL_SLOT_FLAG_DMA if + * the card requires CAPP DMA mode to also check if the system supports it. + * This is intended to be used by bi-modal devices to determine if they can use + * cxl mode or if they should continue running in PCI mode. + * + * Note that this only checks if the slot is cxl capable - it does not + * currently check if the CAPP is currently available for chips where it can be + * assigned to different PHBs on a first come first serve basis (i.e. P8) + */ +bool cxl_slot_is_supported(struct pci_dev *dev, int flags); + + /* Get the AFU associated with a pci_dev */ struct cxl_afu *cxl_pci_to_afu(struct pci_dev *dev); -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 01/15] powerpc/powernv: Split cxl code out into a separate file
From: Ian Munsie <imun...@au1.ibm.com> The support for using the Mellanox CX4 in cxl mode will require additions to the PHB code. In preparation for this, move the existing cxl code out of pci-ioda.c into a separate pci-cxl.c file to keep things more organised. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> --- V1 -> V2: Changed copyright message in new file to 2014-2016, since most of the code originated in other files written in 2014, and will be adding new code shortly. --- arch/powerpc/platforms/powernv/Makefile | 1 + arch/powerpc/platforms/powernv/pci-cxl.c | 163 ++ arch/powerpc/platforms/powernv/pci-ioda.c | 159 + arch/powerpc/platforms/powernv/pci.h | 6 ++ 4 files changed, 173 insertions(+), 156 deletions(-) create mode 100644 arch/powerpc/platforms/powernv/pci-cxl.c diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile index cd9711e..b5d98cb 100644 --- a/arch/powerpc/platforms/powernv/Makefile +++ b/arch/powerpc/platforms/powernv/Makefile @@ -6,6 +6,7 @@ obj-y += opal-kmsg.o obj-$(CONFIG_SMP) += smp.o subcore.o subcore-asm.o obj-$(CONFIG_PCI) += pci.o pci-ioda.o npu-dma.o +obj-$(CONFIG_CXL_BASE) += pci-cxl.o obj-$(CONFIG_EEH) += eeh-powernv.o obj-$(CONFIG_PPC_SCOM) += opal-xscom.o obj-$(CONFIG_MEMORY_FAILURE) += opal-memory-errors.o diff --git a/arch/powerpc/platforms/powernv/pci-cxl.c b/arch/powerpc/platforms/powernv/pci-cxl.c new file mode 100644 index 000..e0eeb00 --- /dev/null +++ b/arch/powerpc/platforms/powernv/pci-cxl.c @@ -0,0 +1,163 @@ +/* + * Copyright 2014-2016 IBM Corp. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include + +#include "pci.h" + +struct device_node *pnv_pci_get_phb_node(struct pci_dev *dev) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + + return of_node_get(hose->dn); +} +EXPORT_SYMBOL(pnv_pci_get_phb_node); + +int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t mode) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + struct pnv_ioda_pe *pe; + int rc; + + pe = pnv_ioda_get_pe(dev); + if (!pe) + return -ENODEV; + + pe_info(pe, "Switching PHB to CXL\n"); + + rc = opal_pci_set_phb_cxl_mode(phb->opal_id, mode, pe->pe_number); + if (rc == OPAL_UNSUPPORTED) + dev_err(>dev, "Required cxl mode not supported by firmware - update skiboot\n"); + else if (rc) + dev_err(>dev, "opal_pci_set_phb_cxl_mode failed: %i\n", rc); + + return rc; +} +EXPORT_SYMBOL(pnv_phb_to_cxl_mode); + +/* Find PHB for cxl dev and allocate MSI hwirqs? + * Returns the absolute hardware IRQ number + */ +int pnv_cxl_alloc_hwirqs(struct pci_dev *dev, int num) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + int hwirq = msi_bitmap_alloc_hwirqs(>msi_bmp, num); + + if (hwirq < 0) { + dev_warn(>dev, "Failed to find a free MSI\n"); + return -ENOSPC; + } + + return phb->msi_base + hwirq; +} +EXPORT_SYMBOL(pnv_cxl_alloc_hwirqs); + +void pnv_cxl_release_hwirqs(struct pci_dev *dev, int hwirq, int num) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + + msi_bitmap_free_hwirqs(>msi_bmp, hwirq - phb->msi_base, num); +} +EXPORT_SYMBOL(pnv_cxl_release_hwirqs); + +void pnv_cxl_release_hwirq_ranges(struct cxl_irq_ranges *irqs, + struct pci_dev *dev) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + int i, hwirq; + + for (i = 1; i < CXL_IRQ_RANGES; i++) { + if (!irqs->range[i]) + continue; + pr_devel("cxl release irq range 0x%x: offset: 0x%lx limit: %ld\n", +i, irqs->offset[i], +irqs->range[i]); + hwirq = irqs->offset[i] - phb->msi_base; + msi_bitmap_free_hwirqs(>msi_bmp, hwirq, + irqs->range[i]); + } +} +EXPORT_SYMBOL(pnv_cxl_release_hwirq_ranges); + +int pnv_cxl_alloc_hwirq_ranges(struct cxl_irq_ranges *irqs, + st
Re: [PATCH 14/14] cxl: Add cxl_check_and_switch_mode() API to switch bi-modal cards
Excerpts from andrew.donnellan's message of 2016-07-07 18:15:06 +1000: > On 07/07/16 16:44, Andrew Donnellan wrote: > > We can match the vendor, device ID *and* class code - unfortunately > > there isn't a macro for this, which makes it a little bit less > > aesthetically pleasing, but I'm pretty sure this works. > > Something like the below, which works fine: I like this solution, but I'm not going to include it in v2 of this series and would rather it be submitted separately. The reason being is that this series will work as is, and I'd like to see this undergo some regression testing separate to the cx4 work, and a bit of scrutiny from the hardware team just in case we are missing any device IDs that would no longer be matched(I'm not aware of any, but you never know). Cheers, -Ian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 07/14] cxl: Add support for using the kernel API with a real PHB
Excerpts from Frederic Barrat's message of 2016-07-06 20:30:41 +0200: > > > @@ -1572,6 +1575,9 @@ static pci_ers_result_t cxl_pci_error_detected(struct > > pci_dev *pdev, > >*/ > > for (i = 0; i < adapter->slices; i++) { > > afu = adapter->afu[i]; > > +/* Only participate in EEH if we are on a virtual PHB */ > > +if (afu->phb == NULL) > > +return PCI_ERS_RESULT_NONE; > > cxl_vphb_error_detected(afu, state); > > } > > > Sorry, I had my notes out of order, something is bugging me here. Don't > we always define afu->phb, though for Mellanox (or if there's no config > record in the general case), we don't have any devices attached to it? I think you're right. I'll change the vPHB code to skip it if there are no configuration records. > Which raises the question of the handling of slot_reset and resume > callbacks... We aren't going to support EEH (at least not yet) - the vPHB model makes this (relatively) easy since we can notify the AFU drivers when we get notified, but in the peer model it will be the real PHB notifying us and the networking drivers. If we do end up supporting that, it will come later. Cheers, -Ian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 06/14] powerpc/powernv: Add support for the cxl kernel api on the real phb
Excerpts from Frederic Barrat's message of 2016-07-06 19:38:18 +0200: > > > +/* No special handling for cxl function: */ > > +if (PCI_FUNC(dev->devfn) == 0) > > +return true; > > I believe that is the first time we're getting a hint of the black magic > which is going to occur when the card is switched to cxl mode and the > appearance of a new pci function. I think a general comment explaining > it is needed somewhere. In this patch or a later one. Also "peer model" > is used several times in the commit messages, though it's not clear to > the novice what it really means. > > At this point of the review, I was a bit overwhelmed by all the new > APIs, wondering how everything would end up working together. By the > last patch, it's understandable, but a few extra comments would help. > For the vPHB model, pretty much all the relevant code is in one file, > which helps grabbing the full picture. But here it's spread between the > phb platform code and the cxl driver. > >Fred Ok, will see what I can to to clarify this. -Ian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 14/14] cxl: Add cxl_check_and_switch_mode() API to switch bi-modal cards
Excerpts from andrew.donnellan's message of 2016-07-07 11:18:37 +1000: > > This is to balance the 'get' done in cxl_check_and_switch_mode(), right? > > A comment wouldn't hurt. I think we're missing the 'put' on the first > > error path above (!bridge). > > Yep, it's to balance the pci_dev_get() in cxl_check_and_switch_mode() - > you're right, a comment to that effect wouldn't hurt. > > You're also right about the error path. Will fix in V2. We could probably use a dedicated error label for all the error paths before the pci_dev_put in the main function so we don't need it in every error path. > > I was half-expecting to see a new entry in the cxl_pci_tbl pci ID table > > for the Mellanox entry, but no such thing. By what magic is cxl_probe() > > called after the switch? Because of the device class? > > It matches against the class, as function 0 of the device after reset > comes up as a class 1200 processing accelerator. > > Perhaps we should be a bit more explicit though... If we explicitly match the Vendor + Device ID we will also match the networking functions, which we can't do, because before the mode switch there *IS* a CAPI VSEC in one of the networking functions and our driver would mistake it as a generic accelerator and try to initialise it. We could add a comment to this effect to the PCI ID table. Cheers, -Ian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 10/14] cxl: Add support for interrupts on the Mellanox CX4
Excerpts from Frederic Barrat's message of 2016-07-06 20:41:42 +0200: > I think we want: > if (WARN_ON(hwirq <= 0)) > cxl_find_afu_irq() returns 0 if doesn't find the irq, which is not > supposed to happen here. Good catch - will fix in v2. Cheers, -Ian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 08/14] cxl: Add kernel APIs to get & set the max irqs per context
Excerpts from Frederic Barrat's message of 2016-07-06 20:11:48 +0200: > > Le 04/07/2016 15:22, Ian Munsie a écrit : > > From: Ian Munsie <imun...@au1.ibm.com> > > > > These APIs will be used by the Mellanox CX4 support. While they function > > standalone to configure existing behaviour, their primary purpose is to > > allow the Mellanox driver to inform the cxl driver of a hardware > > limitation, which will be used in a future patch. > > > > Signed-off-by: Ian Munsie <imun...@au1.ibm.com> > > Any way to add a check that the "set max" API is called before the > interrupts are allocated? I don't think there is any real need - if the set max API has not been called then we use the maximum number of interrupts possible on the PHB, which is the correct thing to do if we don't need the workaround. We could try adding a WARN in the set max API if interrupts have previously been allocated, but realistically - if a driver developer needs to use this they already know it and will be testing for it. Cheers, -Ian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3] cxl: Refine slice error debug messages
Acked-by: Ian Munsie <imun...@au1.ibm.com> ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2] cxl: Refine slice error debug messages
I agree with Mikey - this needs a description. But otherwise it looks good to me, and I'll be happy if it stops any more AFU developers from reporting their bugs to us, so happy to add this now: Acked-by: Ian Munsie <imun...@au1.ibm.com> Excerpts from Philippe Bergheaud's message of 2016-07-04 17:07:36 +0200: > Signed-off-by: Philippe Bergheaud <fe...@linux.vnet.ibm.com> > --- > Changes since v1: > - Rebased on Ian's patch > "cxl: Abstract the differences between the PSL and XSL" > > drivers/misc/cxl/cxl.h| 15 +++ > drivers/misc/cxl/guest.c | 9 ++--- > drivers/misc/cxl/irq.c| 29 + > drivers/misc/cxl/native.c | 12 +++- > 4 files changed, 57 insertions(+), 8 deletions(-) > > diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h > index 7745252..d928a8c 100644 > --- a/drivers/misc/cxl/cxl.h > +++ b/drivers/misc/cxl/cxl.h > @@ -188,6 +188,18 @@ static const cxl_p2n_reg_t CXL_PSL_WED_An = {0x0A0}; > #define CXL_PSL_ID_An_F(1ull << (63-31)) > #define CXL_PSL_ID_An_L(1ull << (63-30)) > > +/** CXL_PSL_SERR_An / > +#define CXL_PSL_SERR_An_afuto(1ull << (63-0)) > +#define CXL_PSL_SERR_An_afudis(1ull << (63-1)) > +#define CXL_PSL_SERR_An_afuov(1ull << (63-2)) > +#define CXL_PSL_SERR_An_badsrc(1ull << (63-3)) > +#define CXL_PSL_SERR_An_badctx(1ull << (63-4)) > +#define CXL_PSL_SERR_An_llcmdis(1ull << (63-5)) > +#define CXL_PSL_SERR_An_llcmdto(1ull << (63-6)) > +#define CXL_PSL_SERR_An_afupar(1ull << (63-7)) > +#define CXL_PSL_SERR_An_afudup(1ull << (63-8)) > +#define CXL_PSL_SERR_An_AE(1ull << (63-30)) > + > /** CXL_PSL_SCNTL_An > / > #define CXL_PSL_SCNTL_An_CR (0x1ull << (63-15)) > /* Programming Modes: */ > @@ -905,4 +917,7 @@ extern const struct cxl_backend_ops *cxl_ops; > > /* check if the given pci_dev is on the the cxl vphb bus */ > bool cxl_pci_is_vphb_device(struct pci_dev *dev); > + > +/* decode AFU error bits in the PSL register PSL_SERR_An */ > +void cxl_afu_decode_psl_serr(struct cxl_afu *afu, u64 serr); > #endif > diff --git a/drivers/misc/cxl/guest.c b/drivers/misc/cxl/guest.c > index bc8d0b9..d516d0a 100644 > --- a/drivers/misc/cxl/guest.c > +++ b/drivers/misc/cxl/guest.c > @@ -196,15 +196,18 @@ static irqreturn_t guest_slice_irq_err(int irq, void > *data) > { > struct cxl_afu *afu = data; > int rc; > -u64 serr; > +u64 serr, afu_error, dsisr; > > -WARN(irq, "CXL SLICE ERROR interrupt %i\n", irq); > rc = cxl_h_get_fn_error_interrupt(afu->guest->handle, ); > if (rc) { > dev_crit(>dev, "Couldn't read PSL_SERR_An: %d\n", rc); > return IRQ_HANDLED; > } > -dev_crit(>dev, "PSL_SERR_An: 0x%.16llx\n", serr); > +afu_error = cxl_p2n_read(afu, CXL_AFU_ERR_An); > +dsisr = cxl_p2n_read(afu, CXL_PSL_DSISR_An); > +cxl_afu_decode_psl_serr(afu, serr); > +dev_crit(>dev, "AFU_ERR_An: 0x%.16llx\n", afu_error); > +dev_crit(>dev, "PSL_DSISR_An: 0x%.16llx\n", dsisr); > > rc = cxl_h_ack_fn_error_interrupt(afu->guest->handle, serr); > if (rc) > diff --git a/drivers/misc/cxl/irq.c b/drivers/misc/cxl/irq.c > index 8def455..40fffe4 100644 > --- a/drivers/misc/cxl/irq.c > +++ b/drivers/misc/cxl/irq.c > @@ -374,3 +374,32 @@ void afu_release_irqs(struct cxl_context *ctx, void > *cookie) > > ctx->irq_count = 0; > } > + > +void cxl_afu_decode_psl_serr(struct cxl_afu *afu, u64 serr) > +{ > +dev_crit(>dev, > + "PSL Slice error received. Check AFU for root cause.\n"); > +dev_crit(>dev, "PSL_SERR_An: 0x%016llx\n", serr); > +if (serr & CXL_PSL_SERR_An_afuto) > +dev_crit(>dev, "AFU MMIO Timeout\n"); > +if (serr & CXL_PSL_SERR_An_afudis) > +dev_crit(>dev, > + "MMIO targeted Accelerator that was not enabled\n"); > +if (serr & CXL_PSL_SERR_An_afuov) > +dev_crit(>dev, "AFU CTAG Overflow\n"); > +if (serr & CXL_PSL_SERR_An_badsrc) > +dev_crit(>dev, "Bad Interrupt Source\n"); > +if (serr & CXL_PSL_SERR_An_badctx) > +dev_crit(>dev, "Bad Context Handle\n"); > +if (serr & CXL_PSL_SERR_An_llcmdis) > +dev_crit(>dev, "LLCMD to Disabled AFU\n"); > +if (serr & CXL_PSL
[PATCH 10/14] cxl: Add support for interrupts on the Mellanox CX4
From: Ian Munsie <imun...@au1.ibm.com> The Mellanox CX4 in cxl mode uses a hybrid interrupt model, where interrupts are routed from the networking hardware to the XSL using the MSIX table, and from there will be transformed back into an MSIX interrupt using the cxl style interrupts (i.e. using IVTE entries and ranges to map a PE and AFU interrupt number to an MSIX address). We want to hide the implementation details of cxl interrupts as much as possible. To this end, we use a special version of the MSI setup & teardown routines in the PHB while in cxl mode to allocate the cxl interrupts and configure the IVTE entries in the process element. This function does not configure the MSIX table - the CX4 card uses a custom format in that table and it would not be appropriate to fill that out in generic code. The rest of the functionality is similar to the "Full MSI-X mode" described in the CAIA, and this could be easily extended to support other adapters that use that mode in the future. The interrupts will be associated with the default context. If the maximum number of interrupts per context has been limited (e.g. by the mlx5 driver), it will automatically allocate additional kernel contexts to associate extra interrupts as required. These contexts will be started using the same WED that was used to start the default context. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- arch/powerpc/platforms/powernv/pci-cxl.c | 84 +++ arch/powerpc/platforms/powernv/pci-ioda.c | 4 ++ arch/powerpc/platforms/powernv/pci.h | 2 + drivers/misc/cxl/api.c| 71 ++ drivers/misc/cxl/base.c | 31 drivers/misc/cxl/cxl.h| 4 ++ drivers/misc/cxl/main.c | 2 + include/misc/cxl-base.h | 4 ++ 8 files changed, 202 insertions(+) diff --git a/arch/powerpc/platforms/powernv/pci-cxl.c b/arch/powerpc/platforms/powernv/pci-cxl.c index 2f386f5..1559ca2 100644 --- a/arch/powerpc/platforms/powernv/pci-cxl.c +++ b/arch/powerpc/platforms/powernv/pci-cxl.c @@ -8,6 +8,7 @@ */ #include +#include #include #include #include @@ -273,3 +274,86 @@ void pnv_cxl_disable_device(struct pci_dev *dev) cxl_pci_disable_device(dev); cxl_afu_put(afu); } + +/* + * This is a special version of pnv_setup_msi_irqs for cards in cxl mode. This + * function handles setting up the IVTE entries for the XSL to use. + * + * We are currently not filling out the MSIX table, since the only currently + * supported adapter (CX4) uses a custom MSIX table format in cxl mode and it + * is up to their driver to fill that out. In the future we may fill out the + * MSIX table (and change the IVTE entries to be an index to the MSIX table) + * for adapters implementing the Full MSI-X mode described in the CAIA. + */ +int pnv_cxl_cx4_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type) +{ + struct pci_controller *hose = pci_bus_to_host(pdev->bus); + struct pnv_phb *phb = hose->private_data; + struct msi_desc *entry; + struct cxl_context *ctx = NULL; + unsigned int virq; + int hwirq; + int afu_irq = 0; + int rc; + + if (WARN_ON(!phb) || !phb->msi_bmp.bitmap) + return -ENODEV; + + if (pdev->no_64bit_msi && !phb->msi32_support) + return -ENODEV; + + rc = cxl_cx4_setup_msi_irqs(pdev, nvec, type); + if (rc) + return rc; + + for_each_pci_msi_entry(entry, pdev) { + if (!entry->msi_attrib.is_64 && !phb->msi32_support) { + pr_warn("%s: Supports only 64-bit MSIs\n", + pci_name(pdev)); + return -ENXIO; + } + + hwirq = cxl_next_msi_hwirq(pdev, , _irq); + if (WARN_ON(hwirq < 0)) + return hwirq; + + virq = irq_create_mapping(NULL, hwirq); + if (virq == NO_IRQ) { + pr_warn("%s: Failed to map cxl mode MSI to linux irq\n", + pci_name(pdev)); + return -ENOMEM; + } + + rc = pnv_cxl_ioda_msi_setup(pdev, hwirq, virq); + if (rc) { + pr_warn("%s: Failed to setup cxl mode MSI\n", pci_name(pdev)); + irq_dispose_mapping(virq); + return rc; + } + + irq_set_msi_desc(virq, entry); + } + + return 0; +} + +void pnv_cxl_cx4_teardown_msi_irqs(struct pci_dev *pdev) +{ + struct pci_controller *hose = pci_bus_to_host(pdev->bus); + struct pnv_phb *phb = hose->private_data; + struct msi_desc *entry; + irq_hw_number_t hwirq; + + if (WARN_ON(!phb)) +
[PATCH 04/14] cxl: Move cxl_afu_get / cxl_afu_put to base
From: Ian Munsie <imun...@au1.ibm.com> The Mellanox CX4 uses a model where the AFU is one physical function of the device, and is used by other peer physical functions of the same device. This will require those other devices to grab a reference on the AFU when they are initialised to make sure that it does not go away during their lifetime. Move the AFU refcount functions to base.c so they can be called from the PHB code. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- drivers/misc/cxl/base.c | 13 + drivers/misc/cxl/cxl.h | 12 include/misc/cxl-base.h | 4 3 files changed, 17 insertions(+), 12 deletions(-) diff --git a/drivers/misc/cxl/base.c b/drivers/misc/cxl/base.c index 9b90ec6..c35a52f 100644 --- a/drivers/misc/cxl/base.c +++ b/drivers/misc/cxl/base.c @@ -54,6 +54,19 @@ static inline void cxl_calls_put(struct cxl_calls *calls) { } #endif /* CONFIG_CXL_MODULE */ +/* AFU refcount management */ +struct cxl_afu *cxl_afu_get(struct cxl_afu *afu) +{ + return (get_device(>dev) == NULL) ? NULL : afu; +} +EXPORT_SYMBOL_GPL(cxl_afu_get); + +void cxl_afu_put(struct cxl_afu *afu) +{ + put_device(>dev); +} +EXPORT_SYMBOL_GPL(cxl_afu_put); + void cxl_slbia(struct mm_struct *mm) { struct cxl_calls *calls; diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index aafffa8..9e2621e 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -428,18 +428,6 @@ struct cxl_afu { bool enabled; }; -/* AFU refcount management */ -static inline struct cxl_afu *cxl_afu_get(struct cxl_afu *afu) -{ - - return (get_device(>dev) == NULL) ? NULL : afu; -} - -static inline void cxl_afu_put(struct cxl_afu *afu) -{ - put_device(>dev); -} - struct cxl_irq_name { struct list_head list; diff --git a/include/misc/cxl-base.h b/include/misc/cxl-base.h index 5ae9625..f53808f 100644 --- a/include/misc/cxl-base.h +++ b/include/misc/cxl-base.h @@ -36,11 +36,15 @@ static inline void cxl_ctx_put(void) atomic_dec(_use_count); } +struct cxl_afu *cxl_afu_get(struct cxl_afu *afu); +void cxl_afu_put(struct cxl_afu *afu); void cxl_slbia(struct mm_struct *mm); #else /* CONFIG_CXL_BASE */ static inline bool cxl_ctx_in_use(void) { return false; } +static inline struct cxl_afu *cxl_afu_get(struct cxl_afu *afu) { return NULL; } +static inline void cxl_afu_put(struct cxl_afu *afu) {} static inline void cxl_slbia(struct mm_struct *mm) {} #endif /* CONFIG_CXL_BASE */ -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 02/14] cxl: Add cxl_slot_is_supported API
From: Ian Munsie <imun...@au1.ibm.com> This extends the check that the adapter is in a CAPI capable slot so that it may be called by external users in the kernel API. This will be used by the upcoming Mellanox CX4 support, which needs to know ahead of time if the card can be switched to cxl mode so that it can leave it in PCI mode if it is not. This API takes a parameter to check if CAPP DMA mode is supported, which it currently only allows on P8NVL systems, since that mode currently has issues accessing memory < 4GB on P8, and we cannot realistically avoid that. This API does not currently check if a CAPP unit is available (i.e. not already assigned to another PHB) on P8. Doing so would be racy since it is assigned on a first come first serve basis, and so long as CAPP DMA mode is not supported on P8 we don't need this, since the only anticipated user of this API requires CAPP DMA mode. Cc: Philippe Bergheaud <fe...@linux.vnet.ibm.com> Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- drivers/misc/cxl/pci.c | 37 + include/misc/cxl.h | 15 +++ 2 files changed, 52 insertions(+) diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index 3a5f980..9530280 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -1426,6 +1426,43 @@ static int cxl_slot_is_switched(struct pci_dev *dev) return (depth > CXL_MAX_PCIEX_PARENT); } +bool cxl_slot_is_supported(struct pci_dev *dev, int flags) +{ + if (!cpu_has_feature(CPU_FTR_HVMODE)) + return false; + + if ((flags & CXL_SLOT_FLAG_DMA) && (!pvr_version_is(PVR_POWER8NVL))) { + /* +* CAPP DMA mode is technically supported on regular P8, but +* will EEH if the card attempts to acccess memory < 4GB, which +* we cannot realistically avoid. We might be able to work +* around the issue, but until then return unsupported: +*/ + return false; + } + + if (cxl_slot_is_switched(dev)) + return false; + + /* +* XXX: This gets a little tricky on regular P8 (not POWER8NVL) since +* the CAPP can be connected to PHB 0, 1 or 2 on a first come first +* served basis, which is racy to check from here. If we need to +* support this in future we might need to consider having this +* function effectively reserve it ahead of time. +* +* Currently, the only user of this API is the Mellanox CX4, which is +* only supported on P8NVL due to the above mentioned limitation of +* CAPP DMA mode and therefore does not need to worry about thi. If the +* issue with CAPP DMA mode is later worked around on P8 we might need +* to revisit this. +*/ + + return true; +} +EXPORT_SYMBOL_GPL(cxl_slot_is_supported); + + static int cxl_probe(struct pci_dev *dev, const struct pci_device_id *id) { struct cxl *adapter; diff --git a/include/misc/cxl.h b/include/misc/cxl.h index b6d040f..dd9eebb 100644 --- a/include/misc/cxl.h +++ b/include/misc/cxl.h @@ -24,6 +24,21 @@ * generic PCI API. This API is agnostic to the actual AFU. */ +#define CXL_SLOT_FLAG_DMA 0x1 + +/* + * Checks if the given card is in a cxl capable slot. Pass CXL_SLOT_FLAG_DMA if + * the card requires CAPP DMA mode to also check if the system supports it. + * This is intended to be used by bi-modal devices to determine if they can use + * cxl mode or if they should continue running in PCI mode. + * + * Note that this only checks if the slot is cxl capable - it does not + * currently check if the CAPP is currently available for chips where it can be + * assigned to different PHBs on a first come first serve basis (i.e. P8) + */ +bool cxl_slot_is_supported(struct pci_dev *dev, int flags); + + /* Get the AFU associated with a pci_dev */ struct cxl_afu *cxl_pci_to_afu(struct pci_dev *dev); -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 14/14] cxl: Add cxl_check_and_switch_mode() API to switch bi-modal cards
From: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Add a new API, cxl_check_and_switch_mode() to allow for switching of bi-modal CAPI cards, such as the Mellanox CX-4 network card. When a driver requests to switch a card to CAPI mode, use PCI hotplug infrastructure to remove all PCI devices underneath the slot. We then write an updated mode control register to the CAPI VSEC, hot reset the card, and reprobe the card. As the card may present a different set of PCI devices after the mode switch, use the infrastructure provided by the pnv_php driver and the OPAL PCI slot management facilities to ensure that: * the old devices are removed from both the OPAL and Linux device trees * the new devices are probed by OPAL and added to the OPAL device tree * the new devices are added to the Linux device tree and probed through the regular PCI device probe path As such, introduce a new option, CONFIG_CXL_BIMODAL, with a dependency on the pnv_php driver. Refactor existing code that touches the mode control register in the regular single mode case into a new function, setup_cxl_protocol_area(). Co-authored-by: Ian Munsie <imun...@au1.ibm.com> Cc: Gavin Shan <gws...@linux.vnet.ibm.com> Signed-off-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Reviewed-by: Gavin Shan <gws...@linux.vnet.ibm.com> --- drivers/misc/cxl/Kconfig | 8 ++ drivers/misc/cxl/pci.c | 234 +++ include/misc/cxl.h | 25 + 3 files changed, 249 insertions(+), 18 deletions(-) diff --git a/drivers/misc/cxl/Kconfig b/drivers/misc/cxl/Kconfig index 560412c..6859723 100644 --- a/drivers/misc/cxl/Kconfig +++ b/drivers/misc/cxl/Kconfig @@ -38,3 +38,11 @@ config CXL CAPI adapters are found in POWER8 based systems. If unsure, say N. + +config CXL_BIMODAL + bool "Support for bi-modal CAPI cards" + depends on HOTPLUG_PCI_POWERNV = y && CXL || HOTPLUG_PCI_POWERNV = m && CXL = m + default y + help + Select this option to enable support for bi-modal CAPI cards, such as + the Mellanox CX-4. \ No newline at end of file diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index 090eee8..63abd26 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -55,6 +55,8 @@ pci_read_config_byte(dev, vsec + 0xa, dest) #define CXL_WRITE_VSEC_MODE_CONTROL(dev, vsec, val) \ pci_write_config_byte(dev, vsec + 0xa, val) +#define CXL_WRITE_VSEC_MODE_CONTROL_BUS(bus, devfn, vsec, val) \ + pci_bus_write_config_byte(bus, devfn, vsec + 0xa, val) #define CXL_VSEC_PROTOCOL_MASK 0xe0 #define CXL_VSEC_PROTOCOL_1024TB 0x80 #define CXL_VSEC_PROTOCOL_512TB 0x40 @@ -614,36 +616,232 @@ static int setup_cxl_bars(struct pci_dev *dev) return 0; } -/* pciex node: ibm,opal-m64-window = <0x3d058 0x0 0x3d058 0x0 0x8 0x0>; */ -static int switch_card_to_cxl(struct pci_dev *dev) -{ +#ifdef CONFIG_CXL_BIMODAL + +struct cxl_switch_work { + struct pci_dev *dev; + struct work_struct work; int vsec; + int mode; +}; + +static void switch_card_to_cxl(struct work_struct *work) +{ + struct cxl_switch_work *switch_work = + container_of(work, struct cxl_switch_work, work); + struct pci_dev *dev = switch_work->dev; + struct pci_bus *bus = dev->bus; + struct pci_controller *hose = pci_bus_to_host(bus); + struct pci_dev *bridge; + struct pnv_php_slot *php_slot; + unsigned int devfn; u8 val; int rc; - dev_info(>dev, "switch card to CXL\n"); + dev_info(>dev, "cxl: Preparing for mode switch...\n"); + bridge = list_first_entry_or_null(>bus->devices, struct pci_dev, + bus_list); + if (!bridge) { + dev_WARN(>dev, "cxl: Couldn't find root port!\n"); + goto err_free_work; + } - if (!(vsec = find_cxl_vsec(dev))) { - dev_err(>dev, "ABORTING: CXL VSEC not found!\n"); + php_slot = pnv_php_find_slot(pci_device_to_OF_node(bridge)); + if (!php_slot) { + dev_err(>dev, "cxl: Failed to find slot hotplug " + "information. You may need to upgrade " + "skiboot. Aborting.\n"); + pci_dev_put(dev); + goto err_free_work; + } + + rc = CXL_READ_VSEC_MODE_CONTROL(dev, switch_work->vsec, ); + if (rc) { + dev_err(>dev, "cxl: Failed to read CAPI mode control: %i\n", rc); + pci_dev_put(dev); + goto err_free_work; + } + devfn = dev->devfn; + pci_dev_put(dev); + + dev_dbg(>dev, "cxl: Removing PCI devices from kernel\n"); + pci_
[PATCH 13/14] PCI/hotplug: pnv_php: handle OPAL_PCI_SLOT_OFFLINE power state
From: Andrew DonnellanWhen calling pnv_php_set_slot_power_state() with state == OPAL_PCI_SLOT_OFFLINE, remove devices from the device tree as if we're dealing with OPAL_PCI_SLOT_POWER_OFF. Cc: Gavin Shan Cc: linux-...@vger.kernel.org Cc: Bjorn Helgaas Signed-off-by: Andrew Donnellan Acked-by: Gavin Shan --- drivers/pci/hotplug/pnv_php.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c index 2d2f704..e6245b0 100644 --- a/drivers/pci/hotplug/pnv_php.c +++ b/drivers/pci/hotplug/pnv_php.c @@ -317,7 +317,7 @@ int pnv_php_set_slot_power_state(struct hotplug_slot *slot, return ret; } - if (state == OPAL_PCI_SLOT_POWER_OFF) + if (state == OPAL_PCI_SLOT_POWER_OFF || state == OPAL_PCI_SLOT_OFFLINE) pnv_php_rmv_devtree(php_slot); else ret = pnv_php_add_devtree(php_slot); -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 12/14] PCI/hotplug: pnv_php: export symbols and move struct types needed by cxl
From: Andrew DonnellanThe cxl driver will use infrastructure from pnv_php to handle device tree updates when switching bi-modal CAPI cards into CAPI mode. To enable this, export pnv_php_find_slot() and pnv_php_set_slot_power_state(), and add corresponding declarations, as well as the definition of struct pnv_php_slot, to asm/pnv-pci.h. Cc: Gavin Shan Cc: linux-...@vger.kernel.org Cc: Bjorn Helgaas Signed-off-by: Andrew Donnellan Acked-by: Gavin Shan --- arch/powerpc/include/asm/pnv-pci.h | 28 drivers/pci/hotplug/Kconfig| 1 + drivers/pci/hotplug/pnv_php.c | 32 +--- 3 files changed, 34 insertions(+), 27 deletions(-) diff --git a/arch/powerpc/include/asm/pnv-pci.h b/arch/powerpc/include/asm/pnv-pci.h index c47097f..0cbd813 100644 --- a/arch/powerpc/include/asm/pnv-pci.h +++ b/arch/powerpc/include/asm/pnv-pci.h @@ -11,6 +11,7 @@ #define _ASM_PNV_PCI_H #include +#include #include #include @@ -47,4 +48,31 @@ void pnv_cxl_phb_set_peer_afu(struct pci_dev *dev, struct cxl_afu *afu); #endif +struct pnv_php_slot { + struct hotplug_slot slot; + struct hotplug_slot_infoslot_info; + uint64_tid; + char*name; + int slot_no; + struct kref kref; +#define PNV_PHP_STATE_INITIALIZED 0 +#define PNV_PHP_STATE_REGISTERED 1 +#define PNV_PHP_STATE_POPULATED2 +#define PNV_PHP_STATE_OFFLINE 3 + int state; + struct device_node *dn; + struct pci_dev *pdev; + struct pci_bus *bus; + boolpower_state_check; + void*fdt; + void*dt; + struct of_changeset ocs; + struct pnv_php_slot *parent; + struct list_headchildren; + struct list_headlink; +}; +extern struct pnv_php_slot *pnv_php_find_slot(struct device_node *dn); +extern int pnv_php_set_slot_power_state(struct hotplug_slot *slot, + uint8_t state); + #endif diff --git a/drivers/pci/hotplug/Kconfig b/drivers/pci/hotplug/Kconfig index aadce45..b719a72 100644 --- a/drivers/pci/hotplug/Kconfig +++ b/drivers/pci/hotplug/Kconfig @@ -117,6 +117,7 @@ config HOTPLUG_PCI_POWERNV tristate "PowerPC PowerNV PCI Hotplug driver" depends on PPC_POWERNV && EEH select OF_DYNAMIC + select HOTPLUG_PCI_POWERNV_BASE help Say Y here if you run PowerPC PowerNV platform that supports PCI Hotplug diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c index 6086db6..2d2f704 100644 --- a/drivers/pci/hotplug/pnv_php.c +++ b/drivers/pci/hotplug/pnv_php.c @@ -22,30 +22,6 @@ #define DRIVER_AUTHOR "Gavin Shan, IBM Corporation" #define DRIVER_DESC"PowerPC PowerNV PCI Hotplug Driver" -struct pnv_php_slot { - struct hotplug_slot slot; - struct hotplug_slot_infoslot_info; - uint64_tid; - char*name; - int slot_no; - struct kref kref; -#define PNV_PHP_STATE_INITIALIZED 0 -#define PNV_PHP_STATE_REGISTERED 1 -#define PNV_PHP_STATE_POPULATED2 -#define PNV_PHP_STATE_OFFLINE 3 - int state; - struct device_node *dn; - struct pci_dev *pdev; - struct pci_bus *bus; - boolpower_state_check; - void*fdt; - void*dt; - struct of_changeset ocs; - struct pnv_php_slot *parent; - struct list_headchildren; - struct list_headlink; -}; - static LIST_HEAD(pnv_php_slot_list); static DEFINE_SPINLOCK(pnv_php_lock); @@ -91,7 +67,7 @@ static struct pnv_php_slot *pnv_php_match(struct device_node *dn, return NULL; } -static struct pnv_php_slot *pnv_php_find_slot(struct device_node *dn) +struct pnv_php_slot *pnv_php_find_slot(struct device_node *dn) { struct pnv_php_slot *php_slot, *tmp; unsigned long flags; @@ -108,6 +84,7 @@ static struct pnv_php_slot *pnv_php_find_slot(struct device_node *dn) return NULL; } +EXPORT_SYMBOL_GPL(pnv_php_find_slot); /* * Remove pdn for all children of the indicated device node. @@ -316,8 +293,8 @@ out: return ret; } -static int pnv_php_set_slot_power_state(struct hotplug_slot
[PATCH 11/14] cxl: Workaround PE=0 hardware limitation in Mellanox CX4
From: Ian Munsie <imun...@au1.ibm.com> The CX4 card cannot cope with a context with PE=0 due to a hardware limitation, resulting in: [ 34.166577] command failed, status limits exceeded(0x8), syndrome 0x5a7939 [ 34.166580] mlx5_core :01:00.1: Failed allocating uar, aborting Since the kernel API allocates a default context very early during device init that will almost certainly get Process Element ID 0 there is no easy way for us to extend the API to allow the Mellanox to inform us of this limitation ahead of time. Instead, work around the issue by extending the XSL structure to include a minimum PE to allocate. Although the bug is not in the XSL, it is the easiest place to work around this limitation given that the CX4 is currently the only card that uses an XSL. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- drivers/misc/cxl/context.c | 3 ++- drivers/misc/cxl/cxl.h | 1 + drivers/misc/cxl/pci.c | 1 + 3 files changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/misc/cxl/context.c b/drivers/misc/cxl/context.c index 2616cddb..bdee9a0 100644 --- a/drivers/misc/cxl/context.c +++ b/drivers/misc/cxl/context.c @@ -90,7 +90,8 @@ int cxl_context_init(struct cxl_context *ctx, struct cxl_afu *afu, bool master, */ mutex_lock(>contexts_lock); idr_preload(GFP_KERNEL); - i = idr_alloc(>afu->contexts_idr, ctx, 0, + i = idr_alloc(>afu->contexts_idr, ctx, + ctx->afu->adapter->native->sl_ops->min_pe, ctx->afu->num_procs, GFP_NOWAIT); idr_preload_end(); mutex_unlock(>contexts_lock); diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index 078b268..19b132f 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -549,6 +549,7 @@ struct cxl_service_layer_ops { u64 (*timebase_read)(struct cxl *adapter); int capi_mode; bool needs_reset_before_disable; + int min_pe; }; struct cxl_native { diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index 02242be..090eee8 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -1321,6 +1321,7 @@ static const struct cxl_service_layer_ops xsl_ops = { .write_timebase_ctrl = write_timebase_ctrl_xsl, .timebase_read = timebase_read_xsl, .capi_mode = OPAL_PHB_CAPI_MODE_DMA, + .min_pe = 1, /* Workaround for Mellanox CX4 HW bug */ }; static void set_sl_ops(struct cxl *adapter, struct pci_dev *dev) -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 09/14] cxl: Add preliminary workaround for CX4 interrupt limitation
From: Ian Munsie <imun...@au1.ibm.com> The Mellanox CX4 has a hardware limitation where only 4 bits of the AFU interrupt number can be passed to the XSL when sending an interrupt, limiting it to only 15 interrupts per context (AFU interrupt number 0 is invalid). In order to overcome this, we will allocate additional contexts linked to the default context as extra address space for the extra interrupts - this will be implemented in the next patch. This patch adds the preliminary support to allow this, by way of adding a linked list in the context structure that we use to keep track of the contexts dedicated to interrupts, and an API to simultaneously iterate over the related context structures, AFU interrupt numbers and hardware interrupt numbers. The point of using a single API to iterate these is to hide some of the details of the iteration from external code, and to reduce the number of APIs that need to be exported via base.c to allow built in code to call. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- drivers/misc/cxl/api.c | 15 +++ drivers/misc/cxl/base.c| 17 + drivers/misc/cxl/context.c | 1 + drivers/misc/cxl/cxl.h | 10 ++ drivers/misc/cxl/main.c| 1 + include/misc/cxl.h | 9 + 6 files changed, 53 insertions(+) diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index 1e2c0d9..f02a859 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -97,6 +97,21 @@ static irq_hw_number_t cxl_find_afu_irq(struct cxl_context *ctx, int num) return 0; } +int _cxl_next_msi_hwirq(struct pci_dev *pdev, struct cxl_context **ctx, int *afu_irq) +{ + if (*ctx == NULL || *afu_irq == 0) { + *afu_irq = 1; + *ctx = cxl_get_context(pdev); + } else { + (*afu_irq)++; + if (*afu_irq > cxl_get_max_irqs_per_process(pdev)) { + *ctx = list_next_entry(*ctx, extra_irq_contexts); + *afu_irq = 1; + } + } + return cxl_find_afu_irq(*ctx, *afu_irq); +} +/* Exported via cxl_base */ int cxl_set_priv(struct cxl_context *ctx, void *priv) { diff --git a/drivers/misc/cxl/base.c b/drivers/misc/cxl/base.c index af20b34..0f89ea9 100644 --- a/drivers/misc/cxl/base.c +++ b/drivers/misc/cxl/base.c @@ -141,6 +141,23 @@ void cxl_pci_disable_device(struct pci_dev *dev) } EXPORT_SYMBOL_GPL(cxl_pci_disable_device); +int cxl_next_msi_hwirq(struct pci_dev *pdev, struct cxl_context **ctx, int *afu_irq) +{ + int ret; + struct cxl_calls *calls; + + calls = cxl_calls_get(); + if (!calls) + return -EBUSY; + + ret = calls->cxl_next_msi_hwirq(pdev, ctx, afu_irq); + + cxl_calls_put(calls); + + return ret; +} +EXPORT_SYMBOL_GPL(cxl_next_msi_hwirq); + static int __init cxl_base_init(void) { struct device_node *np = NULL; diff --git a/drivers/misc/cxl/context.c b/drivers/misc/cxl/context.c index edbb99e..2616cddb 100644 --- a/drivers/misc/cxl/context.c +++ b/drivers/misc/cxl/context.c @@ -68,6 +68,7 @@ int cxl_context_init(struct cxl_context *ctx, struct cxl_afu *afu, bool master, ctx->pending_afu_err = false; INIT_LIST_HEAD(>irq_names); + INIT_LIST_HEAD(>extra_irq_contexts); /* * When we have to destroy all contexts in cxl_context_detach_all() we diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index c94b54f..67464c9 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -525,6 +525,14 @@ struct cxl_context { atomic_t afu_driver_events; struct rcu_head rcu; + + /* +* Only used when more interrupts are allocated via +* pci_enable_msix_range than are supported in the default context, to +* use additional contexts to overcome the limitation. i.e. Mellanox +* CX4 only: +*/ + struct list_head extra_irq_contexts; }; struct cxl_service_layer_ops { @@ -710,11 +718,13 @@ ssize_t cxl_pci_afu_read_err_buffer(struct cxl_afu *afu, char *buf, /* Internal functions wrapped in cxl_base to allow PHB to call them */ bool _cxl_pci_associate_default_context(struct pci_dev *dev, struct cxl_afu *afu); void _cxl_pci_disable_device(struct pci_dev *dev); +int _cxl_next_msi_hwirq(struct pci_dev *pdev, struct cxl_context **ctx, int *afu_irq); struct cxl_calls { void (*cxl_slbia)(struct mm_struct *mm); bool (*cxl_pci_associate_default_context)(struct pci_dev *dev, struct cxl_afu *afu); void (*cxl_pci_disable_device)(struct pci_dev *dev); + int (*cxl_next_msi_hwirq)(struct pci_dev *pdev, struct cxl_context **ctx, int *afu_irq); struct module *owner; }; diff --git a/drivers/misc/cxl/main.c b/drivers/misc/cxl/main.c index 4e5474b..66fac71 100644 --- a/drivers/misc/cxl/main.c +++ b/drivers/misc/cxl/main.c @@ -112,6 +112,7 @@ static
[PATCH 08/14] cxl: Add kernel APIs to get & set the max irqs per context
From: Ian Munsie <imun...@au1.ibm.com> These APIs will be used by the Mellanox CX4 support. While they function standalone to configure existing behaviour, their primary purpose is to allow the Mellanox driver to inform the cxl driver of a hardware limitation, which will be used in a future patch. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- drivers/misc/cxl/api.c | 27 +++ include/misc/cxl.h | 10 ++ 2 files changed, 37 insertions(+) diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index 6a030bf..1e2c0d9 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -447,3 +447,30 @@ ssize_t cxl_read_adapter_vpd(struct pci_dev *dev, void *buf, size_t count) return cxl_ops->read_adapter_vpd(afu->adapter, buf, count); } EXPORT_SYMBOL_GPL(cxl_read_adapter_vpd); + +int cxl_set_max_irqs_per_process(struct pci_dev *dev, int irqs) +{ + struct cxl_afu *afu = cxl_pci_to_afu(dev); + if (IS_ERR(afu)) + return -ENODEV; + + if (irqs > afu->adapter->user_irqs) + return -EINVAL; + + /* Limit user_irqs to prevent the user increasing this via sysfs */ + afu->adapter->user_irqs = irqs; + afu->irqs_max = irqs; + + return 0; +} +EXPORT_SYMBOL_GPL(cxl_set_max_irqs_per_process); + +int cxl_get_max_irqs_per_process(struct pci_dev *dev) +{ + struct cxl_afu *afu = cxl_pci_to_afu(dev); + if (IS_ERR(afu)) + return -ENODEV; + + return afu->irqs_max; +} +EXPORT_SYMBOL_GPL(cxl_get_max_irqs_per_process); diff --git a/include/misc/cxl.h b/include/misc/cxl.h index dd9eebb..fc07ed4 100644 --- a/include/misc/cxl.h +++ b/include/misc/cxl.h @@ -166,6 +166,16 @@ void cxl_psa_unmap(void __iomem *addr); /* Get the process element for this context */ int cxl_process_element(struct cxl_context *ctx); +/* + * Limit the number of interrupts that a single context can allocate via + * cxl_start_work. If using the api with a real phb, this may be used to + * request that additional default contexts be created when allocating + * interrupts via pci_enable_msix_range. These will be set to the same running + * state as the default context, and if that is running it will reuse the + * parameters previously passed to cxl_start_context for the default context. + */ +int cxl_set_max_irqs_per_process(struct pci_dev *dev, int irqs); +int cxl_get_max_irqs_per_process(struct pci_dev *dev); /* * These calls allow drivers to create their own file descriptors and make them -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 07/14] cxl: Add support for using the kernel API with a real PHB
From: Ian Munsie <imun...@au1.ibm.com> This hooks up support for using the kernel API with a real PHB. After the AFU initialisation has completed it calls into the PHB code to pass it the AFU that will be used by other peer physical functions on the adapter. The cxl_pci_to_afu API is extended to work with peer PCI devices, retrieving the peer AFU from the PHB. This API may also now return an error if it is called on a PCI device that is not associated with either a cxl vPHB or a peer PCI device to an AFU, and this error is propagated down. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- drivers/misc/cxl/api.c | 5 + drivers/misc/cxl/pci.c | 6 ++ drivers/misc/cxl/vphb.c | 16 ++-- 3 files changed, 25 insertions(+), 2 deletions(-) diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index 7707055..6a030bf 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -13,6 +13,7 @@ #include #include #include +#include #include "cxl.h" @@ -24,6 +25,8 @@ struct cxl_context *cxl_dev_context_init(struct pci_dev *dev) int rc; afu = cxl_pci_to_afu(dev); + if (IS_ERR(afu)) + return ERR_CAST(afu); ctx = cxl_context_alloc(); if (IS_ERR(ctx)) { @@ -438,6 +441,8 @@ EXPORT_SYMBOL_GPL(cxl_perst_reloads_same_image); ssize_t cxl_read_adapter_vpd(struct pci_dev *dev, void *buf, size_t count) { struct cxl_afu *afu = cxl_pci_to_afu(dev); + if (IS_ERR(afu)) + return -ENODEV; return cxl_ops->read_adapter_vpd(afu->adapter, buf, count); } diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index 6c0597d..02242be 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -1502,6 +1502,9 @@ static int cxl_probe(struct pci_dev *dev, const struct pci_device_id *id) dev_err(>dev, "AFU %i failed to start: %i\n", slice, rc); } + if (pnv_pci_on_cxl_phb(dev) && adapter->slices >= 1) + pnv_cxl_phb_set_peer_afu(dev, adapter->afu[0]); + return 0; } @@ -1572,6 +1575,9 @@ static pci_ers_result_t cxl_pci_error_detected(struct pci_dev *pdev, */ for (i = 0; i < adapter->slices; i++) { afu = adapter->afu[i]; + /* Only participate in EEH if we are on a virtual PHB */ + if (afu->phb == NULL) + return PCI_ERS_RESULT_NONE; cxl_vphb_error_detected(afu, state); } return PCI_ERS_RESULT_DISCONNECT; diff --git a/drivers/misc/cxl/vphb.c b/drivers/misc/cxl/vphb.c index c5b9c201..08e8db7 100644 --- a/drivers/misc/cxl/vphb.c +++ b/drivers/misc/cxl/vphb.c @@ -9,6 +9,7 @@ #include #include +#include #include "cxl.h" static int cxl_dma_set_mask(struct pci_dev *pdev, u64 dma_mask) @@ -280,13 +281,18 @@ void cxl_pci_vphb_remove(struct cxl_afu *afu) pcibios_free_controller(phb); } +static bool _cxl_pci_is_vphb_device(struct pci_controller *phb) +{ + return (phb->ops == _pcie_pci_ops); +} + bool cxl_pci_is_vphb_device(struct pci_dev *dev) { struct pci_controller *phb; phb = pci_bus_to_host(dev->bus); - return (phb->ops == _pcie_pci_ops); + return _cxl_pci_is_vphb_device(phb); } struct cxl_afu *cxl_pci_to_afu(struct pci_dev *dev) @@ -295,7 +301,13 @@ struct cxl_afu *cxl_pci_to_afu(struct pci_dev *dev) phb = pci_bus_to_host(dev->bus); - return (struct cxl_afu *)phb->private_data; + if (_cxl_pci_is_vphb_device(phb)) + return (struct cxl_afu *)phb->private_data; + + if (pnv_pci_on_cxl_phb(dev)) + return pnv_cxl_phb_to_afu(phb); + + return ERR_PTR(-ENODEV); } EXPORT_SYMBOL_GPL(cxl_pci_to_afu); -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 06/14] powerpc/powernv: Add support for the cxl kernel api on the real phb
From: Ian Munsie <imun...@au1.ibm.com> This adds support for the peer model of the cxl kernel api to the PowerNV PHB, and exports APIs to enable the mode, check if a PCI device is attached to a PHB in this mode, and to set and get the peer AFU for this mode. The cxl driver will enable this mode for supported cards by calling pnv_cxl_enable_phb_kernel_api(). This will set a flag in the PHB to note that this mode is enabled, and switch out it's controller_ops for the cxl version. The cxl version of the controller_ops struct implements it's own versions of the enable_device_hook and release_device to handle refcounting on the peer AFU and to allocate a default context for the device. Once enabled, the cxl kernel API may not be disabled on a PHB. Currently there is no safe way to disable cxl mode short of a reboot, so until that changes there is no reason to support the disable path. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- arch/powerpc/include/asm/pnv-pci.h| 7 ++ arch/powerpc/platforms/powernv/pci-cxl.c | 112 ++ arch/powerpc/platforms/powernv/pci-ioda.c | 22 +- arch/powerpc/platforms/powernv/pci.h | 16 + 4 files changed, 154 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/pnv-pci.h b/arch/powerpc/include/asm/pnv-pci.h index 791db1b..c47097f 100644 --- a/arch/powerpc/include/asm/pnv-pci.h +++ b/arch/powerpc/include/asm/pnv-pci.h @@ -38,6 +38,13 @@ int pnv_cxl_alloc_hwirq_ranges(struct cxl_irq_ranges *irqs, struct pci_dev *dev, int num); void pnv_cxl_release_hwirq_ranges(struct cxl_irq_ranges *irqs, struct pci_dev *dev); + +/* Support for the cxl kernel api on the real PHB (instead of vPHB) */ +int pnv_cxl_enable_phb_kernel_api(struct pci_controller *hose, bool enable); +bool pnv_pci_on_cxl_phb(struct pci_dev *dev); +struct cxl_afu *pnv_cxl_phb_to_afu(struct pci_controller *hose); +void pnv_cxl_phb_set_peer_afu(struct pci_dev *dev, struct cxl_afu *afu); + #endif #endif diff --git a/arch/powerpc/platforms/powernv/pci-cxl.c b/arch/powerpc/platforms/powernv/pci-cxl.c index ea8171f..2f386f5 100644 --- a/arch/powerpc/platforms/powernv/pci-cxl.c +++ b/arch/powerpc/platforms/powernv/pci-cxl.c @@ -7,8 +7,11 @@ * 2 of the License, or (at your option) any later version. */ +#include +#include #include #include +#include #include "pci.h" @@ -161,3 +164,112 @@ int pnv_cxl_ioda_msi_setup(struct pci_dev *dev, unsigned int hwirq, return 0; } EXPORT_SYMBOL(pnv_cxl_ioda_msi_setup); + +/* + * Sets flags and switches the controller ops to enable the cxl kernel api. + * Original the cxl kernel API operated on a virtual PHB, but certain cards + * such as the Mellanox CX4 use a peer model instead and for these cards the + * cxl kernel api will operate on the real PHB. + */ +int pnv_cxl_enable_phb_kernel_api(struct pci_controller *hose, bool enable) +{ + struct pnv_phb *phb = hose->private_data; + struct module *cxl_module; + + if (!enable) { + /* +* Once cxl mode is enabled on the PHB, there is currently no +* known safe method to disable it again, and trying risks a +* checkstop. If we can find a way to safely disable cxl mode +* in the future we can revisit this, but for now the only sane +* thing to do is to refuse to disable cxl mode: +*/ + return -EPERM; + } + + /* +* Hold a reference to the cxl module since several PHB operations now +* depend on it, and it would be insane to allow it to be removed so +* long as we are in this mode (and since we can't safely disable this +* mode once enabled...). +*/ + mutex_lock(_mutex); + cxl_module = find_module("cxl"); + if (cxl_module) + __module_get(cxl_module); + mutex_unlock(_mutex); + if (!cxl_module) + return -ENODEV; + + phb->flags |= PNV_PHB_FLAG_CXL; + hose->controller_ops = pnv_cxl_cx4_ioda_controller_ops; + + return 0; +} +EXPORT_SYMBOL(pnv_cxl_enable_phb_kernel_api); + +bool pnv_pci_on_cxl_phb(struct pci_dev *dev) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + + return !!(phb->flags & PNV_PHB_FLAG_CXL); +} +EXPORT_SYMBOL(pnv_pci_on_cxl_phb); + +struct cxl_afu *pnv_cxl_phb_to_afu(struct pci_controller *hose) +{ + struct pnv_phb *phb = hose->private_data; + + return (struct cxl_afu *)phb->cxl_afu; +} +EXPORT_SYMBOL_GPL(pnv_cxl_phb_to_afu); + +void pnv_cxl_phb_set_peer_afu(struct pci_dev *dev, struct cxl_afu *afu) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + + phb->cx
[PATCH 05/14] cxl: Allow a default context to be associated with an external pci_dev
From: Ian Munsie <imun...@au1.ibm.com> The cxl kernel API has a concept of a default context associated with each PCI device under the virtual PHB. The Mellanox CX4 will also use the cxl kernel API, but it does not use a virtual PHB - rather, the AFU appears as a physical function as a peer to the networking functions. In order to allow the kernel API to work with those networking functions, we will need to associate a default context with them as well. To this end, refactor the corresponding code to do this in vphb.c and export it so that it can be called from the PHB code. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- drivers/misc/cxl/base.c | 35 +++ drivers/misc/cxl/cxl.h | 6 ++ drivers/misc/cxl/main.c | 2 ++ drivers/misc/cxl/vphb.c | 37 +++-- include/misc/cxl-base.h | 6 ++ 5 files changed, 72 insertions(+), 14 deletions(-) diff --git a/drivers/misc/cxl/base.c b/drivers/misc/cxl/base.c index c35a52f..af20b34 100644 --- a/drivers/misc/cxl/base.c +++ b/drivers/misc/cxl/base.c @@ -106,6 +106,41 @@ int cxl_update_properties(struct device_node *dn, } EXPORT_SYMBOL_GPL(cxl_update_properties); +/* + * API calls into the driver that may be called from the PHB code and must be + * built in. + */ +bool cxl_pci_associate_default_context(struct pci_dev *dev, struct cxl_afu *afu) +{ + bool ret; + struct cxl_calls *calls; + + calls = cxl_calls_get(); + if (!calls) + return false; + + ret = calls->cxl_pci_associate_default_context(dev, afu); + + cxl_calls_put(calls); + + return ret; +} +EXPORT_SYMBOL_GPL(cxl_pci_associate_default_context); + +void cxl_pci_disable_device(struct pci_dev *dev) +{ + struct cxl_calls *calls; + + calls = cxl_calls_get(); + if (!calls) + return; + + calls->cxl_pci_disable_device(dev); + + cxl_calls_put(calls); +} +EXPORT_SYMBOL_GPL(cxl_pci_disable_device); + static int __init cxl_base_init(void) { struct device_node *np = NULL; diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index 9e2621e..c94b54f 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -707,9 +707,15 @@ static inline u64 cxl_p2n_read(struct cxl_afu *afu, cxl_p2n_reg_t reg) ssize_t cxl_pci_afu_read_err_buffer(struct cxl_afu *afu, char *buf, loff_t off, size_t count); +/* Internal functions wrapped in cxl_base to allow PHB to call them */ +bool _cxl_pci_associate_default_context(struct pci_dev *dev, struct cxl_afu *afu); +void _cxl_pci_disable_device(struct pci_dev *dev); struct cxl_calls { void (*cxl_slbia)(struct mm_struct *mm); + bool (*cxl_pci_associate_default_context)(struct pci_dev *dev, struct cxl_afu *afu); + void (*cxl_pci_disable_device)(struct pci_dev *dev); + struct module *owner; }; int register_cxl_calls(struct cxl_calls *calls); diff --git a/drivers/misc/cxl/main.c b/drivers/misc/cxl/main.c index ae68c32..4e5474b 100644 --- a/drivers/misc/cxl/main.c +++ b/drivers/misc/cxl/main.c @@ -110,6 +110,8 @@ static inline void cxl_slbia_core(struct mm_struct *mm) static struct cxl_calls cxl_calls = { .cxl_slbia = cxl_slbia_core, + .cxl_pci_associate_default_context = _cxl_pci_associate_default_context, + .cxl_pci_disable_device = _cxl_pci_disable_device, .owner = THIS_MODULE, }; diff --git a/drivers/misc/cxl/vphb.c b/drivers/misc/cxl/vphb.c index 012b6aa..c5b9c201 100644 --- a/drivers/misc/cxl/vphb.c +++ b/drivers/misc/cxl/vphb.c @@ -40,11 +40,28 @@ static void cxl_teardown_msi_irqs(struct pci_dev *pdev) */ } +bool _cxl_pci_associate_default_context(struct pci_dev *dev, struct cxl_afu *afu) +{ + struct cxl_context *ctx; + + /* +* Allocate a context to do cxl things too. This is used for interrupts +* in the peer model using a real phb, and if we eventually do DMA ops +* in the virtual phb, we'll need a default context to attach them to. +*/ + ctx = cxl_dev_context_init(dev); + if (!ctx) + return false; + dev->dev.archdata.cxl_ctx = ctx; + + return (cxl_ops->afu_check_and_enable(afu) == 0); +} +/* exported via cxl_base */ + static bool cxl_pci_enable_device_hook(struct pci_dev *dev) { struct pci_controller *phb; struct cxl_afu *afu; - struct cxl_context *ctx; phb = pci_bus_to_host(dev->bus); afu = (struct cxl_afu *)phb->private_data; @@ -57,19 +74,10 @@ static bool cxl_pci_enable_device_hook(struct pci_dev *dev) set_dma_ops(>dev, _direct_ops); set_dma_offset(>dev, PAGE_OFFSET); - /* -* Allocate a context to do cxl things too. If we eventually do real -* DMA ops, we'll need a default context to attach them to -*/ - ctx = cxl_dev_context_init(dev); - if
[PATCH 03/14] cxl: Enable bus mastering for devices using CAPP DMA mode
From: Ian Munsie <imun...@au1.ibm.com> Devices that use CAPP DMA mode (such as the Mellanox CX4) require bus master to be enabled in order for the CAPI traffic to flow. This should be harmless to enable for other cxl devices, so unconditionally enable it in the adapter init flow. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- drivers/misc/cxl/pci.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index 9530280..6c0597d 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -1264,6 +1264,9 @@ static int cxl_configure_adapter(struct cxl *adapter, struct pci_dev *dev) if ((rc = adapter->native->sl_ops->adapter_regs_init(adapter, dev))) goto err; + /* Required for devices using CAPP DMA mode, harmless for others */ + pci_set_master(dev); + if ((rc = pnv_phb_to_cxl_mode(dev, adapter->native->sl_ops->capi_mode))) goto err; -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 01/14] powerpc/powernv: Split cxl code out into a separate file
From: Ian Munsie <imun...@au1.ibm.com> The support for using the Mellanox CX4 in cxl mode will require additions to the PHB code. In preparation for this, move the existing cxl code out of pci-ioda.c into a separate pci-cxl.c file to keep things more organised. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- arch/powerpc/platforms/powernv/Makefile | 1 + arch/powerpc/platforms/powernv/pci-cxl.c | 163 ++ arch/powerpc/platforms/powernv/pci-ioda.c | 159 + arch/powerpc/platforms/powernv/pci.h | 6 ++ 4 files changed, 173 insertions(+), 156 deletions(-) create mode 100644 arch/powerpc/platforms/powernv/pci-cxl.c diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile index cd9711e..b5d98cb 100644 --- a/arch/powerpc/platforms/powernv/Makefile +++ b/arch/powerpc/platforms/powernv/Makefile @@ -6,6 +6,7 @@ obj-y += opal-kmsg.o obj-$(CONFIG_SMP) += smp.o subcore.o subcore-asm.o obj-$(CONFIG_PCI) += pci.o pci-ioda.o npu-dma.o +obj-$(CONFIG_CXL_BASE) += pci-cxl.o obj-$(CONFIG_EEH) += eeh-powernv.o obj-$(CONFIG_PPC_SCOM) += opal-xscom.o obj-$(CONFIG_MEMORY_FAILURE) += opal-memory-errors.o diff --git a/arch/powerpc/platforms/powernv/pci-cxl.c b/arch/powerpc/platforms/powernv/pci-cxl.c new file mode 100644 index 000..ea8171f --- /dev/null +++ b/arch/powerpc/platforms/powernv/pci-cxl.c @@ -0,0 +1,163 @@ +/* + * Copyright 2015 IBM Corp. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include + +#include "pci.h" + +struct device_node *pnv_pci_get_phb_node(struct pci_dev *dev) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + + return of_node_get(hose->dn); +} +EXPORT_SYMBOL(pnv_pci_get_phb_node); + +int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t mode) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + struct pnv_ioda_pe *pe; + int rc; + + pe = pnv_ioda_get_pe(dev); + if (!pe) + return -ENODEV; + + pe_info(pe, "Switching PHB to CXL\n"); + + rc = opal_pci_set_phb_cxl_mode(phb->opal_id, mode, pe->pe_number); + if (rc == OPAL_UNSUPPORTED) + dev_err(>dev, "Required cxl mode not supported by firmware - update skiboot\n"); + else if (rc) + dev_err(>dev, "opal_pci_set_phb_cxl_mode failed: %i\n", rc); + + return rc; +} +EXPORT_SYMBOL(pnv_phb_to_cxl_mode); + +/* Find PHB for cxl dev and allocate MSI hwirqs? + * Returns the absolute hardware IRQ number + */ +int pnv_cxl_alloc_hwirqs(struct pci_dev *dev, int num) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + int hwirq = msi_bitmap_alloc_hwirqs(>msi_bmp, num); + + if (hwirq < 0) { + dev_warn(>dev, "Failed to find a free MSI\n"); + return -ENOSPC; + } + + return phb->msi_base + hwirq; +} +EXPORT_SYMBOL(pnv_cxl_alloc_hwirqs); + +void pnv_cxl_release_hwirqs(struct pci_dev *dev, int hwirq, int num) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + + msi_bitmap_free_hwirqs(>msi_bmp, hwirq - phb->msi_base, num); +} +EXPORT_SYMBOL(pnv_cxl_release_hwirqs); + +void pnv_cxl_release_hwirq_ranges(struct cxl_irq_ranges *irqs, + struct pci_dev *dev) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + int i, hwirq; + + for (i = 1; i < CXL_IRQ_RANGES; i++) { + if (!irqs->range[i]) + continue; + pr_devel("cxl release irq range 0x%x: offset: 0x%lx limit: %ld\n", +i, irqs->offset[i], +irqs->range[i]); + hwirq = irqs->offset[i] - phb->msi_base; + msi_bitmap_free_hwirqs(>msi_bmp, hwirq, + irqs->range[i]); + } +} +EXPORT_SYMBOL(pnv_cxl_release_hwirq_ranges); + +int pnv_cxl_alloc_hwirq_ranges(struct cxl_irq_ranges *irqs, + struct pci_dev *dev, int num) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + int i, hwirq, try; + + memset(irqs, 0, sizeof(struct cxl_irq_ranges)); + + /* 0 is reserved for the multiplexed PSL DSI interrupt */ + for (i = 1
powerpc / cxl: Add support for the Mellanox CX4 in cxl mode
This series adds support for the Mellanox CX4 network adapter operating in cxl mode to the cxl driver and the PowerNV PHB code. The Mellanox developers will submit a separate patch series that makes use of this in the mlx5 driver. The CX4 card can operate in either pci mode, or cxl mode. In cxl mode, memory accesses from the card go through the XSL (Translation Service Layer, essentially a stripped down version of the Power Service Layer), allowing it to transparently access unpinned memory with the cxl driver handling faulting in pages as necessary, etc. Most of the support for the XSL is already upstream, though this series does include a bug fix to enable bus mastering for this (patch 3). Patch 2 in this series provides an API which the mlx5 driver can query to check if it is in a cxl capable slot. The card will come up in pci mode, and the mlx5 driver can choose to switch it to cxl mode, wherein it will reappear with an additional physical function representing the XSL that the cxl driver will bind to. Patches 12-14 add support for switching the card's mode, including using the PCI hotplug support to re-enumerate the device tree and re-probind the card. Unlike previous users of the cxl kernel API where we used a virtual PHB and exposed PCI devices under it, the Mellanox CX4 uses a peer model where cxl binds to one of the physical functions of the card and the mlx5_core driver binds to the other networking physical functions. Patches 6 and 7 add support for using the cxl kernel API with the real PHB to enable this peer model. Patches 4 and 5 are prepatory patches exposing some APIs that the PHB will need to call. While in cxl mode, interrupts from the CX4 are a little unusual - they are neither pci interrupts, nor cxl interrutps, but rather a hybrid of the two. The interrupts are passed from the networking hardware to the XSL using a custom format in the MSIX table, and from there are treated as cxl interrupts. These are configured mostly transparently using the standard msix APIs - the PHB handles allocating and configuring the cxl interrupts, associating them with the default context, and the mlx5 driver handles filling out the MSIX table with their custom format (not included in this series). See patch 10. Additionally, the CX4 has a hard limitation of the number of interrupts that can be associated with a given context, so to overcome this patches 8 and 9 expose an API to allow the mlx5 driver to inform us of the limit, and the interrupt allocation code in patch 10 will allocate additional contexts to associate these with. Patch 1 is a prepatory cleanup patch to reorganise cxl code in arch/powerpc into a separate file. Patch 11 is a workaround for a hardware limitation in the CX4 where a context with PE=0 cannot be used. Note that patch 2 depends on "cxl: Ignore CAPI adapters misplaced in switched slot" by Philippe Bergheaud: http://patchwork.ozlabs.org/patch/642920/ Additionally, the following stand-alone patches related to the CX4 are also pending on the mainling list, but are *not* dependencies of this series: - cxl: Fix bug where AFU disable operation had no effect - cxl: Workaround XSL bug that does not clear the RA bit after a reset - cxl: Fix NULL pointer dereference on kernel contexts with no AFU interrupts The entire series is bisectable. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] cxl: remove dead Kconfig options
Acked-by: Ian Munsie <imun...@au1.ibm.com> ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2] cxl: Ignore CAPI adapters misplaced in switched slots
Acked-by: Ian Munsie <imun...@au1.ibm.com> ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] cxl: make base more explicitly non-modular
Acked-by: Ian Munsie <imun...@au1.ibm.com> ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc/fadump: Fix compile error due to missing semicolon
From: Ian Munsie <imun...@au1.ibm.com> The commit "powerpc/fadump: trivial fix of spelling mistake, clean up message" removed a semicolon causing the following compile failure: arch/powerpc/kernel/fadump.c: In function ‘fadump_invalidate_dump’: arch/powerpc/kernel/fadump.c:1014:2: error: expected ‘;’ before ‘}’ token } ^ Reported-by: Huy Nguyen <h...@mellanox.com> Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- arch/powerpc/kernel/fadump.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c index f066486..b3a6633 100644 --- a/arch/powerpc/kernel/fadump.c +++ b/arch/powerpc/kernel/fadump.c @@ -1010,7 +1010,7 @@ static int fadump_invalidate_dump(struct fadump_mem_struct *fdm) if (rc) { pr_err("Failed to invalidate firmware-assisted dump registration. Unexpected error (%d).\n", rc); - return rc + return rc; } fw_dump.dump_active = 0; fdm_active = NULL; -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] cxl: Ignore CAPI adapters misplaced in switched slots
Thanks Philippe - this looks like a decent solution to the problem (and I intend to use this for the upcoming cx4 support as well). Acked-by: Ian Munsie <imun...@au1.ibm.com> Excerpts from Philippe Bergheaud's message of 2016-06-30 13:45:37 +0200: > One should not attempt to switch a PHB into CAPI mode if there is > a switch between the PHB and the adapter. This patch modifies the > cxl driver to ignore CAPI adapters misplaced in switched slots. > > Signed-off-by: Philippe Bergheaud <fe...@linux.vnet.ibm.com> > --- > This patch fixes Bz 142217. > > drivers/misc/cxl/pci.c | 29 + > 1 file changed, 29 insertions(+) > > diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c > index a08fcc8..2f978ed 100644 > --- a/drivers/misc/cxl/pci.c > +++ b/drivers/misc/cxl/pci.c > @@ -1280,6 +1280,30 @@ static void cxl_pci_remove_adapter(struct cxl *adapter) > device_unregister(>dev); > } > > +#define CXL_MAX_PCIEX_PARENT 2 > + > +static int cxl_slot_is_switched(struct pci_dev *dev) > +{ > +struct device_node *np; > +int depth = 0; > +const __be32 *prop; > + > +if (!(np = pci_device_to_OF_node(dev))) { > +pr_err("cxl: np = NULL\n"); > +return -ENODEV; > +} > +of_node_get(np); > +while (np) { > +np = of_get_next_parent(np); > +prop = of_get_property(np, "device_type", NULL); > +if (!prop || strcmp((char *)prop, "pciex")) > +break; > +depth++; > +} > +of_node_put(np); > +return (depth > CXL_MAX_PCIEX_PARENT); > +} > + > static int cxl_probe(struct pci_dev *dev, const struct pci_device_id *id) > { > struct cxl *adapter; > @@ -1291,6 +1315,11 @@ static int cxl_probe(struct pci_dev *dev, const struct > pci_device_id *id) > return -ENODEV; > } > > +if (cxl_slot_is_switched(dev)) { > +dev_dbg(>dev, "cxl_init_adapter: Ignoring switched slot > device\n"); > +return -ENODEV; > +} > + > if (cxl_verbose) > dump_cxl_config_space(dev); > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2] cxl: Fix bug where AFU disable operation had no effect
From: Ian Munsie <imun...@au1.ibm.com> The AFU disable operation has a bug where it will not clear the enable bit and therefore will have no effect. To date this has likely been masked by fact that we perform an AFU reset before the disable, which also has the effect of clearing the enable bit, making the following disable operation effectively a noop on most hardware. This patch modifies the afu_control function to take a parameter to clear from the AFU control register so that the disable operation can clear the appropriate bit. This bug was uncovered on the Mellanox CX4, which uses an XSL rather than a PSL. On the XSL the reset operation will not complete while the AFU is enabled, meaning the enable bit was still set at the start of the disable and as a result this bug was hit and the disable also timed out. Because of this difference in behaviour between the PSL and XSL, this patch now makes the reset dependent on the card using a PSL to avoid waiting for a timeout on the XSL. It is entirely possible that we may be able to drop the reset altogether if it turns out we only ever needed it due to this bug - however I am not willing to drop it without further regression testing and have added comments to the code explaining the background. This also fixes a small issue where the AFU_Cntl register was read outside of the lock that protects it. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- Changes since v1: - Modified comment to dedicated process disable path to explain the architected procedure, and the origin of our more heavy weight procedure. - Modified comment to afu directed disable path with a note that the procedure is AFU specific, and explaining the origin of our heavy weight prcedure. - Removed needs_reset_before_disable condition from dedicated process disable path. The XSL in the CX4 does not use this mode (AFU directed only), and since the reset in this procedure is architected we *should* never need to skip it. drivers/misc/cxl/cxl.h| 1 + drivers/misc/cxl/native.c | 58 +-- drivers/misc/cxl/pci.c| 1 + 3 files changed, 53 insertions(+), 7 deletions(-) diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index ce2b9d5..bab8dfd 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -544,6 +544,7 @@ struct cxl_service_layer_ops { void (*write_timebase_ctrl)(struct cxl *adapter); u64 (*timebase_read)(struct cxl *adapter); int capi_mode; + bool needs_reset_before_disable; }; struct cxl_native { diff --git a/drivers/misc/cxl/native.c b/drivers/misc/cxl/native.c index 120c468..e774505 100644 --- a/drivers/misc/cxl/native.c +++ b/drivers/misc/cxl/native.c @@ -21,10 +21,10 @@ #include "cxl.h" #include "trace.h" -static int afu_control(struct cxl_afu *afu, u64 command, +static int afu_control(struct cxl_afu *afu, u64 command, u64 clear, u64 result, u64 mask, bool enabled) { - u64 AFU_Cntl = cxl_p2n_read(afu, CXL_AFU_Cntl_An); + u64 AFU_Cntl; unsigned long timeout = jiffies + (HZ * CXL_TIMEOUT); int rc = 0; @@ -33,7 +33,8 @@ static int afu_control(struct cxl_afu *afu, u64 command, trace_cxl_afu_ctrl(afu, command); - cxl_p2n_write(afu, CXL_AFU_Cntl_An, AFU_Cntl | command); + AFU_Cntl = cxl_p2n_read(afu, CXL_AFU_Cntl_An); + cxl_p2n_write(afu, CXL_AFU_Cntl_An, (AFU_Cntl & ~clear) | command); AFU_Cntl = cxl_p2n_read(afu, CXL_AFU_Cntl_An); while ((AFU_Cntl & mask) != result) { @@ -67,7 +68,7 @@ static int afu_enable(struct cxl_afu *afu) { pr_devel("AFU enable request\n"); - return afu_control(afu, CXL_AFU_Cntl_An_E, + return afu_control(afu, CXL_AFU_Cntl_An_E, 0, CXL_AFU_Cntl_An_ES_Enabled, CXL_AFU_Cntl_An_ES_MASK, true); } @@ -76,7 +77,8 @@ int cxl_afu_disable(struct cxl_afu *afu) { pr_devel("AFU disable request\n"); - return afu_control(afu, 0, CXL_AFU_Cntl_An_ES_Disabled, + return afu_control(afu, 0, CXL_AFU_Cntl_An_E, + CXL_AFU_Cntl_An_ES_Disabled, CXL_AFU_Cntl_An_ES_MASK, false); } @@ -85,7 +87,7 @@ static int native_afu_reset(struct cxl_afu *afu) { pr_devel("AFU reset request\n"); - return afu_control(afu, CXL_AFU_Cntl_An_RA, + return afu_control(afu, CXL_AFU_Cntl_An_RA, 0, CXL_AFU_Cntl_An_RS_Complete | CXL_AFU_Cntl_An_ES_Disabled, CXL_AFU_Cntl_An_RS_MASK | CXL_AFU_Cntl_An_ES_MASK, false); @@ -595,7 +597,33 @@ static int deactivate_afu_directed(struct cxl_afu *afu) cxl_sysfs_afu_m_remove(afu); cxl_chardev_afu_remove(afu); - cxl_ops-&
Re: [PATCH 1/2] cxl: Fix bug where AFU disable operation had no effect
Excerpts from Frederic Barrat's message of 2016-06-30 17:50:00 +0200: > > Le 30/06/2016 17:32, Ian Munsie a écrit : > >> For dedicated mode, the CAIA recommends an explicit reset of the AFU > >> >(section 2.1.1). > > True, I had forgotten that procedure was added to the document before it > > was made public - I'll update the comment and resend. > > > > Actually, my point was that for dedicated mode, we shouldn't have the > "if" and always reset. It's only for dedicated mode, so it wouldn't > impact cx4 and we would stay CAIA-compliant. If one day, there's a xsl > with a dedicated mode AFU, they are expected to follow the spec. Yeah, I thought of that as well while I was updating the patch and removed that from the dedicated path. I still added a comment to that path to note that though. Cheers, -Ian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/2] cxl: Fix bug where AFU disable operation had no effect
Excerpts from Frederic Barrat's message of 2016-06-30 16:19:54 +0200: > I'm not a big fan of the new "clear" argument, which forces us to pass > an extra 0 most of the time. Why not always clearing the "action" bits > of the register before applying the command? They are mutually > exclusive, so it should work. I.e. : > > +AFU_Cntl = cxl_p2n_read(afu, CXL_AFU_Cntl_An); > +AFU_Cntl &= ~(CXL_AFU_Cntl_An_E | CXL_AFU_Cntl_An_RA); > +AFU_Cntl |= command; In theory that should be OK, but I'd want to test it on some PSL images first just in case they don't behave quite how we expect since we've had problems in this area in the past (although after discovering the bug in the disable path that may provide the explanation for those problems). The risk I see with that approach is setting the reset bit and clearing the enable bit at the same time is commanding the hardware to do two similar but subtly different operations simultaneously, which is not something we have done before or tested - we can always clean this up in a later patch after we've tested it on a couple of PSLs, cxlflash, etc and are happy that it works with no ill effects. I don't see any reason that needs to be done in this patch though since it's just churn inside the driver and doesn't impact anyone else. > > static inline int detach_process_native_dedicated(struct cxl_context *ctx) > > { > > -cxl_ops->afu_reset(ctx->afu); > > +if (ctx->afu->adapter->native->sl_ops->needs_reset_before_disable) { > > +/* > > + * XXX: We may be able to do away with this entirely - it is > > + * possible that this was only ever needed due to a bug where > > + * the disable operation did not clear the enable bit, however > > + * I will only consider dropping this after more regression > > + * testing on earlier PSL images. > > + */ > > +cxl_ops->afu_reset(ctx->afu); > > +} > > For dedicated mode, the CAIA recommends an explicit reset of the AFU > (section 2.1.1). True, I had forgotten that procedure was added to the document before it was made public - I'll update the comment and resend. > For directed mode, CAIA says it's AFU-specific. Damnit, that should never have been architected as AFU specific - PSL implementation specific would have been ok, but AFU specific is just asking for trouble since this is all generic code with no way to identify the AFU. Oh well, in practice the AFU developers will be coding against our implementation now, so I guess we effectively became the authority on how this works by default... and therefore should probably continue to do the reset here. Just hope there aren't too many more ASIC implementations that implement some contradictory behaviour and are set in stone before anyone realises. I'll resend this with these two comments updated. > So for xsl, we only disable the afu and purge the xsl. Are we getting > rid of the reset because it's useless in that environment, or because > it times out? > > If just because of timeout, would it make sense to switch the order > and disable first, then reset? I don't see any afu-reset on the next > activation. It times out if the AFU is enabled when we attempt the reset - that's OK, but is a bit of a waste of time and generates unnecessary noise in the kernel log. We could switch the order, but I don't think that the AFU reset is necessary - the documentation certainly suggests that only a disable is required (but then again, the XSL has been full of surprises already so I reserve the right to send a later patch adding a reset if it has one more in store for me :-p). A reset is also pretty meaningless on the XSL as far as I can tell - it looks like it will assert a bit to the CX4 hardware, so I assume the rest of the card could choose to do something with it, but unlike the FPGA cards I don't think it actually resets anything (and I doubt we even need the one we do while initialising the card since the AFU descriptor is in the CX4 hardware and should be readable without that). Cheers, -Ian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] cxl: Fix NULL pointer dereference on kernel contexts with no AFU interrupts
Excerpts from andrew.donnellan's message of 2016-06-30 15:15:02 +1000: > On 30/06/16 15:00, Michael Ellerman wrote: > > On Thu, 2016-06-30 at 08:28 +1000, Andrew Donnellan wrote: > >> On 30/06/16 04:55, Ian Munsie wrote: > >>> > >>> From: Ian Munsie <imun...@au1.ibm.com> > >>> > >>> If a kernel context is initialised and does not have any AFU interrupts > >>> allocated it will cause a NULL pointer dereference when the context is > >>> detached since the irq_names list will not have been initialised. > >>> > >>> Move the initialisation of the irq_names list into the cxl_context_init > >>> routine so that it will be valid for the entire lifetime of the context > >>> and will not cause a NULL pointer dereference. > >>> > >>> Signed-off-by: Ian Munsie <imun...@au1.ibm.com> > > > >> As it's nice having your machine not crash on every shutdown... > > > > Fixes: > > Ian can correct me if I'm wrong, but I suspect this doesn't affect > cxlflash (the only current user of the cxl kernel API) - this issue was > hit while working on CAPI support for mlx5. Correct - no current user hits this bug, but the upcoming mlx5 support does because of the way it uses interrupts. -Ian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] cxl: Fix NULL pointer dereference on kernel contexts with no AFU interrupts
From: Ian Munsie <imun...@au1.ibm.com> If a kernel context is initialised and does not have any AFU interrupts allocated it will cause a NULL pointer dereference when the context is detached since the irq_names list will not have been initialised. Move the initialisation of the irq_names list into the cxl_context_init routine so that it will be valid for the entire lifetime of the context and will not cause a NULL pointer dereference. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- drivers/misc/cxl/context.c | 2 ++ drivers/misc/cxl/irq.c | 3 --- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/misc/cxl/context.c b/drivers/misc/cxl/context.c index 26d206b..edbb99e 100644 --- a/drivers/misc/cxl/context.c +++ b/drivers/misc/cxl/context.c @@ -67,6 +67,8 @@ int cxl_context_init(struct cxl_context *ctx, struct cxl_afu *afu, bool master, ctx->pending_fault = false; ctx->pending_afu_err = false; + INIT_LIST_HEAD(>irq_names); + /* * When we have to destroy all contexts in cxl_context_detach_all() we * end up with afu_release_irqs() called from inside a diff --git a/drivers/misc/cxl/irq.c b/drivers/misc/cxl/irq.c index 8def455..f3a7d4a 100644 --- a/drivers/misc/cxl/irq.c +++ b/drivers/misc/cxl/irq.c @@ -260,9 +260,6 @@ int afu_allocate_irqs(struct cxl_context *ctx, u32 count) else alloc_count = count + 1; - /* Initialize the list head to hold irq names */ - INIT_LIST_HEAD(>irq_names); - if ((rc = cxl_ops->alloc_irq_ranges(>irqs, ctx->afu->adapter, alloc_count))) return rc; -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/2] cxl: Workaround XSL bug that does not clear the RA bit after a reset
From: Ian Munsie <imun...@au1.ibm.com> An issue was noted in our debug logs where the XSL would leave the RA bit asserted after an AFU reset operation, which would effectively prevent further AFU reset operations from working. Workaround the issue by clearing the RA bit with an MMIO write if it is still asserted after any AFU control operation. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- drivers/misc/cxl/native.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/misc/cxl/native.c b/drivers/misc/cxl/native.c index 9479bfc..bc79be8 100644 --- a/drivers/misc/cxl/native.c +++ b/drivers/misc/cxl/native.c @@ -55,6 +55,16 @@ static int afu_control(struct cxl_afu *afu, u64 command, u64 clear, cpu_relax(); AFU_Cntl = cxl_p2n_read(afu, CXL_AFU_Cntl_An); }; + + if (AFU_Cntl & CXL_AFU_Cntl_An_RA) { + /* +* Workaround for a bug in the XSL used in the Mellanox CX4 +* that fails to clear the RA bit after an AFU reset, +* preventing subsequent AFU resets from working. +*/ + cxl_p2n_write(afu, CXL_AFU_Cntl_An, AFU_Cntl & ~CXL_AFU_Cntl_An_RA); + } + pr_devel("AFU command complete: %llx\n", command); afu->enabled = enabled; out: -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/2] cxl: Fix bug where AFU disable operation had no effect
From: Ian Munsie <imun...@au1.ibm.com> The AFU disable operation has a bug where it will not clear the enable bit and therefore will have no effect. To date this has likely been masked by fact that we perform an AFU reset before the disable, which also has the effect of clearing the enable bit, making the following disable operation effectively a noop on most hardware. This patch modifies the afu_control function to take a parameter to clear from the AFU control register so that the disable operation can clear the appropriate bit. This bug was uncovered on the Mellanox CX4, which uses an XSL rather than a PSL. On the XSL the reset operation will not complete while the AFU is enabled, meaning the enable bit was still set at the start of the disable and as a result this bug was hit and the disable also timed out. Because of this difference in behaviour between the PSL and XSL, this patch now makes the reset dependent on the card using a PSL to avoid waiting for a timeout on the XSL. It is entirely possible that we may be able to drop the reset altogether if it turns out we only ever needed it due to this bug - however I am not willing to drop it without further regression testing. This also fixes a small issue where the AFU_Cntl register was read outside of the lock that protects it. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- drivers/misc/cxl/cxl.h| 1 + drivers/misc/cxl/native.c | 36 drivers/misc/cxl/pci.c| 1 + 3 files changed, 30 insertions(+), 8 deletions(-) diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index ce2b9d5..bab8dfd 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -544,6 +544,7 @@ struct cxl_service_layer_ops { void (*write_timebase_ctrl)(struct cxl *adapter); u64 (*timebase_read)(struct cxl *adapter); int capi_mode; + bool needs_reset_before_disable; }; struct cxl_native { diff --git a/drivers/misc/cxl/native.c b/drivers/misc/cxl/native.c index 120c468..9479bfc 100644 --- a/drivers/misc/cxl/native.c +++ b/drivers/misc/cxl/native.c @@ -21,10 +21,10 @@ #include "cxl.h" #include "trace.h" -static int afu_control(struct cxl_afu *afu, u64 command, +static int afu_control(struct cxl_afu *afu, u64 command, u64 clear, u64 result, u64 mask, bool enabled) { - u64 AFU_Cntl = cxl_p2n_read(afu, CXL_AFU_Cntl_An); + u64 AFU_Cntl; unsigned long timeout = jiffies + (HZ * CXL_TIMEOUT); int rc = 0; @@ -33,7 +33,8 @@ static int afu_control(struct cxl_afu *afu, u64 command, trace_cxl_afu_ctrl(afu, command); - cxl_p2n_write(afu, CXL_AFU_Cntl_An, AFU_Cntl | command); + AFU_Cntl = cxl_p2n_read(afu, CXL_AFU_Cntl_An); + cxl_p2n_write(afu, CXL_AFU_Cntl_An, (AFU_Cntl & ~clear) | command); AFU_Cntl = cxl_p2n_read(afu, CXL_AFU_Cntl_An); while ((AFU_Cntl & mask) != result) { @@ -67,7 +68,7 @@ static int afu_enable(struct cxl_afu *afu) { pr_devel("AFU enable request\n"); - return afu_control(afu, CXL_AFU_Cntl_An_E, + return afu_control(afu, CXL_AFU_Cntl_An_E, 0, CXL_AFU_Cntl_An_ES_Enabled, CXL_AFU_Cntl_An_ES_MASK, true); } @@ -76,7 +77,8 @@ int cxl_afu_disable(struct cxl_afu *afu) { pr_devel("AFU disable request\n"); - return afu_control(afu, 0, CXL_AFU_Cntl_An_ES_Disabled, + return afu_control(afu, 0, CXL_AFU_Cntl_An_E, + CXL_AFU_Cntl_An_ES_Disabled, CXL_AFU_Cntl_An_ES_MASK, false); } @@ -85,7 +87,7 @@ static int native_afu_reset(struct cxl_afu *afu) { pr_devel("AFU reset request\n"); - return afu_control(afu, CXL_AFU_Cntl_An_RA, + return afu_control(afu, CXL_AFU_Cntl_An_RA, 0, CXL_AFU_Cntl_An_RS_Complete | CXL_AFU_Cntl_An_ES_Disabled, CXL_AFU_Cntl_An_RS_MASK | CXL_AFU_Cntl_An_ES_MASK, false); @@ -595,7 +597,16 @@ static int deactivate_afu_directed(struct cxl_afu *afu) cxl_sysfs_afu_m_remove(afu); cxl_chardev_afu_remove(afu); - cxl_ops->afu_reset(afu); + if (afu->adapter->native->sl_ops->needs_reset_before_disable) { + /* +* XXX: We may be able to do away with this entirely - it is +* possible that this was only ever needed due to a bug where +* the disable operation did not clear the enable bit, however +* I will only consider dropping this after more regression +* testing on earlier PSL images. +*/ + cxl_ops->afu_reset(afu); + } cxl_afu_disable(afu); cxl_psl_purge(afu); @@ -735,7 +746,16 @@ static int native_attach_process(struct cxl_context *ctx, bool kernel, stati
[PATCH 2/2] cxl: Fix allocating a minimum of 2 pages for the SPA
From: Ian Munsie <imun...@au1.ibm.com> The Scheduled Process Area is allocated dynamically with enough pages to fit at least as many processes as the AFU descriptor indicated. Since the calculation is non-trivial, it does this by calculating how many processes could fit in an allocation of a given order, and increasing that order until it can fit enough processes or hits the maximum supported size. Currently, it will start this search using a SPA of 2 pages instead of 1. This can waste a page of memory if the AFU's maximum number of supported processes was small enough to fit in one page. Fix the algorithm to start the search at 1 page. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- drivers/misc/cxl/native.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/cxl/native.c b/drivers/misc/cxl/native.c index e80d8f7..120c468 100644 --- a/drivers/misc/cxl/native.c +++ b/drivers/misc/cxl/native.c @@ -189,7 +189,7 @@ int cxl_alloc_spa(struct cxl_afu *afu) unsigned spa_size; /* Work out how many pages to allocate */ - afu->native->spa_order = 0; + afu->native->spa_order = -1; do { afu->native->spa_order++; spa_size = (1 << afu->native->spa_order) * PAGE_SIZE; -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/2] cxl: Fix allowing bogus AFU descriptors with 0 maximum processes
From: Ian Munsie <imun...@au1.ibm.com> If the AFU descriptor of an AFU directed AFU indicates that it supports 0 maximum processes, we will accept that value and attempt to use it. The SPA will still be allocated (with 2 pages due to another minor bug and room for 958 processes), and when a context is allocated we will pass the value of 0 to idr_alloc as the maximum. However, idr_alloc will treat that as meaning no maximum and will allocate a context number and we return a valid context. Conceivably, this could lead to a buffer overflow of the SPA if more than 958 contexts were allocated, however this is mitigated by the fact that there are no known AFUs in the wild with a bogus AFU descriptor like this, and that only the root user is allowed to flash an AFU image to a card. Add a check when validating the AFU descriptor to reject any with 0 maximum processes. We do still allow a dedicated process only AFU to indicate that it supports 0 contexts even though that is forbidden in the architecture, as in that case we ignore the value and use 1 instead. This is just on the off-chance that such a dedicated process AFU may exist (not that I am aware of any), since their developers are less likely to have cared about this value at all. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- drivers/misc/cxl/pci.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index 648817a..58d7d821 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -775,6 +775,21 @@ static int cxl_afu_descriptor_looks_ok(struct cxl_afu *afu) } } + if ((afu->modes_supported & ~CXL_MODE_DEDICATED) && afu->max_procs_virtualised == 0) { + /* +* We could also check this for the dedicated process model +* since the architecture indicates it should be set to 1, but +* in that case we ignore the value and I'd rather not risk +* breaking any existing dedicated process AFUs that left it as +* 0 (not that I'm aware of any). It is clearly an error for an +* AFU directed AFU to set this to 0, and would have previously +* triggered a bug resulting in the maximum not being enforced +* at all since idr_alloc treats 0 as no maximum. +*/ + dev_err(>dev, "AFU does not support any processes\n"); + return -EINVAL; + } + return 0; } -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [v6, 1/2] cxl: Add mechanism for delivering AFU driver specific events
Excerpts from Vaibhav Jain's message of 2016-06-20 14:20:16 +0530: > > +int cxl_unset_driver_ops(struct cxl_context *ctx) > > +{ > > +if (atomic_read(>afu_driver_events)) > > +return -EBUSY; > > + > > +ctx->afu_driver_ops = NULL; > Need a write memory barrier so that afu_driver_ops isnt possibly called > after this store. What situation do you think this will help? I haven't looked closely at the last few iterations of this patch set, but if you're in a situation where you might be racing with some code doing e.g. if (ctx->afu_driver_ops) ctx->afu_driver_ops->something(); You have a race with or without a memory barrier. Ideally you would just have the caller guarantee that it will only call cxl_unset_driver_ops if no further calls to afu_driver_ops is possible, otherwise you may need locking here which would be far from ideal. What exactly is the use case for this API? I'd vote to drop it if we can do without it. -Ian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/mm: Prevent unlikely crash in copro_calculate_slb()
Acked-by: Ian Munsie <imun...@au1.ibm.com> ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] cxl: Make vPHB device node match adapter's
This could probably use a description in the commit message, perhaps including output showing the before/after difference this makes to lsvpd, but otherwise it looks fine to me. @Mikey - this look OK to you? Acked-by: Ian Munsie <imun...@au1.ibm.com> Excerpts from Frederic Barrat's message of 2016-06-15 16:42:16 +0200: > Tested by cxlflash on bare-metal and powerVM. > > Signed-off-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> > --- > drivers/misc/cxl/vphb.c | 21 ++--- > 1 file changed, 10 insertions(+), 11 deletions(-) > > diff --git a/drivers/misc/cxl/vphb.c b/drivers/misc/cxl/vphb.c > index cdc7723..012b6aa 100644 > --- a/drivers/misc/cxl/vphb.c > +++ b/drivers/misc/cxl/vphb.c > @@ -208,20 +208,19 @@ static struct pci_controller_ops cxl_pci_controller_ops > = > > int cxl_pci_vphb_add(struct cxl_afu *afu) > { > -struct pci_dev *phys_dev; > -struct pci_controller *phb, *phys_phb; > +struct pci_controller *phb; > struct device_node *vphb_dn; > struct device *parent; > > -if (cpu_has_feature(CPU_FTR_HVMODE)) { > -phys_dev = to_pci_dev(afu->adapter->dev.parent); > -phys_phb = pci_bus_to_host(phys_dev->bus); > -vphb_dn = phys_phb->dn; > -parent = _dev->dev; > -} else { > -vphb_dn = afu->adapter->dev.parent->of_node; > -parent = afu->adapter->dev.parent; > -} > +/* The parent device is the adapter. Reuse the device node of > + * the adapter. > + * We don't seem to care what device node is used for the vPHB, > + * but tools such as lsvpd walk up the device parents looking > + * for a valid location code, so we might as well show devices > + * attached to the adapter as being located on that adapter. > + */ > +parent = afu->adapter->dev.parent; > +vphb_dn = parent->of_node; > > /* Alloc and setup PHB data structure */ > phb = pcibios_alloc_controller(vphb_dn); ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH, RFC] cxl: Add support for CAPP DMA mode
From: Ian Munsie <imun...@au1.ibm.com> This adds support for using CAPP DMA mode, which is required for XSL based cards such as the Mellanox CX4 to function. This is currently an RFC as it depends on the corresponding support to be merged into skiboot first, which was submitted here: http://patchwork.ozlabs.org/patch/625582/ In the event that the skiboot on the system does not have the above support, it will indicate as such in the kernel log and abort the init process. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- arch/powerpc/include/asm/opal-api.h | 1 + arch/powerpc/platforms/powernv/pci-ioda.c | 4 +++- drivers/misc/cxl/cxl.h| 1 + drivers/misc/cxl/pci.c| 4 +++- 4 files changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h index 9bb8ddf..d29c584 100644 --- a/arch/powerpc/include/asm/opal-api.h +++ b/arch/powerpc/include/asm/opal-api.h @@ -825,6 +825,7 @@ enum { OPAL_PHB_CAPI_MODE_CAPI = 1, OPAL_PHB_CAPI_MODE_SNOOP_OFF= 2, OPAL_PHB_CAPI_MODE_SNOOP_ON = 3, + OPAL_PHB_CAPI_MODE_DMA = 4, }; /* OPAL I2C request */ diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 3a5ea82..5a42e98 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -2793,7 +2793,9 @@ int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t mode) pe_info(pe, "Switching PHB to CXL\n"); rc = opal_pci_set_phb_cxl_mode(phb->opal_id, mode, pe->pe_number); - if (rc) + if (rc == OPAL_UNSUPPORTED) + dev_err(>dev, "Required cxl mode not supported by firmware - update skiboot\n"); + else if (rc) dev_err(>dev, "opal_pci_set_phb_cxl_mode failed: %i\n", rc); return rc; diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index 92e7f19..aa69a84 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -542,6 +542,7 @@ struct cxl_service_layer_ops { void (*debugfs_stop_trace)(struct cxl *adapter); void (*write_timebase_ctrl)(struct cxl *adapter); u64 (*timebase_read)(struct cxl *adapter); + int capi_mode; }; struct cxl_native { diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index 556718d..648817a 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -1249,7 +1249,7 @@ static int cxl_configure_adapter(struct cxl *adapter, struct pci_dev *dev) if ((rc = adapter->native->sl_ops->adapter_regs_init(adapter, dev))) goto err; - if ((rc = pnv_phb_to_cxl_mode(dev, OPAL_PHB_CAPI_MODE_CAPI))) + if ((rc = pnv_phb_to_cxl_mode(dev, adapter->native->sl_ops->capi_mode))) goto err; /* If recovery happened, the last step is to turn on snooping. @@ -1293,6 +1293,7 @@ static const struct cxl_service_layer_ops psl_ops = { .debugfs_stop_trace = cxl_stop_trace, .write_timebase_ctrl = write_timebase_ctrl_psl, .timebase_read = timebase_read_psl, + .capi_mode = OPAL_PHB_CAPI_MODE_CAPI, }; static const struct cxl_service_layer_ops xsl_ops = { @@ -1300,6 +1301,7 @@ static const struct cxl_service_layer_ops xsl_ops = { .debugfs_add_adapter_sl_regs = cxl_debugfs_add_adapter_xsl_regs, .write_timebase_ctrl = write_timebase_ctrl_xsl, .timebase_read = timebase_read_xsl, + .capi_mode = OPAL_PHB_CAPI_MODE_DMA, }; static void set_sl_ops(struct cxl *adapter, struct pci_dev *dev) -- 2.8.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] cxl: Abstract the differences between the PSL and XSL
From: Frederic Barrat <fbar...@linux.vnet.ibm.com> The XSL (Translation Service Layer) is a stripped down version of the PSL (Power Service Layer) used in some cards such as the Mellanox CX4. Like the PSL, it implements the CAIA architecture, but has a number of differences, mostly in it's implementation dependent registers. This adds an ops structure to abstract these differences to bring initial support for XSL CAPI devices. The XSL does not implement the optional architected SERR register, however while it treats it as a reserved register and should work with no special treatment, attempting to access it will cause the XSL_FEC (First Error Capture) register to be filled out, preventing it from capturing any subsequent errors. Therefore, this patch also prevents the kernel from trying to set up the SERR register so that the FEC register may still be useful, and to save one interrupt. The XSL also uses a special DMA cxl mode, which uses a slightly different init sequence for the CAPP and PHB. The kernel support for this will be in a future patch once the corresponding support has been merged into skiboot. Co-authored-by: Ian Munsie <imun...@au1.ibm.com> Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- drivers/misc/cxl/cxl.h | 26 drivers/misc/cxl/debugfs.c | 35 --- drivers/misc/cxl/native.c | 53 +++- drivers/misc/cxl/pci.c | 152 ++--- 4 files changed, 218 insertions(+), 48 deletions(-) diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index d12a035..92e7f19 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -81,6 +81,7 @@ static const cxl_p1_reg_t CXL_PSL_TLBIA = {0x00A8}; static const cxl_p1_reg_t CXL_PSL_AFUSEL = {0x00B0}; /* 0x00C0:7EFF Implementation dependent area */ +/* PSL registers */ static const cxl_p1_reg_t CXL_PSL_FIR1 = {0x0100}; static const cxl_p1_reg_t CXL_PSL_FIR2 = {0x0108}; static const cxl_p1_reg_t CXL_PSL_Timebase = {0x0110}; @@ -91,6 +92,11 @@ static const cxl_p1_reg_t CXL_PSL_FIR_CNTL = {0x0148}; static const cxl_p1_reg_t CXL_PSL_DSNDCTL = {0x0150}; static const cxl_p1_reg_t CXL_PSL_SNWRALLOC = {0x0158}; static const cxl_p1_reg_t CXL_PSL_TRACE = {0x0170}; +/* XSL registers (Mellanox CX4) */ +static const cxl_p1_reg_t CXL_XSL_Timebase = {0x0100}; +static const cxl_p1_reg_t CXL_XSL_TB_CTLSTAT = {0x0108}; +static const cxl_p1_reg_t CXL_XSL_FEC = {0x0158}; +static const cxl_p1_reg_t CXL_XSL_DSNCTL= {0x0168}; /* 0x7F00:7FFF Reserved PCIe MSI-X Pending Bit Array area */ /* 0x8000: Reserved PCIe MSI-X Table Area */ @@ -524,6 +530,20 @@ struct cxl_context { struct rcu_head rcu; }; +struct cxl_service_layer_ops { + int (*adapter_regs_init)(struct cxl *adapter, struct pci_dev *dev); + int (*afu_regs_init)(struct cxl_afu *afu); + int (*register_serr_irq)(struct cxl_afu *afu); + void (*release_serr_irq)(struct cxl_afu *afu); + void (*debugfs_add_adapter_sl_regs)(struct cxl *adapter, struct dentry *dir); + void (*debugfs_add_afu_sl_regs)(struct cxl_afu *afu, struct dentry *dir); + void (*psl_irq_dump_registers)(struct cxl_context *ctx); + void (*err_irq_dump_registers)(struct cxl *adapter); + void (*debugfs_stop_trace)(struct cxl *adapter); + void (*write_timebase_ctrl)(struct cxl *adapter); + u64 (*timebase_read)(struct cxl *adapter); +}; + struct cxl_native { u64 afu_desc_off; u64 afu_desc_size; @@ -532,6 +552,7 @@ struct cxl_native { irq_hw_number_t err_hwirq; unsigned int err_virq; u64 ps_off; + const struct cxl_service_layer_ops *sl_ops; }; struct cxl_guest { @@ -804,6 +825,11 @@ int cxl_tlb_slb_invalidate(struct cxl *adapter); int cxl_afu_disable(struct cxl_afu *afu); int cxl_psl_purge(struct cxl_afu *afu); +void cxl_debugfs_add_adapter_psl_regs(struct cxl *adapter, struct dentry *dir); +void cxl_debugfs_add_adapter_xsl_regs(struct cxl *adapter, struct dentry *dir); +void cxl_debugfs_add_afu_psl_regs(struct cxl_afu *afu, struct dentry *dir); +void cxl_native_psl_irq_dump_regs(struct cxl_context *ctx); +void cxl_native_err_irq_dump_regs(struct cxl *adapter); void cxl_stop_trace(struct cxl *cxl); int cxl_pci_vphb_add(struct cxl_afu *afu); void cxl_pci_vphb_remove(struct cxl_afu *afu); diff --git a/drivers/misc/cxl/debugfs.c b/drivers/misc/cxl/debugfs.c index 5751899..ec7b8a0 100644 --- a/drivers/misc/cxl/debugfs.c +++ b/drivers/misc/cxl/debugfs.c @@ -51,6 +51,19 @@ static struct dentry *debugfs_create_io_x64(const char *name, umode_t mode, return debugfs_create_file(name, mode, parent, (void __force *)value, _io_x64); } +void cxl_debugfs_add_adapter_psl_regs(struct cxl *adapter, struct dentry *dir) +{ + debugfs_create_io_x64("fir1", S_IRUSR, dir, _cxl_p1_addr(adapter, CXL_PSL_FIR1)); + debugfs_create_io_x64("fir2&quo
[PATCH] cxl: Update process element after allocating interrupts
From: Ian Munsie <imun...@au1.ibm.com> In the kernel API, it is possible to attempt to allocate AFU interrupts after already starting a context. Since the process element structure used by the hardware is only filled out at the time the context is started, it will not be updated with the interrupt numbers that have just been allocated and therefore AFU interrupts will not work unless they were allocated prior to starting the context. This can present some difficulties as each CAPI enabled PCI device in the kernel API has a default context, which may need to be started very early to enable translations, potentially before interrupts can easily be set up. This patch makes the API more flexible to allow interrupts to be allocated after a context has already been started and takes care of updating the PE structure used by the hardware and notifying it to discard any cached copy it may have. The update is currently performed via a terminate/remove/add sequence. This is necessary on some hardware such as the XSL that does not properly support the update LLCMD. Note that this is only supported on powernv at present - attempting to perform this ordering on PowerVM will raise a warning. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- drivers/misc/cxl/api.c| 12 +++- drivers/misc/cxl/cxl.h| 1 + drivers/misc/cxl/guest.c | 1 + drivers/misc/cxl/native.c | 74 +-- 4 files changed, 71 insertions(+), 17 deletions(-) diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index 6d228cc..99081b8 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -102,7 +102,10 @@ int cxl_allocate_afu_irqs(struct cxl_context *ctx, int num) if (num == 0) num = ctx->afu->pp_irqs; res = afu_allocate_irqs(ctx, num); - if (!res && !cpu_has_feature(CPU_FTR_HVMODE)) { + if (res) + return res; + + if (!cpu_has_feature(CPU_FTR_HVMODE)) { /* In a guest, the PSL interrupt is not multiplexed. It was * allocated above, and we need to set its handler */ @@ -110,6 +113,13 @@ int cxl_allocate_afu_irqs(struct cxl_context *ctx, int num) if (hwirq) cxl_map_irq(ctx->afu->adapter, hwirq, cxl_ops->psl_interrupt, ctx, "psl"); } + + if (ctx->status == STARTED) { + if (cxl_ops->update_ivtes) + cxl_ops->update_ivtes(ctx); + else WARN(1, "BUG: cxl_allocate_afu_irqs must be called prior to starting the context on this platform\n"); + } + return res; } EXPORT_SYMBOL_GPL(cxl_allocate_afu_irqs); diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index d23a3a5..d12a035 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -853,6 +853,7 @@ struct cxl_backend_ops { int (*attach_process)(struct cxl_context *ctx, bool kernel, u64 wed, u64 amr); int (*detach_process)(struct cxl_context *ctx); + void (*update_ivtes)(struct cxl_context *ctx); bool (*support_attributes)(const char *attr_name, enum cxl_attrs type); bool (*link_ok)(struct cxl *cxl, struct cxl_afu *afu); void (*release_afu)(struct device *dev); diff --git a/drivers/misc/cxl/guest.c b/drivers/misc/cxl/guest.c index bc8d0b9..1edba52 100644 --- a/drivers/misc/cxl/guest.c +++ b/drivers/misc/cxl/guest.c @@ -1182,6 +1182,7 @@ const struct cxl_backend_ops cxl_guest_ops = { .ack_irq = guest_ack_irq, .attach_process = guest_attach_process, .detach_process = guest_detach_process, + .update_ivtes = NULL, .support_attributes = guest_support_attributes, .link_ok = guest_link_ok, .release_afu = guest_release_afu, diff --git a/drivers/misc/cxl/native.c b/drivers/misc/cxl/native.c index 98f2cac..aa0be79 100644 --- a/drivers/misc/cxl/native.c +++ b/drivers/misc/cxl/native.c @@ -429,7 +429,6 @@ static int remove_process_element(struct cxl_context *ctx) return rc; } - void cxl_assign_psn_space(struct cxl_context *ctx) { if (!ctx->afu->pp_size || ctx->master) { @@ -506,10 +505,39 @@ static u64 calculate_sr(struct cxl_context *ctx) return sr; } +static void update_ivtes_directed(struct cxl_context *ctx) +{ + bool need_update = (ctx->status == STARTED); + int r; + + if (need_update) { + WARN_ON(terminate_process_element(ctx)); + WARN_ON(remove_process_element(ctx)); + } + + for (r = 0; r < CXL_IRQ_RANGES; r++) { + ctx->elem->ivte_offsets[r] = cpu_to_be16(ctx->irqs.offset[r]); + ctx->elem->ivte_ranges[r] = cpu_to_be16(ctx->irqs.range[r]); + } + + /* +* Theoretically we could use the update llcmd, instead of a +* terminate/remove/
Re: [PATCH] cxl: Refine slice error debug messages.
Acked-by: Ian Munsie <imun...@au1.ibm.com> ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2] cxl: Add kernel API to allow a context to operate with relocate disabled
From: Ian Munsie <imun...@au1.ibm.com> cxl devices typically access memory using an MMU in much the same way as the CPU, and each context includes a state register much like the MSR in the CPU. Like the CPU, the state register includes a bit to enable relocation, which we currently always enable. In some cases, it may be desirable to allow a device to access memory using real addresses instead of effective addresses, so this adds a new API, cxl_set_translation_mode, that can be used to disable relocation on a given kernel context. This can allow for the creation of a special privileged context that the device can use if it needs relocation disabled, and can use regular contexts at times when it needs relocation enabled. This interface is only available to users of the kernel API for obvious reasons, and will never be supported in a virtualised environment. This will be used by the upcoming cxl support in the mlx5 driver. Signed-off-by: Ian Munsie <imun...@au1.ibm.com> --- Changes since v1: - Changed API to use a dedicated cxl_set_translation_mode() call instead of adding an extra parameter to cxl_start_context2() based on review feedback from Frederic Barrat - Changed error code for attempting to use in PowerVM environment to -EPERM drivers/misc/cxl/api.c| 19 +++ drivers/misc/cxl/cxl.h| 1 + drivers/misc/cxl/guest.c | 3 +++ drivers/misc/cxl/native.c | 5 +++-- include/misc/cxl.h| 8 5 files changed, 34 insertions(+), 2 deletions(-) diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index 8075823..6d228cc 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -183,6 +183,7 @@ int cxl_start_context(struct cxl_context *ctx, u64 wed, ctx->pid = get_task_pid(task, PIDTYPE_PID); ctx->glpid = get_task_pid(task->group_leader, PIDTYPE_PID); kernel = false; + ctx->real_mode = false; } cxl_ctx_get(); @@ -219,6 +220,24 @@ void cxl_set_master(struct cxl_context *ctx) } EXPORT_SYMBOL_GPL(cxl_set_master); +int cxl_set_translation_mode(struct cxl_context *ctx, bool real_mode) +{ + if (ctx->status == STARTED) { + /* +* We could potentially update the PE and issue an update LLCMD +* to support this, but it doesn't seem to have a good use case +* since it's trivial to just create a second kernel context +* with different translation modes, so until someone convinces +* me otherwise: +*/ + return -EBUSY; + } + + ctx->real_mode = real_mode; + return 0; +} +EXPORT_SYMBOL_GPL(cxl_set_translation_mode); + /* wrappers around afu_* file ops which are EXPORTED */ int cxl_fd_open(struct inode *inode, struct file *file) { diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index dfdbfb0..6e3e485 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -523,6 +523,7 @@ struct cxl_context { bool pe_inserted; bool master; bool kernel; + bool real_mode; bool pending_irq; bool pending_fault; bool pending_afu_err; diff --git a/drivers/misc/cxl/guest.c b/drivers/misc/cxl/guest.c index 769971c..c2815b9 100644 --- a/drivers/misc/cxl/guest.c +++ b/drivers/misc/cxl/guest.c @@ -617,6 +617,9 @@ static int guest_attach_process(struct cxl_context *ctx, bool kernel, u64 wed, u { pr_devel("in %s\n", __func__); + if (ctx->real_mode) + return -EPERM; + ctx->kernel = kernel; if (ctx->afu->current_mode == CXL_MODE_DIRECTED) return attach_afu_directed(ctx, wed, amr); diff --git a/drivers/misc/cxl/native.c b/drivers/misc/cxl/native.c index ef494ba..ba459a9 100644 --- a/drivers/misc/cxl/native.c +++ b/drivers/misc/cxl/native.c @@ -485,8 +485,9 @@ static u64 calculate_sr(struct cxl_context *ctx) if (mfspr(SPRN_LPCR) & LPCR_TC) sr |= CXL_PSL_SR_An_TC; if (ctx->kernel) { - sr |= CXL_PSL_SR_An_R | (mfmsr() & MSR_SF); - sr |= CXL_PSL_SR_An_HV; + if (!ctx->real_mode) + sr |= CXL_PSL_SR_An_R; + sr |= (mfmsr() & MSR_SF) | CXL_PSL_SR_An_HV; } else { sr |= CXL_PSL_SR_An_PR | CXL_PSL_SR_An_R; sr &= ~(CXL_PSL_SR_An_HV); diff --git a/include/misc/cxl.h b/include/misc/cxl.h index 7d5e261..56560c5 100644 --- a/include/misc/cxl.h +++ b/include/misc/cxl.h @@ -127,6 +127,14 @@ int cxl_afu_reset(struct cxl_context *ctx); void cxl_set_master(struct cxl_context *ctx); /* + * Sets the context to use real mode memory accesses to operate with + * translation disabled. Note that this only makes sense for kernel contexts + * under bare metal, and will not work with virtualisation. May only be + * performed
Re: [PATCH] cxl: Add kernel API to allow a context to operate with relocate disabled
Sure thing, that actually simplifies things a great deal. Testing now and will resend shortly :) -Ian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev