Re: [PATCH 11/15] cxl: Add support for interrupts on the Mellanox CX4

2016-07-13 Thread Andrew Donnellan

On 14/07/16 07:17, Ian Munsie wrote:

From: Ian Munsie 

The Mellanox CX4 in cxl mode uses a hybrid interrupt model, where
interrupts are routed from the networking hardware to the XSL using the
MSIX table, and from there will be transformed back into an MSIX
interrupt using the cxl style interrupts (i.e. using IVTE entries and
ranges to map a PE and AFU interrupt number to an MSIX address).

We want to hide the implementation details of cxl interrupts as much as
possible. To this end, we use a special version of the MSI setup &
teardown routines in the PHB while in cxl mode to allocate the cxl
interrupts and configure the IVTE entries in the process element.

This function does not configure the MSIX table - the CX4 card uses a
custom format in that table and it would not be appropriate to fill that
out in generic code. The rest of the functionality is similar to the
"Full MSI-X mode" described in the CAIA, and this could be easily
extended to support other adapters that use that mode in the future.

The interrupts will be associated with the default context. If the
maximum number of interrupts per context has been limited (e.g. by the
mlx5 driver), it will automatically allocate additional kernel contexts
to associate extra interrupts as required. These contexts will be
started using the same WED that was used to start the default context.

Signed-off-by: Ian Munsie 


Some minor nitpicks below, which shouldn't block acceptance.

Reviewed-by: Andrew Donnellan 



diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 104c040..530d4af 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -3465,6 +3465,10 @@ static const struct pci_controller_ops 
pnv_npu_ioda_controller_ops = {
 const struct pci_controller_ops pnv_cxl_cx4_ioda_controller_ops = {
.dma_dev_setup  = pnv_pci_dma_dev_setup,
.dma_bus_setup  = pnv_pci_dma_bus_setup,
+#ifdef CONFIG_PCI_MSI


If you've got CXL_BASE you've already got PCI_MSI.


diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c
index f02a859..f3d34b9 100644
--- a/drivers/misc/cxl/api.c
+++ b/drivers/misc/cxl/api.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "cxl.h"

@@ -489,3 +490,73 @@ int cxl_get_max_irqs_per_process(struct pci_dev *dev)
return afu->irqs_max;
 }
 EXPORT_SYMBOL_GPL(cxl_get_max_irqs_per_process);
+
+/*
+ * This is a special interrupt allocation routine called from the PHB's MSI
+ * setup function. When capi interrupts are allocated in this manner they must
+ * still be associated with a running context, but since the MSI APIs have no
+ * way to specify this we use the default context associated with the device.
+ *
+ * The Mellanox CX4 has a hardware limitation that restricts the maximum AFU
+ * interrupt number, so in order to overcome this their driver informs us of
+ * the restriction by setting the maximum interrupts per context, and we
+ * allocate additional contexts as necessary so that we can keep the AFU
+ * interrupt number within the supported range.
+ */
+int _cxl_cx4_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
+{
+   struct cxl_context *ctx, *new_ctx, *default_ctx;
+   int remaining;
+   int rc;
+
+   ctx = default_ctx = cxl_get_context(pdev);
+   if (WARN_ON(!default_ctx))
+   return -ENODEV;


I have a very slight preference for:

if (!default_ctx) {
dev_WARN(>dev, "couldn't get default context");
return -ENODEV;
}

(I see this in your arch/powerpc code too, but that's obviously copied 
from the regular powernv irq code. Also, why is there no dev_WARN_ON() 
function?)



+
+   remaining = nvec;
+   while (remaining > 0) {
+   rc = cxl_allocate_afu_irqs(ctx, min(remaining, 
ctx->afu->irqs_max));
+   if (rc) {
+   pr_warn("%s: Failed to find enough free MSIs\n", 
pci_name(pdev));


dev_warn(>dev, "failed to find enough free MSIs\n"); is more 
common in the cxl code.



--
Andrew Donnellan  OzLabs, ADL Canberra
andrew.donnel...@au1.ibm.com  IBM Australia Limited

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 11/15] cxl: Add support for interrupts on the Mellanox CX4

2016-07-13 Thread Ian Munsie
From: Ian Munsie 

The Mellanox CX4 in cxl mode uses a hybrid interrupt model, where
interrupts are routed from the networking hardware to the XSL using the
MSIX table, and from there will be transformed back into an MSIX
interrupt using the cxl style interrupts (i.e. using IVTE entries and
ranges to map a PE and AFU interrupt number to an MSIX address).

We want to hide the implementation details of cxl interrupts as much as
possible. To this end, we use a special version of the MSI setup &
teardown routines in the PHB while in cxl mode to allocate the cxl
interrupts and configure the IVTE entries in the process element.

This function does not configure the MSIX table - the CX4 card uses a
custom format in that table and it would not be appropriate to fill that
out in generic code. The rest of the functionality is similar to the
"Full MSI-X mode" described in the CAIA, and this could be easily
extended to support other adapters that use that mode in the future.

The interrupts will be associated with the default context. If the
maximum number of interrupts per context has been limited (e.g. by the
mlx5 driver), it will automatically allocate additional kernel contexts
to associate extra interrupts as required. These contexts will be
started using the same WED that was used to start the default context.

Signed-off-by: Ian Munsie 

---

V1->V2:
- Handle error case if cxl_next_msi_hwirq returns 0 signifying
  that an AFU IRQ is not mapped to a hardware interrupt.
---
 arch/powerpc/platforms/powernv/pci-cxl.c  | 84 +++
 arch/powerpc/platforms/powernv/pci-ioda.c |  4 ++
 arch/powerpc/platforms/powernv/pci.h  |  2 +
 drivers/misc/cxl/api.c| 71 ++
 drivers/misc/cxl/base.c   | 31 
 drivers/misc/cxl/cxl.h|  4 ++
 drivers/misc/cxl/main.c   |  2 +
 include/misc/cxl-base.h   |  4 ++
 8 files changed, 202 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pci-cxl.c 
b/arch/powerpc/platforms/powernv/pci-cxl.c
index 831bbfb..3f34207 100644
--- a/arch/powerpc/platforms/powernv/pci-cxl.c
+++ b/arch/powerpc/platforms/powernv/pci-cxl.c
@@ -8,6 +8,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -281,3 +282,86 @@ void pnv_cxl_disable_device(struct pci_dev *dev)
cxl_pci_disable_device(dev);
cxl_afu_put(afu);
 }
+
+/*
+ * This is a special version of pnv_setup_msi_irqs for cards in cxl mode. This
+ * function handles setting up the IVTE entries for the XSL to use.
+ *
+ * We are currently not filling out the MSIX table, since the only currently
+ * supported adapter (CX4) uses a custom MSIX table format in cxl mode and it
+ * is up to their driver to fill that out. In the future we may fill out the
+ * MSIX table (and change the IVTE entries to be an index to the MSIX table)
+ * for adapters implementing the Full MSI-X mode described in the CAIA.
+ */
+int pnv_cxl_cx4_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
+{
+   struct pci_controller *hose = pci_bus_to_host(pdev->bus);
+   struct pnv_phb *phb = hose->private_data;
+   struct msi_desc *entry;
+   struct cxl_context *ctx = NULL;
+   unsigned int virq;
+   int hwirq;
+   int afu_irq = 0;
+   int rc;
+
+   if (WARN_ON(!phb) || !phb->msi_bmp.bitmap)
+   return -ENODEV;
+
+   if (pdev->no_64bit_msi && !phb->msi32_support)
+   return -ENODEV;
+
+   rc = cxl_cx4_setup_msi_irqs(pdev, nvec, type);
+   if (rc)
+   return rc;
+
+   for_each_pci_msi_entry(entry, pdev) {
+   if (!entry->msi_attrib.is_64 && !phb->msi32_support) {
+   pr_warn("%s: Supports only 64-bit MSIs\n",
+   pci_name(pdev));
+   return -ENXIO;
+   }
+
+   hwirq = cxl_next_msi_hwirq(pdev, , _irq);
+   if (WARN_ON(hwirq <= 0))
+   return (hwirq ? hwirq : -ENOMEM);
+
+   virq = irq_create_mapping(NULL, hwirq);
+   if (virq == NO_IRQ) {
+   pr_warn("%s: Failed to map cxl mode MSI to linux irq\n",
+   pci_name(pdev));
+   return -ENOMEM;
+   }
+
+   rc = pnv_cxl_ioda_msi_setup(pdev, hwirq, virq);
+   if (rc) {
+   pr_warn("%s: Failed to setup cxl mode MSI\n", 
pci_name(pdev));
+   irq_dispose_mapping(virq);
+   return rc;
+   }
+
+   irq_set_msi_desc(virq, entry);
+   }
+
+   return 0;
+}
+
+void pnv_cxl_cx4_teardown_msi_irqs(struct pci_dev *pdev)
+{
+   struct pci_controller *hose = pci_bus_to_host(pdev->bus);
+   struct pnv_phb *phb = hose->private_data;
+   struct msi_desc *entry;
+   

[PATCH 11/15] cxl: Add support for interrupts on the Mellanox CX4

2016-07-11 Thread Ian Munsie
From: Ian Munsie 

The Mellanox CX4 in cxl mode uses a hybrid interrupt model, where
interrupts are routed from the networking hardware to the XSL using the
MSIX table, and from there will be transformed back into an MSIX
interrupt using the cxl style interrupts (i.e. using IVTE entries and
ranges to map a PE and AFU interrupt number to an MSIX address).

We want to hide the implementation details of cxl interrupts as much as
possible. To this end, we use a special version of the MSI setup &
teardown routines in the PHB while in cxl mode to allocate the cxl
interrupts and configure the IVTE entries in the process element.

This function does not configure the MSIX table - the CX4 card uses a
custom format in that table and it would not be appropriate to fill that
out in generic code. The rest of the functionality is similar to the
"Full MSI-X mode" described in the CAIA, and this could be easily
extended to support other adapters that use that mode in the future.

The interrupts will be associated with the default context. If the
maximum number of interrupts per context has been limited (e.g. by the
mlx5 driver), it will automatically allocate additional kernel contexts
to associate extra interrupts as required. These contexts will be
started using the same WED that was used to start the default context.

Signed-off-by: Ian Munsie 

---

V1->V2:
- Handle error case if cxl_next_msi_hwirq returns 0 signifying
  that an AFU IRQ is not mapped to a hardware interrupt.
---
 arch/powerpc/platforms/powernv/pci-cxl.c  | 84 +++
 arch/powerpc/platforms/powernv/pci-ioda.c |  4 ++
 arch/powerpc/platforms/powernv/pci.h  |  2 +
 drivers/misc/cxl/api.c| 71 ++
 drivers/misc/cxl/base.c   | 31 
 drivers/misc/cxl/cxl.h|  4 ++
 drivers/misc/cxl/main.c   |  2 +
 include/misc/cxl-base.h   |  4 ++
 8 files changed, 202 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pci-cxl.c 
b/arch/powerpc/platforms/powernv/pci-cxl.c
index 3c4caf0..0e6bd0a 100644
--- a/arch/powerpc/platforms/powernv/pci-cxl.c
+++ b/arch/powerpc/platforms/powernv/pci-cxl.c
@@ -8,6 +8,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -281,3 +282,86 @@ void pnv_cxl_disable_device(struct pci_dev *dev)
cxl_pci_disable_device(dev);
cxl_afu_put(afu);
 }
+
+/*
+ * This is a special version of pnv_setup_msi_irqs for cards in cxl mode. This
+ * function handles setting up the IVTE entries for the XSL to use.
+ *
+ * We are currently not filling out the MSIX table, since the only currently
+ * supported adapter (CX4) uses a custom MSIX table format in cxl mode and it
+ * is up to their driver to fill that out. In the future we may fill out the
+ * MSIX table (and change the IVTE entries to be an index to the MSIX table)
+ * for adapters implementing the Full MSI-X mode described in the CAIA.
+ */
+int pnv_cxl_cx4_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
+{
+   struct pci_controller *hose = pci_bus_to_host(pdev->bus);
+   struct pnv_phb *phb = hose->private_data;
+   struct msi_desc *entry;
+   struct cxl_context *ctx = NULL;
+   unsigned int virq;
+   int hwirq;
+   int afu_irq = 0;
+   int rc;
+
+   if (WARN_ON(!phb) || !phb->msi_bmp.bitmap)
+   return -ENODEV;
+
+   if (pdev->no_64bit_msi && !phb->msi32_support)
+   return -ENODEV;
+
+   rc = cxl_cx4_setup_msi_irqs(pdev, nvec, type);
+   if (rc)
+   return rc;
+
+   for_each_pci_msi_entry(entry, pdev) {
+   if (!entry->msi_attrib.is_64 && !phb->msi32_support) {
+   pr_warn("%s: Supports only 64-bit MSIs\n",
+   pci_name(pdev));
+   return -ENXIO;
+   }
+
+   hwirq = cxl_next_msi_hwirq(pdev, , _irq);
+   if (WARN_ON(hwirq <= 0))
+   return (hwirq ? hwirq : -ENOMEM);
+
+   virq = irq_create_mapping(NULL, hwirq);
+   if (virq == NO_IRQ) {
+   pr_warn("%s: Failed to map cxl mode MSI to linux irq\n",
+   pci_name(pdev));
+   return -ENOMEM;
+   }
+
+   rc = pnv_cxl_ioda_msi_setup(pdev, hwirq, virq);
+   if (rc) {
+   pr_warn("%s: Failed to setup cxl mode MSI\n", 
pci_name(pdev));
+   irq_dispose_mapping(virq);
+   return rc;
+   }
+
+   irq_set_msi_desc(virq, entry);
+   }
+
+   return 0;
+}
+
+void pnv_cxl_cx4_teardown_msi_irqs(struct pci_dev *pdev)
+{
+   struct pci_controller *hose = pci_bus_to_host(pdev->bus);
+   struct pnv_phb *phb = hose->private_data;
+   struct msi_desc *entry;
+