[PATCH 1/2 v2] powerpc/dma: Define map/unmap mmio resource callbacks

2020-04-30 Thread Max Gurtovoy
Define the map_resource/unmap_resource callbacks for the dma_iommu_ops
used by several powerpc platforms. The map_resource callback is called
when trying to map a mmio resource through the dma_map_resource()
driver API.

For now, the callback returns an invalid address for devices using
translations, but will "direct" map the resource when in bypass
mode. Previous behavior for dma_map_resource() was to always return an
invalid address.

We also call an optional platform-specific controller op in
case some setup is needed for the platform.

Signed-off-by: Frederic Barrat 
Signed-off-by: Max Gurtovoy 
---

changes from v1:
 - rename pci_controller_ops callback to 
dma_direct_map_resource/dma_direct_unmap_resource
 - cosmetic changes to make the code more readable

---
 arch/powerpc/include/asm/pci-bridge.h |  7 +++
 arch/powerpc/kernel/dma-iommu.c   | 31 +++
 2 files changed, 38 insertions(+)

diff --git a/arch/powerpc/include/asm/pci-bridge.h 
b/arch/powerpc/include/asm/pci-bridge.h
index 69f4cb3..aca3724 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -44,6 +44,13 @@ struct pci_controller_ops {
 #endif
 
void(*shutdown)(struct pci_controller *hose);
+   int (*dma_direct_map_resource)(struct pci_dev *pdev,
+   phys_addr_t phys_addr,
+   size_t size,
+   enum dma_data_direction dir);
+   void(*dma_direct_unmap_resource)(struct pci_dev *pdev,
+   dma_addr_t addr, size_t size,
+   enum dma_data_direction dir);
 };
 
 /*
diff --git a/arch/powerpc/kernel/dma-iommu.c b/arch/powerpc/kernel/dma-iommu.c
index e486d1d..049d000 100644
--- a/arch/powerpc/kernel/dma-iommu.c
+++ b/arch/powerpc/kernel/dma-iommu.c
@@ -108,6 +108,35 @@ static void dma_iommu_unmap_sg(struct device *dev, struct 
scatterlist *sglist,
dma_direct_unmap_sg(dev, sglist, nelems, direction, attrs);
 }
 
+static dma_addr_t dma_iommu_map_resource(struct device *dev,
+phys_addr_t phys_addr, size_t size,
+enum dma_data_direction dir,
+unsigned long attrs)
+{
+   struct pci_dev *pdev = to_pci_dev(dev);
+   struct pci_controller *phb = pci_bus_to_host(pdev->bus);
+   struct pci_controller_ops *ops = >controller_ops;
+
+   if (!dma_iommu_map_bypass(dev, attrs) ||
+   !ops->dma_direct_map_resource ||
+   ops->dma_direct_map_resource(pdev, phys_addr, size, dir))
+   return DMA_MAPPING_ERROR;
+
+   return dma_direct_map_resource(dev, phys_addr, size, dir, attrs);
+}
+
+static void dma_iommu_unmap_resource(struct device *dev, dma_addr_t dma_handle,
+size_t size, enum dma_data_direction dir,
+unsigned long attrs)
+{
+   struct pci_dev *pdev = to_pci_dev(dev);
+   struct pci_controller *phb = pci_bus_to_host(pdev->bus);
+   struct pci_controller_ops *ops = >controller_ops;
+
+   if (dma_iommu_map_bypass(dev, attrs) && ops->dma_direct_unmap_resource)
+   ops->dma_direct_unmap_resource(pdev, dma_handle, size, dir);
+}
+
 static bool dma_iommu_bypass_supported(struct device *dev, u64 mask)
 {
struct pci_dev *pdev = to_pci_dev(dev);
@@ -199,6 +228,8 @@ extern void dma_iommu_sync_sg_for_device(struct device *dev,
.free   = dma_iommu_free_coherent,
.map_sg = dma_iommu_map_sg,
.unmap_sg   = dma_iommu_unmap_sg,
+   .map_resource   = dma_iommu_map_resource,
+   .unmap_resource = dma_iommu_unmap_resource,
.dma_supported  = dma_iommu_dma_supported,
.map_page   = dma_iommu_map_page,
.unmap_page = dma_iommu_unmap_page,
-- 
1.8.3.1



[PATCH 2/2 v2] powerpc/powernv: Enable and setup PCI P2P

2020-04-30 Thread Max Gurtovoy
Implement the generic dma_map_resource callback on the PCI controller
for powernv. This will enable PCI P2P on POWER9 architecture. It will
allow catching a cross-PHB mmio mapping, which needs to be setup in
hardware by calling opal. Both the initiator and target PHBs need to be
configured, so we look for which PHB owns the mmio address being mapped.

Signed-off-by: Frederic Barrat 
[maxg: added CONFIG_PCI_P2PDMA wrappers]
Signed-off-by: Max Gurtovoy 
---

changes from v1:
 - remove CONFIG_PCI_P2PDMA around opal_pci_set_p2p decleration
 - divide pnv_pci_ioda_set_p2p to 
pnv_pci_ioda_enable_p2p/pnv_pci_ioda_disable_p2p
 - added pnv_pci_dma_dir_to_opal_p2p static helper

---
 arch/powerpc/include/asm/opal.h|   3 +-
 arch/powerpc/platforms/powernv/opal-call.c |   1 +
 arch/powerpc/platforms/powernv/pci-ioda.c  | 212 +++--
 arch/powerpc/platforms/powernv/pci.h   |   9 ++
 4 files changed, 213 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 9986ac3..362f54b 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -284,7 +284,8 @@ int64_t opal_xive_set_queue_state(uint64_t vp, uint32_t 
prio,
  uint32_t qtoggle,
  uint32_t qindex);
 int64_t opal_xive_get_vp_state(uint64_t vp, __be64 *out_w01);
-
+int64_t opal_pci_set_p2p(uint64_t phb_init, uint64_t phb_target,
+uint64_t desc, uint16_t pe_number);
 int64_t opal_imc_counters_init(uint32_t type, uint64_t address,
uint64_t cpu_pir);
 int64_t opal_imc_counters_start(uint32_t type, uint64_t cpu_pir);
diff --git a/arch/powerpc/platforms/powernv/opal-call.c 
b/arch/powerpc/platforms/powernv/opal-call.c
index 5cd0f52..442d5445c 100644
--- a/arch/powerpc/platforms/powernv/opal-call.c
+++ b/arch/powerpc/platforms/powernv/opal-call.c
@@ -273,6 +273,7 @@ int64_t name(int64_t a0, int64_t a1, int64_t a2, int64_t 
a3,\
 OPAL_CALL(opal_imc_counters_init,  OPAL_IMC_COUNTERS_INIT);
 OPAL_CALL(opal_imc_counters_start, OPAL_IMC_COUNTERS_START);
 OPAL_CALL(opal_imc_counters_stop,  OPAL_IMC_COUNTERS_STOP);
+OPAL_CALL(opal_pci_set_p2p,OPAL_PCI_SET_P2P);
 OPAL_CALL(opal_get_powercap,   OPAL_GET_POWERCAP);
 OPAL_CALL(opal_set_powercap,   OPAL_SET_POWERCAP);
 OPAL_CALL(opal_get_power_shift_ratio,  OPAL_GET_POWER_SHIFT_RATIO);
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 57d3a6a..9ecc576 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -3706,18 +3706,208 @@ static void pnv_pci_ioda_dma_bus_setup(struct pci_bus 
*bus)
}
 }
 
+#ifdef CONFIG_PCI_P2PDMA
+static DEFINE_MUTEX(p2p_mutex);
+
+static bool pnv_pci_controller_owns_addr(struct pci_controller *hose,
+phys_addr_t addr, size_t size)
+{
+   int i;
+
+   /*
+* It seems safe to assume the full range is under the same PHB, so we
+* can ignore the size.
+*/
+   for (i = 0; i < ARRAY_SIZE(hose->mem_resources); i++) {
+   struct resource *res = >mem_resources[i];
+
+   if (res->flags && addr >= res->start && addr < res->end)
+   return true;
+   }
+   return false;
+}
+
+/*
+ * find the phb owning a mmio address if not owned locally
+ */
+static struct pnv_phb *pnv_pci_find_owning_phb(struct pci_dev *pdev,
+  phys_addr_t addr, size_t size)
+{
+   struct pci_controller *hose;
+
+   /* fast path */
+   if (pnv_pci_controller_owns_addr(pdev->bus->sysdata, addr, size))
+   return NULL;
+
+   list_for_each_entry(hose, _list, list_node) {
+   struct pnv_phb *phb = hose->private_data;
+
+   if (phb->type != PNV_PHB_NPU_NVLINK &&
+   phb->type != PNV_PHB_NPU_OCAPI) {
+   if (pnv_pci_controller_owns_addr(hose, addr, size))
+   return phb;
+   }
+   }
+   return NULL;
+}
+
+static u64 pnv_pci_dma_dir_to_opal_p2p(enum dma_data_direction dir)
+{
+   if (dir == DMA_TO_DEVICE)
+   return OPAL_PCI_P2P_STORE;
+   else if (dir == DMA_FROM_DEVICE)
+   return OPAL_PCI_P2P_LOAD;
+   else if (dir == DMA_BIDIRECTIONAL)
+   return OPAL_PCI_P2P_LOAD | OPAL_PCI_P2P_STORE;
+   else
+   return 0;
+}
+
+static int pnv_pci_ioda_enable_p2p(struct pci_dev *initiator,
+  struct pnv_phb *phb_target,
+  enum dma_data_direction dir)
+{
+   struct pci_controller *hose;
+ 

Re: [PATCH 1/3] powerpc/powernv: remove the unused pnv_pci_set_p2p function

2019-07-09 Thread Max Gurtovoy



On 7/9/2019 5:40 PM, Christoph Hellwig wrote:

On Tue, Jul 09, 2019 at 05:37:18PM +0300, Max Gurtovoy wrote:

On 7/9/2019 5:32 PM, Christoph Hellwig wrote:

On Tue, Jul 09, 2019 at 05:31:37PM +0300, Max Gurtovoy wrote:

Are we ok with working on a solution during kernel-5.3 cycle ?

You can start working on it any time, no need to ask for permission.

I just want to make sure we don't remove it from the kernel before we send
a general API solution.

The code is gone in this merge window.


Ok, so we must fix it to kernel-5.3 to make sure we're covered.

Understood.




This way we'll make sure that all the kernel versions has this
functionality...

Again, we do not provide functionality for out of tree modules.  We've
had the p2p API for about a year now, its not like you didn't have
plenty of time.


I didn't know about the intention to remove this code...

Also this code was merged before the p2p API for p2pmem.



Re: [PATCH 1/3] powerpc/powernv: remove the unused pnv_pci_set_p2p function

2019-07-09 Thread Max Gurtovoy



On 7/9/2019 5:32 PM, Christoph Hellwig wrote:

On Tue, Jul 09, 2019 at 05:31:37PM +0300, Max Gurtovoy wrote:

Are we ok with working on a solution during kernel-5.3 cycle ?

You can start working on it any time, no need to ask for permission.


I just want to make sure we don't remove it from the kernel before we 
send a general API solution.


This way we'll make sure that all the kernel versions has this 
functionality...




Re: [PATCH 1/3] powerpc/powernv: remove the unused pnv_pci_set_p2p function

2019-07-09 Thread Max Gurtovoy



On 7/9/2019 4:59 PM, Christoph Hellwig wrote:

On Tue, Jul 09, 2019 at 01:49:04PM +, Max Gurtovoy wrote:

Hi Greg/Christoph,
Can we leave it meanwhile till we'll find a general solution (for the upcoming 
kernel) ?
I guess we can somehow generalize the P2P initialization process for PPC and 
leave it empty for now for other archs.
Or maybe we can find some other solution (sysfs/configfs/module param), but it 
will take time since we'll need to work closely with the IBM pci guys that 
wrote this code.

We do not keep code without in-tree users around, especially not if
we have a better API with in-tree users.

AFAICS the only thing you'll need is to wire up the enable/disable
calls.


I guess you're right, but we still need to know the time frame we have 
here since this should be tested carefully on the P9 hardware.


Are we ok with working on a solution during kernel-5.3 cycle ?



RE: [PATCH 1/3] powerpc/powernv: remove the unused pnv_pci_set_p2p function

2019-07-09 Thread Max Gurtovoy
Hi Greg/Christoph,
Can we leave it meanwhile till we'll find a general solution (for the upcoming 
kernel) ?
I guess we can somehow generalize the P2P initialization process for PPC and 
leave it empty for now for other archs.
Or maybe we can find some other solution (sysfs/configfs/module param), but it 
will take time since we'll need to work closely with the IBM pci guys that 
wrote this code.

-Max.


-Original Message-
From: Christoph Hellwig  
Sent: Thursday, May 23, 2019 10:53 AM
To: Frederic Barrat 
Cc: Christoph Hellwig ; Benjamin Herrenschmidt 
; Paul Mackerras ; Michael Ellerman 
; linuxppc-dev@lists.ozlabs.org; Max Gurtovoy 

Subject: Re: [PATCH 1/3] powerpc/powernv: remove the unused pnv_pci_set_p2p 
function

On Mon, May 06, 2019 at 10:46:11AM +0200, Frederic Barrat wrote:
> Hi,
>
> The PCI p2p and tunnel code is used by the Mellanox CX5 driver, at 
> least their latest, out of tree version, which is used for CORAL. My 
> understanding is that they'll upstream it at some point, though I 
> don't know what their schedule is like.

FYI, Max who wrote (at least larger parts of) that code is on Cc agreed that 
all P2P code should go through the kernel P2P infrastructure and might be able 
to spend some cycles on it.

Which still doesn't change anything about that fact that we [1] generally don't 
add infrastructure for anything that is not in the tree.

[1] well, powernv seems to have handles this a little oddly, and now is on my 
special watchlist.