Re: [PATCH v4 01/14] PCI/P2PDMA: Support peer-to-peer memory
Thanks for the review. I'll apply all of these for the changes for next version of the set. >> +/* >> + * If a device is behind a switch, we try to find the upstream bridge >> + * port of the switch. This requires two calls to pci_upstream_bridge(): >> + * one for the upstream port on the switch, one on the upstream port >> + * for the next level in the hierarchy. Because of this, devices connected >> + * to the root port will be rejected. >> + */ >> +static struct pci_dev *get_upstream_bridge_port(struct pci_dev *pdev) > > This function doesn't seem to be used anymore. Thanks for all your hard > work to get rid of it! Oops, I thought I had gotten rid of it entirely, but I guess I messed it up a bit and it gets removed in patch 4. I'll fix it for v5. Logan
Re: [PATCH v4 01/14] PCI/P2PDMA: Support peer-to-peer memory
On Mon, Apr 23, 2018 at 05:30:33PM -0600, Logan Gunthorpe wrote: > Some PCI devices may have memory mapped in a BAR space that's > intended for use in peer-to-peer transactions. In order to enable > such transactions the memory must be registered with ZONE_DEVICE pages > so it can be used by DMA interfaces in existing drivers. > > Add an interface for other subsystems to find and allocate chunks of P2P > memory as necessary to facilitate transfers between two PCI peers: > > int pci_p2pdma_add_client(); > struct pci_dev *pci_p2pmem_find(); > void *pci_alloc_p2pmem(); > > The new interface requires a driver to collect a list of client devices > involved in the transaction with the pci_p2pmem_add_client*() functions > then call pci_p2pmem_find() to obtain any suitable P2P memory. Once > this is done the list is bound to the memory and the calling driver is > free to add and remove clients as necessary (adding incompatible clients > will fail). With a suitable p2pmem device, memory can then be > allocated with pci_alloc_p2pmem() for use in DMA transactions. > > Depending on hardware, using peer-to-peer memory may reduce the bandwidth > of the transfer but can significantly reduce pressure on system memory. > This may be desirable in many cases: for example a system could be designed > with a small CPU connected to a PCI switch by a small number of lanes s/PCI/PCIe/ > which would maximize the number of lanes available to connect to NVMe > devices. > > The code is designed to only utilize the p2pmem device if all the devices > involved in a transfer are behind the same root port (typically through s/root port/PCI bridge/ > a network of PCIe switches). This is because we have no way of knowing > whether peer-to-peer routing between PCIe Root Ports is supported > (PCIe r4.0, sec 1.3.1). Additionally, the benefits of P2P transfers that > go through the RC is limited to only reducing DRAM usage and, in some > cases, coding convenience. The PCI-SIG may be exploring adding a new > capability bit to advertise whether this is possible for future > hardware. > > This commit includes significant rework and feedback from Christoph > Hellwig. > > Signed-off-by: Christoph Hellwig> Signed-off-by: Logan Gunthorpe > --- > drivers/pci/Kconfig| 17 ++ > drivers/pci/Makefile | 1 + > drivers/pci/p2pdma.c | 694 > + > include/linux/memremap.h | 18 ++ > include/linux/pci-p2pdma.h | 100 +++ > include/linux/pci.h| 4 + > 6 files changed, 834 insertions(+) > create mode 100644 drivers/pci/p2pdma.c > create mode 100644 include/linux/pci-p2pdma.h > > diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig > index 34b56a8f8480..b2396c22b53e 100644 > --- a/drivers/pci/Kconfig > +++ b/drivers/pci/Kconfig > @@ -124,6 +124,23 @@ config PCI_PASID > > If unsure, say N. > > +config PCI_P2PDMA > + bool "PCI peer-to-peer transfer support" > + depends on PCI && ZONE_DEVICE && EXPERT > + select GENERIC_ALLOCATOR > + help > + Enableѕ drivers to do PCI peer-to-peer transactions to and from > + BARs that are exposed in other devices that are the part of > + the hierarchy where peer-to-peer DMA is guaranteed by the PCI > + specification to work (ie. anything below a single PCI bridge). > + > + Many PCIe root complexes do not support P2P transactions and > + it's hard to tell which support it at all, so at this time, DMA > + transations must be between devices behind the same root port. s/DMA transactions/PCIe DMA transactions/ (Theoretically P2P should work on conventional PCI, and this sentence only applies to PCIe.) > + (Typically behind a network of PCIe switches). Not sure this last sentence adds useful information. > +++ b/drivers/pci/p2pdma.c > @@ -0,0 +1,694 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * PCI Peer 2 Peer DMA support. > + * > + * Copyright (c) 2016-2018, Logan Gunthorpe > + * Copyright (c) 2016-2017, Microsemi Corporation > + * Copyright (c) 2017, Christoph Hellwig > + * Copyright (c) 2018, Eideticom Inc. > + * Nit: unnecessary blank line. > +/* > + * If a device is behind a switch, we try to find the upstream bridge > + * port of the switch. This requires two calls to pci_upstream_bridge(): > + * one for the upstream port on the switch, one on the upstream port > + * for the next level in the hierarchy. Because of this, devices connected > + * to the root port will be rejected. > + */ > +static struct pci_dev *get_upstream_bridge_port(struct pci_dev *pdev) This function doesn't seem to be used anymore. Thanks for all your hard work to get rid of it! > +{ > + struct pci_dev *up1, *up2; > + > + if (!pdev) > + return NULL; > + > + up1 = pci_dev_get(pci_upstream_bridge(pdev)); > + if (!up1) > + return NULL; > + > + up2 = pci_dev_get(pci_upstream_bridge(up1)); > +
[PATCH v4 01/14] PCI/P2PDMA: Support peer-to-peer memory
Some PCI devices may have memory mapped in a BAR space that's intended for use in peer-to-peer transactions. In order to enable such transactions the memory must be registered with ZONE_DEVICE pages so it can be used by DMA interfaces in existing drivers. Add an interface for other subsystems to find and allocate chunks of P2P memory as necessary to facilitate transfers between two PCI peers: int pci_p2pdma_add_client(); struct pci_dev *pci_p2pmem_find(); void *pci_alloc_p2pmem(); The new interface requires a driver to collect a list of client devices involved in the transaction with the pci_p2pmem_add_client*() functions then call pci_p2pmem_find() to obtain any suitable P2P memory. Once this is done the list is bound to the memory and the calling driver is free to add and remove clients as necessary (adding incompatible clients will fail). With a suitable p2pmem device, memory can then be allocated with pci_alloc_p2pmem() for use in DMA transactions. Depending on hardware, using peer-to-peer memory may reduce the bandwidth of the transfer but can significantly reduce pressure on system memory. This may be desirable in many cases: for example a system could be designed with a small CPU connected to a PCI switch by a small number of lanes which would maximize the number of lanes available to connect to NVMe devices. The code is designed to only utilize the p2pmem device if all the devices involved in a transfer are behind the same root port (typically through a network of PCIe switches). This is because we have no way of knowing whether peer-to-peer routing between PCIe Root Ports is supported (PCIe r4.0, sec 1.3.1). Additionally, the benefits of P2P transfers that go through the RC is limited to only reducing DRAM usage and, in some cases, coding convenience. The PCI-SIG may be exploring adding a new capability bit to advertise whether this is possible for future hardware. This commit includes significant rework and feedback from Christoph Hellwig. Signed-off-by: Christoph HellwigSigned-off-by: Logan Gunthorpe --- drivers/pci/Kconfig| 17 ++ drivers/pci/Makefile | 1 + drivers/pci/p2pdma.c | 694 + include/linux/memremap.h | 18 ++ include/linux/pci-p2pdma.h | 100 +++ include/linux/pci.h| 4 + 6 files changed, 834 insertions(+) create mode 100644 drivers/pci/p2pdma.c create mode 100644 include/linux/pci-p2pdma.h diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig index 34b56a8f8480..b2396c22b53e 100644 --- a/drivers/pci/Kconfig +++ b/drivers/pci/Kconfig @@ -124,6 +124,23 @@ config PCI_PASID If unsure, say N. +config PCI_P2PDMA + bool "PCI peer-to-peer transfer support" + depends on PCI && ZONE_DEVICE && EXPERT + select GENERIC_ALLOCATOR + help + Enableѕ drivers to do PCI peer-to-peer transactions to and from + BARs that are exposed in other devices that are the part of + the hierarchy where peer-to-peer DMA is guaranteed by the PCI + specification to work (ie. anything below a single PCI bridge). + + Many PCIe root complexes do not support P2P transactions and + it's hard to tell which support it at all, so at this time, DMA + transations must be between devices behind the same root port. + (Typically behind a network of PCIe switches). + + If unsure, say N. + config PCI_LABEL def_bool y if (DMI || ACPI) depends on PCI diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile index 952addc7bacf..050c1e19a1de 100644 --- a/drivers/pci/Makefile +++ b/drivers/pci/Makefile @@ -25,6 +25,7 @@ obj-$(CONFIG_X86_INTEL_MID) += pci-mid.o obj-$(CONFIG_PCI_SYSCALL) += syscall.o obj-$(CONFIG_PCI_STUB) += pci-stub.o obj-$(CONFIG_PCI_ECAM) += ecam.o +obj-$(CONFIG_PCI_P2PDMA) += p2pdma.o obj-$(CONFIG_XEN_PCIDEV_FRONTEND) += xen-pcifront.o obj-y += host/ diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c new file mode 100644 index ..e524a12eca1f --- /dev/null +++ b/drivers/pci/p2pdma.c @@ -0,0 +1,694 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * PCI Peer 2 Peer DMA support. + * + * Copyright (c) 2016-2018, Logan Gunthorpe + * Copyright (c) 2016-2017, Microsemi Corporation + * Copyright (c) 2017, Christoph Hellwig + * Copyright (c) 2018, Eideticom Inc. + * + */ + +#include +#include +#include +#include +#include +#include +#include + +struct pci_p2pdma { + struct percpu_ref devmap_ref; + struct completion devmap_ref_done; + struct gen_pool *pool; + bool p2pmem_published; +}; + +static void pci_p2pdma_percpu_release(struct percpu_ref *ref) +{ + struct pci_p2pdma *p2p = + container_of(ref, struct pci_p2pdma, devmap_ref); + + complete_all(>devmap_ref_done); +} + +static void pci_p2pdma_percpu_kill(void *data) +{ + struct