Re: [PATCH v8 13/13] nvmet: Optionally use PCI P2P memory

2018-09-27 Thread Logan Gunthorpe
On 2018-09-27 11:12 AM, Keith Busch wrote: > Reviewed-by: Keith Busch Thanks for the reviews Keith! Logan

[PATCH v8 12/13] nvmet: Introduce helper functions to allocate and free request SGLs

2018-09-27 Thread Logan Gunthorpe
and cleared on any error. It also seems to be unnecessary to accumulate the length as the map_sgl functions should only ever be called once per request. Signed-off-by: Logan Gunthorpe Acked-by: Sagi Grimberg Cc: Christoph Hellwig --- drivers/nvme/target/core.c | 18 ++ drivers

[PATCH v8 03/13] PCI/P2PDMA: Add PCI p2pmem DMA mappings to adjust the bus offset

2018-09-27 Thread Logan Gunthorpe
. Signed-off-by: Logan Gunthorpe Acked-by: Bjorn Helgaas --- drivers/pci/p2pdma.c | 43 ++ include/linux/memremap.h | 1 + include/linux/pci-p2pdma.h | 7 +++ 3 files changed, 51 insertions(+) diff --git a/drivers/pci/p2pdma.c b/drivers/pci

[PATCH v8 08/13] IB/core: Ensure we map P2P memory correctly in rdma_rw_ctx_[init|destroy]()

2018-09-27 Thread Logan Gunthorpe
is P2P the entire SGL should be P2P. Signed-off-by: Logan Gunthorpe Reviewed-by: Christoph Hellwig Reviewed-by: Sagi Grimberg --- drivers/infiniband/core/rw.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/core/rw.c b/drivers/infiniband/core

[PATCH v8 02/13] PCI/P2PDMA: Add sysfs group to display p2pmem stats

2018-09-27 Thread Logan Gunthorpe
Add a sysfs group to display statistics about P2P memory that is registered in each PCI device. Attributes in the group display the total amount of P2P memory, the amount available and whether it is published or not. Signed-off-by: Logan Gunthorpe Acked-by: Bjorn Helgaas --- Documentation/ABI

[PATCH v8 07/13] block: Add PCI P2P flag for request queue and check support for requests

2018-09-27 Thread Logan Gunthorpe
QUEUE_FLAG_PCI_P2P is introduced meaning a driver's request queue supports targeting P2P memory. This will be used by P2P providers and orchestrators (in subsequent patches) to ensure block devices can support P2P memory before submitting P2P backed pages to submit_bio(). Signed-off-by: Logan

[PATCH v8 06/13] PCI/P2PDMA: Add P2P DMA driver writer's documentation

2018-09-27 Thread Logan Gunthorpe
converted to restructured text at this time. Signed-off-by: Logan Gunthorpe Acked-by: Bjorn Helgaas Cc: Jonathan Corbet --- Documentation/driver-api/pci/index.rst | 1 + Documentation/driver-api/pci/p2pdma.rst | 170 2 files changed, 171 insertions(+) create mode

[PATCH v8 13/13] nvmet: Optionally use PCI P2P memory

2018-09-27 Thread Logan Gunthorpe
code] Signed-off-by: Christoph Hellwig Signed-off-by: Logan Gunthorpe --- drivers/nvme/target/configfs.c| 36 drivers/nvme/target/core.c| 138 +- drivers/nvme/target/io-cmd-bdev.c | 3 + drivers/nvme/target/nvmet.h | 13 +++ drivers/nvme

[PATCH v8 11/13] nvme-pci: Add a quirk for a pseudo CMB

2018-09-27 Thread Logan Gunthorpe
Introduce a quirk to use CMB-like memory on older devices that have an exposed BAR but do not advertise support for using CMBLOC and CMBSIZE. We'd like to use some of these older cards to test P2P memory. Signed-off-by: Logan Gunthorpe Reviewed-by: Sagi Grimberg --- drivers/nvme/host/nvme.h

[PATCH v8 05/13] docs-rst: Add a new directory for PCI documentation

2018-09-27 Thread Logan Gunthorpe
Add a new directory in the driver API guide for PCI specific documentation. This is in preparation for adding a new PCI P2P DMA driver writers guide which will go in this directory. Signed-off-by: Logan Gunthorpe Cc: Jonathan Corbet Cc: Mauro Carvalho Chehab Cc: Greg Kroah-Hartman Cc: Vinod

[PATCH v8 04/13] PCI/P2PDMA: Introduce configfs/sysfs enable attribute helpers

2018-09-27 Thread Logan Gunthorpe
for attributes which take a boolean or a PCI device. Any boolean as accepted by strtobool() turn P2P on or off (such as 'y', 'n', '1', '0', etc). Specifying a full PCI device name/BDF will select the specific device. Signed-off-by: Logan Gunthorpe Acked-by: Bjorn Helgaas --- drivers/pci/p2pdma.c

[PATCH v8 00/13] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-09-27 Thread Logan Gunthorpe
spin) using switches from both Microsemi and Broadcomm. -- Logan Gunthorpe (13): PCI/P2PDMA: Support peer-to-peer memory PCI/P2PDMA: Add sysfs group to display p2pmem stats PCI/P2PDMA: Add PCI p2pmem DMA mappings to adjust the bus offset PCI/P2PDMA: Introduce configfs/sysfs enable attribute

[PATCH v8 01/13] PCI/P2PDMA: Support peer-to-peer memory

2018-09-27 Thread Logan Gunthorpe
capability bit to advertise whether this is possible for future hardware. This commit includes significant rework and feedback from Christoph Hellwig. Signed-off-by: Christoph Hellwig Signed-off-by: Logan Gunthorpe Acked-by: Bjorn Helgaas # PCI pieces --- drivers/pci/Kconfig| 17

[PATCH v8 09/13] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-09-27 Thread Logan Gunthorpe
will not be supported by memremap() and therefore will not be support PCI P2P and have no support for CMB. Signed-off-by: Logan Gunthorpe --- drivers/nvme/host/pci.c | 80 +++-- 1 file changed, 45 insertions(+), 35 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme

[PATCH v8 10/13] nvme-pci: Add support for P2P memory in requests

2018-09-27 Thread Logan Gunthorpe
-by: Logan Gunthorpe Reviewed-by: Sagi Grimberg Reviewed-by: Christoph Hellwig --- drivers/nvme/host/core.c | 4 drivers/nvme/host/nvme.h | 1 + drivers/nvme/host/pci.c | 17 + 3 files changed, 18 insertions(+), 4 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers

Re: [PATCH v4 06/14] PCI/P2PDMA: Add P2P DMA driver writer's documentation

2018-05-22 Thread Logan Gunthorpe
Thanks for the review Randy! I'll make the changes for the next time we post the series. On 22/05/18 03:24 PM, Randy Dunlap wrote: >> +The first task an orchestrator driver must do is compile a list of >> +all client drivers that will be involved in a given transaction. For >> +example, the NVMe

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-11 Thread Logan Gunthorpe
On 5/11/2018 4:24 PM, Stephen Bates wrote: All  Alex (or anyone else) can you point to where IOVA addresses are generated? A case of RTFM perhaps (though a pointer to the code would still be appreciated). https://www.kernel.org/doc/Documentation/Intel-IOMMU.txt Some exceptions to IOVA

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-11 Thread Logan Gunthorpe
On 5/11/2018 2:52 AM, Christian König wrote: This only works when the IOVA and the PCI bus addresses never overlap. I'm not sure how the IOVA allocation works but I don't think we guarantee that on Linux. I find this hard to believe. There's always the possibility that some part of the

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-10 Thread Logan Gunthorpe
On 10/05/18 12:41 PM, Stephen Bates wrote: > Hi Jerome > >>Note on GPU we do would not rely on ATS for peer to peer. Some part >>of the GPU (DMA engines) do not necessarily support ATS. Yet those >>are the part likely to be use in peer to peer. > > OK this is good to know. I agree

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-10 Thread Logan Gunthorpe
On 10/05/18 11:11 AM, Stephen Bates wrote: >> Not to me. In the p2pdma code we specifically program DMA engines with >> the PCI bus address. > > Ah yes of course. Brain fart on my part. We are not programming the P2PDMA > initiator with an IOVA but with the PCI bus address... > >> So

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-10 Thread Logan Gunthorpe
On 10/05/18 08:16 AM, Stephen Bates wrote: > Hi Christian > >> Why would a switch not identify that as a peer address? We use the PASID >>together with ATS to identify the address space which a transaction >>should use. > > I think you are conflating two types of TLPs here. If the

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-09 Thread Logan Gunthorpe
On 09/05/18 07:40 AM, Christian König wrote: > The key takeaway is that when any device has ATS enabled you can't > disable ACS without breaking it (even if you unplug and replug it). I don't follow how you came to this conclusion... The ACS bits we'd be turning off are the ones that force

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 05:11 PM, Alex Williamson wrote: > On to the implementation details... I already mentioned the BDF issue > in my other reply. If we had a way to persistently identify a device, > would we specify the downstream points at which we want to disable ACS > or the endpoints that we want

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 05:00 PM, Dan Williams wrote: >> I'd advise caution with a user supplied BDF approach, we have no >> guaranteed persistence for a device's PCI address. Adding a device >> might renumber the buses, replacing a device with one that consumes >> more/less bus numbers can renumber the

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 04:03 PM, Alex Williamson wrote: > If IOMMU grouping implies device assignment (because nobody else uses > it to the same extent as device assignment) then the build-time option > falls to pieces, we need a single kernel that can do both. I think we > need to get more clever about

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 02:43 PM, Alex Williamson wrote: > Yes, GPUs seem to be leading the pack in implementing ATS. So now the > dumb question, why not simply turn off the IOMMU and thus ACS? The > argument of using the IOMMU for security is rather diminished if we're > specifically enabling devices to

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 02:13 PM, Alex Williamson wrote: > Well, I'm a bit confused, this patch series is specifically disabling > ACS on switches, but per the spec downstream switch ports implementing > ACS MUST implement direct translated P2P. So it seems the only > potential gap here is the endpoint,

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 01:34 PM, Alex Williamson wrote: > They are not so unrelated, see the ACS Direct Translated P2P > capability, which in fact must be implemented by switch downstream > ports implementing ACS and works specifically with ATS. This appears to > be the way the PCI SIG would intend for

Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 10:57 AM, Alex Williamson wrote: > AIUI from previously questioning this, the change is hidden behind a > build-time config option and only custom kernels or distros optimized > for this sort of support would enable that build option. I'm more than > a little dubious though that

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 10:50 AM, Christian König wrote: > E.g. transactions are initially send to the root complex for > translation, that's for sure. But at least for AMD GPUs the root complex > answers with the translated address which is then cached in the device. > > So further transactions for the

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 01:17 AM, Christian König wrote: > AMD APUs mandatory need the ACS flag set for the GPU integrated in the > CPU when IOMMU is enabled or otherwise you will break SVM. Well, given that the current set only disables ACS bits on bridges (previous versions were only on switches) this

Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-05-07 Thread Logan Gunthorpe
> How do you envison merging this? There's a big chunk in drivers/pci, but > really no opportunity for conflicts there, and there's significant stuff in > block and nvme that I don't really want to merge. > > If Alex is OK with the ACS situation, I can ack the PCI parts and you could > merge it

Re: [PATCH v4 01/14] PCI/P2PDMA: Support peer-to-peer memory

2018-05-07 Thread Logan Gunthorpe
Thanks for the review. I'll apply all of these for the changes for next version of the set. >> +/* >> + * If a device is behind a switch, we try to find the upstream bridge >> + * port of the switch. This requires two calls to pci_upstream_bridge(): >> + * one for the upstream port on the switch,

Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-05-04 Thread Logan Gunthorpe
On 04/05/18 08:27 AM, Christian König wrote: > Are you sure that this is more convenient? At least on first glance it > feels overly complicated. > > I mean what's the difference between the two approaches? > >     sum = pci_p2pdma_distance(target, [A, B, C, target]); > > and > >     sum

Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-05-03 Thread Logan Gunthorpe
On 03/05/18 11:29 AM, Christian König wrote: > Ok, that is the point where I'm stuck. Why do we need that in one > function call in the PCIe subsystem? > > The problem at least with GPUs is that we seriously don't have that > information here, cause the PCI subsystem might not be aware of all

Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-05-03 Thread Logan Gunthorpe
On 03/05/18 03:05 AM, Christian König wrote: > Ok, I'm still missing the big picture here. First question is what is > the P2PDMA provider? Well there's some pretty good documentation in the patchset for this, but in short, a provider is a device that provides some kind of P2P resource (ie.

Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-05-02 Thread Logan Gunthorpe
Hi Christian, On 5/2/2018 5:51 AM, Christian König wrote: it would be rather nice to have if you could separate out the functions to detect if peer2peer is possible between two devices. This would essentially be pci_p2pdma_distance() in the existing patchset. It returns the sum of the

[PATCH v4 01/14] PCI/P2PDMA: Support peer-to-peer memory

2018-04-23 Thread Logan Gunthorpe
. The PCI-SIG may be exploring adding a new capability bit to advertise whether this is possible for future hardware. This commit includes significant rework and feedback from Christoph Hellwig. Signed-off-by: Christoph Hellwig <h...@lst.de> Signed-off-by: Logan Gunthorpe <log...@del

[PATCH v4 03/14] PCI/P2PDMA: Add PCI p2pmem dma mappings to adjust the bus offset

2018-04-23 Thread Logan Gunthorpe
The DMA address used when mapping PCI P2P memory must be the PCI bus address. Thus, introduce pci_p2pmem_[un]map_sg() to map the correct addresses when using P2P memory. For this, we assume that an SGL passed to these functions contain all P2P memory or no P2P memory. Signed-off-by: Logan

[PATCH v4 10/14] nvme-pci: Add support for P2P memory in requests

2018-04-23 Thread Logan Gunthorpe
-off-by: Logan Gunthorpe <log...@deltatee.com> Reviewed-by: Sagi Grimberg <s...@grimberg.me> Reviewed-by: Christoph Hellwig <h...@lst.de> --- drivers/nvme/host/core.c | 4 drivers/nvme/host/nvme.h | 1 + drivers/nvme/host/pci.c | 19 +++ 3 files changed, 2

[PATCH v4 12/14] nvmet: Introduce helper functions to allocate and free request SGLs

2018-04-23 Thread Logan Gunthorpe
drivers. The presently unused 'sq' argument in the alloc function will be necessary to decide whether to use peer-to-peer memory and obtain the correct provider to allocate the memory. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Cc: Christoph Hellwig <h...@lst.de> Cc: Sagi

[PATCH v4 02/14] PCI/P2PDMA: Add sysfs group to display p2pmem stats

2018-04-23 Thread Logan Gunthorpe
Add a sysfs group to display statistics about P2P memory that is registered in each PCI device. Attributes in the group display the total amount of P2P memory, the amount available and whether it is published or not. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- Documentati

[PATCH v4 07/14] block: Introduce PCI P2P flags for request and request queue

2018-04-23 Thread Logan Gunthorpe
flag set. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Reviewed-by: Sagi Grimberg <s...@grimberg.me> Reviewed-by: Christoph Hellwig <h...@lst.de> --- block/blk-core.c | 3 +++ include/linux/blk_types.h | 18 +- include/linux/blkdev.h| 3 ++

[PATCH v4 06/14] PCI/P2PDMA: Add P2P DMA driver writer's documentation

2018-04-23 Thread Logan Gunthorpe
converted to restructured text at this time. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Cc: Jonathan Corbet <cor...@lwn.net> --- Documentation/PCI/index.rst | 14 +++ Documentation/driver-api/pci/index.rst | 1 + Documentation/driver-api/pci/p2pd

[PATCH v4 13/14] nvmet-rdma: Use new SGL alloc/free helper for requests

2018-04-23 Thread Logan Gunthorpe
be called once. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Cc: Christoph Hellwig <h...@lst.de> Cc: Sagi Grimberg <s...@grimberg.me> --- drivers/nvme/target/rdma.c | 20 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/drivers/nvme/target/rdm

[PATCH v4 14/14] nvmet: Optionally use PCI P2P memory

2018-04-23 Thread Logan Gunthorpe
se <sw...@opengridcomputing.com> [hch: partial rewrite of the initial code] Signed-off-by: Christoph Hellwig <h...@lst.de> Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/nvme/target/configfs.c | 67 ++ drivers/nvme

[PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-04-23 Thread Logan Gunthorpe
transactions. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/pci/Kconfig| 9 + drivers/pci/p2pdma.c | 45 ++--- drivers/pci/pci.c | 6 ++ include/linux/pci-p2pdma.h | 5 + 4 files chang

[PATCH v4 08/14] IB/core: Ensure we map P2P memory correctly in rdma_rw_ctx_[init|destroy]()

2018-04-23 Thread Logan Gunthorpe
is P2P the entire SGL should be P2P. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Reviewed-by: Christoph Hellwig <h...@lst.de> --- drivers/infiniband/core/rw.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/core/rw

[PATCH v4 09/14] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-04-23 Thread Logan Gunthorpe
, devm_memremap_pages() allocates regular memory without side effects that's accessible without the iomem accessors. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/nvme/host/pci.c | 75 +++-- 1 file changed, 41 insertions(+), 34 del

[PATCH v4 05/14] docs-rst: Add a new directory for PCI documentation

2018-04-23 Thread Logan Gunthorpe
Add a new directory in the driver API guide for PCI specific documentation. This is in preparation for adding a new PCI P2P DMA driver writers guide which will go in this directory. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Cc: Jonathan Corbet <cor...@lwn.net> Cc: Ma

[PATCH v4 11/14] nvme-pci: Add a quirk for a pseudo CMB

2018-04-23 Thread Logan Gunthorpe
Introduce a quirk to use CMB-like memory on older devices that have an exposed BAR but do not advertise support for using CMBLOC and CMBSIZE. We'd like to use some of these older cards to test P2P memory. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Reviewed-by: Sagi Grimb

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-27 Thread Logan Gunthorpe
On 27/03/18 02:47 AM, Jonathan Cameron wrote: > I'll see if I can get our PCI SIG people to follow this through and see if > it is just an omission or as Bjorn suggested, there is some reason we > aren't thinking of that makes it hard. That would be great! Thanks! Logan

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-26 Thread Logan Gunthorpe
On 26/03/18 01:35 PM, Jason Gunthorpe wrote: > I think this is another case of the HW can do it but the SW support is > missing. IOMMU configuration and maybe firmware too, for instance. Nope, not sure how you can make this leap. We've been specifically told that peer-to-peer PCIe DMA is not

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-26 Thread Logan Gunthorpe
On 26/03/18 10:41 AM, Jason Gunthorpe wrote: > On Mon, Mar 26, 2018 at 12:11:38PM +0100, Jonathan Cameron wrote: >> On Tue, 13 Mar 2018 10:43:55 -0600 >> Logan Gunthorpe <log...@deltatee.com> wrote: >> >>> On 12/03/18 09:28 PM, Sinan Kaya wrote: >>>

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-26 Thread Logan Gunthorpe
On 26/03/18 08:01 AM, Bjorn Helgaas wrote: > On Mon, Mar 26, 2018 at 12:11:38PM +0100, Jonathan Cameron wrote: >> On Tue, 13 Mar 2018 10:43:55 -0600 >> Logan Gunthorpe <log...@deltatee.com> wrote: >>> It turns out that root ports that support P2P are far less commo

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-26 Thread Logan Gunthorpe
On 24/03/18 09:28 AM, Stephen Bates wrote: > 1. There is no requirement for a single function to support internal DMAs but > in the case of NVMe we do have a protocol specific way for a NVMe function to > indicate it supports via the CMB BAR. Other protocols may also have such > methods but

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-23 Thread Logan Gunthorpe
On 23/03/18 03:50 PM, Bjorn Helgaas wrote: > Popping way up the stack, my original point was that I'm trying to > remove restrictions on what devices can participate in peer-to-peer > DMA. I think it's fairly clear that in conventional PCI, any devices > in the same PCI hierarchy, i.e., below

Re: [PATCH v3 11/11] nvmet: Optionally use PCI P2P memory

2018-03-21 Thread Logan Gunthorpe
On 21/03/18 03:27 AM, Christoph Hellwig wrote: >> + const char *page, size_t count) >> +{ >> +struct nvmet_port *port = to_nvmet_port(item); >> +struct device *dev; >> +struct pci_dev *p2p_dev = NULL; >> +bool use_p2pmem; >> + >> +switch (page[0])

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-14 Thread Logan Gunthorpe
On 14/03/18 01:28 PM, Dan Williams wrote: > P2P over PCI/PCI-X is quite common in devices like raid controllers. > It would be useful if those configurations were not left behind so > that Linux could feasibly deploy offload code to a controller in the > PCI domain. Thanks for the note. Neat.

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-14 Thread Logan Gunthorpe
On 14/03/18 12:51 PM, Bjorn Helgaas wrote: > You are focused on PCIe systems, and in those systems, most topologies > do have an upstream switch, which means two upstream bridges. I'm > trying to remove that assumption because I don't think there's a > requirement for it in the spec. Enforcing

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-14 Thread Logan Gunthorpe
On 14/03/18 06:16 AM, David Laight wrote: > That surprises me (unless I missed something last time I read the spec). > While P2P writes are relatively easy to handle, reads and any other TLP that > require acks are a completely different proposition. > There are no additional fields that can be

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-14 Thread Logan Gunthorpe
On 13/03/18 08:56 PM, Bjorn Helgaas wrote: > I assume you want to exclude Root Ports because of multi-function > devices and the "route to self" error. I was hoping for a reference > to that so I could learn more about it. I haven't been able to find where in the spec it forbids route to self.

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 13/03/18 05:19 PM, Sinan Kaya wrote: > It is still a switch it can move packets but, maybe it can move data at > 100kbps speed. As Stephen pointed out, it's a requirement of the PCIe spec that a switch supports P2P. If you want to sell a switch that does P2P with bad performance then that's

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 13/03/18 05:08 PM, Bjorn Helgaas wrote: > On Tue, Mar 13, 2018 at 10:31:55PM +, Stephen Bates wrote: > If it *is* necessary because Root Ports and devices below them behave > differently than in conventional PCI, I think you should include a > reference to the relevant section of the

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 13/03/18 04:29 PM, Sinan Kaya wrote: > If hardware doesn't support it, blacklisting should have been the right > path and I still think that you should remove all switch business from the > code. > I did not hear enough justification for having a switch requirement > for P2P. I disagree. >

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 13/03/18 03:22 PM, Sinan Kaya wrote: > It sounds like you have very tight hardware expectations for this to work > at this moment. You also don't want to generalize this code for others and > address the shortcomings. No, that's the way the community has pushed this work. Our original work

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 13/03/18 01:53 PM, Sinan Kaya wrote: > I agree disabling globally would be bad. Somebody can always say I have > ten switches on my system. I want to do peer-to-peer on one switch only. Now, > this change weakened security for the other switches that I had no intention > with doing P2P. > >

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 13/03/18 01:10 PM, Sinan Kaya wrote: > I was thinking of this for the pci_p2pdma_add_client() case for the > parent pointer. > > +struct pci_p2pdma_client { > + struct list_head list; > + struct pci_dev *client; > + struct pci_dev *provider; > +}; Yeah, that structure only

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 13/03/18 11:49 AM, Sinan Kaya wrote: And there's also the ACS problem which means if you want to use P2P on the root ports you'll have to disable ACS on the entire system. (Or preferably, the IOMMU groups need to get more sophisticated to allow for dynamic changes). Do you think you

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 12/03/18 09:28 PM, Sinan Kaya wrote: Maybe, dev parameter should also be struct pci_dev so that you can get rid of all to_pci_dev() calls in this code including find_parent_pci_dev() function. No, this was mentioned in v2. find_parent_pci_dev is necessary because the calling drivers

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 12/03/18 09:28 PM, Sinan Kaya wrote: On 3/12/2018 3:35 PM, Logan Gunthorpe wrote: Regarding the switch business, It is amazing how much trouble you went into limit this functionality into very specific hardware. I thought that we reached to an agreement that code would not impose any

Re: [PATCH v3 05/11] PCI/P2PDMA: Add P2P DMA driver writer's documentation

2018-03-12 Thread Logan Gunthorpe
On 3/12/2018 1:41 PM, Jonathan Corbet wrote: This all seems good, but...could we consider moving this documentation to driver-api/PCI as it's converted to RST? That would keep it together with similar materials and bring a bit more coherence to Documentation/ as a whole. Yup, I'll change this

[PATCH v3 04/11] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-03-12 Thread Logan Gunthorpe
-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/pci/Kconfig| 9 + drivers/pci/p2pdma.c | 44 drivers/pci/pci.c | 6 ++ include/linux/pci-p2pdma.h | 5 + 4 files changed, 64 insertions(+) diff

[PATCH v3 06/11] block: Introduce PCI P2P flags for request and request queue

2018-03-12 Thread Logan Gunthorpe
flag set. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Reviewed-by: Sagi Grimberg <s...@grimberg.me> --- block/blk-core.c | 3 +++ include/linux/blk_types.h | 18 +- include/linux/blkdev.h| 3 +++ 3 files changed, 23 insertions(+), 1 deletion(-)

[PATCH v3 07/11] IB/core: Ensure we map P2P memory correctly in rdma_rw_ctx_[init|destroy]()

2018-03-12 Thread Logan Gunthorpe
is P2P the entire SGL should be P2P. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/infiniband/core/rw.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/core/rw.c b/drivers/infiniband/core/rw.c index c8963e

[PATCH v3 11/11] nvmet: Optionally use PCI P2P memory

2018-03-12 Thread Logan Gunthorpe
hch: partial rewrite of the initial code] Signed-off-by: Christoph Hellwig <h...@lst.de> Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/nvme/target/configfs.c | 67 ++ drivers/nvme/target/core.c | 106 -

[PATCH v3 08/11] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-12 Thread Logan Gunthorpe
Register the CMB buffer as p2pmem and use the appropriate allocation functions to create and destroy the IO SQ. If the CMB supports WDS and RDS, publish it for use as P2P memory by other devices. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/nvme/host/pci.

[PATCH v3 03/11] PCI/P2PDMA: Add PCI p2pmem dma mappings to adjust the bus offset

2018-03-12 Thread Logan Gunthorpe
The DMA address used when mapping PCI P2P memory must be the PCI bus address. Thus, introduce pci_p2pmem_[un]map_sg() to map the correct addresses when using P2P memory. For this, we assume that an SGL passed to these functions contain all P2P memory or no P2P memory. Signed-off-by: Logan

[PATCH v3 05/11] PCI/P2PDMA: Add P2P DMA driver writer's documentation

2018-03-12 Thread Logan Gunthorpe
converted to restructured text at this time. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Cc: Jonathan Corbet <cor...@lwn.net> --- Documentation/PCI/index.rst | 14 Documentation/PCI/p2pdma.rst | 164 +++ Documentation/index.rst

[PATCH v3 00/11] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-03-12 Thread Logan Gunthorpe
new namespaces that are not supported by that memory will fail. Logan Gunthorpe (11): PCI/P2PDMA: Support peer-to-peer memory PCI/P2PDMA: Add sysfs group to display p2pmem stats PCI/P2PDMA: Add PCI p2pmem dma mappings to adjust the bus offset PCI/P2PDMA: Clear ACS P2P flags for all devices

[PATCH v3 10/11] nvme-pci: Add a quirk for a pseudo CMB

2018-03-12 Thread Logan Gunthorpe
Introduce a quirk to use CMB-like memory on older devices that have an exposed BAR but do not advertise support for using CMBLOC and CMBSIZE. We'd like to use some of these older cards to test P2P memory. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Reviewed-by: Sagi Grimb

[PATCH v3 02/11] PCI/P2PDMA: Add sysfs group to display p2pmem stats

2018-03-12 Thread Logan Gunthorpe
Add a sysfs group to display statistics about P2P memory that is registered in each PCI device. Attributes in the group display the total amount of P2P memory, the amount available and whether it is published or not. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- Documentati

[PATCH v3 09/11] nvme-pci: Add support for P2P memory in requests

2018-03-12 Thread Logan Gunthorpe
-off-by: Logan Gunthorpe <log...@deltatee.com> Reviewed-by: Sagi Grimberg <s...@grimberg.me> --- drivers/nvme/host/core.c | 4 drivers/nvme/host/nvme.h | 1 + drivers/nvme/host/pci.c | 19 +++ 3 files changed, 20 insertions(+), 4 deletions(-) diff --git a/drive

[PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-12 Thread Logan Gunthorpe
and feedback from Christoph Hellwig. Signed-off-by: Christoph Hellwig <h...@lst.de> Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/pci/Kconfig| 16 ++ drivers/pci/Makefile | 1 + drivers/pci/p2pdma.c | 679 ++

Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-05 Thread Logan Gunthorpe
On 05/03/18 05:49 PM, Oliver wrote: It's in arch/powerpc/kernel/io.c as _memcpy_toio() and it has two full barriers! Awesome! Our io.h indicates that our iomem accessors are designed to provide x86ish strong ordering of accesses to MMIO space. The git log indicates arch/powerpc/kernel/io.c

Re: [PATCH v2 04/10] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-03-05 Thread Logan Gunthorpe
On 05/03/18 03:28 PM, Bjorn Helgaas wrote: If you put the #ifdef right here, then it's easier to read because we can see that "oh, this is a special and uncommon case that I can probably ignore". Makes sense. I'll do that. Thanks, Logan

Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-05 Thread Logan Gunthorpe
On 05/03/18 01:10 PM, Jason Gunthorpe wrote: So when reading the above mlx code, we see the first wmb() being used to ensure that CPU stores to cachable memory are visible to the DMA triggered by the doorbell ring. Oh, yes, that makes sense. Disregard my previous email as I was wrong. Logan

Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-05 Thread Logan Gunthorpe
On 05/03/18 12:57 PM, Sagi Grimberg wrote: Keith, while we're on this, regardless of cmb, is SQE memcopy and DB update ordering always guaranteed? If you look at mlx4 (rdma device driver) that works exactly the same as nvme you will find: --     qp->sq.head += nreq;    

Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-05 Thread Logan Gunthorpe
On 05/03/18 11:02 AM, Sinan Kaya wrote: writel has a barrier inside on ARM64. https://elixir.bootlin.com/linux/latest/source/arch/arm64/include/asm/io.h#L143 Yes, and no barrier inside memcpy_toio as it uses __raw_writes. This should be sufficient as we are only accessing addresses that

Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-05 Thread Logan Gunthorpe
On 05/03/18 09:00 AM, Keith Busch wrote: On Mon, Mar 05, 2018 at 12:33:29PM +1100, Oliver wrote: On Thu, Mar 1, 2018 at 10:40 AM, Logan Gunthorpe <log...@deltatee.com> wrote: @@ -429,10 +429,7 @@ static void __nvme_submit_cmd(struct nvme_queue *nvmeq, { u16 tail = nvmeq-&g

Re: [PATCH v2 00/10] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-03-02 Thread Logan Gunthorpe
On 02/03/18 02:44 PM, Benjamin Herrenschmidt wrote: Allright, so, I think I have a plan to fix this, but it will take a little bit of time. Basically the idea is to have firmware pass to Linux a region that's known to not have anything in it that it can use for the vmalloc space rather than

Re: [PATCH v2 10/10] nvmet: Optionally use PCI P2P memory

2018-03-02 Thread Logan Gunthorpe
On 02/03/18 09:18 AM, Jason Gunthorpe wrote: This allocator is already seems not useful for the P2P target memory on a Mellanox NIC due to the way it has a special allocation flow (windowing) and special usage requirements.. Nor can it be usefull for the doorbell memory in the NIC. No one

Re: [PATCH v2 02/10] PCI/P2PDMA: Add sysfs group to display p2pmem stats

2018-03-01 Thread Logan Gunthorpe
On 01/03/18 05:36 PM, Dan Williams wrote: On Thu, Mar 1, 2018 at 4:15 PM, Logan Gunthorpe <log...@deltatee.com> wrote: On 01/03/18 10:44 AM, Bjorn Helgaas wrote: I think these two statements are out of order, since the attributes dereference pdev->p2pdma. And it looks like you s

Re: [PATCH v2 02/10] PCI/P2PDMA: Add sysfs group to display p2pmem stats

2018-03-01 Thread Logan Gunthorpe
On 01/03/18 10:44 AM, Bjorn Helgaas wrote: I think these two statements are out of order, since the attributes dereference pdev->p2pdma. And it looks like you set "error" unnecessarily, since you return immediately looking at it. Per the previous series, sysfs_create_group is must_check for

Re: [PATCH v2 10/10] nvmet: Optionally use PCI P2P memory

2018-03-01 Thread Logan Gunthorpe
On 01/03/18 04:57 PM, Stephen Bates wrote: We don't want to lump these all together without knowing which region you're allocating from, right? In all seriousness I do agree with you on these Keith in the long term. We would consider adding property flags for the memory as it is added to

Re: [PATCH v2 04/10] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-03-01 Thread Logan Gunthorpe
On 01/03/18 04:15 PM, Bjorn Helgaas wrote: The question is what the relevant switch is. We call pci_enable_acs() on every PCI device, including Root Ports. It looks like this relies on get_upstream_bridge_port() to filter out some things. I don't think get_upstream_bridge_port() is doing

Re: [PATCH v2 00/10] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-03-01 Thread Logan Gunthorpe
On 01/03/18 04:26 PM, Benjamin Herrenschmidt wrote: The big problem is not the vmemmap, it's the linear mapping. Ah, yes, ok. Logan

Re: [PATCH v2 10/10] nvmet: Optionally use PCI P2P memory

2018-03-01 Thread Logan Gunthorpe
On 01/03/18 04:49 PM, Keith Busch wrote: On Thu, Mar 01, 2018 at 11:00:51PM +, Stephen Bates wrote: P2P is about offloading the memory and PCI subsystem of the host CPU and this is achieved no matter which p2p_dev is used. Even within a device, memory attributes for its various

Re: [PATCH v2 10/10] nvmet: Optionally use PCI P2P memory

2018-03-01 Thread Logan Gunthorpe
On 01/03/18 04:20 PM, Jason Gunthorpe wrote: On Thu, Mar 01, 2018 at 11:00:51PM +, Stephen Bates wrote: No, locality matters. If you have a bunch of NICs and bunch of drives and the allocator chooses to put all P2P memory on a single drive your performance will suck horribly even if

Re: [PATCH v2 00/10] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-03-01 Thread Logan Gunthorpe
On 01/03/18 04:00 PM, Benjamin Herrenschmidt wrote: We use only 52 in practice but yes. That's 64PB. If you use need a sparse vmemmap for the entire space it will take 16TB which leaves you with 63.98PB of address space left. (Similar calculations for other numbers of address bits.) We

  1   2   >