Re: [PATCH 2/3] iopmem : Add a block device driver for PCIe attached IO memory.

2016-10-28 Thread Logan Gunthorpe
Hi Christoph, Thanks so much for the detailed review of the code! Even though by the sounds of things we will be moving to device dax and most of this is moot. Still, it's great to get some feedback and learn a few things. I've given some responses below. On 28/10/16 12:45 AM, Christoph Hellwig

BUG: Hung task timeouts in for-4.10/dio

2016-11-08 Thread Logan Gunthorpe
Hi guys, We were looking at testing the new IO polling improvements and we built a kernel from the 'for-4.10/dio' (64ead7d) branch in linux-block. However this branch seems to cause hung tasks when booted. Most noticeably, dhclient seems to always hang as it tries to read from it's leases file,

Re: BUG: Hung task timeouts in for-4.10/dio

2016-11-08 Thread Logan Gunthorpe
Hey, I haven't check 82a78cd, but when I tried reverting the commit in yesterdays version there were conflicts, as a subsequent patch removed the defines that the specific patch operated on. Logan On 08/11/16 12:01 PM, Jens Axboe wrote: > On 11/08/2016 11:59 AM, Logan Gunthorpe wrote: &g

Re: BUG: Hung task timeouts in for-4.10/dio

2016-11-08 Thread Logan Gunthorpe
Hey, On 08/11/16 12:19 PM, Jens Axboe wrote: > Can you try and boot for-4.10/block instead? Yup. I'm seeing the same issue with that branch too. (b57d74a) Thanks, Logan -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to

Re: BUG: Hung task timeouts in for-4.10/dio

2016-11-09 Thread Logan Gunthorpe
Hey, I just tested with the latest for-4.10/block branch and it looks like it fixed our problem. Thanks! Logan On 08/11/16 07:55 PM, Damien Le Moal wrote: > > Jens, > > On 11/9/16 11:45, Jens Axboe wrote: >> I just committed the work-around. But yes, let's have a logical revert >> and

[PATCH 3/3] block: order /proc/devices by major number

2017-06-16 Thread Logan Gunthorpe
each major number in the correct order, regardless of where they are stored in the hash table. In order to do this, we introduce BLKDEV_MAJOR_MAX as an artificial limit (chosen to be 512). It will then print all devices in major order number from 0 to the maximum. Signed-off-by: Logan Gunthorpe

[RFC PATCH 06/16] scatterlist: convert page_link to pfn_t

2017-05-24 Thread Logan Gunthorpe
affect is that, on 32 bit systems, the sgl entry will be 32 bits larger seeing pfn_t is always 64 bits and the unsigned long it replaced was 32 bits. However, it should still fit the same SG_CHUNK_SIZE entries into a single page so this probably isn't a huge issue. Signed-off-by: Logan Gunthorpe <

[RFC PATCH 09/16] bvec: introduce bvec_page and bvec_set_page accessors

2017-05-24 Thread Logan Gunthorpe
Introduce two accessor functions for bv_page: bvec_page to return the page and bvec_set_page. A follow on patch will mechanically convert all the individual uses within the kernel. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Signed-off-by: Stephen Bates <sba...@raithlin.com>

[RFC PATCH 15/16] dma-mapping: introduce and use unmappable safe sg_virt call

2017-05-24 Thread Logan Gunthorpe
-by: Logan Gunthorpe <log...@deltatee.com> Signed-off-by: Stephen Bates <sba...@raithlin.com> --- include/linux/dma-mapping.h | 9 +++-- include/linux/scatterlist.h | 16 2 files changed, 23 insertions(+), 2 deletions(-) diff --git a/include/linux/dma-mapping

[RFC PATCH 07/16] scatterlist: support unmappable memory in the scatterlist

2017-05-24 Thread Logan Gunthorpe
to io memory, memory that is unmappable, or memory that doesn't have struct page backings. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Signed-off-by: Stephen Bates <sba...@raithlin.com> --- include/linux/pfn_t.h | 37 - include/linux/s

[RFC PATCH 00/16] Unmappable memory in SGLs for p2p transfers

2017-05-24 Thread Logan Gunthorpe
. This series is based on v4.12-rc2 and a git tree is available here: https://github.com/sbates130272/linux-p2pmem.git io_pfn_t Thanks for your time, Logan [1] https://lkml.org/lkml/2017/4/25/738 Logan Gunthorpe (16): dmaengine: ste_dma40, imx-dma: Cleanup scatterlist layering violations staging

[RFC PATCH 05/16] tile: provide default ioremap declaration

2017-05-24 Thread Logan Gunthorpe
Add a default ioremap function which was not provided in all circumstances. (Only when CONFIG_PCI and CONFIG_TILEGX was set). I have designs to use them in scatterlist.c where they'd likely never be called with this architecture, but it is needed to compile. Signed-off-by: Logan Gunthorpe <

[RFC PATCH 04/16] um: add dummy ioremap and iounmap functions

2017-05-24 Thread Logan Gunthorpe
need to compile. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Signed-off-by: Stephen Bates <sba...@raithlin.com> --- arch/um/include/asm/io.h | 17 + 1 file changed, 17 insertions(+) create mode 100644 arch/um/include/asm/io.h diff --git a/arch/um/include/asm/io

[RFC PATCH 14/16] block: bio: go straight from pfn_t to phys instead of through page

2017-05-24 Thread Logan Gunthorpe
Going straight from pfn_t to physical address is cheaper and avoids the potential BUG_ON in bvec_page for unmappable memory. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Signed-off-by: Stephen Bates <sba...@raithlin.com> --- include/linux/bio.h | 7 +-- 1 file changed,

[RFC PATCH 03/16] kfifo: Cleanup example to not use page_link

2017-05-24 Thread Logan Gunthorpe
This is a layering violation so we replace the uses with calls to sg_page(). This is a prep patch for replacing page_link and this is one of the very few uses outside of scatterlist.h. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Signed-off-by: Stephen Bates <sba...@rai

[RFC PATCH 02/16] staging: ccree: Cleanup: remove references to page_link

2017-05-24 Thread Logan Gunthorpe
This is a layering violation so we replace it with calls to sg_page. This is a prep patch for replacing page_link and this is one of the very few uses outside of scatterlist.h. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Signed-off-by: Stephen Bates <sba...@raithlin.com>

[RFC PATCH 12/16] bvec: use sg_set_pfn when mapping a bio to an sgl

2017-05-24 Thread Logan Gunthorpe
len; expression offset; @@ -sg_set_page(sg, bvec_page(), len, offset); +sg_set_pfn(sg, bv.bv_pfn, len, offset); @@ expression sg; expression bv; expression len; expression offset; @@ -sg_set_page(sg, bvec_page(bv), len, offset); +sg_set_pfn(sg, bv->bv_pfn, len, offset); Signed-off-by: Lo

[RFC PATCH 16/16] nvmet: use unmappable sgl in rdma target

2017-05-24 Thread Logan Gunthorpe
/ Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Signed-off-by: Stephen Bates <sba...@raithlin.com> --- drivers/nvme/host/pci.c | 3 ++- drivers/nvme/target/Kconfig | 12 drivers/nvme/target/io-cmd.c | 2 +- drivers/nvme/target/r

[RFC PATCH 01/16] dmaengine: ste_dma40, imx-dma: Cleanup scatterlist layering violations

2017-05-24 Thread Logan Gunthorpe
Two dma engine drivers directly accesses page_link assuming knowledge that should be contained only in scatterlist.h. We replace these with calls to sg_chain and sg_assign_page. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Signed-off-by: Stephen Bates <sba...@raithlin.com>

[RFC PATCH 11/16] bvec: convert to using pfn_t internally

2017-05-24 Thread Logan Gunthorpe
is actually a struct. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Signed-off-by: Stephen Bates <sba...@raithlin.com> --- block/blk-integrity.c | 4 ++-- block/blk-merge.c | 6 +++--- include/linux/bvec.h | 13 + 3 files changed, 14 insertions(+), 9 deletions(-)

[RFC PATCH 08/16] scatterlist: add iomem support to sg_miter and sg_copy_*

2017-05-24 Thread Logan Gunthorpe
to support IO memory by simply calling the appropriate memcpy when required. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Signed-off-by: Stephen Bates <sba...@raithlin.com> --- include/linux/scatterlist.h | 3 +++ lib/scatterlist.c | 35 +---

Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-05-04 Thread Logan Gunthorpe
On 04/05/18 08:27 AM, Christian König wrote: > Are you sure that this is more convenient? At least on first glance it > feels overly complicated. > > I mean what's the difference between the two approaches? > >     sum = pci_p2pdma_distance(target, [A, B, C, target]); > > and > >     sum

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 05:11 PM, Alex Williamson wrote: > On to the implementation details... I already mentioned the BDF issue > in my other reply. If we had a way to persistently identify a device, > would we specify the downstream points at which we want to disable ACS > or the endpoints that we want

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 10:50 AM, Christian König wrote: > E.g. transactions are initially send to the root complex for > translation, that's for sure. But at least for AMD GPUs the root complex > answers with the translated address which is then cached in the device. > > So further transactions for the

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 04:03 PM, Alex Williamson wrote: > If IOMMU grouping implies device assignment (because nobody else uses > it to the same extent as device assignment) then the build-time option > falls to pieces, we need a single kernel that can do both. I think we > need to get more clever about

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 02:13 PM, Alex Williamson wrote: > Well, I'm a bit confused, this patch series is specifically disabling > ACS on switches, but per the spec downstream switch ports implementing > ACS MUST implement direct translated P2P. So it seems the only > potential gap here is the endpoint,

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 05:00 PM, Dan Williams wrote: >> I'd advise caution with a user supplied BDF approach, we have no >> guaranteed persistence for a device's PCI address. Adding a device >> might renumber the buses, replacing a device with one that consumes >> more/less bus numbers can renumber the

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 02:43 PM, Alex Williamson wrote: > Yes, GPUs seem to be leading the pack in implementing ATS. So now the > dumb question, why not simply turn off the IOMMU and thus ACS? The > argument of using the IOMMU for security is rather diminished if we're > specifically enabling devices to

Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 10:57 AM, Alex Williamson wrote: > AIUI from previously questioning this, the change is hidden behind a > build-time config option and only custom kernels or distros optimized > for this sort of support would enable that build option. I'm more than > a little dubious though that

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 01:34 PM, Alex Williamson wrote: > They are not so unrelated, see the ACS Direct Translated P2P > capability, which in fact must be implemented by switch downstream > ports implementing ACS and works specifically with ATS. This appears to > be the way the PCI SIG would intend for

Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-05-07 Thread Logan Gunthorpe
> How do you envison merging this? There's a big chunk in drivers/pci, but > really no opportunity for conflicts there, and there's significant stuff in > block and nvme that I don't really want to merge. > > If Alex is OK with the ACS situation, I can ack the PCI parts and you could > merge it

Re: [PATCH v4 01/14] PCI/P2PDMA: Support peer-to-peer memory

2018-05-07 Thread Logan Gunthorpe
Thanks for the review. I'll apply all of these for the changes for next version of the set. >> +/* >> + * If a device is behind a switch, we try to find the upstream bridge >> + * port of the switch. This requires two calls to pci_upstream_bridge(): >> + * one for the upstream port on the switch,

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-09 Thread Logan Gunthorpe
On 09/05/18 07:40 AM, Christian König wrote: > The key takeaway is that when any device has ATS enabled you can't > disable ACS without breaking it (even if you unplug and replug it). I don't follow how you came to this conclusion... The ACS bits we'd be turning off are the ones that force

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-10 Thread Logan Gunthorpe
On 10/05/18 11:11 AM, Stephen Bates wrote: >> Not to me. In the p2pdma code we specifically program DMA engines with >> the PCI bus address. > > Ah yes of course. Brain fart on my part. We are not programming the P2PDMA > initiator with an IOVA but with the PCI bus address... > >> So

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-10 Thread Logan Gunthorpe
On 10/05/18 08:16 AM, Stephen Bates wrote: > Hi Christian > >> Why would a switch not identify that as a peer address? We use the PASID >>together with ATS to identify the address space which a transaction >>should use. > > I think you are conflating two types of TLPs here. If the

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-10 Thread Logan Gunthorpe
On 10/05/18 12:41 PM, Stephen Bates wrote: > Hi Jerome > >>Note on GPU we do would not rely on ATS for peer to peer. Some part >>of the GPU (DMA engines) do not necessarily support ATS. Yet those >>are the part likely to be use in peer to peer. > > OK this is good to know. I agree

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-11 Thread Logan Gunthorpe
On 5/11/2018 2:52 AM, Christian König wrote: This only works when the IOVA and the PCI bus addresses never overlap. I'm not sure how the IOVA allocation works but I don't think we guarantee that on Linux. I find this hard to believe. There's always the possibility that some part of the

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-11 Thread Logan Gunthorpe
On 5/11/2018 4:24 PM, Stephen Bates wrote: All  Alex (or anyone else) can you point to where IOVA addresses are generated? A case of RTFM perhaps (though a pointer to the code would still be appreciated). https://www.kernel.org/doc/Documentation/Intel-IOMMU.txt Some exceptions to IOVA

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 01:17 AM, Christian König wrote: > AMD APUs mandatory need the ACS flag set for the GPU integrated in the > CPU when IOMMU is enabled or otherwise you will break SVM. Well, given that the current set only disables ACS bits on bridges (previous versions were only on switches) this

Re: [PATCH v4 06/14] PCI/P2PDMA: Add P2P DMA driver writer's documentation

2018-05-22 Thread Logan Gunthorpe
Thanks for the review Randy! I'll make the changes for the next time we post the series. On 22/05/18 03:24 PM, Randy Dunlap wrote: >> +The first task an orchestrator driver must do is compile a list of >> +all client drivers that will be involved in a given transaction. For >> +example, the NVMe

Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-05-02 Thread Logan Gunthorpe
Hi Christian, On 5/2/2018 5:51 AM, Christian König wrote: it would be rather nice to have if you could separate out the functions to detect if peer2peer is possible between two devices. This would essentially be pci_p2pdma_distance() in the existing patchset. It returns the sum of the

Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-05-03 Thread Logan Gunthorpe
On 03/05/18 11:29 AM, Christian König wrote: > Ok, that is the point where I'm stuck. Why do we need that in one > function call in the PCIe subsystem? > > The problem at least with GPUs is that we seriously don't have that > information here, cause the PCI subsystem might not be aware of all

Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-05-03 Thread Logan Gunthorpe
On 03/05/18 03:05 AM, Christian König wrote: > Ok, I'm still missing the big picture here. First question is what is > the P2PDMA provider? Well there's some pretty good documentation in the patchset for this, but in short, a provider is a device that provides some kind of P2P resource (ie.

Re: [PATCH 09/12] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-01-05 Thread Logan Gunthorpe
On 05/01/18 11:11 AM, Keith Busch wrote: On Thu, Jan 04, 2018 at 12:01:34PM -0700, Logan Gunthorpe wrote: Register the CMB buffer as p2pmem and use the appropriate allocation functions to create and destroy the IO SQ. If the CMB supports WDS and RDS, publish it for use as p2p memory by other

Re: [PATCH 04/12] pci-p2p: Clear ACS P2P flags for all client devices

2018-01-05 Thread Logan Gunthorpe
On 04/01/18 08:33 PM, Alex Williamson wrote: That's exactly what IOMMU groups represent, the smallest set of devices which have DMA isolation from other devices. By poking this hole, the IOMMU group is invalid. We cannot turn off ACS only for a specific device, in order to enable p2p it

Re: [PATCH 09/12] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-01-05 Thread Logan Gunthorpe
On 05/01/18 08:30 AM, Marta Rybczynska wrote: @@ -429,10 +429,7 @@ static void __nvme_submit_cmd(struct nvme_queue *nvmeq, { u16 tail = nvmeq->sq_tail; - if (nvmeq->sq_cmds_io) - memcpy_toio(>sq_cmds_io[tail], cmd, sizeof(*cmd)); - else -

Re: [PATCH 09/12] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-01-05 Thread Logan Gunthorpe
On 05/01/18 12:01 PM, Keith Busch wrote: On Fri, Jan 05, 2018 at 11:19:28AM -0700, Logan Gunthorpe wrote: Although it is not explicitly stated anywhere, pci_alloc_p2pmem() should always be at least 4k aligned. This is because the gen_pool that implements it is created with PAGE_SHIFT for its

Re: [PATCH 06/12] IB/core: Add optional PCI P2P flag to rdma_rw_ctx_[init|destroy]()

2018-01-04 Thread Logan Gunthorpe
On 04/01/18 03:13 PM, Jason Gunthorpe wrote: On Thu, Jan 04, 2018 at 12:52:24PM -0700, Logan Gunthorpe wrote: We tried things like this in an earlier iteration[1] which assumed the SG was homogenous (all P2P or all regular memory). This required serious ugliness to try and ensure SGs were

Re: [PATCH 01/12] pci-p2p: Support peer to peer memory

2018-01-04 Thread Logan Gunthorpe
On 04/01/18 02:59 PM, Bjorn Helgaas wrote: On Thu, Jan 04, 2018 at 12:01:26PM -0700, Logan Gunthorpe wrote: Some PCI devices may have memory mapped in a BAR space that's intended for use in Peer-to-Peer transactions. In order to enable such transactions the memory must be registered

Re: [PATCH 04/12] pci-p2p: Clear ACS P2P flags for all client devices

2018-01-04 Thread Logan Gunthorpe
On 04/01/18 05:00 PM, Logan Gunthorpe wrote: > > > On 04/01/18 03:35 PM, Alex Williamson wrote: >> Yep, flipping these ACS bits invalidates any IOMMU groups that depend >> on the isolation of that downstream port and I suspect also any peers >> within the

Re: [PATCH 01/12] pci-p2p: Support peer to peer memory

2018-01-04 Thread Logan Gunthorpe
Thanks for the speedy review! On 04/01/18 02:40 PM, Bjorn Helgaas wrote: Run "git log --oneline drivers/pci" and follow the convention. I think it would make sense to add a new tag like "PCI/P2P", although "P2P" has historically also been used in the "PCI-to-PCI bridge" context, so maybe

Re: [PATCH 04/12] pci-p2p: Clear ACS P2P flags for all client devices

2018-01-04 Thread Logan Gunthorpe
On 04/01/18 03:35 PM, Alex Williamson wrote: Yep, flipping these ACS bits invalidates any IOMMU groups that depend on the isolation of that downstream port and I suspect also any peers within the same PCI slot of that port and their downstream devices. The entire sub-hierarchy grouping needs

Re: [PATCH 02/12] pci-p2p: Add sysfs group to display p2pmem stats

2018-01-04 Thread Logan Gunthorpe
On 04/01/18 02:50 PM, Bjorn Helgaas wrote: On Thu, Jan 04, 2018 at 12:01:27PM -0700, Logan Gunthorpe wrote: Attributes display the total amount of P2P memory, the ammount available and whether it is published or not. s/ammount/amount/ (also below) Will fix. I wonder if "p2pdma&q

[PATCH 07/12] nvme-pci: clean up CMB initialization

2018-01-04 Thread Logan Gunthorpe
From: Christoph Hellwig Refactor the call to nvme_map_cmb, and change the conditions for probing for the CMB. First remove the version check as NVMe TPs always apply to earlier versions of the spec as well. Second check for the whole CMBSZ register for support of the CMB feature

Re: [PATCH 08/12] nvme-pci: clean up SMBSZ bit definitions

2018-01-04 Thread Logan Gunthorpe
On 04/01/18 12:01 PM, Logan Gunthorpe wrote: From: Christoph Hellwig <h...@lst.de> Define the bit positions instead of macros using the magic values, and move the expanded helpers to calculate the size and size unit into the implementation C file. Signed-off-by: Christoph Hell

[PATCH 05/12] block: Introduce PCI P2P flags for request and request queue

2018-01-04 Thread Logan Gunthorpe
flag set. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- block/blk-core.c | 3 +++ include/linux/blk_types.h | 18 +- include/linux/blkdev.h| 2 ++ 3 files changed, 22 insertions(+), 1 deletion(-) diff --git a/block/blk-core.c b/block/blk-core.c

[PATCH 02/12] pci-p2p: Add sysfs group to display p2pmem stats

2018-01-04 Thread Logan Gunthorpe
Attributes display the total amount of P2P memory, the ammount available and whether it is published or not. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- Documentation/ABI/testing/sysfs-bus-pci | 25 drivers/pci/p2p.c

Re: [PATCH 07/12] nvme-pci: clean up CMB initialization

2018-01-04 Thread Logan Gunthorpe
On 04/01/18 12:01 PM, Logan Gunthorpe wrote: From: Christoph Hellwig <h...@lst.de> Refactor the call to nvme_map_cmb, and change the conditions for probing for the CMB. First remove the version check as NVMe TPs always apply to earlier versions of the spec as well. Second

[PATCH 11/12] nvme-pci: Add a quirk for a pseudo CMB

2018-01-04 Thread Logan Gunthorpe
Introduce a quirk to use CMB-like memory on older devices that have an exposed BAR but do not advertise support for using CMBLOC and CMBSIZE. We'd like to use some of these older cards to test P2P memory. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/nvme/host/nvme.

[PATCH 01/12] pci-p2p: Support peer to peer memory

2018-01-04 Thread Logan Gunthorpe
supports P2P transfers is non-trivial. Additionally, the benefits of P2P transfers that go through the RC is limited to only reducing DRAM usage. This commit includes significant rework and feedback from Christoph Hellwig. Signed-off-by: Christoph Hellwig <h...@lst.de> Signed-off-by: Logan Gunthorp

[PATCH 12/12] nvmet: Optionally use PCI P2P memory

2018-01-04 Thread Logan Gunthorpe
hch: partial rewrite of the initial code] Signed-off-by: Christoph Hellwig <h...@lst.de> Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/nvme/target/configfs.c | 29 + drivers/nvme/target/core.c | 95 +- drivers/nvm

[PATCH 03/12] pci-p2p: Add PCI p2pmem dma mappings to adjust the bus offset

2018-01-04 Thread Logan Gunthorpe
The DMA address used when mapping PCI P2P memory must be the PCI bus address. Thus, introduce pci_p2pmem_[un]map_sg() to map the correct addresses when using P2P memory. For this, we assume that an SGL passed to these functions contain all p2p memory or no p2p memory. Signed-off-by: Logan

[PATCH 08/12] nvme-pci: clean up SMBSZ bit definitions

2018-01-04 Thread Logan Gunthorpe
From: Christoph Hellwig Define the bit positions instead of macros using the magic values, and move the expanded helpers to calculate the size and size unit into the implementation C file. Signed-off-by: Christoph Hellwig --- drivers/nvme/host/pci.c | 23

Re: [PATCH 06/12] IB/core: Add optional PCI P2P flag to rdma_rw_ctx_[init|destroy]()

2018-01-04 Thread Logan Gunthorpe
On 04/01/18 12:22 PM, Jason Gunthorpe wrote: This seems really clunky since we are going to want to do this same logic all over the place. I'd be much happier if dma_map_sg can tell the memory is P2P or not from the scatterlist or dir arguments and not require the callers to have this. We

[PATCH 06/12] IB/core: Add optional PCI P2P flag to rdma_rw_ctx_[init|destroy]()

2018-01-04 Thread Logan Gunthorpe
In order to use PCI P2P memory pci_p2pmem_[un]map_sg() functions must be called to map the correct DMA address. To do this, we add a flags variable and the RDMA_RW_CTX_FLAG_PCI_P2P flag. When the flag is specified use the appropriate map function. Signed-off-by: Logan Gunthorpe <

[PATCH 10/12] nvme-pci: Add support for P2P memory in requests

2018-01-04 Thread Logan Gunthorpe
-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/nvme/host/core.c | 4 drivers/nvme/host/nvme.h | 1 + drivers/nvme/host/pci.c | 18 ++ 3 files changed, 19 insertions(+), 4 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c

[PATCH 09/12] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-01-04 Thread Logan Gunthorpe
Register the CMB buffer as p2pmem and use the appropriate allocation functions to create and destroy the IO SQ. If the CMB supports WDS and RDS, publish it for use as p2p memory by other devices. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/nvme/host/pci.

[PATCH 00/11] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-01-04 Thread Logan Gunthorpe
. This series is based off of Christoph's v3 series to revamp dev_pagemap. A git repo of the patches is available here[2]. Logan Christoph Hellwig (2): nvme-pci: clean up CMB initialization nvme-pci: clean up SMBSZ bit definitions Logan Gunthorpe (10): pci-p2p: Support peer to peer memory pci-p2p

[PATCH 04/12] pci-p2p: Clear ACS P2P flags for all client devices

2018-01-04 Thread Logan Gunthorpe
devices involved in transactions with the p2pmem. A count of the number of requests to disable the flags is maintained. When the count transitions from 1 to 0, the old flags are restored. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/pci/p2p.c

Re: [PATCH 06/12] IB/core: Add optional PCI P2P flag to rdma_rw_ctx_[init|destroy]()

2018-01-08 Thread Logan Gunthorpe
On 08/01/18 11:34 AM, Christoph Hellwig wrote: But P2P is _not_ a factor of the dma_ops implementation at all, it is something that happens behind the dma_map implementation. Think about what the dma mapping routines do: (a) translate from host address to bus addresses and (b) flush

Re: [PATCH 06/12] IB/core: Add optional PCI P2P flag to rdma_rw_ctx_[init|destroy]()

2018-01-08 Thread Logan Gunthorpe
On 08/01/18 11:57 AM, Christoph Hellwig wrote: It does, sort of - but in a different way then the normal DMA map ops. And only to work around the fact that we need to map our P2P space into struct pages. Without that we could just pass the bus address around, but the Linux stack and VM isn't

Re: [PATCH 06/12] IB/core: Add optional PCI P2P flag to rdma_rw_ctx_[init|destroy]()

2018-01-08 Thread Logan Gunthorpe
On 08/01/18 11:09 AM, Jason Gunthorpe wrote: It could, if we had a DMA op for p2p then the drivers that provide their own ops can implement it appropriately or not at all. I was thinking of doing something like this. I'll probably rough out a patch and send it along today or tomorrow. If

Re: [PATCH 1/2] char_dev: Fix off-by-one bugs in find_dynamic_major()

2018-02-06 Thread Logan Gunthorpe
undant if condition ("cd->major != i"), as it will never be true. Signed-off-by: Srivatsa S. Bhat <sriva...@csail.mit.edu> Reviewed-by: Logan Gunthorpe <log...@deltatee.com> Logan

Re: [PATCH 2/2] block, char_dev: Use correct format specifier for unsigned ints

2018-02-06 Thread Logan Gunthorpe
is greater than the maximum (511) ..." (and also fix off-by-one bugs in the error prints). While at it, also update the comment describing register_blkdev(). Signed-off-by: Srivatsa S. Bhat <sriva...@csail.mit.edu> Reviewed-by: Logan Gunthorpe <log...@deltatee.com>

[PATCH v3 00/11] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-03-12 Thread Logan Gunthorpe
new namespaces that are not supported by that memory will fail. Logan Gunthorpe (11): PCI/P2PDMA: Support peer-to-peer memory PCI/P2PDMA: Add sysfs group to display p2pmem stats PCI/P2PDMA: Add PCI p2pmem dma mappings to adjust the bus offset PCI/P2PDMA: Clear ACS P2P flags for all devices

[PATCH v3 10/11] nvme-pci: Add a quirk for a pseudo CMB

2018-03-12 Thread Logan Gunthorpe
Introduce a quirk to use CMB-like memory on older devices that have an exposed BAR but do not advertise support for using CMBLOC and CMBSIZE. We'd like to use some of these older cards to test P2P memory. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Reviewed-by: Sagi Grimb

[PATCH v3 02/11] PCI/P2PDMA: Add sysfs group to display p2pmem stats

2018-03-12 Thread Logan Gunthorpe
Add a sysfs group to display statistics about P2P memory that is registered in each PCI device. Attributes in the group display the total amount of P2P memory, the amount available and whether it is published or not. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- Documentati

[PATCH v3 09/11] nvme-pci: Add support for P2P memory in requests

2018-03-12 Thread Logan Gunthorpe
-off-by: Logan Gunthorpe <log...@deltatee.com> Reviewed-by: Sagi Grimberg <s...@grimberg.me> --- drivers/nvme/host/core.c | 4 drivers/nvme/host/nvme.h | 1 + drivers/nvme/host/pci.c | 19 +++ 3 files changed, 20 insertions(+), 4 deletions(-) diff --git a/drive

[PATCH v3 04/11] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-03-12 Thread Logan Gunthorpe
-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/pci/Kconfig| 9 + drivers/pci/p2pdma.c | 44 drivers/pci/pci.c | 6 ++ include/linux/pci-p2pdma.h | 5 + 4 files changed, 64 insertions(+) diff

[PATCH v3 06/11] block: Introduce PCI P2P flags for request and request queue

2018-03-12 Thread Logan Gunthorpe
flag set. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Reviewed-by: Sagi Grimberg <s...@grimberg.me> --- block/blk-core.c | 3 +++ include/linux/blk_types.h | 18 +- include/linux/blkdev.h| 3 +++ 3 files changed, 23 insertions(+), 1 deletion(-)

[PATCH v3 07/11] IB/core: Ensure we map P2P memory correctly in rdma_rw_ctx_[init|destroy]()

2018-03-12 Thread Logan Gunthorpe
is P2P the entire SGL should be P2P. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/infiniband/core/rw.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/core/rw.c b/drivers/infiniband/core/rw.c index c8963e

[PATCH v3 11/11] nvmet: Optionally use PCI P2P memory

2018-03-12 Thread Logan Gunthorpe
hch: partial rewrite of the initial code] Signed-off-by: Christoph Hellwig <h...@lst.de> Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/nvme/target/configfs.c | 67 ++ drivers/nvme/target/core.c | 106 -

[PATCH v3 03/11] PCI/P2PDMA: Add PCI p2pmem dma mappings to adjust the bus offset

2018-03-12 Thread Logan Gunthorpe
The DMA address used when mapping PCI P2P memory must be the PCI bus address. Thus, introduce pci_p2pmem_[un]map_sg() to map the correct addresses when using P2P memory. For this, we assume that an SGL passed to these functions contain all P2P memory or no P2P memory. Signed-off-by: Logan

[PATCH v3 08/11] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-12 Thread Logan Gunthorpe
Register the CMB buffer as p2pmem and use the appropriate allocation functions to create and destroy the IO SQ. If the CMB supports WDS and RDS, publish it for use as P2P memory by other devices. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/nvme/host/pci.

[PATCH v3 05/11] PCI/P2PDMA: Add P2P DMA driver writer's documentation

2018-03-12 Thread Logan Gunthorpe
converted to restructured text at this time. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> Cc: Jonathan Corbet <cor...@lwn.net> --- Documentation/PCI/index.rst | 14 Documentation/PCI/p2pdma.rst | 164 +++ Documentation/index.rst

[PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-12 Thread Logan Gunthorpe
and feedback from Christoph Hellwig. Signed-off-by: Christoph Hellwig <h...@lst.de> Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/pci/Kconfig| 16 ++ drivers/pci/Makefile | 1 + drivers/pci/p2pdma.c | 679 ++

[PATCH v2 09/10] nvme-pci: Add a quirk for a pseudo CMB

2018-02-28 Thread Logan Gunthorpe
Introduce a quirk to use CMB-like memory on older devices that have an exposed BAR but do not advertise support for using CMBLOC and CMBSIZE. We'd like to use some of these older cards to test P2P memory. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/nvme/host/nvme.

[PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-02-28 Thread Logan Gunthorpe
Register the CMB buffer as p2pmem and use the appropriate allocation functions to create and destroy the IO SQ. If the CMB supports WDS and RDS, publish it for use as p2p memory by other devices. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/nvme/host/pci.

[PATCH v2 08/10] nvme-pci: Add support for P2P memory in requests

2018-02-28 Thread Logan Gunthorpe
-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/nvme/host/core.c | 4 drivers/nvme/host/nvme.h | 1 + drivers/nvme/host/pci.c | 19 +++ 3 files changed, 20 insertions(+), 4 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c

[PATCH v2 02/10] PCI/P2PDMA: Add sysfs group to display p2pmem stats

2018-02-28 Thread Logan Gunthorpe
Attributes display the total amount of P2P memory, the amount available and whether it is published or not. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- Documentation/ABI/testing/sysfs-bus-pci | 25 + drivers/pci/p2pdma.c

[PATCH v2 06/10] IB/core: Add optional PCI P2P flag to rdma_rw_ctx_[init|destroy]()

2018-02-28 Thread Logan Gunthorpe
In order to use PCI P2P memory pci_p2pmem_[un]map_sg() functions must be called to map the correct DMA address. To do this, we add a flags variable and the RDMA_RW_CTX_FLAG_PCI_P2P flag. When the flag is specified use the appropriate map function. Signed-off-by: Logan Gunthorpe <

[PATCH v2 01/10] PCI/P2PDMA: Support peer to peer memory

2018-02-28 Thread Logan Gunthorpe
supports P2P transfers is non-trivial. Additionally, the benefits of P2P transfers that go through the RC is limited to only reducing DRAM usage. This commit includes significant rework and feedback from Christoph Hellwig. Signed-off-by: Christoph Hellwig <h...@lst.de> Signed-off-by: Logan Gunthorp

[PATCH v2 05/10] block: Introduce PCI P2P flags for request and request queue

2018-02-28 Thread Logan Gunthorpe
flag set. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- block/blk-core.c | 3 +++ include/linux/blk_types.h | 18 +- include/linux/blkdev.h| 3 +++ 3 files changed, 23 insertions(+), 1 deletion(-) diff --git a/block/blk-core.c b/block/blk-core.c

[PATCH v2 00/10] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-02-28 Thread Logan Gunthorpe
will fail. Logan Gunthorpe (10): PCI/P2PDMA: Support peer to peer memory PCI/P2PDMA: Add sysfs group to display p2pmem stats PCI/P2PDMA: Add PCI p2pmem dma mappings to adjust the bus offset PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches block: Introduce PCI P2P flags

[PATCH v2 10/10] nvmet: Optionally use PCI P2P memory

2018-02-28 Thread Logan Gunthorpe
hch: partial rewrite of the initial code] Signed-off-by: Christoph Hellwig <h...@lst.de> Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/nvme/target/configfs.c | 29 + drivers/nvme/target/core.c | 95 +- drivers/nvm

[PATCH v2 03/10] PCI/P2PDMA: Add PCI p2pmem dma mappings to adjust the bus offset

2018-02-28 Thread Logan Gunthorpe
The DMA address used when mapping PCI P2P memory must be the PCI bus address. Thus, introduce pci_p2pmem_[un]map_sg() to map the correct addresses when using P2P memory. For this, we assume that an SGL passed to these functions contain all p2p memory or no p2p memory. Signed-off-by: Logan

[PATCH v2 04/10] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-02-28 Thread Logan Gunthorpe
. This effectively means that if CONFIG_PCI_P2PDMA is selected then all devices behind any switch will be in the same IOMMU group. Signed-off-by: Logan Gunthorpe <log...@deltatee.com> --- drivers/pci/Kconfig| 4 drivers/pci/p2pdma.c | 44 drive

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 12/03/18 09:28 PM, Sinan Kaya wrote: Maybe, dev parameter should also be struct pci_dev so that you can get rid of all to_pci_dev() calls in this code including find_parent_pci_dev() function. No, this was mentioned in v2. find_parent_pci_dev is necessary because the calling drivers

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 13/03/18 11:49 AM, Sinan Kaya wrote: And there's also the ACS problem which means if you want to use P2P on the root ports you'll have to disable ACS on the entire system. (Or preferably, the IOMMU groups need to get more sophisticated to allow for dynamic changes). Do you think you

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 13/03/18 01:53 PM, Sinan Kaya wrote: > I agree disabling globally would be bad. Somebody can always say I have > ten switches on my system. I want to do peer-to-peer on one switch only. Now, > this change weakened security for the other switches that I had no intention > with doing P2P. > >

  1   2   >