Enabling peer to peer device transactions for PCIe devices

2016-11-30 Thread Jason Gunthorpe
On Wed, Nov 30, 2016 at 12:45:58PM +0200, Haggai Eran wrote: > > That just forces applications to handle horrible unexpected > > failures. If this sort of thing is needed for correctness then OOM > > kill the offending process, don't corrupt its operation. > Yes, that sounds fine. Can we simply

Enabling peer to peer device transactions for PCIe devices

2017-01-05 Thread Jason Gunthorpe
On Thu, Jan 05, 2017 at 01:39:29PM -0500, Jerome Glisse wrote: > 1) peer-to-peer because of userspace specific API like NVidia GPU > direct (AMD is pushing its own similar API i just can't remember > marketing name). This does not happen through a vma, this happens > through

Enabling peer to peer device transactions for PCIe devices

2017-01-05 Thread Jason Gunthorpe
On Thu, Jan 05, 2017 at 06:23:52PM -0500, Jerome Glisse wrote: > > I still don't understand what you driving at - you've said in both > > cases a user VMA exists. > > In the former case no, there is no VMA directly but if you want one than > a device can provide one. But such VMA is useless as

Enabling peer to peer device transactions for PCIe devices

2017-01-05 Thread Jason Gunthorpe
On Thu, Jan 05, 2017 at 02:54:24PM -0500, Jerome Glisse wrote: > Mellanox and NVidia support peer to peer with what they market a > GPUDirect. It only works without IOMMU. It is probably not upstream : > > https://www.mail-archive.com/linux-rdma at vger.kernel.org/msg21402.html > > I thought it

Enabling peer to peer device transactions for PCIe devices

2017-01-05 Thread Jason Gunthorpe
On Thu, Jan 05, 2017 at 03:19:36PM -0500, Jerome Glisse wrote: > > Always having a VMA changes the discussion - the question is how to > > create a VMA that reprensents IO device memory, and how do DMA > > consumers extract the correct information from that VMA to pass to the > > kernel DMA API

Enabling peer to peer device transactions for PCIe devices

2017-01-06 Thread Jason Gunthorpe
On Fri, Jan 06, 2017 at 12:37:22PM -0500, Jerome Glisse wrote: > On Fri, Jan 06, 2017 at 11:56:30AM -0500, Serguei Sagalovitch wrote: > > On 2017-01-05 08:58 PM, Jerome Glisse wrote: > > > On Thu, Jan 05, 2017 at 05:30:34PM -0700, Jason Gunthorpe wrote: > > > > On T

Enabling peer to peer device transactions for PCIe devices

2016-11-23 Thread Jason Gunthorpe
On Wed, Nov 23, 2016 at 02:42:12PM -0800, Dan Williams wrote: > > The crucial part for this discussion is the ability to fence and block > > DMA for a specific range. This is the hardware capability that lets > > page migration happen: fence DMA, migrate page, update page > > table in HCA, unblock

Enabling peer to peer device transactions for PCIe devices

2016-11-23 Thread Jason Gunthorpe
On Wed, Nov 23, 2016 at 10:13:03AM -0700, Logan Gunthorpe wrote: > an MR would be very tricky. The MR may be relied upon by another host > and the kernel would have to inform user-space the MR was invalid then > user-space would have to tell the remote application. As Bart says, it would be best

Enabling peer to peer device transactions for PCIe devices

2016-11-23 Thread Jason Gunthorpe
On Wed, Nov 23, 2016 at 02:14:40PM -0500, Serguei Sagalovitch wrote: > > On 2016-11-23 02:05 PM, Jason Gunthorpe wrote: > >As Bart says, it would be best to be combined with something like > >Mellanox's ODP MRs, which allows a page to be evicted and then trigger > >

Enabling peer to peer device transactions for PCIe devices

2016-11-23 Thread Jason Gunthorpe
On Wed, Nov 23, 2016 at 02:11:29PM -0700, Logan Gunthorpe wrote: > > As I said, there is no possible special handling. Standard IB hardware > > does not support changing the DMA address once a MR is created. Forget > > about doing that. > > Yeah, that's essentially the point I was trying to make.

Enabling peer to peer device transactions for PCIe devices

2016-11-23 Thread Jason Gunthorpe
On Wed, Nov 23, 2016 at 10:40:47AM -0800, Dan Williams wrote: > I don't think that was designed for the case where the backing memory > is a special/static physical address range rather than anonymous > "System RAM", right? The hardware doesn't care where the memory is. ODP is just a generic

Enabling peer to peer device transactions for PCIe devices

2016-11-23 Thread Jason Gunthorpe
On Wed, Nov 23, 2016 at 02:58:38PM -0500, Serguei Sagalovitch wrote: >We do not want to have "highly" dynamic translation due to >performance cost. We need to support "overcommit" but would >like to minimize impact. To support RDMA MRs for GPU/VRAM/PCIe >device memory (which is

Enabling peer to peer device transactions for PCIe devices

2016-11-24 Thread Jason Gunthorpe
On Thu, Nov 24, 2016 at 10:45:18AM +0100, Christian König wrote: > Am 24.11.2016 um 00:25 schrieb Jason Gunthorpe: > >There is certainly nothing about the hardware that cares > >about ZONE_DEVICE vs System memory. > Well that is clearly not so simple. When your ZONE_DEVICE page

Enabling peer to peer device transactions for PCIe devices

2016-11-24 Thread Jason Gunthorpe
On Wed, Nov 23, 2016 at 06:25:21PM -0700, Logan Gunthorpe wrote: > > > On 23/11/16 02:55 PM, Jason Gunthorpe wrote: > >>> Only ODP hardware allows changing the DMA address on the fly, and it > >>> works at the page table level. We do not need special handling for

Enabling peer to peer device transactions for PCIe devices

2016-11-24 Thread Jason Gunthorpe
On Thu, Nov 24, 2016 at 12:40:37AM +, Sagalovitch, Serguei wrote: > On Wed, Nov 23, 2016 at 02:11:29PM -0700, Logan Gunthorpe wrote: > > > Perhaps I am not following what Serguei is asking for, but I > > understood the desire was for a complex GPU allocator that could > > migrate pages

Enabling peer to peer device transactions for PCIe devices

2016-11-25 Thread Jason Gunthorpe
On Fri, Nov 25, 2016 at 02:22:17PM +0100, Christian König wrote: > >Like you say below we have to handle short lived in the usual way, and > >that covers basically every device except IB MRs, including the > >command queue on a NVMe drive. > > Well a problem which wasn't mentioned so far is

Enabling peer to peer device transactions for PCIe devices

2016-11-25 Thread Jason Gunthorpe
On Thu, Nov 24, 2016 at 11:58:17PM -0800, Christoph Hellwig wrote: > On Thu, Nov 24, 2016 at 11:11:34AM -0700, Logan Gunthorpe wrote: > > * Regular DAX in the FS doesn't work at this time because the FS can > > move the file you think your transfer to out from under you. Though I > > understand

Enabling peer to peer device transactions for PCIe devices

2016-11-25 Thread Jason Gunthorpe
On Fri, Nov 25, 2016 at 12:16:30PM -0500, Serguei Sagalovitch wrote: > b) Allocation may not have CPU address at all - only GPU one. But you don't expect RDMA to work in the case, right? GPU people need to stop doing this windowed memory stuff :) Jason

Enabling peer to peer device transactions for PCIe devices

2016-11-25 Thread Jason Gunthorpe
On Fri, Nov 25, 2016 at 09:40:10PM +0100, Christian König wrote: > We call this "userptr" and it's just a combination of get_user_pages() on > command submission and making sure the returned list of pages stays valid > using a MMU notifier. Doesn't that still pin the page? > The "big" problem

Enabling peer to peer device transactions for PCIe devices

2016-11-25 Thread Jason Gunthorpe
On Fri, Nov 25, 2016 at 02:49:50PM -0500, Serguei Sagalovitch wrote: > GPU could perfectly access all VRAM. It is only issue for p2p without > special interconnect and CPU access. Strictly speaking as long as we > have "bus address" we could have RDMA but I agreed that for > RDMA we

Enabling peer to peer device transactions for PCIe devices

2016-11-28 Thread Jason Gunthorpe
On Mon, Nov 28, 2016 at 06:19:40PM +, Haggai Eran wrote: > > > GPU memory. We create a non-ODP MR pointing to VRAM but rely on > > > user-space and the GPU not to migrate it. If they do, the MR gets > > > destroyed immediately. > > That sounds horrible. How can that possibly work? What if the

Enabling peer to peer device transactions for PCIe devices

2016-11-28 Thread Jason Gunthorpe
On Sun, Nov 27, 2016 at 04:02:16PM +0200, Haggai Eran wrote: > > Like in ODP, MMU notifiers/HMM are used to monitor for translation > > changes. If a change comes in the GPU driver checks if an executing > > command is touching those pages and blocks the MMU notifier until the > > command

Enabling peer to peer device transactions for PCIe devices

2016-11-28 Thread Jason Gunthorpe
On Mon, Nov 28, 2016 at 04:55:23PM -0500, Serguei Sagalovitch wrote: > >We haven't touch this in a long time and perhaps it changed, but there > >definitely was a call back in the PeerDirect API to allow the GPU to > >invalidate the mapping. That's what we don't want. > I assume that you are

Enabling peer to peer device transactions for PCIe devices

2016-12-05 Thread Jason Gunthorpe
On Mon, Dec 05, 2016 at 10:48:58AM -0800, Dan Williams wrote: > On Mon, Dec 5, 2016 at 10:39 AM, Logan Gunthorpe > wrote: > > On 05/12/16 11:08 AM, Dan Williams wrote: > >> > >> I've already recommended that iopmem not be a block device and instead > >> be a device-dax instance. I also don't

Enabling peer to peer device transactions for PCIe devices

2016-12-05 Thread Jason Gunthorpe
On Sun, Dec 04, 2016 at 07:23:00AM -0600, Stephen Bates wrote: > Hi All > > This has been a great thread (thanks to Alex for kicking it off) and I > wanted to jump in and maybe try and put some summary around the > discussion. I also wanted to propose we include this as a topic for LFS/MM >

Enabling peer to peer device transactions for PCIe devices

2016-12-05 Thread Jason Gunthorpe
On Mon, Dec 05, 2016 at 12:27:20PM -0700, Logan Gunthorpe wrote: > > > On 05/12/16 12:14 PM, Jason Gunthorpe wrote: > >But CMB sounds much more like the GPU case where there is a > >specialized allocator handing out the BAR to consumers, so I'm not > >sure a general

Enabling peer to peer device transactions for PCIe devices

2016-12-05 Thread Jason Gunthorpe
On Mon, Dec 05, 2016 at 09:40:38AM -0800, Dan Williams wrote: > > If it is kernel only with physical addresess we don't need a uAPI for > > it, so I'm not sure #1 is at all related to iopmem. > > > > Most people who want #1 probably can just mmap > > /sys/../pci/../resourceX to get a user handle

Enabling peer to peer device transactions for PCIe devices

2016-12-06 Thread Jason Gunthorpe
On Tue, Dec 06, 2016 at 09:51:15AM -0700, Logan Gunthorpe wrote: > Hey, > > On 06/12/16 09:38 AM, Jason Gunthorpe wrote: > >>> I'm not opposed to mapping /dev/nvmeX. However, the lookup is trivial > >>> to accomplish in sysfs through /sys/dev/char to find the s

Enabling peer to peer device transactions for PCIe devices

2016-12-06 Thread Jason Gunthorpe
> > I'm not opposed to mapping /dev/nvmeX. However, the lookup is trivial > > to accomplish in sysfs through /sys/dev/char to find the sysfs path of the > > device-dax instance under the nvme device, or if you already have the nvme > > sysfs path the dax instance(s) will appear under the "dax"

Re: Enabling peer to peer device transactions for PCIe devices

2017-01-12 Thread Jason Gunthorpe
On Thu, Jan 12, 2017 at 10:11:29AM -0500, Jerome Glisse wrote: > On Wed, Jan 11, 2017 at 10:54:39PM -0600, Stephen Bates wrote: > > > What we want is for RDMA, O_DIRECT, etc to just work with special VMAs > > > (ie. at least those backed with ZONE_DEVICE memory). Then > > > GPU/NVME/DAX/whatever

Re: [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions

2017-04-27 Thread Jason Gunthorpe
On Thu, Apr 27, 2017 at 08:53:38AM +0200, Christoph Hellwig wrote: > > The main difficulty we > > have now is that neither of those functions are expected to fail and we > > need them to be able to in cases where the page doesn't map to system > > RAM. This patch series is trying to address it

Re: [PATCH v2 15/21] xen-blkfront: Make use of the new sg_map helper function

2017-04-27 Thread Jason Gunthorpe
On Thu, Apr 27, 2017 at 03:53:37PM -0600, Logan Gunthorpe wrote: > On 27/04/17 02:53 PM, Jason Gunthorpe wrote: > > blkfront is one of the drivers I looked at, and it appears to only be > > memcpying with the bvec_data pointer, so I wonder why it does not use > > sg_

Re: [PATCH v2 15/21] xen-blkfront: Make use of the new sg_map helper function

2017-04-27 Thread Jason Gunthorpe
On Thu, Apr 27, 2017 at 02:19:24PM -0600, Logan Gunthorpe wrote: > > > On 26/04/17 01:37 AM, Roger Pau Monné wrote: > > On Tue, Apr 25, 2017 at 12:21:02PM -0600, Logan Gunthorpe wrote: > >> Straightforward conversion to the new helper, except due to the lack > >> of error path, we have to use

Re: [PATCH v2 15/21] xen-blkfront: Make use of the new sg_map helper function

2017-04-27 Thread Jason Gunthorpe
On Thu, Apr 27, 2017 at 05:03:45PM -0600, Logan Gunthorpe wrote: > > > On 27/04/17 04:11 PM, Jason Gunthorpe wrote: > > On Thu, Apr 27, 2017 at 03:53:37PM -0600, Logan Gunthorpe wrote: > > Well, that is in the current form, with more users it would make sense > > to o

Re: [PATCH rdma-next 01/21] drm/i915: Move u64-to-ptr helpers to general header

2018-05-15 Thread Jason Gunthorpe
On Thu, May 03, 2018 at 04:36:55PM +0300, Leon Romanovsky wrote: > From: Leon Romanovsky > > The macro u64_to_ptr() and function ptr_to_u64() are useful enough > to be part of general header, so move them there and allow RDMA > subsystem reuse them. > > Signed-off-by: Leon

Re: [PATCH v2 05/17] compat_ioctl: move more drivers to generic_compat_ioctl_ptrarg

2018-09-25 Thread Jason Gunthorpe
On Mon, Sep 24, 2018 at 10:18:52PM +0200, Arnd Bergmann wrote: > On Tue, Sep 18, 2018 at 7:59 PM Jason Gunthorpe wrote: > > > > On Tue, Sep 18, 2018 at 10:51:08AM -0700, Darren Hart wrote: > > > On Fri, Sep 14, 2018 at 09:57:48PM +0100, Al Viro wrote: > > > >

Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-17 Thread Jason Gunthorpe
On Wed, Jan 16, 2019 at 05:11:34PM +0100, h...@lst.de wrote: > On Tue, Jan 15, 2019 at 02:25:01PM -0700, Jason Gunthorpe wrote: > > RDMA needs something similar as well, in this case drivers take a > > struct page * from get_user_pages() and need to have the DMA map fail > >

Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-14 Thread Jason Gunthorpe
On Sat, Jan 12, 2019 at 01:03:05PM -0600, Shiraz Saleem wrote: > On Sat, Jan 12, 2019 at 06:37:58PM +0000, Jason Gunthorpe wrote: > > On Sat, Jan 12, 2019 at 12:27:05PM -0600, Shiraz Saleem wrote: > > > On Fri, Jan 04, 2019 at 10:35:43PM +0000, Jason Gunthorpe wrote: > >

Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-16 Thread Jason Gunthorpe
On Tue, Jan 15, 2019 at 02:17:26PM +, Thomas Hellstrom wrote: > Hi, Christoph, > > On Mon, 2019-01-14 at 10:48 +0100, Christoph Hellwig wrote: > > On Thu, Jan 10, 2019 at 04:42:18PM -0700, Jason Gunthorpe wrote: > > > > Changes since the RFC: > > > >

Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-18 Thread Jason Gunthorpe
On Thu, Jan 17, 2019 at 10:30:01AM +0100, h...@lst.de wrote: > On Wed, Jan 16, 2019 at 10:24:36AM -0700, Jason Gunthorpe wrote: > > The fact is there is 0 industry interest in using RDMA on platforms > > that can't do HW DMA cache coherency - the kernel syscalls required to &

Re: [PATCH v2 1/3] mm/mmu_notifier: use structure for invalidate_range_start/end callback

2018-12-07 Thread Jason Gunthorpe
21 ++-- > virt/kvm/kvm_main.c | 14 +++- > 12 files changed, 102 insertions(+), 113 deletions(-) The changes to drivers/infiniband look mechanical and fine to me. It even looks like this avoids merge conflicts with the other changes

[PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-07 Thread Jason Gunthorpe
gly mixing accessors and iterators. Signed-off-by: Jason Gunthorpe --- drivers/gpu/drm/vmwgfx/vmwgfx_drv.h| 26 ++-- drivers/gpu/drm/vmwgfx/vmwgfx_mob.c| 26 +++- drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 42 +-- drivers/media/pci/intel/ipu3/i

Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-07 Thread Jason Gunthorpe
please drop us a note to > help improve the system] > > url: > https://github.com/0day-ci/linux/commits/Jason-Gunthorpe/lib-scatterlist-Provide-a-DMA-page-iterator/20190105-081739 > config: x86_64-randconfig-x017-201900 (attached as .config) > compiler: gcc-7 (Debian 7.3.0

Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-11 Thread Jason Gunthorpe
On Fri, Jan 04, 2019 at 03:35:31PM -0700, Jason Gunthorpe wrote: > Commit 2db76d7c3c6d ("lib/scatterlist: sg_page_iter: support sg lists w/o > backing pages") introduced the sg_page_iter_dma_address() function without > providing a way to use it in the general case

Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-13 Thread Jason Gunthorpe
On Sat, Jan 12, 2019 at 12:27:05PM -0600, Shiraz Saleem wrote: > On Fri, Jan 04, 2019 at 10:35:43PM +0000, Jason Gunthorpe wrote: > > Commit 2db76d7c3c6d ("lib/scatterlist: sg_page_iter: support sg lists w/o > > backing pages") introduced the sg_page_iter_dma_a

Re: [PATCH v2 05/17] compat_ioctl: move more drivers to generic_compat_ioctl_ptrarg

2018-09-13 Thread Jason Gunthorpe
const struct file_operations uverbs_mmap_fops = { > .release = ib_uverbs_close, > .llseek = no_llseek, > .unlocked_ioctl = ib_uverbs_ioctl, > - .compat_ioctl = ib_uverbs_ioctl, > + .compat_ioctl = generic_compat_ioctl_ptrarg, > }; > > static struct ib_cl

Re: [PATCH v2 05/17] compat_ioctl: move more drivers to generic_compat_ioctl_ptrarg

2018-09-19 Thread Jason Gunthorpe
On Tue, Sep 18, 2018 at 10:51:08AM -0700, Darren Hart wrote: > On Fri, Sep 14, 2018 at 09:57:48PM +0100, Al Viro wrote: > > On Fri, Sep 14, 2018 at 01:35:06PM -0700, Darren Hart wrote: > > > > > Acked-by: Darren Hart (VMware) > > > > > > As for a longer term solution, would it be possible to

[PATCH v2] lib/scatterlist: Provide a DMA page iterator

2019-02-08 Thread Jason Gunthorpe
st wrongly mixing accessors and iterators. Acked-by: Christoph Hellwig (for scatterlist) Signed-off-by: Jason Gunthorpe --- .clang-format | 1 + drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 8 +++- drivers/media/pci/intel/ipu3/ipu3-cio2.c | 4 +- include/linux/sca

Re: [PATCH v2] lib/scatterlist: Provide a DMA page iterator

2019-02-09 Thread Jason Gunthorpe
On Thu, Feb 07, 2019 at 03:26:47PM -0700, Jason Gunthorpe wrote: > diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c > b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c > index 31786b200afc47..e84f6aaee778f0 100644 > +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c > @@ -311,7 +3

Re: [PATCH V2 3/7] mm/gup: Change GUP fast to use flags rather than a write 'bool'

2019-02-14 Thread Jason Gunthorpe
On Wed, Feb 13, 2019 at 03:04:51PM -0800, ira.we...@intel.com wrote: > From: Ira Weiny > > To facilitate additional options to get_user_pages_fast() change the > singular write parameter to be gup_flags. So now we have: long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages,

Re: [PATCH v2] lib/scatterlist: Provide a DMA page iterator

2019-02-12 Thread Jason Gunthorpe
On Thu, Feb 07, 2019 at 10:26:52PM +, Jason Gunthorpe wrote: > Commit 2db76d7c3c6d ("lib/scatterlist: sg_page_iter: support sg lists w/o > backing pages") introduced the sg_page_iter_dma_address() function without > providing a way to use it in the general case

Re: [PATCH v5 0/9] mmu notifier provide context informations

2019-02-20 Thread Jason Gunthorpe
On Tue, Feb 19, 2019 at 03:30:33PM -0500, Jerome Glisse wrote: > On Tue, Feb 19, 2019 at 12:15:55PM -0800, Dan Williams wrote: > > On Tue, Feb 19, 2019 at 12:04 PM wrote: > > > > > > From: Jérôme Glisse > > > > > > Since last version [4] i added the extra bits needed for the change_pte > > >

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-30 Thread Jason Gunthorpe
On Tue, Jan 29, 2019 at 06:17:43PM -0700, Logan Gunthorpe wrote: > This isn't answering my question at all... I specifically asked what is > backing the VMA when we are *not* using HMM. At least for RDMA what backs the VMA today is non-struct-page BAR memory filled in with io_remap_pfn. And we

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-30 Thread Jason Gunthorpe
On Tue, Jan 29, 2019 at 02:50:55PM -0500, Jerome Glisse wrote: > GPU driver do want more control :) GPU driver are moving things around > all the time and they have more memory than bar space (on newer platform > AMD GPU do resize the bar but it is not the rule for all GPUs). So > GPU driver do

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-30 Thread Jason Gunthorpe
On Tue, Jan 29, 2019 at 02:11:23PM -0500, Jerome Glisse wrote: > On Tue, Jan 29, 2019 at 11:36:29AM -0700, Logan Gunthorpe wrote: > > > > > > On 2019-01-29 10:47 a.m., jgli...@redhat.com wrote: > > > > > + /* > > > + * Optional for device driver that want to allow peer to peer (p2p) > > > + *

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-30 Thread Jason Gunthorpe
On Tue, Jan 29, 2019 at 03:44:00PM -0500, Jerome Glisse wrote: > > But this API doesn't seem to offer any control - I thought that > > control was all coming from the mm/hmm notifiers triggering p2p_unmaps? > > The control is within the driver implementation of those callbacks. Seems like what

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-30 Thread Jason Gunthorpe
On Tue, Jan 29, 2019 at 07:08:06PM -0500, Jerome Glisse wrote: > On Tue, Jan 29, 2019 at 11:02:25PM +0000, Jason Gunthorpe wrote: > > On Tue, Jan 29, 2019 at 03:44:00PM -0500, Jerome Glisse wrote: > > > > > > But this API doesn't seem to offer any control - I thought t

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-30 Thread Jason Gunthorpe
On Tue, Jan 29, 2019 at 01:39:49PM -0700, Logan Gunthorpe wrote: > implement the mapping. And I don't think we should have 'special' vma's > for this (though we may need something to ensure we don't get mapping > requests mixed with different types of pages...). I think Jerome explained the

Re: [PATCH v4 9/9] RDMA/umem_odp: optimize out the case when a range is updated to read only

2019-01-24 Thread Jason Gunthorpe
off-by: Jérôme Glisse > Cc: Christian König > Cc: Jan Kara > Cc: Felix Kuehling > Cc: Jason Gunthorpe > Cc: Andrew Morton > Cc: Matthew Wilcox > Cc: Ross Zwisler > Cc: Dan Williams > Cc: Paolo Bonzini > Cc: Radim Krčmář > Cc: Michal Hocko > Cc: Ralph

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-02-01 Thread Jason Gunthorpe
On Thu, Jan 31, 2019 at 02:35:14PM -0500, Jerome Glisse wrote: > > Basically invert the API flow - the DMA map would be done close to > > GUP, not buried in the driver. This absolutely doesn't work for every > > flow we have, but it does enable the ones that people seem to care > > about when

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-02-01 Thread Jason Gunthorpe
On Thu, Jan 31, 2019 at 09:13:55AM +0100, Christoph Hellwig wrote: > On Wed, Jan 30, 2019 at 03:52:13PM -0700, Logan Gunthorpe wrote: > > > *shrug* so what if the special GUP called a VMA op instead of > > > traversing the VMA PTEs today? Why does it really matter? It could > > > easily change to

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-02-01 Thread Jason Gunthorpe
On Thu, Jan 31, 2019 at 12:19:31PM -0700, Logan Gunthorpe wrote: > > > On 2019-01-31 12:02 p.m., Jason Gunthorpe wrote: > > I still think the right direction is to build on what Logan has done - > > realize that he created a DMA-only SGL - make that a formal type of >

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-31 Thread Jason Gunthorpe
On Wed, Jan 30, 2019 at 03:43:32PM -0500, Jerome Glisse wrote: > On Wed, Jan 30, 2019 at 08:11:19PM +0000, Jason Gunthorpe wrote: > > On Wed, Jan 30, 2019 at 01:00:02PM -0700, Logan Gunthorpe wrote: > > > > > We never changed SGLs. We still use them to pass p2pdma p

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-31 Thread Jason Gunthorpe
On Wed, Jan 30, 2019 at 12:45:46PM -0700, Logan Gunthorpe wrote: > > > On 2019-01-30 12:06 p.m., Jason Gunthorpe wrote: > >> Way less problems than not having struct page for doing anything > >> non-trivial. If you map the BAR to userspace with remap_pfn_range &

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-31 Thread Jason Gunthorpe
On Wed, Jan 30, 2019 at 03:52:13PM -0700, Logan Gunthorpe wrote: > > > On 2019-01-30 2:50 p.m., Jason Gunthorpe wrote: > > On Wed, Jan 30, 2019 at 02:01:35PM -0700, Logan Gunthorpe wrote: > > > >> And I feel the GUP->SGL->DMA flow should still be what we

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-31 Thread Jason Gunthorpe
On Wed, Jan 30, 2019 at 09:00:06AM +0100, Christoph Hellwig wrote: > On Wed, Jan 30, 2019 at 04:18:48AM +0000, Jason Gunthorpe wrote: > > Every attempt to give BAR memory to struct page has run into major > > trouble, IMHO, so I like that this approach avoids that. > > W

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-31 Thread Jason Gunthorpe
On Wed, Jan 30, 2019 at 05:47:05PM -0500, Jerome Glisse wrote: > On Wed, Jan 30, 2019 at 10:33:04PM +0000, Jason Gunthorpe wrote: > > On Wed, Jan 30, 2019 at 05:30:27PM -0500, Jerome Glisse wrote: > > > > > > What is the problem in the HMM mirror tha

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-31 Thread Jason Gunthorpe
On Wed, Jan 30, 2019 at 06:26:53PM +0100, Christoph Hellwig wrote: > On Wed, Jan 30, 2019 at 10:55:43AM -0500, Jerome Glisse wrote: > > Even outside GPU driver, device driver like RDMA just want to share their > > doorbell to other device and they do not want to see those doorbell page > > use in

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-31 Thread Jason Gunthorpe
On Wed, Jan 30, 2019 at 10:17:27AM -0700, Logan Gunthorpe wrote: > > > On 2019-01-29 9:18 p.m., Jason Gunthorpe wrote: > > Every attempt to give BAR memory to struct page has run into major > > trouble, IMHO, so I like that this approach avoids that. > > > >

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-31 Thread Jason Gunthorpe
On Wed, Jan 30, 2019 at 02:01:35PM -0700, Logan Gunthorpe wrote: > And I feel the GUP->SGL->DMA flow should still be what we are aiming > for. Even if we need a special GUP for special pages, and a special DMA > map; and the SGL still has to be homogenous *shrug* so what if the special GUP

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-31 Thread Jason Gunthorpe
On Wed, Jan 30, 2019 at 12:48:33PM -0700, Logan Gunthorpe wrote: > > > On 2019-01-30 12:19 p.m., Jason Gunthorpe wrote: > > On Wed, Jan 30, 2019 at 11:13:11AM -0700, Logan Gunthorpe wrote: > >> > >> > >> On 2019-01-30 10:44 a.m., Jason Gunthorpe w

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-31 Thread Jason Gunthorpe
On Wed, Jan 30, 2019 at 09:02:08AM +0100, Christoph Hellwig wrote: > On Tue, Jan 29, 2019 at 08:58:35PM +0000, Jason Gunthorpe wrote: > > On Tue, Jan 29, 2019 at 01:39:49PM -0700, Logan Gunthorpe wrote: > > > > > implement the mapping. And I don't think we sho

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-31 Thread Jason Gunthorpe
On Wed, Jan 30, 2019 at 02:22:34PM -0500, Jerome Glisse wrote: > For GPU it would not work, GPU might want to use main memory (because > it is running out of BAR space) it is a lot easier if the p2p_map > callback calls the right dma map function (for page or io) rather than > having to define

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-31 Thread Jason Gunthorpe
On Wed, Jan 30, 2019 at 04:45:25PM -0500, Jerome Glisse wrote: > On Wed, Jan 30, 2019 at 08:50:00PM +0000, Jason Gunthorpe wrote: > > On Wed, Jan 30, 2019 at 03:43:32PM -0500, Jerome Glisse wrote: > > > On Wed, Jan 30, 2019 at 08:11:19PM +0000, Jason Gunthorpe wrote: > >

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-31 Thread Jason Gunthorpe
On Wed, Jan 30, 2019 at 05:30:27PM -0500, Jerome Glisse wrote: > > What is the problem in the HMM mirror that it needs this restriction? > > No restriction at all here. I think i just wasn't understood. Are you are talking about from the exporting side - where the thing creating the VMA can

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-31 Thread Jason Gunthorpe
On Wed, Jan 30, 2019 at 01:00:02PM -0700, Logan Gunthorpe wrote: > We never changed SGLs. We still use them to pass p2pdma pages, only we > need to be a bit careful where we send the entire SGL. I see no reason > why we can't continue to be careful once their in userspace if there's > something

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-31 Thread Jason Gunthorpe
On Wed, Jan 30, 2019 at 11:13:11AM -0700, Logan Gunthorpe wrote: > > > On 2019-01-30 10:44 a.m., Jason Gunthorpe wrote: > > I don't see why a special case with a VMA is really that different. > > Well one *really* big difference is the VMA changes necessarily exp

Re: RFC: Run a dedicated hmm.git for 5.3

2019-05-27 Thread Jason Gunthorpe
On Sat, May 25, 2019 at 03:52:10PM -0700, Andrew Morton wrote: > On Fri, 24 May 2019 09:44:55 -0300 Jason Gunthorpe wrote: > > > Now that -mm merged the basic hmm API skeleton I think running like > > this would get us quickly to the place we all want: comprehensive in tre

[PATCH v2 hmm 07/11] mm/hmm: Use lockdep instead of comments

2019-06-07 Thread Jason Gunthorpe
From: Jason Gunthorpe So we can check locking at runtime. Signed-off-by: Jason Gunthorpe Reviewed-by: Jérôme Glisse --- v2 - Fix missing & in lockdeps (Jason) --- mm/hmm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/hmm.c b/mm/hmm.c index f67ba32983

[PATCH v2 hmm 08/11] mm/hmm: Remove racy protection against double-unregistration

2019-06-07 Thread Jason Gunthorpe
From: Jason Gunthorpe No other register/unregister kernel API attempts to provide this kind of protection as it is inherently racy, so just drop it. Callers should provide their own protection, it appears nouveau already does, but just in case drop a debugging POISON. Signed-off-by: Jason

Re: [PATCH 0/2] Two bug-fixes for HMM

2019-06-07 Thread Jason Gunthorpe
On Thu, Jun 06, 2019 at 07:04:46PM +, Kuehling, Felix wrote: > On 2019-06-06 11:11 a.m., Jason Gunthorpe wrote: > > On Fri, May 10, 2019 at 07:53:21PM +, Kuehling, Felix wrote: > >> These problems were found in AMD-internal testing as we're working on > >> ado

[PATCH v2 hmm 02/11] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register

2019-06-07 Thread Jason Gunthorpe
From: Jason Gunthorpe Ralph observes that hmm_range_register() can only be called by a driver while a mirror is registered. Make this clear in the API by passing in the mirror structure as a parameter. This also simplifies understanding the lifetime model for struct hmm, as the hmm pointer must

[PATCH v2 hmm 03/11] mm/hmm: Hold a mmgrab from hmm to mm

2019-06-07 Thread Jason Gunthorpe
From: Jason Gunthorpe So long a a struct hmm pointer exists, so should the struct mm it is linked too. Hold the mmgrab() as soon as a hmm is created, and mmdrop() it once the hmm refcount goes to zero. Since mmdrop() (ie a 0 kref on struct mm) is now impossible with a !NULL mm->hmm del

[PATCH v2 hmm 06/11] mm/hmm: Hold on to the mmget for the lifetime of the range

2019-06-07 Thread Jason Gunthorpe
From: Jason Gunthorpe Range functions like hmm_range_snapshot() and hmm_range_fault() call find_vma, which requires hodling the mmget() and the mmap_sem for the mm. Make this simpler for the callers by holding the mmget() inside the range for the lifetime of the range. Other functions

[PATCH v2 hmm 05/11] mm/hmm: Remove duplicate condition test before wait_event_timeout

2019-06-07 Thread Jason Gunthorpe
From: Jason Gunthorpe The wait_event_timeout macro already tests the condition as its first action, so there is no reason to open code another version of this, all that does is skip the might_sleep() debugging in common cases, which is not helpful. Further, based on prior patches, we can

[PATCH v2 hmm 10/11] mm/hmm: Do not use list*_rcu() for hmm->ranges

2019-06-07 Thread Jason Gunthorpe
From: Jason Gunthorpe This list is always read and written while holding hmm->lock so there is no need for the confusing _rcu annotations. Signed-off-by: Jason Gunthorpe Reviewed-by: Jérôme Glisse --- mm/hmm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/hmm.

[PATCH v2 hmm 00/11] Various revisions from a locking/code review

2019-06-07 Thread Jason Gunthorpe
From: Jason Gunthorpe For hmm.git: This patch series arised out of discussions with Jerome when looking at the ODP changes, particularly informed by use after free races we have already found and fixed in the ODP code (thanks to syzkaller) working with mmu notifiers, and the discussion

[PATCH v2 hmm 04/11] mm/hmm: Simplify hmm_get_or_create and make it reliable

2019-06-07 Thread Jason Gunthorpe
From: Jason Gunthorpe As coded this function can false-fail in various racy situations. Make it reliable by running only under the write side of the mmap_sem and avoiding the false-failing compare/exchange pattern. Also make the locking very easy to understand by only ever reading or writing mm

[PATCH v2 hmm 11/11] mm/hmm: Remove confusing comment and logic from hmm_release

2019-06-07 Thread Jason Gunthorpe
From: Jason Gunthorpe hmm_release() is called exactly once per hmm. ops->release() cannot accidentally trigger any action that would recurse back onto hmm->mirrors_sem. This fixes a use after-free race of the form: CPU0

Re: [PATCH 0/2] Two bug-fixes for HMM

2019-06-07 Thread Jason Gunthorpe
On Fri, May 10, 2019 at 07:53:21PM +, Kuehling, Felix wrote: > These problems were found in AMD-internal testing as we're working on > adopting HMM. They are rebased against glisse/hmm-5.2-v3. We'd like to get > them applied to a mainline Linux kernel as well as drm-next and >

[PATCH v2 hmm 01/11] mm/hmm: fix use after free with struct hmm in the mmu notifiers

2019-06-07 Thread Jason Gunthorpe
From: Jason Gunthorpe mmu_notifier_unregister_no_release() is not a fence and the mmu_notifier system will continue to reference hmm->mn until the srcu grace period expires. Resulting in use after free races like this: CPU0 C

[PATCH v2 hmm 09/11] mm/hmm: Poison hmm_range during unregister

2019-06-07 Thread Jason Gunthorpe
From: Jason Gunthorpe Trying to misuse a range outside its lifetime is a kernel bug. Use WARN_ON and poison bytes to detect this condition. Signed-off-by: Jason Gunthorpe Reviewed-by: Jérôme Glisse --- v2 - Keep range start/end valid after unregistration (Jerome) --- mm/hmm.c | 7 +-- 1

Re: [PATCH] drm/nouveau: Fix DEVICE_PRIVATE dependencies

2019-05-30 Thread Jason Gunthorpe
On Thu, May 30, 2019 at 11:31:12PM +0800, Yuehaibing wrote: > Hi all, > > Friendly ping: > > Who can take this? > > On 2019/4/17 22:26, Yue Haibing wrote: > > From: YueHaibing > > > > During randconfig builds, I occasionally run into an invalid configuration > > > > WARNING: unmet direct

Re: [PATCH v16 12/16] IB, arm64: untag user pointers in ib_uverbs_(re)reg_mr()

2019-06-04 Thread Jason Gunthorpe
On Mon, Jun 03, 2019 at 06:55:14PM +0200, Andrey Konovalov wrote: > This patch is a part of a series that extends arm64 kernel ABI to allow to > pass tagged user pointers (with the top byte set to something else other > than 0x00) as syscall arguments. > > ib_uverbs_(re)reg_mr() use provided user

Re: RFC: Run a dedicated hmm.git for 5.3

2019-06-06 Thread Jason Gunthorpe
On Mon, May 27, 2019 at 04:12:47PM -0300, Jason Gunthorpe wrote: > On Sat, May 25, 2019 at 03:52:10PM -0700, Andrew Morton wrote: > > On Fri, 24 May 2019 09:44:55 -0300 Jason Gunthorpe wrote: > > > > > Now that -mm merged the basic hmm API skeleton I think running like

Re: [PATCH v2 hmm 02/11] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register

2019-06-12 Thread Jason Gunthorpe
e same locking scheme using some optional helpers linked to the mmu notifier? (just a sketch, still needs a lot more thinking) Jason From 5a91d17bc3b8fcaa685abddaaae5c5aea6f82dca Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 11 Jun 2019 16:33:33 -0300 Subject: [PATCH] RFC mm: Provide

Re: [PATCH v2 hmm 00/11] Various revisions from a locking/code review

2019-06-12 Thread Jason Gunthorpe
On Thu, Jun 06, 2019 at 03:44:27PM -0300, Jason Gunthorpe wrote: > From: Jason Gunthorpe > > For hmm.git: > > This patch series arised out of discussions with Jerome when looking at the > ODP changes, particularly informed by use after free races we have already > found

Re: [PATCH v2 hmm 02/11] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register

2019-06-10 Thread Jason Gunthorpe
On Fri, Jun 07, 2019 at 03:39:06PM -0700, Ralph Campbell wrote: > > > +    range->hmm = hmm; > > > +    kref_get(>kref); > > >   /* Initialize range to track CPU page table updates. */ > > >   mutex_lock(>lock); > > > > > I forgot to add that I think you can delete the duplicate >

Re: [PATCH v2 hmm 11/11] mm/hmm: Remove confusing comment and logic from hmm_release

2019-06-10 Thread Jason Gunthorpe
On Fri, Jun 07, 2019 at 02:37:07PM -0700, Ralph Campbell wrote: > > On 6/6/19 11:44 AM, Jason Gunthorpe wrote: > > From: Jason Gunthorpe > > > > hmm_release() is called exactly once per hmm. ops->release() cannot > > accidentally trigger any action tha

Re: [PATCH v2 hmm 08/11] mm/hmm: Remove racy protection against double-unregistration

2019-06-09 Thread Jason Gunthorpe
On Thu, Jun 06, 2019 at 08:29:10PM -0700, John Hubbard wrote: > On 6/6/19 11:44 AM, Jason Gunthorpe wrote: > > From: Jason Gunthorpe > > > > No other register/unregister kernel API attempts to provide this kind of > > protection as it is inherently racy, so just drop

  1   2   3   4   >