Re: [PATCH v3 5/8] mm: Device exclusive memory access

2021-03-03 Thread Alistair Popple
On Tuesday, 2 March 2021 11:41:52 PM AEDT Jason Gunthorpe wrote:
> > However try_to_protect() scans the PTEs again under the PTL so checking 
the 
> > mapping of interest actually gets replaced during the rmap walk seems like 
a 
> > reasonable solution. Thanks for the comments.
> 
> It does seem cleaner if you can manage it, the notifier will still be
> needd to program the HW though

Checking during the rmap walk wasn't hard but ultimately pointless. As you say 
a range notifier and lock is required to program the hardware, which requires 
checking the mappings with a mmu notifier sequence anyway.

 - Alistair



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v3 5/8] mm: Device exclusive memory access

2021-03-02 Thread Jason Gunthorpe
On Tue, Mar 02, 2021 at 07:57:58PM +1100, Alistair Popple wrote:

> The intent was a driver could use HMM or some other mechanism to keep PTEs 
> synchronised if required. However I just looked at patch 8 in the series 
> again 
> and it appears I got this wrong when converting from the old migration 
> approach:
> 
> +   mutex_unlock(>mutex);
> +   ret = nouveau_atomic_range_fault(svmm, drm, args,
> +   size, hmm_flags, mm);
> 
> The mutex needs to be unlocked after the range fault to ensure the PTE hasn't 
> changed. But this ends up being a problem because try_to_protect() calls 
> notifiers which need to take that mutex and hence deadlocks.

you have to check the notifier sequence under the mutex and loop
again. The mutex should only cover programming the HW to use the
pages, nothing else.

> However try_to_protect() scans the PTEs again under the PTL so checking the 
> mapping of interest actually gets replaced during the rmap walk seems like a 
> reasonable solution. Thanks for the comments.

It does seem cleaner if you can manage it, the notifier will still be
needd to program the HW though

Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v3 5/8] mm: Device exclusive memory access

2021-03-02 Thread Alistair Popple
On Tuesday, 2 March 2021 11:05:59 AM AEDT Jason Gunthorpe wrote:
> On Fri, Feb 26, 2021 at 06:18:29PM +1100, Alistair Popple wrote:
> 
> > +/**
> > + * make_device_exclusive_range() - Mark a range for exclusive use by a 
device
> > + * @mm: mm_struct of assoicated target process
> > + * @start: start of the region to mark for exclusive device access
> > + * @end: end address of region
> > + * @pages: returns the pages which were successfully mark for exclusive 
acces
> > + *
> > + * Returns: number of pages successfully marked for exclusive access
> > + *
> > + * This function finds the ptes mapping page(s) to the given address 
range and
> > + * replaces them with special swap entries preventing userspace CPU 
access. On
> > + * fault these entries are replaced with the original mapping after 
calling MMU
> > + * notifiers.
> > + */
> > +int make_device_exclusive_range(struct mm_struct *mm, unsigned long 
start,
> > +   unsigned long end, struct page **pages)
> > +{
> > +   long npages = (end - start) >> PAGE_SHIFT;
> > +   long i;
> > +
> > +   npages = get_user_pages_remote(mm, start, npages,
> > +  FOLL_GET | FOLL_WRITE | FOLL_SPLIT_PMD,
> > +  pages, NULL, NULL);
> > +   for (i = 0; i < npages; i++) {
> > +   if (!trylock_page(pages[i])) {
> > +   put_page(pages[i]);
> > +   pages[i] = NULL;
> > +   continue;
> > +   }
> > +
> > +   if (!try_to_protect(pages[i])) {
> 
> Isn't this racy? get_user_pages returns the ptes at an instant in
> time, they could have already been changed to something else?

Right. On it's own this does not guarantee that the page is mapped at the 
given location, only that a mapping won't get established without an mmu 
notifier callback to clear the swap entry.

The intent was a driver could use HMM or some other mechanism to keep PTEs 
synchronised if required. However I just looked at patch 8 in the series again 
and it appears I got this wrong when converting from the old migration 
approach:

+   mutex_unlock(>mutex);
+   ret = nouveau_atomic_range_fault(svmm, drm, args,
+   size, hmm_flags, mm);

The mutex needs to be unlocked after the range fault to ensure the PTE hasn't 
changed. But this ends up being a problem because try_to_protect() calls 
notifiers which need to take that mutex and hence deadlocks.

> I would think you'd want to switch to the swap entry atomically under
> th PTLs?

That is one approach, but the reuse of get_user_pages() to walk the page 
tables and fault/gather the pages is a nice simplification and adding a new 
FOLL flag/mode to atomically swap entries doesn't seem right.

However try_to_protect() scans the PTEs again under the PTL so checking the 
mapping of interest actually gets replaced during the rmap walk seems like a 
reasonable solution. Thanks for the comments.

 - Alistair

> Jason
> 




___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v3 5/8] mm: Device exclusive memory access

2021-03-01 Thread Jason Gunthorpe
On Fri, Feb 26, 2021 at 06:18:29PM +1100, Alistair Popple wrote:

> +/**
> + * make_device_exclusive_range() - Mark a range for exclusive use by a device
> + * @mm: mm_struct of assoicated target process
> + * @start: start of the region to mark for exclusive device access
> + * @end: end address of region
> + * @pages: returns the pages which were successfully mark for exclusive acces
> + *
> + * Returns: number of pages successfully marked for exclusive access
> + *
> + * This function finds the ptes mapping page(s) to the given address range 
> and
> + * replaces them with special swap entries preventing userspace CPU access. 
> On
> + * fault these entries are replaced with the original mapping after calling 
> MMU
> + * notifiers.
> + */
> +int make_device_exclusive_range(struct mm_struct *mm, unsigned long start,
> + unsigned long end, struct page **pages)
> +{
> + long npages = (end - start) >> PAGE_SHIFT;
> + long i;
> +
> + npages = get_user_pages_remote(mm, start, npages,
> +FOLL_GET | FOLL_WRITE | FOLL_SPLIT_PMD,
> +pages, NULL, NULL);
> + for (i = 0; i < npages; i++) {
> + if (!trylock_page(pages[i])) {
> + put_page(pages[i]);
> + pages[i] = NULL;
> + continue;
> + }
> +
> + if (!try_to_protect(pages[i])) {

Isn't this racy? get_user_pages returns the ptes at an instant in
time, they could have already been changed to something else?

I would think you'd want to switch to the swap entry atomically under
th PTLs?

Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH v3 5/8] mm: Device exclusive memory access

2021-03-01 Thread Ralph Campbell
> From: Alistair Popple 
> Sent: Thursday, February 25, 2021 11:18 PM
> To: linux...@kvack.org; nouv...@lists.freedesktop.org;
> bske...@redhat.com; a...@linux-foundation.org
> Cc: linux-...@vger.kernel.org; linux-ker...@vger.kernel.org; dri-
> de...@lists.freedesktop.org; John Hubbard ; Ralph
> Campbell ; jgli...@redhat.com; Jason Gunthorpe
> ; h...@infradead.org; dan...@ffwll.ch; Alistair Popple
> 
> Subject: [PATCH v3 5/8] mm: Device exclusive memory access
> 
> Some devices require exclusive write access to shared virtual memory (SVM)
> ranges to perform atomic operations on that memory. This requires CPU page
> tables to be updated to deny access whilst atomic operations are occurring.
> 
> In order to do this introduce a new swap entry type (SWP_DEVICE_EXCLUSIVE).
> When a SVM range needs to be marked for exclusive access by a device all page
> table mappings for the particular range are replaced with device exclusive 
> swap
> entries. This causes any CPU access to the page to result in a fault.
> 
> Faults are resovled by replacing the faulting entry with the original 
> mapping. This
> results in MMU notifiers being called which a driver uses to update access
> permissions such as revoking atomic access. After notifiers have been called 
> the
> device will no longer have exclusive access to the region.
> 
> Signed-off-by: Alistair Popple 
> ---
>  Documentation/vm/hmm.rst |  15 
>  include/linux/rmap.h |   3 +
>  include/linux/swap.h |   4 +-
>  include/linux/swapops.h  |  44 ++-
>  mm/hmm.c |   5 ++
>  mm/memory.c  | 108 +-
>  mm/mprotect.c|   8 ++
>  mm/page_vma_mapped.c |   9 ++-
>  mm/rmap.c| 163 +++
>  9 files changed, 352 insertions(+), 7 deletions(-)
...
> +int make_device_exclusive_range(struct mm_struct *mm, unsigned long start,
> + unsigned long end, struct page **pages) {
> + long npages = (end - start) >> PAGE_SHIFT;
> + long i;

Nit: you should use unsigned long for 'i' and 'npages' to match start/end.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v3 5/8] mm: Device exclusive memory access

2021-03-01 Thread Jason Gunthorpe
On Fri, Feb 26, 2021 at 06:18:29PM +1100, Alistair Popple wrote:
> Some devices require exclusive write access to shared virtual
> memory (SVM) ranges to perform atomic operations on that memory. This
> requires CPU page tables to be updated to deny access whilst atomic
> operations are occurring.
> 
> In order to do this introduce a new swap entry
> type (SWP_DEVICE_EXCLUSIVE). When a SVM range needs to be marked for
> exclusive access by a device all page table mappings for the particular
> range are replaced with device exclusive swap entries. This causes any
> CPU access to the page to result in a fault.
> 
> Faults are resovled by replacing the faulting entry with the original
> mapping. This results in MMU notifiers being called which a driver uses
> to update access permissions such as revoking atomic access. After
> notifiers have been called the device will no longer have exclusive
> access to the region.

This makes alot more sense than the prior versions!

I don't know the migration area especially well, but nothing caught my
eye in here

Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel