Re: [Nouveau] [PATCH v9 07/10] mm: Device exclusive memory access

2021-05-26 Thread Alistair Popple
On Thursday, 27 May 2021 5:28:32 AM AEST Peter Xu wrote:
> On Mon, May 24, 2021 at 11:27:22PM +1000, Alistair Popple wrote:
> > Some devices require exclusive write access to shared virtual
> > memory (SVM) ranges to perform atomic operations on that memory. This
> > requires CPU page tables to be updated to deny access whilst atomic
> > operations are occurring.
> > 
> > In order to do this introduce a new swap entry
> > type (SWP_DEVICE_EXCLUSIVE). When a SVM range needs to be marked for
> > exclusive access by a device all page table mappings for the particular
> > range are replaced with device exclusive swap entries. This causes any
> > CPU access to the page to result in a fault.
> > 
> > Faults are resovled by replacing the faulting entry with the original
> > mapping. This results in MMU notifiers being called which a driver uses
> > to update access permissions such as revoking atomic access. After
> > notifiers have been called the device will no longer have exclusive
> > access to the region.
> > 
> > Walking of the page tables to find the target pages is handled by
> > get_user_pages() rather than a direct page table walk. A direct page
> > table walk similar to what migrate_vma_collect()/unmap() does could also
> > have been utilised. However this resulted in more code similar in
> > functionality to what get_user_pages() provides as page faulting is
> > required to make the PTEs present and to break COW.
> > 
> > Signed-off-by: Alistair Popple 
> > Reviewed-by: Christoph Hellwig 
> > 
> > ---
> > 
> > v9:
> > * Split rename of migrate_pgmap_owner into a separate patch.
> > * Added comments explaining SWP_DEVICE_EXCLUSIVE_* entries.
> > * Renamed try_to_protect{_one} to page_make_device_exclusive{_one} based
> > 
> >   somewhat on a suggestion from Peter Xu. I was never particularly happy
> >   with try_to_protect() as a name so think this is better.
> > 
> > * Removed unneccesary code and reworded some comments based on feedback
> > 
> >   from Peter Xu.
> > 
> > * Removed the VMA walk when restoring PTEs for device-exclusive entries.
> > * Simplified implementation of copy_pte_range() to fail if the page
> > 
> >   cannot be locked. This might lead to occasional fork() failures but at
> >   this stage we don't think that will be an issue.
> > 
> > v8:
> > * Remove device exclusive entries on fork rather than copy them.
> > 
> > v7:
> > * Added Christoph's Reviewed-by.
> > * Minor cosmetic cleanups suggested by Christoph.
> > * Replace mmu_notifier_range_init_migrate/exclusive with
> > 
> >   mmu_notifier_range_init_owner as suggested by Christoph.
> > 
> > * Replaced lock_page() with lock_page_retry() when handling faults.
> > * Restrict to anonymous pages for now.
> > 
> > v6:
> > * Fixed a bisectablity issue due to incorrectly applying the rename of
> > 
> >   migrate_pgmap_owner to the wrong patches for Nouveau and hmm_test.
> > 
> > v5:
> > * Renamed range->migrate_pgmap_owner to range->owner.
> > * Added MMU_NOTIFY_EXCLUSIVE to allow passing of a driver cookie which
> > 
> >   allows notifiers called as a result of make_device_exclusive_range() to
> >   be ignored.
> > 
> > * Added a check to try_to_protect_one() to detect if the pages originally
> > 
> >   returned from get_user_pages() have been unmapped or not.
> > 
> > * Removed check_device_exclusive_range() as it is no longer required with
> > 
> >   the other changes.
> > 
> > * Documentation update.
> > 
> > v4:
> > * Add function to check that mappings are still valid and exclusive.
> > * s/long/unsigned long/ in make_device_exclusive_entry().
> > ---
> > 
> >  Documentation/vm/hmm.rst |  17 
> >  include/linux/mmu_notifier.h |   6 ++
> >  include/linux/rmap.h |   4 +
> >  include/linux/swap.h |   7 +-
> >  include/linux/swapops.h  |  44 -
> >  mm/hmm.c |   5 +
> >  mm/memory.c  | 128 +++-
> >  mm/mprotect.c|   8 ++
> >  mm/page_vma_mapped.c |   9 +-
> >  mm/rmap.c| 186 +++
> >  10 files changed, 405 insertions(+), 9 deletions(-)
> > 
> > diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
> > index 3df79307a797..a14c2938e7af 100644
> > --- a/Documentation/vm/hmm.rst
> > +++ b/Documentation/vm/hmm.rst
> > 
> > @@ -405,6 +405,23 @@ between device driver specific code and shared common 
code:
> > The lock can now be released.
> > 
> > +Exclusive access memory
> > +===
> > +
> > +Some devices have features such as atomic PTE bits that can be used to
> > implement +atomic access to system memory. To support atomic operations
> > to a shared virtual +memory page such a device needs access to that page
> > which is exclusive of any +userspace access from the CPU. The
> > ``make_device_exclusive_range()`` function +can be used to make a memory
> > range inaccessible from userspace.
> > +
> > +This replaces all mappings for pages in 

Re: [Nouveau] [PATCH v9 06/10] mm/memory.c: Allow different return codes for copy_nonpresent_pte()

2021-05-26 Thread Alistair Popple
On Thursday, 27 May 2021 5:50:05 AM AEST Peter Xu wrote:
> On Mon, May 24, 2021 at 11:27:21PM +1000, Alistair Popple wrote:
> > Currently if copy_nonpresent_pte() returns a non-zero value it is
> > assumed to be a swap entry which requires further processing outside the
> > loop in copy_pte_range() after dropping locks. This prevents other
> > values being returned to signal conditions such as failure which a
> > subsequent change requires.
> > 
> > Instead make copy_nonpresent_pte() return an error code if further
> > processing is required and read the value for the swap entry in the main
> > loop under the ptl.
> > 
> > Signed-off-by: Alistair Popple 
> > 
> > ---
> > 
> > v9:
> > 
> > New for v9 to allow device exclusive handling to occur in
> > copy_nonpresent_pte().
> > ---
> > 
> >  mm/memory.c | 12 +++-
> >  1 file changed, 7 insertions(+), 5 deletions(-)
> > 
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 2fb455c365c2..e061cfa18c11 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -718,7 +718,7 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct
> > mm_struct *src_mm,> 
> >   if (likely(!non_swap_entry(entry))) {
> >   
> >   if (swap_duplicate(entry) < 0)
> > 
> > - return entry.val;
> > + return -EAGAIN;
> > 
> >   /* make sure dst_mm is on swapoff's mmlist. */
> >   if (unlikely(list_empty(_mm->mmlist))) {
> > 
> > @@ -974,11 +974,13 @@ copy_pte_range(struct vm_area_struct *dst_vma,
> > struct vm_area_struct *src_vma,> 
> >   continue;
> >   
> >   }
> >   if (unlikely(!pte_present(*src_pte))) {
> > 
> > - entry.val = copy_nonpresent_pte(dst_mm, src_mm,
> > - dst_pte, src_pte,
> > - src_vma, addr, rss);
> > - if (entry.val)
> > + ret = copy_nonpresent_pte(dst_mm, src_mm,
> > + dst_pte, src_pte,
> > + src_vma, addr, rss);
> > + if (ret == -EAGAIN) {
> > + entry = pte_to_swp_entry(*src_pte);
> > 
> >   break;
> > 
> > + }
> > 
> >   progress += 8;
> >   continue;
> >   
> >   }
> 
> Note that -EAGAIN was previously used by copy_present_page() for early cow
> use.  Here later although we check entry.val first:
> 
> if (entry.val) {
> if (add_swap_count_continuation(entry, GFP_KERNEL) < 0) {
> ret = -ENOMEM;
> goto out;
> }
> entry.val = 0;
> } else if (ret) {
> WARN_ON_ONCE(ret != -EAGAIN);
> prealloc = page_copy_prealloc(src_mm, src_vma, addr);
> if (!prealloc)
> return -ENOMEM;
> /* We've captured and resolved the error. Reset, try again.
> */ ret = 0;
> }
> 
> We didn't reset "ret" in entry.val case (maybe we should?). Then in the next
> round of "goto again" if "ret" is unluckily untouched, it could reach the
> 2nd if check, and I think it could cause an unexpected
> page_copy_prealloc().

Thanks, I had considered that but saw "ret" was always set either by 
copy_nonpresent_pte() or copy_present_pte(). However missed the "unlucky" case 
at the start of the loop:

if (progress >= 32) {
progress = 0;
if (need_resched() ||
spin_needbreak(src_ptl) || 
pin_needbreak(dst_ptl))
break;

Looking at this again though checking different variables to figure out what 
to do outside the locks and reusing error codes seems error prone. I reused -
EAGAIN for copy_nonpresent_pte() simply because that seemed the most sensible 
error code, but I don't think that aids readability and it might be better to 
use a unique error code for each case needing extra handling.

So it might be better if I update this patch to:
1) Use unique error codes for each case requiring special handling outside the 
lock.
2) Only check "ret" to determine what to do outside locks (ie. not entry.val)
3) Document these.
4) Always reset ret after handling.

Thoughts?

 - Alistair

> --
> Peter Xu




___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH][next] nouveau/svm: Fix missing failure check on call to make_device_exclusive_range

2021-05-26 Thread Alistair Popple
On Thursday, 27 May 2021 12:04:59 AM AEST Colin King wrote:
> The call to make_device_exclusive_range can potentially fail leaving
> pointer page not initialized that leads to an uninitialized pointer
> read issue. Fix this by adding a check to see if the call failed and
> returning the error code.
> 
> Addresses-Coverity: ("Uninitialized pointer read")
> Fixes: c620bba9828c ("nouveau/svm: implement atomic SVM access")
> Signed-off-by: Colin Ian King 
> ---
>  drivers/gpu/drm/nouveau/nouveau_svm.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c
> b/drivers/gpu/drm/nouveau/nouveau_svm.c index 84726a89e665..b913b4907088
> 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_svm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
> @@ -609,8 +609,10 @@ static int nouveau_atomic_range_fault(struct
> nouveau_svmm *svmm,
> 
> notifier_seq = mmu_interval_read_begin(>notifier);
> mmap_read_lock(mm);
> -   make_device_exclusive_range(mm, start, start + PAGE_SIZE,
> -   , drm->dev);
> +   ret = make_device_exclusive_range(mm, start, start +
> PAGE_SIZE, + ,
> drm->dev); +   if (ret < 0)
> +   goto out;

Thanks for spotting, this is fixing get_user_pages() inside 
make_device_exclusive_range() returning an error. However the check needs to 
happen after dropping mmap_lock below:

> mmap_read_unlock(mm);
> if (!page) {
> ret = -EINVAL;
> --
> 2.31.1




___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH v9 06/10] mm/memory.c: Allow different return codes for copy_nonpresent_pte()

2021-05-26 Thread Peter Xu
On Mon, May 24, 2021 at 11:27:21PM +1000, Alistair Popple wrote:
> Currently if copy_nonpresent_pte() returns a non-zero value it is
> assumed to be a swap entry which requires further processing outside the
> loop in copy_pte_range() after dropping locks. This prevents other
> values being returned to signal conditions such as failure which a
> subsequent change requires.
> 
> Instead make copy_nonpresent_pte() return an error code if further
> processing is required and read the value for the swap entry in the main
> loop under the ptl.
> 
> Signed-off-by: Alistair Popple 
> 
> ---
> 
> v9:
> 
> New for v9 to allow device exclusive handling to occur in
> copy_nonpresent_pte().
> ---
>  mm/memory.c | 12 +++-
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 2fb455c365c2..e061cfa18c11 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -718,7 +718,7 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct 
> mm_struct *src_mm,
>  
>   if (likely(!non_swap_entry(entry))) {
>   if (swap_duplicate(entry) < 0)
> - return entry.val;
> + return -EAGAIN;
>  
>   /* make sure dst_mm is on swapoff's mmlist. */
>   if (unlikely(list_empty(_mm->mmlist))) {
> @@ -974,11 +974,13 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct 
> vm_area_struct *src_vma,
>   continue;
>   }
>   if (unlikely(!pte_present(*src_pte))) {
> - entry.val = copy_nonpresent_pte(dst_mm, src_mm,
> - dst_pte, src_pte,
> - src_vma, addr, rss);
> - if (entry.val)
> + ret = copy_nonpresent_pte(dst_mm, src_mm,
> + dst_pte, src_pte,
> + src_vma, addr, rss);
> + if (ret == -EAGAIN) {
> + entry = pte_to_swp_entry(*src_pte);
>   break;
> + }
>   progress += 8;
>   continue;
>   }

Note that -EAGAIN was previously used by copy_present_page() for early cow
use.  Here later although we check entry.val first:

if (entry.val) {
if (add_swap_count_continuation(entry, GFP_KERNEL) < 0) {
ret = -ENOMEM;
goto out;
}
entry.val = 0;
} else if (ret) {
WARN_ON_ONCE(ret != -EAGAIN);
prealloc = page_copy_prealloc(src_mm, src_vma, addr);
if (!prealloc)
return -ENOMEM;
/* We've captured and resolved the error. Reset, try again. */
ret = 0;
}

We didn't reset "ret" in entry.val case (maybe we should?). Then in the next
round of "goto again" if "ret" is unluckily untouched, it could reach the 2nd
if check, and I think it could cause an unexpected page_copy_prealloc().

-- 
Peter Xu

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH v9 05/10] mm: Rename migrate_pgmap_owner

2021-05-26 Thread Peter Xu
On Mon, May 24, 2021 at 11:27:20PM +1000, Alistair Popple wrote:
> @@ -521,14 +521,14 @@ static inline void mmu_notifier_range_init(struct 
> mmu_notifier_range *range,
>   range->flags = flags;
>  }
>  
> -static inline void mmu_notifier_range_init_migrate(
> - struct mmu_notifier_range *range, unsigned int flags,
> +static inline void mmu_notifier_range_init_owner(
> + struct mmu_notifier_range *range,
> + enum mmu_notifier_event event, unsigned int flags,
>   struct vm_area_struct *vma, struct mm_struct *mm,
> - unsigned long start, unsigned long end, void *pgmap)
> + unsigned long start, unsigned long end, void *owner)
>  {
> - mmu_notifier_range_init(range, MMU_NOTIFY_MIGRATE, flags, vma, mm,
> - start, end);
> - range->migrate_pgmap_owner = pgmap;
> + mmu_notifier_range_init(range, event, flags, vma, mm, start, end);
> + range->owner = owner;
>  }

mmu_notifier_range_init_migrate() can even be kept to just call the new helper,
then existing callers are unaffected.  Not a big deal, though:

Reviewed-by: Peter Xu 

Thanks,

-- 
Peter Xu

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH v9 07/10] mm: Device exclusive memory access

2021-05-26 Thread Peter Xu
On Mon, May 24, 2021 at 11:27:22PM +1000, Alistair Popple wrote:
> Some devices require exclusive write access to shared virtual
> memory (SVM) ranges to perform atomic operations on that memory. This
> requires CPU page tables to be updated to deny access whilst atomic
> operations are occurring.
> 
> In order to do this introduce a new swap entry
> type (SWP_DEVICE_EXCLUSIVE). When a SVM range needs to be marked for
> exclusive access by a device all page table mappings for the particular
> range are replaced with device exclusive swap entries. This causes any
> CPU access to the page to result in a fault.
> 
> Faults are resovled by replacing the faulting entry with the original
> mapping. This results in MMU notifiers being called which a driver uses
> to update access permissions such as revoking atomic access. After
> notifiers have been called the device will no longer have exclusive
> access to the region.
> 
> Walking of the page tables to find the target pages is handled by
> get_user_pages() rather than a direct page table walk. A direct page
> table walk similar to what migrate_vma_collect()/unmap() does could also
> have been utilised. However this resulted in more code similar in
> functionality to what get_user_pages() provides as page faulting is
> required to make the PTEs present and to break COW.
> 
> Signed-off-by: Alistair Popple 
> Reviewed-by: Christoph Hellwig 
> 
> ---
> 
> v9:
> * Split rename of migrate_pgmap_owner into a separate patch.
> * Added comments explaining SWP_DEVICE_EXCLUSIVE_* entries.
> * Renamed try_to_protect{_one} to page_make_device_exclusive{_one} based
>   somewhat on a suggestion from Peter Xu. I was never particularly happy
>   with try_to_protect() as a name so think this is better.
> * Removed unneccesary code and reworded some comments based on feedback
>   from Peter Xu.
> * Removed the VMA walk when restoring PTEs for device-exclusive entries.
> * Simplified implementation of copy_pte_range() to fail if the page
>   cannot be locked. This might lead to occasional fork() failures but at
>   this stage we don't think that will be an issue.
> 
> v8:
> * Remove device exclusive entries on fork rather than copy them.
> 
> v7:
> * Added Christoph's Reviewed-by.
> * Minor cosmetic cleanups suggested by Christoph.
> * Replace mmu_notifier_range_init_migrate/exclusive with
>   mmu_notifier_range_init_owner as suggested by Christoph.
> * Replaced lock_page() with lock_page_retry() when handling faults.
> * Restrict to anonymous pages for now.
> 
> v6:
> * Fixed a bisectablity issue due to incorrectly applying the rename of
>   migrate_pgmap_owner to the wrong patches for Nouveau and hmm_test.
> 
> v5:
> * Renamed range->migrate_pgmap_owner to range->owner.
> * Added MMU_NOTIFY_EXCLUSIVE to allow passing of a driver cookie which
>   allows notifiers called as a result of make_device_exclusive_range() to
>   be ignored.
> * Added a check to try_to_protect_one() to detect if the pages originally
>   returned from get_user_pages() have been unmapped or not.
> * Removed check_device_exclusive_range() as it is no longer required with
>   the other changes.
> * Documentation update.
> 
> v4:
> * Add function to check that mappings are still valid and exclusive.
> * s/long/unsigned long/ in make_device_exclusive_entry().
> ---
>  Documentation/vm/hmm.rst |  17 
>  include/linux/mmu_notifier.h |   6 ++
>  include/linux/rmap.h |   4 +
>  include/linux/swap.h |   7 +-
>  include/linux/swapops.h  |  44 -
>  mm/hmm.c |   5 +
>  mm/memory.c  | 128 +++-
>  mm/mprotect.c|   8 ++
>  mm/page_vma_mapped.c |   9 +-
>  mm/rmap.c| 186 +++
>  10 files changed, 405 insertions(+), 9 deletions(-)
> 
> diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
> index 3df79307a797..a14c2938e7af 100644
> --- a/Documentation/vm/hmm.rst
> +++ b/Documentation/vm/hmm.rst
> @@ -405,6 +405,23 @@ between device driver specific code and shared common 
> code:
>  
> The lock can now be released.
>  
> +Exclusive access memory
> +===
> +
> +Some devices have features such as atomic PTE bits that can be used to 
> implement
> +atomic access to system memory. To support atomic operations to a shared 
> virtual
> +memory page such a device needs access to that page which is exclusive of any
> +userspace access from the CPU. The ``make_device_exclusive_range()`` function
> +can be used to make a memory range inaccessible from userspace.
> +
> +This replaces all mappings for pages in the given range with special swap
> +entries. Any attempt to access the swap entry results in a fault which is
> +resovled by replacing the entry with the original mapping. A driver gets
> +notified that the mapping has been changed by MMU notifiers, after which 
> point
> +it will no longer have exclusive access to the page. 

[Nouveau] [PATCH 00/34] Rid W=1 warnings from GPU

2021-05-26 Thread Lee Jones
This set is part of a larger effort attempting to clean-up W=1
kernel builds, which are currently overwhelmingly riddled with
niggly little warnings.

Lee Jones (34):
  drm/amd/pm/inc/smu_v13_0: Move table into the only source file that
uses it
  drm/amd/pm/swsmu/smu13/aldebaran_ppt: Remove unused variable 'ret'
  drm/amd/pm/powerplay/hwmgr/smu7_thermal: Provide function name for
'smu7_fan_ctrl_set_default_mode()'
  drm/amd/pm/powerplay/hwmgr/vega12_thermal: Provide function name
  drm/amd/pm/powerplay/hwmgr/vega12_hwmgr: Provide
'vega12_init_smc_table()' function name
  drm/amd/pm/powerplay/hwmgr/vega10_hwmgr: Kernel-doc headers must
contain function names
  drm/amd/pm/powerplay/hwmgr/vega20_hwmgr: Provide function name
'vega20_init_smc_table()'
  drm/amd/display/dc/bios/command_table_helper: Fix function name for
'dal_cmd_table_helper_transmitter_bp_to_atom()'
  drm/amd/display/dc/bios/command_table_helper2: Fix function name
'dal_cmd_table_helper_transmitter_bp_to_atom2()'
  drm/amd/display/dc/bios/bios_parser: Fix formatting and misnaming
issues
  drm/nouveau/nvkm/subdev/mc/tu102: Make functions called by reference
static
  drm/amd/display/amdgpu_dm/amdgpu_dm: Functions must directly follow
their headers
  drm/amd/display/dc/dce/dmub_outbox: Convert over to kernel-doc
  drm/amd/display/dc/gpio/gpio_service: Pass around correct
dce_{version,environment} types
  drm/amd/display/dc/dce110/dce110_hw_sequencer: Include our own header
  drm/amd/display/dc/dce/dce_transform: Remove superfluous
re-initialisation of DCFE_MEM_LIGHT_SLEEP_CNTL,
  drm/amd/display/dc/dce/dce_mem_input: Remove duplicate initialisation
of GRPH_CONTROL__GRPH_NUM_BANKS_{SHIFT,MASK}
  drm/amd/display/dc/dce/dce_mem_input: Remove duplicate initialisation
of GRPH_CONTROL__GRPH_NUM_BANKS_{SHIFT,MASK
  drm/amd/amdgpu/amdgpu_device: Make local function static
  drm/amd/display/amdgpu_dm/amdgpu_dm: Fix kernel-doc formatting issue
  drm/amd/display/dc/dce110/dce110_hw_sequencer: Include header
containing our prototypes
  drm/amd/display/dc/core/dc: Convert function headers to kernel-doc
  drm/amd/display/dmub/src/dmub_srv_stat: Convert function header to
kernel-doc
  drm/amd/display/modules/hdcp/hdcp_psp: Remove unused function
'mod_hdcp_hdcp1_get_link_encryption_status()'
  drm/xlnx/zynqmp_disp: Fix incorrectly named enum
'zynqmp_disp_layer_id'
  drm/xlnx/zynqmp_dp: Fix incorrectly name function 'zynqmp_dp_train()'
  drm/ttm/ttm_tt: Demote non-conformant kernel-doc header
  drm/panel/panel-raspberrypi-touchscreen: Demote kernel-doc abuse
  drm/panel/panel-sitronix-st7701: Demote kernel-doc abuse
  drm/vgem/vgem_drv: Standard comment blocks should not use kernel-doc
format
  drm/exynos/exynos7_drm_decon: Fix incorrect naming of
'decon_shadow_protect_win()'
  drm/exynos/exynos_drm_ipp: Fix documentation for
'exynos_drm_ipp_get_{caps,res}_ioctl()'
  drm/vboxvideo/hgsmi_base: Place function names into headers
  drm/vboxvideo/modesetting: Provide function names for prototype
headers

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c|  2 +-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  4 +-
 .../gpu/drm/amd/display/dc/bios/bios_parser.c |  6 +--
 .../display/dc/bios/command_table_helper.c|  2 +-
 .../display/dc/bios/command_table_helper2.c   |  2 +-
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 46 +--
 .../drm/amd/display/dc/dce/dce_mem_input.h|  2 -
 .../drm/amd/display/dc/dce/dce_transform.h|  3 +-
 .../gpu/drm/amd/display/dc/dce/dmub_outbox.c  | 17 ++-
 .../display/dc/dce110/dce110_hw_sequencer.c   |  3 ++
 .../drm/amd/display/dc/gpio/gpio_service.c| 12 ++---
 .../drm/amd/display/dmub/src/dmub_srv_stat.c  | 19 +++-
 .../display/include/gpio_service_interface.h  |  4 +-
 .../drm/amd/display/modules/hdcp/hdcp_psp.c   | 13 --
 drivers/gpu/drm/amd/pm/inc/smu_v13_0.h|  6 ---
 .../drm/amd/pm/powerplay/hwmgr/smu7_thermal.c |  8 ++--
 .../drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c | 26 ++-
 .../drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c |  2 +-
 .../amd/pm/powerplay/hwmgr/vega12_thermal.c   |  3 +-
 .../drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c |  2 +-
 .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c|  9 +++-
 drivers/gpu/drm/exynos/exynos7_drm_decon.c|  2 +-
 drivers/gpu/drm/exynos/exynos_drm_ipp.c   |  4 +-
 .../gpu/drm/nouveau/nvkm/subdev/mc/tu102.c|  6 +--
 .../drm/panel/panel-raspberrypi-touchscreen.c |  2 +-
 drivers/gpu/drm/panel/panel-sitronix-st7701.c |  2 +-
 drivers/gpu/drm/ttm/ttm_tt.c  |  2 +-
 drivers/gpu/drm/vboxvideo/hgsmi_base.c| 19 +---
 drivers/gpu/drm/vboxvideo/modesetting.c   | 20 
 drivers/gpu/drm/vgem/vgem_drv.c   |  2 +-
 drivers/gpu/drm/xlnx/zynqmp_disp.c|  2 +-
 drivers/gpu/drm/xlnx/zynqmp_dp.c  |  2 +-
 32 files changed, 107 insertions(+), 147 deletions(-)

Cc: Adam Jackson 
Cc: Ajay Kumar 

[Nouveau] [PATCH][next] nouveau/svm: Fix missing failure check on call to make_device_exclusive_range

2021-05-26 Thread Colin King
From: Colin Ian King 

The call to make_device_exclusive_range can potentially fail leaving
pointer page not initialized that leads to an uninitialized pointer
read issue. Fix this by adding a check to see if the call failed and
returning the error code.

Addresses-Coverity: ("Uninitialized pointer read")
Fixes: c620bba9828c ("nouveau/svm: implement atomic SVM access")
Signed-off-by: Colin Ian King 
---
 drivers/gpu/drm/nouveau/nouveau_svm.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c 
b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 84726a89e665..b913b4907088 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -609,8 +609,10 @@ static int nouveau_atomic_range_fault(struct nouveau_svmm 
*svmm,
 
notifier_seq = mmu_interval_read_begin(>notifier);
mmap_read_lock(mm);
-   make_device_exclusive_range(mm, start, start + PAGE_SIZE,
-   , drm->dev);
+   ret = make_device_exclusive_range(mm, start, start + PAGE_SIZE,
+ , drm->dev);
+   if (ret < 0)
+   goto out;
mmap_read_unlock(mm);
if (!page) {
ret = -EINVAL;
-- 
2.31.1

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH v9 07/10] mm: Device exclusive memory access

2021-05-26 Thread Alistair Popple
On Wednesday, 26 May 2021 5:17:18 PM AEST John Hubbard wrote:
> On 5/25/21 4:51 AM, Balbir Singh wrote:
> ...
> 
> >> How beneficial is this code to nouveau users?  I see that it permits a
> >> part of OpenCL to be implemented, but how useful/important is this in
> >> the real world?
> > 
> > That is a very good question! I've not reviewed the code, but a sample
> > program with the described use case would make things easy to parse.
> > I suspect that is not easy to build at the moment?
> 
> The cover letter says this:
> 
> This has been tested with upstream Mesa 21.1.0 and a simple OpenCL program
> which checks that GPU atomic accesses to system memory are atomic. Without
> this series the test fails as there is no way of write-protecting the page
> mapping which results in the device clobbering CPU writes. For reference
> the test is available at https://ozlabs.org/~apopple/opencl_svm_atomics/
> 
> Further testing has been performed by adding support for testing exclusive
> access to the hmm-tests kselftests.
> 
> ...so that seems to cover the "sample program" request, at least.

It is also sufficiently easy to build, assuming of course you have the 
appropriate Mesa/LLVM/OpenCL libraries installed :-)

If you are interested I have some scripts which may help with building Mesa, 
etc. Not that that is especially hard either, it's just there are a couple of 
different dependencies required.

> > I wonder how we co-ordinate all the work the mm is doing, page migration,
> > reclaim with device exclusive access? Do we have any numbers for the worst
> > case page fault latency when something is marked away for exclusive
> > access?
>
> CPU page fault latency is approximately "terrible", if a page is resident on
> the GPU. We have to spin up a DMA engine on the GPU and have it copy the
> page over the PCIe bus, after all.

Although for clarity that describes latency for CPU faults to device private 
pages which are always resident on the GPU. A CPU fault to a page being 
exclusively accessed will be slightly less terrible as it only requires the 
GPU MMU/TLB mappings to be taken down in much the same as for any other MMU 
notifier callback as the page is mapped by the GPU rather than resident there.

> > I presume for now this is anonymous memory only? SWP_DEVICE_EXCLUSIVE
> > would
> 
> Yes, for now.
> 
> > only impact the address space of programs using the GPU. Should the
> > exclusively marked range live in the unreclaimable list and recycled back
> > to active/in-active to account for the fact that
> > 
> > 1. It is not reclaimable and reclaim will only hurt via page faults?
> > 2. It ages the page correctly or at-least allows for that possibility when
> > the> 
> > page is used by the GPU.
> 
> I'm not sure that that is *necessarily* something we can conclude. It
> depends upon access patterns of each program. For example, a "reduction"
> parallel program sends over lots of data to the GPU, and only a tiny bit of
> (reduced!) data comes back to the CPU. In that case, freeing the physical
> page on the CPU is actually the best decision for the OS to make (if the OS
> is sufficiently prescient).
> 
> thanks,




___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH v4 0/7] drm: Clean up mmap for TTM-based GEM drivers

2021-05-26 Thread Christian König

Patches #1-#4 are Reviewed-by: Christian König 

Patches #5 and #6 are Acked-by: Christian König 

Patch #7 already has my rb.

I would say push that to drm-misc-next ASAP.

Regards,
Christian.

Am 25.05.21 um 17:10 schrieb Thomas Zimmermann:

Implement mmap via struct drm_gem_object_functions.mmap in amdgpu,
radeon and nouveau. This allows for using common DRM helpers for
the mmap-related callbacks in struct file_operations and struct
drm_driver. The drivers have their own vm_ops, which are now set
automatically by the DRM core functions. The code in each driver's
verify_access becomes part of the driver's new mmap implementation.

With the GEM drivers converted, vmwgfx is the only user of
ttm_bo_mmap() and related infrastructure. So move everything into
vmwgfx and delete the rsp code from TTM.

This touches several drivers. Preferably everything would be merged
at once via drm-misc-next.

v4:
* rebase on top of amdgpu hot-unplug changes
v3:
* tidy up the new mmap functions in amdgpu and radeon (Christian)
v2:
* removal of amdgpu fbdev mmap already merged (Christian)
* rebase on top of amdgpu fixes [1] (Felix)
* replace pr_err() with drm_err() in vmwgfx patch (Zack)
* several typos

[1] 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fseries%2F88822%2Fdata=04%7C01%7Cchristian.koenig%40amd.com%7C782d195c5e8e4ca6351b08d91f8f4e90%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637575522662233381%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=ZJ%2BIx9hFBXzQubYpEKcOngREtjjMKwaChyIonKSACFY%3Dreserved=0

Thomas Zimmermann (7):
   drm/ttm: Don't override vm_ops callbacks, if set
   drm/amdgpu: Implement mmap as GEM object function
   drm/radeon: Implement mmap as GEM object function
   drm/nouveau: Implement mmap as GEM object function
   drm/vmwgfx: Inline ttm_bo_mmap() into vmwgfx driver
   drm/vmwgfx: Inline vmw_verify_access()
   drm/ttm: Remove ttm_bo_mmap() and friends

  drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 46 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h |  2 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  4 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 55 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 75 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  1 -
  drivers/gpu/drm/nouveau/nouveau_bo.c| 10 ---
  drivers/gpu/drm/nouveau/nouveau_drm.c   |  3 +-
  drivers/gpu/drm/nouveau/nouveau_gem.c   | 36 ++
  drivers/gpu/drm/nouveau/nouveau_ttm.c   | 49 --
  drivers/gpu/drm/nouveau/nouveau_ttm.h   |  1 -
  drivers/gpu/drm/radeon/radeon_drv.c |  3 +-
  drivers/gpu/drm/radeon/radeon_gem.c | 49 ++
  drivers/gpu/drm/radeon/radeon_ttm.c | 65 --
  drivers/gpu/drm/radeon/radeon_ttm.h |  1 -
  drivers/gpu/drm/ttm/ttm_bo_vm.c | 60 ++---
  drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c  |  9 ---
  drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c| 53 ++-
  include/drm/ttm/ttm_bo_api.h| 13 
  include/drm/ttm/ttm_device.h| 15 -
  20 files changed, 202 insertions(+), 348 deletions(-)


base-commit: 28dddc0c90bc6464be4c5e3224a293c022564a4e
prerequisite-patch-id: c2b2f08f0eccc9f5df0c0da49fa1d36267deb11d
prerequisite-patch-id: c67e5d886a47b7d0266d81100837557fda34cb24
--
2.31.1



___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 11/34] drm/nouveau/nvkm/subdev/mc/tu102: Make functions called by reference static

2021-05-26 Thread Karol Herbst
On Wed, May 26, 2021 at 10:47 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/nvkm/subdev/mc/tu102.c:50:1: warning: no previous 
> prototype for ‘tu102_mc_intr_unarm’ [-Wmissing-prototypes]
>  drivers/gpu/drm/nouveau/nvkm/subdev/mc/tu102.c:62:1: warning: no previous 
> prototype for ‘tu102_mc_intr_rearm’ [-Wmissing-prototypes]
>  drivers/gpu/drm/nouveau/nvkm/subdev/mc/tu102.c:74:1: warning: no previous 
> prototype for ‘tu102_mc_intr_mask’ [-Wmissing-prototypes]
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Alistair Popple 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouveau@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/nvkm/subdev/mc/tu102.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mc/tu102.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/mc/tu102.c
> index 58db83ebadc5f..a96084b34a788 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mc/tu102.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mc/tu102.c
> @@ -46,7 +46,7 @@ tu102_mc_intr_update(struct tu102_mc *mc)
> nvkm_wr32(device, 0xb81610, 0x6);
>  }
>
> -void
> +static void
>  tu102_mc_intr_unarm(struct nvkm_mc *base)
>  {
> struct tu102_mc *mc = tu102_mc(base);
> @@ -58,7 +58,7 @@ tu102_mc_intr_unarm(struct nvkm_mc *base)
> spin_unlock_irqrestore(>lock, flags);
>  }
>
> -void
> +static void
>  tu102_mc_intr_rearm(struct nvkm_mc *base)
>  {
> struct tu102_mc *mc = tu102_mc(base);
> @@ -70,7 +70,7 @@ tu102_mc_intr_rearm(struct nvkm_mc *base)
> spin_unlock_irqrestore(>lock, flags);
>  }
>
> -void
> +static void
>  tu102_mc_intr_mask(struct nvkm_mc *base, u32 mask, u32 intr)
>  {
> struct tu102_mc *mc = tu102_mc(base);
> --
> 2.31.1
>
> ___
> Nouveau mailing list
> Nouveau@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau

Reviewed-by: Karol Herbst 

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [PATCH 11/34] drm/nouveau/nvkm/subdev/mc/tu102: Make functions called by reference static

2021-05-26 Thread Lee Jones
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/nouveau/nvkm/subdev/mc/tu102.c:50:1: warning: no previous 
prototype for ‘tu102_mc_intr_unarm’ [-Wmissing-prototypes]
 drivers/gpu/drm/nouveau/nvkm/subdev/mc/tu102.c:62:1: warning: no previous 
prototype for ‘tu102_mc_intr_rearm’ [-Wmissing-prototypes]
 drivers/gpu/drm/nouveau/nvkm/subdev/mc/tu102.c:74:1: warning: no previous 
prototype for ‘tu102_mc_intr_mask’ [-Wmissing-prototypes]

Cc: Ben Skeggs 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Alistair Popple 
Cc: dri-de...@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Signed-off-by: Lee Jones 
---
 drivers/gpu/drm/nouveau/nvkm/subdev/mc/tu102.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mc/tu102.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/mc/tu102.c
index 58db83ebadc5f..a96084b34a788 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mc/tu102.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mc/tu102.c
@@ -46,7 +46,7 @@ tu102_mc_intr_update(struct tu102_mc *mc)
nvkm_wr32(device, 0xb81610, 0x6);
 }
 
-void
+static void
 tu102_mc_intr_unarm(struct nvkm_mc *base)
 {
struct tu102_mc *mc = tu102_mc(base);
@@ -58,7 +58,7 @@ tu102_mc_intr_unarm(struct nvkm_mc *base)
spin_unlock_irqrestore(>lock, flags);
 }
 
-void
+static void
 tu102_mc_intr_rearm(struct nvkm_mc *base)
 {
struct tu102_mc *mc = tu102_mc(base);
@@ -70,7 +70,7 @@ tu102_mc_intr_rearm(struct nvkm_mc *base)
spin_unlock_irqrestore(>lock, flags);
 }
 
-void
+static void
 tu102_mc_intr_mask(struct nvkm_mc *base, u32 mask, u32 intr)
 {
struct tu102_mc *mc = tu102_mc(base);
-- 
2.31.1

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH v9 07/10] mm: Device exclusive memory access

2021-05-26 Thread John Hubbard

On 5/25/21 4:51 AM, Balbir Singh wrote:
...

How beneficial is this code to nouveau users?  I see that it permits a
part of OpenCL to be implemented, but how useful/important is this in
the real world?


That is a very good question! I've not reviewed the code, but a sample
program with the described use case would make things easy to parse.
I suspect that is not easy to build at the moment?



The cover letter says this:

This has been tested with upstream Mesa 21.1.0 and a simple OpenCL program
which checks that GPU atomic accesses to system memory are atomic. Without
this series the test fails as there is no way of write-protecting the page
mapping which results in the device clobbering CPU writes. For reference
the test is available at https://ozlabs.org/~apopple/opencl_svm_atomics/

Further testing has been performed by adding support for testing exclusive
access to the hmm-tests kselftests.

...so that seems to cover the "sample program" request, at least.


I wonder how we co-ordinate all the work the mm is doing, page migration,
reclaim with device exclusive access? Do we have any numbers for the worst
case page fault latency when something is marked away for exclusive access?


CPU page fault latency is approximately "terrible", if a page is resident on
the GPU. We have to spin up a DMA engine on the GPU and have it copy the page
over the PCIe bus, after all.


I presume for now this is anonymous memory only? SWP_DEVICE_EXCLUSIVE would


Yes, for now.


only impact the address space of programs using the GPU. Should the exclusively
marked range live in the unreclaimable list and recycled back to 
active/in-active
to account for the fact that

1. It is not reclaimable and reclaim will only hurt via page faults?
2. It ages the page correctly or at-least allows for that possibility when the
page is used by the GPU.


I'm not sure that that is *necessarily* something we can conclude. It depends 
upon
access patterns of each program. For example, a "reduction" parallel program 
sends
over lots of data to the GPU, and only a tiny bit of (reduced!) data comes back
to the CPU. In that case, freeing the physical page on the CPU is actually the
best decision for the OS to make (if the OS is sufficiently prescient).

thanks,
--
John Hubbard
NVIDIA
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau