Re: [Intel-gfx] [PATCH v5 2/2] drm/i915/dgfx: Release mmap on rpm suspend
On 21/09/2022 06:29, Gupta, Anshuman wrote: -Original Message- From: Matthew Auld Sent: Tuesday, September 20, 2022 7:30 PM To: Gupta, Anshuman Cc: intel-gfx@lists.freedesktop.org; ch...@chris-wilson.co.uk; Auld, Matthew ; Vivi, Rodrigo Subject: Re: [Intel-gfx] [PATCH v5 2/2] drm/i915/dgfx: Release mmap on rpm suspend On Tue, 13 Sept 2022 at 16:27, Anshuman Gupta wrote: Release all mmap mapping for all lmem objects which are associated with userfault such that, while pcie function in D3hot, any access to memory mappings will raise a userfault. Runtime resume the dgpu(when gem object lies in lmem). This will transition the dgpu graphics function to D0 state if it was in D3 in order to access the mmap memory mappings. v2: - Squashes the patches. [Matt Auld] - Add adequate locking for lmem_userfault_list addition. [Matt Auld] - Reused obj->userfault_count to avoid double addition. [Matt Auld] - Added i915_gem_object_lock to check i915_gem_object_is_lmem. [Matt Auld] v3: - Use i915_ttm_cpu_maps_iomem. [Matt Auld] - Fix 'ret == 0 to ret == VM_FAULT_NOPAGE'. [Matt Auld] - Reuse obj->userfault_count as a bool 0 or 1. [Matt Auld] - Delete the mmaped obj from lmem_userfault_list in obj destruction path. [Matt Auld] - Get a wakeref for object destruction patch. [Matt Auld] - Use intel_wakeref_auto to delay runtime PM. [Matt Auld] v4: - Avoid using mmo offset to get the vma_node. [Matt Auld] - Added comment to use the lmem_userfault_lock. [Matt Auld] - Get lmem_userfault_lock in i915_gem_object_release_mmap_offset. [Matt Auld] - Fixed kernel test robot generated warning. v5: - Addressed the cosmetics comments. [Andi] - Changed i915_gem_runtime_pm_object_release_mmap_offset() name to i915_gem_object_runtime_pm_release_mmap_offset() to be rhythmic. PCIe Specs 5.3.1.4.1 Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6331 Cc: Matthew Auld Cc: Rodrigo Vivi Signed-off-by: Anshuman Gupta Reviewed-by: Andi Shyti --- drivers/gpu/drm/i915/gem/i915_gem_mman.c | 21 +++ drivers/gpu/drm/i915/gem/i915_gem_mman.h | 1 + drivers/gpu/drm/i915/gem/i915_gem_object.c| 2 +- .../gpu/drm/i915/gem/i915_gem_object_types.h | 3 +- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 36 +-- drivers/gpu/drm/i915/gt/intel_gt.c| 2 ++ drivers/gpu/drm/i915/gt/intel_gt_types.h | 14 drivers/gpu/drm/i915/i915_gem.c | 4 +++ 8 files changed, 79 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c index b55befda3387..73d9eda1d6b7 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -550,6 +550,20 @@ void i915_gem_object_release_mmap_gtt(struct drm_i915_gem_object *obj) intel_runtime_pm_put(&i915->runtime_pm, wakeref); } +void i915_gem_object_runtime_pm_release_mmap_offset(struct +drm_i915_gem_object *obj) { + struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); + struct ttm_device *bdev = bo->bdev; + + drm_vma_node_unmap(&bo->base.vma_node, bdev->dev_mapping); + + if (obj->userfault_count) { + /* rpm wakeref provide exclusive access */ + list_del(&obj->userfault_link); + obj->userfault_count = 0; + } +} + void i915_gem_object_release_mmap_offset(struct drm_i915_gem_object *obj) { struct i915_mmap_offset *mmo, *mn; @@ -573,6 +587,13 @@ void i915_gem_object_release_mmap_offset(struct drm_i915_gem_object *obj) spin_lock(&obj->mmo.lock); } spin_unlock(&obj->mmo.lock); + + if (obj->userfault_count) { + mutex_lock(&to_gt(to_i915(obj->base.dev))->lmem_userfault_lock); + list_del(&obj->userfault_link); + mutex_unlock(&to_gt(to_i915(obj->base.dev))- lmem_userfault_lock); + obj->userfault_count = 0; + } Sorry for the late reply, I was out last week. This looks like it's missing holding the runtime pm AFAICT. We are holding the runtime pm for object destruction, but this is also called when a move is triggered (very common). If so, this can race against the runtime pm also touching the list concurrently. We are chasing some crazy looking NULL deref bugs, so wondering if this is somehow related... Yes it is called from i915_ttm_move_notify(), I missed it thinking of __i915_gem_object_pages_fini Would be sufficient to protect against runtime PM. Having said that, it ok to remove the wakeref from i915_ttm_delete_mem_notify and having only in one place in i915_gem_object_release_mmap_offset ? If that is the case then is it safer to use the i915_gem_object_is_lmem() or we should use i915_ttm_cpu_maps_iomem() here ? Yeah, maybe we should just throw this into i915_ttm_unmap_virtual(). Someth
Re: [Intel-gfx] [PATCH v5 2/2] drm/i915/dgfx: Release mmap on rpm suspend
> -Original Message- > From: Matthew Auld > Sent: Tuesday, September 20, 2022 7:30 PM > To: Gupta, Anshuman > Cc: intel-gfx@lists.freedesktop.org; ch...@chris-wilson.co.uk; Auld, Matthew > ; Vivi, Rodrigo > Subject: Re: [Intel-gfx] [PATCH v5 2/2] drm/i915/dgfx: Release mmap on rpm > suspend > > On Tue, 13 Sept 2022 at 16:27, Anshuman Gupta > wrote: > > > > Release all mmap mapping for all lmem objects which are associated > > with userfault such that, while pcie function in D3hot, any access to > > memory mappings will raise a userfault. > > > > Runtime resume the dgpu(when gem object lies in lmem). > > This will transition the dgpu graphics function to D0 state if it was > > in D3 in order to access the mmap memory mappings. > > > > v2: > > - Squashes the patches. [Matt Auld] > > - Add adequate locking for lmem_userfault_list addition. [Matt Auld] > > - Reused obj->userfault_count to avoid double addition. [Matt Auld] > > - Added i915_gem_object_lock to check > > i915_gem_object_is_lmem. [Matt Auld] > > > > v3: > > - Use i915_ttm_cpu_maps_iomem. [Matt Auld] > > - Fix 'ret == 0 to ret == VM_FAULT_NOPAGE'. [Matt Auld] > > - Reuse obj->userfault_count as a bool 0 or 1. [Matt Auld] > > - Delete the mmaped obj from lmem_userfault_list in obj > > destruction path. [Matt Auld] > > - Get a wakeref for object destruction patch. [Matt Auld] > > - Use intel_wakeref_auto to delay runtime PM. [Matt Auld] > > > > v4: > > - Avoid using mmo offset to get the vma_node. [Matt Auld] > > - Added comment to use the lmem_userfault_lock. [Matt Auld] > > - Get lmem_userfault_lock in i915_gem_object_release_mmap_offset. > > [Matt Auld] > > - Fixed kernel test robot generated warning. > > > > v5: > > - Addressed the cosmetics comments. [Andi] > > - Changed i915_gem_runtime_pm_object_release_mmap_offset() name to > > i915_gem_object_runtime_pm_release_mmap_offset() to be rhythmic. > > > > PCIe Specs 5.3.1.4.1 > > > > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6331 > > Cc: Matthew Auld > > Cc: Rodrigo Vivi > > Signed-off-by: Anshuman Gupta > > Reviewed-by: Andi Shyti > > --- > > drivers/gpu/drm/i915/gem/i915_gem_mman.c | 21 +++ > > drivers/gpu/drm/i915/gem/i915_gem_mman.h | 1 + > > drivers/gpu/drm/i915/gem/i915_gem_object.c| 2 +- > > .../gpu/drm/i915/gem/i915_gem_object_types.h | 3 +- > > drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 36 +-- > > drivers/gpu/drm/i915/gt/intel_gt.c| 2 ++ > > drivers/gpu/drm/i915/gt/intel_gt_types.h | 14 > > drivers/gpu/drm/i915/i915_gem.c | 4 +++ > > 8 files changed, 79 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c > > b/drivers/gpu/drm/i915/gem/i915_gem_mman.c > > index b55befda3387..73d9eda1d6b7 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c > > @@ -550,6 +550,20 @@ void i915_gem_object_release_mmap_gtt(struct > drm_i915_gem_object *obj) > > intel_runtime_pm_put(&i915->runtime_pm, wakeref); } > > > > +void i915_gem_object_runtime_pm_release_mmap_offset(struct > > +drm_i915_gem_object *obj) { > > + struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); > > + struct ttm_device *bdev = bo->bdev; > > + > > + drm_vma_node_unmap(&bo->base.vma_node, bdev->dev_mapping); > > + > > + if (obj->userfault_count) { > > + /* rpm wakeref provide exclusive access */ > > + list_del(&obj->userfault_link); > > + obj->userfault_count = 0; > > + } > > +} > > + > > void i915_gem_object_release_mmap_offset(struct drm_i915_gem_object > > *obj) { > > struct i915_mmap_offset *mmo, *mn; @@ -573,6 +587,13 @@ void > > i915_gem_object_release_mmap_offset(struct drm_i915_gem_object *obj) > > spin_lock(&obj->mmo.lock); > > } > > spin_unlock(&obj->mmo.lock); > > + > > + if (obj->userfault_count) { > > + > > mutex_lock(&to_gt(to_i915(obj->base.dev))->lmem_userfault_lock); > > + list_del(&obj->userfault_link); > > + mutex_unlock(&to_gt(to_i915(obj->base.dev))- > >lmem_userfault_lock); > > + obj->userfault_count = 0; > > + }
Re: [Intel-gfx] [PATCH v5 2/2] drm/i915/dgfx: Release mmap on rpm suspend
On Tue, 13 Sept 2022 at 16:27, Anshuman Gupta wrote: > > Release all mmap mapping for all lmem objects which are associated > with userfault such that, while pcie function in D3hot, any access > to memory mappings will raise a userfault. > > Runtime resume the dgpu(when gem object lies in lmem). > This will transition the dgpu graphics function to D0 > state if it was in D3 in order to access the mmap memory > mappings. > > v2: > - Squashes the patches. [Matt Auld] > - Add adequate locking for lmem_userfault_list addition. [Matt Auld] > - Reused obj->userfault_count to avoid double addition. [Matt Auld] > - Added i915_gem_object_lock to check > i915_gem_object_is_lmem. [Matt Auld] > > v3: > - Use i915_ttm_cpu_maps_iomem. [Matt Auld] > - Fix 'ret == 0 to ret == VM_FAULT_NOPAGE'. [Matt Auld] > - Reuse obj->userfault_count as a bool 0 or 1. [Matt Auld] > - Delete the mmaped obj from lmem_userfault_list in obj > destruction path. [Matt Auld] > - Get a wakeref for object destruction patch. [Matt Auld] > - Use intel_wakeref_auto to delay runtime PM. [Matt Auld] > > v4: > - Avoid using mmo offset to get the vma_node. [Matt Auld] > - Added comment to use the lmem_userfault_lock. [Matt Auld] > - Get lmem_userfault_lock in i915_gem_object_release_mmap_offset. > [Matt Auld] > - Fixed kernel test robot generated warning. > > v5: > - Addressed the cosmetics comments. [Andi] > - Changed i915_gem_runtime_pm_object_release_mmap_offset() name to > i915_gem_object_runtime_pm_release_mmap_offset() to be rhythmic. > > PCIe Specs 5.3.1.4.1 > > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6331 > Cc: Matthew Auld > Cc: Rodrigo Vivi > Signed-off-by: Anshuman Gupta > Reviewed-by: Andi Shyti > --- > drivers/gpu/drm/i915/gem/i915_gem_mman.c | 21 +++ > drivers/gpu/drm/i915/gem/i915_gem_mman.h | 1 + > drivers/gpu/drm/i915/gem/i915_gem_object.c| 2 +- > .../gpu/drm/i915/gem/i915_gem_object_types.h | 3 +- > drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 36 +-- > drivers/gpu/drm/i915/gt/intel_gt.c| 2 ++ > drivers/gpu/drm/i915/gt/intel_gt_types.h | 14 > drivers/gpu/drm/i915/i915_gem.c | 4 +++ > 8 files changed, 79 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c > b/drivers/gpu/drm/i915/gem/i915_gem_mman.c > index b55befda3387..73d9eda1d6b7 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c > @@ -550,6 +550,20 @@ void i915_gem_object_release_mmap_gtt(struct > drm_i915_gem_object *obj) > intel_runtime_pm_put(&i915->runtime_pm, wakeref); > } > > +void i915_gem_object_runtime_pm_release_mmap_offset(struct > drm_i915_gem_object *obj) > +{ > + struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); > + struct ttm_device *bdev = bo->bdev; > + > + drm_vma_node_unmap(&bo->base.vma_node, bdev->dev_mapping); > + > + if (obj->userfault_count) { > + /* rpm wakeref provide exclusive access */ > + list_del(&obj->userfault_link); > + obj->userfault_count = 0; > + } > +} > + > void i915_gem_object_release_mmap_offset(struct drm_i915_gem_object *obj) > { > struct i915_mmap_offset *mmo, *mn; > @@ -573,6 +587,13 @@ void i915_gem_object_release_mmap_offset(struct > drm_i915_gem_object *obj) > spin_lock(&obj->mmo.lock); > } > spin_unlock(&obj->mmo.lock); > + > + if (obj->userfault_count) { > + > mutex_lock(&to_gt(to_i915(obj->base.dev))->lmem_userfault_lock); > + list_del(&obj->userfault_link); > + > mutex_unlock(&to_gt(to_i915(obj->base.dev))->lmem_userfault_lock); > + obj->userfault_count = 0; > + } Sorry for the late reply, I was out last week. This looks like it's missing holding the runtime pm AFAICT. We are holding the runtime pm for object destruction, but this is also called when a move is triggered (very common). If so, this can race against the runtime pm also touching the list concurrently. We are chasing some crazy looking NULL deref bugs, so wondering if this is somehow related... > } > > static struct i915_mmap_offset * > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.h > b/drivers/gpu/drm/i915/gem/i915_gem_mman.h > index efee9e0d2508..1fa91b3033b3 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.h > +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.h > @@ -27,6 +27,7 @@ int i915_gem_dumb_mmap_offset(struct drm_file *file_priv, > void __i915_gem_object_release_mmap_gtt(struct drm_i915_gem_object *obj); > void i915_gem_object_release_mmap_gtt(struct drm_i915_gem_object *obj); > > +void i915_gem_object_runtime_pm_release_mmap_offset(struct > drm_i915_gem_object *obj); > void i915_gem_object_release_mmap_offset(struct drm_i915_gem_object *obj); > > #endif > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c > b/drivers/gpu/drm/i9
[Intel-gfx] [PATCH v5 2/2] drm/i915/dgfx: Release mmap on rpm suspend
Release all mmap mapping for all lmem objects which are associated with userfault such that, while pcie function in D3hot, any access to memory mappings will raise a userfault. Runtime resume the dgpu(when gem object lies in lmem). This will transition the dgpu graphics function to D0 state if it was in D3 in order to access the mmap memory mappings. v2: - Squashes the patches. [Matt Auld] - Add adequate locking for lmem_userfault_list addition. [Matt Auld] - Reused obj->userfault_count to avoid double addition. [Matt Auld] - Added i915_gem_object_lock to check i915_gem_object_is_lmem. [Matt Auld] v3: - Use i915_ttm_cpu_maps_iomem. [Matt Auld] - Fix 'ret == 0 to ret == VM_FAULT_NOPAGE'. [Matt Auld] - Reuse obj->userfault_count as a bool 0 or 1. [Matt Auld] - Delete the mmaped obj from lmem_userfault_list in obj destruction path. [Matt Auld] - Get a wakeref for object destruction patch. [Matt Auld] - Use intel_wakeref_auto to delay runtime PM. [Matt Auld] v4: - Avoid using mmo offset to get the vma_node. [Matt Auld] - Added comment to use the lmem_userfault_lock. [Matt Auld] - Get lmem_userfault_lock in i915_gem_object_release_mmap_offset. [Matt Auld] - Fixed kernel test robot generated warning. v5: - Addressed the cosmetics comments. [Andi] - Changed i915_gem_runtime_pm_object_release_mmap_offset() name to i915_gem_object_runtime_pm_release_mmap_offset() to be rhythmic. PCIe Specs 5.3.1.4.1 Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6331 Cc: Matthew Auld Cc: Rodrigo Vivi Signed-off-by: Anshuman Gupta Reviewed-by: Andi Shyti --- drivers/gpu/drm/i915/gem/i915_gem_mman.c | 21 +++ drivers/gpu/drm/i915/gem/i915_gem_mman.h | 1 + drivers/gpu/drm/i915/gem/i915_gem_object.c| 2 +- .../gpu/drm/i915/gem/i915_gem_object_types.h | 3 +- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 36 +-- drivers/gpu/drm/i915/gt/intel_gt.c| 2 ++ drivers/gpu/drm/i915/gt/intel_gt_types.h | 14 drivers/gpu/drm/i915/i915_gem.c | 4 +++ 8 files changed, 79 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c index b55befda3387..73d9eda1d6b7 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -550,6 +550,20 @@ void i915_gem_object_release_mmap_gtt(struct drm_i915_gem_object *obj) intel_runtime_pm_put(&i915->runtime_pm, wakeref); } +void i915_gem_object_runtime_pm_release_mmap_offset(struct drm_i915_gem_object *obj) +{ + struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); + struct ttm_device *bdev = bo->bdev; + + drm_vma_node_unmap(&bo->base.vma_node, bdev->dev_mapping); + + if (obj->userfault_count) { + /* rpm wakeref provide exclusive access */ + list_del(&obj->userfault_link); + obj->userfault_count = 0; + } +} + void i915_gem_object_release_mmap_offset(struct drm_i915_gem_object *obj) { struct i915_mmap_offset *mmo, *mn; @@ -573,6 +587,13 @@ void i915_gem_object_release_mmap_offset(struct drm_i915_gem_object *obj) spin_lock(&obj->mmo.lock); } spin_unlock(&obj->mmo.lock); + + if (obj->userfault_count) { + mutex_lock(&to_gt(to_i915(obj->base.dev))->lmem_userfault_lock); + list_del(&obj->userfault_link); + mutex_unlock(&to_gt(to_i915(obj->base.dev))->lmem_userfault_lock); + obj->userfault_count = 0; + } } static struct i915_mmap_offset * diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.h b/drivers/gpu/drm/i915/gem/i915_gem_mman.h index efee9e0d2508..1fa91b3033b3 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.h @@ -27,6 +27,7 @@ int i915_gem_dumb_mmap_offset(struct drm_file *file_priv, void __i915_gem_object_release_mmap_gtt(struct drm_i915_gem_object *obj); void i915_gem_object_release_mmap_gtt(struct drm_i915_gem_object *obj); +void i915_gem_object_runtime_pm_release_mmap_offset(struct drm_i915_gem_object *obj); void i915_gem_object_release_mmap_offset(struct drm_i915_gem_object *obj); #endif diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 85482a04d158..7ff9c7877bec 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -238,7 +238,7 @@ static void __i915_gem_object_free_mmaps(struct drm_i915_gem_object *obj) { /* Skip serialisation and waking the device if known to be not used. */ - if (obj->userfault_count) + if (obj->userfault_count && !IS_DGFX(to_i915(obj->base.dev))) i915_gem_object_release_mmap_gtt(obj); if (!RB_EMPTY_ROOT(&obj->mmo.offsets)) { diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index 9f