On Wed, Oct 12, 2016 at 12:16:33PM +0100, Chris Wilson wrote:
> During rpm resume we restore the fences, but we do not have the
> protection of struct_mutex. This rules out updating the activity
> tracking on the fences, and requires us to rely on the rpm as the
> serialisation barrier instead.
> 
> [  350.298052] [drm:intel_runtime_resume [i915]] Resuming device
> [  350.308606]
> [  350.310520] ===============================
> [  350.315560] [ INFO: suspicious RCU usage. ]
> [  350.320554] 4.8.0-rc8-bsw-rapl+ #3133 Tainted: G     U  W
> [  350.327208] -------------------------------
> [  350.331977] ../drivers/gpu/drm/i915/i915_gem_request.h:371 suspicious 
> rcu_dereference_protected() usage!
> [  350.342619]
> [  350.342619] other info that might help us debug this:
> [  350.342619]
> [  350.351593]
> [  350.351593] rcu_scheduler_active = 1, debug_locks = 0
> [  350.358952] 3 locks held by Xorg/320:
> [  350.363077]  #0:  (&dev->mode_config.mutex){+.+.+.}, at: 
> [<ffffffffa030589c>] drm_modeset_lock_all+0x3c/0xd0 [drm]
> [  350.375162]  #1:  (crtc_ww_class_acquire){+.+.+.}, at: 
> [<ffffffffa03058a6>] drm_modeset_lock_all+0x46/0xd0 [drm]
> [  350.387022]  #2:  (crtc_ww_class_mutex){+.+.+.}, at: [<ffffffffa0305056>] 
> drm_modeset_lock+0x36/0x110 [drm]
> [  350.398236]
> [  350.398236] stack backtrace:
> [  350.403196] CPU: 1 PID: 320 Comm: Xorg Tainted: G     U  W       
> 4.8.0-rc8-bsw-rapl+ #3133
> [  350.412457] Hardware name: Intel Corporation CHERRYVIEW C0 
> PLATFORM/Braswell CRB, BIOS BRAS.X64.X088.R00.1510270350 10/27/2015
> [  350.425212]  0000000000000000 ffff8801680a78c8 ffffffff81332187 
> ffff88016c5c5000
> [  350.433611]  0000000000000001 ffff8801680a78f8 ffffffff810ca6da 
> ffff88016cc8b0f0
> [  350.442012]  ffff88016cc80000 ffff88016cc80000 ffff880177ad0000 
> ffff8801680a7948
> [  350.450409] Call Trace:
> [  350.453165]  [<ffffffff81332187>] dump_stack+0x67/0x90
> [  350.458931]  [<ffffffff810ca6da>] lockdep_rcu_suspicious+0xea/0x120
> [  350.466002]  [<ffffffffa039e8dd>] fence_update+0xbd/0x670 [i915]
> [  350.472766]  [<ffffffffa039efe2>] i915_gem_restore_fences+0x52/0x70 [i915]
> [  350.480496]  [<ffffffffa0368f42>] vlv_resume_prepare+0x72/0x570 [i915]
> [  350.487839]  [<ffffffffa0369802>] intel_runtime_resume+0x102/0x210 [i915]
> [  350.495442]  [<ffffffff8137f26f>] pci_pm_runtime_resume+0x7f/0xb0
> [  350.502274]  [<ffffffff8137f1f0>] ? pci_restore_standard_config+0x40/0x40
> [  350.509883]  [<ffffffff814401c5>] __rpm_callback+0x35/0x70
> [  350.516037]  [<ffffffff8137f1f0>] ? pci_restore_standard_config+0x40/0x40
> [  350.523646]  [<ffffffff81440224>] rpm_callback+0x24/0x80
> [  350.529604]  [<ffffffff8137f1f0>] ? pci_restore_standard_config+0x40/0x40
> [  350.537212]  [<ffffffff814417bd>] rpm_resume+0x4ad/0x740
> [  350.543161]  [<ffffffff81441aa1>] __pm_runtime_resume+0x51/0x80
> [  350.549824]  [<ffffffffa03889c8>] intel_runtime_pm_get+0x28/0x90 [i915]
> [  350.557265]  [<ffffffffa0388a53>] intel_display_power_get+0x23/0x50 [i915]
> [  350.565001]  [<ffffffffa03ef23d>] intel_atomic_commit_tail+0xdfd/0x10b0 
> [i915]
> [  350.573106]  [<ffffffffa034b2e9>] ? 
> drm_atomic_helper_swap_state+0x159/0x300 [drm_kms_helper]
> [  350.582659]  [<ffffffff81615091>] ? _raw_spin_unlock+0x31/0x50
> [  350.589205]  [<ffffffffa034b2e9>] ? 
> drm_atomic_helper_swap_state+0x159/0x300 [drm_kms_helper]
> [  350.598787]  [<ffffffffa03ef8a5>] intel_atomic_commit+0x3b5/0x500 [i915]
> [  350.606319]  [<ffffffffa03061dc>] ? 
> drm_atomic_set_crtc_for_connector+0xcc/0x100 [drm]
> [  350.615209]  [<ffffffffa0306b49>] drm_atomic_commit+0x49/0x50 [drm]
> [  350.622242]  [<ffffffffa034dee8>] drm_atomic_helper_set_config+0x88/0xc0 
> [drm_kms_helper]
> [  350.631419]  [<ffffffffa02f94ac>] drm_mode_set_config_internal+0x6c/0x120 
> [drm]
> [  350.639623]  [<ffffffffa02fa94c>] drm_mode_setcrtc+0x22c/0x4d0 [drm]
> [  350.646760]  [<ffffffffa02f0f19>] drm_ioctl+0x209/0x460 [drm]
> [  350.653217]  [<ffffffffa02fa720>] ? drm_mode_getcrtc+0x150/0x150 [drm]
> [  350.660536]  [<ffffffff810c984a>] ? __lock_is_held+0x4a/0x70
> [  350.666885]  [<ffffffff81202303>] do_vfs_ioctl+0x93/0x6b0
> [  350.672939]  [<ffffffff8120f843>] ? __fget+0x113/0x200
> [  350.678797]  [<ffffffff8120f735>] ? __fget+0x5/0x200
> [  350.684361]  [<ffffffff81202964>] SyS_ioctl+0x44/0x80
> [  350.690030]  [<ffffffff81001deb>] do_syscall_64+0x5b/0x120
> [  350.696184]  [<ffffffff81615ada>] entry_SYSCALL64_slow_path+0x25/0x25
> 
> Note we also have to remember the lesson from commit 4fc788f5ee3d
> ("drm/i915: Flush delayed fence releases after reset") where we have to
> flush any changes to the fence on restore.
> 
> v2: Replace call to release user mmaps with an assertion that they have
> already been zapped.
> 
> Fixes: 49ef5294cda2 ("drm/i915: Move fence tracking from object to vma")
> Reported-by: Ville Syrjälä <ville.syrj...@linux.intel.com>
> Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
> Cc: Ville Syrjälä <ville.syrj...@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahti...@linux.intel.com>
> Cc: Mika Kuoppala <mika.kuopp...@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_fence.c | 21 +++++++++++++++++++--
>  1 file changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_fence.c 
> b/drivers/gpu/drm/i915/i915_gem_fence.c
> index 8df1fa7234e8..f7081f4b5d22 100644
> --- a/drivers/gpu/drm/i915/i915_gem_fence.c
> +++ b/drivers/gpu/drm/i915/i915_gem_fence.c
> @@ -290,6 +290,8 @@ i915_vma_put_fence(struct i915_vma *vma)
>  {
>       struct drm_i915_fence_reg *fence = vma->fence;
>  
> +     assert_rpm_wakelock_held(to_i915(vma->vm->dev));
> +
>       if (!fence)
>               return 0;
>  
> @@ -341,6 +343,8 @@ i915_vma_get_fence(struct i915_vma *vma)
>       struct drm_i915_fence_reg *fence;
>       struct i915_vma *set = i915_gem_object_is_tiled(vma->obj) ? vma : NULL;
>  
> +     assert_rpm_wakelock_held(to_i915(vma->vm->dev));
> +
>       /* Just update our place in the LRU if our fence is getting reused. */
>       if (vma->fence) {
>               fence = vma->fence;
> @@ -371,6 +375,12 @@ void i915_gem_restore_fences(struct drm_device *dev)
>       struct drm_i915_private *dev_priv = to_i915(dev);
>       int i;
>  
> +     /* Note that this may be called outside of struct_mutex, by
> +      * runtime suspend/resume. The barrier we require is enforced by
> +      * rpm itself - all access to fences/GTT are only within an rpm
> +      * wakeref, and to acquire that wakeref you must pass through here.
> +      */
> +
>       for (i = 0; i < dev_priv->num_fence_regs; i++) {
>               struct drm_i915_fence_reg *reg = &dev_priv->fence_regs[i];
>               struct i915_vma *vma = reg->vma;
> @@ -379,10 +389,17 @@ void i915_gem_restore_fences(struct drm_device *dev)
>                * Commit delayed tiling changes if we have an object still
>                * attached to the fence, otherwise just clear the fence.
>                */
> -             if (vma && !i915_gem_object_is_tiled(vma->obj))
> +             if (vma && !i915_gem_object_is_tiled(vma->obj)) {
> +                     GEM_BUG_ON(!reg->dirty);
> +                     GEM_BUG_ON(!list_empty(&vma->obj->userfault_link));
> +
> +                     list_move(&reg->link, &dev_priv->mm.fence_list);
> +                     vma->fence = NULL;
>                       vma = NULL;
> +             }
>  
> -             fence_update(reg, vma);
> +             fence_write(reg, vma);
> +             reg->vma = vma;

Same comments as with the userfault_list: Using rpm ordering to enforce
consistency causes mild panic attacks here with me ;-)

Is the above (delayed tiling change commit) even possible here, at least
for rpm resume? Same for system s/r (both s3 and s4) since the pagetables
won't survive anyway. Can't we simply make this an impossibility?
-Daniel

>       }
>  }
>  
> -- 
> 2.9.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to