Re: [PATCH 0/4] Backported amdgpu ttm deadlock fixes for 4.14

2017-12-04 Thread Greg KH
On Thu, Nov 30, 2017 at 07:23:02PM -0500, Lyude Paul wrote:
> I haven't gone to see where it started, but as of late a good number of
> pretty nasty deadlock issues have appeared with the kernel. Easy
> reproduction recipe on a laptop with i915/amdgpu prime with lockdep enabled:
> 
> DRI_PRIME=1 glxinfo
> 
> Additionally, some more race conditions exist that I've managed to
> trigger with piglit and lockdep enabled after applying these patches:
> 
> =
> WARNING: suspicious RCU usage
> 4.14.3Lyude-Test+ #2 Not tainted
> -
> ./include/linux/reservation.h:216 suspicious rcu_dereference_protected() 
> usage!
> 
> other info that might help us debug this:
> 
> rcu_scheduler_active = 2, debug_locks = 1
> 1 lock held by ext_image_dma_b/27451:
>  #0:  (reservation_ww_class_mutex){+.+.}, at: [] 
> ttm_bo_unref+0x9f/0x3c0 [ttm]
> 
> stack backtrace:
> CPU: 0 PID: 27451 Comm: ext_image_dma_b Not tainted 4.14.3Lyude-Test+ #2
> Hardware name: HP HP ZBook 15 G4/8275, BIOS P70 Ver. 01.02 06/09/2017
> Call Trace:
>  dump_stack+0x8e/0xce
>  lockdep_rcu_suspicious+0xc5/0x100
>  reservation_object_copy_fences+0x292/0x2b0
>  ? ttm_bo_unref+0x9f/0x3c0 [ttm]
>  ttm_bo_unref+0xbd/0x3c0 [ttm]
>  amdgpu_bo_unref+0x2a/0x50 [amdgpu]
>  amdgpu_gem_object_free+0x4b/0x50 [amdgpu]
>  drm_gem_object_free+0x1f/0x40 [drm]
>  drm_gem_object_put_unlocked+0x40/0xb0 [drm]
>  drm_gem_object_handle_put_unlocked+0x6c/0xb0 [drm]
>  drm_gem_object_release_handle+0x51/0x90 [drm]
>  drm_gem_handle_delete+0x5e/0x90 [drm]
>  ? drm_gem_handle_create+0x40/0x40 [drm]
>  drm_gem_close_ioctl+0x20/0x30 [drm]
>  drm_ioctl_kernel+0x5d/0xb0 [drm]
>  drm_ioctl+0x2f7/0x3b0 [drm]
>  ? drm_gem_handle_create+0x40/0x40 [drm]
>  ? trace_hardirqs_on_caller+0xf4/0x190
>  ? trace_hardirqs_on+0xd/0x10
>  amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
>  do_vfs_ioctl+0x93/0x670
>  ? __fget+0x108/0x1f0
>  SyS_ioctl+0x79/0x90
>  entry_SYSCALL_64_fastpath+0x23/0xc2
> 
> I've also added the relevant fixes for the issue mentioned above.
> 
> Christian König (3):
>   drm/ttm: fix ttm_bo_cleanup_refs_or_queue once more
>   dma-buf: make reservation_object_copy_fences rcu save
>   drm/amdgpu: reserve root PD while releasing it
> 
> Michel Dänzer (1):
>   drm/ttm: Always and only destroy bo->ttm_resv in ttm_bo_release_list

All now queued up, thanks.

greg k-h
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 0/4] Backported amdgpu ttm deadlock fixes for 4.14

2017-12-01 Thread Christian König

Am 01.12.2017 um 01:23 schrieb Lyude Paul:

I haven't gone to see where it started, but as of late a good number of
pretty nasty deadlock issues have appeared with the kernel. Easy
reproduction recipe on a laptop with i915/amdgpu prime with lockdep enabled:

DRI_PRIME=1 glxinfo


Acked-by: Christian König 

Thanks for taking care of this,
Christian.



Additionally, some more race conditions exist that I've managed to
trigger with piglit and lockdep enabled after applying these patches:

 =
 WARNING: suspicious RCU usage
 4.14.3Lyude-Test+ #2 Not tainted
 -
 ./include/linux/reservation.h:216 suspicious rcu_dereference_protected() 
usage!

 other info that might help us debug this:

 rcu_scheduler_active = 2, debug_locks = 1
 1 lock held by ext_image_dma_b/27451:
  #0:  (reservation_ww_class_mutex){+.+.}, at: [] 
ttm_bo_unref+0x9f/0x3c0 [ttm]

 stack backtrace:
 CPU: 0 PID: 27451 Comm: ext_image_dma_b Not tainted 4.14.3Lyude-Test+ #2
 Hardware name: HP HP ZBook 15 G4/8275, BIOS P70 Ver. 01.02 06/09/2017
 Call Trace:
  dump_stack+0x8e/0xce
  lockdep_rcu_suspicious+0xc5/0x100
  reservation_object_copy_fences+0x292/0x2b0
  ? ttm_bo_unref+0x9f/0x3c0 [ttm]
  ttm_bo_unref+0xbd/0x3c0 [ttm]
  amdgpu_bo_unref+0x2a/0x50 [amdgpu]
  amdgpu_gem_object_free+0x4b/0x50 [amdgpu]
  drm_gem_object_free+0x1f/0x40 [drm]
  drm_gem_object_put_unlocked+0x40/0xb0 [drm]
  drm_gem_object_handle_put_unlocked+0x6c/0xb0 [drm]
  drm_gem_object_release_handle+0x51/0x90 [drm]
  drm_gem_handle_delete+0x5e/0x90 [drm]
  ? drm_gem_handle_create+0x40/0x40 [drm]
  drm_gem_close_ioctl+0x20/0x30 [drm]
  drm_ioctl_kernel+0x5d/0xb0 [drm]
  drm_ioctl+0x2f7/0x3b0 [drm]
  ? drm_gem_handle_create+0x40/0x40 [drm]
  ? trace_hardirqs_on_caller+0xf4/0x190
  ? trace_hardirqs_on+0xd/0x10
  amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
  do_vfs_ioctl+0x93/0x670
  ? __fget+0x108/0x1f0
  SyS_ioctl+0x79/0x90
  entry_SYSCALL_64_fastpath+0x23/0xc2

I've also added the relevant fixes for the issue mentioned above.

Christian König (3):
   drm/ttm: fix ttm_bo_cleanup_refs_or_queue once more
   dma-buf: make reservation_object_copy_fences rcu save
   drm/amdgpu: reserve root PD while releasing it

Michel Dänzer (1):
   drm/ttm: Always and only destroy bo->ttm_resv in ttm_bo_release_list

  drivers/dma-buf/reservation.c  | 56 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 ++--
  drivers/gpu/drm/ttm/ttm_bo.c   | 43 +-
  3 files changed, 74 insertions(+), 38 deletions(-)

--
2.14.3

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 0/4] Backported amdgpu ttm deadlock fixes for 4.14

2017-11-30 Thread Lyude Paul
I haven't gone to see where it started, but as of late a good number of
pretty nasty deadlock issues have appeared with the kernel. Easy
reproduction recipe on a laptop with i915/amdgpu prime with lockdep enabled:

DRI_PRIME=1 glxinfo

Additionally, some more race conditions exist that I've managed to
trigger with piglit and lockdep enabled after applying these patches:

=
WARNING: suspicious RCU usage
4.14.3Lyude-Test+ #2 Not tainted
-
./include/linux/reservation.h:216 suspicious rcu_dereference_protected() 
usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
1 lock held by ext_image_dma_b/27451:
 #0:  (reservation_ww_class_mutex){+.+.}, at: [] 
ttm_bo_unref+0x9f/0x3c0 [ttm]

stack backtrace:
CPU: 0 PID: 27451 Comm: ext_image_dma_b Not tainted 4.14.3Lyude-Test+ #2
Hardware name: HP HP ZBook 15 G4/8275, BIOS P70 Ver. 01.02 06/09/2017
Call Trace:
 dump_stack+0x8e/0xce
 lockdep_rcu_suspicious+0xc5/0x100
 reservation_object_copy_fences+0x292/0x2b0
 ? ttm_bo_unref+0x9f/0x3c0 [ttm]
 ttm_bo_unref+0xbd/0x3c0 [ttm]
 amdgpu_bo_unref+0x2a/0x50 [amdgpu]
 amdgpu_gem_object_free+0x4b/0x50 [amdgpu]
 drm_gem_object_free+0x1f/0x40 [drm]
 drm_gem_object_put_unlocked+0x40/0xb0 [drm]
 drm_gem_object_handle_put_unlocked+0x6c/0xb0 [drm]
 drm_gem_object_release_handle+0x51/0x90 [drm]
 drm_gem_handle_delete+0x5e/0x90 [drm]
 ? drm_gem_handle_create+0x40/0x40 [drm]
 drm_gem_close_ioctl+0x20/0x30 [drm]
 drm_ioctl_kernel+0x5d/0xb0 [drm]
 drm_ioctl+0x2f7/0x3b0 [drm]
 ? drm_gem_handle_create+0x40/0x40 [drm]
 ? trace_hardirqs_on_caller+0xf4/0x190
 ? trace_hardirqs_on+0xd/0x10
 amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
 do_vfs_ioctl+0x93/0x670
 ? __fget+0x108/0x1f0
 SyS_ioctl+0x79/0x90
 entry_SYSCALL_64_fastpath+0x23/0xc2

I've also added the relevant fixes for the issue mentioned above.

Christian König (3):
  drm/ttm: fix ttm_bo_cleanup_refs_or_queue once more
  dma-buf: make reservation_object_copy_fences rcu save
  drm/amdgpu: reserve root PD while releasing it

Michel Dänzer (1):
  drm/ttm: Always and only destroy bo->ttm_resv in ttm_bo_release_list

 drivers/dma-buf/reservation.c  | 56 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 ++--
 drivers/gpu/drm/ttm/ttm_bo.c   | 43 +-
 3 files changed, 74 insertions(+), 38 deletions(-)

--
2.14.3

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel