[PATCH] dma-buf: Use fence_get_rcu_safe() for retrieving the exclusive fence

2016-11-15 Thread Daniel Vetter
On Mon, Nov 14, 2016 at 04:37:09PM +0100, Christian König wrote:
> Am 14.11.2016 um 12:55 schrieb Chris Wilson:
> > The current code is subject to a race where we may try to acquire a
> > reference on a stale fence:
> > 
> > [13703.335118] WARNING: CPU: 1 PID: 14975 at ./include/linux/kref.h:46 
> > i915_gem_object_wait+0x1a3/0x1c0
> > [13703.335184] Modules linked in:
> > [13703.335202] CPU: 1 PID: 14975 Comm: gem_concurrent_ Not tainted 
> > 4.9.0-rc4+ #26
> > [13703.335216] Hardware name:  /, BIOS 
> > PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
> > [13703.335233]  c90002f5bcc8 812807de  
> > 
> > [13703.335257]  c90002f5bd08 81073811 002e8000 
> > 88026bf7c780
> > [13703.335279]  7fff 0001 88027045a550 
> > 88026bf7c780
> > [13703.335301] Call Trace:
> > [13703.335316]  [] dump_stack+0x4d/0x6f
> > [13703.335331]  [] __warn+0xc1/0xe0
> > [13703.335343]  [] warn_slowpath_null+0x18/0x20
> > [13703.335355]  [] i915_gem_object_wait+0x1a3/0x1c0
> > [13703.335367]  [] i915_gem_set_domain_ioctl+0xcc/0x330
> > [13703.335386]  [] drm_ioctl+0x1cb/0x410
> > [13703.335400]  [] ? 
> > i915_gem_obj_prepare_shmem_write+0x1d0/0x1d0
> > [13703.335416]  [] ? drm_ioctl+0x2bb/0x410
> > [13703.335429]  [] do_vfs_ioctl+0x8f/0x5c0
> > [13703.335442]  [] SyS_ioctl+0x3c/0x70
> > [13703.335456]  [] entry_SYSCALL_64_fastpath+0x17/0x98
> > [13703.335558] ---[ end trace fd24176416ba6981 ]---
> > [13703.382778] general protection fault:  [#1] SMP
> > [13703.382802] Modules linked in:
> > [13703.382816] CPU: 1 PID: 14967 Comm: gem_concurrent_ Tainted: GW  
> >  4.9.0-rc4+ #26
> > [13703.382828] Hardware name:  /, BIOS 
> > PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
> > [13703.382841] task: 880275458000 task.stack: c90002f18000
> > [13703.382849] RIP: 0010:[]  [] 
> > i915_gem_request_retire+0x2b4/0x320
> > [13703.382870] RSP: 0018:c90002f1bbc8  EFLAGS: 00010293
> > [13703.382878] RAX: dead0200 RBX: 88026bf7dce8 RCX: 
> > dead0100
> > [13703.382887] RDX: dead0100 RSI: 88026bf7c930 RDI: 
> > 88026bf7dd00
> > [13703.382897] RBP: c90002f1bbf8 R08:  R09: 
> > 88026b89a000
> > [13703.382905] R10: 0001 R11: 88026bbe8fe0 R12: 
> > 88026bf7c000
> > [13703.382913] R13: 880275af8000 R14: 88026bf7c180 R15: 
> > dead0200
> > [13703.382922] FS:  7f89e787d740() GS:88027fd0() 
> > knlGS:
> > [13703.382934] CS:  0010 DS:  ES:  CR0: 80050033
> > [13703.382942] CR2: 7f9053d2e000 CR3: 00026d414000 CR4: 
> > 001006e0
> > [13703.382951] Stack:
> > [13703.382958]  880275413000 c90002f1bde8 880275af8000 
> > 880274e8a600
> > [13703.382976]  880276a06000 c90002f1bde8 c90002f1bc38 
> > 813b48c5
> > [13703.382995]  c90002f1bc00 c90002f1bde8 88026972a440 
> > 
> > [13703.383021] Call Trace:
> > [13703.383032]  [] i915_gem_request_alloc+0xa5/0x350
> > [13703.383043]  [] 
> > i915_gem_do_execbuffer.isra.41+0x7b3/0x18b0
> > [13703.383055]  [] ? i915_gem_object_get_sg+0x25c/0x2b0
> > [13703.383065]  [] ? i915_gem_object_get_page+0x1d/0x50
> > [13703.383076]  [] ? i915_gem_pwrite_ioctl+0x66c/0x6d0
> > [13703.383086]  [] i915_gem_execbuffer2+0x95/0x1e0
> > [13703.383096]  [] drm_ioctl+0x1cb/0x410
> > [13703.383105]  [] ? i915_gem_execbuffer+0x2d0/0x2d0
> > [13703.383117]  [] ? hrtimer_start_range_ns+0x1a0/0x310
> > [13703.383128]  [] do_vfs_ioctl+0x8f/0x5c0
> > [13703.383140]  [] ? SyS_timer_settime+0x118/0x1a0
> > [13703.383150]  [] SyS_ioctl+0x3c/0x70
> > [13703.383162]  [] entry_SYSCALL_64_fastpath+0x17/0x98
> > [13703.383172] Code: 49 39 c6 48 8d 70 e8 48 8d 5f e8 75 16 eb 47 48 8d 43 
> > 18 48 8b 53 18 48 89 de 49 39 c6 48 8d 5a e8 74 33 48 8b 56 08 48 8b 46 10 
> > <48> 89 42 08 48 89 10 f6 46 38 01 48 89 4e 08 4c 89 7e 10 74 cf
> > [13703.383557] RIP  [] i915_gem_request_retire+0x2b4/0x320
> > [13703.383570]  RSP 
> > [13703.383586] ---[ end trace fd24176416ba6982 ]---
> > 
> > This is fixed by using the kref_get_unless_zero() as a full memory
> > barrier to validate the fence is still the current exclusive fence before
> > returning it back to the caller. (Note the fix only requires using
> > dma_fence_get_rcu() and correct handling, but we may as well use the
> > helper rather than inline equivalent code.)
> > 
> > Signed-off-by: Chris Wilson 
> > Cc: Sumit Semwal  
> Reviewed-by: Christian König .

Applied to drm-misc, thanks.
-Daniel

> 
> > ---
> >   include/linux/reservation.h | 15 ++-
> >   1 file changed, 6 insertions(+), 9 deletions(-)
> > 
> > diff --git a/include/linux/reservation.h b/include/linux/reservation.h
> > index 2e313cca08f0..d9706a6f5ae2 100644
> > --- a/include/linux/reservation.h
> > +++ b/include/linux/reservation.h
> > @@ 

[PATCH] dma-buf: Use fence_get_rcu_safe() for retrieving the exclusive fence

2016-11-14 Thread Christian König
Am 14.11.2016 um 12:55 schrieb Chris Wilson:
> The current code is subject to a race where we may try to acquire a
> reference on a stale fence:
>
> [13703.335118] WARNING: CPU: 1 PID: 14975 at ./include/linux/kref.h:46 
> i915_gem_object_wait+0x1a3/0x1c0
> [13703.335184] Modules linked in:
> [13703.335202] CPU: 1 PID: 14975 Comm: gem_concurrent_ Not tainted 4.9.0-rc4+ 
> #26
> [13703.335216] Hardware name:  /, BIOS 
> PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
> [13703.335233]  c90002f5bcc8 812807de  
> 
> [13703.335257]  c90002f5bd08 81073811 002e8000 
> 88026bf7c780
> [13703.335279]  7fff 0001 88027045a550 
> 88026bf7c780
> [13703.335301] Call Trace:
> [13703.335316]  [] dump_stack+0x4d/0x6f
> [13703.335331]  [] __warn+0xc1/0xe0
> [13703.335343]  [] warn_slowpath_null+0x18/0x20
> [13703.335355]  [] i915_gem_object_wait+0x1a3/0x1c0
> [13703.335367]  [] i915_gem_set_domain_ioctl+0xcc/0x330
> [13703.335386]  [] drm_ioctl+0x1cb/0x410
> [13703.335400]  [] ? 
> i915_gem_obj_prepare_shmem_write+0x1d0/0x1d0
> [13703.335416]  [] ? drm_ioctl+0x2bb/0x410
> [13703.335429]  [] do_vfs_ioctl+0x8f/0x5c0
> [13703.335442]  [] SyS_ioctl+0x3c/0x70
> [13703.335456]  [] entry_SYSCALL_64_fastpath+0x17/0x98
> [13703.335558] ---[ end trace fd24176416ba6981 ]---
> [13703.382778] general protection fault:  [#1] SMP
> [13703.382802] Modules linked in:
> [13703.382816] CPU: 1 PID: 14967 Comm: gem_concurrent_ Tainted: GW
>4.9.0-rc4+ #26
> [13703.382828] Hardware name:  /, BIOS 
> PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
> [13703.382841] task: 880275458000 task.stack: c90002f18000
> [13703.382849] RIP: 0010:[]  [] 
> i915_gem_request_retire+0x2b4/0x320
> [13703.382870] RSP: 0018:c90002f1bbc8  EFLAGS: 00010293
> [13703.382878] RAX: dead0200 RBX: 88026bf7dce8 RCX: 
> dead0100
> [13703.382887] RDX: dead0100 RSI: 88026bf7c930 RDI: 
> 88026bf7dd00
> [13703.382897] RBP: c90002f1bbf8 R08:  R09: 
> 88026b89a000
> [13703.382905] R10: 0001 R11: 88026bbe8fe0 R12: 
> 88026bf7c000
> [13703.382913] R13: 880275af8000 R14: 88026bf7c180 R15: 
> dead0200
> [13703.382922] FS:  7f89e787d740() GS:88027fd0() 
> knlGS:
> [13703.382934] CS:  0010 DS:  ES:  CR0: 80050033
> [13703.382942] CR2: 7f9053d2e000 CR3: 00026d414000 CR4: 
> 001006e0
> [13703.382951] Stack:
> [13703.382958]  880275413000 c90002f1bde8 880275af8000 
> 880274e8a600
> [13703.382976]  880276a06000 c90002f1bde8 c90002f1bc38 
> 813b48c5
> [13703.382995]  c90002f1bc00 c90002f1bde8 88026972a440 
> 
> [13703.383021] Call Trace:
> [13703.383032]  [] i915_gem_request_alloc+0xa5/0x350
> [13703.383043]  [] 
> i915_gem_do_execbuffer.isra.41+0x7b3/0x18b0
> [13703.383055]  [] ? i915_gem_object_get_sg+0x25c/0x2b0
> [13703.383065]  [] ? i915_gem_object_get_page+0x1d/0x50
> [13703.383076]  [] ? i915_gem_pwrite_ioctl+0x66c/0x6d0
> [13703.383086]  [] i915_gem_execbuffer2+0x95/0x1e0
> [13703.383096]  [] drm_ioctl+0x1cb/0x410
> [13703.383105]  [] ? i915_gem_execbuffer+0x2d0/0x2d0
> [13703.383117]  [] ? hrtimer_start_range_ns+0x1a0/0x310
> [13703.383128]  [] do_vfs_ioctl+0x8f/0x5c0
> [13703.383140]  [] ? SyS_timer_settime+0x118/0x1a0
> [13703.383150]  [] SyS_ioctl+0x3c/0x70
> [13703.383162]  [] entry_SYSCALL_64_fastpath+0x17/0x98
> [13703.383172] Code: 49 39 c6 48 8d 70 e8 48 8d 5f e8 75 16 eb 47 48 8d 43 18 
> 48 8b 53 18 48 89 de 49 39 c6 48 8d 5a e8 74 33 48 8b 56 08 48 8b 46 10 <48> 
> 89 42 08 48 89 10 f6 46 38 01 48 89 4e 08 4c 89 7e 10 74 cf
> [13703.383557] RIP  [] i915_gem_request_retire+0x2b4/0x320
> [13703.383570]  RSP 
> [13703.383586] ---[ end trace fd24176416ba6982 ]---
>
> This is fixed by using the kref_get_unless_zero() as a full memory
> barrier to validate the fence is still the current exclusive fence before
> returning it back to the caller. (Note the fix only requires using
> dma_fence_get_rcu() and correct handling, but we may as well use the
> helper rather than inline equivalent code.)
>
> Signed-off-by: Chris Wilson 
> Cc: Sumit Semwal .

> ---
>   include/linux/reservation.h | 15 ++-
>   1 file changed, 6 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/reservation.h b/include/linux/reservation.h
> index 2e313cca08f0..d9706a6f5ae2 100644
> --- a/include/linux/reservation.h
> +++ b/include/linux/reservation.h
> @@ -177,17 +177,14 @@ static inline struct dma_fence *
>   reservation_object_get_excl_rcu(struct reservation_object *obj)
>   {
>   struct dma_fence *fence;
> - unsigned seq;
> -retry:
> - seq = read_seqcount_begin(>seq);
> +
> + if (!rcu_access_pointer(obj->fence_excl))
> + return NULL;
> +
>   rcu_read_lock();
> -  

[PATCH] dma-buf: Use fence_get_rcu_safe() for retrieving the exclusive fence

2016-11-14 Thread Chris Wilson
On Mon, Nov 14, 2016 at 11:55:40AM +, Chris Wilson wrote:
> The current code is subject to a race where we may try to acquire a
> reference on a stale fence:

>From i915.ko pov, this
Fixes: d07f0e59b2c7 ("drm/i915: Move GEM activity tracking into a common struct 
reservation_object")
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


[PATCH] dma-buf: Use fence_get_rcu_safe() for retrieving the exclusive fence

2016-11-14 Thread Chris Wilson
The current code is subject to a race where we may try to acquire a
reference on a stale fence:

[13703.335118] WARNING: CPU: 1 PID: 14975 at ./include/linux/kref.h:46 
i915_gem_object_wait+0x1a3/0x1c0
[13703.335184] Modules linked in:
[13703.335202] CPU: 1 PID: 14975 Comm: gem_concurrent_ Not tainted 4.9.0-rc4+ 
#26
[13703.335216] Hardware name:  /, BIOS 
PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
[13703.335233]  c90002f5bcc8 812807de  

[13703.335257]  c90002f5bd08 81073811 002e8000 
88026bf7c780
[13703.335279]  7fff 0001 88027045a550 
88026bf7c780
[13703.335301] Call Trace:
[13703.335316]  [] dump_stack+0x4d/0x6f
[13703.335331]  [] __warn+0xc1/0xe0
[13703.335343]  [] warn_slowpath_null+0x18/0x20
[13703.335355]  [] i915_gem_object_wait+0x1a3/0x1c0
[13703.335367]  [] i915_gem_set_domain_ioctl+0xcc/0x330
[13703.335386]  [] drm_ioctl+0x1cb/0x410
[13703.335400]  [] ? 
i915_gem_obj_prepare_shmem_write+0x1d0/0x1d0
[13703.335416]  [] ? drm_ioctl+0x2bb/0x410
[13703.335429]  [] do_vfs_ioctl+0x8f/0x5c0
[13703.335442]  [] SyS_ioctl+0x3c/0x70
[13703.335456]  [] entry_SYSCALL_64_fastpath+0x17/0x98
[13703.335558] ---[ end trace fd24176416ba6981 ]---
[13703.382778] general protection fault:  [#1] SMP
[13703.382802] Modules linked in:
[13703.382816] CPU: 1 PID: 14967 Comm: gem_concurrent_ Tainted: GW  
 4.9.0-rc4+ #26
[13703.382828] Hardware name:  /, BIOS 
PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
[13703.382841] task: 880275458000 task.stack: c90002f18000
[13703.382849] RIP: 0010:[]  [] 
i915_gem_request_retire+0x2b4/0x320
[13703.382870] RSP: 0018:c90002f1bbc8  EFLAGS: 00010293
[13703.382878] RAX: dead0200 RBX: 88026bf7dce8 RCX: dead0100
[13703.382887] RDX: dead0100 RSI: 88026bf7c930 RDI: 88026bf7dd00
[13703.382897] RBP: c90002f1bbf8 R08:  R09: 88026b89a000
[13703.382905] R10: 0001 R11: 88026bbe8fe0 R12: 88026bf7c000
[13703.382913] R13: 880275af8000 R14: 88026bf7c180 R15: dead0200
[13703.382922] FS:  7f89e787d740() GS:88027fd0() 
knlGS:
[13703.382934] CS:  0010 DS:  ES:  CR0: 80050033
[13703.382942] CR2: 7f9053d2e000 CR3: 00026d414000 CR4: 001006e0
[13703.382951] Stack:
[13703.382958]  880275413000 c90002f1bde8 880275af8000 
880274e8a600
[13703.382976]  880276a06000 c90002f1bde8 c90002f1bc38 
813b48c5
[13703.382995]  c90002f1bc00 c90002f1bde8 88026972a440 

[13703.383021] Call Trace:
[13703.383032]  [] i915_gem_request_alloc+0xa5/0x350
[13703.383043]  [] i915_gem_do_execbuffer.isra.41+0x7b3/0x18b0
[13703.383055]  [] ? i915_gem_object_get_sg+0x25c/0x2b0
[13703.383065]  [] ? i915_gem_object_get_page+0x1d/0x50
[13703.383076]  [] ? i915_gem_pwrite_ioctl+0x66c/0x6d0
[13703.383086]  [] i915_gem_execbuffer2+0x95/0x1e0
[13703.383096]  [] drm_ioctl+0x1cb/0x410
[13703.383105]  [] ? i915_gem_execbuffer+0x2d0/0x2d0
[13703.383117]  [] ? hrtimer_start_range_ns+0x1a0/0x310
[13703.383128]  [] do_vfs_ioctl+0x8f/0x5c0
[13703.383140]  [] ? SyS_timer_settime+0x118/0x1a0
[13703.383150]  [] SyS_ioctl+0x3c/0x70
[13703.383162]  [] entry_SYSCALL_64_fastpath+0x17/0x98
[13703.383172] Code: 49 39 c6 48 8d 70 e8 48 8d 5f e8 75 16 eb 47 48 8d 43 18 
48 8b 53 18 48 89 de 49 39 c6 48 8d 5a e8 74 33 48 8b 56 08 48 8b 46 10 <48> 89 
42 08 48 89 10 f6 46 38 01 48 89 4e 08 4c 89 7e 10 74 cf
[13703.383557] RIP  [] i915_gem_request_retire+0x2b4/0x320
[13703.383570]  RSP 
[13703.383586] ---[ end trace fd24176416ba6982 ]---

This is fixed by using the kref_get_unless_zero() as a full memory
barrier to validate the fence is still the current exclusive fence before
returning it back to the caller. (Note the fix only requires using
dma_fence_get_rcu() and correct handling, but we may as well use the
helper rather than inline equivalent code.)

Signed-off-by: Chris Wilson 
Cc: Sumit Semwal seq);
+
+   if (!rcu_access_pointer(obj->fence_excl))
+   return NULL;
+
rcu_read_lock();
-   fence = rcu_dereference(obj->fence_excl);
-   if (read_seqcount_retry(>seq, seq)) {
-   rcu_read_unlock();
-   goto retry;
-   }
-   fence = dma_fence_get(fence);
+   fence = dma_fence_get_rcu_safe(>fence_excl);
rcu_read_unlock();
+
return fence;
 }

-- 
2.10.2