[PATCH] dma-buf: Use fence_get_rcu_safe() for retrieving the exclusive fence
On Mon, Nov 14, 2016 at 04:37:09PM +0100, Christian König wrote: > Am 14.11.2016 um 12:55 schrieb Chris Wilson: > > The current code is subject to a race where we may try to acquire a > > reference on a stale fence: > > > > [13703.335118] WARNING: CPU: 1 PID: 14975 at ./include/linux/kref.h:46 > > i915_gem_object_wait+0x1a3/0x1c0 > > [13703.335184] Modules linked in: > > [13703.335202] CPU: 1 PID: 14975 Comm: gem_concurrent_ Not tainted > > 4.9.0-rc4+ #26 > > [13703.335216] Hardware name: /, BIOS > > PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015 > > [13703.335233] c90002f5bcc8 812807de > > > > [13703.335257] c90002f5bd08 81073811 002e8000 > > 88026bf7c780 > > [13703.335279] 7fff 0001 88027045a550 > > 88026bf7c780 > > [13703.335301] Call Trace: > > [13703.335316] [] dump_stack+0x4d/0x6f > > [13703.335331] [] __warn+0xc1/0xe0 > > [13703.335343] [] warn_slowpath_null+0x18/0x20 > > [13703.335355] [] i915_gem_object_wait+0x1a3/0x1c0 > > [13703.335367] [] i915_gem_set_domain_ioctl+0xcc/0x330 > > [13703.335386] [] drm_ioctl+0x1cb/0x410 > > [13703.335400] [] ? > > i915_gem_obj_prepare_shmem_write+0x1d0/0x1d0 > > [13703.335416] [] ? drm_ioctl+0x2bb/0x410 > > [13703.335429] [] do_vfs_ioctl+0x8f/0x5c0 > > [13703.335442] [] SyS_ioctl+0x3c/0x70 > > [13703.335456] [] entry_SYSCALL_64_fastpath+0x17/0x98 > > [13703.335558] ---[ end trace fd24176416ba6981 ]--- > > [13703.382778] general protection fault: [#1] SMP > > [13703.382802] Modules linked in: > > [13703.382816] CPU: 1 PID: 14967 Comm: gem_concurrent_ Tainted: GW > > 4.9.0-rc4+ #26 > > [13703.382828] Hardware name: /, BIOS > > PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015 > > [13703.382841] task: 880275458000 task.stack: c90002f18000 > > [13703.382849] RIP: 0010:[] [] > > i915_gem_request_retire+0x2b4/0x320 > > [13703.382870] RSP: 0018:c90002f1bbc8 EFLAGS: 00010293 > > [13703.382878] RAX: dead0200 RBX: 88026bf7dce8 RCX: > > dead0100 > > [13703.382887] RDX: dead0100 RSI: 88026bf7c930 RDI: > > 88026bf7dd00 > > [13703.382897] RBP: c90002f1bbf8 R08: R09: > > 88026b89a000 > > [13703.382905] R10: 0001 R11: 88026bbe8fe0 R12: > > 88026bf7c000 > > [13703.382913] R13: 880275af8000 R14: 88026bf7c180 R15: > > dead0200 > > [13703.382922] FS: 7f89e787d740() GS:88027fd0() > > knlGS: > > [13703.382934] CS: 0010 DS: ES: CR0: 80050033 > > [13703.382942] CR2: 7f9053d2e000 CR3: 00026d414000 CR4: > > 001006e0 > > [13703.382951] Stack: > > [13703.382958] 880275413000 c90002f1bde8 880275af8000 > > 880274e8a600 > > [13703.382976] 880276a06000 c90002f1bde8 c90002f1bc38 > > 813b48c5 > > [13703.382995] c90002f1bc00 c90002f1bde8 88026972a440 > > > > [13703.383021] Call Trace: > > [13703.383032] [] i915_gem_request_alloc+0xa5/0x350 > > [13703.383043] [] > > i915_gem_do_execbuffer.isra.41+0x7b3/0x18b0 > > [13703.383055] [] ? i915_gem_object_get_sg+0x25c/0x2b0 > > [13703.383065] [] ? i915_gem_object_get_page+0x1d/0x50 > > [13703.383076] [] ? i915_gem_pwrite_ioctl+0x66c/0x6d0 > > [13703.383086] [] i915_gem_execbuffer2+0x95/0x1e0 > > [13703.383096] [] drm_ioctl+0x1cb/0x410 > > [13703.383105] [] ? i915_gem_execbuffer+0x2d0/0x2d0 > > [13703.383117] [] ? hrtimer_start_range_ns+0x1a0/0x310 > > [13703.383128] [] do_vfs_ioctl+0x8f/0x5c0 > > [13703.383140] [] ? SyS_timer_settime+0x118/0x1a0 > > [13703.383150] [] SyS_ioctl+0x3c/0x70 > > [13703.383162] [] entry_SYSCALL_64_fastpath+0x17/0x98 > > [13703.383172] Code: 49 39 c6 48 8d 70 e8 48 8d 5f e8 75 16 eb 47 48 8d 43 > > 18 48 8b 53 18 48 89 de 49 39 c6 48 8d 5a e8 74 33 48 8b 56 08 48 8b 46 10 > > <48> 89 42 08 48 89 10 f6 46 38 01 48 89 4e 08 4c 89 7e 10 74 cf > > [13703.383557] RIP [] i915_gem_request_retire+0x2b4/0x320 > > [13703.383570] RSP > > [13703.383586] ---[ end trace fd24176416ba6982 ]--- > > > > This is fixed by using the kref_get_unless_zero() as a full memory > > barrier to validate the fence is still the current exclusive fence before > > returning it back to the caller. (Note the fix only requires using > > dma_fence_get_rcu() and correct handling, but we may as well use the > > helper rather than inline equivalent code.) > > > > Signed-off-by: Chris Wilson > > Cc: Sumit Semwal > Reviewed-by: Christian König . Applied to drm-misc, thanks. -Daniel > > > --- > > include/linux/reservation.h | 15 ++- > > 1 file changed, 6 insertions(+), 9 deletions(-) > > > > diff --git a/include/linux/reservation.h b/include/linux/reservation.h > > index 2e313cca08f0..d9706a6f5ae2 100644 > > --- a/include/linux/reservation.h > > +++ b/include/linux/reservation.h > > @@
[PATCH] dma-buf: Use fence_get_rcu_safe() for retrieving the exclusive fence
Am 14.11.2016 um 12:55 schrieb Chris Wilson: > The current code is subject to a race where we may try to acquire a > reference on a stale fence: > > [13703.335118] WARNING: CPU: 1 PID: 14975 at ./include/linux/kref.h:46 > i915_gem_object_wait+0x1a3/0x1c0 > [13703.335184] Modules linked in: > [13703.335202] CPU: 1 PID: 14975 Comm: gem_concurrent_ Not tainted 4.9.0-rc4+ > #26 > [13703.335216] Hardware name: /, BIOS > PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015 > [13703.335233] c90002f5bcc8 812807de > > [13703.335257] c90002f5bd08 81073811 002e8000 > 88026bf7c780 > [13703.335279] 7fff 0001 88027045a550 > 88026bf7c780 > [13703.335301] Call Trace: > [13703.335316] [] dump_stack+0x4d/0x6f > [13703.335331] [] __warn+0xc1/0xe0 > [13703.335343] [] warn_slowpath_null+0x18/0x20 > [13703.335355] [] i915_gem_object_wait+0x1a3/0x1c0 > [13703.335367] [] i915_gem_set_domain_ioctl+0xcc/0x330 > [13703.335386] [] drm_ioctl+0x1cb/0x410 > [13703.335400] [] ? > i915_gem_obj_prepare_shmem_write+0x1d0/0x1d0 > [13703.335416] [] ? drm_ioctl+0x2bb/0x410 > [13703.335429] [] do_vfs_ioctl+0x8f/0x5c0 > [13703.335442] [] SyS_ioctl+0x3c/0x70 > [13703.335456] [] entry_SYSCALL_64_fastpath+0x17/0x98 > [13703.335558] ---[ end trace fd24176416ba6981 ]--- > [13703.382778] general protection fault: [#1] SMP > [13703.382802] Modules linked in: > [13703.382816] CPU: 1 PID: 14967 Comm: gem_concurrent_ Tainted: GW >4.9.0-rc4+ #26 > [13703.382828] Hardware name: /, BIOS > PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015 > [13703.382841] task: 880275458000 task.stack: c90002f18000 > [13703.382849] RIP: 0010:[] [] > i915_gem_request_retire+0x2b4/0x320 > [13703.382870] RSP: 0018:c90002f1bbc8 EFLAGS: 00010293 > [13703.382878] RAX: dead0200 RBX: 88026bf7dce8 RCX: > dead0100 > [13703.382887] RDX: dead0100 RSI: 88026bf7c930 RDI: > 88026bf7dd00 > [13703.382897] RBP: c90002f1bbf8 R08: R09: > 88026b89a000 > [13703.382905] R10: 0001 R11: 88026bbe8fe0 R12: > 88026bf7c000 > [13703.382913] R13: 880275af8000 R14: 88026bf7c180 R15: > dead0200 > [13703.382922] FS: 7f89e787d740() GS:88027fd0() > knlGS: > [13703.382934] CS: 0010 DS: ES: CR0: 80050033 > [13703.382942] CR2: 7f9053d2e000 CR3: 00026d414000 CR4: > 001006e0 > [13703.382951] Stack: > [13703.382958] 880275413000 c90002f1bde8 880275af8000 > 880274e8a600 > [13703.382976] 880276a06000 c90002f1bde8 c90002f1bc38 > 813b48c5 > [13703.382995] c90002f1bc00 c90002f1bde8 88026972a440 > > [13703.383021] Call Trace: > [13703.383032] [] i915_gem_request_alloc+0xa5/0x350 > [13703.383043] [] > i915_gem_do_execbuffer.isra.41+0x7b3/0x18b0 > [13703.383055] [] ? i915_gem_object_get_sg+0x25c/0x2b0 > [13703.383065] [] ? i915_gem_object_get_page+0x1d/0x50 > [13703.383076] [] ? i915_gem_pwrite_ioctl+0x66c/0x6d0 > [13703.383086] [] i915_gem_execbuffer2+0x95/0x1e0 > [13703.383096] [] drm_ioctl+0x1cb/0x410 > [13703.383105] [] ? i915_gem_execbuffer+0x2d0/0x2d0 > [13703.383117] [] ? hrtimer_start_range_ns+0x1a0/0x310 > [13703.383128] [] do_vfs_ioctl+0x8f/0x5c0 > [13703.383140] [] ? SyS_timer_settime+0x118/0x1a0 > [13703.383150] [] SyS_ioctl+0x3c/0x70 > [13703.383162] [] entry_SYSCALL_64_fastpath+0x17/0x98 > [13703.383172] Code: 49 39 c6 48 8d 70 e8 48 8d 5f e8 75 16 eb 47 48 8d 43 18 > 48 8b 53 18 48 89 de 49 39 c6 48 8d 5a e8 74 33 48 8b 56 08 48 8b 46 10 <48> > 89 42 08 48 89 10 f6 46 38 01 48 89 4e 08 4c 89 7e 10 74 cf > [13703.383557] RIP [] i915_gem_request_retire+0x2b4/0x320 > [13703.383570] RSP > [13703.383586] ---[ end trace fd24176416ba6982 ]--- > > This is fixed by using the kref_get_unless_zero() as a full memory > barrier to validate the fence is still the current exclusive fence before > returning it back to the caller. (Note the fix only requires using > dma_fence_get_rcu() and correct handling, but we may as well use the > helper rather than inline equivalent code.) > > Signed-off-by: Chris Wilson > Cc: Sumit Semwal . > --- > include/linux/reservation.h | 15 ++- > 1 file changed, 6 insertions(+), 9 deletions(-) > > diff --git a/include/linux/reservation.h b/include/linux/reservation.h > index 2e313cca08f0..d9706a6f5ae2 100644 > --- a/include/linux/reservation.h > +++ b/include/linux/reservation.h > @@ -177,17 +177,14 @@ static inline struct dma_fence * > reservation_object_get_excl_rcu(struct reservation_object *obj) > { > struct dma_fence *fence; > - unsigned seq; > -retry: > - seq = read_seqcount_begin(>seq); > + > + if (!rcu_access_pointer(obj->fence_excl)) > + return NULL; > + > rcu_read_lock(); > -
[PATCH] dma-buf: Use fence_get_rcu_safe() for retrieving the exclusive fence
On Mon, Nov 14, 2016 at 11:55:40AM +, Chris Wilson wrote: > The current code is subject to a race where we may try to acquire a > reference on a stale fence: >From i915.ko pov, this Fixes: d07f0e59b2c7 ("drm/i915: Move GEM activity tracking into a common struct reservation_object") -Chris -- Chris Wilson, Intel Open Source Technology Centre
[PATCH] dma-buf: Use fence_get_rcu_safe() for retrieving the exclusive fence
The current code is subject to a race where we may try to acquire a reference on a stale fence: [13703.335118] WARNING: CPU: 1 PID: 14975 at ./include/linux/kref.h:46 i915_gem_object_wait+0x1a3/0x1c0 [13703.335184] Modules linked in: [13703.335202] CPU: 1 PID: 14975 Comm: gem_concurrent_ Not tainted 4.9.0-rc4+ #26 [13703.335216] Hardware name: /, BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015 [13703.335233] c90002f5bcc8 812807de [13703.335257] c90002f5bd08 81073811 002e8000 88026bf7c780 [13703.335279] 7fff 0001 88027045a550 88026bf7c780 [13703.335301] Call Trace: [13703.335316] [] dump_stack+0x4d/0x6f [13703.335331] [] __warn+0xc1/0xe0 [13703.335343] [] warn_slowpath_null+0x18/0x20 [13703.335355] [] i915_gem_object_wait+0x1a3/0x1c0 [13703.335367] [] i915_gem_set_domain_ioctl+0xcc/0x330 [13703.335386] [] drm_ioctl+0x1cb/0x410 [13703.335400] [] ? i915_gem_obj_prepare_shmem_write+0x1d0/0x1d0 [13703.335416] [] ? drm_ioctl+0x2bb/0x410 [13703.335429] [] do_vfs_ioctl+0x8f/0x5c0 [13703.335442] [] SyS_ioctl+0x3c/0x70 [13703.335456] [] entry_SYSCALL_64_fastpath+0x17/0x98 [13703.335558] ---[ end trace fd24176416ba6981 ]--- [13703.382778] general protection fault: [#1] SMP [13703.382802] Modules linked in: [13703.382816] CPU: 1 PID: 14967 Comm: gem_concurrent_ Tainted: GW 4.9.0-rc4+ #26 [13703.382828] Hardware name: /, BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015 [13703.382841] task: 880275458000 task.stack: c90002f18000 [13703.382849] RIP: 0010:[] [] i915_gem_request_retire+0x2b4/0x320 [13703.382870] RSP: 0018:c90002f1bbc8 EFLAGS: 00010293 [13703.382878] RAX: dead0200 RBX: 88026bf7dce8 RCX: dead0100 [13703.382887] RDX: dead0100 RSI: 88026bf7c930 RDI: 88026bf7dd00 [13703.382897] RBP: c90002f1bbf8 R08: R09: 88026b89a000 [13703.382905] R10: 0001 R11: 88026bbe8fe0 R12: 88026bf7c000 [13703.382913] R13: 880275af8000 R14: 88026bf7c180 R15: dead0200 [13703.382922] FS: 7f89e787d740() GS:88027fd0() knlGS: [13703.382934] CS: 0010 DS: ES: CR0: 80050033 [13703.382942] CR2: 7f9053d2e000 CR3: 00026d414000 CR4: 001006e0 [13703.382951] Stack: [13703.382958] 880275413000 c90002f1bde8 880275af8000 880274e8a600 [13703.382976] 880276a06000 c90002f1bde8 c90002f1bc38 813b48c5 [13703.382995] c90002f1bc00 c90002f1bde8 88026972a440 [13703.383021] Call Trace: [13703.383032] [] i915_gem_request_alloc+0xa5/0x350 [13703.383043] [] i915_gem_do_execbuffer.isra.41+0x7b3/0x18b0 [13703.383055] [] ? i915_gem_object_get_sg+0x25c/0x2b0 [13703.383065] [] ? i915_gem_object_get_page+0x1d/0x50 [13703.383076] [] ? i915_gem_pwrite_ioctl+0x66c/0x6d0 [13703.383086] [] i915_gem_execbuffer2+0x95/0x1e0 [13703.383096] [] drm_ioctl+0x1cb/0x410 [13703.383105] [] ? i915_gem_execbuffer+0x2d0/0x2d0 [13703.383117] [] ? hrtimer_start_range_ns+0x1a0/0x310 [13703.383128] [] do_vfs_ioctl+0x8f/0x5c0 [13703.383140] [] ? SyS_timer_settime+0x118/0x1a0 [13703.383150] [] SyS_ioctl+0x3c/0x70 [13703.383162] [] entry_SYSCALL_64_fastpath+0x17/0x98 [13703.383172] Code: 49 39 c6 48 8d 70 e8 48 8d 5f e8 75 16 eb 47 48 8d 43 18 48 8b 53 18 48 89 de 49 39 c6 48 8d 5a e8 74 33 48 8b 56 08 48 8b 46 10 <48> 89 42 08 48 89 10 f6 46 38 01 48 89 4e 08 4c 89 7e 10 74 cf [13703.383557] RIP [] i915_gem_request_retire+0x2b4/0x320 [13703.383570] RSP [13703.383586] ---[ end trace fd24176416ba6982 ]--- This is fixed by using the kref_get_unless_zero() as a full memory barrier to validate the fence is still the current exclusive fence before returning it back to the caller. (Note the fix only requires using dma_fence_get_rcu() and correct handling, but we may as well use the helper rather than inline equivalent code.) Signed-off-by: Chris Wilson Cc: Sumit Semwal seq); + + if (!rcu_access_pointer(obj->fence_excl)) + return NULL; + rcu_read_lock(); - fence = rcu_dereference(obj->fence_excl); - if (read_seqcount_retry(>seq, seq)) { - rcu_read_unlock(); - goto retry; - } - fence = dma_fence_get(fence); + fence = dma_fence_get_rcu_safe(>fence_excl); rcu_read_unlock(); + return fence; } -- 2.10.2