Re: [Intel-gfx] [PATCH 1/2] drm/i915: Move priority bumping for flips earlier

2016-11-29 Thread Tvrtko Ursulin


On 28/11/2016 14:36, Chris Wilson wrote:

David found another issue with priority bumping from mmioflips, where we
are accessing the requests concurrently to them being retired and freed.
Whilst we are skipping the dependency if has been submitted, that is not
sufficient to stop the dependency from disappearing if another thread
retires that request. To prevent we can either employ the struct_mutex (or a
request mutex in the future) to serialise retiring before it is freed.
Alternatively, we need to keep the dependencies alive using RCU whilst
they are being accessed via the DFS.

Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.698111] general protection fault: 
 [#1] PREEMPT SMP
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.698305] Modules linked in: 
snd_hda_intel snd_hda_codec snd_hwdep x86_pkg_temp_thermal snd_hda_core 
coretemp crct10dif_pclmul crc32_pclmul snd_pcm ghash_clmulni_intel mei_me mei 
i915 e1000e ptp pps_core i2c_hid
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.698750] CPU: 1 PID: 6716 Comm: 
kworker/u8:2 Not tainted 4.9.0-rc6-CI-Nightly_816+ #1
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.698871] Hardware name: GIGABYTE 
GB-BKi7A-7500/MFLP7AP-00, BIOS F1 07/27/2016
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699125] Workqueue: events_unbound 
intel_mmio_flip_work_func [i915]
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699266] task: 880260a5e800 
task.stack: c9f6c000
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699361] RIP: 0010:[]  
[] execlists_schedule+0x8d/0x300 [i915]
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699632] RSP: 0018:c9f6fcd8  
EFLAGS: 00010206
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699724] RAX: dead00f8 RBX: 
8801f64b2bf0 RCX: 8801f64b2c10
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699842] RDX: dead0100 RSI: 
 RDI: 8801f64b0458
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699972] RBP: c9f6fd68 R08: 
88026488dc00 R09: 0002
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700090] R10:  R11: 
 R12: 0400
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700195] R13: c9f6fcf0 R14: 
88020955aa40 R15: 88020955aa68
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700307] FS:  () 
GS:88026dc8() knlGS:
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700435] CS:  0010 DS:  ES:  
CR0: 80050033
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700532] CR2: 02a69e90 CR3: 
02c07000 CR4: 003406e0
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700635] Stack:
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700682]  880260a5e880 
c9f6fd50 810af69a c9f6fd28
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700827]  88020955a628 
8801e1eaebf0 0020 
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700947]  0196af1edc96 
88025dfa4000 8801f0b030a8 c9f6fcf0
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701071] Call Trace:
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701117]  [] ? 
dequeue_entity+0x25a/0xb50
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701260]  [] 
fence_set_priority+0x7e/0x80 [i915]
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701406]  [] 
i915_gem_object_wait_priority+0x85/0x160 [i915]
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701599]  [] 
intel_mmio_flip_work_func+0x47/0x2b0 [i915]
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701717]  [] 
process_one_work+0x14d/0x470
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701809]  [] 
worker_thread+0x43/0x4e0
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701888]  [] ? 
process_one_work+0x470/0x470
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701969]  [] ? 
process_one_work+0x470/0x470
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702072]  [] 
kthread+0xc5/0xe0
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702152]  [] ? 
_raw_spin_unlock_irq+0x9/0x10
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702234]  [] ? 
kthread_park+0x60/0x60
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702318]  [] 
ret_from_fork+0x22/0x30
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702387] Code: 89 42 08 48 8b 45 88 48 89 55 
c0 4c 89 6d c8 4c 8d 70 d8 4d 8d 7e 28 4d 39 ef 74 72 49 8b 1e 48 8b 13 48 39 d3 48 
8d 42 f8 74 3e <48> 8b 10 8b 52 38 41 39 d4 7e 26 48 8b 50 30 48 8b 78 28 48 8d
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702921] RIP  [] 
execlists_schedule+0x8d/0x300 [i915]
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.703027]  RSP 
Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711015] ---[ end trace 
4ecf3ae63087e670 ]---
Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711023] BUG: unable to handle kernel 
NULL pointer dereference at 000b
Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711070] IP: [] 
__wake_up_common+0x26/0x80
Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711104] PGD 25df92067
Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.79] PUD 25b1f0067
Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711134] PMD 0
Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711140]
Nov 25 

[Intel-gfx] [PATCH 1/2] drm/i915: Move priority bumping for flips earlier

2016-11-28 Thread Chris Wilson
David found another issue with priority bumping from mmioflips, where we
are accessing the requests concurrently to them being retired and freed.
Whilst we are skipping the dependency if has been submitted, that is not
sufficient to stop the dependency from disappearing if another thread
retires that request. To prevent we can either employ the struct_mutex (or a
request mutex in the future) to serialise retiring before it is freed.
Alternatively, we need to keep the dependencies alive using RCU whilst
they are being accessed via the DFS.

Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.698111] general protection fault: 
 [#1] PREEMPT SMP
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.698305] Modules linked in: 
snd_hda_intel snd_hda_codec snd_hwdep x86_pkg_temp_thermal snd_hda_core 
coretemp crct10dif_pclmul crc32_pclmul snd_pcm ghash_clmulni_intel mei_me mei 
i915 e1000e ptp pps_core i2c_hid
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.698750] CPU: 1 PID: 6716 Comm: 
kworker/u8:2 Not tainted 4.9.0-rc6-CI-Nightly_816+ #1
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.698871] Hardware name: GIGABYTE 
GB-BKi7A-7500/MFLP7AP-00, BIOS F1 07/27/2016
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699125] Workqueue: events_unbound 
intel_mmio_flip_work_func [i915]
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699266] task: 880260a5e800 
task.stack: c9f6c000
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699361] RIP: 
0010:[]  [] execlists_schedule+0x8d/0x300 
[i915]
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699632] RSP: 0018:c9f6fcd8  
EFLAGS: 00010206
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699724] RAX: dead00f8 RBX: 
8801f64b2bf0 RCX: 8801f64b2c10
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699842] RDX: dead0100 RSI: 
 RDI: 8801f64b0458
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699972] RBP: c9f6fd68 R08: 
88026488dc00 R09: 0002
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700090] R10:  R11: 
 R12: 0400
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700195] R13: c9f6fcf0 R14: 
88020955aa40 R15: 88020955aa68
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700307] FS:  () 
GS:88026dc8() knlGS:
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700435] CS:  0010 DS:  ES:  
CR0: 80050033
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700532] CR2: 02a69e90 CR3: 
02c07000 CR4: 003406e0
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700635] Stack:
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700682]  880260a5e880 
c9f6fd50 810af69a c9f6fd28
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700827]  88020955a628 
8801e1eaebf0 0020 
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700947]  0196af1edc96 
88025dfa4000 8801f0b030a8 c9f6fcf0
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701071] Call Trace:
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701117]  [] ? 
dequeue_entity+0x25a/0xb50
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701260]  [] 
fence_set_priority+0x7e/0x80 [i915]
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701406]  [] 
i915_gem_object_wait_priority+0x85/0x160 [i915]
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701599]  [] 
intel_mmio_flip_work_func+0x47/0x2b0 [i915]
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701717]  [] 
process_one_work+0x14d/0x470
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701809]  [] 
worker_thread+0x43/0x4e0
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701888]  [] ? 
process_one_work+0x470/0x470
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701969]  [] ? 
process_one_work+0x470/0x470
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702072]  [] 
kthread+0xc5/0xe0
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702152]  [] ? 
_raw_spin_unlock_irq+0x9/0x10
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702234]  [] ? 
kthread_park+0x60/0x60
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702318]  [] 
ret_from_fork+0x22/0x30
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702387] Code: 89 42 08 48 8b 45 88 48 
89 55 c0 4c 89 6d c8 4c 8d 70 d8 4d 8d 7e 28 4d 39 ef 74 72 49 8b 1e 48 8b 13 
48 39 d3 48 8d 42 f8 74 3e <48> 8b 10 8b 52 38 41 39 d4 7e 26 48 8b 50 30 48 8b 
78 28 48 8d
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702921] RIP  [] 
execlists_schedule+0x8d/0x300 [i915]
Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.703027]  RSP 
Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711015] ---[ end trace 
4ecf3ae63087e670 ]---
Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711023] BUG: unable to handle kernel 
NULL pointer dereference at 000b
Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711070] IP: [] 
__wake_up_common+0x26/0x80
Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711104] PGD 25df92067
Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.79] PUD 25b1f0067
Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711134] PMD 0
Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711140]
Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711151] Oops: