Re: [Intel-gfx] [PATCH 1/2] drm/i915: Move priority bumping for flips earlier
On 28/11/2016 14:36, Chris Wilson wrote: David found another issue with priority bumping from mmioflips, where we are accessing the requests concurrently to them being retired and freed. Whilst we are skipping the dependency if has been submitted, that is not sufficient to stop the dependency from disappearing if another thread retires that request. To prevent we can either employ the struct_mutex (or a request mutex in the future) to serialise retiring before it is freed. Alternatively, we need to keep the dependencies alive using RCU whilst they are being accessed via the DFS. Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.698111] general protection fault: [#1] PREEMPT SMP Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.698305] Modules linked in: snd_hda_intel snd_hda_codec snd_hwdep x86_pkg_temp_thermal snd_hda_core coretemp crct10dif_pclmul crc32_pclmul snd_pcm ghash_clmulni_intel mei_me mei i915 e1000e ptp pps_core i2c_hid Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.698750] CPU: 1 PID: 6716 Comm: kworker/u8:2 Not tainted 4.9.0-rc6-CI-Nightly_816+ #1 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.698871] Hardware name: GIGABYTE GB-BKi7A-7500/MFLP7AP-00, BIOS F1 07/27/2016 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699125] Workqueue: events_unbound intel_mmio_flip_work_func [i915] Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699266] task: 880260a5e800 task.stack: c9f6c000 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699361] RIP: 0010:[] [] execlists_schedule+0x8d/0x300 [i915] Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699632] RSP: 0018:c9f6fcd8 EFLAGS: 00010206 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699724] RAX: dead00f8 RBX: 8801f64b2bf0 RCX: 8801f64b2c10 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699842] RDX: dead0100 RSI: RDI: 8801f64b0458 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699972] RBP: c9f6fd68 R08: 88026488dc00 R09: 0002 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700090] R10: R11: R12: 0400 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700195] R13: c9f6fcf0 R14: 88020955aa40 R15: 88020955aa68 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700307] FS: () GS:88026dc8() knlGS: Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700435] CS: 0010 DS: ES: CR0: 80050033 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700532] CR2: 02a69e90 CR3: 02c07000 CR4: 003406e0 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700635] Stack: Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700682] 880260a5e880 c9f6fd50 810af69a c9f6fd28 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700827] 88020955a628 8801e1eaebf0 0020 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700947] 0196af1edc96 88025dfa4000 8801f0b030a8 c9f6fcf0 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701071] Call Trace: Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701117] [] ? dequeue_entity+0x25a/0xb50 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701260] [] fence_set_priority+0x7e/0x80 [i915] Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701406] [] i915_gem_object_wait_priority+0x85/0x160 [i915] Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701599] [] intel_mmio_flip_work_func+0x47/0x2b0 [i915] Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701717] [] process_one_work+0x14d/0x470 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701809] [] worker_thread+0x43/0x4e0 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701888] [] ? process_one_work+0x470/0x470 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701969] [] ? process_one_work+0x470/0x470 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702072] [] kthread+0xc5/0xe0 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702152] [] ? _raw_spin_unlock_irq+0x9/0x10 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702234] [] ? kthread_park+0x60/0x60 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702318] [] ret_from_fork+0x22/0x30 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702387] Code: 89 42 08 48 8b 45 88 48 89 55 c0 4c 89 6d c8 4c 8d 70 d8 4d 8d 7e 28 4d 39 ef 74 72 49 8b 1e 48 8b 13 48 39 d3 48 8d 42 f8 74 3e <48> 8b 10 8b 52 38 41 39 d4 7e 26 48 8b 50 30 48 8b 78 28 48 8d Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702921] RIP [] execlists_schedule+0x8d/0x300 [i915] Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.703027] RSP Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711015] ---[ end trace 4ecf3ae63087e670 ]--- Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711023] BUG: unable to handle kernel NULL pointer dereference at 000b Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711070] IP: [] __wake_up_common+0x26/0x80 Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711104] PGD 25df92067 Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.79] PUD 25b1f0067 Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711134] PMD 0 Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711140] Nov 25
[Intel-gfx] [PATCH 1/2] drm/i915: Move priority bumping for flips earlier
David found another issue with priority bumping from mmioflips, where we are accessing the requests concurrently to them being retired and freed. Whilst we are skipping the dependency if has been submitted, that is not sufficient to stop the dependency from disappearing if another thread retires that request. To prevent we can either employ the struct_mutex (or a request mutex in the future) to serialise retiring before it is freed. Alternatively, we need to keep the dependencies alive using RCU whilst they are being accessed via the DFS. Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.698111] general protection fault: [#1] PREEMPT SMP Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.698305] Modules linked in: snd_hda_intel snd_hda_codec snd_hwdep x86_pkg_temp_thermal snd_hda_core coretemp crct10dif_pclmul crc32_pclmul snd_pcm ghash_clmulni_intel mei_me mei i915 e1000e ptp pps_core i2c_hid Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.698750] CPU: 1 PID: 6716 Comm: kworker/u8:2 Not tainted 4.9.0-rc6-CI-Nightly_816+ #1 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.698871] Hardware name: GIGABYTE GB-BKi7A-7500/MFLP7AP-00, BIOS F1 07/27/2016 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699125] Workqueue: events_unbound intel_mmio_flip_work_func [i915] Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699266] task: 880260a5e800 task.stack: c9f6c000 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699361] RIP: 0010:[] [] execlists_schedule+0x8d/0x300 [i915] Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699632] RSP: 0018:c9f6fcd8 EFLAGS: 00010206 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699724] RAX: dead00f8 RBX: 8801f64b2bf0 RCX: 8801f64b2c10 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699842] RDX: dead0100 RSI: RDI: 8801f64b0458 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.699972] RBP: c9f6fd68 R08: 88026488dc00 R09: 0002 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700090] R10: R11: R12: 0400 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700195] R13: c9f6fcf0 R14: 88020955aa40 R15: 88020955aa68 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700307] FS: () GS:88026dc8() knlGS: Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700435] CS: 0010 DS: ES: CR0: 80050033 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700532] CR2: 02a69e90 CR3: 02c07000 CR4: 003406e0 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700635] Stack: Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700682] 880260a5e880 c9f6fd50 810af69a c9f6fd28 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700827] 88020955a628 8801e1eaebf0 0020 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.700947] 0196af1edc96 88025dfa4000 8801f0b030a8 c9f6fcf0 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701071] Call Trace: Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701117] [] ? dequeue_entity+0x25a/0xb50 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701260] [] fence_set_priority+0x7e/0x80 [i915] Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701406] [] i915_gem_object_wait_priority+0x85/0x160 [i915] Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701599] [] intel_mmio_flip_work_func+0x47/0x2b0 [i915] Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701717] [] process_one_work+0x14d/0x470 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701809] [] worker_thread+0x43/0x4e0 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701888] [] ? process_one_work+0x470/0x470 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.701969] [] ? process_one_work+0x470/0x470 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702072] [] kthread+0xc5/0xe0 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702152] [] ? _raw_spin_unlock_irq+0x9/0x10 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702234] [] ? kthread_park+0x60/0x60 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702318] [] ret_from_fork+0x22/0x30 Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702387] Code: 89 42 08 48 8b 45 88 48 89 55 c0 4c 89 6d c8 4c 8d 70 d8 4d 8d 7e 28 4d 39 ef 74 72 49 8b 1e 48 8b 13 48 39 d3 48 8d 42 f8 74 3e <48> 8b 10 8b 52 38 41 39 d4 7e 26 48 8b 50 30 48 8b 78 28 48 8d Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.702921] RIP [] execlists_schedule+0x8d/0x300 [i915] Nov 25 21:42:54 kbl-gbbki7 kernel: [ 1746.703027] RSP Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711015] ---[ end trace 4ecf3ae63087e670 ]--- Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711023] BUG: unable to handle kernel NULL pointer dereference at 000b Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711070] IP: [] __wake_up_common+0x26/0x80 Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711104] PGD 25df92067 Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.79] PUD 25b1f0067 Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711134] PMD 0 Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711140] Nov 25 21:44:11 kbl-gbbki7 kernel: [ 1746.711151] Oops: