Re: [Intel-gfx] Oops with i915
On Mon, Jun 18, 2018 at 01:29:02PM +0100, Sudip Mukherjee wrote: > Hi Ville, > > On Mon, Jun 18, 2018 at 03:09:15PM +0300, Ville Syrjälä wrote: > > On Thu, Jun 07, 2018 at 11:06:33AM +0100, Sudip Mukherjee wrote: > > > Hi All, > > > > > > We are running v4.14.47 kernel and recently in one of our test cycle > > > we saw the below trace. I know this is not the usual way to raise a > > > BUG report, but since this was seen only once in one of the automated > > > test cycle so I donot have anything else apart from this trace. > > > Is this a known issue? Will appreciate any help in understanding what > > > the problem might be. > > > > > > [ 1176.909543] BUG: unable to handle kernel paging request at 8298fb0a > > > [ 1176.916565] IP: queued_spin_lock_slowpath+0xfc/0x142 > > > [ 1176.922111] *pdpt = 3367a001 *pde = > > > [ 1176.928534] Oops: 0002 [#1] PREEMPT SMP > > > [ 1177.002434] CPU: 2 PID: 24688 Comm: kworker/u8:4 Tainted: G U > > > O4.14.47-20180606-a6b8390e8cc1de032b8314d1a5b193fe9e21f325 #1 > > > [ 1177.024120] Workqueue: events_unbound intel_atomic_commit_work > > > [ 1177.030630] task: ef2ee200 task.stack: efbf4000 > > > [ 1177.035685] EIP: queued_spin_lock_slowpath+0xfc/0x142 > > > [ 1177.041327] EFLAGS: 00010087 CPU: 2 > > > [ 1177.045212] EAX: 8298fb0a EBX: 3ba0 ECX: ee82489c EDX: f4656fc0 > > > [ 1177.052215] ESI: 000c EDI: 0001 EBP: efbf5e88 ESP: efbf5e78 > > > [ 1177.059217] DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 > > > [ 1177.065239] CR0: 80050033 CR2: 8298fb0a CR3: 2e8ed320 CR4: 001006f0 > > > [ 1177.072240] Call Trace: > > > [ 1177.074973] _raw_spin_lock_irqsave+0x28/0x2d > > > [ 1177.079840] complete_all+0x12/0x36 > > > [ 1177.083737] drm_atomic_helper_commit_hw_done+0x3c/0x43 > > > [ 1177.089576] intel_atomic_commit_tail+0xa5f/0xbd9 > > > [ 1177.094832] ? wait_woken+0x5a/0x5a > > > [ 1177.098727] ? wait_woken+0x5a/0x5a > > > [ 1177.102622] intel_atomic_commit_work+0xb/0xd > > > [ 1177.107489] ? intel_atomic_commit_work+0xb/0xd > > > [ 1177.112551] process_one_work+0x109/0x1ee > > > [ 1177.117029] worker_thread+0x1a4/0x257 > > > [ 1177.121215] kthread+0xee/0xf3 > > > [ 1177.124625] ? rescuer_thread+0x207/0x207 > > > [ 1177.129103] ? kthread_create_on_node+0x1a/0x1a > > > [ 1177.134165] ret_from_fork+0x2e/0x38 > > > [ 1177.138156] Code: 12 09 de 89 f0 89 75 f0 c1 e8 10 66 87 41 02 89 c3 > > > c1 e3 10 74 51 83 e0 03 c1 eb 12 6b c0 0c 05 c0 1f 7e c1 03 04 9d d8 b1 > > > 6c c1 <89> 10 8b 42 04 85 c0 75 04 f3 90 eb f5 8b 1a 85 db 74 03 0f 0d > > > [ 1177.159204] EIP: queued_spin_lock_slowpath+0xfc/0x142 SS:ESP: > > > 0068:efbf5e78 > > > [ 1177.166983] CR2: 8298fb0a > > > > Presumably a use after free in atomic. Possibly 21a01abbe32a > > ("drm/atomic: Fix freeing connector/plane state too early by tracking > > commits, v3.") But there may have been other similar fixes. > > Thanks for your reply. I also thought so as the stacktrace showed it was > using an invalid memory for the old_state. And so I applied: > 21a01abbe32a ("drm/atomic: Fix freeing connector/plane state too early by > tracking commits, v3.") > on top of v4.14.47. It also needed: > 1) f46640b931e5 ("drm/atomic: Return commit in drm_crtc_commit_get for better > annotation") > 2) 163bcc2c74a2 ("drm/atomic: Move drm_crtc_commit to drm_crtc_state, v4.") > > to apply cleanly. But after that the occurance rate increased. > Did I miss something else also? No idea. I suggest a reverse bisect to find out when it got fixed in upstream. -- Ville Syrjälä Intel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Oops with i915
Hi Ville, On Mon, Jun 18, 2018 at 03:09:15PM +0300, Ville Syrjälä wrote: > On Thu, Jun 07, 2018 at 11:06:33AM +0100, Sudip Mukherjee wrote: > > Hi All, > > > > We are running v4.14.47 kernel and recently in one of our test cycle > > we saw the below trace. I know this is not the usual way to raise a > > BUG report, but since this was seen only once in one of the automated > > test cycle so I donot have anything else apart from this trace. > > Is this a known issue? Will appreciate any help in understanding what > > the problem might be. > > > > [ 1176.909543] BUG: unable to handle kernel paging request at 8298fb0a > > [ 1176.916565] IP: queued_spin_lock_slowpath+0xfc/0x142 > > [ 1176.922111] *pdpt = 3367a001 *pde = > > [ 1176.928534] Oops: 0002 [#1] PREEMPT SMP > > [ 1177.002434] CPU: 2 PID: 24688 Comm: kworker/u8:4 Tainted: G U O > > 4.14.47-20180606-a6b8390e8cc1de032b8314d1a5b193fe9e21f325 #1 > > [ 1177.024120] Workqueue: events_unbound intel_atomic_commit_work > > [ 1177.030630] task: ef2ee200 task.stack: efbf4000 > > [ 1177.035685] EIP: queued_spin_lock_slowpath+0xfc/0x142 > > [ 1177.041327] EFLAGS: 00010087 CPU: 2 > > [ 1177.045212] EAX: 8298fb0a EBX: 3ba0 ECX: ee82489c EDX: f4656fc0 > > [ 1177.052215] ESI: 000c EDI: 0001 EBP: efbf5e88 ESP: efbf5e78 > > [ 1177.059217] DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 > > [ 1177.065239] CR0: 80050033 CR2: 8298fb0a CR3: 2e8ed320 CR4: 001006f0 > > [ 1177.072240] Call Trace: > > [ 1177.074973] _raw_spin_lock_irqsave+0x28/0x2d > > [ 1177.079840] complete_all+0x12/0x36 > > [ 1177.083737] drm_atomic_helper_commit_hw_done+0x3c/0x43 > > [ 1177.089576] intel_atomic_commit_tail+0xa5f/0xbd9 > > [ 1177.094832] ? wait_woken+0x5a/0x5a > > [ 1177.098727] ? wait_woken+0x5a/0x5a > > [ 1177.102622] intel_atomic_commit_work+0xb/0xd > > [ 1177.107489] ? intel_atomic_commit_work+0xb/0xd > > [ 1177.112551] process_one_work+0x109/0x1ee > > [ 1177.117029] worker_thread+0x1a4/0x257 > > [ 1177.121215] kthread+0xee/0xf3 > > [ 1177.124625] ? rescuer_thread+0x207/0x207 > > [ 1177.129103] ? kthread_create_on_node+0x1a/0x1a > > [ 1177.134165] ret_from_fork+0x2e/0x38 > > [ 1177.138156] Code: 12 09 de 89 f0 89 75 f0 c1 e8 10 66 87 41 02 89 c3 c1 > > e3 10 74 51 83 e0 03 c1 eb 12 6b c0 0c 05 c0 1f 7e c1 03 04 9d d8 b1 6c c1 > > <89> 10 8b 42 04 85 c0 75 04 f3 90 eb f5 8b 1a 85 db 74 03 0f 0d > > [ 1177.159204] EIP: queued_spin_lock_slowpath+0xfc/0x142 SS:ESP: > > 0068:efbf5e78 > > [ 1177.166983] CR2: 8298fb0a > > Presumably a use after free in atomic. Possibly 21a01abbe32a > ("drm/atomic: Fix freeing connector/plane state too early by tracking > commits, v3.") But there may have been other similar fixes. Thanks for your reply. I also thought so as the stacktrace showed it was using an invalid memory for the old_state. And so I applied: 21a01abbe32a ("drm/atomic: Fix freeing connector/plane state too early by tracking commits, v3.") on top of v4.14.47. It also needed: 1) f46640b931e5 ("drm/atomic: Return commit in drm_crtc_commit_get for better annotation") 2) 163bcc2c74a2 ("drm/atomic: Move drm_crtc_commit to drm_crtc_state, v4.") to apply cleanly. But after that the occurance rate increased. Did I miss something else also? Will apprecate your help in finding a fix to this. -- Regards Sudip ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Oops with i915
On Thu, Jun 07, 2018 at 11:06:33AM +0100, Sudip Mukherjee wrote: > Hi All, > > We are running v4.14.47 kernel and recently in one of our test cycle > we saw the below trace. I know this is not the usual way to raise a > BUG report, but since this was seen only once in one of the automated > test cycle so I donot have anything else apart from this trace. > Is this a known issue? Will appreciate any help in understanding what > the problem might be. > > [ 1176.909543] BUG: unable to handle kernel paging request at 8298fb0a > [ 1176.916565] IP: queued_spin_lock_slowpath+0xfc/0x142 > [ 1176.922111] *pdpt = 3367a001 *pde = > [ 1176.928534] Oops: 0002 [#1] PREEMPT SMP > [ 1177.002434] CPU: 2 PID: 24688 Comm: kworker/u8:4 Tainted: G U O > 4.14.47-20180606-a6b8390e8cc1de032b8314d1a5b193fe9e21f325 #1 > [ 1177.024120] Workqueue: events_unbound intel_atomic_commit_work > [ 1177.030630] task: ef2ee200 task.stack: efbf4000 > [ 1177.035685] EIP: queued_spin_lock_slowpath+0xfc/0x142 > [ 1177.041327] EFLAGS: 00010087 CPU: 2 > [ 1177.045212] EAX: 8298fb0a EBX: 3ba0 ECX: ee82489c EDX: f4656fc0 > [ 1177.052215] ESI: 000c EDI: 0001 EBP: efbf5e88 ESP: efbf5e78 > [ 1177.059217] DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 > [ 1177.065239] CR0: 80050033 CR2: 8298fb0a CR3: 2e8ed320 CR4: 001006f0 > [ 1177.072240] Call Trace: > [ 1177.074973] _raw_spin_lock_irqsave+0x28/0x2d > [ 1177.079840] complete_all+0x12/0x36 > [ 1177.083737] drm_atomic_helper_commit_hw_done+0x3c/0x43 > [ 1177.089576] intel_atomic_commit_tail+0xa5f/0xbd9 > [ 1177.094832] ? wait_woken+0x5a/0x5a > [ 1177.098727] ? wait_woken+0x5a/0x5a > [ 1177.102622] intel_atomic_commit_work+0xb/0xd > [ 1177.107489] ? intel_atomic_commit_work+0xb/0xd > [ 1177.112551] process_one_work+0x109/0x1ee > [ 1177.117029] worker_thread+0x1a4/0x257 > [ 1177.121215] kthread+0xee/0xf3 > [ 1177.124625] ? rescuer_thread+0x207/0x207 > [ 1177.129103] ? kthread_create_on_node+0x1a/0x1a > [ 1177.134165] ret_from_fork+0x2e/0x38 > [ 1177.138156] Code: 12 09 de 89 f0 89 75 f0 c1 e8 10 66 87 41 02 89 c3 c1 e3 > 10 74 51 83 e0 03 c1 eb 12 6b c0 0c 05 c0 1f 7e c1 03 04 9d d8 b1 6c c1 <89> > 10 8b 42 04 85 c0 75 04 f3 90 eb f5 8b 1a 85 db 74 03 0f 0d > [ 1177.159204] EIP: queued_spin_lock_slowpath+0xfc/0x142 SS:ESP: 0068:efbf5e78 > [ 1177.166983] CR2: 8298fb0a Presumably a use after free in atomic. Possibly 21a01abbe32a ("drm/atomic: Fix freeing connector/plane state too early by tracking commits, v3.") But there may have been other similar fixes. -- Ville Syrjälä Intel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Oops with i915
On Thu, Jun 07, 2018 at 11:06:33AM +0100, Sudip Mukherjee wrote: > Hi All, > > We are running v4.14.47 kernel and recently in one of our test cycle > we saw the below trace. I know this is not the usual way to raise a > BUG report, but since this was seen only once in one of the automated > test cycle so I donot have anything else apart from this trace. > Is this a known issue? Will appreciate any help in understanding what > the problem might be. > > [ 1176.909543] BUG: unable to handle kernel paging request at 8298fb0a > [ 1176.916565] IP: queued_spin_lock_slowpath+0xfc/0x142 > [ 1176.922111] *pdpt = 3367a001 *pde = > [ 1176.928534] Oops: 0002 [#1] PREEMPT SMP > [ 1177.002434] CPU: 2 PID: 24688 Comm: kworker/u8:4 Tainted: G U O > 4.14.47-20180606-a6b8390e8cc1de032b8314d1a5b193fe9e21f325 #1 > [ 1177.024120] Workqueue: events_unbound intel_atomic_commit_work > [ 1177.030630] task: ef2ee200 task.stack: efbf4000 > [ 1177.035685] EIP: queued_spin_lock_slowpath+0xfc/0x142 > [ 1177.041327] EFLAGS: 00010087 CPU: 2 > [ 1177.045212] EAX: 8298fb0a EBX: 3ba0 ECX: ee82489c EDX: f4656fc0 > [ 1177.052215] ESI: 000c EDI: 0001 EBP: efbf5e88 ESP: efbf5e78 > [ 1177.059217] DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 > [ 1177.065239] CR0: 80050033 CR2: 8298fb0a CR3: 2e8ed320 CR4: 001006f0 > [ 1177.072240] Call Trace: > [ 1177.074973] _raw_spin_lock_irqsave+0x28/0x2d > [ 1177.079840] complete_all+0x12/0x36 > [ 1177.083737] drm_atomic_helper_commit_hw_done+0x3c/0x43 > [ 1177.089576] intel_atomic_commit_tail+0xa5f/0xbd9 > [ 1177.094832] ? wait_woken+0x5a/0x5a > [ 1177.098727] ? wait_woken+0x5a/0x5a > [ 1177.102622] intel_atomic_commit_work+0xb/0xd > [ 1177.107489] ? intel_atomic_commit_work+0xb/0xd > [ 1177.112551] process_one_work+0x109/0x1ee > [ 1177.117029] worker_thread+0x1a4/0x257 > [ 1177.121215] kthread+0xee/0xf3 > [ 1177.124625] ? rescuer_thread+0x207/0x207 > [ 1177.129103] ? kthread_create_on_node+0x1a/0x1a > [ 1177.134165] ret_from_fork+0x2e/0x38 > [ 1177.138156] Code: 12 09 de 89 f0 89 75 f0 c1 e8 10 66 87 41 02 89 c3 c1 e3 > 10 74 51 83 e0 03 c1 eb 12 6b c0 0c 05 c0 1f 7e c1 03 04 9d d8 b1 6c c1 <89> > 10 8b 42 04 85 c0 75 04 f3 90 eb f5 8b 1a 85 db 74 03 0f 0d > [ 1177.159204] EIP: queued_spin_lock_slowpath+0xfc/0x142 SS:ESP: 0068:efbf5e78 > [ 1177.166983] CR2: 8298fb0a A gentile ping on this issue. Can anyone please help me on this. -- Regards Sudip ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] Oops with i915
Hi All, We are running v4.14.47 kernel and recently in one of our test cycle we saw the below trace. I know this is not the usual way to raise a BUG report, but since this was seen only once in one of the automated test cycle so I donot have anything else apart from this trace. Is this a known issue? Will appreciate any help in understanding what the problem might be. [ 1176.909543] BUG: unable to handle kernel paging request at 8298fb0a [ 1176.916565] IP: queued_spin_lock_slowpath+0xfc/0x142 [ 1176.922111] *pdpt = 3367a001 *pde = [ 1176.928534] Oops: 0002 [#1] PREEMPT SMP [ 1177.002434] CPU: 2 PID: 24688 Comm: kworker/u8:4 Tainted: G U O 4.14.47-20180606-a6b8390e8cc1de032b8314d1a5b193fe9e21f325 #1 [ 1177.024120] Workqueue: events_unbound intel_atomic_commit_work [ 1177.030630] task: ef2ee200 task.stack: efbf4000 [ 1177.035685] EIP: queued_spin_lock_slowpath+0xfc/0x142 [ 1177.041327] EFLAGS: 00010087 CPU: 2 [ 1177.045212] EAX: 8298fb0a EBX: 3ba0 ECX: ee82489c EDX: f4656fc0 [ 1177.052215] ESI: 000c EDI: 0001 EBP: efbf5e88 ESP: efbf5e78 [ 1177.059217] DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 [ 1177.065239] CR0: 80050033 CR2: 8298fb0a CR3: 2e8ed320 CR4: 001006f0 [ 1177.072240] Call Trace: [ 1177.074973] _raw_spin_lock_irqsave+0x28/0x2d [ 1177.079840] complete_all+0x12/0x36 [ 1177.083737] drm_atomic_helper_commit_hw_done+0x3c/0x43 [ 1177.089576] intel_atomic_commit_tail+0xa5f/0xbd9 [ 1177.094832] ? wait_woken+0x5a/0x5a [ 1177.098727] ? wait_woken+0x5a/0x5a [ 1177.102622] intel_atomic_commit_work+0xb/0xd [ 1177.107489] ? intel_atomic_commit_work+0xb/0xd [ 1177.112551] process_one_work+0x109/0x1ee [ 1177.117029] worker_thread+0x1a4/0x257 [ 1177.121215] kthread+0xee/0xf3 [ 1177.124625] ? rescuer_thread+0x207/0x207 [ 1177.129103] ? kthread_create_on_node+0x1a/0x1a [ 1177.134165] ret_from_fork+0x2e/0x38 [ 1177.138156] Code: 12 09 de 89 f0 89 75 f0 c1 e8 10 66 87 41 02 89 c3 c1 e3 10 74 51 83 e0 03 c1 eb 12 6b c0 0c 05 c0 1f 7e c1 03 04 9d d8 b1 6c c1 <89> 10 8b 42 04 85 c0 75 04 f3 90 eb f5 8b 1a 85 db 74 03 0f 0d [ 1177.159204] EIP: queued_spin_lock_slowpath+0xfc/0x142 SS:ESP: 0068:efbf5e78 [ 1177.166983] CR2: 8298fb0a -- Regards Sudip ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [OOPS] drm/i915/execlists: Remove too-early assert
Quoting Chris Wilson (2018-02-16 15:32:10) > We can't assert that the execlists are active before we set the flag. So > perform the assert after we are expected to have marked the execlists > active. > > Fixes: 339ccd35b42c ("drm/i915: Assert that we always complete a submission > to guc/execlists") > Signed-off-by: Chris Wilson> Cc: Michał Winiarski > Cc: Mika Kuoppala From irc, Acked-by: Tomi Sarvela -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [OOPS] drm/i915/execlists: Remove too-early assert
We can't assert that the execlists are active before we set the flag. So perform the assert after we are expected to have marked the execlists active. Fixes: 339ccd35b42c ("drm/i915: Assert that we always complete a submission to guc/execlists") Signed-off-by: Chris WilsonCc: Michał Winiarski Cc: Mika Kuoppala --- drivers/gpu/drm/i915/intel_lrc.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 6fbe1a8a37ad..9b6d781b22ec 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -644,8 +644,6 @@ static void execlists_dequeue(struct intel_engine_cs *engine) port_assign(port, last); /* We must always keep the beast fed if we have work piled up */ - GEM_BUG_ON(port_isset(execlists->port) && - !execlists_is_active(execlists, EXECLISTS_ACTIVE_USER)); GEM_BUG_ON(execlists->first && !port_isset(execlists->port)); unlock: @@ -655,6 +653,9 @@ static void execlists_dequeue(struct intel_engine_cs *engine) execlists_set_active(execlists, EXECLISTS_ACTIVE_USER); execlists_submit_ports(engine); } + + GEM_BUG_ON(port_isset(execlists->port) && + !execlists_is_active(execlists, EXECLISTS_ACTIVE_USER)); } void -- 2.16.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx