> Date: Wed, 12 Feb 2020 15:24:46 +0100 > From: Martin Pieuchot <[email protected]>
Haven't forgotten about these. > Some warnings reported by WITNESS: > > witness: lock order reversal: > 1st 0xffff800001332b38 &rq->lock (&rq->lock) > 2nd 0xffff8000006a0050 rcs0 (&timeline->lock) > lock order "&timeline->lock"(mutex) -> "&rq->lock"(mutex) first seen at: > #0 witness_checkorder+0x449 > #1 mtx_enter+0x34 > #2 __i915_request_submit+0x5b > #3 __execlists_submission_tasklet+0x1b9 > #4 execlists_submit_request+0x1d1 > #5 submit_notify+0x37 > #6 __i915_sw_fence_complete+0x40 > #7 i915_request_add+0x2d3 > #8 i915_gem_init+0x2b9 > #9 i915_driver_load+0x81b > #10 inteldrm_attachhook+0x2c > #11 config_process_deferred_mountroot+0x6b > #12 main+0x755 > #13 longmode_hi+0x9c > lock order "&rq->lock"(mutex) -> "&timeline->lock"(mutex) first seen at: > #0 witness_checkorder+0x449 > #1 mtx_enter+0x34 > #2 execlists_submit_request+0x2a > #3 submit_notify+0x37 > #4 __i915_sw_fence_complete+0x40 > #5 dma_i915_sw_fence_wake+0x1d > #6 notify_ring+0x1a8 > #7 gen8_gt_irq_handler+0xba > #8 gen8_irq_handler+0x114 > #9 intr_handler+0x6e > #10 Xintr_ioapic_edge16_untramp+0x19f > #11 acpicpu_idle+0x1d2 > #12 sched_idle+0x225 > #13 proc_trampoline+0x1c This one smells like a bug in the 2nd codepath as I think the timeline lock needs to be taken before the request lock. But I have a hard time figuring out how this happens in that 2nd codepath. > witness: lock order reversal: > 1st 0xffff800001332678 &wqh->lock (&wqh->lock) > 2nd 0xffff8000006a0050 rcs0 (&timeline->lock) > lock order "&wqh->lock"(mutex) -> "&timeline->lock"(mutex) first seen at: > #0 witness_checkorder+0x449 > #1 mtx_enter+0x34 > #2 execlists_submit_request+0x2a > #3 submit_notify+0x37 > #4 __i915_sw_fence_complete+0x40 > #5 i915_sw_fence_wake+0x39 > #6 __i915_sw_fence_complete+0x131 > #7 dma_i915_sw_fence_wake+0x1d > #8 notify_ring+0x1a8 > #9 gen8_gt_irq_handler+0xba > #10 gen8_irq_handler+0x114 > #11 intr_handler+0x6e > #12 Xintr_ioapic_edge16_untramp+0x19f > #13 acpicpu_idle+0x1d2 > #14 sched_idle+0x225 > #15 proc_trampoline+0x1c Not sure what this one means. But it is the same as the 2nd path above > witness: acquiring duplicate lock of same type: "&wqh->lock" > 1st &wqh->lock > 2nd &wqh->lock > Starting stack trace... > witness_checkorder(ffff800001333980,9,0) at witness_checkorder+0x6ba > mtx_enter(ffff800001333970) at mtx_enter+0x34 > __i915_sw_fence_complete(ffff800001333970,ffff800022280270) at > __i915_sw_fence_complete+0x58 > i915_sw_fence_wake(ffff8000013339c8,1,0,ffff800022280270) at > i915_sw_fence_wake+0x39 > __i915_sw_fence_complete(ffff800001332668,0) at __i915_sw_fence_complete+0x131 > dma_i915_sw_fence_wake(ffff8000013322c8,ffff800001355b20) at > dma_i915_sw_fence_wake+0x1d > notify_ring(ffff800000a75000) at notify_ring+0x1a8 > gen8_gt_irq_handler(ffff800000154000,2,ffff8000222803b0) at > gen8_gt_irq_handler+0xba > gen8_irq_handler(0,ffff800000154078) at gen8_irq_handler+0x114 > intr_handler(ffff800022280450,ffff80000013fd00) at intr_handler+0x6e > Xintr_ioapic_edge16_untramp() at Xintr_ioapic_edge16_untramp+0x19f > acpicpu_idle() at acpicpu_idle+0x1d2 > sched_idle(ffffffff81effff0) at sched_idle+0x225 > end trace frame: 0x0, count: 244 > End of stack trace. This may simply be a case of adding a DUPOK. But again it's the same suspicious code path...
