Re: [Intel-gfx] [PATCH 14/27] drm/i915: Don't mark an execlists context-switch when idle

2017-04-20 Thread Joonas Lahtinen
On ke, 2017-04-19 at 10:41 +0100, Chris Wilson wrote:
> If we *know* that the engine is idle, i.e. we have not more contexts in
> lift, we can skip any spurious CSB idle interrupts. These spurious

in flight?

> interrupts seem to arrive long after we assert that the engines are
> completely idle, triggering later assertions:
> 
> [  178.896646] intel_engine_is_idle(bcs): interrupt not handled, irq_posted=2
> [  178.896655] [ cut here ]
> [  178.896658] kernel BUG at drivers/gpu/drm/i915/intel_engine_cs.c:226!
> [  178.896661] invalid opcode:  [#1] SMP
> [  178.896663] Modules linked in: i915(E) x86_pkg_temp_thermal(E) 
> crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) ghash_clmulni_intel(E) 
> nls_ascii(E) nls_cp437(E) vfat(E) fat(E) intel_gtt(E) i2c_algo_bit(E) 
> drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) 
> aesni_intel(E) prime_numbers(E) evdev(E) aes_x86_64(E) drm(E) crypto_simd(E) 
> cryptd(E) glue_helper(E) mei_me(E) mei(E) lpc_ich(E) efivars(E) mfd_core(E) 
> battery(E) video(E) acpi_pad(E) button(E) tpm_tis(E) tpm_tis_core(E) tpm(E) 
> autofs4(E) i2c_i801(E) fan(E) thermal(E) i2c_designware_platform(E) 
> i2c_designware_core(E)
> [  178.896694] CPU: 1 PID: 522 Comm: gem_exec_whispe Tainted: GE  
>  4.11.0-rc5+ #14
> [  178.896702] task: 88040aba8d40 task.stack: c93f
> [  178.896722] RIP: 0010:intel_engine_init_global_seqno+0x1db/0x1f0 [i915]
> [  178.896725] RSP: 0018:c93f3ab0 EFLAGS: 00010246
> [  178.896728] RAX:  RBX: 88040af54000 RCX: 
> 
> [  178.896731] RDX: 88041ec933e0 RSI: 88041ec8cc48 RDI: 
> 88041ec8cc48
> [  178.896734] RBP: c93f3ac8 R08:  R09: 
> 047d
> [  178.896736] R10: 0040 R11: 88040b344f80 R12: 
> 
> [  178.896739] R13: 88040bce R14: 88040bce52d8 R15: 
> 88040bce
> [  178.896742] FS:  7f22d8c0() GS:88041ec8() 
> knlGS:
> [  178.896746] CS:  0010 DS:  ES:  CR0: 80050033
> [  178.896749] CR2: 7f41ddd8f000 CR3: 00040bb03000 CR4: 
> 001406e0
> [  178.896752] Call Trace:
> [  178.896768]  reset_all_global_seqno.part.33+0x4e/0xd0 [i915]
> [  178.896782]  i915_gem_request_alloc+0x304/0x330 [i915]
> [  178.896795]  i915_gem_do_execbuffer+0x8a1/0x17d0 [i915]
> [  178.896799]  ? remove_wait_queue+0x48/0x50
> [  178.896812]  ? i915_wait_request+0x300/0x590 [i915]
> [  178.896816]  ? wake_up_q+0x70/0x70
> [  178.896819]  ? refcount_dec_and_test+0x11/0x20
> [  178.896823]  ? reservation_object_add_excl_fence+0xa5/0x100
> [  178.896835]  i915_gem_execbuffer2+0xab/0x1f0 [i915]
> [  178.896844]  drm_ioctl+0x1e6/0x460 [drm]
> [  178.896858]  ? i915_gem_execbuffer+0x260/0x260 [i915]
> [  178.896862]  ? dput+0xcf/0x250
> [  178.896866]  ? full_proxy_release+0x66/0x80
> [  178.896869]  ? mntput+0x1f/0x30
> [  178.896872]  do_vfs_ioctl+0x8f/0x5b0
> [  178.896875]  ? fput+0x9/0x10
> [  178.896878]  ? task_work_run+0x80/0xa0
> [  178.896881]  SyS_ioctl+0x3c/0x70
> [  178.896885]  entry_SYSCALL_64_fastpath+0x17/0x98
> [  178.896888] RIP: 0033:0x7f2ccb455ca7
> [  178.896890] RSP: 002b:7ffcabec72d8 EFLAGS: 0246 ORIG_RAX: 
> 0010
> [  178.896894] RAX: ffda RBX: 55f897a44b90 RCX: 
> 7f2ccb455ca7
> [  178.896897] RDX: 7ffcabec74a0 RSI: 40406469 RDI: 
> 0003
> [  178.896900] RBP: 7f2ccb70a440 R08: 7f2ccb70d0a4 R09: 
> 
> [  178.896903] R10:  R11: 0246 R12: 
> 
> [  178.896905] R13: 55f89782d71a R14: 7ffcabecf838 R15: 
> 0003
> [  178.896908] Code: 00 31 d2 4c 89 ef 8d 70 48 41 ff 95 f8 06 00 00 e9 68 fe 
> ff ff be 0f 00 00 00 48 c7 c7 48 dc 37 a0 e8 fa 33 d6 e0 e9 0b ff ff ff <0f> 
> 0b 0f 0b 0f 0b 0f 0b 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00
> 
> On the other hand, by ignoring the interrupt do we risk running out of
> space in CSB ring? Testing for a few hours suggests not, i.e. that we
> only seem to get the odd delayed CSB idle notification.
> 
> Signed-off-by: Chris Wilson 
> Cc: Tvrtko Ursulin 

Slap your Tested-by too.

Reviewed-by: Joonas Lahtinen 

Even with that, I dislike the port_count macro.

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 14/27] drm/i915: Don't mark an execlists context-switch when idle

2017-04-19 Thread Chris Wilson
If we *know* that the engine is idle, i.e. we have not more contexts in
lift, we can skip any spurious CSB idle interrupts. These spurious
interrupts seem to arrive long after we assert that the engines are
completely idle, triggering later assertions:

[  178.896646] intel_engine_is_idle(bcs): interrupt not handled, irq_posted=2
[  178.896655] [ cut here ]
[  178.896658] kernel BUG at drivers/gpu/drm/i915/intel_engine_cs.c:226!
[  178.896661] invalid opcode:  [#1] SMP
[  178.896663] Modules linked in: i915(E) x86_pkg_temp_thermal(E) 
crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) ghash_clmulni_intel(E) 
nls_ascii(E) nls_cp437(E) vfat(E) fat(E) intel_gtt(E) i2c_algo_bit(E) 
drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) 
aesni_intel(E) prime_numbers(E) evdev(E) aes_x86_64(E) drm(E) crypto_simd(E) 
cryptd(E) glue_helper(E) mei_me(E) mei(E) lpc_ich(E) efivars(E) mfd_core(E) 
battery(E) video(E) acpi_pad(E) button(E) tpm_tis(E) tpm_tis_core(E) tpm(E) 
autofs4(E) i2c_i801(E) fan(E) thermal(E) i2c_designware_platform(E) 
i2c_designware_core(E)
[  178.896694] CPU: 1 PID: 522 Comm: gem_exec_whispe Tainted: GE   
4.11.0-rc5+ #14
[  178.896702] task: 88040aba8d40 task.stack: c93f
[  178.896722] RIP: 0010:intel_engine_init_global_seqno+0x1db/0x1f0 [i915]
[  178.896725] RSP: 0018:c93f3ab0 EFLAGS: 00010246
[  178.896728] RAX:  RBX: 88040af54000 RCX: 
[  178.896731] RDX: 88041ec933e0 RSI: 88041ec8cc48 RDI: 88041ec8cc48
[  178.896734] RBP: c93f3ac8 R08:  R09: 047d
[  178.896736] R10: 0040 R11: 88040b344f80 R12: 
[  178.896739] R13: 88040bce R14: 88040bce52d8 R15: 88040bce
[  178.896742] FS:  7f22d8c0() GS:88041ec8() 
knlGS:
[  178.896746] CS:  0010 DS:  ES:  CR0: 80050033
[  178.896749] CR2: 7f41ddd8f000 CR3: 00040bb03000 CR4: 001406e0
[  178.896752] Call Trace:
[  178.896768]  reset_all_global_seqno.part.33+0x4e/0xd0 [i915]
[  178.896782]  i915_gem_request_alloc+0x304/0x330 [i915]
[  178.896795]  i915_gem_do_execbuffer+0x8a1/0x17d0 [i915]
[  178.896799]  ? remove_wait_queue+0x48/0x50
[  178.896812]  ? i915_wait_request+0x300/0x590 [i915]
[  178.896816]  ? wake_up_q+0x70/0x70
[  178.896819]  ? refcount_dec_and_test+0x11/0x20
[  178.896823]  ? reservation_object_add_excl_fence+0xa5/0x100
[  178.896835]  i915_gem_execbuffer2+0xab/0x1f0 [i915]
[  178.896844]  drm_ioctl+0x1e6/0x460 [drm]
[  178.896858]  ? i915_gem_execbuffer+0x260/0x260 [i915]
[  178.896862]  ? dput+0xcf/0x250
[  178.896866]  ? full_proxy_release+0x66/0x80
[  178.896869]  ? mntput+0x1f/0x30
[  178.896872]  do_vfs_ioctl+0x8f/0x5b0
[  178.896875]  ? fput+0x9/0x10
[  178.896878]  ? task_work_run+0x80/0xa0
[  178.896881]  SyS_ioctl+0x3c/0x70
[  178.896885]  entry_SYSCALL_64_fastpath+0x17/0x98
[  178.896888] RIP: 0033:0x7f2ccb455ca7
[  178.896890] RSP: 002b:7ffcabec72d8 EFLAGS: 0246 ORIG_RAX: 
0010
[  178.896894] RAX: ffda RBX: 55f897a44b90 RCX: 7f2ccb455ca7
[  178.896897] RDX: 7ffcabec74a0 RSI: 40406469 RDI: 0003
[  178.896900] RBP: 7f2ccb70a440 R08: 7f2ccb70d0a4 R09: 
[  178.896903] R10:  R11: 0246 R12: 
[  178.896905] R13: 55f89782d71a R14: 7ffcabecf838 R15: 0003
[  178.896908] Code: 00 31 d2 4c 89 ef 8d 70 48 41 ff 95 f8 06 00 00 e9 68 fe 
ff ff be 0f 00 00 00 48 c7 c7 48 dc 37 a0 e8 fa 33 d6 e0 e9 0b ff ff ff <0f> 0b 
0f 0b 0f 0b 0f 0b 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00

On the other hand, by ignoring the interrupt do we risk running out of
space in CSB ring? Testing for a few hours suggests not, i.e. that we
only seem to get the odd delayed CSB idle notification.

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_irq.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index fd97fe00cd0d..fb2ac202dec5 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1359,8 +1359,10 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 
iir, int test_shift)
bool tasklet = false;
 
if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
-   set_bit(ENGINE_IRQ_EXECLIST, >irq_posted);
-   tasklet = true;
+   if (port_count(>execlist_port[0])) {
+   set_bit(ENGINE_IRQ_EXECLIST, >irq_posted);
+   tasklet = true;
+   }
}
 
if (iir & (GT_RENDER_USER_INTERRUPT << test_shift)) {
-- 
2.11.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org