Re: [Intel-gfx] [RFC] Move BDW workarounds to ring init fn
On Tue, Jul 29, 2014 at 11:27:55PM +0100, Siluvery, Arun wrote: On 28/07/2014 18:26, Ville Syrjälä wrote: On Mon, Jul 28, 2014 at 05:31:45PM +0100, arun.siluv...@linux.intel.com wrote: From: Arun Siluvery arun.siluv...@linux.intel.com This patch moves BDW workarounds from init_clock_gating() to render ring init fn otherwise they are lost when gpu is reset. In case of execlists, some of the workarounds modify registers that are part of register state context which doesn't get initialized until init_clock_gating(); this results in default context with incorrect values as it is restored and saved before updated by workarounds. I don't think it has to do with execlists. Many of the registers are part of the context image even in ring buffer mode AFAIK. Open issue: For Wa4x4STCOptimizationDisable, we set CACHE_MODE_1[6:6] = 1 At the time when HW contexts are enabled after rings are initialized with default context this workaround is valid but followed by a context switch this is getting reset, please see below log snippet. This is a bit weird. The default context should have restore inhibit==1 so it shouldn't clobber the CACHE_MODE_1 register. There was a specific magic dance you're supposed to do when accessing such registers with mmio, but here we do the write even before the first context switch. Apparently there was some kind of problem with CACHE_MODE_0 on snb too: commit 3a69ddd6f872180b6f61fda87152b37202118fbc Author: Kenneth Graunke kenn...@whitecape.org Date: Fri Apr 27 12:44:41 2012 -0700 drm/i915: Set the Stencil Cache eviction policy to non-LRA mode. but IIRC I wasn't able to reproduce it when I tried. Similar to this register I am also applying this in render ring init fn. Maybe we need to delay these register writes until we've switched to the default context? In its current state (WAs applied in init_clock_gating()) we are writing these registers after switching to default context. When a new hw context is created does all the registers part of context start with default values or they sample the current state? and at what point this sampling takes place? We load each uninitialized context with restore inhibit=true so AFAIK the current register values should stay intact. As a test I have updated CACHE_MODE_1 after mi_set_context() then the workaround was valid with every context switch but I think it may not be the right way otherwise we will have to update other WA registers also at this point with every context switch. Maybe there's something special about the very first context switch? Though I don't see why that would be the case. regards Arun ... [5.978209] [drm:i915_pages_create_for_stolen] offset=0x0, size=8294400 [5.978213] [drm:intel_alloc_plane_obj] plane fb obj 8801472e [5.978215] [drm:i915_gem_setup_global_gtt] reserving preallocated space: 0 + 7e9000 [5.978216] [drm:i915_gem_setup_global_gtt] clearing unused GTT space: [7e9000, f000] [5.979613] [drm:i915_gem_init] CACHE_MODE_1: 0x0180 [5.981372] [drm:gen8_ppgtt_init] Allocated 4 pages for page directories (0 wasted) [5.981373] [drm:gen8_ppgtt_init] Allocated 2048 pages for page tables (0 wasted) [5.981376] [drm:i915_gem_context_init] HW context support initialized [5.981462] [drm:i915_gem_init_hw] CACHE_MODE_1: 0x0180 [5.981467] [drm:i915_gem_init_rings] CACHE_MODE_1: 0x0180 [5.981704] [drm:bdw_init_workarounds] CACHE_MODE_1: 0x01C0 [5.981716] [drm:init_status_page] bsd ring hws offset: 0x0081e000 [5.981792] [drm:init_status_page] blitter ring hws offset: 0x0083f000 [5.981910] [drm:init_status_page] video enhancement ring hws offset: 0x0086 [5.982001] [drm:i915_gem_init_hw] CACHE_MODE_1: 0x01C0 [5.982104] [drm:i915_gem_context_enable] Switch render ring to default_context [5.982106] [drm:i915_gem_render_state_init] render ring: Render state init [5.982120] [drm:do_switch] render ring, CACHE_MODE_1: 0x01C0, uninitialized: 1 [5.982121] [drm:i915_gem_context_enable] Switch bsd ring to default_context [5.982122] [drm:do_switch] bsd ring, CACHE_MODE_1: 0x01C0, uninitialized: 0 [5.982123] [drm:i915_gem_context_enable] Switch blitter ring to default_context [5.982126] [drm:do_switch] blitter ring, CACHE_MODE_1: 0x01C0, uninitialized: 0 [5.982126] [drm:i915_gem_context_enable] Switch video enhancement ring to default_context [5.982128] [drm:do_switch] video enhancement ring, CACHE_MODE_1: 0x01C0, uninitialized: 0 [5.982133] [drm:i915_gem_init] CACHE_MODE_1: 0x01C0 [5.982258] [drm:intel_init_clock_gating] ... [ 10.037019] [drm:do_switch] blitter ring, CACHE_MODE_1: 0x0180, uninitialized: 0 ... [ 10.488145] [drm:do_switch] render ring,
Re: [Intel-gfx] [RFC] Move BDW workarounds to ring init fn
On 28/07/2014 18:26, Ville Syrjälä wrote: On Mon, Jul 28, 2014 at 05:31:45PM +0100, arun.siluv...@linux.intel.com wrote: From: Arun Siluvery arun.siluv...@linux.intel.com This patch moves BDW workarounds from init_clock_gating() to render ring init fn otherwise they are lost when gpu is reset. In case of execlists, some of the workarounds modify registers that are part of register state context which doesn't get initialized until init_clock_gating(); this results in default context with incorrect values as it is restored and saved before updated by workarounds. I don't think it has to do with execlists. Many of the registers are part of the context image even in ring buffer mode AFAIK. Open issue: For Wa4x4STCOptimizationDisable, we set CACHE_MODE_1[6:6] = 1 At the time when HW contexts are enabled after rings are initialized with default context this workaround is valid but followed by a context switch this is getting reset, please see below log snippet. This is a bit weird. The default context should have restore inhibit==1 so it shouldn't clobber the CACHE_MODE_1 register. There was a specific magic dance you're supposed to do when accessing such registers with mmio, but here we do the write even before the first context switch. Apparently there was some kind of problem with CACHE_MODE_0 on snb too: commit 3a69ddd6f872180b6f61fda87152b37202118fbc Author: Kenneth Graunke kenn...@whitecape.org Date: Fri Apr 27 12:44:41 2012 -0700 drm/i915: Set the Stencil Cache eviction policy to non-LRA mode. but IIRC I wasn't able to reproduce it when I tried. Similar to this register I am also applying this in render ring init fn. Maybe we need to delay these register writes until we've switched to the default context? In its current state (WAs applied in init_clock_gating()) we are writing these registers after switching to default context. When a new hw context is created does all the registers part of context start with default values or they sample the current state? and at what point this sampling takes place? As a test I have updated CACHE_MODE_1 after mi_set_context() then the workaround was valid with every context switch but I think it may not be the right way otherwise we will have to update other WA registers also at this point with every context switch. regards Arun ... [5.978209] [drm:i915_pages_create_for_stolen] offset=0x0, size=8294400 [5.978213] [drm:intel_alloc_plane_obj] plane fb obj 8801472e [5.978215] [drm:i915_gem_setup_global_gtt] reserving preallocated space: 0 + 7e9000 [5.978216] [drm:i915_gem_setup_global_gtt] clearing unused GTT space: [7e9000, f000] [5.979613] [drm:i915_gem_init] CACHE_MODE_1: 0x0180 [5.981372] [drm:gen8_ppgtt_init] Allocated 4 pages for page directories (0 wasted) [5.981373] [drm:gen8_ppgtt_init] Allocated 2048 pages for page tables (0 wasted) [5.981376] [drm:i915_gem_context_init] HW context support initialized [5.981462] [drm:i915_gem_init_hw] CACHE_MODE_1: 0x0180 [5.981467] [drm:i915_gem_init_rings] CACHE_MODE_1: 0x0180 [5.981704] [drm:bdw_init_workarounds] CACHE_MODE_1: 0x01C0 [5.981716] [drm:init_status_page] bsd ring hws offset: 0x0081e000 [5.981792] [drm:init_status_page] blitter ring hws offset: 0x0083f000 [5.981910] [drm:init_status_page] video enhancement ring hws offset: 0x0086 [5.982001] [drm:i915_gem_init_hw] CACHE_MODE_1: 0x01C0 [5.982104] [drm:i915_gem_context_enable] Switch render ring to default_context [5.982106] [drm:i915_gem_render_state_init] render ring: Render state init [5.982120] [drm:do_switch] render ring, CACHE_MODE_1: 0x01C0, uninitialized: 1 [5.982121] [drm:i915_gem_context_enable] Switch bsd ring to default_context [5.982122] [drm:do_switch] bsd ring, CACHE_MODE_1: 0x01C0, uninitialized: 0 [5.982123] [drm:i915_gem_context_enable] Switch blitter ring to default_context [5.982126] [drm:do_switch] blitter ring, CACHE_MODE_1: 0x01C0, uninitialized: 0 [5.982126] [drm:i915_gem_context_enable] Switch video enhancement ring to default_context [5.982128] [drm:do_switch] video enhancement ring, CACHE_MODE_1: 0x01C0, uninitialized: 0 [5.982133] [drm:i915_gem_init] CACHE_MODE_1: 0x01C0 [5.982258] [drm:intel_init_clock_gating] ... [ 10.037019] [drm:do_switch] blitter ring, CACHE_MODE_1: 0x0180, uninitialized: 0 ... [ 10.488145] [drm:do_switch] render ring, CACHE_MODE_1: 0x0180, uninitialized: 0 ... I am currently testing this with an igt which triggers a gpu reset and compares WA register contents before and after reset but the test fails because of this register hence not sending it now. Please let me know how to keep this WA valid after a context switch. Arun Siluvery (1): drm/i915/bdw: Initialize BDW workarounds in render ring init fn drivers/gpu/drm/i915/i915_debugfs.c | 46 ++
[Intel-gfx] [RFC] Move BDW workarounds to ring init fn
From: Arun Siluvery arun.siluv...@linux.intel.com This patch moves BDW workarounds from init_clock_gating() to render ring init fn otherwise they are lost when gpu is reset. In case of execlists, some of the workarounds modify registers that are part of register state context which doesn't get initialized until init_clock_gating(); this results in default context with incorrect values as it is restored and saved before updated by workarounds. Open issue: For Wa4x4STCOptimizationDisable, we set CACHE_MODE_1[6:6] = 1 At the time when HW contexts are enabled after rings are initialized with default context this workaround is valid but followed by a context switch this is getting reset, please see below log snippet. ... [5.978209] [drm:i915_pages_create_for_stolen] offset=0x0, size=8294400 [5.978213] [drm:intel_alloc_plane_obj] plane fb obj 8801472e [5.978215] [drm:i915_gem_setup_global_gtt] reserving preallocated space: 0 + 7e9000 [5.978216] [drm:i915_gem_setup_global_gtt] clearing unused GTT space: [7e9000, f000] [5.979613] [drm:i915_gem_init] CACHE_MODE_1: 0x0180 [5.981372] [drm:gen8_ppgtt_init] Allocated 4 pages for page directories (0 wasted) [5.981373] [drm:gen8_ppgtt_init] Allocated 2048 pages for page tables (0 wasted) [5.981376] [drm:i915_gem_context_init] HW context support initialized [5.981462] [drm:i915_gem_init_hw] CACHE_MODE_1: 0x0180 [5.981467] [drm:i915_gem_init_rings] CACHE_MODE_1: 0x0180 [5.981704] [drm:bdw_init_workarounds] CACHE_MODE_1: 0x01C0 [5.981716] [drm:init_status_page] bsd ring hws offset: 0x0081e000 [5.981792] [drm:init_status_page] blitter ring hws offset: 0x0083f000 [5.981910] [drm:init_status_page] video enhancement ring hws offset: 0x0086 [5.982001] [drm:i915_gem_init_hw] CACHE_MODE_1: 0x01C0 [5.982104] [drm:i915_gem_context_enable] Switch render ring to default_context [5.982106] [drm:i915_gem_render_state_init] render ring: Render state init [5.982120] [drm:do_switch] render ring, CACHE_MODE_1: 0x01C0, uninitialized: 1 [5.982121] [drm:i915_gem_context_enable] Switch bsd ring to default_context [5.982122] [drm:do_switch] bsd ring, CACHE_MODE_1: 0x01C0, uninitialized: 0 [5.982123] [drm:i915_gem_context_enable] Switch blitter ring to default_context [5.982126] [drm:do_switch] blitter ring, CACHE_MODE_1: 0x01C0, uninitialized: 0 [5.982126] [drm:i915_gem_context_enable] Switch video enhancement ring to default_context [5.982128] [drm:do_switch] video enhancement ring, CACHE_MODE_1: 0x01C0, uninitialized: 0 [5.982133] [drm:i915_gem_init] CACHE_MODE_1: 0x01C0 [5.982258] [drm:intel_init_clock_gating] ... [ 10.037019] [drm:do_switch] blitter ring, CACHE_MODE_1: 0x0180, uninitialized: 0 ... [ 10.488145] [drm:do_switch] render ring, CACHE_MODE_1: 0x0180, uninitialized: 0 ... I am currently testing this with an igt which triggers a gpu reset and compares WA register contents before and after reset but the test fails because of this register hence not sending it now. Please let me know how to keep this WA valid after a context switch. Arun Siluvery (1): drm/i915/bdw: Initialize BDW workarounds in render ring init fn drivers/gpu/drm/i915/i915_debugfs.c | 46 ++ drivers/gpu/drm/i915/intel_pm.c | 59 drivers/gpu/drm/i915/intel_ringbuffer.c | 68 + 3 files changed, 114 insertions(+), 59 deletions(-) -- 1.9.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC] Move BDW workarounds to ring init fn
On Mon, Jul 28, 2014 at 05:31:45PM +0100, arun.siluv...@linux.intel.com wrote: From: Arun Siluvery arun.siluv...@linux.intel.com This patch moves BDW workarounds from init_clock_gating() to render ring init fn otherwise they are lost when gpu is reset. In case of execlists, some of the workarounds modify registers that are part of register state context which doesn't get initialized until init_clock_gating(); this results in default context with incorrect values as it is restored and saved before updated by workarounds. I don't think it has to do with execlists. Many of the registers are part of the context image even in ring buffer mode AFAIK. Open issue: For Wa4x4STCOptimizationDisable, we set CACHE_MODE_1[6:6] = 1 At the time when HW contexts are enabled after rings are initialized with default context this workaround is valid but followed by a context switch this is getting reset, please see below log snippet. This is a bit weird. The default context should have restore inhibit==1 so it shouldn't clobber the CACHE_MODE_1 register. There was a specific magic dance you're supposed to do when accessing such registers with mmio, but here we do the write even before the first context switch. Apparently there was some kind of problem with CACHE_MODE_0 on snb too: commit 3a69ddd6f872180b6f61fda87152b37202118fbc Author: Kenneth Graunke kenn...@whitecape.org Date: Fri Apr 27 12:44:41 2012 -0700 drm/i915: Set the Stencil Cache eviction policy to non-LRA mode. but IIRC I wasn't able to reproduce it when I tried. Maybe we need to delay these register writes until we've switched to the default context? ... [5.978209] [drm:i915_pages_create_for_stolen] offset=0x0, size=8294400 [5.978213] [drm:intel_alloc_plane_obj] plane fb obj 8801472e [5.978215] [drm:i915_gem_setup_global_gtt] reserving preallocated space: 0 + 7e9000 [5.978216] [drm:i915_gem_setup_global_gtt] clearing unused GTT space: [7e9000, f000] [5.979613] [drm:i915_gem_init] CACHE_MODE_1: 0x0180 [5.981372] [drm:gen8_ppgtt_init] Allocated 4 pages for page directories (0 wasted) [5.981373] [drm:gen8_ppgtt_init] Allocated 2048 pages for page tables (0 wasted) [5.981376] [drm:i915_gem_context_init] HW context support initialized [5.981462] [drm:i915_gem_init_hw] CACHE_MODE_1: 0x0180 [5.981467] [drm:i915_gem_init_rings] CACHE_MODE_1: 0x0180 [5.981704] [drm:bdw_init_workarounds] CACHE_MODE_1: 0x01C0 [5.981716] [drm:init_status_page] bsd ring hws offset: 0x0081e000 [5.981792] [drm:init_status_page] blitter ring hws offset: 0x0083f000 [5.981910] [drm:init_status_page] video enhancement ring hws offset: 0x0086 [5.982001] [drm:i915_gem_init_hw] CACHE_MODE_1: 0x01C0 [5.982104] [drm:i915_gem_context_enable] Switch render ring to default_context [5.982106] [drm:i915_gem_render_state_init] render ring: Render state init [5.982120] [drm:do_switch] render ring, CACHE_MODE_1: 0x01C0, uninitialized: 1 [5.982121] [drm:i915_gem_context_enable] Switch bsd ring to default_context [5.982122] [drm:do_switch] bsd ring, CACHE_MODE_1: 0x01C0, uninitialized: 0 [5.982123] [drm:i915_gem_context_enable] Switch blitter ring to default_context [5.982126] [drm:do_switch] blitter ring, CACHE_MODE_1: 0x01C0, uninitialized: 0 [5.982126] [drm:i915_gem_context_enable] Switch video enhancement ring to default_context [5.982128] [drm:do_switch] video enhancement ring, CACHE_MODE_1: 0x01C0, uninitialized: 0 [5.982133] [drm:i915_gem_init] CACHE_MODE_1: 0x01C0 [5.982258] [drm:intel_init_clock_gating] ... [ 10.037019] [drm:do_switch] blitter ring, CACHE_MODE_1: 0x0180, uninitialized: 0 ... [ 10.488145] [drm:do_switch] render ring, CACHE_MODE_1: 0x0180, uninitialized: 0 ... I am currently testing this with an igt which triggers a gpu reset and compares WA register contents before and after reset but the test fails because of this register hence not sending it now. Please let me know how to keep this WA valid after a context switch. Arun Siluvery (1): drm/i915/bdw: Initialize BDW workarounds in render ring init fn drivers/gpu/drm/i915/i915_debugfs.c | 46 ++ drivers/gpu/drm/i915/intel_pm.c | 59 drivers/gpu/drm/i915/intel_ringbuffer.c | 68 + 3 files changed, 114 insertions(+), 59 deletions(-) -- 1.9.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx