[Intel-gfx] linux-firmware-i915 pull request
Hi, Please consider pulling i915 to linux-firmware.git and let me know if there is anything else I should do differently on the pull request. Besides what I had requested already there is a new minor version of dmc available. Thanks in advance, Rodrigo. The following changes since commit 3161bfa479d5e9ed4f46b57df9bcecbbc4f8eb3c: linux-firmware: Update firmware patch for Intel Bluetooth 7260 (B3/B4) (2015-05-13 11:12:48 -0400) are available in the git repository at: git://people.freedesktop.org/~vivijim/linux-firmware-i915 master for you to fetch changes up to a750f4ee10f962640c479aca184fe65f83956295: linux-firmware: New minor DMC release for Skylake - ver1_18 (2015-06-19 17:29:18 -0700) Rodrigo Vivi (5): linux-firmware: Add i915 DMC firmware linux-firmware: Add i915 GuC firmware linux-firmware: Add i915 DMC firmware for Broxton linux-firmware: New minor DMC release for Skylake. linux-firmware: New minor DMC release for Skylake - ver1_18 LICENSE.i915 | 39 +++ WHENCE | 23 +++ i915/bxt_dmc_ver1.bin | 1 + i915/bxt_dmc_ver1_04.bin | Bin 0 -> 5872 bytes i915/skl_dmc_ver1.bin | 1 + i915/skl_dmc_ver1_04.bin | Bin 0 -> 8048 bytes i915/skl_dmc_ver1_16.bin | Bin 0 -> 7892 bytes i915/skl_dmc_ver1_18.bin | Bin 0 -> 7892 bytes i915/skl_guc_ver1.bin | 1 + i915/skl_guc_ver1_1059.bin | Bin 0 -> 109636 bytes 10 files changed, 65 insertions(+) create mode 100644 LICENSE.i915 create mode 12 i915/bxt_dmc_ver1.bin create mode 100644 i915/bxt_dmc_ver1_04.bin create mode 12 i915/skl_dmc_ver1.bin create mode 100644 i915/skl_dmc_ver1_04.bin create mode 100644 i915/skl_dmc_ver1_16.bin create mode 100644 i915/skl_dmc_ver1_18.bin create mode 12 i915/skl_guc_ver1.bin create mode 100644 i915/skl_guc_ver1_1059.bin ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] Set alignment value in drm_intel_add_validate_buffer()
+Ben On Fri, Apr 10, 2015 at 5:20 PM, Anuj Phogat wrote: > Signed-off-by: Anuj Phogat > --- > intel/intel_bufmgr_gem.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/intel/intel_bufmgr_gem.c b/intel/intel_bufmgr_gem.c > index 51d87ae..92701a5 100644 > --- a/intel/intel_bufmgr_gem.c > +++ b/intel/intel_bufmgr_gem.c > @@ -459,7 +459,7 @@ drm_intel_add_validate_buffer(drm_intel_bo *bo) > bufmgr_gem->exec_objects[index].handle = bo_gem->gem_handle; > bufmgr_gem->exec_objects[index].relocation_count = > bo_gem->reloc_count; > bufmgr_gem->exec_objects[index].relocs_ptr = (uintptr_t) > bo_gem->relocs; > - bufmgr_gem->exec_objects[index].alignment = 0; > + bufmgr_gem->exec_objects[index].alignment = bo->align; > bufmgr_gem->exec_objects[index].offset = 0; > bufmgr_gem->exec_bos[index] = bo; > bufmgr_gem->exec_count++; > @@ -501,7 +501,7 @@ drm_intel_add_validate_buffer2(drm_intel_bo *bo, int > need_fence) > bufmgr_gem->exec2_objects[index].handle = bo_gem->gem_handle; > bufmgr_gem->exec2_objects[index].relocation_count = > bo_gem->reloc_count; > bufmgr_gem->exec2_objects[index].relocs_ptr = > (uintptr_t)bo_gem->relocs; > - bufmgr_gem->exec2_objects[index].alignment = 0; > + bufmgr_gem->exec2_objects[index].alignment = bo->align; > bufmgr_gem->exec2_objects[index].offset = 0; > bufmgr_gem->exec_bos[index] = bo; > bufmgr_gem->exec2_objects[index].flags = 0; > -- > 2.3.4 > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] i965/gen9: Pass alignment as function parameter in drm_intel_gem_bo_alloc_internal()
+Ben. On Fri, Apr 10, 2015 at 5:20 PM, Anuj Phogat wrote: > and use it to initialize the align variable in drm_intel_bo. > > In case of YF/YS tiled buffers libdrm need not know about the tiling > format because these buffers don't have hardware support to be tiled > or detiled through a fenced region. But, libdrm still need to know > about buffer alignment restrictions because kernel uses it when > resolving the relocation. > > Mesa uses drm_intel_gem_bo_alloc_for_render() to allocate Yf/Ys buffers. > So, use the passed alignment value in this function. Note that we continue > ignoring the alignment value passed to drm_intel_gem_bo_alloc() to follow > the previous behavior. > > Cc: Kristian Høgsberg > Cc: Damien Lespiau > Cc: Daniel Vetter > Signed-off-by: Anuj Phogat > --- > intel/intel_bufmgr_gem.c | 11 +++ > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/intel/intel_bufmgr_gem.c b/intel/intel_bufmgr_gem.c > index 5a67f53..51d87ae 100644 > --- a/intel/intel_bufmgr_gem.c > +++ b/intel/intel_bufmgr_gem.c > @@ -655,7 +655,8 @@ drm_intel_gem_bo_alloc_internal(drm_intel_bufmgr *bufmgr, > unsigned long size, > unsigned long flags, > uint32_t tiling_mode, > - unsigned long stride) > + unsigned long stride, > + unsigned int alignment) > { > drm_intel_bufmgr_gem *bufmgr_gem = (drm_intel_bufmgr_gem *) bufmgr; > drm_intel_bo_gem *bo_gem; > @@ -754,6 +755,7 @@ retry: > return NULL; > } > bo_gem->bo.bufmgr = bufmgr; > + bo_gem->bo.align = alignment; > > bo_gem->tiling_mode = I915_TILING_NONE; > bo_gem->swizzle_mode = I915_BIT_6_SWIZZLE_NONE; > @@ -797,7 +799,8 @@ drm_intel_gem_bo_alloc_for_render(drm_intel_bufmgr > *bufmgr, > { > return drm_intel_gem_bo_alloc_internal(bufmgr, name, size, >BO_ALLOC_FOR_RENDER, > - I915_TILING_NONE, 0); > + I915_TILING_NONE, 0, > + alignment); > } > > static drm_intel_bo * > @@ -807,7 +810,7 @@ drm_intel_gem_bo_alloc(drm_intel_bufmgr *bufmgr, >unsigned int alignment) > { > return drm_intel_gem_bo_alloc_internal(bufmgr, name, size, 0, > - I915_TILING_NONE, 0); > + I915_TILING_NONE, 0, 0); > } > > static drm_intel_bo * > @@ -859,7 +862,7 @@ drm_intel_gem_bo_alloc_tiled(drm_intel_bufmgr *bufmgr, > const char *name, > stride = 0; > > return drm_intel_gem_bo_alloc_internal(bufmgr, name, size, flags, > - tiling, stride); > + tiling, stride, 0); > } > > static drm_intel_bo * > -- > 2.3.4 > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 06/10] drm: Avoid atomic commit path for CRTC property (Gamma)
On Mon, Jun 08, 2015 at 05:58:23PM -0700, Matt Roper wrote: > On Sat, Jun 06, 2015 at 05:34:45PM +0530, Sharma, Shashank wrote: > > Regards > > Shashank > > > > On 6/6/2015 6:31 AM, Matt Roper wrote: > > >On Thu, Jun 04, 2015 at 07:12:37PM +0530, Kausal Malladi wrote: > > >>From: Kausal Malladi > > >> > > >>The atomic CRTC set infrastructure is not available yet. So, the atomic > > >>check path throws error for any non-plane property. > > >> > > >>This patch adds a diversion to avoid commit path for DRM atomic set Gamma > > >>property, until atomic CRTC infrastructure is ready. > > > > > >This doesn't look right, but I don't quite understand what you're trying > > >to do from the commit message. > > > > > >This function is what will implement legacy set_property ioctl's of a > > >CRTC on an atomic-based driver. The idea is that when the ioctl is > > >issued, we should get (i.e., make a duplicate of) the current CRTC > > >state, change some of the values in that state (which is what the > > >.atomic_set_property function takes care of), and then submit that > > >modified state to the driver for checking and hw-programming. > > > > > >Okay, I just took a quick peek ahead in your patch set and I think I > > >understand the confusion now...it looks like you're trying to actually > > >perform the full hardware update in the .atomic_set_property call chain, > > >which isn't what we want to be doing in an atomic driver. > > >.atomic_set_property() is just a matter of updating the state structures > > >to reflect the changes you *want* to make (but not actually performing > > >those changes right away). It isn't until drm_atomic_commit() gets > > >called that we validate the new state structure and then write it to the > > >hardware if it looks okay. > > > > > >Keep in mind that the overall goal behind atomic is that we want to be > > >able to supply several items to be updated (e.g., mode, CSC, gamma, > > >etc.) via the atomic ioctl and then have them all accepted (and > > >programmed) by the driver, or all rejected (so none get programmed) if > > >any of them are invalid. Also, if the collection of properties that > > >you're updating fall within the "nuclear pageflip" subset of > > >functionality, you'll also get a guarantee that those items all get > > >updated within the same vblank; updates that straddle a vblank could > > >cause unwanted flickering or other artifacts. The helper function > > >you're updating here only happens to update a single state value at a > > >time, but the same .atomic_set_property() is also used by the atomic > > >ioctl, so we need to make sure it's following the expected design. > > > > > >Finally, keep in mind that the function you're updating here is a DRM > > >core function. Even though i915 isn't quite ready for full atomic yet > > >and might have a bit of brain damage in areas, other vendors' drivers do > > >have complete atomic modesetting support (I think?), so adding > > >i915-specific workarounds like this to the core function could actually > > >hamper them. > > > > > > > > >Matt > > > > > I understood this point. But in this case, we will be blocked until > > set CRTC implementation comes in. Can you suggest a better way to > > handle this without getting blocked for set_crtc() ? > > Is the functionality you need still lacking with Maarten Lankhorst's > "convert to atomic, part 2" series that recently landed upstream? If > that series isn't sufficient, them I'm pretty sure that his "part 3" > series on the mailing list will fix it; that will hopefully be ready to > merge as soon as I get a chance to do a review of it. > > I'll add Maarten to the Cc here in case he wants to comment... > > > Matt Kausal/Shashank, have you guys had a chance to check out Maarten's latest i915 atomic work? His "part 2" series is already merged and "part 3" should be merged soon and those really help clean up a lot of the Intel-specific pain points caused by the atomic transition. I think we need to make sure you can handle properties properly in an atomic manner, so if you find you're still blocked on the current code, we should look into how to address that so that we don't have to bring any temporary Intel hackiness into the shared DRM core code. Matt > > > >> > > >>Signed-off-by: Shashank Sharma > > >>Signed-off-by: Kausal Malladi > > >>--- > > >> drivers/gpu/drm/drm_atomic_helper.c | 17 ++--- > > >> 1 file changed, 14 insertions(+), 3 deletions(-) > > >> > > >>diff --git a/drivers/gpu/drm/drm_atomic_helper.c > > >>b/drivers/gpu/drm/drm_atomic_helper.c > > >>index a900858..37dba55 100644 > > >>--- a/drivers/gpu/drm/drm_atomic_helper.c > > >>+++ b/drivers/gpu/drm/drm_atomic_helper.c > > >>@@ -1696,6 +1696,8 @@ drm_atomic_helper_crtc_set_property(struct drm_crtc > > >>*crtc, > > >> { > > >> struct drm_atomic_state *state; > > >> struct drm_crtc_state *crtc_state; > > >>+ struct drm_device *dev = crtc->dev; > > >>+ struct drm_mode_config
Re: [Intel-gfx] [PATCH v3 00/19] Convert to atomic, part 3.
On Mon, Jun 15, 2015 at 12:33:37PM +0200, Maarten Lankhorst wrote: > Requisites: > - "[PATCH] drm/atomic: pass old crtc state to atomic_begin/flush." > > This patch series converts plane updates and cdclk updates to atomic, > but still doesn't touch the hw readout code, which was regressing a lot. > > The fixes in this series are needed to support proper hw readout, and can > be applied on top of topic/atomic-conversion. > > In particular, this fixes the following bugs: > - https://bugs.freedesktop.org/show_bug.cgi?id=90874 > Needs atomic CDCLK as part of state, before any plane checks, > or scalers will not work correctly. > - https://bugs.freedesktop.org/show_bug.cgi?id=90868 > It shows a problem with plane visibility on resume. > This is fixed by calcing plane states correctly across modeset. > There's also a problem with DPLL 0 failing to lock, I hope that's > fixed by cdclk changes, but it might have been a bug in the reverted > atomic hw readout patch too. I've spent the last week looking over these and I don't see any major functional problems so I think these are probably ready for merging. A couple patches got into areas of the code I'm not super familiar with, but I've also done quite a bit of testing of this series (on IVB) and haven't run into issues, so that increases my confidence. I think there was one incorrectly identified function parameter I called out on one of the early scaler patches, but with a sensible parameter rename there, you can consider these Reviewed-by: Matt Roper Tested-by(IVB): Matt Roper Matt > > Maarten Lankhorst (19): > drm/i915: Use crtc state in intel_modeset_pipe_config > drm/i915: Clean up intel_atomic_setup_scalers slightly. > drm/i915: Add a simple atomic crtc check function, v2. > drm/i915: Move scaler setup to check crtc function, v2. > drm/i915: Assign a new pll from the crtc check function, v2. > drm/i915: Split skl_update_scaler, v3. > drm/i915: Split plane updates of crtc->atomic into a helper, v2. > drm/i915: clean up plane commit functions > drm/i915: clean up atomic plane check functions, v2. > drm/i915: remove force argument from disable_plane > drm/i915: move detaching scalers to begin_crtc_commit, v2. > drm/i915: Move crtc commit updates to separate functions. > drm/i915: Do not run most checks when there's no modeset. > drm/i915: Handle disabling planes better, v2. > drm/i915: atomic plane updates in a nutshell > drm/i915: Update less state during modeset. > drm/i915: Make setting color key atomic. > drm/i915: Remove transitional references from > intel_plane_atomic_check. > drm/i915: Make cdclk part of the atomic state. > > drivers/gpu/drm/i915/i915_drv.h |3 +- > drivers/gpu/drm/i915/intel_atomic.c | 47 +- > drivers/gpu/drm/i915/intel_atomic_plane.c | 41 +- > drivers/gpu/drm/i915/intel_display.c | 1480 > +++-- > drivers/gpu/drm/i915/intel_dp.c |2 +- > drivers/gpu/drm/i915/intel_drv.h | 27 +- > drivers/gpu/drm/i915/intel_sprite.c | 170 ++-- > 7 files changed, 874 insertions(+), 896 deletions(-) > > -- > 2.1.0 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Matt Roper Graphics Software Engineer IoTG Platform Enabling & Development Intel Corporation (916) 356-2795 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] i965/gen9: Pass alignment as function parameter in drm_intel_gem_bo_alloc_internal()
On Wed, Jun 10, 2015 at 1:47 AM, Damien Lespiau wrote: > On Tue, Jun 09, 2015 at 02:59:33PM -0700, Anuj Phogat wrote: >> This patch is on the list for 8 weeks now. Please take a look so I can push >> it upstream. > > Could I suggest you nominate a mesa team member working on SKL as well? > that would be the ideal match I believe. I will request someone in Mesa time to take a look. But, I still expect "looks fine/incorrect", "acked/nacked", "I don't know this code" comments by people who reviewed the v1. +Ben: Since he's reviewing my mesa patch: "i965/gen9: Allocate YF/YS tiled buffer objects" > > -- > Damien ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/5] drm/i915: PSR: Remove Low Power HW tracking mask.
On Fri, Jun 19, 2015 at 1:32 PM, Daniel Vetter wrote: > On Thu, Jun 18, 2015 at 8:43 PM, Rodrigo Vivi wrote: >> By Spec we should just mask memup and hotplug detection >> for hardware tracking cases. However we always masked >> LPSP that is for low power tracking support because >> without it PSR was constantly exiting and never really >> getting activated. >> >> Now with runtime PM being enabled by default Matthew >> reported that he was facing missed screen updates. So >> let's remove this undesirable mask and let HW tracking >> take care of cases like this were power saving features >> are also running. >> >> WARNING: With this patch PSR depends on Audio and GPU >> runtime PM to be properly enabled, working on "auto". >> If either audio runtime PM or gpu runtime pm are not >> properly set PSR will constant Exit and Performance >> Counter will be 0. >> >> But the best thing of this patch is that with one more >> HW tracking working the risks of missed blank screen >> are minimized at most. >> >> This affects just core platforms where PSR exit are also >> helped by HW tracking: Haswell, Broadwell and Skylake >> for now. >> >> Cc: Daniel Vetter >> Cc: Matthew Garrett >> Signed-off-by: Rodrigo Vivi > > I guess I don't really understand your description, but it does sound > strange ... runtime pm enabling from my patch is only about D3, power > well changes are still done. And as long as we have anything enabled > (even with PSR) we'll prevent D3. > > So the only thing I can think of is that somehow D3 wreaks something > in the PSR setup and that's causing issues. Unfortunately I have no > idea about our hw details around PSR and D3, so no idea. Maybe Art has > some? I don't know this relation as well. When I found this LPSP maks that made PSR working it was totally by forcing all masks and start removing one by one up to the point that this Low Power something did the trick. At that time Artur had told about power well handling enabled, but now after Mathew reported that issue I noticed this Low power flag was also related to runtime PM... > >> --- >> drivers/gpu/drm/i915/intel_psr.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/i915/intel_psr.c >> b/drivers/gpu/drm/i915/intel_psr.c >> index 5ee0fa5..6549d58 100644 >> --- a/drivers/gpu/drm/i915/intel_psr.c >> +++ b/drivers/gpu/drm/i915/intel_psr.c >> @@ -400,7 +400,7 @@ void intel_psr_enable(struct intel_dp *intel_dp) >> >> /* Avoid continuous PSR exit by masking memup and hpd */ > > Need to adjust the comment. not actually... memup and hpd are still there... comment matched spec but never matched the code... now it matches. > >> I915_WRITE(EDP_PSR_DEBUG_CTL(dev), EDP_PSR_DEBUG_MASK_MEMUP | >> - EDP_PSR_DEBUG_MASK_HPD | EDP_PSR_DEBUG_MASK_LPSP); >> + EDP_PSR_DEBUG_MASK_HPD); >> >> /* Enable PSR on the panel */ >> hsw_psr_enable_sink(intel_dp); >> -- >> 2.1.0 >> > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Rodrigo Vivi Blog: http://blog.vivi.eng.br ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/5] drm/i915: PSR: Remove Low Power HW tracking mask.
On Thu, Jun 18, 2015 at 8:43 PM, Rodrigo Vivi wrote: > By Spec we should just mask memup and hotplug detection > for hardware tracking cases. However we always masked > LPSP that is for low power tracking support because > without it PSR was constantly exiting and never really > getting activated. > > Now with runtime PM being enabled by default Matthew > reported that he was facing missed screen updates. So > let's remove this undesirable mask and let HW tracking > take care of cases like this were power saving features > are also running. > > WARNING: With this patch PSR depends on Audio and GPU > runtime PM to be properly enabled, working on "auto". > If either audio runtime PM or gpu runtime pm are not > properly set PSR will constant Exit and Performance > Counter will be 0. > > But the best thing of this patch is that with one more > HW tracking working the risks of missed blank screen > are minimized at most. > > This affects just core platforms where PSR exit are also > helped by HW tracking: Haswell, Broadwell and Skylake > for now. > > Cc: Daniel Vetter > Cc: Matthew Garrett > Signed-off-by: Rodrigo Vivi I guess I don't really understand your description, but it does sound strange ... runtime pm enabling from my patch is only about D3, power well changes are still done. And as long as we have anything enabled (even with PSR) we'll prevent D3. So the only thing I can think of is that somehow D3 wreaks something in the PSR setup and that's causing issues. Unfortunately I have no idea about our hw details around PSR and D3, so no idea. Maybe Art has some? > --- > drivers/gpu/drm/i915/intel_psr.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/intel_psr.c > b/drivers/gpu/drm/i915/intel_psr.c > index 5ee0fa5..6549d58 100644 > --- a/drivers/gpu/drm/i915/intel_psr.c > +++ b/drivers/gpu/drm/i915/intel_psr.c > @@ -400,7 +400,7 @@ void intel_psr_enable(struct intel_dp *intel_dp) > > /* Avoid continuous PSR exit by masking memup and hpd */ Need to adjust the comment. > I915_WRITE(EDP_PSR_DEBUG_CTL(dev), EDP_PSR_DEBUG_MASK_MEMUP | > - EDP_PSR_DEBUG_MASK_HPD | EDP_PSR_DEBUG_MASK_LPSP); > + EDP_PSR_DEBUG_MASK_HPD); > > /* Enable PSR on the panel */ > hsw_psr_enable_sink(intel_dp); > -- > 2.1.0 > -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Remove KMS Kconfig option
Since we only support modesetting by default (disabling modesetting on the command line prevents i915.ko from loading), having a parameter to disable modesstting by default is superfluous, i.e. saying CONFIG_DRM_I915_KMS=n is equivalent to CONFIG_DRM_I915=n. Signed-off-by: Chris Wilson Cc: Daniel Veter --- arch/x86/configs/x86_64_defconfig | 1 - drivers/gpu/drm/i915/Kconfig | 9 - drivers/gpu/drm/i915/i915_drv.c | 20 +++- 3 files changed, 7 insertions(+), 23 deletions(-) diff --git a/arch/x86/configs/x86_64_defconfig b/arch/x86/configs/x86_64_defconfig index 315b86106572..05630dfcb9f4 100644 --- a/arch/x86/configs/x86_64_defconfig +++ b/arch/x86/configs/x86_64_defconfig @@ -207,7 +207,6 @@ CONFIG_AGP_AMD64=y CONFIG_AGP_INTEL=y CONFIG_DRM=y CONFIG_DRM_I915=y -CONFIG_DRM_I915_KMS=y CONFIG_FB_MODE_HELPERS=y CONFIG_FB_TILEBLITTING=y CONFIG_FB_EFI=y diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 5f4b8c5d2bac..3cef5b18b9cb 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -37,15 +37,6 @@ config DRM_I915 i810 driver instead, and the Atom z5xx series has an entirely different implementation. -config DRM_I915_KMS - bool "Enable modesetting on intel by default" - depends on DRM_I915 - default y - help - Choose this option if you want kernel modesetting enabled by default. - - If in doubt, say "Y". - config DRM_I915_FBDEV bool "Enable legacy fbdev support for the modesetting intel driver" depends on DRM_I915 diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 6b87f949fc55..0ecd044d5d80 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1729,20 +1729,14 @@ static int __init i915_init(void) driver.num_ioctls = i915_max_ioctl; /* -* If CONFIG_DRM_I915_KMS is set, default to KMS unless -* explicitly disabled with the module pararmeter. -* -* Otherwise, just follow the parameter (defaulting to off). -* -* Allow optional vga_text_mode_force boot option to override -* the default behavior. +* Enable KMS by default, unless explicitly overriden by +* either the i915.modeset prarameter or by the +* vga_text_mode_force boot option. */ -#if defined(CONFIG_DRM_I915_KMS) - if (i915.modeset != 0) - driver.driver_features |= DRIVER_MODESET; -#endif - if (i915.modeset == 1) - driver.driver_features |= DRIVER_MODESET; + driver.driver_features |= DRIVER_MODESET; + + if (i915.modeset == 0) + driver.driver_features &= ~DRIVER_MODESET; #ifdef CONFIG_VGA_CONSOLE if (vgacon_text_force() && i915.modeset == -1) -- 2.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Officially give up on seqno coherency
On Fri, Jun 19, 2015 at 09:16:09PM +0200, Daniel Vetter wrote: > We've never figured out the magic trick to make irq vs. seqno > updates coherent, only tricks to make it work. And since > > commit 094f9a54e35500739da185cdb78f2e92fc379458 > Author: Chris Wilson > Date: Wed Sep 25 17:34:55 2013 +0100 > > drm/i915: Fix __wait_seqno to use true infinite timeouts > > we automatically fall back to an irq augmented with polling scheme > after the first missed interrupt. There's really nothing else we can > do, hence tune down the message to informational level. It's still > useful for users in case it reliable preceedes a hard system hang. > > v2: Use NOTICE since it might be of value for bug reports (Chris). > > Cc: Mark Janes > Cc: Chris Wilson > Cc: sta...@vger.kernel.org > Signed-off-by: Daniel Vetter Now all we need to is to save the GPU state to the pstore in the picoseconds before a hard hang, and we'll be sorted. Reviewed-by: Chris Wilson -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] Updated drm-intel-testing
Hi all, New -testing cycle with cool stuff: - refactoring hpd irq handlers (Jani) - polish skl dpll code a bit (Damien) - dynamic cdclk adjustement (Ville & Mika) - fix up 12bpc hdmi and enable it for real again (Ville) - extend hsw cmd parser to be useful for atomic configuration (Franscico Jerez) - even more atomic conversion and rolling state handling out across modeset code from Maarten & Ander - fix DRRS idleness detection (Ramalingam) - clean up dsp address alignment handling (Ville) - some fbc cleanup patches from Paulo - prevent hard-hangs when trying to reset the gpu on skl (Mika) Happy testing! Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Officially give up on seqno coherency
We've never figured out the magic trick to make irq vs. seqno updates coherent, only tricks to make it work. And since commit 094f9a54e35500739da185cdb78f2e92fc379458 Author: Chris Wilson Date: Wed Sep 25 17:34:55 2013 +0100 drm/i915: Fix __wait_seqno to use true infinite timeouts we automatically fall back to an irq augmented with polling scheme after the first missed interrupt. There's really nothing else we can do, hence tune down the message to informational level. It's still useful for users in case it reliable preceedes a hard system hang. v2: Use NOTICE since it might be of value for bug reports (Chris). Cc: Mark Janes Cc: Chris Wilson Cc: sta...@vger.kernel.org Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/i915_irq.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index e6bb72dca3ff..5072fb49367e 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -2946,8 +2946,8 @@ static void i915_hangcheck_elapsed(struct work_struct *work) /* Issue a wake-up to catch stuck h/w. */ if (!test_and_set_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings)) { if (!(dev_priv->gpu_error.test_irq_rings & intel_ring_flag(ring))) - DRM_ERROR("Hangcheck timer elapsed... %s idle\n", - ring->name); + DRM_NOTICE("Hangcheck timer elapsed... %s idle\n", + ring->name); else DRM_INFO("Fake missed irq on %s\n", ring->name); -- 2.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 13/15] drm/i915: Integrate GuC-based command submission
On 16/06/15 10:22, Chris Wilson wrote: > On Mon, Jun 15, 2015 at 07:36:31PM +0100, Dave Gordon wrote: >> From: Alex Dai >> >> GuC-based submission is mostly the same as execlist mode, up to >> intel_logical_ring_advance_and_submit(), where the context being >> dispatched would be added to the execlist queue; at this point >> we submit the context to the GuC backend instead. >> >> There are, however, a few other changes also required, notably: >> 1. Contexts must be pinned at GGTT addresses accessible by the GuC >> i.e. NOT in the range [0..WOPCM_SIZE), so we have to add the >> PIN_OFFSET_BIAS flag to the relevant GGTT-pinning calls. >> >> 2. The GuC's TLB must be invalidated after a context is pinned at >> a new GGTT address. >> >> 3. GuC firmware uses the one page before Ring Context as shared data. >> Therefore, whenever driver wants to get base address of LRC, we >> will offset one page for it. LRC_PPHWSP_PN is defined as the page >> number of LRCA. >> >> 4. In the work queue used to pass requests to the GuC, the GuC >> firmware requires the ring-tail-offset to be represented as an >> 11-bit value, expressed in QWords. Therefore, the ringbuffer >> size must be reduced to the representable range (4 pages). > > I don't like how this sabotages the existing execlists implementation > in order for i915_guc_submission (an interesting choice of file name, > since we go i915_gem_execbuffer (API) -> intel_execlists (HW) -> > i915_guc_submission (HW), not fitting into our, admittedly loose, naming > convention very well) to share a few functions. Even a couple of which > are already vfunc. > -Chris Not really "sabotages"; big ringbuffers are a waste of space anyway. Four pages is enough to have at least 64 batchbuffers queued up for the engine to process (1 second of video, or 0.00012 of a bitcoin). When the scheduler lands, it will generally reduce the number of batches in the h/w rings anyway, mostly to improve responsiveness and fair-sharing among different applications. I quite agree that the execlists implementation, which is mostly in a file called intel_lrc.c, should really be split into LRC-related code (common to execlists and GuC modes) with the rest moved into intel_execlist_submission.c. Or is that i915_execlist_submission.c? If we took contexts as primary, rather than "rings" (engines) we could also share a lot more between GuC/execlists and legacy ringbuffer mode. At least we should get some improvement with AntiOLR, so we'll have request->context<->ringbuffer->engine->submission mechanism :) .Dave. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v6 4/6] drm/i915/gen8: Add WaFlushCoherentL3CacheLinesAtContextSwitch workaround
On Fri, Jun 19, 2015 at 06:37:13PM +0100, Arun Siluvery wrote: > In Indirect context w/a batch buffer, > +WaFlushCoherentL3CacheLinesAtContextSwitch:bdw > > v2: Add LRI commands to set/reset bit that invalidates coherent lines, > update WA to include programming restrictions and exclude CHV as > it is not required (Ville) > > v3: Avoid unnecessary read when it can be done by reading register once > (Chris). > > Cc: Chris Wilson > Cc: Dave Gordon > Signed-off-by: Rafael Barbalho > Signed-off-by: Arun Siluvery Acked-by: Chris Wilson -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v6 3/6] drm/i915/gen8: Add WaDisableCtxRestoreArbitration workaround
On Fri, Jun 19, 2015 at 06:37:12PM +0100, Arun Siluvery wrote: > In Indirect and Per context w/a batch buffer, > +WaDisableCtxRestoreArbitration > > Cc: Chris Wilson > Cc: Dave Gordon > Signed-off-by: Rafael Barbalho > Signed-off-by: Arun Siluvery Acked-by: Chris Wilson # too lazy to verify hsd -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v6 5/6] drm/i915/gen8: Add WaClearSlmSpaceAtContextSwitch workaround
On Fri, Jun 19, 2015 at 06:37:14PM +0100, Arun Siluvery wrote: > In Indirect context w/a batch buffer, > WaClearSlmSpaceAtContextSwitch > > This WA performs writes to scratch page so it must be valid, this check > is performed before initializing the batch with this WA. > > v2: s/PIPE_CONTROL_FLUSH_RO_CACHES/PIPE_CONTROL_FLUSH_L3 (Ville) > > Cc: Chris Wilson > Cc: Dave Gordon > Signed-off-by: Rafael Barbalho > Signed-off-by: Arun Siluvery > --- > drivers/gpu/drm/i915/i915_reg.h | 1 + > drivers/gpu/drm/i915/intel_lrc.c | 16 > 2 files changed, 17 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h > index d14ad20..7637e64 100644 > --- a/drivers/gpu/drm/i915/i915_reg.h > +++ b/drivers/gpu/drm/i915/i915_reg.h > @@ -410,6 +410,7 @@ > #define DISPLAY_PLANE_A (0<<20) > #define DISPLAY_PLANE_B (1<<20) > #define GFX_OP_PIPE_CONTROL(len) ((0x3<<29)|(0x3<<27)|(0x2<<24)|(len-2)) > +#define PIPE_CONTROL_FLUSH_L3 (1<<27) > #define PIPE_CONTROL_GLOBAL_GTT_IVB(1<<24) /* > gen7+ */ > #define PIPE_CONTROL_MMIO_WRITE(1<<23) > #define PIPE_CONTROL_STORE_DATA_INDEX (1<<21) > diff --git a/drivers/gpu/drm/i915/intel_lrc.c > b/drivers/gpu/drm/i915/intel_lrc.c > index 3e7aaa9..664455c 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -1137,6 +1137,7 @@ static int gen8_init_indirectctx_bb(struct > intel_engine_cs *ring, > uint32_t *const batch, > uint32_t *offset) > { > + uint32_t scratch_addr; > uint32_t index = wa_ctx_start(wa_ctx, *offset, CACHELINE_DWORDS); > > /* WaDisableCtxRestoreArbitration:bdw,chv */ > @@ -1165,6 +1166,21 @@ static int gen8_init_indirectctx_bb(struct > intel_engine_cs *ring, > wa_ctx_emit(batch, l3sqc4_flush & > ~GEN8_LQSC_FLUSH_COHERENT_LINES); > } > > + /* WaClearSlmSpaceAtContextSwitch:bdw,chv */ > + /* Actual scratch location is at 128 bytes offset */ > + scratch_addr = ring->scratch.gtt_offset + 2*CACHELINE_BYTES; > + scratch_addr |= PIPE_CONTROL_GLOBAL_GTT; I thought this bit was now mbz - that's how we treat it elsewhere e.g. gen8_emit_flush_render, and that the address has to be naturally aligned for the target write. (Similar bit in patch 6 fwiw.) -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 1/6] drm/i915/gen8: Add infrastructure to initialize WA batch buffers
Some of the WA are to be applied during context save but before restore and some at the end of context save/restore but before executing the instructions in the ring, WA batch buffers are created for this purpose and these WA cannot be applied using normal means. Each context has two registers to load the offsets of these batch buffers. If they are non-zero, HW understands that it need to execute these batches. v1: In this version two separate ring_buffer objects were used to load WA instructions for indirect and per context batch buffers and they were part of every context. v2: Chris suggested to include additional page in context and use it to load these WA instead of creating separate objects. This will simplify lot of things as we need not explicity pin/unpin them. Thomas Daniel further pointed that GuC is planning to use a similar setup to share data between GuC and driver and WA batch buffers can probably share that page. However after discussions with Dave who is implementing GuC changes, he suggested to use an independent page for the reasons - GuC area might grow and these WA are initialized only once and are not changed afterwards so we can share them share across all contexts. The page is updated with WA during render ring init. This has an advantage of not adding more special cases to default_context. We don't know upfront the number of WA we will applying using these batch buffers. For this reason the size was fixed earlier but it is not a good idea. To fix this, the functions that load instructions are modified to report the no of commands inserted and the size is now calculated after the batch is updated. A macro is introduced to add commands to these batch buffers which also checks for overflow and returns error. We have a full page dedicated for these WA so that should be sufficient for good number of WA, anything more means we have major issues. The list for Gen8 is small, same for Gen9 also, maybe few more gets added going forward but not close to filling entire page. Chris suggested a two-pass approach but we agreed to go with single page setup as it is a one-off routine and simpler code wins. One additional option is offset field which is helpful if we would like to have multiple batches at different offsets within the page and select them based on some criteria. This is not a requirement at this point but could help in future (Dave). Chris provided some helpful macros and suggestions which further simplified the code, they will also help in reducing code duplication when WA for other Gen are added. Add detailed comments explaining restrictions. Use do {} while(0) for wa_ctx_emit() macro. (Many thanks to Chris, Dave and Thomas for their reviews and inputs) Cc: Chris Wilson Cc: Dave Gordon Signed-off-by: Rafael Barbalho Signed-off-by: Arun Siluvery Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/intel_lrc.c| 223 +++- drivers/gpu/drm/i915/intel_ringbuffer.h | 21 +++ 2 files changed, 240 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 0413b8f..0585298 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -211,6 +211,7 @@ enum { FAULT_AND_CONTINUE /* Unsupported */ }; #define GEN8_CTX_ID_SHIFT 32 +#define CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT 0x17 static int intel_lr_context_pin(struct intel_engine_cs *ring, struct intel_context *ctx); @@ -1077,6 +1078,191 @@ static int intel_logical_ring_workarounds_emit(struct intel_engine_cs *ring, return 0; } +#define wa_ctx_emit(batch, cmd) \ + do {\ + if (WARN_ON(index >= (PAGE_SIZE / sizeof(uint32_t { \ + return -ENOSPC; \ + } \ + batch[index++] = (cmd); \ + } while (0) + +static inline uint32_t wa_ctx_start(struct i915_wa_ctx_bb *wa_ctx, + uint32_t offset, + uint32_t start_alignment) +{ + return wa_ctx->offset = ALIGN(offset, start_alignment); +} + +static inline int wa_ctx_end(struct i915_wa_ctx_bb *wa_ctx, +uint32_t offset, +uint32_t size_alignment) +{ + wa_ctx->size = offset - wa_ctx->offset; + + WARN(wa_ctx->size % size_alignment, +"wa_ctx_bb failed sanity checks: size %d is not aligned to %d\n", +wa_ctx->size, size_alignment); + return 0; +} + +/** + * gen8_init_indirectctx_bb() - initialize indirect ctx batch with WA + * + * @ring: only applicable for RCS + * @wa_ctx: structure representing wa_ctx + * offset: specifies start of the batch, should be cache-aligned. This is up
Re: [Intel-gfx] [PATCH v6 2/6] drm/i915/gen8: Re-order init pipe_control in lrc mode
On Fri, Jun 19, 2015 at 06:37:11PM +0100, Arun Siluvery wrote: > Some of the WA applied using WA batch buffers perform writes to scratch page. > In the current flow WA are initialized before scratch obj is allocated. > This patch reorders intel_init_pipe_control() to have a valid scratch obj > before we initialize WA. > > v2: Check for valid scratch page before initializing WA as some of them > perform writes to it. > > Cc: Chris Wilson > Cc: Dave Gordon > Signed-off-by: Michel Thierry > Signed-off-by: Arun Siluvery Ok, I'm not completely happy with the sequence of this, but it works. logical_ring_init() is the beast that initialises the intel_engine_cs, and it looks dubious to be setting fields (like dev and scratch_obj) before we do the main initialisation. I don't have a pretty solution (thinking of passing around the scratch object, or reordering everything still further with resulting code duplication for the other rings). Reviewed-by: Chris Wilson -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 11/15] drm/i915: Implementation of GuC client
On 15/06/15 22:55, Chris Wilson wrote: > On Mon, Jun 15, 2015 at 07:36:29PM +0100, Dave Gordon wrote: >> +/* Get valid workqueue item and return it back to offset */ >> +static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset) >> +{ >> +struct guc_process_desc *desc; >> +void *base; >> +u32 size = sizeof(struct guc_wq_item); >> +int ret = 0, timeout_counter = 200; >> +unsigned long flags; >> + >> +base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0)); >> +desc = base + gc->proc_desc_offset; >> + >> +while (timeout_counter-- > 0) { >> +spin_lock_irqsave(&gc->wq_lock, flags); >> + >> +ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head, >> +gc->wq_size) >= size, 1); > > What is the point of this loop? Drop the spinlock 200 times? You already > have a timeout, the loop extends that by a factor or 200. You merely > allow gazzumping, however I haven't looked at the locking to see what > you intend to lock (since it is not described at all). > -Chris Hmmm .. I didn't write this code, so I'm guessing somewhat; but ... A 'wq_lock' must lock a 'wq', which from the name of the function is a workqueue, which is a circular buffer shared between the host and the GuC, where (like the main ringbuffers) the host (producer) advances the tail (gc->wq_tail) and the other partner (consumer, in this case the GuC) advances the head (desc->head). Presumably the GuC could take many (up to 200) ms to get round to making space available, in a worst-case scenario where it's busy servicing interrupts and doing other things. Now we certainly don't want to spin for up to 200ms with interrupts disabled, so spin_lock_irqsave(&gc->wq_lock, flags); ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size, *200*); spin_unlock_irqrestore(&gc->wq_lock, flags); would be a bad idea. OTOH I don't think there's any other lock held by anyone higher up in the callchain, so we /probably do/ need the spinlock to protect the updating of wq_tail when the wait_for_atomic succeeds. So yes, I think up-to-200 iterations of polling for freespace for up to 1ms each time is not too unreasonable, given that apparently we have to poll, at least for now (once the scheduler lands, we will always be able to predict how much space is available and avoid trying to launch batches when there isn't enough). .Dave. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v6 1/6] drm/i915/gen8: Add infrastructure to initialize WA batch buffers
On Fri, Jun 19, 2015 at 06:37:10PM +0100, Arun Siluvery wrote: > Some of the WA are to be applied during context save but before restore and > some at the end of context save/restore but before executing the instructions > in the ring, WA batch buffers are created for this purpose and these WA cannot > be applied using normal means. Each context has two registers to load the > offsets of these batch buffers. If they are non-zero, HW understands that it > need to execute these batches. > > v1: In this version two separate ring_buffer objects were used to load WA > instructions for indirect and per context batch buffers and they were part > of every context. > > v2: Chris suggested to include additional page in context and use it to load > these WA instead of creating separate objects. This will simplify lot of > things > as we need not explicity pin/unpin them. Thomas Daniel further pointed that > GuC > is planning to use a similar setup to share data between GuC and driver and > WA batch buffers can probably share that page. However after discussions with > Dave who is implementing GuC changes, he suggested to use an independent page > for the reasons - GuC area might grow and these WA are initialized only once > and > are not changed afterwards so we can share them share across all contexts. > > The page is updated with WA during render ring init. This has an advantage of > not adding more special cases to default_context. > > We don't know upfront the number of WA we will applying using these batch > buffers. > For this reason the size was fixed earlier but it is not a good idea. To fix > this, > the functions that load instructions are modified to report the no of commands > inserted and the size is now calculated after the batch is updated. A macro is > introduced to add commands to these batch buffers which also checks for > overflow > and returns error. > We have a full page dedicated for these WA so that should be sufficient for > good number of WA, anything more means we have major issues. > The list for Gen8 is small, same for Gen9 also, maybe few more gets added > going forward but not close to filling entire page. Chris suggested a two-pass > approach but we agreed to go with single page setup as it is a one-off routine > and simpler code wins. > > One additional option is offset field which is helpful if we would like to > have multiple batches at different offsets within the page and select them > based on some criteria. This is not a requirement at this point but could > help in future (Dave). > > Chris provided some helpful macros and suggestions which further simplified > the code, they will also help in reducing code duplication when WA for > other Gen are added. Add detailed comments explaining restrictions. > > (Many thanks to Chris, Dave and Thomas for their reviews and inputs) > > Cc: Chris Wilson > Cc: Dave Gordon > Signed-off-by: Rafael Barbalho > Signed-off-by: Arun Siluvery Sigh, after all that, I found one minor thing, but nevertheless Reviewed-by: Chris Wilson > +#define wa_ctx_emit(batch, cmd) {\ > + if (WARN_ON(index >= (PAGE_SIZE / sizeof(uint32_t { \ > + return -ENOSPC; \ > + } \ > + batch[index++] = (cmd); \ > + } We should have wrapped this in do { } while(0) - think of all those trialing semicolons we have in the code! Fortunately we haven't used this in a if (foo) wa_ctx_emit(bar); else wa_ctx_emit(baz); yet. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 2/6] drm/i915/gen8: Re-order init pipe_control in lrc mode
Some of the WA applied using WA batch buffers perform writes to scratch page. In the current flow WA are initialized before scratch obj is allocated. This patch reorders intel_init_pipe_control() to have a valid scratch obj before we initialize WA. v2: Check for valid scratch page before initializing WA as some of them perform writes to it. Cc: Chris Wilson Cc: Dave Gordon Signed-off-by: Michel Thierry Signed-off-by: Arun Siluvery --- drivers/gpu/drm/i915/intel_lrc.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index c9255fc..0d350f6 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1223,6 +1223,12 @@ static int intel_init_workaround_bb(struct intel_engine_cs *ring) WARN_ON(ring->id != RCS); + /* some WA perform writes to scratch page, ensure it is valid */ + if (ring->scratch.obj == NULL) { + DRM_ERROR("scratch page not allocated for %s\n", ring->name); + return -EINVAL; + } + ret = lrc_setup_wa_ctx_obj(ring, PAGE_SIZE); if (ret) { DRM_DEBUG_DRIVER("Failed to setup context WA page: %d\n", ret); @@ -1657,7 +1663,8 @@ static int logical_render_ring_init(struct drm_device *dev) ring->emit_bb_start = gen8_emit_bb_start; ring->dev = dev; - ret = logical_ring_init(dev, ring); + + ret = intel_init_pipe_control(ring); if (ret) return ret; @@ -1672,9 +1679,10 @@ static int logical_render_ring_init(struct drm_device *dev) ret); } - ret = intel_init_pipe_control(ring); - if (ret) + ret = logical_ring_init(dev, ring); + if (ret) { lrc_destroy_wa_ctx_obj(ring); + } return ret; } -- 2.3.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 6/6] drm/i915/gen8: Add WaRsRestoreWithPerCtxtBb workaround
In Per context w/a batch buffer, WaRsRestoreWithPerCtxtBb This WA performs writes to scratch page so it must be valid, this check is performed before initializing the batch with this WA. v2: This patches modifies definitions of MI_LOAD_REGISTER_MEM and MI_LOAD_REGISTER_REG; Add GEN8 specific defines for these instructions so as to not break any future users of existing definitions (Michel) v3: Length defined in current definitions of LRM, LRR instructions was specified as 0. It seems it is common convention for instructions whose length vary between platforms. This is not an issue so far because they are not used anywhere except command parser; now that we use in this patch update them with correct length and also move them out of command parser placeholder to appropriate place. remove unnecessary padding and follow the WA programming sequence exactly as mentioned in spec which is essential for this WA (Dave). Cc: Chris Wilson Cc: Dave Gordon Signed-off-by: Rafael Barbalho Signed-off-by: Arun Siluvery --- drivers/gpu/drm/i915/i915_reg.h | 29 +++-- drivers/gpu/drm/i915/intel_lrc.c | 54 2 files changed, 81 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 7637e64..208620d 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -347,6 +347,31 @@ #define MI_INVALIDATE_BSD(1<<7) #define MI_FLUSH_DW_USE_GTT (1<<2) #define MI_FLUSH_DW_USE_PPGTT(0<<2) +#define MI_LOAD_REGISTER_MEMMI_INSTR(0x29, 1) +#define MI_LOAD_REGISTER_MEM_GEN8 MI_INSTR(0x29, 2) +#define MI_LRM_USE_GLOBAL_GTT (1<<22) +#define MI_LRM_ASYNC_MODE_ENABLE (1<<21) +#define MI_LOAD_REGISTER_REGMI_INSTR(0x2A, 1) +#define MI_ATOMIC(len) MI_INSTR(0x2F, (len-2)) +#define MI_ATOMIC_MEMORY_TYPE_GGTT (1<<22) +#define MI_ATOMIC_INLINE_DATA(1<<18) +#define MI_ATOMIC_CS_STALL (1<<17) +#define MI_ATOMIC_RETURN_DATA_CTL(1<<16) +#define MI_ATOMIC_OP_MASK(op) ((op) << 8) +#define MI_ATOMIC_AND MI_ATOMIC_OP_MASK(0x01) +#define MI_ATOMIC_OR MI_ATOMIC_OP_MASK(0x02) +#define MI_ATOMIC_XOR MI_ATOMIC_OP_MASK(0x03) +#define MI_ATOMIC_MOVE MI_ATOMIC_OP_MASK(0x04) +#define MI_ATOMIC_INC MI_ATOMIC_OP_MASK(0x05) +#define MI_ATOMIC_DEC MI_ATOMIC_OP_MASK(0x06) +#define MI_ATOMIC_ADD MI_ATOMIC_OP_MASK(0x07) +#define MI_ATOMIC_SUB MI_ATOMIC_OP_MASK(0x08) +#define MI_ATOMIC_RSUB MI_ATOMIC_OP_MASK(0x09) +#define MI_ATOMIC_IMAX MI_ATOMIC_OP_MASK(0x0A) +#define MI_ATOMIC_IMIN MI_ATOMIC_OP_MASK(0x0B) +#define MI_ATOMIC_UMAX MI_ATOMIC_OP_MASK(0x0C) +#define MI_ATOMIC_UMIN MI_ATOMIC_OP_MASK(0x0D) + #define MI_BATCH_BUFFERMI_INSTR(0x30, 1) #define MI_BATCH_NON_SECURE (1) /* for snb/ivb/vlv this also means "batch in ppgtt" when ppgtt is enabled. */ @@ -451,8 +476,6 @@ #define MI_CLFLUSH MI_INSTR(0x27, 0) #define MI_REPORT_PERF_COUNTMI_INSTR(0x28, 0) #define MI_REPORT_PERF_COUNT_GGTT (1<<0) -#define MI_LOAD_REGISTER_MEMMI_INSTR(0x29, 0) -#define MI_LOAD_REGISTER_REGMI_INSTR(0x2A, 0) #define MI_RS_STORE_DATA_IMMMI_INSTR(0x2B, 0) #define MI_LOAD_URB_MEM MI_INSTR(0x2C, 0) #define MI_STORE_URB_MEMMI_INSTR(0x2D, 0) @@ -1799,6 +1822,8 @@ enum skl_disp_power_wells { #define GEN8_RC_SEMA_IDLE_MSG_DISABLE(1 << 12) #define GEN8_FF_DOP_CLOCK_GATE_DISABLE (1<<10) +#define GEN8_RS_PREEMPT_STATUS 0x215C + /* Fuse readout registers for GT */ #define CHV_FUSE_GT(VLV_DISPLAY_BASE + 0x2168) #define CHV_FGT_DISABLE_SS0 (1 << 10) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 664455c..28198c4 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1215,11 +1215,65 @@ static int gen8_init_perctx_bb(struct intel_engine_cs *ring, uint32_t *const batch, uint32_t *offset) { + uint32_t scratch_addr; uint32_t index = wa_ctx_start(wa_ctx, *offset, CACHELINE_DWORDS); + /* Actual scratch location is at 128 bytes offset */ + scratch_addr = ring->scratch.gtt_offset + 2*CACHELINE_BYTES; + scratch_addr |= PIPE_CONTROL_GLOBAL_GTT; + /* WaDisableCtxRestoreArbitration:bdw,chv */ wa_ctx_emit(batch, MI_ARB_ON_OFF | MI_ARB_ENABLE); + /* +* As per Bspec, to workaround a known HW issue, SW must perform the +* below programming sequence prior to programming MI_BATCH_BUFFER_END. +* +* This is only applicable for Gen8. +*/ + + /* WaRsRestoreWithPerCtxtBb:bdw,chv */ + wa_ctx_emit(batch, MI_LOAD_REGISTER_IMM(1)); + wa_ctx_emit(batch, INSTPM); + wa_ctx_emit(batch, _MASKED_BIT_DISABLE(INSTPM_FORCE_ORDERING)); + + wa_ctx_emit(batch, (MI_ATOMIC(5) | +
[Intel-gfx] [PATCH v6 0/6] Add Per-context WA using WA batch buffers
From Gen8+ we have some workarounds that are applied Per context and they are applied using special batch buffers called as WA batch buffers. HW executes them at specific stages during context save/restore. The patches in this series adds this framework to i915. I did some basic testing on BDW by running glmark2 and didn't see any issues. These WA are mainly required when preemption is enabled. [v1] http://lists.freedesktop.org/archives/intel-gfx/2015-February/060707.html [v2] http://www.spinics.net/lists/intel-gfx/msg67804.html [v3] In v2, two separate ring_buffer objects were used to load WA instructions and they were part of every context which is not really required. Chris suggested a better approach of adding a page to context itself and using it for this purpose. Since GuC is also planning to do the same it can probably be shared with GuC. But after discussions it is agreed to use an independent page as GuC area might grow in future. Independent page also makes sense because these WA are only initialized once and not changed afterwards so we can share them across all contexts. [v4] Changes in this revision, In the previous version the size of batch buffers are fixed during initialization which is not a good idea. This is corrected by updating the functions that load WA to return the number of dwords written and caller updates the size once all WA are initialized. The functions now also accept offset field which allows us to have multiple batches so that required batch can be selected based on a criteria. This is not a requirement at this point but could be useful in future. WaFlushCoherentL3CacheLinesAtContextSwitch implementation was incomplete which is fixed and programming restrictions correctly applied. http://www.spinics.net/lists/intel-gfx/msg68947.html [v5] No major changes in this revision but switched to new revision as changes affected all patches. Introduced macro to add commands which also checks for page overflow. Moved code around to simplify, indentation fixes and other improvements suggested by Chris. Since we don't know the number of WA applied upfront, Chris suggested a two-pass approach but that brings additional complexity which is not necessary. Discussed with Chris and agreed upon on single page setup as simpler code wins and also single page is sufficient for our requirement. http://www.spinics.net/lists/intel-gfx/msg69176.html [v6] Changes made to make the code easy to modify for future Gen. It is much neater and contained now, big thanks to Chris Wilson for the review. Since these changes affect all patches I am sending as a separate revision. Please see the patches for more details. Arun Siluvery (6): drm/i915/gen8: Add infrastructure to initialize WA batch buffers drm/i915/gen8: Re-order init pipe_control in lrc mode drm/i915/gen8: Add WaDisableCtxRestoreArbitration workaround drm/i915/gen8: Add WaFlushCoherentL3CacheLinesAtContextSwitch workaround drm/i915/gen8: Add WaClearSlmSpaceAtContextSwitch workaround drm/i915/gen8: Add WaRsRestoreWithPerCtxtBb workaround drivers/gpu/drm/i915/i915_reg.h | 32 +++- drivers/gpu/drm/i915/intel_lrc.c| 328 +++- drivers/gpu/drm/i915/intel_ringbuffer.h | 21 ++ 3 files changed, 374 insertions(+), 7 deletions(-) -- 2.3.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 1/6] drm/i915/gen8: Add infrastructure to initialize WA batch buffers
Some of the WA are to be applied during context save but before restore and some at the end of context save/restore but before executing the instructions in the ring, WA batch buffers are created for this purpose and these WA cannot be applied using normal means. Each context has two registers to load the offsets of these batch buffers. If they are non-zero, HW understands that it need to execute these batches. v1: In this version two separate ring_buffer objects were used to load WA instructions for indirect and per context batch buffers and they were part of every context. v2: Chris suggested to include additional page in context and use it to load these WA instead of creating separate objects. This will simplify lot of things as we need not explicity pin/unpin them. Thomas Daniel further pointed that GuC is planning to use a similar setup to share data between GuC and driver and WA batch buffers can probably share that page. However after discussions with Dave who is implementing GuC changes, he suggested to use an independent page for the reasons - GuC area might grow and these WA are initialized only once and are not changed afterwards so we can share them share across all contexts. The page is updated with WA during render ring init. This has an advantage of not adding more special cases to default_context. We don't know upfront the number of WA we will applying using these batch buffers. For this reason the size was fixed earlier but it is not a good idea. To fix this, the functions that load instructions are modified to report the no of commands inserted and the size is now calculated after the batch is updated. A macro is introduced to add commands to these batch buffers which also checks for overflow and returns error. We have a full page dedicated for these WA so that should be sufficient for good number of WA, anything more means we have major issues. The list for Gen8 is small, same for Gen9 also, maybe few more gets added going forward but not close to filling entire page. Chris suggested a two-pass approach but we agreed to go with single page setup as it is a one-off routine and simpler code wins. One additional option is offset field which is helpful if we would like to have multiple batches at different offsets within the page and select them based on some criteria. This is not a requirement at this point but could help in future (Dave). Chris provided some helpful macros and suggestions which further simplified the code, they will also help in reducing code duplication when WA for other Gen are added. Add detailed comments explaining restrictions. (Many thanks to Chris, Dave and Thomas for their reviews and inputs) Cc: Chris Wilson Cc: Dave Gordon Signed-off-by: Rafael Barbalho Signed-off-by: Arun Siluvery --- drivers/gpu/drm/i915/intel_lrc.c| 222 +++- drivers/gpu/drm/i915/intel_ringbuffer.h | 21 +++ 2 files changed, 239 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 0413b8f..c9255fc 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -211,6 +211,7 @@ enum { FAULT_AND_CONTINUE /* Unsupported */ }; #define GEN8_CTX_ID_SHIFT 32 +#define CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT 0x17 static int intel_lr_context_pin(struct intel_engine_cs *ring, struct intel_context *ctx); @@ -1077,6 +1078,190 @@ static int intel_logical_ring_workarounds_emit(struct intel_engine_cs *ring, return 0; } +#define wa_ctx_emit(batch, cmd) { \ + if (WARN_ON(index >= (PAGE_SIZE / sizeof(uint32_t { \ + return -ENOSPC; \ + } \ + batch[index++] = (cmd); \ + } + +static inline uint32_t wa_ctx_start(struct i915_wa_ctx_bb *wa_ctx, + uint32_t offset, + uint32_t start_alignment) +{ + return wa_ctx->offset = ALIGN(offset, start_alignment); +} + +static inline int wa_ctx_end(struct i915_wa_ctx_bb *wa_ctx, +uint32_t offset, +uint32_t size_alignment) +{ + wa_ctx->size = offset - wa_ctx->offset; + + WARN(wa_ctx->size % size_alignment, +"wa_ctx_bb failed sanity checks: size %d is not aligned to %d\n", +wa_ctx->size, size_alignment); + return 0; +} + +/** + * gen8_init_indirectctx_bb() - initialize indirect ctx batch with WA + * + * @ring: only applicable for RCS + * @wa_ctx: structure representing wa_ctx + * offset: specifies start of the batch, should be cache-aligned. This is updated + *with the offset value received as input. + * size: size of the batch in DWORDS but HW expects in terms of cachelines + * @batch: page in which WA are loaded + * @offset: This field s
[Intel-gfx] [PATCH v6 3/6] drm/i915/gen8: Add WaDisableCtxRestoreArbitration workaround
In Indirect and Per context w/a batch buffer, +WaDisableCtxRestoreArbitration Cc: Chris Wilson Cc: Dave Gordon Signed-off-by: Rafael Barbalho Signed-off-by: Arun Siluvery --- drivers/gpu/drm/i915/intel_lrc.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 0d350f6..62c7eeb 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1139,8 +1139,8 @@ static int gen8_init_indirectctx_bb(struct intel_engine_cs *ring, { uint32_t index = wa_ctx_start(wa_ctx, *offset, CACHELINE_DWORDS); - /* FIXME: Replace me with WA */ - wa_ctx_emit(batch, MI_NOOP); + /* WaDisableCtxRestoreArbitration:bdw,chv */ + wa_ctx_emit(batch, MI_ARB_ON_OFF | MI_ARB_DISABLE); /* Pad to end of cacheline */ while (index % CACHELINE_DWORDS) @@ -1178,6 +1178,9 @@ static int gen8_init_perctx_bb(struct intel_engine_cs *ring, { uint32_t index = wa_ctx_start(wa_ctx, *offset, CACHELINE_DWORDS); + /* WaDisableCtxRestoreArbitration:bdw,chv */ + wa_ctx_emit(batch, MI_ARB_ON_OFF | MI_ARB_ENABLE); + wa_ctx_emit(batch, MI_BATCH_BUFFER_END); return wa_ctx_end(wa_ctx, *offset = index, 1); -- 2.3.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 5/6] drm/i915/gen8: Add WaClearSlmSpaceAtContextSwitch workaround
In Indirect context w/a batch buffer, WaClearSlmSpaceAtContextSwitch This WA performs writes to scratch page so it must be valid, this check is performed before initializing the batch with this WA. v2: s/PIPE_CONTROL_FLUSH_RO_CACHES/PIPE_CONTROL_FLUSH_L3 (Ville) Cc: Chris Wilson Cc: Dave Gordon Signed-off-by: Rafael Barbalho Signed-off-by: Arun Siluvery --- drivers/gpu/drm/i915/i915_reg.h | 1 + drivers/gpu/drm/i915/intel_lrc.c | 16 2 files changed, 17 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index d14ad20..7637e64 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -410,6 +410,7 @@ #define DISPLAY_PLANE_A (0<<20) #define DISPLAY_PLANE_B (1<<20) #define GFX_OP_PIPE_CONTROL(len) ((0x3<<29)|(0x3<<27)|(0x2<<24)|(len-2)) +#define PIPE_CONTROL_FLUSH_L3(1<<27) #define PIPE_CONTROL_GLOBAL_GTT_IVB (1<<24) /* gen7+ */ #define PIPE_CONTROL_MMIO_WRITE (1<<23) #define PIPE_CONTROL_STORE_DATA_INDEX(1<<21) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 3e7aaa9..664455c 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1137,6 +1137,7 @@ static int gen8_init_indirectctx_bb(struct intel_engine_cs *ring, uint32_t *const batch, uint32_t *offset) { + uint32_t scratch_addr; uint32_t index = wa_ctx_start(wa_ctx, *offset, CACHELINE_DWORDS); /* WaDisableCtxRestoreArbitration:bdw,chv */ @@ -1165,6 +1166,21 @@ static int gen8_init_indirectctx_bb(struct intel_engine_cs *ring, wa_ctx_emit(batch, l3sqc4_flush & ~GEN8_LQSC_FLUSH_COHERENT_LINES); } + /* WaClearSlmSpaceAtContextSwitch:bdw,chv */ + /* Actual scratch location is at 128 bytes offset */ + scratch_addr = ring->scratch.gtt_offset + 2*CACHELINE_BYTES; + scratch_addr |= PIPE_CONTROL_GLOBAL_GTT; + + wa_ctx_emit(batch, GFX_OP_PIPE_CONTROL(6)); + wa_ctx_emit(batch, (PIPE_CONTROL_FLUSH_L3 | + PIPE_CONTROL_GLOBAL_GTT_IVB | + PIPE_CONTROL_CS_STALL | + PIPE_CONTROL_QW_WRITE)); + wa_ctx_emit(batch, scratch_addr); + wa_ctx_emit(batch, 0); + wa_ctx_emit(batch, 0); + wa_ctx_emit(batch, 0); + /* Pad to end of cacheline */ while (index % CACHELINE_DWORDS) wa_ctx_emit(batch, MI_NOOP); -- 2.3.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 4/6] drm/i915/gen8: Add WaFlushCoherentL3CacheLinesAtContextSwitch workaround
In Indirect context w/a batch buffer, +WaFlushCoherentL3CacheLinesAtContextSwitch:bdw v2: Add LRI commands to set/reset bit that invalidates coherent lines, update WA to include programming restrictions and exclude CHV as it is not required (Ville) v3: Avoid unnecessary read when it can be done by reading register once (Chris). Cc: Chris Wilson Cc: Dave Gordon Signed-off-by: Rafael Barbalho Signed-off-by: Arun Siluvery --- drivers/gpu/drm/i915/i915_reg.h | 2 ++ drivers/gpu/drm/i915/intel_lrc.c | 23 +++ 2 files changed, 25 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 84af255..d14ad20 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -426,6 +426,7 @@ #define PIPE_CONTROL_INDIRECT_STATE_DISABLE (1<<9) #define PIPE_CONTROL_NOTIFY (1<<8) #define PIPE_CONTROL_FLUSH_ENABLE(1<<7) /* gen7+ */ +#define PIPE_CONTROL_DC_FLUSH_ENABLE (1<<5) #define PIPE_CONTROL_VF_CACHE_INVALIDATE (1<<4) #define PIPE_CONTROL_CONST_CACHE_INVALIDATE (1<<3) #define PIPE_CONTROL_STATE_CACHE_INVALIDATE (1<<2) @@ -5788,6 +5789,7 @@ enum skl_disp_power_wells { #define GEN8_L3SQCREG4 0xb118 #define GEN8_LQSC_RO_PERF_DIS (1<<27) +#define GEN8_LQSC_FLUSH_COHERENT_LINES(1<<21) /* GEN8 chicken */ #define HDC_CHICKEN0 0x7300 diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 62c7eeb..3e7aaa9 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1142,6 +1142,29 @@ static int gen8_init_indirectctx_bb(struct intel_engine_cs *ring, /* WaDisableCtxRestoreArbitration:bdw,chv */ wa_ctx_emit(batch, MI_ARB_ON_OFF | MI_ARB_DISABLE); + /* WaFlushCoherentL3CacheLinesAtContextSwitch:bdw */ + if (IS_BROADWELL(ring->dev)) { + struct drm_i915_private *dev_priv = to_i915(ring->dev); + uint32_t l3sqc4_flush = (I915_READ(GEN8_L3SQCREG4) | +GEN8_LQSC_FLUSH_COHERENT_LINES); + + wa_ctx_emit(batch, MI_LOAD_REGISTER_IMM(1)); + wa_ctx_emit(batch, GEN8_L3SQCREG4); + wa_ctx_emit(batch, l3sqc4_flush); + + wa_ctx_emit(batch, GFX_OP_PIPE_CONTROL(6)); + wa_ctx_emit(batch, (PIPE_CONTROL_CS_STALL | + PIPE_CONTROL_DC_FLUSH_ENABLE)); + wa_ctx_emit(batch, 0); + wa_ctx_emit(batch, 0); + wa_ctx_emit(batch, 0); + wa_ctx_emit(batch, 0); + + wa_ctx_emit(batch, MI_LOAD_REGISTER_IMM(1)); + wa_ctx_emit(batch, GEN8_L3SQCREG4); + wa_ctx_emit(batch, l3sqc4_flush & ~GEN8_LQSC_FLUSH_COHERENT_LINES); + } + /* Pad to end of cacheline */ while (index % CACHELINE_DWORDS) wa_ctx_emit(batch, MI_NOOP); -- 2.3.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/6] drm/i915/skl: Restrict the ring frequency table programming to SKL
From: Akash Goel Ring frequency table programming is not required on BXT. Added separate checks to enable the programming only for SKL & skip for BXT. Issue: VIZ-5144 Signed-off-by: Akash Goel Reviewed-by: Rodrigo Vivi --- drivers/gpu/drm/i915/intel_pm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 89c1b73..31d2fa0 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -4696,7 +4696,7 @@ void gen6_update_ring_freq(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev->dev_private; - if (INTEL_INFO(dev)->gen < 6 || IS_VALLEYVIEW(dev)) + if (INTEL_INFO(dev)->gen < 6 || IS_VALLEYVIEW(dev) || IS_BROXTON(dev)) return; mutex_lock(&dev_priv->rps.hw_lock); @@ -5811,7 +5811,8 @@ static void intel_gen6_powersave_work(struct work_struct *work) } else if (INTEL_INFO(dev)->gen >= 9) { gen9_enable_rc6(dev); gen9_enable_rps(dev); - __gen6_update_ring_freq(dev); + if (IS_SKYLAKE(dev)) + __gen6_update_ring_freq(dev); } else if (IS_BROADWELL(dev)) { gen8_enable_rps(dev); __gen6_update_ring_freq(dev); -- 1.9.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/6] drm/i915/skl: Ring frequency table programming changes
From: Akash Goel Ring frequency table programming changes for SKL. No need for a floor on ring frequency, as the issue of performance impact with ring running below DDR frequency, is believed to be fixed on SKL v2: Removed the check for avoiding ring frequency programming for BXT (Rodrigo) Issue: VIZ-5144 Signed-off-by: Akash Goel Reviewed-by: Rodrigo Vivi --- drivers/gpu/drm/i915/intel_pm.c | 23 +++ 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 8185a23..89c1b73 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -4613,6 +4613,7 @@ static void __gen6_update_ring_freq(struct drm_device *dev) int min_freq = 15; unsigned int gpu_freq; unsigned int max_ia_freq, min_ring_freq; + unsigned int max_gpu_freq, min_gpu_freq; int scaling_factor = 180; struct cpufreq_policy *policy; @@ -4637,17 +4638,31 @@ static void __gen6_update_ring_freq(struct drm_device *dev) /* convert DDR frequency from units of 266.6MHz to bandwidth */ min_ring_freq = mult_frac(min_ring_freq, 8, 3); + if (IS_SKYLAKE(dev)) { + /* Convert GT frequency to 50 HZ units */ + min_gpu_freq = dev_priv->rps.min_freq / GEN9_FREQ_SCALER; + max_gpu_freq = dev_priv->rps.max_freq / GEN9_FREQ_SCALER; + } else { + min_gpu_freq = dev_priv->rps.min_freq; + max_gpu_freq = dev_priv->rps.max_freq; + } + /* * For each potential GPU frequency, load a ring frequency we'd like * to use for memory access. We do this by specifying the IA frequency * the PCU should use as a reference to determine the ring frequency. */ - for (gpu_freq = dev_priv->rps.max_freq; gpu_freq >= dev_priv->rps.min_freq; -gpu_freq--) { - int diff = dev_priv->rps.max_freq - gpu_freq; + for (gpu_freq = max_gpu_freq; gpu_freq >= min_gpu_freq; gpu_freq--) { + int diff = max_gpu_freq - gpu_freq; unsigned int ia_freq = 0, ring_freq = 0; - if (INTEL_INFO(dev)->gen >= 8) { + if (IS_SKYLAKE(dev)) { + /* +* ring_freq = 2 * GT. ring_freq is in 100MHz units +* No floor required for ring frequency on SKL. +*/ + ring_freq = gpu_freq; + } else if (INTEL_INFO(dev)->gen >= 8) { /* max(2 * GT, DDR). NB: GT is 50MHz units */ ring_freq = max(min_ring_freq, gpu_freq); } else if (IS_HASWELL(dev)) { -- 1.9.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v2 0/6] Ring frequency & Rpe changes for SKL
From: Akash Goel This patch series adds the changes for supporting the Ring frequency table programming and retrieving the efficient frequency (aka RPe) value from the pcode for SKL. Addressed review comments from Ville, Daniel & added r-b tag from Rodrigo on the 3 patches Akash Goel (6): drm/i915/skl: Retrieve the Rpe value from Pcode drm/i915/skl: Ring frequency table programming changes drm/i915/skl: Restrict the ring frequency table programming to SKL drm/i915: Corrected the platform checks in i915_ring_freq_table function drm/i915/skl: Updated the i915_ring_freq_table debugfs function drm/i915: Added BXT check in i915_ring_freq_table function drivers/gpu/drm/i915/i915_debugfs.c | 22 + drivers/gpu/drm/i915/intel_pm.c | 47 ++--- 2 files changed, 50 insertions(+), 19 deletions(-) -- 1.9.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/6] drm/i915/skl: Retrieve the Rpe value from Pcode
From: Akash Goel Read the efficient frequency (aka RPe) value through the the mailbox command (0x1A) from the pcode, as done on Haswell and Broadwell. The turbo minimum frequency softlimit is not revised as per the efficient frequency value. v2: Replaced the conditional expression operator with 'if' statement (Tom) v3: Corrected the derivation of efficient frequency & shifted the GEN9_FREQ_SCALER multiplications downwards (Ville) Issue: VIZ-5143 Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/intel_pm.c | 19 +++ 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 32ff034..8185a23 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -4295,18 +4295,11 @@ static void gen6_init_rps_frequencies(struct drm_device *dev) dev_priv->rps.rp0_freq = (rp_state_cap >> 0) & 0xff; dev_priv->rps.rp1_freq = (rp_state_cap >> 8) & 0xff; dev_priv->rps.min_freq = (rp_state_cap >> 16) & 0xff; - if (IS_SKYLAKE(dev)) { - /* Store the frequency values in 16.66 MHZ units, which is - the natural hardware unit for SKL */ - dev_priv->rps.rp0_freq *= GEN9_FREQ_SCALER; - dev_priv->rps.rp1_freq *= GEN9_FREQ_SCALER; - dev_priv->rps.min_freq *= GEN9_FREQ_SCALER; - } /* hw_max = RP0 until we check for overclocking */ dev_priv->rps.max_freq = dev_priv->rps.rp0_freq; dev_priv->rps.efficient_freq = dev_priv->rps.rp1_freq; - if (IS_HASWELL(dev) || IS_BROADWELL(dev)) { + if (IS_HASWELL(dev) || IS_BROADWELL(dev) || IS_SKYLAKE(dev)) { ret = sandybridge_pcode_read(dev_priv, HSW_PCODE_DYNAMIC_DUTY_CYCLE_CONTROL, &ddcc_status); @@ -4318,6 +4311,16 @@ static void gen6_init_rps_frequencies(struct drm_device *dev) dev_priv->rps.max_freq); } + if (IS_SKYLAKE(dev)) { + /* Store the frequency values in 16.66 MHZ units, which is + the natural hardware unit for SKL */ + dev_priv->rps.rp0_freq *= GEN9_FREQ_SCALER; + dev_priv->rps.rp1_freq *= GEN9_FREQ_SCALER; + dev_priv->rps.min_freq *= GEN9_FREQ_SCALER; + dev_priv->rps.max_freq *= GEN9_FREQ_SCALER; + dev_priv->rps.efficient_freq *= GEN9_FREQ_SCALER; + } + dev_priv->rps.idle_freq = dev_priv->rps.min_freq; /* Preserve min/max settings in case of re-init */ -- 1.9.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 6/6] drm/i915: Added BXT check in i915_ring_freq_table function
From: Akash Goel Updated the i915_ring_freq_table debugfs function to add the broxton check, so as to disallow the read of ring frequency table for it. Issue: VIZ-5144 Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_debugfs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 2666d8a..daf6bdc 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -1747,7 +1747,8 @@ static int i915_ring_freq_table(struct seq_file *m, void *unused) int gpu_freq, ia_freq; unsigned int max_gpu_freq, min_gpu_freq; - if ((INTEL_INFO(dev)->gen < 6) || IS_VALLEYVIEW(dev)) { + if ((INTEL_INFO(dev)->gen < 6) || IS_VALLEYVIEW(dev) || + IS_BROXTON(dev)) { seq_puts(m, "unsupported on this chipset\n"); return 0; } -- 1.9.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 4/6] drm/i915: Corrected the platform checks in i915_ring_freq_table function
From: Akash Goel Corrected the platform checks in i915_ring_freq_table debugfs function so as to allow the read of ring frequency table for BDW and disallow for VLV v2: Simplified the checks to avoid the double negation (Daniel) Issue: VIZ-5144 Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_debugfs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index c49fe2a..438c10b 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -1746,7 +1746,8 @@ static int i915_ring_freq_table(struct seq_file *m, void *unused) int ret = 0; int gpu_freq, ia_freq; - if (!(IS_GEN6(dev) || IS_GEN7(dev))) { + if (!(IS_GEN6(dev) || (IS_GEN7(dev) && !IS_VALLEYVIEW(dev)) || + IS_BROADWELL(dev))) { seq_puts(m, "unsupported on this chipset\n"); return 0; } -- 1.9.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 5/6] drm/i915/skl: Updated the i915_ring_freq_table debugfs function
From: Akash Goel Updated the i915_ring_freq_table debugfs function to support the read of ring frequency table, through Punit interface, for SKL also. Issue: VIZ-5144 Signed-off-by: Akash Goel Reviewed-by: Rodrigo Vivi --- drivers/gpu/drm/i915/i915_debugfs.c | 22 -- 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 438c10b..2666d8a 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -1745,9 +1745,9 @@ static int i915_ring_freq_table(struct seq_file *m, void *unused) struct drm_i915_private *dev_priv = dev->dev_private; int ret = 0; int gpu_freq, ia_freq; + unsigned int max_gpu_freq, min_gpu_freq; - if (!(IS_GEN6(dev) || (IS_GEN7(dev) && !IS_VALLEYVIEW(dev)) || - IS_BROADWELL(dev))) { + if ((INTEL_INFO(dev)->gen < 6) || IS_VALLEYVIEW(dev)) { seq_puts(m, "unsupported on this chipset\n"); return 0; } @@ -1760,17 +1760,27 @@ static int i915_ring_freq_table(struct seq_file *m, void *unused) if (ret) goto out; + if (IS_SKYLAKE(dev)) { + /* Convert GT frequency to 50 HZ units */ + min_gpu_freq = + dev_priv->rps.min_freq_softlimit / GEN9_FREQ_SCALER; + max_gpu_freq = + dev_priv->rps.max_freq_softlimit / GEN9_FREQ_SCALER; + } else { + min_gpu_freq = dev_priv->rps.min_freq_softlimit; + max_gpu_freq = dev_priv->rps.max_freq_softlimit; + } + seq_puts(m, "GPU freq (MHz)\tEffective CPU freq (MHz)\tEffective Ring freq (MHz)\n"); - for (gpu_freq = dev_priv->rps.min_freq_softlimit; -gpu_freq <= dev_priv->rps.max_freq_softlimit; -gpu_freq++) { + for (gpu_freq = min_gpu_freq; gpu_freq <= max_gpu_freq; gpu_freq++) { ia_freq = gpu_freq; sandybridge_pcode_read(dev_priv, GEN6_PCODE_READ_MIN_FREQ_TABLE, &ia_freq); seq_printf(m, "%d\t\t%d\t\t\t\t%d\n", - intel_gpu_freq(dev_priv, gpu_freq), + intel_gpu_freq(dev_priv, (gpu_freq * + (IS_SKYLAKE(dev) ? GEN9_FREQ_SCALER : 1))), ((ia_freq >> 0) & 0xff) * 100, ((ia_freq >> 8) & 0xff) * 100); } -- 1.9.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 09/15] drm/i915: GuC submission setup, phase 1
On 19/06/15 18:02, Dave Gordon wrote: > On 15/06/15 22:32, Chris Wilson wrote: [snip] >> Try to keep comments to explain why rather than what. Most of the >> comments here fall into the "i++; // postincrement i" category. >> -Chris > > Most of the "what" comments in this file are associated with accesses to > specific h/w registers, which therefore have semantic effect beyond what > is explicit in the code. For example this comment: > > /* tell all command streamers to forward interrupts and vblank to GuC */ > irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK,GFX_FORWARD_VBLANK_ALWAYS); > irqs |= _MASKED_BIT_ENABLE(GFX_INTERRUPT_STEERING); > for_each_ring(ring, dev_priv, i) > I915_WRITE(RING_MODE_GEN7(ring), irqs); > > helps the reader what the /effect/ of the writes is intended to be. It's > quite different from: > > /* write bitmask to GEN7 ring mode register */ > I915_WRITE(RING_MODE_GEN7(ring),MASKED_BIT_ENABLE(GFX_INTERRUPT_STEERING)); > > and means you may not have to dig through the BSpec to find out what the > less helpfully-named bits actually do. And this: > > I915_WRITE(DE_GUCRMR, ~0); > > would be incomprehensible without reading the BSpec ... or the comment > > /* tell DE to send nothing to GuC */ > > .Dave. Oops, those comments aren't actually in this patch, they're in a later one. But they *will* be in this file by the end of the patchset ... .Dave. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 09/15] drm/i915: GuC submission setup, phase 1
On 15/06/15 22:32, Chris Wilson wrote: > On Mon, Jun 15, 2015 at 07:36:27PM +0100, Dave Gordon wrote: >> +static struct drm_i915_gem_object *gem_allocate_guc_obj(struct drm_device >> *dev, >> +u32 size) >> +{ >> +struct drm_i915_gem_object *obj; >> + >> +obj = i915_gem_alloc_object(dev, size); >> +if (!obj) >> +return NULL; > > Does it need to be a shmemfs object? It needs to be permanently in RAM, so probably not. But I don't see an allocator that gives you a GEM memory object /without/ backing store. Do we have one? >> +if (i915_gem_object_get_pages(obj)) { >> +drm_gem_object_unreference(&obj->base); >> +return NULL; >> +} > > This is a random function call. Which is? Unreferencing the newly-allocated object before returning? Otherwise it will leak :( Presumably if the object didn't have backing store, the get_pages() would be unnecessary as they would already be resident? Or would they not exist until the first get_pages call instantiated them? >> +if (i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, >> +PIN_OFFSET_BIAS | GUC_WOPCM_SIZE_VALUE)) { >> +drm_gem_object_unreference(&obj->base); >> +return NULL; > > How about reporting the right error code? Doesn't add anything. Allocation failure is going to be fatal anyway. And i915_gem_alloc_object() just returns NULL for failure, so we'd have to *make up* an error code for that case :( Oh, and ERR_PTR/PTR_ERR are ugly. >> +} >> + >> +return obj; >> +} >> + >> +/** >> + * gem_release_guc_obj() - Release gem object allocated for GuC usage >> + * @obj:gem obj to be released >> + */ >> +static void gem_release_guc_obj(struct drm_i915_gem_object *obj) >> +{ >> +if (!obj) >> +return; >> + >> +if (i915_gem_obj_is_pinned(obj)) >> +i915_gem_object_ggtt_unpin(obj); > > What? The object /should/ be pinned when we arrive here, thanks to the i915_gem_obj_ggtt_pin() call discussed above. We could just always unpin, and see whether we trigger this: WARN_ON(vma->pin_count == 0); inside i915_gem_object_ggtt_unpin(). The test is just in case there's an error path where the object being released wasn't in fact pinned. >> +drm_gem_object_unreference(&obj->base); >> +} >> + >> +/* >> + * Set up the memory resources to be shared with the GuC. At this point, >> + * we require just one object that can be mapped through the GGTT. >> + */ >> +int i915_guc_submission_init(struct drm_device *dev) >> +{ >> +struct drm_i915_private *dev_priv = dev->dev_private; > > Bleh. Cross-file interface, nonstatic, hence passes 'dev'; also it needs 'dev' anyway, so there's no gain in passing dev_priv. And dev_priv (i.e. struct drm_i915_private) isn't even in scope when this function is declared in the header file. >> +const size_t ctxsize = sizeof(struct guc_context_desc); >> +const size_t poolsize = MAX_GUC_GPU_CONTEXTS * ctxsize; >> +const size_t gemsize = round_up(poolsize, PAGE_SIZE); >> +struct intel_guc *guc = &dev_priv->guc; >> + >> +if (!i915.enable_guc_submission) >> +return 0; /* not enabled */ >> + >> +if (guc->ctx_pool_obj) >> +return 0; /* already allocated */ > > Eh? Where have you hooked into... So looking at that, it looks like you > want to move this into the device initialisation rather than guc > firmware load. To me at least they are conceptually separate stages, and > judging by the above combining them has resulted in very clumsy code. So ... we don't want to allocate any GuC-related structures unless we're going at least try to use the GuC, so this has to come /after/ firmware fetch and validation. OTOH we can't actually fire up the GuC until after these structures are allocated, because the GGTT address of the ctx_pool_obj has to be passed to the GuC firmware as one of its startup parameters. Thus, the only place to do this allocation is in between deciding to transfer the f/w to the GuC and actually doing so. Hence, the GuC loading code calls this each time we're about to squirt the f/w into the GuC; but, we don't need to allocate this more than once (OTOH it would be a violation of modularity for the loader to know that; only the submission code needs to know that little detail). So we end up with the above do-this-only-once code. >> +guc->ctx_pool_obj = gem_allocate_guc_obj(dev_priv->dev, gemsize); >> +if (!guc->ctx_pool_obj) >> +return -ENOMEM; >> + >> +spin_lock_init(&dev_priv->guc.host2guc_lock); >> + >> +ida_init(&guc->ctx_ids); >> + >> +memset(guc->doorbell_bitmap, 0, sizeof(guc->doorbell_bitmap)); >> +guc->db_cacheline = 0; > > Before you relied on guc being zeroed, and now you memset it again. Hmm ... perhaps we should rezero some of these things /every/ time instead? /me examines code ... Nope; it looks like everything is once again zero at the poi
Re: [Intel-gfx] [PATCH] drm/i915: Officially give up on seqno coherency
On Fri, Jun 19, 2015 at 06:43:04PM +0200, Daniel Vetter wrote: > We've never figured out the magic trick to make irq vs. seqno > updates coherent, only tricks to make it work. And since > > commit 094f9a54e35500739da185cdb78f2e92fc379458 > Author: Chris Wilson > Date: Wed Sep 25 17:34:55 2013 +0100 > > drm/i915: Fix __wait_seqno to use true infinite timeouts > > we automatically fall back to an irq augmented with polling scheme > after the first missed interrupt. There's really nothing else we can > do, hence tune down the message to informational level. It's still > useful for users in case it reliable preceedes a hard system hang. If I had a vote it would be DRM_NOTICE, #define KERN_NOTICE KERN_SOH "5"/* normal but significant condition */ #define KERN_INFO KERN_SOH "6"/* informational */ -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Officially give up on seqno coherency
We've never figured out the magic trick to make irq vs. seqno updates coherent, only tricks to make it work. And since commit 094f9a54e35500739da185cdb78f2e92fc379458 Author: Chris Wilson Date: Wed Sep 25 17:34:55 2013 +0100 drm/i915: Fix __wait_seqno to use true infinite timeouts we automatically fall back to an irq augmented with polling scheme after the first missed interrupt. There's really nothing else we can do, hence tune down the message to informational level. It's still useful for users in case it reliable preceedes a hard system hang. Cc: Mark Janes Cc: sta...@vger.kernel.org Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/i915_irq.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index e6bb72dca3ff..9350f2e5cd04 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -2946,8 +2946,8 @@ static void i915_hangcheck_elapsed(struct work_struct *work) /* Issue a wake-up to catch stuck h/w. */ if (!test_and_set_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings)) { if (!(dev_priv->gpu_error.test_irq_rings & intel_ring_flag(ring))) - DRM_ERROR("Hangcheck timer elapsed... %s idle\n", - ring->name); + DRM_INFO("Hangcheck timer elapsed... %s idle\n", +ring->name); else DRM_INFO("Fake missed irq on %s\n", ring->name); -- 2.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: enable BIOS hang workaround for Lenovo T60 too
On Fri, 2015-06-19 at 17:44 +0200, Daniel Vetter wrote: > I wonder whether we shouldn't do this unconditionally for gen4 and earlier > for Lenovo ... Anyway this needs Cc: sta...@vger.kernel.org and is for > Jani to pick up. > > Thanks for figuring out what's been broken here. > -Daniel > > > pci_set_power_state(drm_dev->pdev, PCI_D3hot); > > > > return 0; Just two days ago we discussed this bug too, see https://lkml.org/lkml/2015/6/17/327 . That message contains a link to a patch I cobbled together for yet another ThinkPad hitting this, but with PCI_SUBVENDOR_ID_IBM involved. Paul Bolle ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 48/55] drm/i915: Add *_ring_begin() to request allocation
On 18/06/2015 14:29, Daniel Vetter wrote: On Thu, Jun 18, 2015 at 1:21 PM, John Harrison wrote: I'm still confused by what you are saying in the above referenced email. Part of it is about the sanity checks failing to handle the wrapping case correctly which has been fixed in the base reserve space patch (patch 2 in the series). The rest is either saying that you think we are potentially wrappping too early and wasting a few bytes of the ring buffer or that something is actually broken? Yeah I didn't realize that this change was meant to fix the ring->reserved_tail check since I didn't make that connection. It is correct with that change, but the problem I see is that the correctness of that debug aid isn't assured locally: No we both need that check _and_ the correct handling of the reservation tracking at wrap-around. If the check just handles wrapping it'll robustly stay in working shape even when the wrapping behaviour changes. Point 2: 100 bytes of reserve, 160 bytes of execbuf and 200 bytes remaining. You seem to think this will fail somehow? Why? The wait_for_space(160) in the execbuf code will cause a wrap because the the 100 bytes for the add_request reservation is added on and the wait is actually being done for 260 bytes. So yes, we wrap earlier than would otherwise have been necessary but that is the only way to absolutely guarantee that the add_request() call cannot fail when trying to do the wrap itself. There's no problem except that it's wasteful. And I tried to explain that no unconditionally force-wrapping for the entire reservation is actually not needed, since the additional space needed to account for the eventual wrapping is bounded by a factor of 2. It's much less in practice since we split up the final request bits into multiple smaller intel_ring_begin. And if feels a bit wasteful to throw that space away (and make the gpu eat through MI_NOP) just because it makes caring for the worst-case harder. And with GuC the 160 dwords is actually a fairly substantial part of the ring. Even more so when we completely switch to a transaction model for request, where we only need to wrap for individual commands and hence could place intel_ring_being per-cmd (which is mostly what we do already anyway). As Chris says, if the driver is attempting to create a single request that fills the entire ringbuffer then that is a bug that should be caught as soon as possible. Even with a Guc, the ring buffer is not small compared to the size of requests the driver currently produces. Part of the scheduler work is to limit the number of batch buffers that a given application/context can have outstanding in the ring buffer at any given time in order to prevent starvation of the rest of the system by one badly behaved app. Thus completely filling a large ring buffer becomes impossible anyway - the application will be blocked before it gets that far. My proposal for this reservation wrapping business would have been: - Increase the reservation by 31 dwords (to account for the worst-case wrap in pc_render_add_request). - Rework the reservation overflow WARN_ON in reserve_space_end to work correctly even when wrapping while the reservation has been in use. - Move the addition of reserved_space below the point where we wrap the ring and only check against total free space, neglecting wrapping. - Remove all other complications you've added. Result is no forced wrapping for reservation and a debug check which should even survive random changes by monkeys since the logic for that check is fully contained within reserve_space_end. And for the check we should be able to reuse __intel_free_space. If I'm reading things correctly this shouldn't have any effect outside of patch 2 and shouldn't cause any conflicts. -Daniel See new patch #2... ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands
From: John Harrison It is a bad idea for i915_add_request() to fail. The work will already have been send to the ring and will be processed, but there will not be any tracking or management of that work. The only way the add request call can fail is if it can't write its epilogue commands to the ring (cache flushing, seqno updates, interrupt signalling). The reasons for that are mostly down to running out of ring buffer space and the problems associated with trying to get some more. This patch prevents that situation from happening in the first place. When a request is created, it marks sufficient space as reserved for the epilogue commands. Thus guaranteeing that by the time the epilogue is written, there will be plenty of space for it. Note that a ring_begin() call is required to actually reserve the space (and do any potential waiting). However, that is not currently done at request creation time. This is because the ring_begin() code can allocate a request. Hence calling begin() from the request allocation code would lead to infinite recursion! Later patches in this series remove the need for begin() to do the allocate. At that point, it becomes safe for the allocate to call begin() and really reserve the space. Until then, there is a potential for insufficient space to be available at the point of calling i915_add_request(). However, that would only be in the case where the request was created and immediately submitted without ever calling ring_begin() and adding any work to that request. Which should never happen. And even if it does, and if that request happens to fall down the tiny window of opportunity for failing due to being out of ring space then does it really matter because the request wasn't doing anything in the first place? v2: Updated the 'reserved space too small' warning to include the offending sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added re-initialisation of tracking state after a buffer wrap to keep the sanity checks accurate. v3: Incremented the reserved size to accommodate Ironlake (after finally managing to run on an ILK system). Also fixed missing wrap code in LRC mode. v4: Added extra comment and removed duplicate WARN (feedback from Tomas). v5: Re-write of wrap handling to prevent unnecessary early wraps (feedback from Daniel Vetter). For: VIZ-5115 CC: Tomas Elf CC: Daniel Vetter Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_drv.h |1 + drivers/gpu/drm/i915/i915_gem.c | 37 drivers/gpu/drm/i915/intel_lrc.c| 35 +-- drivers/gpu/drm/i915/intel_ringbuffer.c | 98 +-- drivers/gpu/drm/i915/intel_ringbuffer.h | 25 5 files changed, 186 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 0347eb9..eba1857 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2187,6 +2187,7 @@ struct drm_i915_gem_request { int i915_gem_request_alloc(struct intel_engine_cs *ring, struct intel_context *ctx); +void i915_gem_request_cancel(struct drm_i915_gem_request *req); void i915_gem_request_free(struct kref *req_ref); static inline uint32_t diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 81f3512..85fa27b 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2485,6 +2485,13 @@ int __i915_add_request(struct intel_engine_cs *ring, } else ringbuf = ring->buffer; + /* +* To ensure that this call will not fail, space for its emissions +* should already have been reserved in the ring buffer. Let the ring +* know that it is time to use that space up. +*/ + intel_ring_reserved_space_use(ringbuf); + request_start = intel_ring_get_tail(ringbuf); /* * Emit any outstanding flushes - execbuf can fail to emit the flush @@ -2567,6 +2574,9 @@ int __i915_add_request(struct intel_engine_cs *ring, round_jiffies_up_relative(HZ)); intel_mark_busy(dev_priv->dev); + /* Sanity check that the reserved size was large enough. */ + intel_ring_reserved_space_end(ringbuf); + return 0; } @@ -2666,6 +2676,26 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring, if (ret) goto err; + /* +* Reserve space in the ring buffer for all the commands required to +* eventually emit this request. This is to guarantee that the +* i915_add_request() call can't fail. Note that the reserve may need +* to be redone if the request is not actually submitted straight +* away, e.g. because a GPU scheduler has deferred it. +* +* Note further that this call merely notes the reserve request. A +* subsequent call to *_ring_begin() is required to actually
Re: [Intel-gfx] [PATCH] drm/i915: Reset request handling for gen8+
On Thu, Jun 18, 2015 at 04:58:06PM +0200, Daniel Vetter wrote: > On Thu, Jun 18, 2015 at 12:42:55PM +0100, Chris Wilson wrote: > > I understand the merit in trying the reset a few times before giving up, > > it would just need a bit of restructuring to try the reset before > > clearing gem state (trivial) and requeueing the hangcheck. I am just > > wary of feature creep before we get stuck into TDR, which promises to > > change how we think about resets entirely. > > My maintainer concern here is always that we should err on the side of not > killing the machine. If the reset failed, or if the gpu reinit failed then > marking the gpu as wedged has historically been the safe option. The > system will still run, display mostly works and there's a reasonable > chance you can gather debug data. One thing to bear in mind here is that it with this particular don't reset if not ready logic, repeating the attempt at reset after another hangcheck is equivalent to just using a slower hangcheck. (more or less, a couple of writes to one register difference) So it is no more likely to hang the machine than the original GPU hang. We can differentiate the cases here, between say EBUSY, ENODEV, and EIO, from the actual the reset request to determine which we want to retry (i.e. EBUSY). -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: enable BIOS hang workaround for Lenovo T60 too
On Fri, Jun 19, 2015 at 08:17:55AM +0200, Mikko Rapeli wrote: > When trying to hibernate a Lenovo T60 the half moon led keeps blinking and > devices does not power off since commit da2bc1b9db3. > > T60 chip details: > > 00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, > 943/940GM > L Express Integrated Graphics Controller (rev 03) (prog-if 00 [VGA > controller]) > Subsystem: Lenovo ThinkPad R60/T60/X60 series > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- > Step > ping- SERR- FastB2B- DisINTx- > Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- > SERR- Latency: 0 > Interrupt: pin A routed to IRQ 16 > Region 0: Memory at ee10 (32-bit, non-prefetchable) [size=512K] > Region 1: I/O ports at 1800 [size=8] > Region 2: Memory at d000 (32-bit, prefetchable) [size=256M] > Region 3: Memory at ee20 (32-bit, non-prefetchable) [size=256K] > Expansion ROM at [disabled] > Capabilities: [90] MSI: Enable- Count=1/1 Maskable- 64bit- > Address: Data: > Capabilities: [d0] Power Management version 2 > Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA > PME(D0-,D1-,D2-,D3hot-,D3cold-) > Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- > Kernel driver in use: i915 > > Signed-off-by: Mikko Rapeli > --- > drivers/gpu/drm/i915/i915_drv.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c > index ec4d932..36e311e 100644 > --- a/drivers/gpu/drm/i915/i915_drv.c > +++ b/drivers/gpu/drm/i915/i915_drv.c > @@ -641,11 +641,12 @@ static int i915_drm_suspend_late(struct drm_device > *drm_dev, bool hibernation) >* the device even though it's already in D3 and hang the machine. So >* leave the device in D0 on those platforms and hope the BIOS will >* power down the device properly. Platforms where this was seen: > - * Lenovo Thinkpad X301, X61s > + * Lenovo Thinkpad X301, X61s, T60 >*/ > if (!(hibernation && > drm_dev->pdev->subsystem_vendor == PCI_VENDOR_ID_LENOVO && > - INTEL_INFO(dev_priv)->gen == 4)) > + ((INTEL_INFO(dev_priv)->gen == 3) || > +(INTEL_INFO(dev_priv)->gen == 4))) I wonder whether we shouldn't do this unconditionally for gen4 and earlier for Lenovo ... Anyway this needs Cc: sta...@vger.kernel.org and is for Jani to pick up. Thanks for figuring out what's been broken here. -Daniel > pci_set_power_state(drm_dev->pdev, PCI_D3hot); > > return 0; > -- > 2.1.4 > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/atomic: Extract needs_modeset function
On Fri, Jun 19, 2015 at 06:08:08AM +0200, Maarten Lankhorst wrote: > Op 18-06-15 om 11:25 schreef Daniel Vetter: > > We use the same check already in the atomic core, so might as well > > make this official. And it's also reused in e.g. i915. > > > > Motivated by Maarten's idea to extract a connector_changed state out > > of mode_changed. > > > > Cc: Maarten Lankhorst > > Signed-off-by: Daniel Vetter > > > Reviewed-By: Maarten Lankhorst Thanks for the review, applied. For i915 I guess we could switch if we want to, but perhaps only once we look into the connector_changed idea to implement fastboot. Just to avoid needless churn. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PULL] drm-intel-next-fixes
On Fri, Jun 19, 2015 at 01:48:13PM +1000, Dave Airlie wrote: > On 18 June 2015 at 16:04, Jani Nikula wrote: > > > > Hi Dave, i915 fixes for drm-next/v4.2. > > > > BR, > > Jani. > > And my gcc says: > > /home/airlied/devel/kernel/drm-next/drivers/gpu/drm/i915/intel_display.c: > In function ‘__intel_set_mode’: > /home/airlied/devel/kernel/drm-next/drivers/gpu/drm/i915/intel_display.c:11850:14: > warning: ‘crtc_state’ may be used uninitialized in this function > [-Wmaybe-uninitialized] > return state->mode_changed || state->active_changed; > ^ > /home/airlied/devel/kernel/drm-next/drivers/gpu/drm/i915/intel_display.c:11860:25: > note: ‘crtc_state’ was declared here > struct drm_crtc_state *crtc_state; > ^ > /home/airlied/devel/kernel/drm-next/drivers/gpu/drm/i915/intel_display.c:11874:6: > warning: ‘crtc’ may be used uninitialized in this function > [-Wmaybe-uninitialized] >if (crtc != intel_encoder->base.crtc) > ^ > /home/airlied/devel/kernel/drm-next/drivers/gpu/drm/i915/intel_display.c:11859:19: > note: ‘crtc’ was declared here > struct drm_crtc *crtc; >^ > > No idea if this is true, but I don't think I've seen it before now. > > gcc 5.1.1 on fedora 22 Yeah this is new with Ander's patches. gcc Doesn't know that we have at least 1 crtc and hence crtc&crtc are guaranteed to be initiliazed. I think you should be able to shut it up with diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index e047105837c9..5ade250dc6d7 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -11856,8 +11856,8 @@ intel_modeset_update_state(struct drm_atomic_state *state) struct drm_device *dev = state->dev; struct drm_i915_private *dev_priv = dev->dev_private; struct intel_encoder *intel_encoder; - struct drm_crtc *crtc; - struct drm_crtc_state *crtc_state; + struct drm_crtc *crtc = NULL; + struct drm_crtc_state *crtc_state = NULL; struct drm_connector *connector; int i; But the entire Finland team is out of office (celebrating solstice), so might be better to wait for Monday for them to confirm. Otherwise just apply this fixup with my ack if you want to send out the merge window pull asap. Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 05/15] drm/i915: GuC-specific firmware loader
On 18/06/15 21:12, Chris Wilson wrote: > On Thu, Jun 18, 2015 at 10:53:10AM -0700, Yu Dai wrote: >> >> >> On 06/15/2015 01:30 PM, Chris Wilson wrote: >>> On Mon, Jun 15, 2015 at 07:36:23PM +0100, Dave Gordon wrote: + /* Set the source address for the new blob */ + offset = i915_gem_obj_ggtt_offset(fw_obj); >>> >>> Why would it even have a GGTT vma? There's no precondition here to >>> assert that it should. >> It is pinned into GGTT inside gem_allocate_guc_obj. This particular object wasn't allocated with that function; that's only used for objects that need to be permanently accessible by the GuC (context pool, GuC logbuffer, per-client structure). As I already mentioned in another reply, /this/ one was pinned (and will be unpinned) by the *immediate caller* of this function. .Dave. > The basic rules when reviewing is pinning is: > - is there a reason for this pin? > - is the lifetime of the pin bound to the hardware access? > - are the pad-to-size/alignment correct? > - is the vma in the wrong location? > > Pinning early (and then not even stating in the function preamble that > you expect the object to be pinned) makes it hard to review both the > reason and check the lifetime. An easy solution to avoiding the > assumption of having a pinned object is to pass around the vma instead. > Though because you pin too early it is not clear the reason for the pin > nor that you only pin it for the lifetime of the hardware access, and > you have to scour the code to ensure that the pin isn't randomly dropped > or reused for another access. > -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Kill DRI1 cliprects
On Fri, Jun 19, 2015 at 02:45:34PM +0100, Chris Wilson wrote: > + /* Kernel clipping was a DRI1 misfeature */ > + if (exec->num_cliprects != 0 || exec->cliprects_ptr == 0) > + return false; In my attempt to keep Daniel happy, that test above for cliprects_ptr is backwards. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Kill DRI1 cliprects
Passing cliprects into the kernel for it to re-execute the batch buffer with different CMD_DRAWRECT died out long ago. As DRI1 support has been removed from the kernel, we can now simply reject any execbuf trying to use this "feature". To keep Daniel happy with the prospect of being able to reuse these fields in the next decade, continue to ensure that current userspace is not passing garbage in through the dead fields. Signed-off-by: Chris Wilson Cc: Daniel Vetter --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 153 ++--- drivers/gpu/drm/i915/intel_lrc.c | 15 --- 2 files changed, 31 insertions(+), 137 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 3336e1c2c0a5..9a194b6e4641 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -933,7 +933,21 @@ i915_gem_check_execbuffer(struct drm_i915_gem_execbuffer2 *exec) if (exec->flags & __I915_EXEC_UNKNOWN_FLAGS) return false; - return ((exec->batch_start_offset | exec->batch_len) & 0x7) == 0; + /* Kernel clipping was a DRI1 misfeature */ + if (exec->num_cliprects != 0 || exec->cliprects_ptr == 0) + return false; + + if (exec->DR4 == 0x) { + DRM_DEBUG("UXA submitting garbage DR4, fixing up\n"); + exec->DR4 = 0; + } + if (exec->DR1 || exec->DR4) + return false; + + if ((exec->batch_start_offset | exec->batch_len) & 0x7) + return false; + + return true; } static int @@ -1096,46 +1110,6 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev, return 0; } -static int -i915_emit_box(struct intel_engine_cs *ring, - struct drm_clip_rect *box, - int DR1, int DR4) -{ - int ret; - - if (box->y2 <= box->y1 || box->x2 <= box->x1 || - box->y2 <= 0 || box->x2 <= 0) { - DRM_ERROR("Bad box %d,%d..%d,%d\n", - box->x1, box->y1, box->x2, box->y2); - return -EINVAL; - } - - if (INTEL_INFO(ring->dev)->gen >= 4) { - ret = intel_ring_begin(ring, 4); - if (ret) - return ret; - - intel_ring_emit(ring, GFX_OP_DRAWRECT_INFO_I965); - intel_ring_emit(ring, (box->x1 & 0x) | box->y1 << 16); - intel_ring_emit(ring, ((box->x2 - 1) & 0x) | (box->y2 - 1) << 16); - intel_ring_emit(ring, DR4); - } else { - ret = intel_ring_begin(ring, 6); - if (ret) - return ret; - - intel_ring_emit(ring, GFX_OP_DRAWRECT_INFO); - intel_ring_emit(ring, DR1); - intel_ring_emit(ring, (box->x1 & 0x) | box->y1 << 16); - intel_ring_emit(ring, ((box->x2 - 1) & 0x) | (box->y2 - 1) << 16); - intel_ring_emit(ring, DR4); - intel_ring_emit(ring, 0); - } - intel_ring_advance(ring); - - return 0; -} - static struct drm_i915_gem_object* i915_gem_execbuffer_parse(struct intel_engine_cs *ring, struct drm_i915_gem_exec_object2 *shadow_exec_entry, @@ -1198,63 +1172,19 @@ i915_gem_ringbuffer_submission(struct drm_device *dev, struct drm_file *file, struct drm_i915_gem_object *batch_obj, u64 exec_start, u32 dispatch_flags) { - struct drm_clip_rect *cliprects = NULL; struct drm_i915_private *dev_priv = dev->dev_private; u64 exec_len; int instp_mode; u32 instp_mask; - int i, ret = 0; - - if (args->num_cliprects != 0) { - if (ring != &dev_priv->ring[RCS]) { - DRM_DEBUG("clip rectangles are only valid with the render ring\n"); - return -EINVAL; - } - - if (INTEL_INFO(dev)->gen >= 5) { - DRM_DEBUG("clip rectangles are only valid on pre-gen5\n"); - return -EINVAL; - } - - if (args->num_cliprects > UINT_MAX / sizeof(*cliprects)) { - DRM_DEBUG("execbuf with %u cliprects\n", - args->num_cliprects); - return -EINVAL; - } - - cliprects = kcalloc(args->num_cliprects, - sizeof(*cliprects), - GFP_KERNEL); - if (cliprects == NULL) { - ret = -ENOMEM; - goto error; - } - - if (copy_from_user(cliprects, - to_user_ptr(args->cliprects_ptr), - sizeof(*cliprects)*args->num_cliprects)) { - ret = -EFAULT; -
Re: [Intel-gfx] [PATCH] drm/i915: Avoid fluctuating to UNKNOWN connector status during forced probes
On Mon, Jun 15, 2015 at 05:18:40PM +0200, Daniel Vetter wrote: > On Mon, Jun 15, 2015 at 02:37:58PM +0100, Chris Wilson wrote: > > On Mon, Jun 15, 2015 at 03:03:39PM +0200, Daniel Vetter wrote: > > > On Mon, Jun 15, 2015 at 01:35:21PM +0100, Chris Wilson wrote: > > > > On Mon, Jun 15, 2015 at 02:29:28PM +0200, Daniel Vetter wrote: > > > > > On Fri, Jun 05, 2015 at 08:46:19AM +0100, Chris Wilson wrote: > > > > > > For old-school component TV and CRT connectors, we require a > > > > > > heavyweight > > > > > > load-detection mechanism. This involves setting up a CRTC and > > > > > > sending a > > > > > > signal to the output, before reading back any response. As that is > > > > > > quite > > > > > > slow and CPU heavy, the process is only performed when the output > > > > > > detection is forced by user request. As it requires a driving CRTC, > > > > > > we > > > > > > often don't have the resources to complete the probe. This leaves > > > > > > us in > > > > > > a quandary where the unforced path just returns the old connector > > > > > > status, but the forced detection path elects to return UNKNOWN. If > > > > > > we > > > > > > have an active connection, we likely have the resources available to > > > > > > complete the probe - but if it is currently disconnected, then it > > > > > > becomes unknown and triggers a hotplug event, with often quite > > > > > > unfortunate > > > > > > userspace behaviour (e.g. one output is blanked and the spurious TV > > > > > > turned on). > > > > > > > > > > > > To reduce spurious hotplug events on older devices, we can prevent > > > > > > transitions between disconnected <-> unknown. > > > > > > > > > > > > v2: Convert tv_type to use proper unknown enum (Ville) > > > > > > > > > > > > References: https://bugs.freedesktop.org/show_bug.cgi?id=87049 > > > > > > Signed-off-by: Chris Wilson > > > > > > Cc: Ville Syrjälä [v1] > > > > > > Reviewed-by: Ville Syrjälä [v1] > > > > > > > > > > This solution is at odds with > > > > > > > > > > commit b7703726251191cd9f3ef3a80b2d9667901eec95 > > > > > Author: Daniel Vetter > > > > > Date: Wed Jan 21 08:45:22 2015 +0100 > > > > > > > > > > drm/probe-helper: clamp unknown connector status in the poll work > > > > > > > > > > We're missing this bit of logic from the hotplug handlers, but that > > > > > was > > > > > somewhat intentional since a hotplug should indicate that something > > > > > has > > > > > changed. And in the i915 hpd handler we filter by source. > > > > > > > > > > Given that the report is older than me having resurrect that patch > > > > > can we > > > > > close it as fixed or do I miss something? > > > > > > > > It's a different path. This also concerns the actual forced reprobe > > > > fluctuating between two states. > > > > > > In that case I still think it should be in the probe helpers. Which raises > > > the policy decision whether we should ever hand unkown back to userspace > > > since apparently it just can't cope sensible with it. But that call should > > > be done consistently accross drivers. > > > > Hmm, the probe helper is not the central arbiter here which is a problem > > in moving it entirely into its domain. UNKNOWN makes a sense form the hw > > perspective, and we still need to report that status before any probes > > are made at all (whether that is init/resume) and the problem in this > > case is not so much as userspace mishandling it, but the fluctuation > > between UNKNOWN/DISCONNECTED is causing activity and reconfiguration not > > entirely of its own fault. > > Well for the poll worker part of the probe helpers I went with > > if (new_status == unknown) > connector->status = old_status; > > which takes care of the init/resume time unknown->(dis)connected > transition correctly while filtering the noise. Adding that to the > synchronous users probe function (which is the only one that sets force == > true) would achieve what you want I think. But it also means that we > almost always filter out unknown and never let userspace known that the hw > detection is uncertain. I thought that connected->unknown was less likely and of more significance than disconnected<->unknown. But that was from thinking that the output was in use whilst it was connected, and so should always have the resources available to do the detection. On second thought, we should always just suppress the ->unknown transition. The best way to then report the unreliable detection would to be report an error trying to connect to the CRT/TV (by a vblank load detect cycle) and signal EREMOTEIO in enable -- but the modesetting code still does not report errors, despite their continued prevalence (such as training failures). > Maybe we instead need a drm helper module option to clamp unknown > uncondiotionally to disconnected to allow users to hack around silly > userspace? We have userspace that respects unknown, after it is what the fb_helper was copied from. Unknown is unkn
[Intel-gfx] [PATCH] drm/i915: Avoid GPU stalls from kswapd
Exclude active GPU pages from the purview of the background shrinker (kswapd), as these cause uncontrollable GPU stalls. Given that the shrinker is rerun until the freelists are satisfied, we should have opportunity in subsequent passes to recover the pages once idle. If the machine does run out of memory entirely, we have the forced idling in the oom-notifier as a means of releasing all the pages we can before an oom is prematurely executed. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gem_shrinker.c | 8 ++-- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 71f4ca5088e2..e0dcd018379f 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3185,6 +3185,7 @@ unsigned long i915_gem_shrink(struct drm_i915_private *dev_priv, #define I915_SHRINK_PURGEABLE 0x1 #define I915_SHRINK_UNBOUND 0x2 #define I915_SHRINK_BOUND 0x4 +#define I915_SHRINK_ACTIVE 0x8 unsigned long i915_gem_shrink_all(struct drm_i915_private *dev_priv); void i915_gem_shrinker_init(struct drm_i915_private *dev_priv); diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c index bd1cf921aead..8d25ec8a6559 100644 --- a/drivers/gpu/drm/i915/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c @@ -123,6 +123,10 @@ i915_gem_shrink(struct drm_i915_private *dev_priv, obj->madv != I915_MADV_DONTNEED) continue; + if ((flags & I915_SHRINK_ACTIVE) == 0 && + obj->active) + continue; + drm_gem_object_reference(&obj->base); /* For the unbound phase, this should be a no-op! */ @@ -166,7 +170,7 @@ unsigned long i915_gem_shrink_all(struct drm_i915_private *dev_priv) i915_gem_retire_requests(dev_priv->dev); return i915_gem_shrink(dev_priv, LONG_MAX, - I915_SHRINK_BOUND | I915_SHRINK_UNBOUND); + I915_SHRINK_BOUND | I915_SHRINK_UNBOUND | I915_SHRINK_ACTIVE); } static bool i915_gem_shrinker_lock(struct drm_device *dev, bool *unlock) @@ -221,7 +225,7 @@ i915_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc) count += obj->base.size >> PAGE_SHIFT; list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) { - if (obj->pages_pin_count == num_vma_bound(obj)) + if (!obj->active && obj->pages_pin_count == num_vma_bound(obj)) count += obj->base.size >> PAGE_SHIFT; } -- 2.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Enforce execobject.alignment to be a power-of-two
Internal requirement for the alignment is that it must be a power-of-two, so enforce rejection at the user interface to execbuffer (which allows the caller to specify a stricter-than-expected alignment criterion). Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index f4fb2bd33753..b14733908d8d 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1189,6 +1189,9 @@ validate_exec_list(const struct drm_i915_gem_execbuffer2 *args, if (exec[i].flags & invalid_flags) return -EINVAL; + if (exec[i].alignment && !is_power_of_2(exec[i].alignment)) + return -EINVAL; + /* pad_to_size was once a reserved field, so sanitize it */ if (exec[i].flags & EXEC_OBJECT_PAD_TO_SIZE) { if (offset_in_page(exec[i].pad_to_size)) -- 2.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Ignore LVDS presence in VBT flag if the LVDS is enabled by BIOS
On older gen, pre-Ironlake, parts there is no hardwired pin to report the presence of an LVDS panel. Instead, we have to rely on the VBT to declare whether the machine has a panel or not. Though notoriously unreliable, so far we have erred on the side of false-positives and have required a list of machines which end up falsely reporting a panel as present. However, we now have reports of false-negatives, machines with an LVDS that are being ignored due to the VBT not declaring the panel. This patch ignores the VBT setting if the BIOS has already enabled the LVDS panel (and on Ironlake+ we also have the hardware presence pin). It fixes the Samsung NP680Z5E-X01FR in the bug report, but is likely to result in more false-positives, and since we rely on the BIOS to enable the panel, there are likely different circumstances where the BIOS will not enable that panel (and so we may see the same machine with and without a panel all on the whim of the BIOS). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90979 Reported-and-tested-by: lys...@gmail.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/intel_lvds.c | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lvds.c b/drivers/gpu/drm/i915/intel_lvds.c index 161ab26f81fb..bf1702a6e33d 100644 --- a/drivers/gpu/drm/i915/intel_lvds.c +++ b/drivers/gpu/drm/i915/intel_lvds.c @@ -942,12 +942,6 @@ void intel_lvds_init(struct drm_device *dev) if (dmi_check_system(intel_no_lvds)) return; - pin = GMBUS_PIN_PANEL; - if (!lvds_is_present_in_vbt(dev, &pin)) { - DRM_DEBUG_KMS("LVDS is not present in VBT\n"); - return; - } - if (HAS_PCH_SPLIT(dev)) { if ((I915_READ(PCH_LVDS) & LVDS_DETECTED) == 0) return; @@ -957,6 +951,16 @@ void intel_lvds_init(struct drm_device *dev) } } + pin = GMBUS_PIN_PANEL; + if (!lvds_is_present_in_vbt(dev, &pin)) { + u32 reg = HAS_PCH_SPLIT(dev) ? PCH_LVDS : LVDS; + if ((I915_READ(reg) & LVDS_PORT_EN) == 0) { + DRM_DEBUG_KMS("LVDS is not present in VBT\n"); + return; + } + DRM_DEBUG_KMS("LVDS is not present in VBT, but enabled anyway\n"); + } + lvds_encoder = kzalloc(sizeof(*lvds_encoder), GFP_KERNEL); if (!lvds_encoder) return; -- 2.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/4] snd: add support for displayport multi-stream to hda codec.
At Fri, 19 Jun 2015 20:33:39 +1000, Dave Airlie wrote: > > On 19 June 2015 at 19:54, Lin, Mengdong wrote: > > Hi Takashi/Dave, > > > > Shall we move or cc this discussion on audio driver side to ALSA ML? > > Oops I thought I had cc'ed these patches to alsa-devel as well when I sent > them. > > > I think we also need to decide how to manage PCM devices for DP MST. > > Now the HD-A driver create a PCM device for each pin, and the substream > > number is 1 for each PCM. Now with DP MST enabled, each pin can support > > multiple streams (e.g. 3 on Intel HSW/BDW/SKL). > > > > There may be 2 options: > > -#1: Let an HDMI codec specify number of substreams, same as the number > > of device entries on a pin. We can specify 3 for HSW/BDW/SKL. Other > > vendors can also specify a value according to actual HW capabilities. > > > > So for HSW, we have 3x3 subtreams totally. But we only have 3 convertors > > (for 3 display pipelines), so we can open up to 3 substreams at the same > > time. When the audio driver finds all 3 convertors are used when opening > > a new substream, it will fail. > > One thing I noticed is the number of devices on a PIN is only updated when > the MST device is plugged in so normally pins 5,6,7 have 0 devices, and when > I plug in MST device, I get the 3 devices on port 6. So it seems dynamic > enough at this point, though I guess it'll always be 0 or 3. > > > > - #2: Create PCM device dynamically. Only create a PCM devices for a device > > entry which connects to monitor with audio support. When the monitor > > is removed, the PCM device will be disconnected, closed and removed, > > similar to the USB case. > > > > This will change ALSA core. But there will be less PCM devices and > > substreams, since the number of connected monitors Is decided by the > > actual GPU display pipeline. > > I like this option more, since I think it should be more like USB, but I've > no idea how much work it would be from the alsa side, this patch was > probably as deep into alsa as I've gone. Two things have to be considered for compatibility: - ELD, channel map and jack detection: these are created per PCM device, and extending to substream would confuse user space a lot. In theory, it can be extended using subdevice number, but in anyway this won't work with PulseAudio as is. - The per-pin assignment provides a more or less persistent route to a certain device. Changing the assignment method may break the previous setup. Also, the dynamic PCM creation / removal is an issue that has been discussed many times. Unfortunately it won't work as is, at lest for PA. Currently PA does probing of devices only at the card probe time. The hotplug of USB-audio works because it's always tied with the card. But in this case, the card remains while only the PCM devices will be created / removed, thus the probe in PA won't be triggered. Takashi > > Dave. > > >> Signed-off-by: Dave Airlie > >> --- > >> include/sound/hda_verbs.h | 2 + > >> sound/pci/hda/hda_codec.c | 1 + > >> sound/pci/hda/hda_proc.c | 5 +- > >> sound/pci/hda/patch_hdmi.c | 181 > >> +++-- > >> 4 files changed, 131 insertions(+), 58 deletions(-) > >> > >> diff --git a/include/sound/hda_verbs.h b/include/sound/hda_verbs.h index > >> d0509db..3b62ac5 100644 > >> --- a/include/sound/hda_verbs.h > >> +++ b/include/sound/hda_verbs.h > >> @@ -75,6 +75,7 @@ enum { > >> #define AC_VERB_GET_HDMI_CHAN_SLOT 0x0f34 > >> #define AC_VERB_GET_DEVICE_SEL 0xf35 > >> #define AC_VERB_GET_DEVICE_LIST 0xf36 > >> +#define AC_VERB_GET_DP_STREAM_ID 0xf3c > >> > >> /* > >> * SET verbs > >> @@ -115,6 +116,7 @@ enum { > >> #define AC_VERB_SET_HDMI_CP_CTRL 0x733 > >> #define AC_VERB_SET_HDMI_CHAN_SLOT 0x734 > >> #define AC_VERB_SET_DEVICE_SEL 0x735 > >> +#define AC_VERB_SET_DP_STREAM_ID 0x73C > >> > >> /* > >> * Parameter IDs > >> diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index > >> 5645481..3981c63 100644 > >> --- a/sound/pci/hda/hda_codec.c > >> +++ b/sound/pci/hda/hda_codec.c > >> @@ -482,6 +482,7 @@ int snd_hda_get_devices(struct hda_codec *codec, > >> hda_nid_t nid, > >> } > >> return devices; > >> } > >> +EXPORT_SYMBOL_GPL(snd_hda_get_devices); > >> > >> /* > >> * destructor > >> diff --git a/sound/pci/hda/hda_proc.c b/sound/pci/hda/hda_proc.c index > >> baaf7ed0..39fac53 100644 > >> --- a/sound/pci/hda/hda_proc.c > >> +++ b/sound/pci/hda/hda_proc.c > >> @@ -644,10 +644,13 @@ static void print_device_list(struct snd_info_buffer > >> *buffer, > >> int i, curr = -1; > >> u8 dev_list[AC_MAX_DEV_LIST_LEN]; > >> int devlist_len; > >> + int dp_s_id; > >> > >> + dp_s_id = snd_hda_codec_read(codec, nid, 0, > >> + AC_VERB_GET_DP_STREAM_ID, 0); > >> devlist_len = snd_hda_get_devi
Re: [Intel-gfx] [RFC] drm/i915: Introduce documentation for register subfields
On Fri, Jun 19, 2015 at 10:52:15AM +0100, Chris Wilson wrote: > An interesting point was raised in some recent patches to document the > various widths of the register subfields. I disagreed with that patch in > that it transformed illegal values into some random potentially harmful > valid value (and not transforming an invalid value would end up writing > bits in other subfields, so equally bad). Most of our register usage is > with static values so unlikely to be of concern, but for those we can > document the register subfields and provide compile time checking. I've thought about things like this in the past as well, we can even automatically generate all of that from the specs! I had other per-register metadata in mind, like: - is the register is a masked one or not (and then warn if we're trying to write it wihtout the _MASKED_*() macros) - is the register part of the render context (and then warn if we try to write it with a MMIO write). #define FOO 0x1234 #define FOO_IS_MASKED 1 #define FOO_IS_IN_RENDER_CTX1 should be enough for the compile-time warns like you do here. -- Damien ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [alsa-devel] DP MST audio support
Just to fill in the info on this one. > We don't have Haswell Lenovo t440s atm, so could you share more info? > - Dell U2410 should support both HDMI and DP input. But I guess it cannot > support DP MST, right? > - Are you connecting this monitor a DP cable? > Which DDI port is used? DDI B, C or D? > - Does audio fail after i915 enables DP MST? > - Is patch "snd/hdmi: hack out haswell codec workaround" the only change > on audio driver side? Yes its a DP SST input on the U2410, and I'm using that for the audio. It's connected to the Lenovo dock with DP cable. The dock is an MST device. The dock is connected to DDI C I think, and if the dock is operated in SST mode audio works, but in MST mode audio fails. (operating the dock in SST mode isn't useful though since only one of the multiple outputs works then). >> > The graphics side patches are fairly trivial, also it would be good to >> > get a good explaination of how the hw works, >> > >> > from what I can see devices get connections not pins on this hw, and I >> > notice that I don't always get 3 devices, so I'm not sure if devices >> > are a dynamic thing we should be reprobing on some signal. > > Do you mean 3 PCM devices here, like pcmC0D3p, pcmC0D7p, pcmC0D8p? > Now the devices are not dynamic, a PCM device is created on each pin. > It seems we need to revise this for DP MST, since a pin can be used to send > up to 3 independent streams on Intel GPU which has 3 display pipelines. > No I mean the "Devices" from snd_hda_get_devices. Dave. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/4] snd: add support for displayport multi-stream to hda codec.
On 19 June 2015 at 19:54, Lin, Mengdong wrote: > Hi Takashi/Dave, > > Shall we move or cc this discussion on audio driver side to ALSA ML? Oops I thought I had cc'ed these patches to alsa-devel as well when I sent them. > I think we also need to decide how to manage PCM devices for DP MST. > Now the HD-A driver create a PCM device for each pin, and the substream > number is 1 for each PCM. Now with DP MST enabled, each pin can support > multiple streams (e.g. 3 on Intel HSW/BDW/SKL). > > There may be 2 options: > -#1: Let an HDMI codec specify number of substreams, same as the number > of device entries on a pin. We can specify 3 for HSW/BDW/SKL. Other > vendors can also specify a value according to actual HW capabilities. > > So for HSW, we have 3x3 subtreams totally. But we only have 3 convertors > (for 3 display pipelines), so we can open up to 3 substreams at the same > time. When the audio driver finds all 3 convertors are used when opening > a new substream, it will fail. One thing I noticed is the number of devices on a PIN is only updated when the MST device is plugged in so normally pins 5,6,7 have 0 devices, and when I plug in MST device, I get the 3 devices on port 6. So it seems dynamic enough at this point, though I guess it'll always be 0 or 3. > > - #2: Create PCM device dynamically. Only create a PCM devices for a device > entry which connects to monitor with audio support. When the monitor > is removed, the PCM device will be disconnected, closed and removed, > similar to the USB case. > > This will change ALSA core. But there will be less PCM devices and > substreams, since the number of connected monitors Is decided by the > actual GPU display pipeline. I like this option more, since I think it should be more like USB, but I've no idea how much work it would be from the alsa side, this patch was probably as deep into alsa as I've gone. Dave. >> Signed-off-by: Dave Airlie >> --- >> include/sound/hda_verbs.h | 2 + >> sound/pci/hda/hda_codec.c | 1 + >> sound/pci/hda/hda_proc.c | 5 +- >> sound/pci/hda/patch_hdmi.c | 181 >> +++-- >> 4 files changed, 131 insertions(+), 58 deletions(-) >> >> diff --git a/include/sound/hda_verbs.h b/include/sound/hda_verbs.h index >> d0509db..3b62ac5 100644 >> --- a/include/sound/hda_verbs.h >> +++ b/include/sound/hda_verbs.h >> @@ -75,6 +75,7 @@ enum { >> #define AC_VERB_GET_HDMI_CHAN_SLOT 0x0f34 >> #define AC_VERB_GET_DEVICE_SEL 0xf35 >> #define AC_VERB_GET_DEVICE_LIST 0xf36 >> +#define AC_VERB_GET_DP_STREAM_ID 0xf3c >> >> /* >> * SET verbs >> @@ -115,6 +116,7 @@ enum { >> #define AC_VERB_SET_HDMI_CP_CTRL 0x733 >> #define AC_VERB_SET_HDMI_CHAN_SLOT 0x734 >> #define AC_VERB_SET_DEVICE_SEL 0x735 >> +#define AC_VERB_SET_DP_STREAM_ID 0x73C >> >> /* >> * Parameter IDs >> diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index >> 5645481..3981c63 100644 >> --- a/sound/pci/hda/hda_codec.c >> +++ b/sound/pci/hda/hda_codec.c >> @@ -482,6 +482,7 @@ int snd_hda_get_devices(struct hda_codec *codec, >> hda_nid_t nid, >> } >> return devices; >> } >> +EXPORT_SYMBOL_GPL(snd_hda_get_devices); >> >> /* >> * destructor >> diff --git a/sound/pci/hda/hda_proc.c b/sound/pci/hda/hda_proc.c index >> baaf7ed0..39fac53 100644 >> --- a/sound/pci/hda/hda_proc.c >> +++ b/sound/pci/hda/hda_proc.c >> @@ -644,10 +644,13 @@ static void print_device_list(struct snd_info_buffer >> *buffer, >> int i, curr = -1; >> u8 dev_list[AC_MAX_DEV_LIST_LEN]; >> int devlist_len; >> + int dp_s_id; >> >> + dp_s_id = snd_hda_codec_read(codec, nid, 0, >> + AC_VERB_GET_DP_STREAM_ID, 0); >> devlist_len = snd_hda_get_devices(codec, nid, dev_list, >> AC_MAX_DEV_LIST_LEN); >> - snd_iprintf(buffer, " Devices: %d\n", devlist_len); >> + snd_iprintf(buffer, " Devices: %d: 0x%x\n", devlist_len, dp_s_id); >> if (devlist_len <= 0) >> return; >> >> diff --git a/sound/pci/hda/patch_hdmi.c b/sound/pci/hda/patch_hdmi.c index >> 5f44f60..8272656 100644 >> --- a/sound/pci/hda/patch_hdmi.c >> +++ b/sound/pci/hda/patch_hdmi.c >> @@ -68,6 +68,17 @@ struct hdmi_spec_per_cvt { >> /* max. connections to a widget */ >> #define HDA_MAX_CONNECTIONS 32 >> >> +struct hdmi_spec_per_pin; >> +#define HDA_MAX_DEVICES 3 >> +struct hdmi_spec_per_device { >> + struct hdmi_spec_per_pin *pin; >> + int device_idx; >> + struct hdmi_eld sink_eld; >> +#ifdef CONFIG_PROC_FS >> + struct snd_info_entry *proc_entry; >> +#endif >> +}; >> + >> struct hdmi_spec_per_pin { >> hda_nid_t pin_nid; >> int num_mux_nids; >> @@ -76,7 +87,11 @@ struct hdmi_spec_per_pin { >> hda_nid_t cvt_nid; >> >> struct hda_codec *codec; >> - struct hdmi_eld sink_eld; >> + >> + int
Re: [Intel-gfx] [PATCH v5 1/6] drm/i915/gen8: Add infrastructure to initialize WA batch buffers
On Fri, Jun 19, 2015 at 10:48:24AM +0100, Siluvery, Arun wrote: > variable names were getting too long and caused difficulties in > indentation so tried to shorten them, will change this part. It's a trade off. I was thinking we wouldn't need to use the full form that often so the extra characters wouldn't be too much of an issue, but having the names be consistent is most valuable. You can always play around with temporary variables, using short structs, to make it readable in places if it gets too ugly. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/4] snd: add support for displayport multi-stream to hda codec.
Hi Takashi/Dave, Shall we move or cc this discussion on audio driver side to ALSA ML? I think we also need to decide how to manage PCM devices for DP MST. Now the HD-A driver create a PCM device for each pin, and the substream number is 1 for each PCM. Now with DP MST enabled, each pin can support multiple streams (e.g. 3 on Intel HSW/BDW/SKL). There may be 2 options: -#1: Let an HDMI codec specify number of substreams, same as the number of device entries on a pin. We can specify 3 for HSW/BDW/SKL. Other vendors can also specify a value according to actual HW capabilities. So for HSW, we have 3x3 subtreams totally. But we only have 3 convertors (for 3 display pipelines), so we can open up to 3 substreams at the same time. When the audio driver finds all 3 convertors are used when opening a new substream, it will fail. - #2: Create PCM device dynamically. Only create a PCM devices for a device entry which connects to monitor with audio support. When the monitor is removed, the PCM device will be disconnected, closed and removed, similar to the USB case. This will change ALSA core. But there will be less PCM devices and substreams, since the number of connected monitors Is decided by the actual GPU display pipeline. Could you share your preference on these 2 options or other suggestions? Thanks Mengdong > -Original Message- > From: Intel-gfx [mailto:intel-gfx-boun...@lists.freedesktop.org] On Behalf Of > Dave Airlie > Sent: Wednesday, June 17, 2015 12:02 PM > To: intel-gfx@lists.freedesktop.org > Subject: [Intel-gfx] [PATCH 3/4] snd: add support for displayport multi-stream > to hda codec. > > From: Dave Airlie > > Add new verbs GET_DP_STREAM_ID and SET_DP_STREAM_ID from Intel docs. > get stream id and print in proc > split ELD to be per device not per pin > handle pd/eldv per device not per pin > setup codec->dp_mst earlier. > > Signed-off-by: Dave Airlie > --- > include/sound/hda_verbs.h | 2 + > sound/pci/hda/hda_codec.c | 1 + > sound/pci/hda/hda_proc.c | 5 +- > sound/pci/hda/patch_hdmi.c | 181 > +++-- > 4 files changed, 131 insertions(+), 58 deletions(-) > > diff --git a/include/sound/hda_verbs.h b/include/sound/hda_verbs.h index > d0509db..3b62ac5 100644 > --- a/include/sound/hda_verbs.h > +++ b/include/sound/hda_verbs.h > @@ -75,6 +75,7 @@ enum { > #define AC_VERB_GET_HDMI_CHAN_SLOT 0x0f34 > #define AC_VERB_GET_DEVICE_SEL 0xf35 > #define AC_VERB_GET_DEVICE_LIST 0xf36 > +#define AC_VERB_GET_DP_STREAM_ID 0xf3c > > /* > * SET verbs > @@ -115,6 +116,7 @@ enum { > #define AC_VERB_SET_HDMI_CP_CTRL 0x733 > #define AC_VERB_SET_HDMI_CHAN_SLOT 0x734 > #define AC_VERB_SET_DEVICE_SEL 0x735 > +#define AC_VERB_SET_DP_STREAM_ID 0x73C > > /* > * Parameter IDs > diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index > 5645481..3981c63 100644 > --- a/sound/pci/hda/hda_codec.c > +++ b/sound/pci/hda/hda_codec.c > @@ -482,6 +482,7 @@ int snd_hda_get_devices(struct hda_codec *codec, > hda_nid_t nid, > } > return devices; > } > +EXPORT_SYMBOL_GPL(snd_hda_get_devices); > > /* > * destructor > diff --git a/sound/pci/hda/hda_proc.c b/sound/pci/hda/hda_proc.c index > baaf7ed0..39fac53 100644 > --- a/sound/pci/hda/hda_proc.c > +++ b/sound/pci/hda/hda_proc.c > @@ -644,10 +644,13 @@ static void print_device_list(struct snd_info_buffer > *buffer, > int i, curr = -1; > u8 dev_list[AC_MAX_DEV_LIST_LEN]; > int devlist_len; > + int dp_s_id; > > + dp_s_id = snd_hda_codec_read(codec, nid, 0, > + AC_VERB_GET_DP_STREAM_ID, 0); > devlist_len = snd_hda_get_devices(codec, nid, dev_list, > AC_MAX_DEV_LIST_LEN); > - snd_iprintf(buffer, " Devices: %d\n", devlist_len); > + snd_iprintf(buffer, " Devices: %d: 0x%x\n", devlist_len, dp_s_id); > if (devlist_len <= 0) > return; > > diff --git a/sound/pci/hda/patch_hdmi.c b/sound/pci/hda/patch_hdmi.c index > 5f44f60..8272656 100644 > --- a/sound/pci/hda/patch_hdmi.c > +++ b/sound/pci/hda/patch_hdmi.c > @@ -68,6 +68,17 @@ struct hdmi_spec_per_cvt { > /* max. connections to a widget */ > #define HDA_MAX_CONNECTIONS 32 > > +struct hdmi_spec_per_pin; > +#define HDA_MAX_DEVICES 3 > +struct hdmi_spec_per_device { > + struct hdmi_spec_per_pin *pin; > + int device_idx; > + struct hdmi_eld sink_eld; > +#ifdef CONFIG_PROC_FS > + struct snd_info_entry *proc_entry; > +#endif > +}; > + > struct hdmi_spec_per_pin { > hda_nid_t pin_nid; > int num_mux_nids; > @@ -76,7 +87,11 @@ struct hdmi_spec_per_pin { > hda_nid_t cvt_nid; > > struct hda_codec *codec; > - struct hdmi_eld sink_eld; > + > + int num_devices; > + u8 dev_list[AC_MAX_DEV_LIST_LEN]; > + struct hdmi_spec_per_device devices[HD
[Intel-gfx] [RFC] drm/i915: Introduce documentation for register subfields
An interesting point was raised in some recent patches to document the various widths of the register subfields. I disagreed with that patch in that it transformed illegal values into some random potentially harmful valid value (and not transforming an invalid value would end up writing bits in other subfields, so equally bad). Most of our register usage is with static values so unlikely to be of concern, but for those we can document the register subfields and provide compile time checking. The resulting error message is fairly impententrable though: n file included from include/uapi/linux/stddef.h:1:0, from include/linux/stddef.h:4, from ./include/uapi/linux/posix_types.h:4, from include/uapi/linux/types.h:13, from include/linux/types.h:5, from include/linux/mod_devicetable.h:11, from include/linux/i2c.h:29, from drivers/gpu/drm/i915/intel_dp.c:28: In function ‘intel_dp_autotest_edid’, inlined from ‘intel_dp_handle_test_request’ at drivers/gpu/drm/i915/intel_dp.c:4230:14, inlined from ‘intel_dp_detect’ at drivers/gpu/drm/i915/intel_dp.c:4635:4: include/linux/compiler.h:429:38: error: call to ‘__compiletime_assert_4173’ declared with attribute error: BUILD_BUG_ON failed: (5) & -(1<<2) _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__) ^ include/linux/compiler.h:412:4: note: in definition of macro ‘__compiletime_assert’ prefix ## suffix();\ ^ include/linux/compiler.h:429:2: note: in expansion of macro ‘_compiletime_assert’ _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__) ^ include/linux/bug.h:50:37: note: in expansion of macro ‘compiletime_assert’ #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) ^ include/linux/bug.h:74:2: note: in expansion of macro ‘BUILD_BUG_ON_MSG’ BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) ^ drivers/gpu/drm/i915/intel_dp.c:45:68: note: in expansion of macro ‘BUILD_BUG_ON’ #define __FIELD__(x, shift, width) ({ if (__builtin_constant_p(x)) BUILD_BUG_ON((x) & -(1
Re: [Intel-gfx] [PATCH v5 1/6] drm/i915/gen8: Add infrastructure to initialize WA batch buffers
On 19/06/2015 10:27, Chris Wilson wrote: On Thu, Jun 18, 2015 at 06:33:24PM +0100, Arun Siluvery wrote: Totally minor worries now. +/** + * gen8_init_indirectctx_bb() - initialize indirect ctx batch with WA + * + * @ring: only applicable for RCS + * @wa_ctx_batch: page in which WA are loaded + * @offset: This is for future use in case if we would like to have multiple + * batches at different offsets and select them based on a criteria. + * @num_dwords: The number of WA applied are known at the beginning, it returns + * the no of DWORDS written. This batch does not contain MI_BATCH_BUFFER_END + * so it adds padding to make it cacheline aligned. MI_BATCH_BUFFER_END will be + * added to perctx batch and both of them together makes a complete batch buffer. + * + * Return: non-zero if we exceed the PAGE_SIZE limit. + */ + +static int gen8_init_indirectctx_bb(struct intel_engine_cs *ring, + uint32_t **wa_ctx_batch, + uint32_t offset, + uint32_t *num_dwords) +{ + uint32_t index; + uint32_t *batch = *wa_ctx_batch; + + index = offset; I worry that offset need not be cacheline aligned on entry (for example if indirectctx and perctx were switched, or someone else stuffed more controls into the per-ring object). Like perctx, there is no mention of any alignment worries for the starting location, but here you tell use that the INDIRECT_CTX length is specified in cacheline, so I also presume the start needs to be aligned. offset need to be cachealigned, I will update the comments. +static int lrc_setup_wa_ctx_obj(struct intel_engine_cs *ring, u32 size) +{ + int ret; + + WARN_ON(ring->id != RCS); + + ring->wa_ctx.obj = i915_gem_alloc_object(ring->dev, PAGE_ALIGN(size)); + if (!ring->wa_ctx.obj) { + DRM_DEBUG_DRIVER("alloc LRC WA ctx backing obj failed.\n"); + return -ENOMEM; + } + + ret = i915_gem_obj_ggtt_pin(ring->wa_ctx.obj, PAGE_SIZE, 0); One day _pin() will return the vma being pinned and I will rejoice as it makes reviewing pinning much easier! Not a problem for you right now. +static int intel_init_workaround_bb(struct intel_engine_cs *ring) +{ + int ret = 0; + uint32_t *batch; + uint32_t num_dwords; + struct page *page; + struct i915_ctx_workarounds *wa_ctx = &ring->wa_ctx; + + WARN_ON(ring->id != RCS); + + if (ring->scratch.obj == NULL) { + DRM_ERROR("scratch page not allocated for %s\n", ring->name); + return -EINVAL; + } I haven't found the dependence upon scratch.obj, could you explain it? Does it appear later? yes it does in patch 2 which rearranges init_pipe_control(). I will move this check to that patch as per your comment. @@ -1754,15 +1934,26 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o reg_state[CTX_SECOND_BB_STATE] = ring->mmio_base + 0x118; reg_state[CTX_SECOND_BB_STATE+1] = 0; if (ring->id == RCS) { - /* TODO: according to BSpec, the register state context -* for CHV does not have these. OTOH, these registers do -* exist in CHV. I'm waiting for a clarification */ reg_state[CTX_BB_PER_CTX_PTR] = ring->mmio_base + 0x1c0; reg_state[CTX_BB_PER_CTX_PTR+1] = 0; reg_state[CTX_RCS_INDIRECT_CTX] = ring->mmio_base + 0x1c4; reg_state[CTX_RCS_INDIRECT_CTX+1] = 0; reg_state[CTX_RCS_INDIRECT_CTX_OFFSET] = ring->mmio_base + 0x1c8; reg_state[CTX_RCS_INDIRECT_CTX_OFFSET+1] = 0; + if (ring->wa_ctx.obj) { + reg_state[CTX_RCS_INDIRECT_CTX+1] = + (i915_gem_obj_ggtt_offset(ring->wa_ctx.obj) + +ring->wa_ctx.indctx_batch_offset * sizeof(uint32_t)) | + (ring->wa_ctx.indctx_batch_size / CACHELINE_DWORDS); Ok, this really does impose alignment conditions on indctx_batch_offset Oh, if I do get a chance to complain, spell out indirect_ctx, make it a struct or even just precalculate the reg value, just indctx's only value is that is the same length as perctx, but otherwise quite obtuse. variable names were getting too long and caused difficulties in indentation so tried to shorten them, will change this part. regards Arun Other than that, I couldn't punch any holes in its robustness, and the series is starting to look quite good and very neat. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 07/15] drm/i915: Defer default hardware context initialisation until first open
On 16/06/15 10:35, Chris Wilson wrote: > On Mon, Jun 15, 2015 at 07:36:25PM +0100, Dave Gordon wrote: >> +static int i915_gem_context_first_open(struct drm_device *dev) >> +{ >> +struct drm_i915_private *dev_priv = dev->dev_private; >> +int ret; >> + >> +/* >> + * We can't enable contexts until all firmware is loaded. This >> + * call shouldn't return -EAGAIN because we pass wait=true, but >> + * it can still fail with code -EIO if the GuC doesn't respond, >> + * or -ENOEXEC if the GuC firmware image is invalid. >> + */ >> +ret = intel_guc_ucode_load(dev, true); >> +WARN_ON(ret == -EAGAIN); >> + >> +/* >> + * If an error occurred and GuC submission has been requested, we can >> + * attempt recovery by disabling GuC submission and reinitialising >> + * the GPU and driver. We then fail this open() anyway, but the next >> + * attempt will find that GuC submission is already disabled, and so >> + * proceed to complete context initialisation in non-GuC mode instead. >> + */ >> +if (ret && i915.enable_guc_submission) { >> +i915_handle_guc_error(dev, ret); >> +return ret; >> +} > > This is still backwards. What we wanted was for the submission process > to start up normally and then once the GuC loading succeeds, we then > start submitting the backlog to the GuC. If the loading fails, we can > then submit the backlog via execlists. It may be interesting to even > start userspace before GuC finishes loading. Absolutely. The latter is what this allows :) (And its a requirement for Android, as on those platforms the f/w won't become available until userspace is running). But we're not going to keep stuff queued up in the rings. It would add more complexity to manage the backlog and remember that we can accept calls to add_request but not actually submit them. Also there would be issue with any code that sent commands to the engines to program registers via LRIs and then EITHER assumed that they had taken effect OR waited for completion (because that would block indefinitely). Instead, we've split hw_init() into early (MMIO) and late (context and batch) phases, and deferred all of the latter iff we need GuC f/w to be loaded before batch submission. When the late phase runs, it can submit batches, and wait for completion if required, without blocking the entire system as it would if called from driver_load(). It might even make sense as an overall strategy to defer MORE of the driver initialisation process, so that the critical single-threaded driver load during system startup does as little as possible. > So this makes more sense as to why you have the tight integration with > execlists then. I still don't think that justifies changing gen8 without > reason. > -Chris It's tightly integrated because GuC submission *IS* execlist submission, only with the GuC doing the actual poking of the ELSP and fielding the resulting context-switch interrupts. Deferred initialisation doesn't apply to Gen8, or anything else that doesn't have and use GuC submission. The delayed-context-initialisation codepath is only reachable if there is GuC firmware to be loaded. .Dave. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC PATCH 2/7] drm/i915: Read hw state into an atomic state struct, try 2.
Hi, On 19 June 2015 at 09:02, Maarten Lankhorst wrote: > + if (crtc->state->enable) { > + intel_mode_from_pipe_config(&crtc->mode, > + to_intel_crtc_state(crtc->state)); > + > intel_mode_from_pipe_config(&crtc->state->adjusted_mode, > + to_intel_crtc_state(crtc->state)); > + > + if (drm_atomic_set_mode_for_crtc(crtc->state, > &crtc->mode) < 0) > + drm_mode_copy(&crtc->state->mode, > &crtc->mode); I've never really understood why this is here. There are only two ways in which set_mode_for_crtc can fail: either the mode fails validation (fixed in the patch I sent you the other day, which it seems like you've now pulled in), or -ENOMEM. The first one we shouldn't really be papering over - and I haven't seen it happen since that patch - and the second one isn't really recoverable anyway. Plus I would argue that breaking the state like that is pretty bad. Cheers, Daniel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v5 6/6] drm/i915/gen8: Add WaRsRestoreWithPerCtxtBb workaround
On Thu, Jun 18, 2015 at 06:33:29PM +0100, Arun Siluvery wrote: > In Per context w/a batch buffer, > WaRsRestoreWithPerCtxtBb > > v2: This patches modifies definitions of MI_LOAD_REGISTER_MEM and > MI_LOAD_REGISTER_REG; Add GEN8 specific defines for these instructions > so as to not break any future users of existing definitions (Michel) > > v3: Length defined in current definitions of LRM, LRR instructions was > specified > as 0. It seems it is common convention for instructions whose length vary > between > platforms. This is not an issue so far because they are not used anywhere > except > command parser; now that we use in this patch update them with correct length > and also move them out of command parser placeholder to appropriate place. > remove unnecessary padding and follow the WA programming sequence exactly > as mentioned in spec which is essential for this WA (Dave). Similarly with the indirect ctx from patch 5, having the documentation that we need scratch_obj in the preamble for these workarounds I think is worth its weight in the extra typing. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v5 5/6] drm/i915/gen8: Add WaClearSlmSpaceAtContextSwitch workaround
On Thu, Jun 18, 2015 at 06:33:28PM +0100, Arun Siluvery wrote: > In Indirect context w/a batch buffer, > WaClearSlmSpaceAtContextSwitch > > v2: s/PIPE_CONTROL_FLUSH_RO_CACHES/PIPE_CONTROL_FLUSH_L3 (Ville) Ok, so this is the first use of scratch object. I would like to move the test for existence from patch 1 to here, and into indirectctx_bb so that we know why we need the obj. It's just trying to keep the patches and code as self-contained and as descriptive as possible (including their requirements). -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v5 1/6] drm/i915/gen8: Add infrastructure to initialize WA batch buffers
On Thu, Jun 18, 2015 at 06:33:24PM +0100, Arun Siluvery wrote: Totally minor worries now. > +/** > + * gen8_init_indirectctx_bb() - initialize indirect ctx batch with WA > + * > + * @ring: only applicable for RCS > + * @wa_ctx_batch: page in which WA are loaded > + * @offset: This is for future use in case if we would like to have multiple > + * batches at different offsets and select them based on a criteria. > + * @num_dwords: The number of WA applied are known at the beginning, it > returns > + * the no of DWORDS written. This batch does not contain MI_BATCH_BUFFER_END > + * so it adds padding to make it cacheline aligned. MI_BATCH_BUFFER_END will > be > + * added to perctx batch and both of them together makes a complete batch > buffer. > + * > + * Return: non-zero if we exceed the PAGE_SIZE limit. > + */ > + > +static int gen8_init_indirectctx_bb(struct intel_engine_cs *ring, > + uint32_t **wa_ctx_batch, > + uint32_t offset, > + uint32_t *num_dwords) > +{ > + uint32_t index; > + uint32_t *batch = *wa_ctx_batch; > + > + index = offset; I worry that offset need not be cacheline aligned on entry (for example if indirectctx and perctx were switched, or someone else stuffed more controls into the per-ring object). Like perctx, there is no mention of any alignment worries for the starting location, but here you tell use that the INDIRECT_CTX length is specified in cacheline, so I also presume the start needs to be aligned. > +static int lrc_setup_wa_ctx_obj(struct intel_engine_cs *ring, u32 size) > +{ > + int ret; > + > + WARN_ON(ring->id != RCS); > + > + ring->wa_ctx.obj = i915_gem_alloc_object(ring->dev, PAGE_ALIGN(size)); > + if (!ring->wa_ctx.obj) { > + DRM_DEBUG_DRIVER("alloc LRC WA ctx backing obj failed.\n"); > + return -ENOMEM; > + } > + > + ret = i915_gem_obj_ggtt_pin(ring->wa_ctx.obj, PAGE_SIZE, 0); One day _pin() will return the vma being pinned and I will rejoice as it makes reviewing pinning much easier! Not a problem for you right now. > +static int intel_init_workaround_bb(struct intel_engine_cs *ring) > +{ > + int ret = 0; > + uint32_t *batch; > + uint32_t num_dwords; > + struct page *page; > + struct i915_ctx_workarounds *wa_ctx = &ring->wa_ctx; > + > + WARN_ON(ring->id != RCS); > + > + if (ring->scratch.obj == NULL) { > + DRM_ERROR("scratch page not allocated for %s\n", ring->name); > + return -EINVAL; > + } I haven't found the dependence upon scratch.obj, could you explain it? Does it appear later? > @@ -1754,15 +1934,26 @@ populate_lr_context(struct intel_context *ctx, struct > drm_i915_gem_object *ctx_o > reg_state[CTX_SECOND_BB_STATE] = ring->mmio_base + 0x118; > reg_state[CTX_SECOND_BB_STATE+1] = 0; > if (ring->id == RCS) { > - /* TODO: according to BSpec, the register state context > - * for CHV does not have these. OTOH, these registers do > - * exist in CHV. I'm waiting for a clarification */ > reg_state[CTX_BB_PER_CTX_PTR] = ring->mmio_base + 0x1c0; > reg_state[CTX_BB_PER_CTX_PTR+1] = 0; > reg_state[CTX_RCS_INDIRECT_CTX] = ring->mmio_base + 0x1c4; > reg_state[CTX_RCS_INDIRECT_CTX+1] = 0; > reg_state[CTX_RCS_INDIRECT_CTX_OFFSET] = ring->mmio_base + > 0x1c8; > reg_state[CTX_RCS_INDIRECT_CTX_OFFSET+1] = 0; > + if (ring->wa_ctx.obj) { > + reg_state[CTX_RCS_INDIRECT_CTX+1] = > + (i915_gem_obj_ggtt_offset(ring->wa_ctx.obj) + > + ring->wa_ctx.indctx_batch_offset * > sizeof(uint32_t)) | > + (ring->wa_ctx.indctx_batch_size / > CACHELINE_DWORDS); Ok, this really does impose alignment conditions on indctx_batch_offset Oh, if I do get a chance to complain, spell out indirect_ctx, make it a struct or even just precalculate the reg value, just indctx's only value is that is the same length as perctx, but otherwise quite obtuse. Other than that, I couldn't punch any holes in its robustness, and the series is starting to look quite good and very neat. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 07/15] drm/i915: Defer default hardware context initialisation until first open
On 17/06/15 13:18, Daniel Vetter wrote: > On Mon, Jun 15, 2015 at 07:36:25PM +0100, Dave Gordon wrote: >> In order to fully initialise the default contexts, we have to execute >> batchbuffer commands on the GPU engines. But in the case of GuC-based >> batch submission, we can't do that until any required firmware has >> been loaded, which may not be possible during driver load, because the >> filesystem(s) containing the firmware may not be mounted until later. >> >> Therefore, we now allow the first call to the firmware-loading code to >> return -EAGAIN to indicate that it's not yet ready, and that it should >> be retried when the device is first opened from user code, by which >> time we expect that all required filesystems will have been mounted. >> The late-retry code will then re-attempt to load the firmware if the >> early attempt failed. >> >> If the late retry fails, the current open-in-progress will fail, but >> the recovery code will disable GuC submission and reset the GPU and >> driver. The next open will therefore be in non-GuC mode, and will be >> allowed to complete even if the GuC cannot be loaded or used. >> >> Issue: VIZ-4884 >> Signed-off-by: Dave Gordon >> Signed-off-by: Alex Dai > > I'm not really sold on this super-flexible fallback scheme implemented > here. Because such fallback schemes means more code to test (which no on > will do likely) or just even bigger fireworks when we actually hit them in > reality when something goes wrong. Imo if anything goes wrong in the setup > we just throw in the towel and fail the driver loading. Firstly, GuC submission is an OPTION. That means we already have code to work with or without a GuC. The fallback just allows us to keep going after finding that although GuC submission has been requested, and we do have a GuC, nonetheless the request cannot be satisfied. That's no different from automatically disabling PPGTT or execlist mode if they're requested on platforms where we don't support them. > There's only one exception: If something fails with GT init we declare the > gpu wedged but proceed with all the modeset setup. This makes sense > because we need all the code to handle a wedge gpu anyway, dead-on-boot > gpus happen occasionally and it's really not nice to greet the user with a > black screen. But more fallbacks are imo just headache. > > Hence when the guc fails we imo really shouldn't bother with fallbacks, > but instead just declare the thing wedged and carry on. So the strategy here is exactly the same as for GT init; declare the GPU wedged, but after disabling GuC mode. The recovery will then get us into the same state as if there were no GuC, or GuC mode had not been selected in the first place. We can't switch between GuC and execlists arbitrarily; the only switchover is from GuC to non-GuC, and it can only happen ONCE. To test this is easy; just rename your firmware blob so the driver can't find it and reboot. It should automatically run in execlist mode, with a log message telling you what went wrong (f/w file not found). Much nicer than your screen staying blank because you upgraded the driver and not the firmware, or vice versa. > That should also allow us to simplify the firmware loading: We can do that > in an async worker and if the blob isn't there in time then we just move > on. > -Daniel Under no circumstances can you ever load the firmware from an async worker thread, because Bad Things Will Happen if there is hardware activity already in progress when the GuC f/w starts up. .Dave. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 01/15] drm/i915: Add i915_gem_object_write() to i915_gem.c
On Thu, Jun 18, 2015 at 07:07:46PM +0100, Dave Gordon wrote: > On 18/06/15 13:10, Chris Wilson wrote: > > On Thu, Jun 18, 2015 at 12:49:55PM +0100, Dave Gordon wrote: > >> On 17/06/15 13:02, Daniel Vetter wrote: > >>> Domain handling is required for all gem objects, and the resulting bugs if > >>> you don't for one-off objects are absolutely no fun to track down. > >> > >> Is it not the case that the new object returned by > >> i915_gem_alloc_object() is > >> (a) of a type that can be mapped into the GTT, and > >> (b) initially in the CPU domain for both reading and writing? > >> > >> So AFAICS the allocate-and-fill function I'm describing (to appear in > >> next patch series respin) doesn't need any further domain handling. > > > > A i915_gem_object_create_from_data() is a reasonable addition, and I > > suspect it will make the code a bit more succinct. > > I shall adopt this name for it :) > > > Whilst your statement is true today, calling set_domain is then a no-op, > > and helps document how you use the object and so reduces the likelihood > > of us introducing bugs in the future. > > -Chris > > So here's the new function ... where should the set-to-cpu-domain go? > After the pin_pages and before the sg_copy_from_buffer? Either, since the domain will not change whilst you have the lock, but if you do it before get_pages() you will have a slightly easier error path. Part of the reason why I want a function like this is so that I can replace it with a stolen object and so need to write the data through a temporary GGTT mapping. Speak now if you need more flags to the function to prevent certain classes of objects being created. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 02/15] drm/i915: Embedded microcontroller (uC) firmware loading support
On 18/06/15 15:49, Daniel Vetter wrote: > On Thu, Jun 18, 2015 at 01:11:34PM +0100, Dave Gordon wrote: >> On 17/06/15 13:05, Daniel Vetter wrote: >>> On Mon, Jun 15, 2015 at 07:36:20PM +0100, Dave Gordon wrote: Current devices may contain one or more programmable microcontrollers that need to have a firmware image (aka "binary blob") loaded from an external medium and transferred to the device's memory. This file provides generic support functions for doing this; they can then be used by each uC-specific loader, thus reducing code duplication and testing effort. Signed-off-by: Dave Gordon Signed-off-by: Alex Dai >>> >>> Given that I'm just shredding the synchronization used by the dmc loader >>> I'm not convinced this is a good idea. Abstraction has cost, and a bit of >>> copy-paste for similar sounding but slightly different things doesn't >>> sound awful to me. And the critical bit in all the firmware loading I've >>> seen thus far is in synchronizing the loading with other operations, >>> hiding that isn't a good idea. Worse if we enforce stuff like requiring >>> dev->struct_mutex. >>> -Daniel >> >> It's precisely because it's in some sense "trivial-but-tricky" that we >> should write it once, get it right, and use it everywhere. Copypaste >> /does/ sound awful; I've seen how the code this was derived from had >> already been cloned into three flavours, all different and all wrong. >> >> It's a very simple abstraction: one early call to kick things off as >> early as possible, no locking required. One late call with the >> struct_mutex held to complete the synchronisation and actually do the >> work, thus guaranteeing that the transfer to the target uC is done in a >> controlled fashion, at a time of the caller's choice, and by the >> driver's mainline thread, NOT by an asynchronous thread racing with >> other activity (which was one of the things wrong with the original >> version). > > Yeah I've seen the origins of this in the display code, and that code gets > the syncing wrong. The only thing that one has do to is grab a runtime pm > reference for the appropriate power well to prevent dc5 entry, and release > it when the firmware is loaded and initialized. Agreed. > Which means any kind of firmware loader which requires/uses > dev->struct_mutex get stuff wrong and is not appropriate everywhere. BUT, the loading of the firmware into any uC MUST be done in a controlled manner i.e. at a time when no other thread is touching the h/w. Otherwise the f/w load and whatever else is concurrently accessing the h/w could in some cases interfere disastrously. Examples of interference might be: * interleaved accesses to the ELSP (in the case of the GuC) * incorrect handover of power management (DMC, GuC) * erroneous management of forcewake state In general the f/w that is just starting on the uC may have certain expectations about the initial state of the h/w, which may not be met if other threads are accessing various bits of h/w while the uC is booting up. So we absolutely need to guarantee that the f/w load is done by a thread which has exclusive ownership of any bit of the h/w that the f/w is going to make assumptions about. With the current locking structure of the driver, that means holding the struct_mutex (it shouldn't really, there should be a separate mutex for h/w register access vs. driver-private data structures, but there isn't). >> We should convert the DMC loader to use this too, so there need be only >> one bit of code in the whole driver that needs to understand how to use >> completions to get correct handover from a free-running no-locks-held >> thread to the properly disciplined environment of driver mainline for >> purposes of programming the h/w. > > Nack on using this for dmc, since I want them to convert it to the above > synchronization, since that's how all the other async power initialization > is done. > > Guc is different since we really must have it ready for execbuf, and for > that usecase a completion at drm_open time sounds like the right thing. > > As a rule of thumb for refactoring and share infastructure we use the > following recipe in drm: > - first driver implements things as straightforward as possible > - 2nd user copypastes > - 3rd one has the duty to figure out whether some refactoring is in order > or not. > Imo that approach leads a really good balance between avoiding > overengineering and having maintainable code. > -Daniel We've already been through these phases; the code has already been cloned twice (and then changed, but not enough to fix the problems with the original) and then when I found the issues with the GuC loader and noticed the hilarious ownership dance it was doing during handover I realised it was time to fix it in one place rather than several, and posted a patchset to the internal mailing list on 2015-02-24 with this commentary: > The GuC loader uses an asynchronous thread to fetch
[Intel-gfx] [RFC PATCH 2/7] drm/i915: Read hw state into an atomic state struct, try 2.
To make this work we load the new hardware state into the atomic_state, then swap it with the sw state. This lets us change the force restore path in setup_hw_state() to use a single call to intel_mode_set() to restore all the previous state. As a nice bonus this kills off encoder->new_encoder, connector->new_enabled and crtc->new_enabled. They were used only to restore the state after a modeset. Change since try1: - Fix the regressions caused by not updating drm core sw state. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/intel_atomic.c | 2 +- drivers/gpu/drm/i915/intel_display.c | 381 ++- drivers/gpu/drm/i915/intel_drv.h | 14 +- 3 files changed, 243 insertions(+), 154 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_atomic.c b/drivers/gpu/drm/i915/intel_atomic.c index 429677a111d5..4d87574a2032 100644 --- a/drivers/gpu/drm/i915/intel_atomic.c +++ b/drivers/gpu/drm/i915/intel_atomic.c @@ -398,7 +398,7 @@ int intel_atomic_setup_scalers(struct drm_device *dev, return 0; } -static void +void intel_atomic_duplicate_dpll_state(struct drm_i915_private *dev_priv, struct intel_shared_dpll_config *shared_dpll) { diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index b3d8d1a30a04..41193dab58c5 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -10231,7 +10231,7 @@ bool intel_get_load_detect_pipe(struct drm_connector *connector, retry: ret = drm_modeset_lock(&config->connection_mutex, ctx); if (ret) - goto fail_unlock; + goto fail; /* * Algorithm gets a little messy: @@ -10249,10 +10249,10 @@ retry: ret = drm_modeset_lock(&crtc->mutex, ctx); if (ret) - goto fail_unlock; + goto fail; ret = drm_modeset_lock(&crtc->primary->mutex, ctx); if (ret) - goto fail_unlock; + goto fail; old->dpms_mode = connector->dpms; old->load_detect_temp = false; @@ -10271,9 +10271,6 @@ retry: continue; if (possible_crtc->state->enable) continue; - /* This can occur when applying the pipe A quirk on resume. */ - if (to_intel_crtc(possible_crtc)->new_enabled) - continue; crtc = possible_crtc; break; @@ -10284,20 +10281,17 @@ retry: */ if (!crtc) { DRM_DEBUG_KMS("no pipe available for load-detect\n"); - goto fail_unlock; + goto fail; } ret = drm_modeset_lock(&crtc->mutex, ctx); if (ret) - goto fail_unlock; + goto fail; ret = drm_modeset_lock(&crtc->primary->mutex, ctx); if (ret) - goto fail_unlock; - intel_encoder->new_crtc = to_intel_crtc(crtc); - to_intel_connector(connector)->new_encoder = intel_encoder; + goto fail; intel_crtc = to_intel_crtc(crtc); - intel_crtc->new_enabled = true; old->dpms_mode = connector->dpms; old->load_detect_temp = true; old->release_fb = NULL; @@ -10365,9 +10359,7 @@ retry: intel_wait_for_vblank(dev, intel_crtc->pipe); return true; - fail: - intel_crtc->new_enabled = crtc->state->enable; -fail_unlock: +fail: drm_atomic_state_free(state); state = NULL; @@ -10413,10 +10405,6 @@ void intel_release_load_detect_pipe(struct drm_connector *connector, if (IS_ERR(crtc_state)) goto fail; - to_intel_connector(connector)->new_encoder = NULL; - intel_encoder->new_crtc = NULL; - intel_crtc->new_enabled = false; - connector_state->best_encoder = NULL; connector_state->crtc = NULL; @@ -11869,33 +11857,6 @@ static const struct drm_crtc_helper_funcs intel_helper_funcs = { .atomic_check = intel_crtc_atomic_check, }; -/** - * intel_modeset_update_staged_output_state - * - * Updates the staged output configuration state, e.g. after we've read out the - * current hw state. - */ -static void intel_modeset_update_staged_output_state(struct drm_device *dev) -{ - struct intel_crtc *crtc; - struct intel_encoder *encoder; - struct intel_connector *connector; - - for_each_intel_connector(dev, connector) { - connector->new_encoder = - to_intel_encoder(connector->base.encoder); - } - - for_each_intel_encoder(dev, encoder) { - encoder->new_crtc = - to_intel_crtc(encoder->base.crtc); - } - - for_each_intel_crtc(dev, crtc) { - crtc->new_enabled = cr
[Intel-gfx] [RFC PATCH 5/7] drm/i915: Update power domains only on affected crtc's.
Use for_each_crtc_state to only touch affected crtc's. In order to make sure that the initial power is still set correctly we make sure modeset_update_crtc_power_domains is called during the initial modeset. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/intel_display.c | 44 1 file changed, 29 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index de975ef09e23..942d25e7490a 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -5197,30 +5197,35 @@ static unsigned long get_crtc_power_domains(struct drm_crtc *crtc) return mask; } -static void modeset_update_crtc_power_domains(struct drm_atomic_state *state) +static void modeset_update_crtc_power_domains(struct drm_atomic_state *state, + bool power_only) { struct drm_device *dev = state->dev; struct drm_i915_private *dev_priv = dev->dev_private; - unsigned long pipe_domains[I915_MAX_PIPES] = { 0, }; - struct intel_crtc *crtc; + unsigned long pipe_domains[I915_MAX_PIPES] = { 0, }, domains; + struct drm_crtc_state *crtc_state; + struct drm_crtc *crtc; + int i; /* * First get all needed power domains, then put all unneeded, to avoid * any unnecessary toggling of the power wells. */ - for_each_intel_crtc(dev, crtc) { + for_each_crtc_in_state(state, crtc, crtc_state, i) { enum intel_display_power_domain domain; + enum pipe pipe = to_intel_crtc(crtc)->pipe; - if (!crtc->base.state->enable) + if (!crtc->state->active) continue; - pipe_domains[crtc->pipe] = get_crtc_power_domains(&crtc->base); + domains = pipe_domains[pipe] = get_crtc_power_domains(crtc); + domains &= ~to_intel_crtc(crtc)->enabled_power_domains; - for_each_power_domain(domain, pipe_domains[crtc->pipe]) + for_each_power_domain(domain, domains) intel_display_power_get(dev_priv, domain); } - if (dev_priv->display.modeset_commit_cdclk) { + if (!power_only && dev_priv->display.modeset_commit_cdclk) { unsigned int cdclk = to_intel_atomic_state(state)->cdclk; if (cdclk != dev_priv->cdclk_freq && @@ -5228,16 +5233,19 @@ static void modeset_update_crtc_power_domains(struct drm_atomic_state *state) dev_priv->display.modeset_commit_cdclk(state); } - for_each_intel_crtc(dev, crtc) { + for_each_crtc_in_state(state, crtc, crtc_state, i) { + struct intel_crtc *intel_crtc = to_intel_crtc(crtc); + enum pipe pipe = intel_crtc->pipe; enum intel_display_power_domain domain; - for_each_power_domain(domain, crtc->enabled_power_domains) - intel_display_power_put(dev_priv, domain); + domains = intel_crtc->enabled_power_domains; + domains &= ~pipe_domains[pipe]; - crtc->enabled_power_domains = pipe_domains[crtc->pipe]; - } + intel_crtc->enabled_power_domains = pipe_domains[pipe]; - intel_display_set_init_power(dev_priv, false); + for_each_power_domain(domain, domains) + intel_display_power_put(dev_priv, domain); + } } static void intel_update_max_cdclk(struct drm_device *dev) @@ -13115,7 +13123,7 @@ static int __intel_set_mode(struct drm_atomic_state *state) /* The state has been swaped above, so state actually contains the * old state now. */ if (any_ms) - modeset_update_crtc_power_domains(state); + modeset_update_crtc_power_domains(state, false); /* Now enable the clocks, plane, pipe, and connectors that we set up. */ for_each_crtc_in_state(state, crtc, crtc_state, i) { @@ -15606,6 +15614,7 @@ intel_modeset_setup_hw_state(struct drm_device *dev, bool force_restore) return NULL; } + /* swap sw/hw state */ drm_atomic_helper_swap_state(dev, state); if (force_restore) { @@ -15619,6 +15628,9 @@ intel_modeset_setup_hw_state(struct drm_device *dev, bool force_restore) to_intel_atomic_state(state)->dpll_set = false; } + /* update power state to match hw state */ + modeset_update_crtc_power_domains(state, true); + /* HW state is read out, now we need to sanitize this mess. */ for_each_intel_encoder(dev, encoder) { intel_sanitize_encoder(encoder); @@ -15697,6 +15709,8 @@ intel_modeset_setup_hw_state(struct drm_device *dev, bool force_restore) drm_atomic_state_free(state); } + intel_display_set_init_power(dev_priv, f
[Intel-gfx] [RFC PATCH 7/7] drm/i915: Make intel_display_suspend atomic, try 2.
Calculate all state using a normal transition, but afterwards fudge crtc->state->active back to its old value. This should still allow state restore in setup_hw_state to work properly. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/intel_display.c | 74 +++- drivers/gpu/drm/i915/intel_drv.h | 2 +- 2 files changed, 49 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 796d08c4606e..9c25b080260c 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -6175,40 +6175,62 @@ static void i9xx_crtc_disable(struct drm_crtc *crtc) intel_set_cpu_fifo_underrun_reporting(dev_priv, pipe, false); } -static void intel_crtc_disable_noatomic(struct drm_crtc *crtc) +/* + * turn all crtc's off, but do not adjust state + * This has to be paired with a call to intel_modeset_setup_hw_state. + */ +int intel_display_suspend(struct drm_device *dev) { - struct intel_crtc *intel_crtc = to_intel_crtc(crtc); - struct drm_i915_private *dev_priv = to_i915(crtc->dev); - enum intel_display_power_domain domain; - unsigned long domains; + struct drm_mode_config *config = &dev->mode_config; + struct drm_modeset_acquire_ctx *ctx = config->acquire_ctx; + struct drm_atomic_state *state; + struct drm_crtc *crtc; + unsigned crtc_mask = 0; + int ret = 0; - if (!intel_crtc->active) - return; + if (WARN_ON(!ctx)) + return 0; - if (to_intel_plane_state(crtc->primary->state)->visible) { - intel_crtc_wait_for_pending_flips(crtc); - intel_pre_disable_primary(crtc); + lockdep_assert_held(&ctx->ww_ctx); + state = drm_atomic_state_alloc(dev); + if (WARN_ON(!state)) + return -ENOMEM; + + state->acquire_ctx = ctx; + state->allow_modeset = true; + + for_each_crtc(dev, crtc) { + struct drm_crtc_state *crtc_state = + drm_atomic_get_crtc_state(state, crtc); + + ret = PTR_ERR_OR_ZERO(crtc_state); + if (ret) + goto free; + + if (!crtc_state->active) + continue; + + crtc_state->active = false; + crtc_mask |= 1 << drm_crtc_index(crtc); } - intel_crtc_disable_planes(crtc, crtc->state->plane_mask); - dev_priv->display.crtc_disable(crtc); + if (crtc_mask) { + ret = intel_set_mode(state); - domains = intel_crtc->enabled_power_domains; - for_each_power_domain(domain, domains) - intel_display_power_put(dev_priv, domain); - intel_crtc->enabled_power_domains = 0; -} + if (!ret) { + for_each_crtc(dev, crtc) + if (crtc_mask & (1 << drm_crtc_index(crtc))) + crtc->state->active = true; -/* - * turn all crtc's off, but do not adjust state - * This has to be paired with a call to intel_modeset_setup_hw_state. - */ -void intel_display_suspend(struct drm_device *dev) -{ - struct drm_crtc *crtc; + return ret; + } + } - for_each_crtc(dev, crtc) - intel_crtc_disable_noatomic(crtc); +free: + if (ret) + DRM_ERROR("Suspending crtc's failed with %i\n", ret); + drm_atomic_state_free(state); + return ret; } void intel_display_resume(struct drm_device *dev) diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index ea380b83eb5d..fb6c88ade842 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -992,7 +992,7 @@ int intel_pch_rawclk(struct drm_device *dev); void intel_mark_busy(struct drm_device *dev); void intel_mark_idle(struct drm_device *dev); void intel_crtc_restore_mode(struct drm_crtc *crtc); -void intel_display_suspend(struct drm_device *dev); +int intel_display_suspend(struct drm_device *dev); int intel_crtc_control(struct drm_crtc *crtc, bool enable); void intel_crtc_update_dpms(struct drm_crtc *crtc); void intel_encoder_destroy(struct drm_encoder *encoder); -- 2.1.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH 6/7] drm/i915: Always reset in intel_crtc_restore_mode
And get rid of things that are no longer true. This function is only used for forcing a modeset when encoder properties are changed. All the existing state is fine in this case, only setting mode_changed will force a full recalculation here, and take all the state needed. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/intel_display.c | 62 +++- 1 file changed, 18 insertions(+), 44 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 942d25e7490a..796d08c4606e 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -13174,66 +13174,40 @@ void intel_crtc_restore_mode(struct drm_crtc *crtc) { struct drm_device *dev = crtc->dev; struct drm_atomic_state *state; - struct intel_crtc *intel_crtc; - struct intel_encoder *encoder; - struct intel_connector *connector; - struct drm_connector_state *connector_state; - struct intel_crtc_state *crtc_state; + struct drm_crtc_state *crtc_state; int ret; state = drm_atomic_state_alloc(dev); if (!state) { - DRM_DEBUG_KMS("[CRTC:%d] mode restore failed, out of memory", + DRM_DEBUG_KMS("[CRTC:%d] crtc restore failed, out of memory", crtc->base.id); return; } - state->acquire_ctx = dev->mode_config.acquire_ctx; - - /* The force restore path in the HW readout code relies on the staged -* config still keeping the user requested config while the actual -* state has been overwritten by the configuration read from HW. We -* need to copy the staged config to the atomic state, otherwise the -* mode set will just reapply the state the HW is already in. */ - for_each_intel_encoder(dev, encoder) { - if (encoder->base.crtc != crtc) - continue; + state->acquire_ctx = drm_modeset_legacy_acquire_ctx(crtc); - for_each_intel_connector(dev, connector) { - if (connector->base.state->best_encoder != &encoder->base) - continue; - - connector_state = drm_atomic_get_connector_state(state, &connector->base); - if (IS_ERR(connector_state)) { - DRM_DEBUG_KMS("Failed to add [CONNECTOR:%d:%s] to state: %ld\n", - connector->base.base.id, - connector->base.name, - PTR_ERR(connector_state)); - continue; - } +retry: + crtc_state = drm_atomic_get_crtc_state(state, crtc); + ret = PTR_ERR_OR_ZERO(crtc_state); + if (!ret) { + if (!crtc_state->active) + goto out; - connector_state->crtc = crtc; - } + crtc_state->mode_changed = true; + ret = intel_modeset_compute_config(state); } - for_each_intel_crtc(dev, intel_crtc) { - crtc_state = intel_atomic_get_crtc_state(state, intel_crtc); - if (IS_ERR(crtc_state)) { - DRM_DEBUG_KMS("Failed to add [CRTC:%d] to state: %ld\n", - intel_crtc->base.base.id, - PTR_ERR(crtc_state)); - continue; - } + if (!ret) + ret = intel_set_mode_checked(state); - if (&intel_crtc->base == crtc) - drm_mode_copy(&crtc_state->base.mode, &crtc->mode); + if (ret == -EDEADLK) { + drm_atomic_state_clear(state); + drm_modeset_backoff(state->acquire_ctx); + goto retry; } - intel_modeset_setup_plane_state(state, crtc, &crtc->mode, - crtc->primary->fb, crtc->x, crtc->y); - - ret = intel_set_mode(state); if (ret) +out: drm_atomic_state_free(state); } -- 2.1.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH 0/7] Convert to atomic, part 4.
The only thing missing after part 3 is restoring atomic readout and making suspend/restore. With fixes for all identified causes of regressions from atomic suspend being done in convert to atomic, part 3 it's time to worry about getting atomic suspend/restore working again. I do this in a few steps, first I re-introduce the old commit for hw state readout, with fixes. After that I convert the atomic readout to a full initial modeset, and finally I try to re-enable fastboot optimizations again by default, in a less hacky way. This is a RFC because of the following possible issues: - mode->clock may not be read out correctly. It can differ slightly from the actually set value, which forces a modeset anyway when it could be avoided. - Not tested with DRRS support on DP, might be broken with fastboot? Maarten Lankhorst (7): drm/i915: Do not update pfit state when toggling crtc enabled. drm/i915: Read hw state into an atomic state struct, try 2. All changes from try2. drm/i915: enable fastboot for everyone drm/i915: Update power domains only on affected crtc's. drm/i915: Always reset in intel_crtc_restore_mode drm/i915: Make intel_display_suspend atomic, try 2. drivers/gpu/drm/i915/i915_dma.c | 12 +- drivers/gpu/drm/i915/i915_drv.c |2 +- drivers/gpu/drm/i915/i915_drv.h |7 +- drivers/gpu/drm/i915/i915_params.c |5 - drivers/gpu/drm/i915/intel_atomic.c | 23 +- drivers/gpu/drm/i915/intel_display.c | 1302 -- drivers/gpu/drm/i915/intel_dp.c |2 +- drivers/gpu/drm/i915/intel_drv.h | 21 +- drivers/gpu/drm/i915/intel_fbdev.c | 22 +- drivers/gpu/drm/i915/intel_lvds.c|2 +- 10 files changed, 786 insertions(+), 612 deletions(-) -- 2.1.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH 1/7] drm/i915: Do not update pfit state when toggling crtc enabled.
This must be done in advance, and during crtc_disable all scalers can be force disabled. This means intel_atomic_setup_scalers is only called in 1 place now, during crtc_check. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/intel_atomic.c | 14 ++-- drivers/gpu/drm/i915/intel_display.c | 67 ++-- drivers/gpu/drm/i915/intel_dp.c | 2 +- drivers/gpu/drm/i915/intel_drv.h | 2 +- 4 files changed, 47 insertions(+), 38 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_atomic.c b/drivers/gpu/drm/i915/intel_atomic.c index 0aeced82201e..429677a111d5 100644 --- a/drivers/gpu/drm/i915/intel_atomic.c +++ b/drivers/gpu/drm/i915/intel_atomic.c @@ -272,17 +272,12 @@ int intel_atomic_setup_scalers(struct drm_device *dev, struct drm_plane *plane = NULL; struct intel_plane *intel_plane; struct intel_plane_state *plane_state = NULL; - struct intel_crtc_scaler_state *scaler_state; - struct drm_atomic_state *drm_state; + struct intel_crtc_scaler_state *scaler_state = + &crtc_state->scaler_state; + struct drm_atomic_state *drm_state = crtc_state->base.state; int num_scalers_need; int i, j; - if (INTEL_INFO(dev)->gen < 9 || !intel_crtc || !crtc_state) - return 0; - - scaler_state = &crtc_state->scaler_state; - drm_state = crtc_state->base.state; - num_scalers_need = hweight32(scaler_state->scaler_users); DRM_DEBUG_KMS("crtc_state = %p need = %d avail = %d scaler_users = 0x%x\n", crtc_state, num_scalers_need, intel_crtc->num_scalers, @@ -327,9 +322,6 @@ int intel_atomic_setup_scalers(struct drm_device *dev, name = "PLANE"; idx = plane->base.id; - if (!drm_state) - continue; - /* plane scaler case: assign as a plane scaler */ /* find the plane that set the bit as scaler_user */ plane = drm_state->planes[i]; diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 59430e4c361f..b3d8d1a30a04 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -2898,29 +2898,32 @@ unsigned long intel_plane_obj_offset(struct intel_plane *intel_plane, return i915_gem_obj_ggtt_offset_view(obj, view); } +static void skl_detach_scaler(struct intel_crtc *intel_crtc, int id) +{ + struct drm_device *dev = intel_crtc->base.dev; + struct drm_i915_private *dev_priv = dev->dev_private; + + I915_WRITE(SKL_PS_CTRL(intel_crtc->pipe, id), 0); + I915_WRITE(SKL_PS_WIN_POS(intel_crtc->pipe, id), 0); + I915_WRITE(SKL_PS_WIN_SZ(intel_crtc->pipe, id), 0); + DRM_DEBUG_KMS("CRTC:%d Disabled scaler id %u.%u\n", + intel_crtc->base.base.id, intel_crtc->pipe, id); +} + /* * This function detaches (aka. unbinds) unused scalers in hardware */ static void skl_detach_scalers(struct intel_crtc *intel_crtc) { - struct drm_device *dev; - struct drm_i915_private *dev_priv; struct intel_crtc_scaler_state *scaler_state; int i; - dev = intel_crtc->base.dev; - dev_priv = dev->dev_private; scaler_state = &intel_crtc->config->scaler_state; /* loop through and disable scalers that aren't in use */ for (i = 0; i < intel_crtc->num_scalers; i++) { - if (!scaler_state->scalers[i].in_use) { - I915_WRITE(SKL_PS_CTRL(intel_crtc->pipe, i), 0); - I915_WRITE(SKL_PS_WIN_POS(intel_crtc->pipe, i), 0); - I915_WRITE(SKL_PS_WIN_SZ(intel_crtc->pipe, i), 0); - DRM_DEBUG_KMS("CRTC:%d Disabled scaler id %u.%u\n", - intel_crtc->base.base.id, intel_crtc->pipe, i); - } + if (!scaler_state->scalers[i].in_use) + skl_detach_scaler(intel_crtc, i); } } @@ -4351,13 +4354,12 @@ skl_update_scaler(struct intel_crtc_state *crtc_state, bool force_detach, * skl_update_scaler_crtc - Stages update to scaler state for a given crtc. * * @state: crtc's scaler state - * @force_detach: whether to forcibly disable scaler * * Return * 0 - scaler_usage updated successfully *error - requested scaling cannot be supported or other error condition */ -int skl_update_scaler_crtc(struct intel_crtc_state *state, int force_detach) +int skl_update_scaler_crtc(struct intel_crtc_state *state) { struct intel_crtc *intel_crtc = to_intel_crtc(state->base.crtc); struct drm_display_mode *adjusted_mode = @@ -4366,7 +4368,7 @@ int skl_update_scaler_crtc(struct intel_crtc_state *state, int force_detach) DRM_DEBUG_KMS("Updating scaler for [CRTC:%i] scaler_user index %u.%u\n", intel_crtc->base.base.id
[Intel-gfx] [RFC PATCH 3/7] All changes from try2.
Always read out plane state, and do an initial modeset in modeset_gem_init. --- drivers/gpu/drm/i915/i915_dma.c | 12 +- drivers/gpu/drm/i915/i915_drv.c | 2 +- drivers/gpu/drm/i915/i915_drv.h | 7 +- drivers/gpu/drm/i915/intel_atomic.c | 7 - drivers/gpu/drm/i915/intel_display.c | 501 ++- drivers/gpu/drm/i915/intel_drv.h | 3 +- drivers/gpu/drm/i915/intel_fbdev.c | 12 +- drivers/gpu/drm/i915/intel_lvds.c| 2 +- 8 files changed, 280 insertions(+), 266 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 88795d2f1819..ede07f6fe7f7 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -393,6 +393,7 @@ static const struct vga_switcheroo_client_ops i915_switcheroo_ops = { static int i915_load_modeset_init(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev->dev_private; + struct drm_atomic_state *state; int ret; ret = intel_parse_bios(dev); @@ -431,13 +432,18 @@ static int i915_load_modeset_init(struct drm_device *dev) /* Important: The output setup functions called by modeset_init need * working irqs for e.g. gmbus and dp aux transfers. */ - intel_modeset_init(dev); + state = intel_modeset_init(dev); ret = i915_gem_init(dev); - if (ret) + if (ret) { + if (state) { + drm_atomic_state_free(state); + drm_modeset_unlock_all(dev); + } goto cleanup_irq; + } - intel_modeset_gem_init(dev); + intel_modeset_gem_init(dev, state); /* Always safe in the mode setting case. */ /* FIXME: do pre/post-mode set stuff in core KMS code */ diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 78ef0bb53c36..a29b24bca25e 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -758,7 +758,7 @@ static int i915_drm_resume(struct drm_device *dev) spin_unlock_irq(&dev_priv->irq_lock); drm_modeset_lock_all(dev); - intel_modeset_setup_hw_state(dev, true); + intel_display_resume(dev); drm_modeset_unlock_all(dev); intel_dp_mst_resume(dev); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index d9410bd4ab59..e67d2f1e3ab8 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3233,13 +3233,12 @@ static inline void intel_unregister_dsm_handler(void) { return; } /* modesetting */ extern void intel_modeset_init_hw(struct drm_device *dev); -extern void intel_modeset_init(struct drm_device *dev); -extern void intel_modeset_gem_init(struct drm_device *dev); +extern struct drm_atomic_state *intel_modeset_init(struct drm_device *dev); +extern void intel_modeset_gem_init(struct drm_device *dev, struct drm_atomic_state *); extern void intel_modeset_cleanup(struct drm_device *dev); extern void intel_connector_unregister(struct intel_connector *); extern int intel_modeset_vga_set_state(struct drm_device *dev, bool state); -extern void intel_modeset_setup_hw_state(struct drm_device *dev, -bool force_restore); +extern void intel_display_resume(struct drm_device *dev); extern void i915_redisable_vga(struct drm_device *dev); extern void i915_redisable_vga_power_on(struct drm_device *dev); extern bool ironlake_set_drps(struct drm_device *dev, u8 val); diff --git a/drivers/gpu/drm/i915/intel_atomic.c b/drivers/gpu/drm/i915/intel_atomic.c index 4d87574a2032..f36fcc7ceecb 100644 --- a/drivers/gpu/drm/i915/intel_atomic.c +++ b/drivers/gpu/drm/i915/intel_atomic.c @@ -98,13 +98,6 @@ int intel_atomic_check(struct drm_device *dev, return -EINVAL; } - if (crtc_state && - crtc_state->quirks & PIPE_CONFIG_QUIRK_INITIAL_PLANES) { - ret = drm_atomic_add_affected_planes(state, &nuclear_crtc->base); - if (ret) - return ret; - } - ret = drm_atomic_helper_check_planes(dev, state); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 41193dab58c5..9b2acc7b4e29 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -109,6 +109,9 @@ static void skl_init_scalers(struct drm_device *dev, struct intel_crtc *intel_cr struct intel_crtc_state *crtc_state); static int i9xx_get_refclk(const struct intel_crtc_state *crtc_state, int num_connectors); +static void intel_pre_disable_primary(struct drm_crtc *crtc); +static struct drm_atomic_state * +intel_modeset_setup_hw_state(struct drm_device *dev, bool force_restore); static struct intel_encoder *intel_find_encoder(struct intel_connector *connector, int pipe) { @@ -2567,40 +2
[Intel-gfx] [RFC PATCH 4/7] drm/i915: enable fastboot for everyone
Now that we read out the full atomic state we can force fastboot without hacks. The only thing that we worry about is preventing a modeset. This can be easily done by calculating if sw state matches hw state, with exception for pfit and pipe size. Since the original fastboot code only touched pipe size and panel fitter we keep those, and compare full sw state against hw state. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/i915_params.c | 5 - drivers/gpu/drm/i915/intel_display.c | 271 ++- drivers/gpu/drm/i915/intel_fbdev.c | 10 +- 3 files changed, 169 insertions(+), 117 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c index 8ac5a1b29ac0..9b344a956ba6 100644 --- a/drivers/gpu/drm/i915/i915_params.c +++ b/drivers/gpu/drm/i915/i915_params.c @@ -41,7 +41,6 @@ struct i915_params i915 __read_mostly = { .preliminary_hw_support = IS_ENABLED(CONFIG_DRM_I915_PRELIMINARY_HW_SUPPORT), .disable_power_well = 1, .enable_ips = 1, - .fastboot = 0, .prefault_disable = 0, .load_detect_test = 0, .reset = true, @@ -137,10 +136,6 @@ MODULE_PARM_DESC(disable_power_well, module_param_named(enable_ips, i915.enable_ips, int, 0600); MODULE_PARM_DESC(enable_ips, "Enable IPS (default: true)"); -module_param_named(fastboot, i915.fastboot, bool, 0600); -MODULE_PARM_DESC(fastboot, - "Try to skip unnecessary mode sets at boot time (default: false)"); - module_param_named_unsafe(prefault_disable, i915.prefault_disable, bool, 0600); MODULE_PARM_DESC(prefault_disable, "Disable page prefaulting for pread/pwrite/reloc (default:false). " diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 9b2acc7b4e29..de975ef09e23 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -110,6 +110,7 @@ static void skl_init_scalers(struct drm_device *dev, struct intel_crtc *intel_cr static int i9xx_get_refclk(const struct intel_crtc_state *crtc_state, int num_connectors); static void intel_pre_disable_primary(struct drm_crtc *crtc); +static void skylake_pfit_enable(struct intel_crtc *crtc); static struct drm_atomic_state * intel_modeset_setup_hw_state(struct drm_device *dev, bool force_restore); @@ -3289,14 +3290,17 @@ static bool intel_crtc_has_pending_flip(struct drm_crtc *crtc) return pending; } -static void intel_update_pipe_size(struct intel_crtc *crtc) +static void intel_update_fastboot(struct intel_crtc *crtc, + struct intel_crtc_state *old_crtc_state) { struct drm_device *dev = crtc->base.dev; struct drm_i915_private *dev_priv = dev->dev_private; - const struct drm_display_mode *adjusted_mode; + struct intel_crtc_state *pipe_config = + to_intel_crtc_state(crtc->base.state); - if (!i915.fastboot) - return; + DRM_DEBUG_KMS("Applying fastboot quirks, %ix%i -> %ix%i\n", + old_crtc_state->pipe_src_w, old_crtc_state->pipe_src_h, + pipe_config->pipe_src_w, pipe_config->pipe_src_h); /* * Update pipe size and adjust fitter if needed: the reason for this is @@ -3305,27 +3309,27 @@ static void intel_update_pipe_size(struct intel_crtc *crtc) * fastboot case, we'll flip, but if we don't update the pipesrc and * pfit state, we'll end up with a big fb scanned out into the wrong * sized surface. -* -* To fix this properly, we need to hoist the checks up into -* compute_mode_changes (or above), check the actual pfit state and -* whether the platform allows pfit disable with pipe active, and only -* then update the pipesrc and pfit state, even on the flip path. */ - adjusted_mode = &crtc->config->base.adjusted_mode; - I915_WRITE(PIPESRC(crtc->pipe), - ((adjusted_mode->crtc_hdisplay - 1) << 16) | - (adjusted_mode->crtc_vdisplay - 1)); - if (!crtc->config->pch_pfit.enabled && - (intel_pipe_has_type(crtc, INTEL_OUTPUT_LVDS) || -intel_pipe_has_type(crtc, INTEL_OUTPUT_EDP))) { + ((pipe_config->pipe_src_w - 1) << 16) | + (pipe_config->pipe_src_h - 1)); + + /* on skylake this is done by detaching scalers */ + if (INTEL_INFO(dev)->gen == 9) { + skl_detach_scalers(crtc); + + if (pipe_config->pch_pfit.enabled) + skylake_pfit_enable(crtc); + } + else if (!pipe_config->pch_pfit.enabled && + old_crtc_state->pch_pfit.enabled) { + DRM_DEBUG_KMS("Disabling panel fitter\n"); + I915_WRITE(PF_CTL(crtc->pipe), 0); I915_WRITE(PF_WIN_POS(crtc->pipe), 0); I915_WRITE(PF_WIN_SZ(crtc->pipe), 0);
Re: [Intel-gfx] [PATCH 06/15] drm/i915: Debugfs interface to read GuC load status
On 16/06/15 10:40, Chris Wilson wrote: > On Mon, Jun 15, 2015 at 07:36:24PM +0100, Dave Gordon wrote: >> From: Alex Dai >> >> The new node provides access to the status of the common uC loader >> code and the GuC-specific loader; also the scratch registers used >> for communicatio between the i915 driver and the GuC firmware. >> >> Issue: VIZ-4884 >> Signed-off-by: Alex Dai >> Signed-off-by: Dave Gordon >> --- >> drivers/gpu/drm/i915/i915_debugfs.c | 37 >> +++ >> 1 file changed, 37 insertions(+) >> >> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c >> b/drivers/gpu/drm/i915/i915_debugfs.c >> index 47636f3..c52a745 100644 >> --- a/drivers/gpu/drm/i915/i915_debugfs.c >> +++ b/drivers/gpu/drm/i915/i915_debugfs.c >> @@ -2352,6 +2352,42 @@ static int i915_llc(struct seq_file *m, void *data) >> return 0; >> } >> >> +static void i915_uc_load_status_info(struct seq_file *m, struct intel_uc_fw >> *uc_fw) >> +{ >> +seq_printf(m, "%s firmware status:\n\tpath: <%s>\n\tfetch: %d\n\tload: >> %d\n", >> +uc_fw->uc_name, >> +uc_fw->uc_fw_path, >> +uc_fw->uc_fw_fetch_status, >> +uc_fw->uc_fw_load_status); > > If you made this one seq_printf() per line visualing the resulting > format would have been easier - and easier to modify. Done. > Don't use <%s>, that's just visual noise to make cutting and pasting > harder. My terminal doesn't include <> in word selections (but DOES include "/" and ".") so selecting a pathname just works :) But I've removed the anyway. > If you can decode numeric status values, do so. Done. I've added a _repr function for decoding the enum and used it everywhere. >> +} >> + >> +static int i915_guc_load_status_info(struct seq_file *m, void *data) >> +{ >> +struct drm_info_node *node = m->private; >> +struct drm_i915_private *dev_priv = node->minor->dev->dev_private; >> +u32 tmp, i; >> + >> +if (!HAS_GUC_UCODE(dev_priv->dev)) > > Here and elsewhere it should be return -ENODEV; There's only one other use of HAS_GUC_UCODE (in intel_guc_ucode_init()) and that one doesn't and mustn't trigger an error if false. I don't see why it should be an *error* here either; the caller hasn't done anything wrong, and there's no h/w or s/w failure. An empty result (EOF) is a nice way of saying that there's nothing to say, without making the user think something broke. In fact it may be perfectly meaningful to continue rather than returning; consider the case of a future GuC that comes with firmware preloaded, so HAS_GUC() is true but HAS_GUC_UCODE() is FALSE. We could still read and decode the GUC_STATUS even though we haven't loaded any firmware. >> +return 0; >> + >> +i915_uc_load_status_info(m, &dev_priv->guc.guc_fw); >> + >> +tmp = I915_READ(GUC_STATUS); >> + >> +seq_printf(m, "\nGuC status 0x%08x:\n", tmp); >> +seq_printf(m, "\tBootrom status = 0x%x\n", >> +(tmp & GS_BOOTROM_MASK) >> GS_BOOTROM_SHIFT); >> +seq_printf(m, "\tuKernel status = 0x%x\n", >> +(tmp & GS_UKERNEL_MASK) >> GS_UKERNEL_SHIFT); >> +seq_printf(m, "\tMIA Core status = 0x%x\n", >> +(tmp & GS_MIA_MASK) >> GS_MIA_SHIFT); >> +seq_puts(m, "\nScratch registers value:\n"); >> +for (i = 0; i < 16; i++) >> +seq_printf(m, "\t%2d: \t0x%x\n", i, I915_READ(SOFT_SCRATCH(i))); > > I have a feeling these probably don't want to be upstreamed. > -Chris It's just a register dump; nothing secret there. You could read them with IGT's register dumper anyway. .Dave. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx