Re: [Intel-gfx] [PATCH 2/2] drm/i915: Render decompression support for Gen9 and above
> -Original Message- > From: Ville Syrjälä [mailto:ville.syrj...@linux.intel.com] > Sent: Friday, March 18, 2016 10:42 PM > To: Kannan, Vandana <vandana.kan...@intel.com> > Cc: intel-gfx@lists.freedesktop.org > Subject: Re: [Intel-gfx] [PATCH 2/2] drm/i915: Render decompression > support for Gen9 and above [...] > > @@ -3573,7 +3576,7 @@ static void > skylake_update_primary_plane(struct drm_plane *plane, > > struct drm_framebuffer *fb = plane_state->base.fb; > > int pipe = intel_crtc->pipe; > > u32 plane_ctl; > > - unsigned int rotation = plane_state->base.rotation; > > + unsigned int rotation = plane_state->base.rotation, render_comp; > > u32 stride = skl_plane_stride(fb, 0, rotation); > > u32 surf_addr = plane_state->main.offset; > > int scaler_id = plane_state->scaler_id; @@ -3585,6 +3588,9 @@ > static > > void skylake_update_primary_plane(struct drm_plane *plane, > > int dst_y = plane_state->dst.y1; > > int dst_w = drm_rect_width(_state->dst); > > int dst_h = drm_rect_height(_state->dst); > > + unsigned long aux_dist = 0; > > + u32 aux_x_offset = 0, aux_y_offset = 0, aux_stride = 0; > > + u32 tile_row_adjustment = 0; > > > > plane_ctl = PLANE_CTL_ENABLE | > > PLANE_CTL_PIPE_GAMMA_ENABLE | > > @@ -3604,10 +3610,43 @@ static void > skylake_update_primary_plane(struct drm_plane *plane, > > intel_crtc->adjusted_x = src_x; > > intel_crtc->adjusted_y = src_y; > > > > + render_comp = plane_state->render_comp_enable; > > + if (render_comp) { > > + u32 tile_height = PAGE_SIZE / > > + intel_fb_stride_alignment(dev_priv, fb->modifier[0], > > + fb->pixel_format); > > + > > + u32 height_in_mem = (fb->offsets[1]/fb->pitches[0]); > > + tile_row_adjustment = height_in_mem % tile_height; > > + aux_dist = (fb->pitches[0] * > > + (height_in_mem - tile_row_adjustment)); > > + aux_stride = skl_plane_stride(fb, 1, rotation); > > Could you go read my fb offsets series [1] and see if there's something > there that doesn't agree with render compression? > > [1] https://lists.freedesktop.org/archives/intel-gfx/2016- > February/087660.html > [Vandana] I went through that.. intel_fb_pitch() return the fb->pitches[plane]. So for aux plane in the case of RC and UV plane in the case of NV12, plane value should be 1. I dint see any other dependency on first read. > > + plane_ctl |= PLANE_CTL_DECOMPRESSION_ENABLE; > > + } else { > > + plane_ctl &= ~PLANE_CTL_DECOMPRESSION_ENABLE; > > + } > > + > > I915_WRITE(PLANE_CTL(pipe, 0), plane_ctl); > > I915_WRITE(PLANE_OFFSET(pipe, 0), (src_y << 16) | src_x); > > I915_WRITE(PLANE_STRIDE(pipe, 0), stride); > > I915_WRITE(PLANE_SIZE(pipe, 0), (src_h << 16) | src_w); > > + I915_WRITE(PLANE_AUX_DIST(pipe, 0), aux_dist | aux_stride); > > + I915_WRITE(PLANE_AUX_OFFSET(pipe, 0), aux_y_offset<<16 | > > +aux_x_offset); > > + > > + /* > > +* Per bspec, for SKL C and BXT A steppings, when the plane source > pixel > > +* format is NV12, the CHICKEN_PIPESL register bit 22 must be set > to 1. > > +* When render compression is enabled, bit 22 must be set to 0. > > +*/ > > + if (IS_SKL_REVID(dev, 0, SKL_REVID_C0) || > > + IS_BXT_REVID(dev, 0, BXT_REVID_A1)) { > > Does anyone really care about these anymore? > [Vandana] It was required internally.. If all users will have the latest stepping, then this can be removed. > Also you really should fix your editor to wrap lines correctly. > > > + u32 temp = I915_READ(CHICKEN_PIPESL_1(pipe)); > > + if (render_comp) { > > + if ((temp & HSW_FBCQ_DIS) == HSW_FBCQ_DIS) > > + I915_WRITE(CHICKEN_PIPESL_1(pipe), > > + temp & ~HSW_FBCQ_DIS); > > + } > > + } > > > > if (scaler_id >= 0) { > > uint32_t ps_ctrl = 0; > > @@ -12333,6 +12372,17 @@ int intel_plane_atomic_calc_changes(struct > drm_crtc_state *crtc_state, > > to_intel_plane_state(plane_state)); > > if (ret) > > return ret; > > + > > + if (fb && to_intel_plane_state(plane_state)-> > > + rende
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Render decompression support for Gen9 and above
I thought we had concluded previously that this change was acceptable, which is why I'm surprised that it's come up yet again. The performance cost of creating two fb objects per buffer is irrelevant - it will be a one shot hit on buffer creation, and from that point forward it's just selecting which of the two fb's to use at any point in time. Given that I'm told that the memory cost kernel side per fb is small, the number isn't a big deal either. Hence, I'm not sure why you were expecting a performance analysis. The two things I object to are: 1) Having to tell the kernel that there is no aux buffer on an allocation that actually has an aux buffer in order to indicate that at this point in time the buffer is uncompressed. This is completely non intuitive from a caller perspective - especially as it's called an aux buffer, not a compression buffer. 2) Having to use a fb object to manage dynamic state. The fb for a particular buffer will change over the course of a frame. Any debug for a frame at entry to the HWC will have a different fb to the debug at frame exit which makes it hard to track. Thanks Gary -Original Message- From: Daniel Stone [mailto:dan...@fooishbar.org] Sent: Tuesday, March 22, 2016 1:37 PM To: Smith, Gary KCc: Kannan, Vandana ; intel-gfx ; Daniel Vetter Subject: Re: [PATCH 2/2] drm/i915: Render decompression support for Gen9 and above On 22 March 2016 at 13:30, Daniel Stone wrote: > Exactly the same as the last time we discussed it I should add that I understand your previous objection that creating framebuffers on the fly is not performant enough, and you object to the effort of managing 100 rather than 50 framebuffers upfront (though honestly, if you get to 50 framebuffers you're already having to do some clever setup/management anyway). But in the last thread, Daniel Vetter asked for some performance numbers to bear out your objection that framebuffer creation is too costly, leading to getting it fixed if this is in fact the case (since other userspace relies on it being fast), but this performance analysis never arrived. I'd still be interested in seeing that. Cheers, Daniel - Intel Corporation (UK) Limited Registered No. 1134945 (England) Registered Office: Pipers Way, Swindon SN3 1RJ VAT No: 860 2173 47 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Render decompression support for Gen9 and above
On 22 March 2016 at 13:30, Daniel Stonewrote: > Exactly the same as the last time we discussed it I should add that I understand your previous objection that creating framebuffers on the fly is not performant enough, and you object to the effort of managing 100 rather than 50 framebuffers upfront (though honestly, if you get to 50 framebuffers you're already having to do some clever setup/management anyway). But in the last thread, Daniel Vetter asked for some performance numbers to bear out your objection that framebuffer creation is too costly, leading to getting it fixed if this is in fact the case (since other userspace relies on it being fast), but this performance analysis never arrived. I'd still be interested in seeing that. Cheers, Daniel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Render decompression support for Gen9 and above
Hi Gary, On 21 March 2016 at 12:20, Smith, Gary Kwrote: > What are the objections to this change? Exactly the same as the last time we discussed it: that you're abusing plane properties to contain fundamental properties of the buffer you want to scan out, i.e. that which naturally belongs in the framebuffer object rather than a separate plane property. And doing it in a way which is prone to failure when userspace is unaware of these properties too. That being said, I hadn't noticed that this had moved to being a purely Intel-private property, rather than touching Android core at all, so if you can get it past the Intel DRM maintainers, then fine. I was primarily watching out for core DRM. Cheers, Daniel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Render decompression support for Gen9 and above
Hi, What are the objections to this change? Thanks Gary -Original Message- From: Daniel Stone [mailto:dan...@fooishbar.org] Sent: Friday, March 18, 2016 5:05 PM To: Kannan, VandanaCc: intel-gfx ; Smith, Gary K ; Daniel Vetter Subject: Re: [PATCH 2/2] drm/i915: Render decompression support for Gen9 and above Hi Vandana, On 18 March 2016 at 16:50, Vandana Kannan wrote: > The reason for using a plane property instead of fb modifier:- In > Android, OGL passes a render compressed buffer to hardware composer > (HWC), which would then request a flip on that buffer after checking > if the target can support render compressed buffer. For example, only > planes 1 and 2 on pipes 1 and 2 can support RC. In case the target > cannot support it, HWC will revert back to OGL requesting for uncompressed > buffer. I still don't believe we can support this in non-Android userspace, so I don't have any other comment on this patch. Cheers, Daniel - Intel Corporation (UK) Limited Registered No. 1134945 (England) Registered Office: Pipers Way, Swindon SN3 1RJ VAT No: 860 2173 47 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Render decompression support for Gen9 and above
On Fri, Mar 18, 2016 at 10:20:53PM +0530, Vandana Kannan wrote: > This patch includes enabling render decompression (RC) after checking all the > requirements (format, tiling, rotation etc.). Along with this, the WAs > mentioned in BSpec Workaround page have been implemented. > > TODO: > 1. Disable stereo 3D when render decomp is enabled (bit 7:6) > 2. Render decompression must not be used in VTd pass-through mode > 3. Program hashing select CHICKEN_MISC1 bit 15 > 4. For Gen10, add support for RGB 1010102 > 5. RC-FBC workaround > 6. RC watermark calculation > > The reason for using a plane property instead of fb modifier:- > In Android, OGL passes a render compressed buffer to hardware composer (HWC), > which would then request a flip on that buffer after checking if the target > can support render compressed buffer. For example, only planes 1 and 2 on > pipes 1 and 2 can support RC. In case the target cannot support it, HWC will > revert back to OGL requesting for uncompressed buffer. > Here, > if plane property is used, OGL would send back the buffer (same ID) after > decompression, marked uncompressed. If fb modifier was used, a different > version of the buffer would have to be maintained for different combinations - > in the simple case of render compressed vs uncompressed buffer, there would be > 2 fbs with 2 IDs. So, in this case, OGL would give a reference to a fb with a > different ID. > To avoid the difficulty of keeping track of multiple fbs and the subsequent > complexity in debug, the architecture forum decided to go ahead with a plane > property for RC. > > [Mayuresh] Added the plane check in skl_check_compression() > > Signed-off-by: Vandana Kannan> Cc: Smith, Gary K > Cc: Daniel Stone > Cc: Daniel Vetter > --- > drivers/gpu/drm/i915/i915_drv.h | 1 + > drivers/gpu/drm/i915/i915_reg.h | 22 > drivers/gpu/drm/i915/intel_atomic_plane.c | 24 +++- > drivers/gpu/drm/i915/intel_display.c | 206 > +- > drivers/gpu/drm/i915/intel_drv.h | 24 +++- > drivers/gpu/drm/i915/intel_sprite.c | 42 +- > 6 files changed, 305 insertions(+), 14 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index f37ac12..bb47ee1 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -1929,6 +1929,7 @@ struct drm_i915_private { > > struct drm_property *broadcast_rgb_property; > struct drm_property *force_audio_property; > + struct drm_property *render_comp_property; > > /* hda/i915 audio component */ > struct i915_audio_component *audio_component; > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h > index 7dfc400..773c37f 100644 > --- a/drivers/gpu/drm/i915/i915_reg.h > +++ b/drivers/gpu/drm/i915/i915_reg.h > @@ -5756,6 +5756,28 @@ enum skl_disp_power_wells { > _ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A), \ > _ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B)) > > +#define PLANE_AUX_DIST_1_A 0x701c0 > +#define PLANE_AUX_DIST_2_A 0x702c0 > +#define PLANE_AUX_DIST_1_B 0x711c0 > +#define PLANE_AUX_DIST_2_B 0x712c0 > +#define _PLANE_AUX_DIST_1(pipe) \ > + _PIPE(pipe, PLANE_AUX_DIST_1_A, PLANE_AUX_DIST_1_B) > +#define _PLANE_AUX_DIST_2(pipe) \ > + _PIPE(pipe, PLANE_AUX_DIST_2_A, PLANE_AUX_DIST_2_B) > +#define PLANE_AUX_DIST(pipe, plane) \ > + _MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe), > _PLANE_AUX_DIST_2(pipe)) > + > +#define PLANE_AUX_OFFSET_1_A 0x701c4 > +#define PLANE_AUX_OFFSET_2_A 0x702c4 > +#define PLANE_AUX_OFFSET_1_B 0x711c4 > +#define PLANE_AUX_OFFSET_2_B 0x712c4 > +#define _PLANE_AUX_OFFSET_1(pipe) \ > + _PIPE(pipe, PLANE_AUX_OFFSET_1_A, PLANE_AUX_OFFSET_1_B) > +#define _PLANE_AUX_OFFSET_2(pipe) \ > + _PIPE(pipe, PLANE_AUX_OFFSET_2_A, PLANE_AUX_OFFSET_2_B) > +#define PLANE_AUX_OFFSET(pipe, plane) \ > + _MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe), > _PLANE_AUX_OFFSET_2(pipe)) > + > /* legacy palette */ > #define _LGC_PALETTE_A 0x4a000 > #define _LGC_PALETTE_B 0x4a800 > diff --git a/drivers/gpu/drm/i915/intel_atomic_plane.c > b/drivers/gpu/drm/i915/intel_atomic_plane.c > index e0b851a..c431333 100644 > --- a/drivers/gpu/drm/i915/intel_atomic_plane.c > +++ b/drivers/gpu/drm/i915/intel_atomic_plane.c > @@ -230,8 +230,16 @@ intel_plane_atomic_get_property(struct drm_plane *plane, > struct drm_property *property, > uint64_t *val) > { > - DRM_DEBUG_KMS("Unknown plane property '%s'\n", property->name); > - return -EINVAL; > + struct drm_i915_private *dev_priv =
[Intel-gfx] [PATCH 2/2] drm/i915: Render decompression support for Gen9 and above
This patch includes enabling render decompression (RC) after checking all the requirements (format, tiling, rotation etc.). Along with this, the WAs mentioned in BSpec Workaround page have been implemented. TODO: 1. Disable stereo 3D when render decomp is enabled (bit 7:6) 2. Render decompression must not be used in VTd pass-through mode 3. Program hashing select CHICKEN_MISC1 bit 15 4. For Gen10, add support for RGB 1010102 5. RC-FBC workaround 6. RC watermark calculation The reason for using a plane property instead of fb modifier:- In Android, OGL passes a render compressed buffer to hardware composer (HWC), which would then request a flip on that buffer after checking if the target can support render compressed buffer. For example, only planes 1 and 2 on pipes 1 and 2 can support RC. In case the target cannot support it, HWC will revert back to OGL requesting for uncompressed buffer. Here, if plane property is used, OGL would send back the buffer (same ID) after decompression, marked uncompressed. If fb modifier was used, a different version of the buffer would have to be maintained for different combinations - in the simple case of render compressed vs uncompressed buffer, there would be 2 fbs with 2 IDs. So, in this case, OGL would give a reference to a fb with a different ID. To avoid the difficulty of keeping track of multiple fbs and the subsequent complexity in debug, the architecture forum decided to go ahead with a plane property for RC. [Mayuresh] Added the plane check in skl_check_compression() Signed-off-by: Vandana KannanCc: Smith, Gary K Cc: Daniel Stone Cc: Daniel Vetter --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_reg.h | 22 drivers/gpu/drm/i915/intel_atomic_plane.c | 24 +++- drivers/gpu/drm/i915/intel_display.c | 206 +- drivers/gpu/drm/i915/intel_drv.h | 24 +++- drivers/gpu/drm/i915/intel_sprite.c | 42 +- 6 files changed, 305 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index f37ac12..bb47ee1 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1929,6 +1929,7 @@ struct drm_i915_private { struct drm_property *broadcast_rgb_property; struct drm_property *force_audio_property; + struct drm_property *render_comp_property; /* hda/i915 audio component */ struct i915_audio_component *audio_component; diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 7dfc400..773c37f 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -5756,6 +5756,28 @@ enum skl_disp_power_wells { _ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A), \ _ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B)) +#define PLANE_AUX_DIST_1_A 0x701c0 +#define PLANE_AUX_DIST_2_A 0x702c0 +#define PLANE_AUX_DIST_1_B 0x711c0 +#define PLANE_AUX_DIST_2_B 0x712c0 +#define _PLANE_AUX_DIST_1(pipe) \ + _PIPE(pipe, PLANE_AUX_DIST_1_A, PLANE_AUX_DIST_1_B) +#define _PLANE_AUX_DIST_2(pipe) \ + _PIPE(pipe, PLANE_AUX_DIST_2_A, PLANE_AUX_DIST_2_B) +#define PLANE_AUX_DIST(pipe, plane) \ + _MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe), _PLANE_AUX_DIST_2(pipe)) + +#define PLANE_AUX_OFFSET_1_A 0x701c4 +#define PLANE_AUX_OFFSET_2_A 0x702c4 +#define PLANE_AUX_OFFSET_1_B 0x711c4 +#define PLANE_AUX_OFFSET_2_B 0x712c4 +#define _PLANE_AUX_OFFSET_1(pipe) \ + _PIPE(pipe, PLANE_AUX_OFFSET_1_A, PLANE_AUX_OFFSET_1_B) +#define _PLANE_AUX_OFFSET_2(pipe) \ + _PIPE(pipe, PLANE_AUX_OFFSET_2_A, PLANE_AUX_OFFSET_2_B) +#define PLANE_AUX_OFFSET(pipe, plane) \ + _MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe), _PLANE_AUX_OFFSET_2(pipe)) + /* legacy palette */ #define _LGC_PALETTE_A 0x4a000 #define _LGC_PALETTE_B 0x4a800 diff --git a/drivers/gpu/drm/i915/intel_atomic_plane.c b/drivers/gpu/drm/i915/intel_atomic_plane.c index e0b851a..c431333 100644 --- a/drivers/gpu/drm/i915/intel_atomic_plane.c +++ b/drivers/gpu/drm/i915/intel_atomic_plane.c @@ -230,8 +230,16 @@ intel_plane_atomic_get_property(struct drm_plane *plane, struct drm_property *property, uint64_t *val) { - DRM_DEBUG_KMS("Unknown plane property '%s'\n", property->name); - return -EINVAL; + struct drm_i915_private *dev_priv = state->plane->dev->dev_private; + struct intel_plane_state *intel_state = to_intel_plane_state(state); + + if (property == dev_priv->render_comp_property) { + *val = intel_state->render_comp_enable; + } else { +
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Render decompression support for Gen9 and above
Hi Vandana, On 18 March 2016 at 16:50, Vandana Kannanwrote: > The reason for using a plane property instead of fb modifier:- > In Android, OGL passes a render compressed buffer to hardware composer (HWC), > which would then request a flip on that buffer after checking if the target > can support render compressed buffer. For example, only planes 1 and 2 on > pipes 1 and 2 can support RC. In case the target cannot support it, HWC will > revert back to OGL requesting for uncompressed buffer. I still don't believe we can support this in non-Android userspace, so I don't have any other comment on this patch. Cheers, Daniel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx