Re: [Intel-gfx] [PATCH 2/2] drm/i915: Render decompression support for Gen9 and above

2016-04-24 Thread Kannan, Vandana

> -Original Message-
> From: Ville Syrjälä [mailto:ville.syrj...@linux.intel.com]
> Sent: Friday, March 18, 2016 10:42 PM
> To: Kannan, Vandana <vandana.kan...@intel.com>
> Cc: intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH 2/2] drm/i915: Render decompression
> support for Gen9 and above

[...]
 
> > @@ -3573,7 +3576,7 @@ static void
> skylake_update_primary_plane(struct drm_plane *plane,
> > struct drm_framebuffer *fb = plane_state->base.fb;
> > int pipe = intel_crtc->pipe;
> > u32 plane_ctl;
> > -   unsigned int rotation = plane_state->base.rotation;
> > +   unsigned int rotation = plane_state->base.rotation, render_comp;
> > u32 stride = skl_plane_stride(fb, 0, rotation);
> > u32 surf_addr = plane_state->main.offset;
> > int scaler_id = plane_state->scaler_id; @@ -3585,6 +3588,9 @@
> static
> > void skylake_update_primary_plane(struct drm_plane *plane,
> > int dst_y = plane_state->dst.y1;
> > int dst_w = drm_rect_width(_state->dst);
> > int dst_h = drm_rect_height(_state->dst);
> > +   unsigned long aux_dist = 0;
> > +   u32 aux_x_offset = 0, aux_y_offset = 0, aux_stride = 0;
> > +   u32 tile_row_adjustment = 0;
> >
> > plane_ctl = PLANE_CTL_ENABLE |
> > PLANE_CTL_PIPE_GAMMA_ENABLE |
> > @@ -3604,10 +3610,43 @@ static void
> skylake_update_primary_plane(struct drm_plane *plane,
> > intel_crtc->adjusted_x = src_x;
> > intel_crtc->adjusted_y = src_y;
> >
> > +   render_comp = plane_state->render_comp_enable;
> > +   if (render_comp) {
> > +   u32 tile_height = PAGE_SIZE /
> > +   intel_fb_stride_alignment(dev_priv, fb->modifier[0],
> > +   fb->pixel_format);
> > +
> > +   u32 height_in_mem = (fb->offsets[1]/fb->pitches[0]);
> > +   tile_row_adjustment = height_in_mem % tile_height;
> > +   aux_dist = (fb->pitches[0] *
> > +   (height_in_mem - tile_row_adjustment));
> > +   aux_stride = skl_plane_stride(fb, 1, rotation);
> 
> Could you go read my fb offsets series [1] and see if there's something
> there that doesn't agree with render compression?
> 
> [1] https://lists.freedesktop.org/archives/intel-gfx/2016-
> February/087660.html
> 
[Vandana] 
I went through that..
intel_fb_pitch() return the fb->pitches[plane]. So for aux plane in the case of 
RC and UV plane in the case of NV12,
plane value should be 1.
I dint see any other dependency on first read.

> > +   plane_ctl |= PLANE_CTL_DECOMPRESSION_ENABLE;
> > +   } else {
> > +   plane_ctl &= ~PLANE_CTL_DECOMPRESSION_ENABLE;
> > +   }
> > +
> > I915_WRITE(PLANE_CTL(pipe, 0), plane_ctl);
> > I915_WRITE(PLANE_OFFSET(pipe, 0), (src_y << 16) | src_x);
> > I915_WRITE(PLANE_STRIDE(pipe, 0), stride);
> > I915_WRITE(PLANE_SIZE(pipe, 0), (src_h << 16) | src_w);
> > +   I915_WRITE(PLANE_AUX_DIST(pipe, 0), aux_dist | aux_stride);
> > +   I915_WRITE(PLANE_AUX_OFFSET(pipe, 0), aux_y_offset<<16 |
> > +aux_x_offset);
> > +
> > +   /*
> > +* Per bspec, for SKL C and BXT A steppings, when the plane source
> pixel
> > +* format is NV12, the CHICKEN_PIPESL register bit 22 must be set
> to 1.
> > +* When render compression is enabled, bit 22 must be set to 0.
> > +*/
> > +   if (IS_SKL_REVID(dev, 0, SKL_REVID_C0) ||
> > +   IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {
> 
> Does anyone really care about these anymore?
> 
[Vandana] It was required internally.. If all users will have the latest 
stepping, then this can be removed.

> Also you really should fix your editor to wrap lines correctly.
> 
> > +   u32 temp = I915_READ(CHICKEN_PIPESL_1(pipe));
> > +   if (render_comp) {
> > +   if ((temp & HSW_FBCQ_DIS) == HSW_FBCQ_DIS)
> > +   I915_WRITE(CHICKEN_PIPESL_1(pipe),
> > +   temp & ~HSW_FBCQ_DIS);
> > +   }
> > +   }
> >
> > if (scaler_id >= 0) {
> > uint32_t ps_ctrl = 0;
> > @@ -12333,6 +12372,17 @@ int intel_plane_atomic_calc_changes(struct
> drm_crtc_state *crtc_state,
> > to_intel_plane_state(plane_state));
> > if (ret)
> > return ret;
> > +
> > +   if (fb && to_intel_plane_state(plane_state)->
> > +   rende

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Render decompression support for Gen9 and above

2016-03-22 Thread Smith, Gary K
I thought we had concluded previously that this change was acceptable, which is 
why I'm surprised that it's come up yet again.

The performance cost of creating two fb objects per buffer is irrelevant - it 
will be a one shot hit on buffer creation, and from that point forward it's 
just selecting which of the two fb's to use at any point in time. Given that 
I'm told that the memory cost kernel side per fb is small, the number isn't a 
big deal either. Hence, I'm not sure why you were expecting a performance 
analysis.

The two things I object to are:

1) Having to tell the kernel that there is no aux buffer on an allocation that 
actually has an aux buffer in order to indicate that at this point in time the 
buffer is uncompressed. This is completely non intuitive from a caller 
perspective - especially as it's called an aux buffer, not a compression buffer.

2) Having to use a fb object to manage dynamic state. The fb for a particular 
buffer will change over the course of a frame. Any debug for a frame at entry 
to the HWC will have a different fb to the debug at frame exit which makes it 
hard to track.

Thanks
Gary


-Original Message-
From: Daniel Stone [mailto:dan...@fooishbar.org] 
Sent: Tuesday, March 22, 2016 1:37 PM
To: Smith, Gary K 
Cc: Kannan, Vandana ; intel-gfx 
; Daniel Vetter 
Subject: Re: [PATCH 2/2] drm/i915: Render decompression support for Gen9 and 
above

On 22 March 2016 at 13:30, Daniel Stone  wrote:
> Exactly the same as the last time we discussed it

I should add that I understand your previous objection that creating 
framebuffers on the fly is not performant enough, and you object to the effort 
of managing 100 rather than 50 framebuffers upfront (though honestly, if you 
get to 50 framebuffers you're already having to do some clever setup/management 
anyway). But in the last thread, Daniel Vetter asked for some performance 
numbers to bear out your objection that framebuffer creation is too costly, 
leading to getting it fixed if this is in fact the case (since other userspace 
relies on it being fast), but this performance analysis never arrived.

I'd still be interested in seeing that.

Cheers,
Daniel
-
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Render decompression support for Gen9 and above

2016-03-22 Thread Daniel Stone
On 22 March 2016 at 13:30, Daniel Stone  wrote:
> Exactly the same as the last time we discussed it

I should add that I understand your previous objection that creating
framebuffers on the fly is not performant enough, and you object to
the effort of managing 100 rather than 50 framebuffers upfront (though
honestly, if you get to 50 framebuffers you're already having to do
some clever setup/management anyway). But in the last thread, Daniel
Vetter asked for some performance numbers to bear out your objection
that framebuffer creation is too costly, leading to getting it fixed
if this is in fact the case (since other userspace relies on it being
fast), but this performance analysis never arrived.

I'd still be interested in seeing that.

Cheers,
Daniel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Render decompression support for Gen9 and above

2016-03-22 Thread Daniel Stone
Hi Gary,

On 21 March 2016 at 12:20, Smith, Gary K  wrote:
> What are the objections to this change?

Exactly the same as the last time we discussed it: that you're abusing
plane properties to contain fundamental properties of the buffer you
want to scan out, i.e. that which naturally belongs in the framebuffer
object rather than a separate plane property. And doing it in a way
which is prone to failure when userspace is unaware of these
properties too.

That being said, I hadn't noticed that this had moved to being a
purely Intel-private property, rather than touching Android core at
all, so if you can get it past the Intel DRM maintainers, then fine. I
was primarily watching out for core DRM.

Cheers,
Daniel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Render decompression support for Gen9 and above

2016-03-21 Thread Smith, Gary K
Hi,

What are the objections to this change?

Thanks
Gary

-Original Message-
From: Daniel Stone [mailto:dan...@fooishbar.org] 
Sent: Friday, March 18, 2016 5:05 PM
To: Kannan, Vandana 
Cc: intel-gfx ; Smith, Gary K 
; Daniel Vetter 
Subject: Re: [PATCH 2/2] drm/i915: Render decompression support for Gen9 and 
above

Hi Vandana,

On 18 March 2016 at 16:50, Vandana Kannan  wrote:
> The reason for using a plane property instead of fb modifier:- In 
> Android, OGL passes a render compressed buffer to hardware composer 
> (HWC), which would then request a flip on that buffer after checking 
> if the target can support render compressed buffer. For example, only 
> planes 1 and 2 on pipes 1 and 2 can support RC. In case the target 
> cannot support it, HWC will revert back to OGL requesting for uncompressed 
> buffer.

I still don't believe we can support this in non-Android userspace, so I don't 
have any other comment on this patch.

Cheers,
Daniel
-
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Render decompression support for Gen9 and above

2016-03-20 Thread Ville Syrjälä
On Fri, Mar 18, 2016 at 10:20:53PM +0530, Vandana Kannan wrote:
> This patch includes enabling render decompression (RC) after checking all the
> requirements (format, tiling, rotation etc.). Along with this, the WAs
> mentioned in BSpec Workaround page have been implemented.
> 
> TODO:
> 1. Disable stereo 3D when render decomp is enabled (bit 7:6)
> 2. Render decompression must not be used in VTd pass-through mode
> 3. Program hashing select CHICKEN_MISC1 bit 15
> 4. For Gen10, add support for RGB 1010102
> 5. RC-FBC workaround
> 6. RC watermark calculation
> 
> The reason for using a plane property instead of fb modifier:-
> In Android, OGL passes a render compressed buffer to hardware composer (HWC),
> which would then request a flip on that buffer after checking if the target
> can support render compressed buffer. For example, only planes 1 and 2 on
> pipes 1 and 2 can support RC. In case the target cannot support it, HWC will
> revert back to OGL requesting for uncompressed buffer.
> Here,
> if plane property is used, OGL would send back the buffer (same ID) after
> decompression, marked uncompressed. If fb modifier was used, a different
> version of the buffer would have to be maintained for different combinations -
> in the simple case of render compressed vs uncompressed buffer, there would be
> 2 fbs with 2 IDs. So, in this case, OGL would give a reference to a fb with a
> different ID.
> To avoid the difficulty of keeping track of multiple fbs and the subsequent
> complexity in debug, the architecture forum decided to go ahead with a plane
> property for RC.
> 
> [Mayuresh] Added the plane check in skl_check_compression()
> 
> Signed-off-by: Vandana Kannan 
> Cc: Smith, Gary K 
> Cc: Daniel Stone 
> Cc: Daniel Vetter 
> ---
>  drivers/gpu/drm/i915/i915_drv.h   |   1 +
>  drivers/gpu/drm/i915/i915_reg.h   |  22 
>  drivers/gpu/drm/i915/intel_atomic_plane.c |  24 +++-
>  drivers/gpu/drm/i915/intel_display.c  | 206 
> +-
>  drivers/gpu/drm/i915/intel_drv.h  |  24 +++-
>  drivers/gpu/drm/i915/intel_sprite.c   |  42 +-
>  6 files changed, 305 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index f37ac12..bb47ee1 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1929,6 +1929,7 @@ struct drm_i915_private {
>  
>   struct drm_property *broadcast_rgb_property;
>   struct drm_property *force_audio_property;
> + struct drm_property *render_comp_property;
>  
>   /* hda/i915 audio component */
>   struct i915_audio_component *audio_component;
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 7dfc400..773c37f 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -5756,6 +5756,28 @@ enum skl_disp_power_wells {
>   _ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A),   \
>   _ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B))
>  
> +#define PLANE_AUX_DIST_1_A   0x701c0
> +#define PLANE_AUX_DIST_2_A   0x702c0
> +#define PLANE_AUX_DIST_1_B   0x711c0
> +#define PLANE_AUX_DIST_2_B   0x712c0
> +#define _PLANE_AUX_DIST_1(pipe) \
> + _PIPE(pipe, PLANE_AUX_DIST_1_A, PLANE_AUX_DIST_1_B)
> +#define _PLANE_AUX_DIST_2(pipe) \
> + _PIPE(pipe, PLANE_AUX_DIST_2_A, PLANE_AUX_DIST_2_B)
> +#define PLANE_AUX_DIST(pipe, plane) \
> + _MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe), 
> _PLANE_AUX_DIST_2(pipe))
> +
> +#define PLANE_AUX_OFFSET_1_A 0x701c4
> +#define PLANE_AUX_OFFSET_2_A 0x702c4
> +#define PLANE_AUX_OFFSET_1_B 0x711c4
> +#define PLANE_AUX_OFFSET_2_B 0x712c4
> +#define _PLANE_AUX_OFFSET_1(pipe)   \
> + _PIPE(pipe, PLANE_AUX_OFFSET_1_A, PLANE_AUX_OFFSET_1_B)
> +#define _PLANE_AUX_OFFSET_2(pipe)   \
> + _PIPE(pipe, PLANE_AUX_OFFSET_2_A, PLANE_AUX_OFFSET_2_B)
> +#define PLANE_AUX_OFFSET(pipe, plane)   \
> + _MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe), 
> _PLANE_AUX_OFFSET_2(pipe))
> +
>  /* legacy palette */
>  #define _LGC_PALETTE_A   0x4a000
>  #define _LGC_PALETTE_B   0x4a800
> diff --git a/drivers/gpu/drm/i915/intel_atomic_plane.c 
> b/drivers/gpu/drm/i915/intel_atomic_plane.c
> index e0b851a..c431333 100644
> --- a/drivers/gpu/drm/i915/intel_atomic_plane.c
> +++ b/drivers/gpu/drm/i915/intel_atomic_plane.c
> @@ -230,8 +230,16 @@ intel_plane_atomic_get_property(struct drm_plane *plane,
>   struct drm_property *property,
>   uint64_t *val)
>  {
> - DRM_DEBUG_KMS("Unknown plane property '%s'\n", property->name);
> - return -EINVAL;
> + struct drm_i915_private *dev_priv = 

[Intel-gfx] [PATCH 2/2] drm/i915: Render decompression support for Gen9 and above

2016-03-19 Thread Vandana Kannan
This patch includes enabling render decompression (RC) after checking all the
requirements (format, tiling, rotation etc.). Along with this, the WAs
mentioned in BSpec Workaround page have been implemented.

TODO:
1. Disable stereo 3D when render decomp is enabled (bit 7:6)
2. Render decompression must not be used in VTd pass-through mode
3. Program hashing select CHICKEN_MISC1 bit 15
4. For Gen10, add support for RGB 1010102
5. RC-FBC workaround
6. RC watermark calculation

The reason for using a plane property instead of fb modifier:-
In Android, OGL passes a render compressed buffer to hardware composer (HWC),
which would then request a flip on that buffer after checking if the target
can support render compressed buffer. For example, only planes 1 and 2 on
pipes 1 and 2 can support RC. In case the target cannot support it, HWC will
revert back to OGL requesting for uncompressed buffer.
Here,
if plane property is used, OGL would send back the buffer (same ID) after
decompression, marked uncompressed. If fb modifier was used, a different
version of the buffer would have to be maintained for different combinations -
in the simple case of render compressed vs uncompressed buffer, there would be
2 fbs with 2 IDs. So, in this case, OGL would give a reference to a fb with a
different ID.
To avoid the difficulty of keeping track of multiple fbs and the subsequent
complexity in debug, the architecture forum decided to go ahead with a plane
property for RC.

[Mayuresh] Added the plane check in skl_check_compression()

Signed-off-by: Vandana Kannan 
Cc: Smith, Gary K 
Cc: Daniel Stone 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/i915/i915_drv.h   |   1 +
 drivers/gpu/drm/i915/i915_reg.h   |  22 
 drivers/gpu/drm/i915/intel_atomic_plane.c |  24 +++-
 drivers/gpu/drm/i915/intel_display.c  | 206 +-
 drivers/gpu/drm/i915/intel_drv.h  |  24 +++-
 drivers/gpu/drm/i915/intel_sprite.c   |  42 +-
 6 files changed, 305 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f37ac12..bb47ee1 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1929,6 +1929,7 @@ struct drm_i915_private {
 
struct drm_property *broadcast_rgb_property;
struct drm_property *force_audio_property;
+   struct drm_property *render_comp_property;
 
/* hda/i915 audio component */
struct i915_audio_component *audio_component;
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 7dfc400..773c37f 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -5756,6 +5756,28 @@ enum skl_disp_power_wells {
_ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A),   \
_ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B))
 
+#define PLANE_AUX_DIST_1_A 0x701c0
+#define PLANE_AUX_DIST_2_A 0x702c0
+#define PLANE_AUX_DIST_1_B 0x711c0
+#define PLANE_AUX_DIST_2_B 0x712c0
+#define _PLANE_AUX_DIST_1(pipe) \
+   _PIPE(pipe, PLANE_AUX_DIST_1_A, PLANE_AUX_DIST_1_B)
+#define _PLANE_AUX_DIST_2(pipe) \
+   _PIPE(pipe, PLANE_AUX_DIST_2_A, PLANE_AUX_DIST_2_B)
+#define PLANE_AUX_DIST(pipe, plane) \
+   _MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe), 
_PLANE_AUX_DIST_2(pipe))
+
+#define PLANE_AUX_OFFSET_1_A   0x701c4
+#define PLANE_AUX_OFFSET_2_A   0x702c4
+#define PLANE_AUX_OFFSET_1_B   0x711c4
+#define PLANE_AUX_OFFSET_2_B   0x712c4
+#define _PLANE_AUX_OFFSET_1(pipe)   \
+   _PIPE(pipe, PLANE_AUX_OFFSET_1_A, PLANE_AUX_OFFSET_1_B)
+#define _PLANE_AUX_OFFSET_2(pipe)   \
+   _PIPE(pipe, PLANE_AUX_OFFSET_2_A, PLANE_AUX_OFFSET_2_B)
+#define PLANE_AUX_OFFSET(pipe, plane)   \
+   _MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe), 
_PLANE_AUX_OFFSET_2(pipe))
+
 /* legacy palette */
 #define _LGC_PALETTE_A   0x4a000
 #define _LGC_PALETTE_B   0x4a800
diff --git a/drivers/gpu/drm/i915/intel_atomic_plane.c 
b/drivers/gpu/drm/i915/intel_atomic_plane.c
index e0b851a..c431333 100644
--- a/drivers/gpu/drm/i915/intel_atomic_plane.c
+++ b/drivers/gpu/drm/i915/intel_atomic_plane.c
@@ -230,8 +230,16 @@ intel_plane_atomic_get_property(struct drm_plane *plane,
struct drm_property *property,
uint64_t *val)
 {
-   DRM_DEBUG_KMS("Unknown plane property '%s'\n", property->name);
-   return -EINVAL;
+   struct drm_i915_private *dev_priv = state->plane->dev->dev_private;
+   struct intel_plane_state *intel_state = to_intel_plane_state(state);
+
+   if (property == dev_priv->render_comp_property) {
+   *val = intel_state->render_comp_enable;
+   } else {
+ 

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Render decompression support for Gen9 and above

2016-03-19 Thread Daniel Stone
Hi Vandana,

On 18 March 2016 at 16:50, Vandana Kannan  wrote:
> The reason for using a plane property instead of fb modifier:-
> In Android, OGL passes a render compressed buffer to hardware composer (HWC),
> which would then request a flip on that buffer after checking if the target
> can support render compressed buffer. For example, only planes 1 and 2 on
> pipes 1 and 2 can support RC. In case the target cannot support it, HWC will
> revert back to OGL requesting for uncompressed buffer.

I still don't believe we can support this in non-Android userspace, so
I don't have any other comment on this patch.

Cheers,
Daniel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx