Re: [Mesa-dev] [PATCH v3 13/17] uapi/drm/dg2: Format modifier for DG2 unified compression and clear color

2021-12-10 Thread Nanley Chery
Ping. I see that a v4 has been sent out without these comments being addressed.

-Nanley

On Tue, Dec 7, 2021 at 6:51 PM Nanley Chery  wrote:
>
> Hi Ramalingam,
>
> On Wed, Oct 27, 2021 at 5:22 PM Ramalingam C  wrote:
> >
> > From: Matt Roper 
> >
> > DG2 unifies render compression and media compression into a single
> > format for the first time.  The programming and buffer layout is
> > supposed to match compression on older gen12 platforms, but the
> > actual compression algorithm is different from any previous platform; as
> > such, we need a new framebuffer modifier to represent buffers in this
> > format, but otherwise we can re-use the existing gen12 compression driver
> > logic.
> >
> > DG2 clear color render compression uses Tile4 layout. Therefore, we need
> > to define a new format modifier for uAPI to support clear color rendering.
> >
>
> I left some feedback on the modifier texts below, but I think it also
> applies to this commit message.
>
> > v2: Rebased on new format modifier check [Ram]
> >
> > Signed-off-by: Matt Roper 
> > Signed-off-by: Mika Kahola  (v2)
> > Signed-off-by: Juha-Pekka Heikkilä 
> > Signed-off-by: Ramalingam C 
> > cc: Simon Ser 
> > Cc: Pekka Paalanen 
> > Cc: Jordan Justen 
> > Cc: Kenneth Graunke 
> > Cc: mesa-dev@lists.freedesktop.org
> > Cc: Tony Ye 
> > Cc: Slawomir Milczarek 
> > Acked-by: Simon Ser 
> > ---
> >  drivers/gpu/drm/i915/display/intel_fb.c   | 43 +++
> >  .../drm/i915/display/skl_universal_plane.c| 29 -
> >  include/uapi/drm/drm_fourcc.h | 30 +
> >  3 files changed, 101 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_fb.c 
> > b/drivers/gpu/drm/i915/display/intel_fb.c
> > index 562d5244688d..484ae1fd0e94 100644
> > --- a/drivers/gpu/drm/i915/display/intel_fb.c
> > +++ b/drivers/gpu/drm/i915/display/intel_fb.c
> > @@ -106,6 +106,21 @@ static const struct drm_format_info 
> > gen12_ccs_cc_formats[] = {
> >   .hsub = 1, .vsub = 1, .has_alpha = true },
> >  };
> >
> > +static const struct drm_format_info gen12_flat_ccs_cc_formats[] = {
> > +   { .format = DRM_FORMAT_XRGB, .depth = 24, .num_planes = 2,
> > + .char_per_block = { 4, 0 }, .block_w = { 1, 2 }, .block_h = { 1, 
> > 1 },
> > + .hsub = 1, .vsub = 1, },
> > +   { .format = DRM_FORMAT_XBGR, .depth = 24, .num_planes = 2,
> > + .char_per_block = { 4, 0 }, .block_w = { 1, 2 }, .block_h = { 1, 
> > 1 },
> > + .hsub = 1, .vsub = 1, },
> > +   { .format = DRM_FORMAT_ARGB, .depth = 32, .num_planes = 2,
> > + .char_per_block = { 4, 0 }, .block_w = { 1, 2 }, .block_h = { 1, 
> > 1 },
> > + .hsub = 1, .vsub = 1, .has_alpha = true },
> > +   { .format = DRM_FORMAT_ABGR, .depth = 32, .num_planes = 2,
> > + .char_per_block = { 4, 0 }, .block_w = { 1, 2 }, .block_h = { 1, 
> > 1 },
> > + .hsub = 1, .vsub = 1, .has_alpha = true },
> > +};
> > +
> >  struct intel_modifier_desc {
> > u64 modifier;
> > struct {
> > @@ -166,6 +181,27 @@ static const struct intel_modifier_desc 
> > intel_modifiers[] = {
> > .ccs.packed_aux_planes = BIT(1),
> >
> > FORMAT_OVERRIDE(gen12_ccs_cc_formats),
> > +   }, {
> > +   .modifier = I915_FORMAT_MOD_4_TILED_DG2_RC_CCS,
> > +   .display_ver = { 12, 13 },
> > +   .tiling = I915_TILING_NONE,
> > +
> > +   .ccs.type = INTEL_CCS_RC,
> > +   }, {
> > +   .modifier = I915_FORMAT_MOD_4_TILED_DG2_MC_CCS,
> > +   .display_ver = { 12, 13 },
> > +   .tiling = I915_TILING_NONE,
> > +
> > +   .ccs.type = INTEL_CCS_MC,
> > +   }, {
> > +   .modifier = I915_FORMAT_MOD_4_TILED_DG2_RC_CCS_CC,
> > +   .display_ver = { 12, 13 },
> > +   .tiling = I915_TILING_NONE,
> > +
> > +   .ccs.type = INTEL_CCS_RC_CC,
> > +   .ccs.cc_planes = BIT(1),
> > +
> > +   FORMAT_OVERRIDE(gen12_flat_ccs_cc_formats),
> > }, {
> > .modifier = I915_FORMAT_MOD_Yf_TILED_CCS,
> > .display_ver = { 9, 11 },
> > @@ -582,6 +618,9 @@ intel_tile_width_bytes(const struct drm_framebuffer 
> > *fb, int color_plane)
> > return 

Re: [Mesa-dev] [PATCH v3 13/17] uapi/drm/dg2: Format modifier for DG2 unified compression and clear color

2021-12-08 Thread Nanley Chery
Hi Ramalingam,

On Wed, Oct 27, 2021 at 5:22 PM Ramalingam C  wrote:
>
> From: Matt Roper 
>
> DG2 unifies render compression and media compression into a single
> format for the first time.  The programming and buffer layout is
> supposed to match compression on older gen12 platforms, but the
> actual compression algorithm is different from any previous platform; as
> such, we need a new framebuffer modifier to represent buffers in this
> format, but otherwise we can re-use the existing gen12 compression driver
> logic.
>
> DG2 clear color render compression uses Tile4 layout. Therefore, we need
> to define a new format modifier for uAPI to support clear color rendering.
>

I left some feedback on the modifier texts below, but I think it also
applies to this commit message.

> v2: Rebased on new format modifier check [Ram]
>
> Signed-off-by: Matt Roper 
> Signed-off-by: Mika Kahola  (v2)
> Signed-off-by: Juha-Pekka Heikkilä 
> Signed-off-by: Ramalingam C 
> cc: Simon Ser 
> Cc: Pekka Paalanen 
> Cc: Jordan Justen 
> Cc: Kenneth Graunke 
> Cc: mesa-dev@lists.freedesktop.org
> Cc: Tony Ye 
> Cc: Slawomir Milczarek 
> Acked-by: Simon Ser 
> ---
>  drivers/gpu/drm/i915/display/intel_fb.c   | 43 +++
>  .../drm/i915/display/skl_universal_plane.c| 29 -
>  include/uapi/drm/drm_fourcc.h | 30 +
>  3 files changed, 101 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_fb.c 
> b/drivers/gpu/drm/i915/display/intel_fb.c
> index 562d5244688d..484ae1fd0e94 100644
> --- a/drivers/gpu/drm/i915/display/intel_fb.c
> +++ b/drivers/gpu/drm/i915/display/intel_fb.c
> @@ -106,6 +106,21 @@ static const struct drm_format_info 
> gen12_ccs_cc_formats[] = {
>   .hsub = 1, .vsub = 1, .has_alpha = true },
>  };
>
> +static const struct drm_format_info gen12_flat_ccs_cc_formats[] = {
> +   { .format = DRM_FORMAT_XRGB, .depth = 24, .num_planes = 2,
> + .char_per_block = { 4, 0 }, .block_w = { 1, 2 }, .block_h = { 1, 1 
> },
> + .hsub = 1, .vsub = 1, },
> +   { .format = DRM_FORMAT_XBGR, .depth = 24, .num_planes = 2,
> + .char_per_block = { 4, 0 }, .block_w = { 1, 2 }, .block_h = { 1, 1 
> },
> + .hsub = 1, .vsub = 1, },
> +   { .format = DRM_FORMAT_ARGB, .depth = 32, .num_planes = 2,
> + .char_per_block = { 4, 0 }, .block_w = { 1, 2 }, .block_h = { 1, 1 
> },
> + .hsub = 1, .vsub = 1, .has_alpha = true },
> +   { .format = DRM_FORMAT_ABGR, .depth = 32, .num_planes = 2,
> + .char_per_block = { 4, 0 }, .block_w = { 1, 2 }, .block_h = { 1, 1 
> },
> + .hsub = 1, .vsub = 1, .has_alpha = true },
> +};
> +
>  struct intel_modifier_desc {
> u64 modifier;
> struct {
> @@ -166,6 +181,27 @@ static const struct intel_modifier_desc 
> intel_modifiers[] = {
> .ccs.packed_aux_planes = BIT(1),
>
> FORMAT_OVERRIDE(gen12_ccs_cc_formats),
> +   }, {
> +   .modifier = I915_FORMAT_MOD_4_TILED_DG2_RC_CCS,
> +   .display_ver = { 12, 13 },
> +   .tiling = I915_TILING_NONE,
> +
> +   .ccs.type = INTEL_CCS_RC,
> +   }, {
> +   .modifier = I915_FORMAT_MOD_4_TILED_DG2_MC_CCS,
> +   .display_ver = { 12, 13 },
> +   .tiling = I915_TILING_NONE,
> +
> +   .ccs.type = INTEL_CCS_MC,
> +   }, {
> +   .modifier = I915_FORMAT_MOD_4_TILED_DG2_RC_CCS_CC,
> +   .display_ver = { 12, 13 },
> +   .tiling = I915_TILING_NONE,
> +
> +   .ccs.type = INTEL_CCS_RC_CC,
> +   .ccs.cc_planes = BIT(1),
> +
> +   FORMAT_OVERRIDE(gen12_flat_ccs_cc_formats),
> }, {
> .modifier = I915_FORMAT_MOD_Yf_TILED_CCS,
> .display_ver = { 9, 11 },
> @@ -582,6 +618,9 @@ intel_tile_width_bytes(const struct drm_framebuffer *fb, 
> int color_plane)
> return 128;
> else
> return 512;
> +   case I915_FORMAT_MOD_4_TILED_DG2_RC_CCS:
> +   case I915_FORMAT_MOD_4_TILED_DG2_MC_CCS:
> +   case I915_FORMAT_MOD_4_TILED_DG2_RC_CCS_CC:
> case I915_FORMAT_MOD_4_TILED:
> /*
>  * Each 4K tile consists of 64B(8*8) subtiles, with
> @@ -759,6 +798,10 @@ unsigned int intel_surf_alignment(const struct 
> drm_framebuffer *fb,
> case I915_FORMAT_MOD_4_TILED:
> case I915_FORMAT_MOD_Yf_TILED:
> return 1 * 1024 * 1024;
> +   case I915_FORMAT_MOD_4_TILED_DG2_RC_CCS:
> +   case I915_FORMAT_MOD_4_TILED_DG2_RC_CCS_CC:
> +   case I915_FORMAT_MOD_4_TILED_DG2_MC_CCS:
> +   return 16 * 1024;
> default:
> MISSING_CASE(fb->modifier);
> return 0;
> diff --git a/drivers/gpu/drm/i915/display/skl_universal_plane.c 
> b/drivers/gpu/drm/i915/display/skl_universal_plane.c
> index 

Re: [Mesa-dev] What does WIP really mean in an MR?

2019-07-01 Thread Nanley Chery
On Sat, Jun 29, 2019 at 4:40 PM Eric Engestrom  wrote:
>
> On Saturday, 2019-06-29 22:59:21 +0200, apinheiro wrote:
> >
> > On 29/6/19 2:30, Rob Clark wrote:
> > > I had interpreted it as literally the "block the gitlab merge button"
> > > option, ie. "I want to get feedback but it is not ready to merge and
> > > I'll drop the WIP tag when I think it is"..
> >
> >
> > >
> > > (comments inline)
> > >
> > > On Fri, Jun 28, 2019 at 5:12 PM Ian Romanick  wrote:
> > > > After a conversation yesterday with a couple of the other Intel devs,
> > > > I've come to the conclusion that *everyone* interprets WIP to mean
> > > > something different.  I heard no less than four interpretations.
> > > >
> > > > * This series is good.  It hasn't been reviewed, so don't click "merge."
> > > isn't that the point of a MR.. doesn't seem like a reason for "WIP"
> >
> >
> > I agree with Rob here. What I understand of WIP is that there is some reason
> > that prevents a MR to be merged, even if I still want it to be reviewed, or
> > if it is fully reviewed. For example, I send a MR with patches that I think
> > that are correct, so people can start to review it. But on a rebase, I found
> > that the branch causes regressions on piglit/cts/whatever. I still think
> > that the patches are correct, but those regressions need to be investigated
> > before pushing. So I just add a WIP on the MR to prevent to be merged by
> > mistake.
>
> Kind of just saying "+1" here, but yeah that's also my understanding of "WIP".
>

A help page in our gitlab instance [1] suggests a more relaxed/broader usage
of the WIP tag. So, I had been using WIP to block accidental merges that don't
meet the usual merge criteria (e.g., the code isn't ready or it lacks
reviewed-by tags).
I don't mind avoiding it on MRs that are gated only on the review tags.

1. 
https://gitlab.freedesktop.org/help/user/project/merge_requests/work_in_progress_merge_requests.md

> >
> > >
> > > > * This series has some sketchy bits.  It probably isn't ready for review
> > > > unless you've been tagged for design feedback.
> > > I guess I'd also use WIP for "I want some early feedback, but it isn't
> > > ready yet".. but in this case I'd also poke people who I wanted to
> > > look at it
> >
> >
> > I thought that in this case RFC was used. Or RFC was dropped on gitlab MRs?
>
> Agreed; and "RFC" is definitely being used:
> https://gitlab.freedesktop.org/mesa/mesa/merge_requests?state=all=RFC
>
> Adding "WIP" as well prevents accidental merging, so it makes sense.
>
> >
> >
> > >
> > > > * This series has been reviewed.  Incorporation of detailed feedback is
> > > > in progress, but it's going to take some time.
> > > I suppose also a case for "WIP"..
> > >
> > > > * This series is good, but there are some questionable patches at the 
> > > > end.
> > > I guess in this case, I'd reform things into multiple MR's, one with
> > > the parts ready to go, and one w/ the remaining WIP bits
> > >
> > > BR,
> > > -R
> > >
> > > > Due to this lack of common understanding, we discovered at least one MR
> > > > that was ready to go but had been ignored for months. :(  This makes me
> > > > wonder if other MRs have similarly languished for no good reason.
> > > >
> > > > Can we formulate some guidelines for how people should apply WIP to
> > > > their MRs and how people should interpret WIP when they see it on an MR?
>
> Once we reach a consensus, we should write it down in 
> docs/submittingpatches.html

Agreed.

-Nanley

> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/9] intel/blorp: Only double the fast-clear rect alignment on HSW

2019-05-31 Thread Nanley Chery
Thanks for reaching out to the HW team. Given that the internal
documentation was updated to set the Project field of this restriction
to HSW:GT3, what do you think about shortening the comment to mention
that? I'd like to give this a RB as is, but there are a lot of truth
claims I'd have to verify in order to do so..

-Nanley

On Mon, Dec 3, 2018 at 2:48 PM Jason Ekstrand  wrote:
>
> I've received confirmation from the HW team that the extra doubling is only 
> needed on Haswell GT3.
>
> On Tue, May 15, 2018 at 5:28 PM Jason Ekstrand  wrote:
>>
>> The data in the commit message is a bit sketchy for Ivybridge.  We don't
>> run dEQP or any of the CTSs on Ivybridge in CI so all the data we have
>> is piglit.  On Haswell, piglit didn't catch anything so we don't have
>> anything to go off of for Ivybridge besides the fact that the restriction
>> wasn't added until Haswell.
>> ---
>>  src/intel/blorp/blorp_clear.c | 66 
>> ---
>>  1 file changed, 56 insertions(+), 10 deletions(-)
>>
>> diff --git a/src/intel/blorp/blorp_clear.c b/src/intel/blorp/blorp_clear.c
>> index 832e8ee..618625b 100644
>> --- a/src/intel/blorp/blorp_clear.c
>> +++ b/src/intel/blorp/blorp_clear.c
>> @@ -235,16 +235,62 @@ get_fast_clear_rect(const struct isl_device *dev,
>>x_scaledown = x_align / 2;
>>y_scaledown = y_align / 2;
>>
>> -  /* From BSpec: 3D-Media-GPGPU Engine > 3D Pipeline > Pixel > Pixel
>> -   * Backend > MCS Buffer for Render Target(s) [DevIVB+] > Table "Color
>> -   * Clear of Non-MultiSampled Render Target Restrictions":
>> -   *
>> -   *   Clear rectangle must be aligned to two times the number of
>> -   *   pixels in the table shown below due to 16x16 hashing across the
>> -   *   slice.
>> -   */
>> -  x_align *= 2;
>> -  y_align *= 2;
>> +  if (ISL_DEV_IS_HASWELL(dev)) {
>> + /* The following text was added in the Haswell PRM, "3D Media GPGPU
>> +  * Engine" >> "MCS Buffer for Render Target(s)" >> Table "Color 
>> Clear
>> +  * of Non-MultiSampler Render Target Restrictions":
>> +  *
>> +  *"Clear rectangle must be aligned to two times the number of
>> +  *pixels in the table shown below due to 16X16 hashing across 
>> the
>> +  *slice."
>> +  *
>> +  * It has persisted in the documentation for all platforms up until
>> +  * Cannonlake and possibly even beyond.  However, we believe that 
>> it
>> +  * is only needed on Haswell.
>> +  *
>> +  * There are a couple possible explanations for this restriction:
>> +  *
>> +  * 1) If you assume that the hardware is writing to the CCS as
>> +  *bytes, then the x/y_align computed above gives you an 
>> alignment
>> +  *in the CCS of 8x8 bytes and, if 16x16 is needed for hashing, 
>> we
>> +  *need to multiply by 2.
>> +  *
>> +  * 2) Haswell is a bit unique in that it's CCS tiling does not line
>> +  *up with Y-tiling on a cache-line granularity.  Instead, it 
>> has
>> +  *an extra bit of swizzling in bit 9.  Also, bit 6 swizzling
>> +  *applies to the CCS on Haswell.  This means that Haswell CTS
>> +  *does not match on a cache-line granularity but it does match 
>> on
>> +  *a 2x2 cache line granularity.
>> +  *
>> +  * Clearly, the first explanation seems to follow documentation the
>> +  * best but they may be related.  In any case, empirical evidence
>> +  * seems to confirm that it is, indeed required on Haswell.
>> +  *
>> +  * On Broadwell things get a bit stickier.  Broadwell adds support
>> +  * for mip-mapped CCS with an alignment in the CCS of 256x128.  
>> For a
>> +  * 32bpb main surface, the above computation will yield a x/y_align
>> +  * of 128x128 for a Y-tiled main surface and 256x64 for X-tiled.  
>> In
>> +  * either case, if we double the alignment, we will get an 
>> alignment
>> +  * bigger than horizontal and vertical alignment of the CCS and 
>> fast
>> +  * clears of one LOD may leak into others.
>> +  *
>> +  * Starting with Skylake, the image alignment for the CCS is only
>> +  * 128x64 which is exactly the x/h_align computed above if the main
>> +  * surface has a 32bpb format.  Also, the "Render Target Resolve"
>> +  * page in the bspec (not the PRM) says, "The Resolve Rectangle 
>> size
>> +  * is same as Clear Rectangle size from SKL+".  The x/y_align
>> +  * computed above (without doubling) match the resolve rectangle
>> +  * calculation perfectly.
>> +  *
>> +  * Finally, to confirm all this, a full test run was performed on
>> +  * Feb. 9, 2018 with this doubling removed and the only platform
>> +  * which 

Re: [Mesa-dev] [PATCH v3 03/18] intel/blorp: Use the hardware op for CCS ambiguate on gen10+

2019-05-30 Thread Nanley Chery
Thanks. Landed.

On Thu, May 30, 2019 at 7:02 AM Jason Ekstrand  wrote:
>
> Feel free to land
>
> On Wed, May 29, 2019 at 4:50 PM Nanley Chery  wrote:
>>
>> On Wed, Feb 14, 2018 at 12:19 PM Jason Ekstrand  wrote:
>> >
>> > Cannonlake hardware adds a new resolve type in 3DSTATE_PS called
>> > FAST_CLEAR_0 which does an ambiguate.  Now that the hardware can do it
>> > directly, we should use that instead of binding the CCS as a render
>> > target and doing it manually.  This was tested with a full Vulkan CTS
>> > run on Cannonlake.
>> > ---
>> >  src/intel/blorp/blorp_clear.c | 12 +++-
>> >  src/intel/blorp/blorp_genX_exec.h |  6 ++
>> >  2 files changed, 17 insertions(+), 1 deletion(-)
>> >
>>
>> This patch is
>> Reviewed-by: Nanley Chery 
>>
>> > diff --git a/src/intel/blorp/blorp_clear.c b/src/intel/blorp/blorp_clear.c
>> > index 421a6c5..4ba65d0 100644
>> > --- a/src/intel/blorp/blorp_clear.c
>> > +++ b/src/intel/blorp/blorp_clear.c
>> > @@ -758,7 +758,11 @@ blorp_ccs_resolve(struct blorp_batch *batch,
>> > params.x1 = ALIGN(params.x1, x_scaledown) / x_scaledown;
>> > params.y1 = ALIGN(params.y1, y_scaledown) / y_scaledown;
>> >
>> > -   if (batch->blorp->isl_dev->info->gen >= 9) {
>> > +   if (batch->blorp->isl_dev->info->gen >= 10) {
>> > +  assert(resolve_op == ISL_AUX_OP_FULL_RESOLVE ||
>> > + resolve_op == ISL_AUX_OP_PARTIAL_RESOLVE ||
>> > + resolve_op == ISL_AUX_OP_AMBIGUATE);
>> > +   } else if (batch->blorp->isl_dev->info->gen >= 9) {
>> >assert(resolve_op == ISL_AUX_OP_FULL_RESOLVE ||
>> >   resolve_op == ISL_AUX_OP_PARTIAL_RESOLVE);
>> > } else {
>> > @@ -893,6 +897,12 @@ blorp_ccs_ambiguate(struct blorp_batch *batch,
>> >  struct blorp_surf *surf,
>> >  uint32_t level, uint32_t layer)
>> >  {
>> > +   if (ISL_DEV_GEN(batch->blorp->isl_dev) >= 10) {
>> > +  /* On gen10 and above, we have a hardware resolve op for this */
>> > +  return blorp_ccs_resolve(batch, surf, level, layer, 1,
>> > +   surf->surf->format, ISL_AUX_OP_AMBIGUATE);
>> > +   }
>> > +
>> > struct blorp_params params;
>> > blorp_params_init();
>> >
>> > diff --git a/src/intel/blorp/blorp_genX_exec.h 
>> > b/src/intel/blorp/blorp_genX_exec.h
>> > index 5e1312a..85abf6b 100644
>> > --- a/src/intel/blorp/blorp_genX_exec.h
>> > +++ b/src/intel/blorp/blorp_genX_exec.h
>> > @@ -752,6 +752,12 @@ blorp_emit_ps_config(struct blorp_batch *batch,
>> >switch (params->fast_clear_op) {
>> >case ISL_AUX_OP_NONE:
>> >   break;
>> > +#if GEN_GEN >= 10
>> > +  case ISL_AUX_OP_AMBIGUATE:
>> > + ps.RenderTargetFastClearEnable = true;
>> > + ps.RenderTargetResolveType = FAST_CLEAR_0;
>> > + break;
>> > +#endif
>> >  #if GEN_GEN >= 9
>> >case ISL_AUX_OP_PARTIAL_RESOLVE:
>> >   ps.RenderTargetResolveType = RESOLVE_PARTIAL;
>> > --
>> > 2.5.0.400.gff86faf
>> >
>> > ___
>> > mesa-dev mailing list
>> > mesa-dev@lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 03/18] intel/blorp: Use the hardware op for CCS ambiguate on gen10+

2019-05-29 Thread Nanley Chery
On Wed, Feb 14, 2018 at 12:19 PM Jason Ekstrand  wrote:
>
> Cannonlake hardware adds a new resolve type in 3DSTATE_PS called
> FAST_CLEAR_0 which does an ambiguate.  Now that the hardware can do it
> directly, we should use that instead of binding the CCS as a render
> target and doing it manually.  This was tested with a full Vulkan CTS
> run on Cannonlake.
> ---
>  src/intel/blorp/blorp_clear.c | 12 +++-
>  src/intel/blorp/blorp_genX_exec.h |  6 ++
>  2 files changed, 17 insertions(+), 1 deletion(-)
>

This patch is
Reviewed-by: Nanley Chery 

> diff --git a/src/intel/blorp/blorp_clear.c b/src/intel/blorp/blorp_clear.c
> index 421a6c5..4ba65d0 100644
> --- a/src/intel/blorp/blorp_clear.c
> +++ b/src/intel/blorp/blorp_clear.c
> @@ -758,7 +758,11 @@ blorp_ccs_resolve(struct blorp_batch *batch,
> params.x1 = ALIGN(params.x1, x_scaledown) / x_scaledown;
> params.y1 = ALIGN(params.y1, y_scaledown) / y_scaledown;
>
> -   if (batch->blorp->isl_dev->info->gen >= 9) {
> +   if (batch->blorp->isl_dev->info->gen >= 10) {
> +  assert(resolve_op == ISL_AUX_OP_FULL_RESOLVE ||
> + resolve_op == ISL_AUX_OP_PARTIAL_RESOLVE ||
> + resolve_op == ISL_AUX_OP_AMBIGUATE);
> +   } else if (batch->blorp->isl_dev->info->gen >= 9) {
>assert(resolve_op == ISL_AUX_OP_FULL_RESOLVE ||
>   resolve_op == ISL_AUX_OP_PARTIAL_RESOLVE);
> } else {
> @@ -893,6 +897,12 @@ blorp_ccs_ambiguate(struct blorp_batch *batch,
>  struct blorp_surf *surf,
>  uint32_t level, uint32_t layer)
>  {
> +   if (ISL_DEV_GEN(batch->blorp->isl_dev) >= 10) {
> +  /* On gen10 and above, we have a hardware resolve op for this */
> +  return blorp_ccs_resolve(batch, surf, level, layer, 1,
> +   surf->surf->format, ISL_AUX_OP_AMBIGUATE);
> +   }
> +
> struct blorp_params params;
> blorp_params_init();
>
> diff --git a/src/intel/blorp/blorp_genX_exec.h 
> b/src/intel/blorp/blorp_genX_exec.h
> index 5e1312a..85abf6b 100644
> --- a/src/intel/blorp/blorp_genX_exec.h
> +++ b/src/intel/blorp/blorp_genX_exec.h
> @@ -752,6 +752,12 @@ blorp_emit_ps_config(struct blorp_batch *batch,
>switch (params->fast_clear_op) {
>case ISL_AUX_OP_NONE:
>   break;
> +#if GEN_GEN >= 10
> +  case ISL_AUX_OP_AMBIGUATE:
> + ps.RenderTargetFastClearEnable = true;
> + ps.RenderTargetResolveType = FAST_CLEAR_0;
> + break;
> +#endif
>  #if GEN_GEN >= 9
>case ISL_AUX_OP_PARTIAL_RESOLVE:
>   ps.RenderTargetResolveType = RESOLVE_PARTIAL;
> --
> 2.5.0.400.gff86faf
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] intel/isl: Align clear color buffer to full cacheline

2019-04-18 Thread Nanley Chery
On Thu, Apr 18, 2019 at 08:19:38AM -0700, Kenneth Graunke wrote:
> On Wednesday, April 17, 2019 1:31:28 PM PDT Nanley Chery wrote:
> > On Wed, Apr 17, 2019 at 11:34:15AM -0700, Rafael Antognolli wrote:
> > > On Wed, Apr 17, 2019 at 09:04:09AM -0700, Kenneth Graunke wrote:
> > > > On Wednesday, April 17, 2019 7:16:28 AM PDT Topi Pohjolainen wrote:
> > > > > From: Rafael Antognolli 
> > > > > 
> > > > > Fixes MCS fast clear gpu hangs with Vulkan CTS on ICL in CI.
> > > > > 
> > > > > CC: Anuj Phogat 
> > > > > CC: Kenneth Graunke 
> > > > > Tested-by: Topi Pohjolainen 
> > > > > Signed-off-by: Rafael Antognolli 
> > > > > ---
> > > > >  src/intel/isl/isl.c | 3 ++-
> > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> > > > > index 6b9e6c9e0f0..acfed5119ba 100644
> > > > > --- a/src/intel/isl/isl.c
> > > > > +++ b/src/intel/isl/isl.c
> > > > > @@ -122,7 +122,8 @@ isl_device_init(struct isl_device *dev,
> > > > > dev->ss.size = RENDER_SURFACE_STATE_length(info) * 4;
> > > > > dev->ss.align = isl_align(dev->ss.size, 32);
> > > > >  
> > > > > -   dev->ss.clear_color_state_size = CLEAR_COLOR_length(info) * 4;
> > > > > +   dev->ss.clear_color_state_size =
> > > > > +  isl_align(CLEAR_COLOR_length(info) * 4, 64);
> > > > > dev->ss.clear_color_state_offset =
> > > > >RENDER_SURFACE_STATE_ClearValueAddress_start(info) / 32 * 4;
> > > > >  
> > > > > 
> > > > 
> > > > I'm not as familiar with Vulkan, but it looks like we're storing this
> > > > clear color data as part of the underlying image's BO, rather than as
> > > > a separate piece of data.  I wonder if it has anything to do with that
> > > > BO being considered tiled, so something is trying to access an entire
> > > > cacheline around here.  Or it's offsetting following data to not be
> > > > cacheline aligned...
> > > 
> > > Hmmm... Yeah, we store it after the aux buffer, in the same BO as the
> > > image one.
> > > 
> > > What I think it's the biggest issue in Vulkan is that we store some
> > > data (resolve type and tracking) right after the clear color data. And
> > > the data size is 32B, but the docs say it should be the lower 32B of a
> > > cacheline. For some reason I thought it was safe to write stuff into the
> > > higher 32B, but apparently it wasn't :-/
> > > 
> > > > I did notice that the clear address has to be 64B aligned.
> > > 
> > > My understanding is that the image and aux surface were always 4K
> > > aligned, so this restriction would be met. I guess making an assert for
> > > it wouldn't hurt, though...
> > 
> > This fix looks good to me. I'm a little confused about the title. I see
> > how this change pads the buffer, but I don't see how it aligns it.
> > 
> > -Nanley
> 
> Agreed, perhaps we can change the title to "Resize" instead of "Align".
> 
> With that,
> Reviewed-by: Kenneth Graunke 
> 
> Nanley, would it get a R-b from you with that wording change?

Yep. With that change, this patch is
Reviewed-by: Nanley Chery 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] intel/isl: Align clear color buffer to full cacheline

2019-04-17 Thread Nanley Chery
On Wed, Apr 17, 2019 at 11:34:15AM -0700, Rafael Antognolli wrote:
> On Wed, Apr 17, 2019 at 09:04:09AM -0700, Kenneth Graunke wrote:
> > On Wednesday, April 17, 2019 7:16:28 AM PDT Topi Pohjolainen wrote:
> > > From: Rafael Antognolli 
> > > 
> > > Fixes MCS fast clear gpu hangs with Vulkan CTS on ICL in CI.
> > > 
> > > CC: Anuj Phogat 
> > > CC: Kenneth Graunke 
> > > Tested-by: Topi Pohjolainen 
> > > Signed-off-by: Rafael Antognolli 
> > > ---
> > >  src/intel/isl/isl.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> > > index 6b9e6c9e0f0..acfed5119ba 100644
> > > --- a/src/intel/isl/isl.c
> > > +++ b/src/intel/isl/isl.c
> > > @@ -122,7 +122,8 @@ isl_device_init(struct isl_device *dev,
> > > dev->ss.size = RENDER_SURFACE_STATE_length(info) * 4;
> > > dev->ss.align = isl_align(dev->ss.size, 32);
> > >  
> > > -   dev->ss.clear_color_state_size = CLEAR_COLOR_length(info) * 4;
> > > +   dev->ss.clear_color_state_size =
> > > +  isl_align(CLEAR_COLOR_length(info) * 4, 64);
> > > dev->ss.clear_color_state_offset =
> > >RENDER_SURFACE_STATE_ClearValueAddress_start(info) / 32 * 4;
> > >  
> > > 
> > 
> > I'm not as familiar with Vulkan, but it looks like we're storing this
> > clear color data as part of the underlying image's BO, rather than as
> > a separate piece of data.  I wonder if it has anything to do with that
> > BO being considered tiled, so something is trying to access an entire
> > cacheline around here.  Or it's offsetting following data to not be
> > cacheline aligned...
> 
> Hmmm... Yeah, we store it after the aux buffer, in the same BO as the
> image one.
> 
> What I think it's the biggest issue in Vulkan is that we store some
> data (resolve type and tracking) right after the clear color data. And
> the data size is 32B, but the docs say it should be the lower 32B of a
> cacheline. For some reason I thought it was safe to write stuff into the
> higher 32B, but apparently it wasn't :-/
> 
> > I did notice that the clear address has to be 64B aligned.
> 
> My understanding is that the image and aux surface were always 4K
> aligned, so this restriction would be met. I guess making an assert for
> it wouldn't hurt, though...

This fix looks good to me. I'm a little confused about the title. I see
how this change pads the buffer, but I don't see how it aligns it.

-Nanley

> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] anv/pass: Flag the need for a depth flush for resolve attachments

2019-03-13 Thread Nanley Chery
On Tue, Mar 12, 2019 at 10:56:27PM -0500, Jason Ekstrand wrote:
> Cc: mesa-sta...@lists.freedesktop.org
> Cc: Nanley Chery 
> ---
>  src/intel/vulkan/anv_pass.c | 18 +-
>  1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/src/intel/vulkan/anv_pass.c b/src/intel/vulkan/anv_pass.c
> index 5fac5bbb31c..ec217abfda0 100644
> --- a/src/intel/vulkan/anv_pass.c
> +++ b/src/intel/vulkan/anv_pass.c
> @@ -178,12 +178,28 @@ anv_render_pass_compile(struct anv_render_pass *pass)
>  * subpasses and checking to see if any of them don't have an external
>  * dependency.  Or, we could just be lazy and add a couple extra flushes.
>  * We choose to be lazy.
> +*
> +* From the documentation for vkCmdNextSubpass:
> +*
> +*"Moving to the next subpass automatically performs any multisample
> +*resolve operations in the subpass being ended. End-of-subpass
> +*multisample resolves are treated as color attachment writes for the
> +*purposes of synchronization. This applies to resolve operations for
> +*both color and depth/stencil attachments. That is, they are
> +*considered to execute in the
> +*VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT pipeline stage and
> +*their writes are synchronized with
> +*VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT."
> +*
> +* Therefore, the above flags concerning color attachments also apply to
> +* color and depth/stencil resolve attachments.
>  */
> if (all_usage & VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT) {
>pass->subpass_flushes[0] |=
>   ANV_PIPE_TEXTURE_CACHE_INVALIDATE_BIT;
> }
> -   if (all_usage & VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT) {
> +   if (all_usage & (VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT |
> +VK_IMAGE_USAGE_TRANSFER_DST_BIT)) {
>pass->subpass_flushes[pass->subpass_count] |=
>   ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT;

I'm assuming you meant to s/depth/color/ in the title of the patch?

If so and with that change, this patch is
Reviewed-by: Nanley Chery 

> }
> -- 
> 2.20.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH v4] i965: fixed clamping in set_scissor_bits when the y is flipped

2019-02-26 Thread Nanley Chery
On Mon, Feb 25, 2019 at 03:40:24PM -0800, Nanley Chery wrote:
> On Mon, Feb 25, 2019 at 03:14:10PM -0800, Dylan Baker wrote:
> > Quoting Eleni Maria Stea (2019-02-22 13:02:30)
> > > Calculating the scissor rectangle fields with the y flipped (0 on top)
> > > can generate negative values that will cause assertion failure later on
> > > as the scissor fields are all unsigned. We must clamp the bbox values
> > > again to make sure they don't exceed the fb_height. Also fixed a
> > > calculation error.
> > > 
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999
> > >   https://bugs.freedesktop.org/show_bug.cgi?id=109594
> > > 
> > > v2:
> > >- I initially clamped the values inside the if (Y is flipped) case
> > >and I made a mistake in the calculation: the clamp of the bbox[2] 
> > > should
> > >be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and 
> > > I
> > >shouldn't have changed the ScissorRectangleYMax calculation. As the
> > >fixed code is equivalent with using CLAMP instead of MAX2 at the top of
> > >the function when bbox[2] and bbox[3] are calculated, and the 2nd is 
> > > more
> > >clear, I replaced it. (Nanley Chery)
> > > 
> > > v3:
> > >- Reversed the CLAMP change in bbox[3] as the API guarantees that the
> > >viewport height is positive. (Nanley Chery)
> > > 
> > > v4:
> > >   - Added nomination for the mesa-stable branch and the link to the second
> > >   bugzilla bug (Nanley Chery)
> > > 
> > > CC: 
> > > Tested-by: Paul Chelombitko 
> > > Reviewed-by: Nanley Chery 
> > > ---
> > >  src/mesa/drivers/dri/i965/genX_state_upload.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
> > > b/src/mesa/drivers/dri/i965/genX_state_upload.c
> > > index 027dad1e089..73c983ce742 100644
> > > --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> > > +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> > > @@ -2446,7 +2446,7 @@ set_scissor_bits(const struct gl_context *ctx, int 
> > > i,
> > >  
> > > bbox[0] = MAX2(ctx->ViewportArray[i].X, 0);
> > > bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width);
> > > -   bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0);
> > > +   bbox[2] = CLAMP(ctx->ViewportArray[i].Y, 0, fb_height);
> > > bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height);
> > > _mesa_intersect_scissor_bounding_box(ctx, i, bbox);
> > >  
> > > -- 
> > > 2.20.1
> > > 
> > 
> > Do you have push access? I'd like to get this merged so we can close said 
> > bugs,
> > and Nanley or I can push this for you if you don't have access.
> > 
> 
> I haven't landed this patch because its piglit test isn't catching the
> error in CI. I'm hoping we could resolve that soon though.
> 

The test has been fixed, so I've pushed both patches.

> -Nanley
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH v4] i965: fixed clamping in set_scissor_bits when the y is flipped

2019-02-25 Thread Nanley Chery
On Mon, Feb 25, 2019 at 03:14:10PM -0800, Dylan Baker wrote:
> Quoting Eleni Maria Stea (2019-02-22 13:02:30)
> > Calculating the scissor rectangle fields with the y flipped (0 on top)
> > can generate negative values that will cause assertion failure later on
> > as the scissor fields are all unsigned. We must clamp the bbox values
> > again to make sure they don't exceed the fb_height. Also fixed a
> > calculation error.
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999
> >   https://bugs.freedesktop.org/show_bug.cgi?id=109594
> > 
> > v2:
> >- I initially clamped the values inside the if (Y is flipped) case
> >and I made a mistake in the calculation: the clamp of the bbox[2] should
> >be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I
> >shouldn't have changed the ScissorRectangleYMax calculation. As the
> >fixed code is equivalent with using CLAMP instead of MAX2 at the top of
> >the function when bbox[2] and bbox[3] are calculated, and the 2nd is more
> >clear, I replaced it. (Nanley Chery)
> > 
> > v3:
> >- Reversed the CLAMP change in bbox[3] as the API guarantees that the
> >viewport height is positive. (Nanley Chery)
> > 
> > v4:
> >   - Added nomination for the mesa-stable branch and the link to the second
> >   bugzilla bug (Nanley Chery)
> > 
> > CC: 
> > Tested-by: Paul Chelombitko 
> > Reviewed-by: Nanley Chery 
> > ---
> >  src/mesa/drivers/dri/i965/genX_state_upload.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
> > b/src/mesa/drivers/dri/i965/genX_state_upload.c
> > index 027dad1e089..73c983ce742 100644
> > --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> > +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> > @@ -2446,7 +2446,7 @@ set_scissor_bits(const struct gl_context *ctx, int i,
> >  
> > bbox[0] = MAX2(ctx->ViewportArray[i].X, 0);
> > bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width);
> > -   bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0);
> > +   bbox[2] = CLAMP(ctx->ViewportArray[i].Y, 0, fb_height);
> > bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height);
> > _mesa_intersect_scissor_bounding_box(ctx, i, bbox);
> >  
> > -- 
> > 2.20.1
> > 
> 
> Do you have push access? I'd like to get this merged so we can close said 
> bugs,
> and Nanley or I can push this for you if you don't have access.
> 

I haven't landed this patch because its piglit test isn't catching the
error in CI. I'm hoping we could resolve that soon though.

-Nanley
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] i965: fixed clamping in set_scissor_bits when the y is flipped

2019-02-21 Thread Nanley Chery
On Thu, Feb 21, 2019 at 12:02:58PM +0200, Eleni Maria Stea wrote:
> Calculating the scissor rectangle fields with the y flipped (0 on top)
> can generate negative values that will cause assertion failure later on
> as the scissor fields are all unsigned. We must clamp the bbox values
> again to make sure they don't exceed the fb_height. Also fixed a
> calculation error.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999
> 

I guess we'll want to add the other bug and Cc stable.

> v2:
>- I initially clamped the values inside the if (Y is flipped) case
>and I made a mistake in the calculation: the clamp of the bbox[2] should
>be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I
>shouldn't have changed the ScissorRectangleYMax calculation. As the
>fixed code is equivalent with using CLAMP instead of MAX2 at the top of
>the function when bbox[2] and bbox[3] are calculated, and the 2nd is more
>clear, I replaced it. (Nanley Chery)
> 
> v3:
>- Reversed the CLAMP change in bbox[3] as the API guarantees that the
>viewport height is positive. (Nanley Chery)
> ---
>  src/mesa/drivers/dri/i965/genX_state_upload.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

This patch is
Reviewed-by: Nanley Chery 

> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
> b/src/mesa/drivers/dri/i965/genX_state_upload.c
> index dcdfb3c9292..47f3741e673 100644
> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> @@ -2445,7 +2445,7 @@ set_scissor_bits(const struct gl_context *ctx, int i,
>  
> bbox[0] = MAX2(ctx->ViewportArray[i].X, 0);
> bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width);
> -   bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0);
> +   bbox[2] = CLAMP(ctx->ViewportArray[i].Y, 0, fb_height);
> bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height);
> _mesa_intersect_scissor_bounding_box(ctx, i, bbox);
>  
> -- 
> 2.20.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] i965: fixed clamping in set_scissor_bits when the y is flipped

2019-02-20 Thread Nanley Chery
On Wed, Feb 20, 2019 at 03:08:29PM +0200, Eleni Maria Stea wrote:
> Calculating the scissor rectangle fields with the y flipped (0 on top)
> can generate negative values that will cause assertion failure later on
> as the scissor fields are all unsigned. We must clamp the bbox values
> again to make sure they don't exceed the fb_height. Also fixed a
> calculation error.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999
> 
> v2:
>- I initially clamped the values inside the if (Y is flipped) case
>and I made a mistake in the calculation: the clamp of the bbox[2] should
>be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I
>shouldn't have changed the ScissorRectangleYMax calculation. As the
>fixed code is equivalent with using CLAMP instead of MAX2 at the top of
>the function when bbox[2] and bbox[3] are calculated, and the 2nd is more
>clear, I replaced it. (Nanley Chery)
> ---
>  src/mesa/drivers/dri/i965/genX_state_upload.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
> b/src/mesa/drivers/dri/i965/genX_state_upload.c
> index dcdfb3c9292..dd695218fea 100644
> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> @@ -2445,8 +2445,8 @@ set_scissor_bits(const struct gl_context *ctx, int i,
>  
> bbox[0] = MAX2(ctx->ViewportArray[i].X, 0);
> bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width);
> -   bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0);
> -   bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height);
> +   bbox[2] = CLAMP(ctx->ViewportArray[i].Y, 0, fb_height);
> +   bbox[3] = CLAMP(bbox[2] + ctx->ViewportArray[i].Height, 0, fb_height);

The API guarantees that viewport height is positive, so we can leave the
calculation of bbox[3] unmodified.

> _mesa_intersect_scissor_bounding_box(ctx, i, bbox);
>  
> if (bbox[0] == bbox[1] || bbox[2] == bbox[3]) {
> -- 
> 2.20.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: fixed clamping in set_scissor_bits when the y is flipped

2019-02-19 Thread Nanley Chery
On Mon, Dec 10, 2018 at 12:42:40PM +0200, Eleni Maria Stea wrote:
> Calculating the scissor rectangle fields with the y flipped (0 on top)
> can generate negative values that will cause assertion failure later on
> as the scissor fields are all unsigned. We must clamp the bbox values
> again to make sure they don't exceed the fb_height. Also fixed a
> calculation error.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999

Good find. Could you send the test to the piglit list?

> ---
>  src/mesa/drivers/dri/i965/genX_state_upload.c | 15 ++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
> b/src/mesa/drivers/dri/i965/genX_state_upload.c
> index 8e3fcbf12e..5d8fc8214e 100644
> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> @@ -2424,8 +2424,21 @@ set_scissor_bits(const struct gl_context *ctx, int i,
>/* memory: Y=0=top */
>sc->ScissorRectangleXMin = bbox[0];
>sc->ScissorRectangleXMax = bbox[1] - 1;
> +
> +  /* Clamping to fb_height is necessary because otherwise the
> +   * subtractions below would produce a negative result, which would
> +   * then be assigned to the unsigned YMin/YMax scissor fields,
> +   * resulting in an assertion failure in GENX(SCISSOR_RECT_pack)
> +   */
> +
> +  if (bbox[3] > fb_height)
> + bbox[3] = fb_height;
> +
> +  if (bbox[2] > fb_height)
> + bbox[2] = fb_height;
> +

We should be able to fix this bug in a simpler manner by changing the
MAX2 calls at the top of this function to CLAMP calls.

>sc->ScissorRectangleYMin = fb_height - bbox[3];
> -  sc->ScissorRectangleYMax = fb_height - bbox[2] - 1;
> +  sc->ScissorRectangleYMax = fb_height - (bbox[2] - 1);

I don't think we want to start adding 1 instead of subtracting 1. The
subtraction is there to satisfy the requirement for the HW packet.

-Nanley

> }
>  }
>  
> -- 
> 2.20.0.rc2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v6 2/5] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-02-15 Thread Nanley Chery
On Fri, Feb 15, 2019 at 03:29:41PM +0200, Eleni Maria Stea wrote:
> GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the
> compressed EAC/ETC2 images to non-compressed RGBA images. When
> GetCompressed* functions were called, the pixels were returned in this
> RGBA format and not the compressed format that was expected.
> 
> Trying to fix this problem, we use a secondary shadow miptree to store the
> decompressed data for the rendering and the main miptree to store the
> compressed for the Get functions to work. Each time that the main miptree
> is written with compressed data, we decompress them to RGB and update the
> shadow. Then we use the shadow for rendering.
> 
> v2:
>- Fixes in the commit message (Nanley Chery)
>- Reversed the changes in brw_get_texture_swizzle and swapped the b, g
>values at the time that we decompress the data in the function:
>intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
>- Simplified the format checks in the miptree_create function of the
>intel_mipmap_tree.c and reserved the call of the
>intel_lower_compressed_format for the case that we are faking the ETC
>support (Nanley Chery)
>- Removed the check for the auxiliary usage for the shadow miptree at
>creation (miptree_create of intel_mipmap_tree.c) as we won't use
>auxiliary buffers with these types of trees (Nanley Chery)
>- Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and
>removed the unecessary checks (Nanley Chery)
>- Fixed an unrelated indentation change (Nanley Chery)
>- Modified the function intel_miptree_finish_write to set the
>mt->shadow_needs_update to true to catch all the cases when we need to
>update the miptree (Nanley Chery)
>- In order to update the shadow miptree during the unmap of the
>main and always map the main (Nanley Chery) the following change was
>necessary: Splitted the previous update function that was updating all
>the mipmap levels and use two functions instead: one that updates one
>level and one that updates all of them. Used the first during unmap
>and the second before the rendering.
>- Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which
>miptree should be mapped each time and reversed all the changes in the
>higher level texture functions that upload data to textures as they
>aren't needed anymore.
>- Replaced the boolean needs_fake_etc with an inline function that
>checks when we need to fake the ETC compression (Nanley Chery)
>- Removed the initialization of the strides in the update function as
>the values will be overwritten by the intel_miptree_map call (Nanley
>Chery)
>- Used minify instead of division in the new update function
>intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley
>Chery)
>- Removed the depth from the calculation of the number of slices in
>the new update function (intel_miptree_update_etc_shadow_levels of
>intel_mipmap_tree.c) as we don't need to support 3D ETC images.
>(Nanley Chery)
> 
> v3:
>   - Renamed the rgba_fmt in function miptree_create
>   (intel_mipmap_tree.c) to decomp_format as the format is not always in
>   rgba order. (Nanley Chery)
>   - Documented the new usage for the shadow miptree in the comment above
>   the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley
>   Chery)
>   - Removed the redundant flags from the mapping of the miptrees in
>   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
>   - Fixed the switch from surface's logical level to physical level in
>   the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c
>   (Nanley Chery)
>   - Excluded the Baytrail GPUs from the check for the ETC emulation as
>   they support the ETC formats natively. (Nanley Chery)
>   - Simplified the check if the format is BGRA in
>   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
> 
> v4:
>   - Removed the functions intel_miptree_(map|unmap)_etc and the check if
>we need to call them as with the new changes, they became unreachable.
>(Nanley Chery)
>   - We'd rather calculate the level width and height using the shadow
>   miptree instead of the main in intel_miptree_update_etc_shadow_levels of
>   intel_mipmap_tree.c (Nanley Chery)
>   - Fixed the format in the mt_surface_usage, set at the miptree creation,
>    in miptree_create of intel_mipmap_tree.c (Nanley Chery)
> 
> v5:
>   - Fixed the levels calculations in intel_mipmap_tree.c (Nanley Chery)
>   - Update the flag shadow_needs_update outside the function
>   intel_miptree_update_etc_shadow (Nanley Chery)
>   - Fixed indentation e

Re: [Mesa-dev] [PATCH v6 0/5] improved the support for ETC2 formats on Gen 7

2019-02-15 Thread Nanley Chery
On Fri, Feb 15, 2019 at 03:29:39PM +0200, Eleni Maria Stea wrote:
> Intel Gen7 GPUs don't support the ETC2 formats natively and in order to
> show the pixels properly we decompress them and create decompressed
> miptrees. The problem with that is that the functions that map the
> miptrees for reading (for example the GetCompressed* calls), and would
> be supposed to read compressed pixel values, would read decompressed
> values instead unless if we prevented this with assertions that make
> the user programs either crash or misfunction.
> 
> These patches are an attempt to give a solution to this problem by using 2
> miptrees: the main to store the ETC values and the generic shadow
> (mt->shadow) to store the decompressed values. Each time that the main
> miptree is mapped for writing we set a flag that the shadow will need
> update and we check this flag before every draw call to update the
> shadow miptree. (We perform the check right before drawing to avoid
> missing changes from functions like the CopyImageSubData in the next 
> frame). Then we map the shadow for sampling. This way, we can render the
> images using the decompressed pixels of the shadow but we return the
> compressed ones from the main when the texture is mapped for reading.
> 
> Also, the OES_copy_image extension that couldn't work on Gen 7 due to the
> lack of the ETC support is now enabled back.
> 
> Finally, the following glcts and piglit tests pass:
> 
> On HSW (previously failing):
> 
> KHR-GL46.direct_state_access.textures_compressed_subimage
> 
> On HSW and IVB (previously skipped):
> -
> dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_alpha8_etc2_eac.*
>(6 tests)
> dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_etc2.*
>(6 tests)
> dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_punchthrough_alpha1_etc2.*
>(6 tests)
> 
> On HSW, IVB, SNB (previously skipped):
> ---
> dEQP-GLES3.functional.texture.format.compressed.*
>(12 tests)
> dEQP-GLES3.functional.texture.wrap.etc2_eac_srgb8_alpha8.*
>(36 tests)
> dEQP-GLES3.functional.texture.wrap.etc2_srgb8.*
>(36 tests)
> dEQP-GLES3.functional.texture.wrap.etc2_srgb8_punchthrough_alpha1.*
>(36 tests)
> 
> piglit.spec.!opengl es 3_0.oes_compressed_etc2_texture-miptree_gles3 
>(srgb8, srgb8-alpha, srgb8-punchthrough-alpha1)
> piglit.spec.arb_es3_compatibility.oes_compressed_etc2_texture-miptree
>(srgb8 compat, srgb8 core, srgb8-alpha8 compat, srgb8-alpha8 core,
> srgb8-punchthrough-alpha1 compat, srgb8-punchthrough-alpha1 core)
> (9 tests)
> 
> Total tests passing: 148
> 
> Eleni Maria Stea (4):
>   i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
>   i965: Fixed the CopyImageSubData for ETC2 on Gen < 8
>   i965: Enabled the OES_copy_image extension on Gen 7 GPUs
>   i965: Removed the field etc_format from the struct intel_mipmap_tree
> 

These patches are
Reviewed-by: Nanley Chery 

I like how this series turned out. Thank you!

> Nanley Chery (1):
>   i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*
> 
>  src/mesa/drivers/dri/i965/brw_draw.c  |   5 +
>  .../drivers/dri/i965/brw_wm_surface_state.c   |  15 +-
>  src/mesa/drivers/dri/i965/intel_extensions.c  |  16 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 170 ++
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  48 +++--
>  5 files changed, 149 insertions(+), 105 deletions(-)
> 
> -- 
> 2.20.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v5 3/4] i965: Fixed the CopyImageSubData for ETC2 on Gen < 8

2019-02-14 Thread Nanley Chery
On Wed, Feb 13, 2019 at 12:05:00PM +0200, Eleni Maria Stea wrote:
> For CopyImageSubData to copy the data during the 1st draw call, we need
> to update the shadow tree right before the rendering.
> 
> v2:
>   - Added assertion that the miptree doesn't need update at the time we
>   update the texture surface. (Nanley Chery)
> 
> v3:
>   - As we now update the tree before the rendering we don't need to copy
>   the data during the unmap anymore. Removed the unnecessary update from
>   the intel_miptree_unmap in intel_mipmap_tree.c (Nanley Chery)
> ---
>  src/mesa/drivers/dri/i965/brw_draw.c |  5 +
>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  2 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 13 -
>  3 files changed, 6 insertions(+), 14 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
> b/src/mesa/drivers/dri/i965/brw_draw.c
> index 40bcf82ae8d..d07349419cc 100644
> --- a/src/mesa/drivers/dri/i965/brw_draw.c
> +++ b/src/mesa/drivers/dri/i965/brw_draw.c
> @@ -559,6 +559,11 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool 
> rendering,
>tex_obj->mt->format == MESA_FORMAT_S_UINT8) {
>   intel_update_r8stencil(brw, tex_obj->mt);
>}
> +
> +  if (intel_miptree_has_etc_shadow(brw, tex_obj->mt) &&
> +  tex_obj->mt->shadow_needs_update) {
> + intel_miptree_update_etc_shadow_levels(brw, tex_obj->mt);
> +  }
> }
>  
> /* Resolve color for each active shader image. */
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> index c3d267721e1..19a46fcf243 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> @@ -582,7 +582,7 @@ static void brw_update_texture_surface(struct gl_context 
> *ctx,
>   mt = mt->shadow_mt;
>   format = ISL_FORMAT_R8_UINT;
>} else if (intel_miptree_needs_fake_etc(brw, mt)) {
> - assert(mt->shadow_mt);
> + assert(mt->shadow_mt && !mt->shadow_needs_update);
>   mt = mt->shadow_mt;
>}
>  
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 1643ce2eeb2..89b31c78bc4 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -3780,7 +3780,6 @@ intel_miptree_unmap(struct brw_context *brw,
>  unsigned int slice)
>  {
> struct intel_miptree_map *map = mt->level[level].slice[slice].map;
> -   int level_w, level_h;
>  
> assert(mt->surf.samples == 1);
>  
> @@ -3790,21 +3789,10 @@ intel_miptree_unmap(struct brw_context *brw,
> DBG("%s: mt %p (%s) level %d slice %d\n", __func__,
> mt, _mesa_get_format_name(mt->format), level, slice);
>  
> -   level_w = minify(mt->surf.phys_level0_sa.width,
> -level - mt->first_level);
> -   level_h = minify(mt->surf.phys_level0_sa.height,
> -level - mt->first_level);
> -
> if (map->unmap)
>  map->unmap(brw, mt, map, level, slice);
>  
> intel_miptree_release_map(mt, level, slice);
> -
> -   if (intel_miptree_has_etc_shadow(brw, mt) && mt->shadow_needs_update) {
> -  mt->shadow_needs_update = false;
> -  intel_miptree_update_etc_shadow(brw, mt, level, slice, level_w,
> -  level_h);
> -   }
>  }
>  
>  enum isl_surf_dim
> @@ -3984,6 +3972,5 @@ intel_miptree_update_etc_shadow_levels(struct 
> brw_context *brw,
>   level_h);
>}
> }
> -

Unrelated change.

> mt->shadow_needs_update = false;
>  }
> -- 
> 2.20.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v5 2/4] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-02-14 Thread Nanley Chery
On Wed, Feb 13, 2019 at 12:04:59PM +0200, Eleni Maria Stea wrote:
> GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the
> compressed EAC/ETC2 images to non-compressed RGBA images. When
> GetCompressed* functions were called, the pixels were returned in this
> RGBA format and not the compressed format that was expected.
> 
> Trying to fix this problem, we use a secondary shadow miptree to store the
> decompressed data for the rendering and the main miptree to store the
> compressed for the Get functions to work. Each time that the main miptree
> is written with compressed data, we decompress them to RGB and update the
> shadow. Then we use the shadow for rendering.
> 
> v2:
>- Fixes in the commit message (Nanley Chery)
>- Reversed the changes in brw_get_texture_swizzle and swapped the b, g
>values at the time that we decompress the data in the function:
>intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
>- Simplified the format checks in the miptree_create function of the
>intel_mipmap_tree.c and reserved the call of the
>intel_lower_compressed_format for the case that we are faking the ETC
>support (Nanley Chery)
>- Removed the check for the auxiliary usage for the shadow miptree at
>creation (miptree_create of intel_mipmap_tree.c) as we won't use
>auxiliary buffers with these types of trees (Nanley Chery)
>- Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and
>removed the unecessary checks (Nanley Chery)
>- Fixed an unrelated indentation change (Nanley Chery)
>- Modified the function intel_miptree_finish_write to set the
>mt->shadow_needs_update to true to catch all the cases when we need to
>update the miptree (Nanley Chery)
>- In order to update the shadow miptree during the unmap of the
>main and always map the main (Nanley Chery) the following change was
>necessary: Splitted the previous update function that was updating all
>the mipmap levels and use two functions instead: one that updates one
>level and one that updates all of them. Used the first during unmap
>and the second before the rendering.
>- Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which
>miptree should be mapped each time and reversed all the changes in the
>higher level texture functions that upload data to textures as they
>aren't needed anymore.
>- Replaced the boolean needs_fake_etc with an inline function that
>checks when we need to fake the ETC compression (Nanley Chery)
>- Removed the initialization of the strides in the update function as
>the values will be overwritten by the intel_miptree_map call (Nanley
>Chery)
>- Used minify instead of division in the new update function
>intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley
>Chery)
>- Removed the depth from the calculation of the number of slices in
>the new update function (intel_miptree_update_etc_shadow_levels of
>intel_mipmap_tree.c) as we don't need to support 3D ETC images.
>(Nanley Chery)
> 
> v3:
>   - Renamed the rgba_fmt in function miptree_create
>   (intel_mipmap_tree.c) to decomp_format as the format is not always in
>   rgba order. (Nanley Chery)
>   - Documented the new usage for the shadow miptree in the comment above
>   the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley
>   Chery)
>   - Removed the redundant flags from the mapping of the miptrees in
>   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
>   - Fixed the switch from surface's logical level to physical level in
>   the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c
>   (Nanley Chery)
>   - Excluded the Baytrail GPUs from the check for the ETC emulation as
>   they support the ETC formats natively. (Nanley Chery)
>   - Simplified the check if the format is BGRA in
>   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
> 
> v4:
>   - Removed the functions intel_miptree_(map|unmap)_etc and the check if
>we need to call them as with the new changes, they became unreachable.
>(Nanley Chery)
>   - We'd rather calculate the level width and height using the shadow
>   miptree instead of the main in intel_miptree_update_etc_shadow_levels of
>   intel_mipmap_tree.c (Nanley Chery)
> 
> v5:
>   - Fixed the levels calculations in intel_mipmap_tree.c (Nanley Chery)
>   - Update the flag shadow_needs_update outside the function
>   intel_miptree_update_etc_shadow (Nanley Chery)
>   - Fixed indentation error (Nanley Cherry)
 ^
 Extra r here.

There's real

Re: [Mesa-dev] [PATCH v4 2/4] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-02-11 Thread Nanley Chery
On Sun, Feb 10, 2019 at 11:31:05PM +0200, Eleni Maria Stea wrote:
> GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the
> compressed EAC/ETC2 images to non-compressed RGBA images. When
> GetCompressed* functions were called, the pixels were returned in this
> RGBA format and not the compressed format that was expected.
>
> Trying to fix this problem, we use a secondary shadow miptree to store the
> decompressed data for the rendering and the main miptree to store the
> compressed for the Get functions to work. Each time that the main miptree
> is written with compressed data, we decompress them to RGB and update the
> shadow. Then we use the shadow for rendering.
>
> v2:
>- Fixes in the commit message (Nanley Chery)
>- Reversed the changes in brw_get_texture_swizzle and swapped the b, g
>values at the time that we decompress the data in the function:
>intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
>- Simplified the format checks in the miptree_create function of the
>intel_mipmap_tree.c and reserved the call of the
>intel_lower_compressed_format for the case that we are faking the ETC
>support (Nanley Chery)
>- Removed the check for the auxiliary usage for the shadow miptree at
>creation (miptree_create of intel_mipmap_tree.c) as we won't use
>auxiliary buffers with these types of trees (Nanley Chery)
>- Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and
>removed the unecessary checks (Nanley Chery)
>- Fixed an unrelated indentation change (Nanley Chery)
>- Modified the function intel_miptree_finish_write to set the
>mt->shadow_needs_update to true to catch all the cases when we need to
>update the miptree (Nanley Chery)
>- In order to update the shadow miptree during the unmap of the
>main and always map the main (Nanley Chery) the following change was
>necessary: Splitted the previous update function that was updating all
>the mipmap levels and use two functions instead: one that updates one
>level and one that updates all of them. Used the first during unmap
>and the second before the rendering.
>- Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which
>miptree should be mapped each time and reversed all the changes in the
>higher level texture functions that upload data to textures as they
>aren't needed anymore.
>- Replaced the boolean needs_fake_etc with an inline function that
>checks when we need to fake the ETC compression (Nanley Chery)
>- Removed the initialization of the strides in the update function as
>the values will be overwritten by the intel_miptree_map call (Nanley
>Chery)
>- Used minify instead of division in the new update function
>intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley
>Chery)
>- Removed the depth from the calculation of the number of slices in
>the new update function (intel_miptree_update_etc_shadow_levels of
>intel_mipmap_tree.c) as we don't need to support 3D ETC images.
>(Nanley Chery)
>
> v3:
>   - Renamed the rgba_fmt in function miptree_create
>   (intel_mipmap_tree.c) to decomp_format as the format is not always in
>   rgba order. (Nanley Chery)
>   - Documented the new usage for the shadow miptree in the comment above
>   the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley
>   Chery)
>   - Removed the redundant flags from the mapping of the miptrees in
>   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
>   - Fixed the switch from surface's logical level to physical level in
>   the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c
>   (Nanley Chery)
>   - Excluded the Baytrail GPUs from the check for the ETC emulation as
>   they support the ETC formats natively. (Nanley Chery)
>   - Simplified the check if the format is BGRA in
>   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
>
> v4:
>   - Removed the functions intel_miptree_(map|unmap)_etc and the check if
>we need to call them as with the new changes, they became unreachable.
>(Nanley Chery)
>   - We'd rather calculate the level width and height using the shadow
>   miptree instead of the main in intel_miptree_update_etc_shadow_levels of
>   intel_mipmap_tree.c (Nanley Chery)
> ---
>  .../drivers/dri/i965/brw_wm_surface_state.c   |   5 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 185 +++---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  24 +++
>  3 files changed, 147 insertions(+), 67 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
b/src/mesa/drivers/dri/i965/brw_w

Re: [Mesa-dev] [PATCH v3 2/5] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-02-08 Thread Nanley Chery
On Thu, Feb 07, 2019 at 06:00:19PM +0200, Eleni Maria Stea wrote:
> GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the
> compressed EAC/ETC2 images to non-compressed RGBA images. When
> GetCompressed* functions were called, the pixels were returned in this
> RGBA format and not the compressed format that was expected.
> 
> Trying to fix this problem, we use a secondary shadow miptree to store the
> decompressed data for the rendering and the main miptree to store the
> compressed for the Get functions to work. Each time that the main miptree
> is written with compressed data, we decompress them to RGB and update the
> shadow. Then we use the shadow for rendering.
> 
> v2:
>- Fixes in the commit message (Nanley Chery)
>- Reversed the changes in brw_get_texture_swizzle and swapped the b, g
>values at the time that we decompress the data in the function:
>intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
>- Simplified the format checks in the miptree_create function of the
>intel_mipmap_tree.c and reserved the call of the
>intel_lower_compressed_format for the case that we are faking the ETC
>support (Nanley Chery)
>- Removed the check for the auxiliary usage for the shadow miptree at
>creation (miptree_create of intel_mipmap_tree.c) as we won't use
>auxiliary buffers with these types of trees (Nanley Chery)
>- Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and
>removed the unecessary checks (Nanley Chery)
>- Fixed an unrelated indentation change (Nanley Chery)
>- Modified the function intel_miptree_finish_write to set the
>mt->shadow_needs_update to true to catch all the cases when we need to
>update the miptree (Nanley Chery)
>- In order to update the shadow miptree during the unmap of the
>main and always map the main (Nanley Chery) the following change was
>necessary: Splitted the previous update function that was updating all
>the mipmap levels and use two functions instead: one that updates one
>level and one that updates all of them. Used the first during unmap
>and the second before the rendering.
>- Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which
>miptree should be mapped each time and reversed all the changes in the
>higher level texture functions that upload data to textures as they
>aren't needed anymore.
>- Replaced the boolean needs_fake_etc with an inline function that
>checks when we need to fake the ETC compression (Nanley Chery)
>- Removed the initialization of the strides in the update function as
>the values will be overwritten by the intel_miptree_map call (Nanley
>Chery)
>- Used minify instead of division in the new update function
>intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley
>Chery)
>- Removed the depth from the calculation of the number of slices in
>the new update function (intel_miptree_update_etc_shadow_levels of
>intel_mipmap_tree.c) as we don't need to support 3D ETC images.
>(Nanley Chery)
> 
> v3:
>   - Renamed the rgba_fmt in function miptree_create
>   (intel_mipmap_tree.c) to decomp_format as the format is not always in
>   rgba order. (Nanley Chery)
>   - Documented the new usage for the shadow miptree in the comment above
>   the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley
>   Chery)
>   - Removed the redundant flags from the mapping of the miptrees in
>   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
>   - Fixed the switch from surface's logical level to physical level in
>   the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c
>   (Nanley Chery)
>   - Excluded the Baytrail GPUs from the check for the ETC emulation as
>   they support the ETC formats natively. (Nanley Chery)
>   - Simplified the check if the format is BGRA in
>   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
> ---
>  .../drivers/dri/i965/brw_wm_surface_state.c   |   5 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 130 --
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  24 
>  3 files changed, 149 insertions(+), 10 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> index 618e2ab35bc..c2cf34aee71 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> @@ -521,7 +521,7 @@ static void brw_update_texture_surface(struct gl_context 
> *ctx,
>*/
>   mesa_fmt = mt->format;
>} else if (mt->etc_format != MESA_FORM

Re: [Mesa-dev] [PATCH v3 2/5] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-02-08 Thread Nanley Chery
On Thu, Feb 07, 2019 at 06:00:19PM +0200, Eleni Maria Stea wrote:
> GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the
> compressed EAC/ETC2 images to non-compressed RGBA images. When
> GetCompressed* functions were called, the pixels were returned in this
> RGBA format and not the compressed format that was expected.
> 
> Trying to fix this problem, we use a secondary shadow miptree to store the
> decompressed data for the rendering and the main miptree to store the
> compressed for the Get functions to work. Each time that the main miptree
> is written with compressed data, we decompress them to RGB and update the
> shadow. Then we use the shadow for rendering.
> 
> v2:
>- Fixes in the commit message (Nanley Chery)
>- Reversed the changes in brw_get_texture_swizzle and swapped the b, g
>values at the time that we decompress the data in the function:
>intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
>- Simplified the format checks in the miptree_create function of the
>intel_mipmap_tree.c and reserved the call of the
>intel_lower_compressed_format for the case that we are faking the ETC
>support (Nanley Chery)
>- Removed the check for the auxiliary usage for the shadow miptree at
>creation (miptree_create of intel_mipmap_tree.c) as we won't use
>auxiliary buffers with these types of trees (Nanley Chery)
>- Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and
>removed the unecessary checks (Nanley Chery)
>- Fixed an unrelated indentation change (Nanley Chery)
>- Modified the function intel_miptree_finish_write to set the
>mt->shadow_needs_update to true to catch all the cases when we need to
>update the miptree (Nanley Chery)
>- In order to update the shadow miptree during the unmap of the
>main and always map the main (Nanley Chery) the following change was
>necessary: Splitted the previous update function that was updating all
>the mipmap levels and use two functions instead: one that updates one
>level and one that updates all of them. Used the first during unmap
>and the second before the rendering.
>- Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which
>miptree should be mapped each time and reversed all the changes in the
>higher level texture functions that upload data to textures as they
>aren't needed anymore.
>- Replaced the boolean needs_fake_etc with an inline function that
>checks when we need to fake the ETC compression (Nanley Chery)
>- Removed the initialization of the strides in the update function as
>the values will be overwritten by the intel_miptree_map call (Nanley
>Chery)
>- Used minify instead of division in the new update function
>intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley
>Chery)
>- Removed the depth from the calculation of the number of slices in
>the new update function (intel_miptree_update_etc_shadow_levels of
>intel_mipmap_tree.c) as we don't need to support 3D ETC images.
>(Nanley Chery)
> 
> v3:
>   - Renamed the rgba_fmt in function miptree_create
>   (intel_mipmap_tree.c) to decomp_format as the format is not always in
>   rgba order. (Nanley Chery)
>   - Documented the new usage for the shadow miptree in the comment above
>   the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley
>   Chery)
>   - Removed the redundant flags from the mapping of the miptrees in
>   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
>   - Fixed the switch from surface's logical level to physical level in
>   the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c
>   (Nanley Chery)
>   - Excluded the Baytrail GPUs from the check for the ETC emulation as
>   they support the ETC formats natively. (Nanley Chery)
>   - Simplified the check if the format is BGRA in
>   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
> ---
>  .../drivers/dri/i965/brw_wm_surface_state.c   |   5 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 130 --
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  24 
>  3 files changed, 149 insertions(+), 10 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> index 618e2ab35bc..c2cf34aee71 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> @@ -521,7 +521,7 @@ static void brw_update_texture_surface(struct gl_context 
> *ctx,
>*/
>   mesa_fmt = mt->format;
>} else if (mt->etc_format != MESA_FORM

Re: [Mesa-dev] [PATCH v3 2/5] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-02-08 Thread Nanley Chery
On Fri, Feb 08, 2019 at 12:55:20PM +0200, Eleni Maria Stea wrote:
> Hi Nanley,
> 
> On Thu, 7 Feb 2019 15:46:29 -0800
> Nanley Chery  wrote:
>  >  
> > > @@ -3825,10 +3849,20 @@ intel_miptree_unmap(struct brw_context *brw,
> > > DBG("%s: mt %p (%s) level %d slice %d\n", __func__,
> > > mt, _mesa_get_format_name(mt->format), level, slice);
> > >  
> > > +   level_w = minify(mt->surf.phys_level0_sa.width,
> > > +level - mt->first_level);
> > > +   level_h = minify(mt->surf.phys_level0_sa.height,
> > > +level - mt->first_level);
> > > +
> > > if (map->unmap)
> > >  map->unmap(brw, mt, map, level, slice);
> > >  
> > > intel_miptree_release_map(mt, level, slice);
> > > +
> > > +   if (intel_miptree_has_etc_shadow(brw, mt) &&
> > > mt->shadow_needs_update) {
> > > +  intel_miptree_update_etc_shadow(brw, mt, level, slice,
> > > level_w,
> > > +  level_h);
> > > +   }  
> > 
> > With the next patch applied, the change in this function becomes
> > unnecessary. Is there any reason you're leaving it around?
> 
> After a second thought, I believe that this change wasn't unnecessary.
> There is a problem if we remove it:
> 
> When we generate mipmaps we need to update the shadow for each level.
> As the update is done per level during unmap, if we remove the call we
> end-up with the first level correctly updated but all the others empty.
> 
> An example:
> git clone https://github.com/hikiko/test-compression.git
> make
> ./test compressed/full.tex
> 

That's a nice test :)

> This test loads dumped compressed mipmap levels from the full.tex and
> displays them, if you run it with the per level update inside the unmap
> you will see all the mipmap levels. Without, you will see only the
> first, like here: https://imgur.com/a/VvS0CYC
> 
> Do you have any suggestion on how I could bypass this problem?
> 

Yes. The cause of this problem is that after
intel_miptree_update_etc_shadow_levels() calls
intel_miptree_update_etc_shadow() on the first level, the miptree is
marked as not needing an update. Therefore subsequent calls to
intel_miptree_update_etc_shadow() return early without updating other
levels. One way to fix this is to move the responsibility of marking the
miptree as updated from intel_miptree_update_etc_shadow() to it's
callers.

I also found another bug. I'll add a comment on the problematic patch.

> Thanks again,
> Eleni
> 
> 
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 5/5] i965: Enabled the OES_copy_image extension on Gen 7 GPUs

2019-02-07 Thread Nanley Chery
On Thu, Feb 07, 2019 at 07:17:32PM +0200, Eleni Maria Stea wrote:
> On Thu, 7 Feb 2019 11:18:59 -0500
> Ilia Mirkin  wrote:
> 
> > On Thu, Feb 7, 2019 at 2:49 AM Eleni Maria Stea 
> > wrote:
> > >
> > > On Wed, 6 Feb 2019 12:12:27 -0800
> > > Nanley Chery  wrote:
> > >  
> > > > > +   * For now, we can't enable OES_texture_view on Gen 7
> > > > > because of
> > > > > +   * some piglit failures coming from
> > > > > +   * piglit/tests/spec/arb_texture_view/rendering-formats.c
> > > > > that need
> > > > > +   * investigation.
> > > > > */  
> > > >
> > > > What kind of failures are you seeing? I'd imagine texture views to
> > > > work with this version of your series.
> > > >  
> > >
> > > Hi Nanley,
> > >
> > > If you run the piglit test: arb_texture_view-rendering-format, and
> > > grep for failures on HSW:
> > >
> > > PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_R32F" : "fail"}}
> > > PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RG16F" : "fail"}}
> > > PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RG16I" : "fail"}}
> > > PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RG16_SNORM" : "fail"}}
> > > PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RGBA8I" : "fail"}}
> > > PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RGBA8_SNORM" :
> > > "fail"}}
> > >
> > > I remember seeing similar errors on Ivy too. They must be
> > > irrelevant to the ETC support but as this test passes on BDW where
> > > the extension is enabled, I didn't enable it on Gen 7 for the
> > > moment. I think I had discussed about these failures with Kenneth
> > > before I disabled them, but I didn't investigated them further
> > > after that.  
> > 
> > Do you also see the failures with desktop GL (and ARB_texture_view)?
> > If not, that'd be very surprising.
> > 
> > Note that the piglit test arb_texture_view-rendering-formats is the
> > desktop GL test. arb_texture_view-rendering-formats_gles3 is the ES
> > version.
> > 

Good point.

> >   -ilia
> > 
> 
> Hi Ilia,
> 
> I just checked on HSW and IVY with my final patches (sent a few minutes
> before your reply) and:
> 
> HSW:
> 
> extension disabled: the desktop test passes but we receive the following
> error several times:
> 
> User Error: GL_INVALID_OPERATION in glTextureView(internalformat X not
> compatible with origtexture Y) in each subtest.
> 
> extension enabled: I see the same error but now both the desktop and
> gles versions pass (which wasn't the case when I checked last week with
> my previous patches)
> 
> I could probably enable it now on gen >= 75, if you and Nanley (CC-ed)
> are OK with this decision. What do you think?

If I execute the following on my HSW:

$ MESA_EXTENSION_OVERRIDE=GL_OES_texture_view \
  ./bin/arb_texture_view-rendering-formats_gles3 | grep fail

I still get the errors you mentioned above: 

PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_R32F" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RG16F" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RG16I" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RG16_SNORM" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RGBA8I" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RGBA8_SNORM" : "fail"}}

This is with a mesa that doesn't have your patches applied, but that
shouldn't matter with the formats listed above.

If I ran the test above correctly, I think we should leave it disabled.

-Nanley

> 
> on Ivy:
> ---
> extension disabled: the desktop version of the test
> fails with the failures below (and the gles is skipped) 
> 
> extension enabled: both the desktop and the gles versions
> fail and the failures are the same (see below)
> 
> PIGLIT: {"subtest": {"clear GL_RGBA16_SNORM as GL_RG32F" : "fail"}}
> PIGLIT: {"subtest": {"clear GL_RGBA16_SNORM as GL_RG32UI" : "fail"}}
> PIGLIT: {"subtest": {"clear GL_RGBA16_SNORM as GL_RG32I" : "fail"}}
> PIGLIT: {"subtest": {"clear GL_RGBA16_SNORM as GL_RGBA16F" : "fail"}}
> PI

Re: [Mesa-dev] [PATCH v3 2/5] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-02-07 Thread Nanley Chery
On Thu, Feb 07, 2019 at 06:00:19PM +0200, Eleni Maria Stea wrote:
> GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the
> compressed EAC/ETC2 images to non-compressed RGBA images. When
> GetCompressed* functions were called, the pixels were returned in this
> RGBA format and not the compressed format that was expected.
> 
> Trying to fix this problem, we use a secondary shadow miptree to store the
> decompressed data for the rendering and the main miptree to store the
> compressed for the Get functions to work. Each time that the main miptree
> is written with compressed data, we decompress them to RGB and update the
> shadow. Then we use the shadow for rendering.
> 
> v2:
>- Fixes in the commit message (Nanley Chery)
>- Reversed the changes in brw_get_texture_swizzle and swapped the b, g
>values at the time that we decompress the data in the function:
>intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
>- Simplified the format checks in the miptree_create function of the
>intel_mipmap_tree.c and reserved the call of the
>intel_lower_compressed_format for the case that we are faking the ETC
>support (Nanley Chery)
>- Removed the check for the auxiliary usage for the shadow miptree at
>creation (miptree_create of intel_mipmap_tree.c) as we won't use
>auxiliary buffers with these types of trees (Nanley Chery)
>- Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and
>removed the unecessary checks (Nanley Chery)
>- Fixed an unrelated indentation change (Nanley Chery)
>- Modified the function intel_miptree_finish_write to set the
>mt->shadow_needs_update to true to catch all the cases when we need to
>update the miptree (Nanley Chery)
>- In order to update the shadow miptree during the unmap of the
>main and always map the main (Nanley Chery) the following change was
>necessary: Splitted the previous update function that was updating all
>the mipmap levels and use two functions instead: one that updates one
>level and one that updates all of them. Used the first during unmap
>and the second before the rendering.
>- Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which
>miptree should be mapped each time and reversed all the changes in the
>higher level texture functions that upload data to textures as they
>aren't needed anymore.
>- Replaced the boolean needs_fake_etc with an inline function that
>checks when we need to fake the ETC compression (Nanley Chery)
>- Removed the initialization of the strides in the update function as
>the values will be overwritten by the intel_miptree_map call (Nanley
>Chery)
>- Used minify instead of division in the new update function
>intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley
>Chery)
>- Removed the depth from the calculation of the number of slices in
>the new update function (intel_miptree_update_etc_shadow_levels of
>intel_mipmap_tree.c) as we don't need to support 3D ETC images.
>(Nanley Chery)
> 
> v3:
>   - Renamed the rgba_fmt in function miptree_create
>   (intel_mipmap_tree.c) to decomp_format as the format is not always in
>   rgba order. (Nanley Chery)
>   - Documented the new usage for the shadow miptree in the comment above
>   the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley
>   Chery)
>   - Removed the redundant flags from the mapping of the miptrees in
>   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
>   - Fixed the switch from surface's logical level to physical level in
>   the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c
>   (Nanley Chery)
>   - Excluded the Baytrail GPUs from the check for the ETC emulation as
>   they support the ETC formats natively. (Nanley Chery)
>   - Simplified the check if the format is BGRA in
>   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
> ---
>  .../drivers/dri/i965/brw_wm_surface_state.c   |   5 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 130 --
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  24 
>  3 files changed, 149 insertions(+), 10 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> index 618e2ab35bc..c2cf34aee71 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> @@ -521,7 +521,7 @@ static void brw_update_texture_surface(struct gl_context 
> *ctx,
>*/
>   mesa_fmt = mt->format;
>} else if (mt->etc_format != MESA_FORM

Re: [Mesa-dev] [PATCH v2 5/5] i965: Enabled the OES_copy_image extension on Gen 7 GPUs

2019-02-06 Thread Nanley Chery
On Sun, Feb 03, 2019 at 03:07:36PM +0200, Eleni Maria Stea wrote:
> OES_copy_image extension was disabled on Gen7 due to the lack of support
> for ETC2 images. Enabled it back. (Kenneth Graunke)
> ---
>  src/mesa/drivers/dri/i965/intel_extensions.c | 18 ++
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
> b/src/mesa/drivers/dri/i965/intel_extensions.c
> index 3a95be58a63..d2e232f3ff1 100644
> --- a/src/mesa/drivers/dri/i965/intel_extensions.c
> +++ b/src/mesa/drivers/dri/i965/intel_extensions.c
> @@ -287,14 +287,24 @@ intelInitExtensions(struct gl_context *ctx)
> }
>  
> if (devinfo->gen >= 8 || devinfo->is_baytrail) {
> -  /* For now, we only enable OES_copy_image on platforms that support
> -   * ETC2 natively in hardware.  We would need more hacks to support it
> -   * elsewhere. Same with OES_texture_view.
> +  /*

There's a new blank line here.

> +   * For now, we can't enable OES_texture_view on Gen 7 because of
> +   * some piglit failures coming from
> +   * piglit/tests/spec/arb_texture_view/rendering-formats.c that need
> +   * investigation.
> */

What kind of failures are you seeing? I'd imagine texture views to work
with this version of your series.

> -  ctx->Extensions.OES_copy_image = true;
>ctx->Extensions.OES_texture_view = true;
> }
>  
> +   if (devinfo->gen >= 7) {
> +  /*

There's a new blank line here.

-Nanley

> +   * We can safely enable OES_copy_image on Gen 7, since we emulate
> +   * the ETC2 support using the shadow_miptree to store the
> +   * compressed data.
> +   */
> +  ctx->Extensions.OES_copy_image = true;
> +   }
> +
> if (devinfo->gen >= 8) {
>ctx->Extensions.ARB_gpu_shader_int64 = true;
>/* requires ARB_gpu_shader_int64 */
> -- 
> 2.20.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/5] i965: Removed assertions from intel_miptree_map_etc

2019-02-06 Thread Nanley Chery
On Sun, Feb 03, 2019 at 03:07:33PM +0200, Eleni Maria Stea wrote:
> The assertions that the GL_MAP_WRITE_BIT and GL_MAP_INVALIDATE_RANGE_BIT
> in intel_miptree_map_etc will fail when the ETC miptree is mapped for
> reading. As we are about to fix the GetCompressed* functions in the
> following patches and allow the reading from etc miptrees, we have to
> remove them.
> 
> Fixes the crash of the test
> KHR-GL45.direct_state_access.textures_compressed_subimage on Gen 7 GPUs.
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 479188fd1c8..0a25dfd0161 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -3497,9 +3497,6 @@ intel_miptree_map_etc(struct brw_context *brw,
>assert(mt->format == MESA_FORMAT_R8G8B8X8_UNORM);
> }
>  
> -   assert(map->mode & GL_MAP_WRITE_BIT);
> -   assert(map->mode & GL_MAP_INVALIDATE_RANGE_BIT);
> -

Isn't this function unreachable with the next patch?

> intel_miptree_access_raw(brw, mt, level, slice, true);
>  
> map->stride = _mesa_format_row_stride(mt->etc_format, map->w);
> -- 
> 2.20.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/8] i965: Update the shadow miptree from the main to fake the ETC2 compression

2019-02-06 Thread Nanley Chery
On Sun, Feb 03, 2019 at 03:59:42PM +0200, Eleni Maria Stea wrote:
> On Fri, 18 Jan 2019 17:09:03 -0800
> Nanley Chery  wrote:
> 
> > On Mon, Nov 19, 2018 at 10:54:08AM +0200, Eleni Maria Stea wrote:
> [...]
> > > +   int img_d = smt->surf.logical_level0_px.depth;  
> > 
> > I don't think 3D ETC textures are possible. From the GL4.6 spec:
> > 
> > An INVALID_OPERATION error is generated by
> > CompressedTexImage3D if internalformat is one of the EAC, ETC2, or
> > RGTC formats and either border is non-zero, or target is not
> > TEXTURE_2D_ARRAY.
> 
> Hi Nanley,
> 
> Thanks for pointing this out. I've made the change in my new series
> of patches but after giving it a second thought, I believe that I'd
> rather put back the depth in the calculation of num_slices:
> 
> As, I understand the spec, if the border is zero, the 3D images should
> be supported. Mesa already checks the border value in the file:
> src/mesa/main/teximage.c function: compressed_texture_error_check and
> has a comment:
> 

OpenGL 4.6 says:

   An INVALID_OPERATION error is generated by CompressedTexImage3D if
   internalformat is one of the EAC, ETC2, or RGTC formats and either
   border is non-zero, or target is not TEXTURE_2D_ARRAY.

In this case, it means that we should return an error if the texture
target isn't a 2D array (regardless of the border value).

> /* No compressed formats support borders at this time */
> 
> and so only ETC/EAC compressed formats without border will reach the
> update function and we should support them.
> 

In this function, 3D ETC compressed textures return an GL error where it
calls _mesa_target_can_be_compressed() at line 2034:

   if (!_mesa_target_can_be_compressed(ctx, target, internalFormat, )) {
  reason = "target";
  goto error;
   }

> Also, I see that we have some CTS tests that call the
> CompressedTexImage3D for ETC/EAC formats with 0 border value, so I
> suppose that is expected to have 3D images of these formats.
> 

As seen from the citation above, ETC/EAC formats must have a 0 border
value and a target of TEXTURE_2D_ARRAY when calling
CompressedTexImage3D.

> What do you think?
> 
> Thank you in advance,
> Eleni
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 3/5] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-02-05 Thread Nanley Chery
On Sun, Feb 03, 2019 at 03:07:34PM +0200, Eleni Maria Stea wrote:
> GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the
> compressed EAC/ETC2 images to non-compressed RGBA images. When
> GetCompressed* functions were called, the pixels were returned in this
> RGBA format and not the compressed format that was expected.
> 
> Trying to fix this problem, we use a secondary shadow miptree to store the
> decompressed data for the rendering and the main miptree to store the
> compressed for the Get functions to work. Each time that the main miptree
> is written with compressed data, we decompress them to RGB and update the
> shadow. Then we use the shadow for rendering.
> 
> v2:
>- Fixes in the commit message (Nanley Chery)
>- Reversed the changes in brw_get_texture_swizzle and swapped the b, g
>values at the time that we decompress the data in the function:
>intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
>- Simplified the format checks in the miptree_create function of the
>intel_mipmap_tree.c and reserved the call of the
>intel_lower_compressed_format for the case that we are faking the ETC
>support (Nanley Chery)
>- Removed the check for the auxiliary usage for the shadow miptree at
>creation (miptree_create of intel_mipmap_tree.c) as we won't use
>auxiliary buffers with these types of trees (Nanley Chery)
>- Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and
>removed the unecessary checks (Nanley Chery)
>- Fixed an unrelated indentation change (Nanley Chery)
>- Modified the function intel_miptree_finish_write to set the
>mt->shadow_needs_update to true to catch all the cases when we need to
>update the miptree (Nanley Chery)
>- In order to update the shadow miptree during the unmap of the
>main and always map the main (Nanley Chery) the following change was
>necessary: Splitted the previous update function that was updating all
>the mipmap levels and use two functions instead: one that updates one
>level and one that updates all of them. Used the first during unmap
>and the second before the rendering.
>- Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which
>miptree should be mapped each time and reversed all the changes in the
>higher level texture functions that upload data to textures as they
>aren't needed anymore.
>- Replaced the boolean needs_fake_etc with an inline function that
>checks when we need to fake the ETC compression (Nanley Chery)
>- Removed the initialization of the strides in the update function as
>the values will be overwritten by the intel_miptree_map call (Nanley
>Chery)
>- Used minify instead of division in the new update function
>intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley
>Chery)
>- Removed the depth from the calculation of the number of slices in
>the new update function (intel_miptree_update_etc_shadow_levels of
>intel_mipmap_tree.c) as we don't need to support 3D ETC images.
>(Nanley Chery)
> ---
>  .../drivers/dri/i965/brw_wm_surface_state.c   |   5 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 133 --
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  22 +++
>  3 files changed, 150 insertions(+), 10 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> index 618e2ab35bc..c2cf34aee71 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> @@ -521,7 +521,7 @@ static void brw_update_texture_surface(struct gl_context 
> *ctx,
>*/
>   mesa_fmt = mt->format;
>} else if (mt->etc_format != MESA_FORMAT_NONE) {
> - mesa_fmt = mt->format;
> + mesa_fmt = mt->shadow_mt->format;
>} else if (plane > 0) {
>   mesa_fmt = mt->format;
>} else {
> @@ -581,6 +581,9 @@ static void brw_update_texture_surface(struct gl_context 
> *ctx,
>   assert(mt->shadow_mt && !mt->shadow_needs_update);
>   mt = mt->shadow_mt;
>   format = ISL_FORMAT_R8_UINT;
> +  } else if (intel_miptree_needs_fake_etc(brw, mt)) {
> + assert(mt->shadow_mt);

We can be even safer if we assert that the shadow doesn't need updating
at this time.

> + mt = mt->shadow_mt;
>}
>  
>const int surf_index = surf_offset - >wm.base.surf_offset[0];
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 0a25dfd0161..3f

Re: [Mesa-dev] [PATCH v2 08/32] intel/isl: Add gen10 variants of Yf and Ys tiling

2019-02-01 Thread Nanley Chery
On Fri, Oct 12, 2018 at 01:46:38PM -0500, Jason Ekstrand wrote:
> ---
>  src/intel/isl/isl.c   |  9 +++--
>  src/intel/isl/isl.h   | 12 ++--
>  src/intel/isl/isl_drm.c   |  2 ++
>  src/intel/isl/isl_gen7.c  |  8 +++-
>  src/intel/isl/isl_surface_state.c |  2 ++
>  5 files changed, 28 insertions(+), 5 deletions(-)
> 

This patch is
Reviewed-by: Nanley Chery 

> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> index 392c15ca3fb..3ffc6f627b2 100644
> --- a/src/intel/isl/isl.c
> +++ b/src/intel/isl/isl.c
> @@ -218,8 +218,11 @@ isl_tiling_get_info(enum isl_tiling tiling,
>break;
>  
> case ISL_TILING_GEN9_Yf:
> -   case ISL_TILING_GEN9_Ys: {
> -  bool is_Ys = tiling == ISL_TILING_GEN9_Ys;
> +   case ISL_TILING_GEN9_Ys:
> +   case ISL_TILING_GEN10_Yf:
> +   case ISL_TILING_GEN10_Ys: {
> +  bool is_Ys = tiling == ISL_TILING_GEN9_Ys ||
> +   tiling == ISL_TILING_GEN10_Ys;
>  
>assert(bs > 0);
>unsigned width = 1 << (6 + (ffs(bs) / 2) + (2 * is_Ys));
> @@ -375,7 +378,9 @@ isl_surf_choose_tiling(const struct isl_device *dev,
>CHOOSE(ISL_TILING_LINEAR);
> }
>  
> +   CHOOSE(ISL_TILING_GEN10_Ys);
> CHOOSE(ISL_TILING_GEN9_Ys);
> +   CHOOSE(ISL_TILING_GEN10_Yf);
> CHOOSE(ISL_TILING_GEN9_Yf);
> CHOOSE(ISL_TILING_Y0);
> CHOOSE(ISL_TILING_X);
> diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
> index 1c7990f2dc7..200bfbfa85b 100644
> --- a/src/intel/isl/isl.h
> +++ b/src/intel/isl/isl.h
> @@ -462,6 +462,8 @@ enum isl_tiling {
> ISL_TILING_Y0, /**< Legacy Y tiling */
> ISL_TILING_GEN9_Yf, /**< Standard 4K tiling. The 'f' means "four". */
> ISL_TILING_GEN9_Ys, /**< Standard 64K tiling. The 's' means "sixty-four". 
> */
> +   ISL_TILING_GEN10_Yf, /**< Standard 4K tiling. The 'f' means "four". */
> +   ISL_TILING_GEN10_Ys, /**< Standard 64K tiling. The 's' means 
> "sixty-four". */
> ISL_TILING_HIZ, /**< Tiling format for HiZ surfaces */
> ISL_TILING_CCS, /**< Tiling format for CCS surfaces */
>  };
> @@ -477,6 +479,8 @@ typedef uint32_t isl_tiling_flags_t;
>  #define ISL_TILING_Y0_BIT (1u << ISL_TILING_Y0)
>  #define ISL_TILING_GEN9_Yf_BIT(1u << ISL_TILING_GEN9_Yf)
>  #define ISL_TILING_GEN9_Ys_BIT(1u << ISL_TILING_GEN9_Ys)
> +#define ISL_TILING_GEN10_Yf_BIT   (1u << ISL_TILING_GEN10_Yf)
> +#define ISL_TILING_GEN10_Ys_BIT   (1u << ISL_TILING_GEN10_Ys)
>  #define ISL_TILING_HIZ_BIT(1u << ISL_TILING_HIZ)
>  #define ISL_TILING_CCS_BIT(1u << ISL_TILING_CCS)
>  #define ISL_TILING_ANY_MASK   (~0u)
> @@ -485,11 +489,15 @@ typedef uint32_t isl_tiling_flags_t;
>  /** Any Y tiling, including legacy Y tiling. */
>  #define ISL_TILING_ANY_Y_MASK (ISL_TILING_Y0_BIT | \
> ISL_TILING_GEN9_Yf_BIT | \
> -   ISL_TILING_GEN9_Ys_BIT)
> +   ISL_TILING_GEN9_Ys_BIT | \
> +   ISL_TILING_GEN10_Yf_BIT | \
> +   ISL_TILING_GEN10_Ys_BIT)
>  
>  /** The Skylake BSpec refers to Yf and Ys as "standard tiling formats". */
>  #define ISL_TILING_STD_Y_MASK (ISL_TILING_GEN9_Yf_BIT | \
> -   ISL_TILING_GEN9_Ys_BIT)
> +   ISL_TILING_GEN9_Ys_BIT | \
> +   ISL_TILING_GEN10_Yf_BIT | \
> +   ISL_TILING_GEN10_Ys_BIT)
>  /** @} */
>  
>  /**
> diff --git a/src/intel/isl/isl_drm.c b/src/intel/isl/isl_drm.c
> index 62fdd22d10d..03f433a1058 100644
> --- a/src/intel/isl/isl_drm.c
> +++ b/src/intel/isl/isl_drm.c
> @@ -46,6 +46,8 @@ isl_tiling_to_i915_tiling(enum isl_tiling tiling)
> case ISL_TILING_W:
> case ISL_TILING_GEN9_Yf:
> case ISL_TILING_GEN9_Ys:
> +   case ISL_TILING_GEN10_Yf:
> +   case ISL_TILING_GEN10_Ys:
> case ISL_TILING_HIZ:
> case ISL_TILING_CCS:
>return I915_TILING_NONE;
> diff --git a/src/intel/isl/isl_gen7.c b/src/intel/isl/isl_gen7.c
> index 91cea299abc..f6f7e1ba7dc 100644
> --- a/src/intel/isl/isl_gen7.c
> +++ b/src/intel/isl/isl_gen7.c
> @@ -197,16 +197,22 @@ isl_gen6_filter_tiling(const struct isl_device *dev,
> assert(ISL_DEV_USE_SEPARATE_STENCIL(dev));
>  
> /* Clear flags unsupported on this hardware */
> -   if (ISL_DEV_GEN(dev) < 9) {
> +

Re: [Mesa-dev] [PATCH v2 07/32] intel/isl: Rename ISL_TILING_Yf/s to ISL_TILING_GEN9_Yf/s

2019-02-01 Thread Nanley Chery
On Wed, Jan 23, 2019 at 02:25:14PM -0800, Nanley Chery wrote:
> On Fri, Oct 12, 2018 at 01:46:37PM -0500, Jason Ekstrand wrote:
> > The Yf and Ys tilings change a bit between gen9 and gen10 so we have to
> > be able to distinguish between them.
> > ---
> >  src/intel/isl/isl.c   | 12 ++--
> >  src/intel/isl/isl.h   | 16 
> >  src/intel/isl/isl_drm.c   |  4 ++--
> >  src/intel/isl/isl_gen7.c  |  8 
> >  src/intel/isl/isl_gen9.c  |  2 +-
> >  src/intel/isl/isl_surface_state.c |  4 ++--
> >  6 files changed, 23 insertions(+), 23 deletions(-)
> > 
> > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> > index d6beee987b5..392c15ca3fb 100644
> > --- a/src/intel/isl/isl.c
> > +++ b/src/intel/isl/isl.c
> > @@ -217,9 +217,9 @@ isl_tiling_get_info(enum isl_tiling tiling,
> >phys_B = isl_extent2d(128, 32);
> >break;
> >  
> > -   case ISL_TILING_Yf:
> > -   case ISL_TILING_Ys: {
> > -  bool is_Ys = tiling == ISL_TILING_Ys;
> > +   case ISL_TILING_GEN9_Yf:
> > +   case ISL_TILING_GEN9_Ys: {
> > +  bool is_Ys = tiling == ISL_TILING_GEN9_Ys;
> >  
> >assert(bs > 0);
> >unsigned width = 1 << (6 + (ffs(bs) / 2) + (2 * is_Ys));
> > @@ -375,8 +375,8 @@ isl_surf_choose_tiling(const struct isl_device *dev,
> >CHOOSE(ISL_TILING_LINEAR);
> > }
> >  
> > -   CHOOSE(ISL_TILING_Ys);
> > -   CHOOSE(ISL_TILING_Yf);
> > +   CHOOSE(ISL_TILING_GEN9_Ys);
> > +   CHOOSE(ISL_TILING_GEN9_Yf);
> > CHOOSE(ISL_TILING_Y0);
> > CHOOSE(ISL_TILING_X);
> > CHOOSE(ISL_TILING_W);
> > @@ -715,7 +715,7 @@ isl_calc_phys_level0_extent_sa(const struct isl_device 
> > *dev,
> >   assert(dim_layout == ISL_DIM_LAYOUT_GEN4_2D ||
> >  dim_layout == ISL_DIM_LAYOUT_GEN6_STENCIL_HIZ);
> >  
> > -  if (tiling == ISL_TILING_Ys && info->samples > 1)
> > +  if (tiling == ISL_TILING_GEN9_Ys && info->samples > 1)
> >   isl_finishme("%s:%s: multisample TileYs layout", __FILE__, 
> > __func__);
> >  
> 
> Shouldn't the next patch be updated with a similar change?
> 

This block is never deleted in this series.

> >switch (msaa_layout) {
> > diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
> > index 4f8d38e22fb..1c7990f2dc7 100644
> > --- a/src/intel/isl/isl.h
> > +++ b/src/intel/isl/isl.h
> > @@ -460,8 +460,8 @@ enum isl_tiling {
> > ISL_TILING_W,
> > ISL_TILING_X,
> > ISL_TILING_Y0, /**< Legacy Y tiling */
> > -   ISL_TILING_Yf, /**< Standard 4K tiling. The 'f' means "four". */
> > -   ISL_TILING_Ys, /**< Standard 64K tiling. The 's' means "sixty-four". */
> > +   ISL_TILING_GEN9_Yf, /**< Standard 4K tiling. The 'f' means "four". */
> > +   ISL_TILING_GEN9_Ys, /**< Standard 64K tiling. The 's' means 
> > "sixty-four". */
> > ISL_TILING_HIZ, /**< Tiling format for HiZ surfaces */
> > ISL_TILING_CCS, /**< Tiling format for CCS surfaces */
> >  };
> > @@ -475,8 +475,8 @@ typedef uint32_t isl_tiling_flags_t;
> >  #define ISL_TILING_W_BIT  (1u << ISL_TILING_W)
> >  #define ISL_TILING_X_BIT  (1u << ISL_TILING_X)
> >  #define ISL_TILING_Y0_BIT (1u << ISL_TILING_Y0)
> > -#define ISL_TILING_Yf_BIT (1u << ISL_TILING_Yf)
> > -#define ISL_TILING_Ys_BIT (1u << ISL_TILING_Ys)
> > +#define ISL_TILING_GEN9_Yf_BIT(1u << ISL_TILING_GEN9_Yf)
> > +#define ISL_TILING_GEN9_Ys_BIT(1u << ISL_TILING_GEN9_Ys)
> >  #define ISL_TILING_HIZ_BIT(1u << ISL_TILING_HIZ)
> >  #define ISL_TILING_CCS_BIT(1u << ISL_TILING_CCS)
> >  #define ISL_TILING_ANY_MASK   (~0u)
> > @@ -484,12 +484,12 @@ typedef uint32_t isl_tiling_flags_t;
> >  
> >  /** Any Y tiling, including legacy Y tiling. */
> >  #define ISL_TILING_ANY_Y_MASK (ISL_TILING_Y0_BIT | \
> > -   ISL_TILING_Yf_BIT | \
> > -   ISL_TILING_Ys_BIT)
> > +   ISL_TILING_GEN9_Yf_BIT | \
> > +   ISL_TILING_GEN9_Ys_BIT)
> >  
> >  /** The Skylake BSpec refers to Yf and Ys as "standard tiling formats". */
> > -#define ISL_T

Re: [Mesa-dev] [PATCH v2 05/32] intel/isl: Use a 4D physical total extent for size calculations

2019-01-30 Thread Nanley Chery
On Fri, Oct 12, 2018 at 01:46:35PM -0500, Jason Ekstrand wrote:
> With Yf and Ys tiling, everything is actually four dimensional because
> we can have multiple depth or multisampled array slices in the same
> tile.  This commit just enhances the calculations so they can handle it.
> 
> Reviewed-by: Topi Pohjolainen 
> ---
>  src/intel/isl/isl.c | 73 ++---
>  1 file changed, 55 insertions(+), 18 deletions(-)
> 
> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> index 6bc96e86cb5..a805facb1ae 100644
> --- a/src/intel/isl/isl.c
> +++ b/src/intel/isl/isl.c
> @@ -992,7 +992,7 @@ isl_calc_phys_total_extent_el_gen4_2d(
>const struct isl_extent4d *phys_level0_sa,
>enum isl_array_pitch_span array_pitch_span,
>uint32_t *array_pitch_el_rows,
> -  struct isl_extent2d *total_extent_el)
> +  struct isl_extent4d *phys_total_el)
>  {
> const struct isl_format_layout *fmtl = 
> isl_format_get_layout(info->format);
>  
> @@ -1005,10 +1005,12 @@ isl_calc_phys_total_extent_el_gen4_2d(
> image_align_sa, phys_level0_sa,
> array_pitch_span,
> _slice0_sa);
> -   *total_extent_el = (struct isl_extent2d) {
> +   *phys_total_el = (struct isl_extent4d) {
>.w = isl_assert_div(phys_slice0_sa.w, fmtl->bw),
>.h = *array_pitch_el_rows * (phys_level0_sa->array_len - 1) +
> isl_assert_div(phys_slice0_sa.h, fmtl->bh),
> +  .d = 1,
> +  .a = 1,
> };
>  }
>  
> @@ -1023,7 +1025,7 @@ isl_calc_phys_total_extent_el_gen4_3d(
>const struct isl_extent3d *image_align_sa,
>const struct isl_extent4d *phys_level0_sa,
>uint32_t *array_pitch_el_rows,
> -  struct isl_extent2d *phys_total_el)
> +  struct isl_extent4d *phys_total_el)
>  {
> const struct isl_format_layout *fmtl = 
> isl_format_get_layout(info->format);
>  
> @@ -1070,9 +1072,11 @@ isl_calc_phys_total_extent_el_gen4_3d(
>  */
> *array_pitch_el_rows =
>isl_align_npot(phys_level0_sa->h, image_align_sa->h) / fmtl->bw;
> -   *phys_total_el = (struct isl_extent2d) {
> +   *phys_total_el = (struct isl_extent4d) {
>.w = isl_assert_div(total_w, fmtl->bw),
>.h = isl_assert_div(total_h, fmtl->bh),
> +  .d = 1,
> +  .a = 1,
> };
>  }
>  
> @@ -1088,7 +1092,7 @@ isl_calc_phys_total_extent_el_gen6_stencil_hiz(
>const struct isl_extent3d *image_align_sa,
>const struct isl_extent4d *phys_level0_sa,
>uint32_t *array_pitch_el_rows,
> -  struct isl_extent2d *phys_total_el)
> +  struct isl_extent4d *phys_total_el)
>  {
> const struct isl_format_layout *fmtl = 
> isl_format_get_layout(info->format);
>  
> @@ -1131,9 +1135,11 @@ isl_calc_phys_total_extent_el_gen6_stencil_hiz(
>  
> *array_pitch_el_rows =
>isl_assert_div(isl_align(H0, image_align_sa->h), fmtl->bh);
> -   *phys_total_el = (struct isl_extent2d) {
> +   *phys_total_el = (struct isl_extent4d) {
>.w = isl_assert_div(MAX(total_top_w, total_bottom_w), fmtl->bw),
>.h = isl_assert_div(total_h, fmtl->bh),
> +  .d = 1,
> +  .a = 1,
> };
>  }
>  
> @@ -1148,7 +1154,7 @@ isl_calc_phys_total_extent_el_gen9_1d(
>const struct isl_extent3d *image_align_sa,
>const struct isl_extent4d *phys_level0_sa,
>uint32_t *array_pitch_el_rows,
> -  struct isl_extent2d *phys_total_el)
> +  struct isl_extent4d *phys_total_el)
>  {
> MAYBE_UNUSED const struct isl_format_layout *fmtl = 
> isl_format_get_layout(info->format);
>  
> @@ -1168,9 +1174,11 @@ isl_calc_phys_total_extent_el_gen9_1d(
> }
>  
> *array_pitch_el_rows = 1;
> -   *phys_total_el = (struct isl_extent2d) {
> +   *phys_total_el = (struct isl_extent4d) {
>.w = isl_assert_div(slice_w, fmtl->bw),
>.h = phys_level0_sa->array_len,
> +  .d = 1,
> +  .a = 1,
> };
>  }
>  
> @@ -1188,7 +1196,7 @@ isl_calc_phys_total_extent_el(const struct isl_device 
> *dev,
>const struct isl_extent4d *phys_level0_sa,
>enum isl_array_pitch_span array_pitch_span,
>uint32_t *array_pitch_el_rows,
> -  struct isl_extent2d *total_extent_el)
> +  struct isl_extent4d *phys_total_el)
>  {
> switch (dim_layout) {
> case ISL_DIM_LAYOUT_GEN9_1D:
> @@ -1196,14 +1204,14 @@ isl_calc_phys_total_extent_el(const struct isl_device 
> *dev,
>isl_calc_phys_total_extent_el_gen9_1d(dev, info,
>  image_align_sa, phys_level0_sa,
>  array_pitch_el_rows,
> -total_extent_el);
> +phys_total_el);
>return;
> case ISL_DIM_LAYOUT_GEN4_2D:
>

Re: [Mesa-dev] [PATCH v2 04/32] intel/isl: Make tile logical extents four dimensional

2019-01-29 Thread Nanley Chery
On Fri, Oct 12, 2018 at 01:46:34PM -0500, Jason Ekstrand wrote:
> Reviewed-by: Topi Pohjolainen 
> ---
>  src/intel/isl/isl.c | 36 
>  src/intel/isl/isl.h |  2 +-
>  2 files changed, 25 insertions(+), 13 deletions(-)
> 
> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> index a53fbf3da02..6bc96e86cb5 100644
> --- a/src/intel/isl/isl.c
> +++ b/src/intel/isl/isl.c
> @@ -164,7 +164,8 @@ isl_tiling_get_info(enum isl_tiling tiling,
>  struct isl_tile_info *tile_info)
>  {
> const uint32_t bs = format_bpb / 8;
> -   struct isl_extent2d logical_el, phys_B;
> +   struct isl_extent4d logical_el;
> +   struct isl_extent2d phys_B;
>  
> if (tiling != ISL_TILING_LINEAR && !isl_is_pow2(format_bpb)) {
>/* It is possible to have non-power-of-two formats in a tiled buffer.
> @@ -181,25 +182,25 @@ isl_tiling_get_info(enum isl_tiling tiling,
> switch (tiling) {
> case ISL_TILING_LINEAR:
>assert(bs > 0);
> -  logical_el = isl_extent2d(1, 1);
> +  logical_el = isl_extent4d(1, 1, 1, 1);
>phys_B = isl_extent2d(bs, 1);
>break;
>  
> case ISL_TILING_X:
>assert(bs > 0);
> -  logical_el = isl_extent2d(512 / bs, 8);
> +  logical_el = isl_extent4d(512 / bs, 8, 1, 1);
>phys_B = isl_extent2d(512, 8);
>break;
>  
> case ISL_TILING_Y0:
>assert(bs > 0);
> -  logical_el = isl_extent2d(128 / bs, 32);
> +  logical_el = isl_extent4d(128 / bs, 32, 1, 1);
>phys_B = isl_extent2d(128, 32);
>break;
>  
> case ISL_TILING_W:
>assert(bs == 1);
> -  logical_el = isl_extent2d(64, 64);
> +  logical_el = isl_extent4d(64, 64, 1, 1);
>/* From the Broadwell PRM Vol 2d, RENDER_SURFACE_STATE::SurfacePitch:
> *
> *"If the surface is a stencil buffer (and thus has Tile Mode set
> @@ -222,7 +223,7 @@ isl_tiling_get_info(enum isl_tiling tiling,
>unsigned width = 1 << (6 + (ffs(bs) / 2) + (2 * is_Ys));
>unsigned height = 1 << (6 - (ffs(bs) / 2) + (2 * is_Ys));
>  
> -  logical_el = isl_extent2d(width / bs, height);
> +  logical_el = isl_extent4d(width / bs, height, 1, 1);
>phys_B = isl_extent2d(width, height);
>break;
> }
> @@ -233,7 +234,7 @@ isl_tiling_get_info(enum isl_tiling tiling,
> * Y-tiling but actually has two HiZ columns per Y-tiled column.
> */
>assert(bs == 16);
> -  logical_el = isl_extent2d(16, 16);
> +  logical_el = isl_extent4d(16, 16, 1, 1);
>phys_B = isl_extent2d(128, 32);
>break;
>  
> @@ -256,7 +257,7 @@ isl_tiling_get_info(enum isl_tiling tiling,
> * is 128x256 elements.
> */
>assert(format_bpb == 1 || format_bpb == 2);
> -  logical_el = isl_extent2d(128, 256 / format_bpb);
> +  logical_el = isl_extent4d(128, 256 / format_bpb, 1, 1);
>phys_B = isl_extent2d(128, 32);
>break;
>  
> @@ -2307,7 +2308,10 @@ isl_tiling_get_intratile_offset_el(enum isl_tiling 
> tiling,
> struct isl_tile_info tile_info;
> isl_tiling_get_info(tiling, bpb, _info);
>  
> +   /* Pitches must make sense with the tiling */
> assert(row_pitch_B % tile_info.phys_extent_B.width == 0);
> +   assert(array_pitch_el_rows % tile_info.logical_extent_el.d == 0);
> +   assert(array_pitch_el_rows % tile_info.logical_extent_el.a == 0);

I'm guessing this is assertion is simply here for the divide operation
below.

>  
> /* For non-power-of-two formats, we need the address to be both tile and
>  * element-aligned.  The easiest way to achieve this is to work with a 
> tile
> @@ -2324,14 +2328,22 @@ isl_tiling_get_intratile_offset_el(enum isl_tiling 
> tiling,
> /* Compute the offset into the tile */
> *x_offset_el = total_x_offset_el % tile_info.logical_extent_el.w;
> *y_offset_el = total_y_offset_el % tile_info.logical_extent_el.h;
> -   assert(total_z_offset_el == 0);
> -   assert(total_array_offset == 0);
> -   *z_offset_el = 0;
> -   *array_offset = 0;
> +   *z_offset_el = total_z_offset_el % tile_info.logical_extent_el.d;
> +   *array_offset = total_array_offset % tile_info.logical_extent_el.a;
>  
> /* Compute the offset of the tile in units of whole tiles */
> uint32_t x_offset_tl = total_x_offset_el / tile_info.logical_extent_el.w;
> uint32_t y_offset_tl = total_y_offset_el / tile_info.logical_extent_el.h;
> +   uint32_t z_offset_tl = total_z_offset_el / tile_info.logical_extent_el.d;
> +   uint32_t a_offset_tl = total_array_offset / tile_info.logical_extent_el.a;
> +
> +   /* Compute an array pitch in number of tiles */
> +   uint32_t array_pitch_tl_rows =
> +  array_pitch_el_rows / MAX2(tile_info.logical_extent_el.d,
> + tile_info.logical_extent_el.a);

Shouldn't we be dividing the array pitch by the tile height?

-Nanley

> +
> +   /* Add the Z and array offset to the Y offset to get a 2D offset */
> +   y_offset_tl += 

Re: [Mesa-dev] [PATCH 3/8] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-01-28 Thread Nanley Chery
On Sat, Jan 26, 2019 at 05:22:06PM +0200, Eleni Maria Stea wrote:
> Hi Nanley,
> 
> On Fri, 18 Jan 2019 15:32:02 -0800
> Nanley Chery  wrote:
> 
> 
> > > diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > > b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index
> > > e214fae140..4d1eafac91 100644 ---
> > > a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++
> > > b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -329,6
> [...]
> 
> > > @@ -474,6 +485,11 @@ static void brw_update_texture_surface(struct
> > > gl_context *ctx, struct intel_texture_object *intel_obj =
> > > intel_texture_object(obj); struct intel_mipmap_tree *mt =
> > > intel_obj->mt; 
> > > +  if (mt->needs_fake_etc) {
> > > + assert(mt->shadow_mt);
> > > + mt = mt->shadow_mt;
> > > +  }
> > > +
> > >if (plane > 0) {
> > >   if (mt->plane[plane - 1] == NULL)
> > >  return;
> > > @@ -512,7 +528,7 @@ static void brw_update_texture_surface(struct
> > > gl_context *ctx,
> > >* is safe because texture views aren't allowed on
> > > depth/stencil. */
> > >   mesa_fmt = mt->format;
> > > -  } else if (mt->etc_format != MESA_FORMAT_NONE) {
> > > +  } else if (intel_obj->mt->etc_format != MESA_FORMAT_NONE) {
> > >   mesa_fmt = mt->format;  
> > 
> > For uniformity, lets access mt->shadow_mt->format here and move the
> > mt->needs_fake_etc check from above to below this condition:
> > 
> > } else if (devinfo->gen <= 7 && mt->format ==
> > MESA_FORMAT_S_UINT8) {
> 
> I'd like to ask you one more question on this change: if I do the check
> for the fake etc later, the following code will run for the main
> miptree that contains the compressed data and has ETC2 format:
> 
> > >if (plane > 0) {
> > >   if (mt->plane[plane - 1] == NULL)
> > >  return;
> > > @@ -512,7 +528,7 @@ static void brw_update_texture_surface(struct
> > > gl_context *ctx,
> > >* is safe because texture views aren't allowed on
> > > depth/stencil. */
> > >   mesa_fmt = mt->format;
> 
> Wouldn't this be a problem?
> 

These miptrees won't have more than one plane so this isn't a problem.

-Nanley

> Thank you in advance,
> Eleni
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 02/32] intel/blorp: Use isl_surf_get_image_offset_B_tile_el in ccs_ambiguate

2019-01-25 Thread Nanley Chery
On Fri, Oct 12, 2018 at 01:46:32PM -0500, Jason Ekstrand wrote:
> Reviewed-by: Topi Pohjolainen 
> ---
>  src/intel/blorp/blorp_clear.c | 8 ++--
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 


Patches 1 and 2 are:
Reviewed-by: Nanley Chery 

> diff --git a/src/intel/blorp/blorp_clear.c b/src/intel/blorp/blorp_clear.c
> index 5b575dccc22..dd974df35d2 100644
> --- a/src/intel/blorp/blorp_clear.c
> +++ b/src/intel/blorp/blorp_clear.c
> @@ -1086,12 +1086,8 @@ blorp_ccs_ambiguate(struct blorp_batch *batch,
> }
>  
> uint32_t offset_B, x_offset_el, y_offset_el;
> -   isl_surf_get_image_offset_el(surf->aux_surf, level, layer, z,
> -_offset_el, _offset_el);
> -   isl_tiling_get_intratile_offset_el(surf->aux_surf->tiling, aux_fmtl->bpb,
> -  surf->aux_surf->row_pitch_B,
> -  x_offset_el, y_offset_el,
> -  _B, _offset_el, _offset_el);
> +   isl_surf_get_image_offset_B_tile_el(surf->aux_surf, level, layer, z,
> +   _B, _offset_el, 
> _offset_el);
> params.dst.addr.offset += offset_B;
>  
> const uint32_t width_px =
> -- 
> 2.19.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 07/32] intel/isl: Rename ISL_TILING_Yf/s to ISL_TILING_GEN9_Yf/s

2019-01-23 Thread Nanley Chery
return I915_TILING_Y;
>  
> case ISL_TILING_W:
> -   case ISL_TILING_Yf:
> -   case ISL_TILING_Ys:
> +   case ISL_TILING_GEN9_Yf:
> +   case ISL_TILING_GEN9_Ys:
> case ISL_TILING_HIZ:
> case ISL_TILING_CCS:
>return I915_TILING_NONE;
> diff --git a/src/intel/isl/isl_gen7.c b/src/intel/isl/isl_gen7.c
> index a9db21fba52..91cea299abc 100644
> --- a/src/intel/isl/isl_gen7.c
> +++ b/src/intel/isl/isl_gen7.c
> @@ -198,15 +198,15 @@ isl_gen6_filter_tiling(const struct isl_device *dev,
>  
> /* Clear flags unsupported on this hardware */
> if (ISL_DEV_GEN(dev) < 9) {
> -  *flags &= ~ISL_TILING_Yf_BIT;
> -  *flags &= ~ISL_TILING_Ys_BIT;
> +  *flags &= ~ISL_TILING_GEN9_Yf_BIT;
> +  *flags &= ~ISL_TILING_GEN9_Ys_BIT;
> }
>  
> /* And... clear the Yf and Ys bits anyway because Anvil doesn't support
>  * them yet.
>  */
> -   *flags &= ~ISL_TILING_Yf_BIT; /* FINISHME[SKL]: Support Yf */
> -   *flags &= ~ISL_TILING_Ys_BIT; /* FINISHME[SKL]: Support Ys */
> +   *flags &= ~ISL_TILING_GEN9_Yf_BIT; /* FINISHME[SKL]: Support Yf */
> +   *flags &= ~ISL_TILING_GEN9_Ys_BIT; /* FINISHME[SKL]: Support Ys */
>  
>     if (isl_surf_usage_is_depth(info->usage)) {
>/* Depth requires Y. */
> diff --git a/src/intel/isl/isl_gen9.c b/src/intel/isl/isl_gen9.c
> index e5d0f95402a..8e460430a1c 100644
> --- a/src/intel/isl/isl_gen9.c
> +++ b/src/intel/isl/isl_gen9.c
> @@ -41,7 +41,7 @@ gen9_calc_std_image_alignment_sa(const struct isl_device 
> *dev,
> assert(isl_tiling_is_std_y(tiling));
>  
> const uint32_t bpb = fmtl->bpb;
> -   const uint32_t is_Ys = tiling == ISL_TILING_Ys;
> +   const uint32_t is_Ys = tiling == ISL_TILING_GEN9_Ys;

Was this forgotten on the next patch?

This patch is
Reviewed-by: Nanley Chery 


>  
> switch (info->dim) {
> case ISL_SURF_DIM_1D:
> diff --git a/src/intel/isl/isl_surface_state.c 
> b/src/intel/isl/isl_surface_state.c
> index 7ab260d701b..6ac0969f00c 100644
> --- a/src/intel/isl/isl_surface_state.c
> +++ b/src/intel/isl/isl_surface_state.c
> @@ -70,8 +70,8 @@ static const uint8_t isl_to_gen_tiling[] = {
> [ISL_TILING_LINEAR]  = LINEAR,
> [ISL_TILING_X]   = XMAJOR,
> [ISL_TILING_Y0]  = YMAJOR,
> -   [ISL_TILING_Yf]  = YMAJOR,
> -   [ISL_TILING_Ys]  = YMAJOR,
> +   [ISL_TILING_GEN9_Yf] = YMAJOR,
> +   [ISL_TILING_GEN9_Ys] = YMAJOR,
> [ISL_TILING_W]   = WMAJOR,
>  };
>  #endif
> -- 
> 2.19.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/8] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-01-22 Thread Nanley Chery
On Tue, Jan 22, 2019 at 01:15:25PM +0200, Eleni Maria Stea wrote:
> On 1/22/19 12:46 PM, Eleni Maria Stea wrote:
> >>> +   /**
> >>> +* \brief Indicates that we fake the ETC2 compression support
> >>> +*
> >>> +* GPUs Gen < 8 don't support sampling and rendering of ETC2
> >>> formats so
> >>> +* we need to fake it. This variable is set to true when we
> >>> fake it.
> >>> +*/
> >>> +   bool needs_fake_etc;
> >>> +  
> >>
> >> Let's make a function to detect needs_fake_etc instead of adding to
> >> the data structure. That'd be easier to follow.
> >>
> >> -Nanley
> > 
> > 
> > Hi Nanley,
> > 
> > I'd like a small clarification here if you don't mind: I wasn't very
> > sure about this last change you suggest.
> > 
> > The reasons I preferred to extend the data structure instead of adding
> > a function were:
> > 
> > 1- that I need to check if we fake ETC in several different places in
> > which I don't always have access to the information that helped me
> > decide if we need to fake the ETC or not, so I found it much easier to
> > keep this information in the miptree that can be accessed from
> > everywhere. (That was the main reason).
> 
> Actually, now I better thought of it, I only need the GPU version and if
> the format is compressed, so I can probably get this information in all
> places but we would still need to make many unnecessary calls...
> Couldn't we avoid them by just checking this once at the beginning?
> 

The performance difference should be negligible if the function is
declared static inline in the intel_mipmap_tree.h header. The compiler
should include the body of function (which should be small) and avoid
the overhead of a function call.

> Thanks again,
> Eleni
> 
> > The other reasons were that:
> > 2- I thought that it would be faster to check the miptree than call a
> > function.
> > 3- I was hoping that from the name of the variable it won't be
> > difficult to follow (but I could rename it to something better if you
> > prefer it).
> > 
> > Could you explain me why you'd like me to replace it? Is there an
> > advantage I hadn't thought of?
> > 

Firstly, it's not information that's generally useful for most
intel_mipmap_tree objects. Having too much of such state makes debugging
and reading the struct definition more difficult.

Secondly, it adds to the amount of state-dependent variables I have to
keep in mind when looking at the code. I have to start asking, when is
needs_fake_etc initialized? Is needs_fake_etc ever modified later? I'm
already familiar with the other variables needs_fake_etc can be computed
by: the gen, the miptree format, and the shadow_mt. I hope that helps.

-Nanley

> > Thank you in advance,
> > Eleni
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/8] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-01-22 Thread Nanley Chery
On Tue, Jan 22, 2019 at 02:17:16PM +0200, Eleni Maria Stea wrote:
> On 1/19/19 1:32 AM, Nanley Chery wrote:
> >> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
> >> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> >> index e214fae140..4d1eafac91 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> >> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> >> @@ -329,6 +329,17 @@ brw_get_texture_swizzle(const struct gl_context *ctx,
> >>  {
> >> const struct gl_texture_image *img = t->Image[0][t->BaseLevel];
> >>  
> >> +   struct brw_context *brw = brw_context((struct gl_context *)ctx);
> >> +   const struct gen_device_info *devinfo = >screen->devinfo;
> >> +   bool is_fake_etc = _mesa_is_format_etc2(img->TexFormat) &&
> >> +  devinfo->gen < 8;
> >> +
> >> +   mesa_format format;
> >> +   if (is_fake_etc)
> >> +  format = intel_lower_compressed_format(brw, img->TexFormat);
> >> +   else
> >> +  format = img->TexFormat;
> >> +
> > 
> > Why is modifying this function necessary?
> 
> Hi,
> 
> I'll try to explain this modification:
> 
> After the changes we made:
> - the image TexFormat remains ETC2 to match the main miptree's format
> - the main miptree stores the compressed data (ETC2) so that the
> GetCompressed* functions work
> - the shadow miptree stores the RGBA data and we map it for the drawing
> 
> This texture swizzle function is called before the drawing and it can't
> access the miptrees. Instead it reads the format of the texture we are
> supposed to have in the memory from the gl_texture_image struct directly
> so in this case it reads the ETC2 format.
> 
> At this time, the texture that we have in the memory and is about to be
> used in the drawing is RGBA (from the shadow miptree).
> 
> As a result, we end up calculating the swizzle of the ETC2 format used
> in the original image (+the main miptree) for the RGBA texture that we
> have in the memory. As a result the texture is not rendered properly.
> 

Oh okay, I was thinking that the swizzles of the ETC2 formats wouldn't
conflict with their decompressed RGBA texture, but I see that the SRGB
ones currently need to have the 1st and 3rd swizzles swapped. 

To avoid having to modify this function, could you try the following?
* setting the bgra argument in the decoding function call to false
* updating the mapping in intel_lower_compressed_format() accordingly

-Nanley

> The solution was to use the corresponding RGBA format when we fake the
> ETC2, but as I couldn't read it from the shadow miptree inside this
> function, I took it by calling intel_lower_compressed_format for the
> original ETC2 format of the gl_texture_image.
> 
> I hope that this change is more clear now, I will add a comment
> explaining this just in case,
> 
> Thank you!
> Eleni
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/8] i965: Added support for ETC2 mipmaps

2019-01-18 Thread Nanley Chery
On Mon, Nov 19, 2018 at 10:54:10AM +0200, Eleni Maria Stea wrote:
> Extended the intel_update_decompress_shadow to update all the mipmap
> tree levels so that we can display and run Get functions on mipmaps.
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 48 +++
>  1 file changed, 29 insertions(+), 19 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index ef3e2c33d3..4886bb2b96 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -3962,15 +3962,16 @@ intel_update_decompressed_shadow(struct brw_context 
> *brw,
> int img_h = smt->surf.logical_level0_px.height;
> int img_d = smt->surf.logical_level0_px.depth;
>  
> -   ptrdiff_t shadow_stride = _mesa_format_row_stride(smt->format, img_w);
> +   int level_w = img_w;
> +   int level_h = img_h;
>  
> for (int level = smt->first_level; level <= smt->last_level; level++) {
> -  struct compressed_pixelstore store;
> -  _mesa_compute_compressed_pixelstore(mt->surf.dim,
> -  mt->format,
> -  img_w, img_h, img_d,
> -  >ctx.Unpack,
> -  );
> +  ptrdiff_t shadow_stride = _mesa_format_row_stride(smt->format,
> +level_w);
> +
> +  ptrdiff_t main_stride = _mesa_format_row_stride(mt->format,
> +  level_w);
> +
>for (unsigned int slice = 0; slice < img_d; slice++) {
>   GLbitfield mmode = GL_MAP_READ_BIT | BRW_MAP_DIRECT_BIT |
>  BRW_MAP_ETC_BIT;
> @@ -3978,30 +3979,39 @@ intel_update_decompressed_shadow(struct brw_context 
> *brw,
>  GL_MAP_INVALIDATE_RANGE_BIT |
>  BRW_MAP_DIRECT_BIT;
>  
> - uint32_t img_x, img_y;
> - intel_miptree_get_image_offset(smt, level, slice, _x, _y);
> + uint32_t slevel_x, slevel_y;
> + intel_miptree_get_image_offset(smt, level, slice, _x,
> +_y);
> +
> + uint32_t mlevel_x, mlevel_y;
> + intel_miptree_get_image_offset(mt, level, slice, _x,
> +_y);
> +
> + void *mptr;
> + intel_miptree_map(brw, mt, level, slice, 0, 0,
> +   level_w, level_h, mmode, , _stride);
>  
> - void *mptr = intel_miptree_map_raw(brw, mt, mmode) + mt->offset
> -+ img_y * store.TotalBytesPerRow
> -+ img_x * store.TotalBytesPerRow / img_w;
>  
>   void *sptr;
> - intel_miptree_map(brw, smt, level, slice, img_x, img_y, img_w, 
> img_h,
> -   smode, , _stride);
> + intel_miptree_map(brw, smt, level, slice, 0, 0, level_w,
> +   level_h, smode, , _stride);
>  
>   if (mt->format == MESA_FORMAT_ETC1_RGB8) {
>  _mesa_etc1_unpack_rgba(sptr, shadow_stride,
> -   mptr, store.TotalBytesPerRow,
> -   img_w, img_h);
> +   mptr, main_stride,
> +   level_w, level_h);
>   } else {
>  _mesa_unpack_etc2_format(sptr, shadow_stride,
> - mptr, store.TotalBytesPerRow,
> - img_w, img_h, mt->format, true);
> + mptr, main_stride,
> + level_w, level_h, mt->format, true);
>   }
>  
> - intel_miptree_unmap_raw(mt);
> + intel_miptree_unmap(brw, mt, level, slice);
>   intel_miptree_unmap(brw, smt, level, slice);
>}
> +
> +  level_w /= 2;
> +  level_h /= 2;

You want to use minify() to avoid level_w or level_h from becoming 0.

-Nanley

> }
>  
> mt->shadow_needs_update = false;
> -- 
> 2.19.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/8] i965: Update the shadow miptree from the main to fake the ETC2 compression

2019-01-18 Thread Nanley Chery
On Mon, Nov 19, 2018 at 10:54:08AM +0200, Eleni Maria Stea wrote:
> On GPUs gen < 8 that don't support ETC2 sampling/rendering we now fake
> the support using 2 mipmap trees: one (the main) that stores the
> compressed data for the Get* functions to work and one (the shadow) that
> stores the same data decompressed for the render/sampling to work.
> 
> Added the intel_update_decompressed_shadow function to update the shadow
> tree with the decompressed data whenever the main miptree with the
> compressed is changing.
> ---
>  .../drivers/dri/i965/brw_wm_surface_state.c   |  1 +
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 70 ++-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  3 +
>  3 files changed, 71 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> index 4d1eafac91..2e6d85e1fe 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> @@ -579,6 +579,7 @@ static void brw_update_texture_surface(struct gl_context 
> *ctx,
>  
>if (obj->StencilSampling && firstImage->_BaseFormat == 
> GL_DEPTH_STENCIL) {
>   if (devinfo->gen <= 7) {
> +assert(!intel_obj->mt->needs_fake_etc);
>  assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update);
>  mt = mt->shadow_mt;
>   } else {
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index b24332ff67..ef3e2c33d3 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -3740,12 +3740,15 @@ intel_miptree_map(struct brw_context *brw,
> assert(mt->surf.samples == 1);
>  
> if (mt->needs_fake_etc) {
> -  if (!(mode & BRW_MAP_ETC_BIT)) {
> +  if (!(mode & BRW_MAP_ETC_BIT) && !(mode & GL_MAP_READ_BIT)) {
>   assert(mt->shadow_mt);
>  
> - mt->is_shadow_mapped = true;
> + if (mt->shadow_needs_update) {
> +intel_update_decompressed_shadow(brw, mt);
> +mt->shadow_needs_update = false;
> + }
>  
> - mt->shadow_needs_update = false;
> + mt->is_shadow_mapped = true;
>   mt = miptree->shadow_mt;
>} else {
>   mt->is_shadow_mapped = false;
> @@ -3762,6 +3765,8 @@ intel_miptree_map(struct brw_context *brw,
>  
> map = intel_miptree_attach_map(mt, level, slice, x, y, w, h, mode);
> if (!map){
> +  miptree->is_shadow_mapped = false;
> +
>*out_ptr = NULL;
>*out_stride = 0;
>return;
> @@ -3942,3 +3947,62 @@ intel_miptree_get_clear_color(const struct 
> gen_device_info *devinfo,
>return mt->fast_clear_color;
> }
>  }
> +
> +void
> +intel_update_decompressed_shadow(struct brw_context *brw,
> + struct intel_mipmap_tree *mt)
> +{
> +   struct intel_mipmap_tree *smt = mt->shadow_mt;
> +
> +   assert(smt);
> +   assert(mt->needs_fake_etc);
> +   assert(mt->surf.size_B > 0);
> +
> +   int img_w = smt->surf.logical_level0_px.width;
> +   int img_h = smt->surf.logical_level0_px.height;
> +   int img_d = smt->surf.logical_level0_px.depth;

I don't think 3D ETC textures are possible. From the GL4.6 spec:

An INVALID_OPERATION error is generated by CompressedTexImage3D
if internalformat is one of the EAC, ETC2, or RGTC formats and
either border is non-zero, or target is not TEXTURE_2D_ARRAY.

> +
> +   ptrdiff_t shadow_stride = _mesa_format_row_stride(smt->format, img_w);
> +

This variable gets overwritten when calling intel_miptree_map().

> +   for (int level = smt->first_level; level <= smt->last_level; level++) {

Since we're already iterating levels here we should fold in patch 6 to
get the right level dimensions.

> +  struct compressed_pixelstore store;
> +  _mesa_compute_compressed_pixelstore(mt->surf.dim,
> +  mt->format,
> +  img_w, img_h, img_d,
> +  >ctx.Unpack,
> +  );

store.TotalBytesPerRow will give you the pitch for a buffer allocated
without padding. mt->surf->row_pitch_B gives you the actual pitch.

> +  for (unsigned int slice = 0; slice < img_d; slice++) {
> + GLbitfield mmode = GL_MAP_READ_BIT | BRW_MAP_DIRECT_BIT |
> +BRW_MAP_ETC_BIT;
> + GLbitfield smode = GL_MAP_WRITE_BIT |
> +GL_MAP_INVALIDATE_RANGE_BIT |
> +BRW_MAP_DIRECT_BIT;
> +
> + uint32_t img_x, img_y;
> + intel_miptree_get_image_offset(smt, level, slice, _x, _y);
> +
> + void *mptr = intel_miptree_map_raw(brw, mt, mmode) + mt->offset
> ++ img_y * store.TotalBytesPerRow
> ++ img_x * 

Re: [Mesa-dev] [PATCH 7/8] i965: Added support for ETC2 texture arrays on Gen7

2019-01-18 Thread Nanley Chery
On Mon, Nov 19, 2018 at 10:54:11AM +0200, Eleni Maria Stea wrote:
> Modified the calculation of the number of slices in the
> intel_update_decompressed_shadow function to take the array length into
> account to support arrays.
> ---

At this point, we can delete map_etc and unmap_etc, right?

-Nanley

>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 4886bb2b96..0840b3b243 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -3965,6 +3965,8 @@ intel_update_decompressed_shadow(struct brw_context 
> *brw,
> int level_w = img_w;
> int level_h = img_h;
>  
> +   int num_slices = img_d * smt->surf.logical_level0_px.array_len;
> +
> for (int level = smt->first_level; level <= smt->last_level; level++) {
>ptrdiff_t shadow_stride = _mesa_format_row_stride(smt->format,
>  level_w);
> @@ -3972,7 +3974,7 @@ intel_update_decompressed_shadow(struct brw_context 
> *brw,
>ptrdiff_t main_stride = _mesa_format_row_stride(mt->format,
>level_w);
>  
> -  for (unsigned int slice = 0; slice < img_d; slice++) {
> +  for (unsigned int slice = 0; slice < num_slices; slice++) {
>   GLbitfield mmode = GL_MAP_READ_BIT | BRW_MAP_DIRECT_BIT |
>  BRW_MAP_ETC_BIT;
>   GLbitfield smode = GL_MAP_WRITE_BIT |
> -- 
> 2.19.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/8] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-01-18 Thread Nanley Chery
On Mon, Nov 19, 2018 at 10:54:07AM +0200, Eleni Maria Stea wrote:
> GPUs Gen < 8 cannot render ETC2 formats. So far, they converted the
> compressed EAC/ETC2 images to non-compressed RGB format images that they
> can render. When GetCompressed* functions were called, the pixels were
> returned in the RGB format and not the compressed format as expected.
> 
> Trying to fix this problem, we use the shadow miptree to store the
> decompressed data for the rendering and the main miptree to store the
> compressed. We use the BRW_MAP_ETC_BIT as a flag to indicate when we
> use the fake compression in order to map the main tree with the
> compressed data. The functions that upload the compressed data as well
> as the mapping/unmapping functions are now updated to use this flag.

Did you mean sample instead of render?

> ---
>  .../drivers/dri/i965/brw_wm_surface_state.c   | 26 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 73 +--
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 17 
>  src/mesa/drivers/dri/i965/intel_tex_image.c   | 45 -
>  src/mesa/main/texstore.c  | 92 +++
>  src/mesa/main/texstore.h  |  9 ++
>  6 files changed, 204 insertions(+), 58 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> index e214fae140..4d1eafac91 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> @@ -329,6 +329,17 @@ brw_get_texture_swizzle(const struct gl_context *ctx,
>  {
> const struct gl_texture_image *img = t->Image[0][t->BaseLevel];
>  
> +   struct brw_context *brw = brw_context((struct gl_context *)ctx);
> +   const struct gen_device_info *devinfo = >screen->devinfo;
> +   bool is_fake_etc = _mesa_is_format_etc2(img->TexFormat) &&
> +  devinfo->gen < 8;
> +
> +   mesa_format format;
> +   if (is_fake_etc)
> +  format = intel_lower_compressed_format(brw, img->TexFormat);
> +   else
> +  format = img->TexFormat;
> +

Why is modifying this function necessary?

> int swizzles[SWIZZLE_NIL + 1] = {
>SWIZZLE_X,
>SWIZZLE_Y,
> @@ -381,7 +392,7 @@ brw_get_texture_swizzle(const struct gl_context *ctx,
>}
> }
>  
> -   GLenum datatype = _mesa_get_format_datatype(img->TexFormat);
> +   GLenum datatype = _mesa_get_format_datatype(format);
>  
> /* If the texture's format is alpha-only, force R, G, and B to
>  * 0.0. Similarly, if the texture's format has no alpha channel,
> @@ -422,9 +433,9 @@ brw_get_texture_swizzle(const struct gl_context *ctx,
> case GL_RED:
> case GL_RG:
> case GL_RGB:
> -  if (_mesa_get_format_bits(img->TexFormat, GL_ALPHA_BITS) > 0 ||
> -  img->TexFormat == MESA_FORMAT_RGB_DXT1 ||
> -  img->TexFormat == MESA_FORMAT_SRGB_DXT1)
> +  if (_mesa_get_format_bits(format, GL_ALPHA_BITS) > 0 ||
> +  format == MESA_FORMAT_RGB_DXT1 ||
> +  format == MESA_FORMAT_SRGB_DXT1)
>   swizzles[3] = SWIZZLE_ONE;
>break;
> }
> @@ -474,6 +485,11 @@ static void brw_update_texture_surface(struct gl_context 
> *ctx,
>struct intel_texture_object *intel_obj = intel_texture_object(obj);
>struct intel_mipmap_tree *mt = intel_obj->mt;
>  
> +  if (mt->needs_fake_etc) {
> + assert(mt->shadow_mt);
> + mt = mt->shadow_mt;
> +  }
> +
>if (plane > 0) {
>   if (mt->plane[plane - 1] == NULL)
>  return;
> @@ -512,7 +528,7 @@ static void brw_update_texture_surface(struct gl_context 
> *ctx,
>* is safe because texture views aren't allowed on depth/stencil.
>*/
>   mesa_fmt = mt->format;
> -  } else if (mt->etc_format != MESA_FORMAT_NONE) {
> +  } else if (intel_obj->mt->etc_format != MESA_FORMAT_NONE) {
>   mesa_fmt = mt->format;

For uniformity, lets access mt->shadow_mt->format here and move the
mt->needs_fake_etc check from above to below this condition:

} else if (devinfo->gen <= 7 && mt->format == MESA_FORMAT_S_UINT8) {


>} else if (plane > 0) {
>   mesa_fmt = mt->format;
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 0e67e4d8f3..b24332ff67 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -689,6 +689,8 @@ miptree_create(struct brw_context *brw,
> if (devinfo->gen < 6 && _mesa_is_format_color_format(format))
>tiling_flags &= ~ISL_TILING_Y0_BIT;
>  
> +   bool fakes_etc_compression = devinfo->gen < 8 && 
> _mesa_is_format_etc2(format);
> +
> mesa_format mt_fmt;
> if (_mesa_is_format_color_format(format)) {
>mt_fmt = intel_lower_compressed_format(brw, format);

Why not reserve calling intel_lower_compressed_format() for the case in
which we're creating a 

Re: [Mesa-dev] [PATCH 2/8] i965: r8stencil_mt/needs_update renamed to shadow_mt/needs_update

2019-01-18 Thread Nanley Chery
On Mon, Nov 19, 2018 at 10:54:06AM +0200, Eleni Maria Stea wrote:
> Renamed the r8stencil_mt and r8stencil_needs_update to shadow_mt and
> shadow_needs_update respectively to allow reusing the shadow_mt as a
> generic purpose secondary mipmap tree.

The series I pointed you to earlier has a patch like this, but it's more
complete. It also modifies the comment above the data structure being
modified. Do you want to review it?

https://patchwork.freedesktop.org/patch/253197/

I think what people usually do in this case is send out their series
with the other person's patch included (and their rb tacked onto it).

-Nanley

> ---
>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  8 
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 16 
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h|  4 ++--
>  3 files changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> index 8d21cf5fa7..e214fae140 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> @@ -563,15 +563,15 @@ static void brw_update_texture_surface(struct 
> gl_context *ctx,
>  
>if (obj->StencilSampling && firstImage->_BaseFormat == 
> GL_DEPTH_STENCIL) {
>   if (devinfo->gen <= 7) {
> -assert(mt->r8stencil_mt && 
> !mt->stencil_mt->r8stencil_needs_update);
> -mt = mt->r8stencil_mt;
> +assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update);
> +mt = mt->shadow_mt;
>   } else {
>  mt = mt->stencil_mt;
>   }
>   format = ISL_FORMAT_R8_UINT;
>} else if (devinfo->gen <= 7 && mt->format == MESA_FORMAT_S_UINT8) {
> - assert(mt->r8stencil_mt && !mt->r8stencil_needs_update);
> - mt = mt->r8stencil_mt;
> + assert(mt->shadow_mt && !mt->shadow_needs_update);
> + mt = mt->shadow_mt;
>   format = ISL_FORMAT_R8_UINT;
>}
>  
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 5e11ec0c30..0e67e4d8f3 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -1216,7 +1216,7 @@ intel_miptree_release(struct intel_mipmap_tree **mt)
>  
>brw_bo_unreference((*mt)->bo);
>intel_miptree_release(&(*mt)->stencil_mt);
> -  intel_miptree_release(&(*mt)->r8stencil_mt);
> +  intel_miptree_release(&(*mt)->shadow_mt);
>intel_miptree_aux_buffer_free((*mt)->aux_buf);
>free_aux_state_map((*mt)->aux_state);
>  
> @@ -2429,7 +2429,7 @@ intel_miptree_finish_write(struct brw_context *brw,
> switch (mt->aux_usage) {
> case ISL_AUX_USAGE_NONE:
>if (mt->format == MESA_FORMAT_S_UINT8 && devinfo->gen <= 7)
> - mt->r8stencil_needs_update = true;
> + mt->shadow_needs_update = true;
>break;
>  
> case ISL_AUX_USAGE_MCS:
> @@ -2935,9 +2935,9 @@ intel_update_r8stencil(struct brw_context *brw,
>  
> assert(src->surf.size_B > 0);
>  
> -   if (!mt->r8stencil_mt) {
> +   if (!mt->shadow_mt) {
>assert(devinfo->gen > 6); /* Handle MIPTREE_LAYOUT_GEN6_HIZ_STENCIL */
> -  mt->r8stencil_mt = make_surface(
> +  mt->shadow_mt = make_surface(
>  brw,
>  src->target,
>  MESA_FORMAT_R_UINT8,
> @@ -2951,13 +2951,13 @@ intel_update_r8stencil(struct brw_context *brw,
>  ISL_TILING_Y0_BIT,
>  ISL_SURF_USAGE_TEXTURE_BIT,
>  BO_ALLOC_BUSY, 0, NULL);
> -  assert(mt->r8stencil_mt);
> +  assert(mt->shadow_mt);
> }
>  
> -   if (src->r8stencil_needs_update == false)
> +   if (src->shadow_needs_update == false)
>return;
>  
> -   struct intel_mipmap_tree *dst = mt->r8stencil_mt;
> +   struct intel_mipmap_tree *dst = mt->shadow_mt;
>  
> for (int level = src->first_level; level <= src->last_level; level++) {
>const unsigned depth = src->surf.dim == ISL_SURF_DIM_3D ?
> @@ -2977,7 +2977,7 @@ intel_update_r8stencil(struct brw_context *brw,
> }
>  
> brw_cache_flush_for_read(brw, dst->bo);
> -   src->r8stencil_needs_update = false;
> +   src->shadow_needs_update = false;
>  }
>  
>  static void *
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> index b0333655ad..b955a2bab1 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> @@ -302,8 +302,8 @@ struct intel_mipmap_tree
>  *
>  * \see intel_update_r8stencil()
>  */
> -   struct intel_mipmap_tree *r8stencil_mt;
> -   bool r8stencil_needs_update;
> +   struct intel_mipmap_tree *shadow_mt;
> +   bool shadow_needs_update;
>  
> /**
>  * 

Re: [Mesa-dev] [PATCH 1/8] i965: Removed assertions from intel_miptree_map_etc

2019-01-18 Thread Nanley Chery
On Mon, Nov 19, 2018 at 10:54:05AM +0200, Eleni Maria Stea wrote:
> The assertions that the GL_MAP_WRITE_BIT and GL_MAP_INVALIDATE_RANGE_BIT
> in intel_miptree_map_etc should be removed since they will fail when the
  ^
  missing "bits are set"?

> ETC miptree is mapped for reading.
> 

The assertion is still valid at this point. Reading will give you
incorrect results. You'll want to do this later on in the series though.

> Fixes: KHR-GL45.direct_state_access.textures_compressed_subimage crash
   ^
   Should probably remove the semicolon so that you're not using the
   Fixes tag. I think that's reserved for fixing bugs in 
   commits. See the git log for more info.

-Nanley

> on Gen 7 GPUs.
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 8e50aabb3b..5e11ec0c30 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -3444,9 +3444,6 @@ intel_miptree_map_etc(struct brw_context *brw,
>assert(mt->format == MESA_FORMAT_R8G8B8X8_UNORM);
> }
>  
> -   assert(map->mode & GL_MAP_WRITE_BIT);
> -   assert(map->mode & GL_MAP_INVALIDATE_RANGE_BIT);
> -
> intel_miptree_access_raw(brw, mt, level, slice, true);
>  
> map->stride = _mesa_format_row_stride(mt->etc_format, map->w);
> -- 
> 2.19.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 23/32] intel/isl: Allow Yf and Ys tiling

2019-01-16 Thread Nanley Chery
On Fri, Oct 12, 2018 at 01:46:53PM -0500, Jason Ekstrand wrote:
> They are both implemented in ISL now.  Instead of disabling them in ISL,
> we disable them in the two dirvers.

"drivers" is misspelled.

> 
> Reviewed-by: Topi Pohjolainen 
> ---
>  src/intel/isl/isl_gen7.c  | 8 
>  src/intel/vulkan/anv_image.c  | 3 +++
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 5 +
>  3 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/src/intel/isl/isl_gen7.c b/src/intel/isl/isl_gen7.c
> index fe420e4fbd8..51958f7e2d5 100644
> --- a/src/intel/isl/isl_gen7.c
> +++ b/src/intel/isl/isl_gen7.c
> @@ -206,14 +206,6 @@ isl_gen6_filter_tiling(const struct isl_device *dev,
>*flags &= ~ISL_TILING_GEN10_Ys_BIT;
> }
>  
> -   /* And... clear the Yf and Ys bits anyway because Anvil doesn't support
> -* them yet.
> -*/
> -   *flags &= ~ISL_TILING_GEN9_Yf_BIT; /* FINISHME[SKL]: Support Yf */
> -   *flags &= ~ISL_TILING_GEN9_Ys_BIT; /* FINISHME[SKL]: Support Ys */
> -   *flags &= ~ISL_TILING_GEN10_Yf_BIT; /* FINISHME[SKL]: Support Yf */
> -   *flags &= ~ISL_TILING_GEN10_Ys_BIT; /* FINISHME[SKL]: Support Ys */
> -
> if (isl_surf_usage_is_depth(info->usage)) {
>/* Depth requires Y. */
>*flags &= ISL_TILING_ANY_Y_MASK;
> diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
> index 388f9410564..82ce43ef2b9 100644
> --- a/src/intel/vulkan/anv_image.c
> +++ b/src/intel/vulkan/anv_image.c
> @@ -120,6 +120,9 @@ choose_isl_tiling_flags(const struct 
> anv_image_create_info *anv_info,
> if (isl_mod_info)
>flags &= 1 << isl_mod_info->tiling;
>  
> +   /* We don't support Yf or Ys tiling yet */
> +   flags &= ISL_TILING_STD_Y_MASK;

This is missing a bitwise complement.

> +
> assert(flags);
>  
> return flags;
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 36d080129fa..cfeb4d67d29 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -573,6 +573,11 @@ make_surface(struct brw_context *brw, GLenum target, 
> mesa_format format,
> num_samples, width0, height0, depth0,
> first_level, last_level, mt);
>  
> +   /* We don't support Yf or Ys in i965 yet because we use the blitter too
> +* much and it can't handle them.
> +*/

Didn't we stop using the blitter on newer platforms?

> +   tiling_flags &= ~ISL_TILING_STD_Y_MASK;
> +

Could we move this to miptree_create()?  There's another instance of the
requested set of tiling flags being modified there.

-Nanley

> struct isl_surf_init_info init_info = {
>.dim = get_isl_surf_dim(target),
>.format = translate_tex_format(brw, format, false),
> -- 
> 2.19.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel/blorp: Define the clear value bounds for HiZ clears

2018-10-29 Thread Nanley Chery
On Mon, Oct 29, 2018 at 12:48:50PM +0100, Juan A. Suarez Romero wrote:
> On Thu, 2018-10-25 at 16:25 -0700, nanleych...@gmail.com wrote:
> > From: Nanley Chery 
> > 
> > Follow the restriction of making sure the clear value is between the min
> > and max values defined in CC_VIEWPORT. Avoids a simulator warning for
> > some piglit tests, one of them being:
> > 
> > ./bin/depthstencil-render-miplevels 146 d=z32f_s8
> > 
> > Jason found this to make a GPU hang go away on SKL.
> > 
> > Fixes: 09948151ab1d5184b4dd9052bb1f710fa1e00a7b
> >("intel/blorp: Add the BDW+ optimized HZ_OP sequence to BLORP")
> 
> 
> As 09948151ab1 ("intel/blorp: Add the BDW+ optimized HZ_OP sequence to BLORP")
> is included in 18.2 branch, adding this to 18.2 queue.
> 
> It doesn't apply cleanly, so I've fixed the conflicts. You can check the fixed
> commit at:
> 
> 
> https://gitlab.freedesktop.org/mesa/mesa/commit/aaff8c7a0ed55d71e9dd0a6fef6905d6a2536c3f
> 

Looks good to me. The placement of this instruction in the batch is
different in the stable branch (after 3DSTATE_WM) vs the master branch
(before 3DSTATE_WM). This will cause some noise when diff'ing dumps of
the batches, but it's not a big deal.

-Nanley

>   J.A.
> 
> > ---
> >  src/intel/blorp/blorp_genX_exec.h | 14 ++
> >  1 file changed, 14 insertions(+)
> > 
> > diff --git a/src/intel/blorp/blorp_genX_exec.h 
> > b/src/intel/blorp/blorp_genX_exec.h
> > index 50341ab0ecf..7a8c45dbee5 100644
> > --- a/src/intel/blorp/blorp_genX_exec.h
> > +++ b/src/intel/blorp/blorp_genX_exec.h
> > @@ -1628,6 +1628,20 @@ blorp_emit_gen8_hiz_op(struct blorp_batch *batch,
> >  */
> > blorp_emit_3dstate_multisample(batch, params);
> >  
> > +   /* From the BDW PRM Volume 7, Depth Buffer Clear:
> > +*
> > +*The clear value must be between the min and max depth values
> > +*(inclusive) defined in the CC_VIEWPORT. If the depth buffer 
> > format is
> > +*D32_FLOAT, then +/-DENORM values are also allowed.
> > +*
> > +* Set the bounds to match our hardware limits, [0.0, 1.0].
> > +*/
> > +   if (params->depth.enabled && params->hiz_op == ISL_AUX_OP_FAST_CLEAR) {
> > +  assert(params->depth.clear_color.f32[0] >= 0.0f);
> > +  assert(params->depth.clear_color.f32[0] <= 1.0f);
> > +  blorp_emit_cc_viewport(batch);
> > +   }
> > +
> > /* If we can't alter the depth stencil config and multiple layers are
> >  * involved, the HiZ op will fail. This is because the op requires that 
> > a
> >  * new config is emitted for each additional layer.
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel/blorp: Define the clear value bounds for HiZ clears

2018-10-29 Thread Nanley Chery
On Mon, Oct 29, 2018 at 08:37:13AM -0500, Jason Ekstrand wrote:
> That's likely because Nanley forgot to CC this one too stable:
> 
> https://cgit.freedesktop.org/mesa/mesa/commit/?id=5bcf479524b96554cab7d2429dacf650b4054638
> 

Our submittingpatches.html doc says that using the Fixes tag should be
enough. Did I miss something here?

-Nanley

> On October 29, 2018 06:49:47 "Juan A. Suarez Romero" 
> wrote:
> 
> > On Thu, 2018-10-25 at 16:25 -0700, nanleych...@gmail.com wrote:
> > > From: Nanley Chery 
> > > 
> > > Follow the restriction of making sure the clear value is between the min
> > > and max values defined in CC_VIEWPORT. Avoids a simulator warning for
> > > some piglit tests, one of them being:
> > > 
> > > ./bin/depthstencil-render-miplevels 146 d=z32f_s8
> > > 
> > > Jason found this to make a GPU hang go away on SKL.
> > > 
> > > Fixes: 09948151ab1d5184b4dd9052bb1f710fa1e00a7b
> > >("intel/blorp: Add the BDW+ optimized HZ_OP sequence to BLORP")
> > 
> > 
> > As 09948151ab1 ("intel/blorp: Add the BDW+ optimized HZ_OP sequence to 
> > BLORP")
> > is included in 18.2 branch, adding this to 18.2 queue.
> > 
> > It doesn't apply cleanly, so I've fixed the conflicts. You can check the 
> > fixed
> > commit at:
> > 
> > 
> > https://gitlab.freedesktop.org/mesa/mesa/commit/aaff8c7a0ed55d71e9dd0a6fef6905d6a2536c3f
> > 
> > J.A.
> > 
> > > ---
> > >  src/intel/blorp/blorp_genX_exec.h | 14 ++
> > >  1 file changed, 14 insertions(+)
> > > 
> > > diff --git a/src/intel/blorp/blorp_genX_exec.h
> > > b/src/intel/blorp/blorp_genX_exec.h
> > > index 50341ab0ecf..7a8c45dbee5 100644
> > > --- a/src/intel/blorp/blorp_genX_exec.h
> > > +++ b/src/intel/blorp/blorp_genX_exec.h
> > > @@ -1628,6 +1628,20 @@ blorp_emit_gen8_hiz_op(struct blorp_batch *batch,
> > >  */
> > > blorp_emit_3dstate_multisample(batch, params);
> > > 
> > > +   /* From the BDW PRM Volume 7, Depth Buffer Clear:
> > > +*
> > > +*The clear value must be between the min and max depth values
> > > +*(inclusive) defined in the CC_VIEWPORT. If the depth buffer 
> > > format is
> > > +*D32_FLOAT, then +/-DENORM values are also allowed.
> > > +*
> > > +* Set the bounds to match our hardware limits, [0.0, 1.0].
> > > +*/
> > > +   if (params->depth.enabled && params->hiz_op == ISL_AUX_OP_FAST_CLEAR) 
> > > {
> > > +  assert(params->depth.clear_color.f32[0] >= 0.0f);
> > > +  assert(params->depth.clear_color.f32[0] <= 1.0f);
> > > +  blorp_emit_cc_viewport(batch);
> > > +   }
> > > +
> > > /* If we can't alter the depth stencil config and multiple layers are
> > >  * involved, the HiZ op will fail. This is because the op requires 
> > > that a
> > >  * new config is emitted for each additional layer.
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel/blorp: Define the clear value bounds for HiZ clears

2018-10-26 Thread Nanley Chery
On Fri, Oct 26, 2018 at 12:02:58PM -0500, Jason Ekstrand wrote:
> On Thu, Oct 25, 2018 at 6:25 PM  wrote:
> 
> > From: Nanley Chery 
> >
> > Follow the restriction of making sure the clear value is between the min
> > and max values defined in CC_VIEWPORT. Avoids a simulator warning for
> > some piglit tests, one of them being:
> >
> > ./bin/depthstencil-render-miplevels 146 d=z32f_s8
> >
> > Jason found this to make a GPU hang go away on SKL.
> >
> 
> It wasn't really hangs.  It was just incorrect clearing.  The hangs may be
> related but I have no proof of that.  With the commit message adjusted
> accordingly, this patch is
> 
> Reviewed-by: Jason Ekstrand 
> Tested-by: Jason Ekstrand 
> 

Ah, okay. I'll change the line to say:

   Jason found this to fix incorrect clearing on SKL.

Thanks!

> 
> > Fixes: 09948151ab1d5184b4dd9052bb1f710fa1e00a7b
> >("intel/blorp: Add the BDW+ optimized HZ_OP sequence to BLORP")
> > ---
> >  src/intel/blorp/blorp_genX_exec.h | 14 ++
> >  1 file changed, 14 insertions(+)
> >
> > diff --git a/src/intel/blorp/blorp_genX_exec.h
> > b/src/intel/blorp/blorp_genX_exec.h
> > index 50341ab0ecf..7a8c45dbee5 100644
> > --- a/src/intel/blorp/blorp_genX_exec.h
> > +++ b/src/intel/blorp/blorp_genX_exec.h
> > @@ -1628,6 +1628,20 @@ blorp_emit_gen8_hiz_op(struct blorp_batch *batch,
> >  */
> > blorp_emit_3dstate_multisample(batch, params);
> >
> > +   /* From the BDW PRM Volume 7, Depth Buffer Clear:
> > +*
> > +*The clear value must be between the min and max depth values
> > +*(inclusive) defined in the CC_VIEWPORT. If the depth buffer
> > format is
> > +*D32_FLOAT, then +/-DENORM values are also allowed.
> > +*
> > +* Set the bounds to match our hardware limits, [0.0, 1.0].
> > +*/
> > +   if (params->depth.enabled && params->hiz_op == ISL_AUX_OP_FAST_CLEAR) {
> > +  assert(params->depth.clear_color.f32[0] >= 0.0f);
> > +  assert(params->depth.clear_color.f32[0] <= 1.0f);
> > +  blorp_emit_cc_viewport(batch);
> > +   }
> > +
> > /* If we can't alter the depth stencil config and multiple layers are
> >  * involved, the HiZ op will fail. This is because the op requires
> > that a
> >  * new config is emitted for each additional layer.
> > --
> > 2.19.0
> >
> >
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] docs: Add a copyright.c template we can copy when making new files.

2018-10-19 Thread Nanley Chery
On Fri, Oct 19, 2018 at 10:51:36AM -0700, Kenneth Graunke wrote:
> Usually when making a new file, people copy some random other file
> to get the copyright header comments.  Unfortunately, some of them
> are commented in a decades-old style, are word wrapped poorly, or
> worse, have a few subtle variations in the text.  While we've tried
> to clean those up, we're not going to get every copy to be perfect.
> 
> Instead, this commit adds docs/copyright.c, which contains a copy of
> the license header which is well-formatted and has the correct text.
> The idea is that you can start from this when making a new file, which
> should help with consistency.
> ---
>  docs/copyright.c | 22 ++
>  1 file changed, 22 insertions(+)
>  create mode 100644 docs/copyright.c
> 
> Hey all,
> 
> I noticed when writing my new Iris driver that I had a couple subtle
> variations of copyright headers creep in, even in a brand new project.
> Mostly word wrapping differences.  To combat that, I made a copyright.c
> and made sure to use it when I created new files.  It seemed to help.
> 
> So, the thinking is to just actually put that in the project under docs.
> Maybe it helps other people as well?
> 

Should we let people know about this file by adding a note in
docs/devinfo.html?

To help people who don't always keep up w/ the documentation, I was
initially thinking we could add a git hook. I think that'd cause
problems when importing code from elsewhere, though.

>  --Ken
> 
> diff --git a/docs/copyright.c b/docs/copyright.c
> new file mode 100644
> index 000..db92f27e641
> --- /dev/null
> +++ b/docs/copyright.c
> @@ -0,0 +1,22 @@
> +/*
> + * Copyright © 2018 
   ^
?

-Nanley

> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included
> + * in all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
> + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> -- 
> 2.19.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: Don't advertise ASTC support on BSW

2018-10-15 Thread Nanley Chery
On Mon, Oct 15, 2018 at 01:07:12PM -0500, Jason Ekstrand wrote:
> ---
>  src/intel/vulkan/anv_formats.c | 8 
>  1 file changed, 8 insertions(+)
> 

This patch is
Reviewed-by: Nanley Chery 

> diff --git a/src/intel/vulkan/anv_formats.c b/src/intel/vulkan/anv_formats.c
> index 33faf7cc37f..9199567f445 100644
> --- a/src/intel/vulkan/anv_formats.c
> +++ b/src/intel/vulkan/anv_formats.c
> @@ -521,6 +521,14 @@ get_image_format_features(const struct gen_device_info 
> *devinfo,
> isl_format_get_layout(plane_format.isl_format)->txc == ISL_TXC_ASTC)
>return 0;
>  
> +   /* ASTC requires nasty workarounds on BSW so we just disable it for now.
> +*
> +* TODO: Figure out the ASTC workarounds and re-enable on BSW.
> +*/
> +   if (devinfo->gen < 9 &&
> +   isl_format_get_layout(plane_format.isl_format)->txc == ISL_TXC_ASTC)
> +  return 0;
> +
> if (isl_format_supports_sampling(devinfo, plane_format.isl_format)) {
>flags |= VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT;
>  
> -- 
> 2.19.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Fix pack_uint_Z_FLOAT32()

2018-10-11 Thread Nanley Chery
On Thu, Oct 11, 2018 at 06:06:18PM +0300, Illia Iorin wrote:
> Fixed pack_uint_Z_FLOAT32 by casting row data to float instead uint.
> Remove code duplicate function pack_uint_Z_FLOAT32_X24S8.
> Edited case in "_mesa_get_pack_uint_z_func".
> Now it looks like "_mesa_get_pack_float_z_func".
> Remove _mesa_problem call, which was added for debuging this issue.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91433
> Signed-off-by: Illia Iorin 
> ---
>  src/mesa/main/format_pack.py | 21 +
>  src/mesa/swrast/s_depth.c|  6 --
>  2 files changed, 9 insertions(+), 18 deletions(-)
> 

Thank you for the changes. This patch is
Reviewed-by: Nanley Chery 

I'll push it later today.

> diff --git a/src/mesa/main/format_pack.py b/src/mesa/main/format_pack.py
> index 0b9e0d424d..9fa4f412d4 100644
> --- a/src/mesa/main/format_pack.py
> +++ b/src/mesa/main/format_pack.py
> @@ -510,6 +510,10 @@ pack_float_Z_UNORM32(const GLfloat *src, void *dst)
> *d = (GLuint) (*src * scale);
>  }
>  
> +/**
> + ** Pack float to Z_FLOAT32 or Z_FLOAT32_X24S8.
> + **/
> +
>  static void
>  pack_float_Z_FLOAT32(const GLfloat *src, void *dst)
>  {
> @@ -582,18 +586,12 @@ pack_uint_Z_UNORM32(const GLuint *src, void *dst)
> *d = *src;
>  }
>  
> -static void
> -pack_uint_Z_FLOAT32(const GLuint *src, void *dst)
> -{
> -   GLuint *d = ((GLuint *) dst);
> -   const GLdouble scale = 1.0 / (GLdouble) 0x;
> -   *d = (GLuint) (*src * scale);
> -   assert(*d >= 0.0f);
> -   assert(*d <= 1.0f);
> -}
> +/**
> + ** Pack uint to Z_FLOAT32 or Z_FLOAT32_X24S8.
> + **/
>  
>  static void
> -pack_uint_Z_FLOAT32_X24S8(const GLuint *src, void *dst)
> +pack_uint_Z_FLOAT32(const GLuint *src, void *dst)
>  {
> GLfloat *d = ((GLfloat *) dst);
> const GLdouble scale = 1.0 / (GLdouble) 0x;
> @@ -617,9 +615,8 @@ _mesa_get_pack_uint_z_func(mesa_format format)
> case MESA_FORMAT_Z_UNORM32:
>return pack_uint_Z_UNORM32;
> case MESA_FORMAT_Z_FLOAT32:
> -  return pack_uint_Z_FLOAT32;
> case MESA_FORMAT_Z32_FLOAT_S8X24_UINT:
> -  return pack_uint_Z_FLOAT32_X24S8;
> +  return pack_uint_Z_FLOAT32;
> default:
>_mesa_problem(NULL, "unexpected format in 
> _mesa_get_pack_uint_z_func()");
>return NULL;
> diff --git a/src/mesa/swrast/s_depth.c b/src/mesa/swrast/s_depth.c
> index 4b9640d319..de7f14a4fc 100644
> --- a/src/mesa/swrast/s_depth.c
> +++ b/src/mesa/swrast/s_depth.c
> @@ -310,12 +310,6 @@ _swrast_depth_test_span(struct gl_context *ctx, SWspan 
> *span)
>zBufferVals = zStart;
> }
> else {
> -  if (_mesa_get_format_datatype(rb->Format) != GL_UNSIGNED_NORMALIZED) {
> - _mesa_problem(ctx, "Incorrectly writing swrast's integer depth "
> -   "values to %s depth buffer",
> -   _mesa_get_format_name(rb->Format));
> -  }
> -
>/* copy Z buffer values into temp buffer (32-bit Z values) */
>zBufferTemp = malloc(count * sizeof(GLuint));
>if (!zBufferTemp)
> -- 
> 2.17.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv: Clear WM_HZ_OP overrides in init_device_state

2018-10-10 Thread Nanley Chery
This is basically a port of commit,
3ade766684933ac84e41634429fb693f85353c11
("i965: Disable 3DSTATE_WM_HZ_OP fields.")

The BDW+ docs describe how to use the 3DSTATE_WM_HZ_OP instruction in
the section titled, "Optimized Depth Buffer Clear and/or Stencil Buffer
Clear." It mentions that the packet overrides GPU state for the clear
operation and needs to be reset to 0s to clear the overrides. Depending
on the kernel, we may not get a context with the GPU state for this
packet zeroed. Do it ourselves just in case.

Prevents a number of GPU hangs when running crucible on ICL. I tried to
get the exact number of hangs that occurs without this patch, but was
unsuccessful. The test machine became unresponsive before completing the
full run.

Reviewed-by: Kenneth Graunke 
---

 src/intel/vulkan/genX_state.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c
index 75bcd96d78a..42800a2581e 100644
--- a/src/intel/vulkan/genX_state.c
+++ b/src/intel/vulkan/genX_state.c
@@ -157,6 +157,16 @@ genX(init_device_state)(struct anv_device *device)
   GEN_SAMPLE_POS_16X(sp._16xSample);
 #endif
}
+
+   /* The BDW+ docs describe how to use the 3DSTATE_WM_HZ_OP instruction in the
+* section titled, "Optimized Depth Buffer Clear and/or Stencil Buffer
+* Clear." It mentions that the packet overrides GPU state for the clear
+* operation and needs to be reset to 0s to clear the overrides. Depending
+* on the kernel, we may not get a context with the state for this packet
+* zeroed. Do it ourselves just in case. We've observed this to prevent a
+* number of GPU hangs on ICL.
+*/
+   anv_batch_emit(, GENX(3DSTATE_WM_HZ_OP), hzp);
 #endif
 
 #if GEN_GEN == 10
-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] i965: Re-implement the gen9 void-extent ASTC WA with BLORP

2018-10-05 Thread Nanley Chery
On Tue, Oct 02, 2018 at 04:59:17PM -0700, Nanley Chery wrote:
> On Wed, Sep 26, 2018 at 04:31:02PM -0700, Nanley Chery wrote:
> > The current workaround has two issues. It causes significant slow-downs [1] 
> > in
> > application startup times and uses the modified ASTC blocks for non-sampling
> > operations. This can result in incorrect texture downloads.
> > 
> > This series addresses the latter issue by keeping two copies of an ASTC
> > miptree: one that's been modified for the sampler bug (the shadow) and 
> > another
> > that hasn't (the main). The main copy is used for pixel transfer operations 
> > and
> > the shadow is used for sampling within a shader. The former issue is 
> > addressed
> > by exchanging multiple GTT-mapped memory accesses at texture upload time 
> > with a
> > render engine read and write at sampling time.
> > 
> > At the moment, I don't have any empirical data on the performance
> > implications nor on the bug fixes.

This series reduces the startup time of an internal benchmark from 7s to
0.6s. It doesn't seem to have any impact on the FPS numbers (if I'm
reading them correctly). The benchmark was run five times on a release
build of mesa.

-Nanley

> 
> I just sent out a piglit test to demonstrate the fixed texture download
> issue: https://patchwork.freedesktop.org/series/50474/
> 
> -Nanley
> 
> > I'm trying to get my hands on one of
> > the affected benchmarks. This series does pass our CI system.
> > 
> > 1. 17 seconds were saved by avoiding it in commit:
> >3e56e4642fb5875b3f5c4eb34798ba9f3d827705
> > 
> > Nanley Chery (9):
> >   i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*
> >   i965/miptree: Allocate a shadow_mt for an ASTC WA
> >   i965/miptree: Track the staleness of the ASTC shadow
> >   intel/blorp_blit: Fix ptr deref in convert_to_uncompressed
> >   intel/blorp_blit: Add blorp_copy_astc_wa
> >   i965/blorp: Drop tmp_surfs from surf_for_miptree
> >   i965: Do a WA blit between ASTC main and shadow
> >   i965/surface_state: Use the ASTC shadow_mt if present
> >   i965/tex_image: Drop intelCompressedTexSubImage
> > 
> >  src/intel/blorp/blorp.h   |   6 +
> >  src/intel/blorp/blorp_blit.c  | 158 +-
> >  src/intel/blorp/blorp_priv.h  |   1 +
> >  src/mesa/drivers/dri/i965/brw_blorp.c |  56 ---
> >  src/mesa/drivers/dri/i965/brw_blorp.h |   6 +
> >  src/mesa/drivers/dri/i965/brw_draw.c  |  16 ++
> >  .../drivers/dri/i965/brw_wm_surface_state.c   |  11 +-
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c |  46 -
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  21 ++-
> >  src/mesa/drivers/dri/i965/intel_tex_image.c   |  87 --
> >  10 files changed, 276 insertions(+), 132 deletions(-)
> > 
> > -- 
> > 2.19.0
> > 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] mesa/format_pack: Fix pack_uint_Z_FLOAT32()

2018-10-04 Thread Nanley Chery
Since we're modifying swrast now, we should change the commit title tag
from mesa/format_pack: to just mesa: 

On Mon, Oct 01, 2018 at 07:18:41PM +0300, Illia Iorin wrote:
> Fixed pack_uint_Z_FLOAT32 by casting row data to float instead uint.
> Remove code duplicate function pack_uint_Z_FLOAT32_X24S8.
> Edited case in "_mesa_get_pack_uint_z_func".
> Now it looks like "_mesa_get_pack_float_z_func".

Please insert an empty line here.

> v2: by Nanley Chery
> -add coments
> -remove _mesa_problem call, which was added for debuging this issue

I didn't suggest these, so no need to attribute them to me.

> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91433
> Signed-off-by: Illia Iorin 
> ---
>  src/mesa/main/format_pack.py | 21 +
>  src/mesa/swrast/s_depth.c|  6 --
>  2 files changed, 9 insertions(+), 18 deletions(-)
> 
> diff --git a/src/mesa/main/format_pack.py b/src/mesa/main/format_pack.py
> index 0b9e0d424d..9f97c09ccc 100644
> --- a/src/mesa/main/format_pack.py
> +++ b/src/mesa/main/format_pack.py
> @@ -510,6 +510,10 @@ pack_float_Z_UNORM32(const GLfloat *src, void *dst)
> *d = (GLuint) (*src * scale);
>  }
>  
> +/**
> + ** Pack Z_FLOAT32 and Z_FLOAT32_X24S8 to float.
> + **/
> +

This comment and the one below aren't correct. The direction should be
swapped. In other words, we're packing a float into a Z_FLOAT32 and
Z_FLOAT32_X24S8.

-Nanley

>  static void
>  pack_float_Z_FLOAT32(const GLfloat *src, void *dst)
>  {
> @@ -582,18 +586,12 @@ pack_uint_Z_UNORM32(const GLuint *src, void *dst)
> *d = *src;
>  }
>  
> -static void
> -pack_uint_Z_FLOAT32(const GLuint *src, void *dst)
> -{
> -   GLuint *d = ((GLuint *) dst);
> -   const GLdouble scale = 1.0 / (GLdouble) 0x;
> -   *d = (GLuint) (*src * scale);
> -   assert(*d >= 0.0f);
> -   assert(*d <= 1.0f);
> -}
> +/**
> + ** Pack Z_FLOAT32 and Z_FLOAT32_X24S8 to uint.
> + **/
>  
>  static void
> -pack_uint_Z_FLOAT32_X24S8(const GLuint *src, void *dst)
> +pack_uint_Z_FLOAT32(const GLuint *src, void *dst)
>  {
> GLfloat *d = ((GLfloat *) dst);
> const GLdouble scale = 1.0 / (GLdouble) 0x;
> @@ -617,9 +615,8 @@ _mesa_get_pack_uint_z_func(mesa_format format)
> case MESA_FORMAT_Z_UNORM32:
>return pack_uint_Z_UNORM32;
> case MESA_FORMAT_Z_FLOAT32:
> -  return pack_uint_Z_FLOAT32;
> case MESA_FORMAT_Z32_FLOAT_S8X24_UINT:
> -  return pack_uint_Z_FLOAT32_X24S8;
> +  return pack_uint_Z_FLOAT32;
> default:
>_mesa_problem(NULL, "unexpected format in 
> _mesa_get_pack_uint_z_func()");
>return NULL;
> diff --git a/src/mesa/swrast/s_depth.c b/src/mesa/swrast/s_depth.c
> index 4b9640d319..de7f14a4fc 100644
> --- a/src/mesa/swrast/s_depth.c
> +++ b/src/mesa/swrast/s_depth.c
> @@ -310,12 +310,6 @@ _swrast_depth_test_span(struct gl_context *ctx, SWspan 
> *span)
>zBufferVals = zStart;
> }
> else {
> -  if (_mesa_get_format_datatype(rb->Format) != GL_UNSIGNED_NORMALIZED) {
> - _mesa_problem(ctx, "Incorrectly writing swrast's integer depth "
> -   "values to %s depth buffer",
> -   _mesa_get_format_name(rb->Format));
> -  }
> -
>/* copy Z buffer values into temp buffer (32-bit Z values) */
>zBufferTemp = malloc(count * sizeof(GLuint));
>if (!zBufferTemp)
> -- 
> 2.17.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 9/9] i965/tex_image: Drop intelCompressedTexSubImage

2018-10-02 Thread Nanley Chery
On Wed, Sep 26, 2018 at 04:31:11PM -0700, Nanley Chery wrote:
> Effectively revert 710b1d2e665ed654fb8d52b146fa22469e1dc3a7.
> 
> This function was created to perform the ASTC void-extent workaround.
> Now that the workaround is handled prior to sampling, this function is
> no longer necessary.

Adding to the commit message:

Makes the following piglit test pass:
spec@khr_texture_compression_astc@void-extent-dl-bug

In hopes that the test makes it upstream.

-Nanley

> ---
>  src/mesa/drivers/dri/i965/intel_tex_image.c | 87 -
>  1 file changed, 87 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
> b/src/mesa/drivers/dri/i965/intel_tex_image.c
> index 9775f788788..31ff08217ac 100644
> --- a/src/mesa/drivers/dri/i965/intel_tex_image.c
> +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
> @@ -843,98 +843,11 @@ intel_get_tex_sub_image(struct gl_context *ctx,
> DBG("%s - DONE\n", __func__);
>  }
>  
> -static void
> -flush_astc_denorms(struct gl_context *ctx, GLuint dims,
> -   struct gl_texture_image *texImage,
> -   GLint xoffset, GLint yoffset, GLint zoffset,
> -   GLsizei width, GLsizei height, GLsizei depth)
> -{
> -   struct compressed_pixelstore store;
> -   _mesa_compute_compressed_pixelstore(dims, texImage->TexFormat,
> -   width, height, depth,
> -   >Unpack, );
> -
> -   for (int slice = 0; slice < store.CopySlices; slice++) {
> -
> -  /* Map dest texture buffer */
> -  GLubyte *dstMap;
> -  GLint dstRowStride;
> -  ctx->Driver.MapTextureImage(ctx, texImage, slice + zoffset,
> -  xoffset, yoffset, width, height,
> -  GL_MAP_READ_BIT | GL_MAP_WRITE_BIT,
> -  , );
> -  if (!dstMap)
> - continue;
> -
> -  for (int i = 0; i < store.CopyRowsPerSlice; i++) {
> -
> - /* An ASTC block is stored in little endian mode. The byte that
> -  * contains bits 0..7 is stored at the lower address in memory.
> -  */
> - struct astc_void_extent {
> -uint16_t header : 12;
> -uint16_t dontcare[3];
> -uint16_t R;
> -uint16_t G;
> -uint16_t B;
> -uint16_t A;
> - } *blocks = (struct astc_void_extent*) dstMap;
> -
> - /* Iterate over every copied block in the row */
> - for (int j = 0; j < store.CopyBytesPerRow / 16; j++) {
> -
> -/* Check if the header matches that of an LDR void-extent block 
> */
> -if (blocks[j].header == 0xDFC) {
> -
> -   /* Flush UNORM16 values that would be denormalized */
> -   if (blocks[j].A < 4) blocks[j].A = 0;
> -   if (blocks[j].B < 4) blocks[j].B = 0;
> -   if (blocks[j].G < 4) blocks[j].G = 0;
> -   if (blocks[j].R < 4) blocks[j].R = 0;
> -}
> - }
> -
> - dstMap += dstRowStride;
> -  }
> -
> -  ctx->Driver.UnmapTextureImage(ctx, texImage, slice + zoffset);
> -   }
> -}
> -
> -
> -static void
> -intelCompressedTexSubImage(struct gl_context *ctx, GLuint dims,
> -struct gl_texture_image *texImage,
> -GLint xoffset, GLint yoffset, GLint zoffset,
> -GLsizei width, GLsizei height, GLsizei depth,
> -GLenum format,
> -GLsizei imageSize, const GLvoid *data)
> -{
> -   /* Upload the compressed data blocks */
> -   _mesa_store_compressed_texsubimage(ctx, dims, texImage,
> -  xoffset, yoffset, zoffset,
> -  width, height, depth,
> -  format, imageSize, data);
> -
> -   /* Fix up copied ASTC blocks if necessary */
> -   GLenum gl_format = _mesa_compressed_format_to_glenum(ctx,
> -texImage->TexFormat);
> -   bool is_linear_astc = _mesa_is_astc_format(gl_format) &&
> -!_mesa_is_srgb_format(gl_format);
> -   struct brw_context *brw = (struct brw_context*) ctx;
> -   const struct gen_device_info *devinfo = >screen->devinfo;
> -   if (devinfo->gen == 9 && !gen_device_info_is_9lp(devinfo) && 
> is_linear_astc)
> -  flush_astc_denorms(ctx, dims, texImage,
> - xoffset, yoffset, zoffset,
> - width, height, depth);
> -}
&g

Re: [Mesa-dev] [PATCH 0/9] i965: Re-implement the gen9 void-extent ASTC WA with BLORP

2018-10-02 Thread Nanley Chery
On Wed, Sep 26, 2018 at 04:31:02PM -0700, Nanley Chery wrote:
> The current workaround has two issues. It causes significant slow-downs [1] in
> application startup times and uses the modified ASTC blocks for non-sampling
> operations. This can result in incorrect texture downloads.
> 
> This series addresses the latter issue by keeping two copies of an ASTC
> miptree: one that's been modified for the sampler bug (the shadow) and another
> that hasn't (the main). The main copy is used for pixel transfer operations 
> and
> the shadow is used for sampling within a shader. The former issue is addressed
> by exchanging multiple GTT-mapped memory accesses at texture upload time with 
> a
> render engine read and write at sampling time.
> 
> At the moment, I don't have any empirical data on the performance
> implications nor on the bug fixes.

I just sent out a piglit test to demonstrate the fixed texture download
issue: https://patchwork.freedesktop.org/series/50474/

-Nanley

> I'm trying to get my hands on one of
> the affected benchmarks. This series does pass our CI system.
> 
> 1. 17 seconds were saved by avoiding it in commit:
>    3e56e4642fb5875b3f5c4eb34798ba9f3d827705
> 
> Nanley Chery (9):
>   i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*
>   i965/miptree: Allocate a shadow_mt for an ASTC WA
>   i965/miptree: Track the staleness of the ASTC shadow
>   intel/blorp_blit: Fix ptr deref in convert_to_uncompressed
>   intel/blorp_blit: Add blorp_copy_astc_wa
>   i965/blorp: Drop tmp_surfs from surf_for_miptree
>   i965: Do a WA blit between ASTC main and shadow
>   i965/surface_state: Use the ASTC shadow_mt if present
>   i965/tex_image: Drop intelCompressedTexSubImage
> 
>  src/intel/blorp/blorp.h   |   6 +
>  src/intel/blorp/blorp_blit.c  | 158 +-
>  src/intel/blorp/blorp_priv.h  |   1 +
>  src/mesa/drivers/dri/i965/brw_blorp.c |  56 ---
>  src/mesa/drivers/dri/i965/brw_blorp.h |   6 +
>  src/mesa/drivers/dri/i965/brw_draw.c  |  16 ++
>  .../drivers/dri/i965/brw_wm_surface_state.c   |  11 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c |  46 -
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  21 ++-
>  src/mesa/drivers/dri/i965/intel_tex_image.c   |  87 --
>  10 files changed, 276 insertions(+), 132 deletions(-)
> 
> -- 
> 2.19.0
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv: Restrict some fast-clear values gen7-8

2018-10-01 Thread Nanley Chery
The Vulkan API was recently clarified to allow image views of different
formats to be used for rendering to the same image. Due to certain
incompatible clear-color encodings on gen7-8, we must be more careful
about which fast-clear values we allow.

Makes the following crucible test pass pre-SKL:
func.renderpass.clear.color-view-one

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105826
Cc: 
---

The crucible test currently only exists in a merge request:
https://gitlab.freedesktop.org/mesa/crucible/merge_requests/11

 src/intel/vulkan/anv_image.c   | 36 ++
 src/intel/vulkan/anv_private.h |  6 +
 src/intel/vulkan/genX_cmd_buffer.c | 19 +---
 3 files changed, 58 insertions(+), 3 deletions(-)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index b0d8c560adb..4eb06501f8d 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -156,6 +156,40 @@ add_surface(struct anv_image *image, struct anv_surface 
*surf, uint32_t plane)
  surf->isl.alignment_B);
 }
 
+static bool
+all_formats_flt_xor_int(const struct gen_device_info *devinfo,
+   const struct VkImageCreateInfo *vk_info)
+{
+   if (!(vk_info->flags & VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT))
+  return true;
+
+   const VkImageFormatListCreateInfoKHR *fmt_list =
+  vk_find_struct_const(vk_info->pNext, IMAGE_FORMAT_LIST_CREATE_INFO_KHR);
+
+   if (!fmt_list || fmt_list->viewFormatCount == 0)
+  return false;
+
+   enum isl_format format =
+  anv_get_isl_format(devinfo, vk_info->format,
+ VK_IMAGE_ASPECT_COLOR_BIT, vk_info->tiling);
+
+   const bool is_int = isl_format_has_int_channel(format) ||
+   isl_format_has_uint_channel(format);
+
+   for (uint32_t i = 0; i < fmt_list->viewFormatCount; i++) {
+  enum isl_format view_format =
+ anv_get_isl_format(devinfo, fmt_list->pViewFormats[i],
+VK_IMAGE_ASPECT_COLOR_BIT, vk_info->tiling);
+
+  const bool view_is_int = isl_format_has_int_channel(view_format) ||
+   isl_format_has_uint_channel(view_format);
+
+  if (is_int != view_is_int)
+ return false;
+   }
+
+   return true;
+}
 
 static bool
 all_formats_ccs_e_compatible(const struct gen_device_info *devinfo,
@@ -593,6 +627,8 @@ anv_image_create(VkDevice _device,
image->usage = pCreateInfo->usage;
image->tiling = pCreateInfo->tiling;
image->disjoint = pCreateInfo->flags & VK_IMAGE_CREATE_DISJOINT_BIT;
+   image->fmts_flt_xor_int =
+  all_formats_flt_xor_int(>info, pCreateInfo);
image->needs_set_tiling = wsi_info && wsi_info->scanout;
image->drm_format_mod = isl_mod_info ? isl_mod_info->modifier :
   DRM_FORMAT_MOD_INVALID;
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 60f40c7e2ae..07dcc9f9b37 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2641,6 +2641,12 @@ struct anv_image {
VkImageUsageFlags usage; /**< Superset of VkImageCreateInfo::usage. */
VkImageTiling tiling; /** VkImageCreateInfo::tiling */
 
+   /**
+* True if this image may be used with integer or float formats, but not
+* both.
+*/
+   bool fmts_flt_xor_int;
+
/** True if this is needs to be bound to an appropriately tiled BO.
 *
 * When not using modifiers, consumers such as X11, Wayland, and KMS need
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 099c30f3d66..da01795db10 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -322,9 +322,22 @@ color_attachment_compute_aux_usage(struct anv_device * 
device,
   render_area.extent.height != iview->extent.height)
  att_state->fast_clear = false;
 
-  /* On Broadwell and earlier, we can only handle 0/1 clear colors */
-  if (GEN_GEN <= 8 && !att_state->clear_color_is_zero_one)
- att_state->fast_clear = false;
+  /* Broadwell and earlier have restrictions due to integer and float
+   * texture clears being viewed incorrectly for certain values. Float
+   * texture clears to 1.0 (0x3F80_) are encoded as 1 in the surface
+   * state. Integer texture clears to 1 (0x1) share the same encoding. To
+   * avoid this conflict, only allow clearing with ones if the image format
+   * type is immutable.
+   */
+  if (GEN_GEN <= 8) {
+ if (!iview->image->fmts_flt_xor_int &&
+ !att_state->clear_color_is_zero) {
+att_state->fast_clear = false;
+ } else if (iview->image->fmts_flt_xor_int &&
+!att_state->clear_color_is_zero_one) {
+att_state->fast_clear = false;
+ }
+  }
 
   /* We only allow fast clears to the first slice of an image (level 0,
* layer 0) and 

[Mesa-dev] [PATCH 8/9] i965/surface_state: Use the ASTC shadow_mt if present

2018-09-26 Thread Nanley Chery
When sampling from an ASTC texture in a shader, make sure to use the
miptree which has had the gen9 void-extent workaround applied to it.
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index e214fae140b..cad0f7faba1 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -573,6 +573,9 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
  assert(mt->shadow_mt && !mt->shadow_needs_update);
  mt = mt->shadow_mt;
  format = ISL_FORMAT_R8_UINT;
+  } else if (intel_miptree_has_astc_shadow(mt)) {
+ assert(!mt->shadow_needs_update);
+ mt = mt->shadow_mt;
   }
 
   const int surf_index = surf_offset - >wm.base.surf_offset[0];
-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/9] intel/blorp_blit: Fix ptr deref in convert_to_uncompressed

2018-09-26 Thread Nanley Chery
Don't access the pointers x and y if they're NULL. Nothing hits this
path currently.
---
 src/intel/blorp/blorp_blit.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c
index ae3e3c50930..7c4e569e44c 100644
--- a/src/intel/blorp/blorp_blit.c
+++ b/src/intel/blorp/blorp_blit.c
@@ -2456,15 +2456,18 @@ blorp_surf_convert_to_uncompressed(const struct 
isl_device *isl_dev,
 */
blorp_surf_convert_to_single_slice(isl_dev, info);
 
-   if (width && height) {
 #ifndef NDEBUG
+   if (width && height && x && y) {
   uint32_t right_edge_px = info->tile_x_sa + *x + *width;
   uint32_t bottom_edge_px = info->tile_y_sa + *y + *height;
   assert(*width % fmtl->bw == 0 ||
  right_edge_px == info->surf.logical_level0_px.width);
   assert(*height % fmtl->bh == 0 ||
  bottom_edge_px == info->surf.logical_level0_px.height);
+   }
 #endif
+
+   if (width && height) {
   *width = DIV_ROUND_UP(*width, fmtl->bw);
   *height = DIV_ROUND_UP(*height, fmtl->bh);
}
-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/9] i965/miptree: Allocate a shadow_mt for an ASTC WA

2018-09-26 Thread Nanley Chery
shadow_mt will hold a miptree with ASTC LDR void extent blocks that are
modified to workaround a sampler bug.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 25 +++
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  1 +
 2 files changed, 26 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 332c5d88f58..5e99b563102 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -354,6 +354,19 @@ needs_separate_stencil(const struct brw_context *brw,
   intel_miptree_supports_hiz(brw, mt);
 }
 
+/* Determine if we may run into a problematic void-extent block for our sampler
+ * (see WA #0300). This isn't 100% accurate because we don't actually inspect
+ * the blocks.
+ */
+static bool
+may_need_astc_shadow(const struct gen_device_info *devinfo,
+ mesa_format format)
+{
+   return devinfo->gen == 9 && !gen_device_info_is_9lp(devinfo) &&
+  _mesa_get_format_layout(format) == MESA_FORMAT_LAYOUT_ASTC &&
+  _mesa_get_format_color_encoding(format) == GL_LINEAR;
+}
+
 /**
  * Choose the aux usage for this miptree.  This function must be called fairly
  * late in the miptree create process after we have a tiling.
@@ -719,6 +732,18 @@ miptree_create(struct brw_context *brw,
   }
}
 
+   if (may_need_astc_shadow(devinfo, format)) {
+  mt->shadow_mt =
+ make_surface(brw, target, format, first_level, last_level,
+  width0, height0, depth0, num_samples,
+  ISL_TILING_Y0_BIT, mt_surf_usage(format),
+  BO_ALLOC_BUSY, 0, NULL);
+  if (mt->shadow_mt == NULL) {
+ intel_miptree_release();
+ return NULL;
+  }
+   }
+
mt->etc_format = (_mesa_is_format_color_format(format) && mt_fmt != format) 
?
 format : MESA_FORMAT_NONE;
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index d75c93b8b42..b22514de386 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -301,6 +301,7 @@ struct intel_mipmap_tree
 *
 * This miptree may be used for:
 * - Stencil texturing (pre-BDW) as required by GL_ARB_stencil_texturing.
+* - Correctly sampling from ASTC LDR blocks on big-core gen9 platforms.
 */
struct intel_mipmap_tree *shadow_mt;
bool shadow_needs_update;
-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/9] i965/blorp: Drop tmp_surfs from surf_for_miptree

2018-09-26 Thread Nanley Chery
We've been using intel_mipmap_tree::surf instead. The tmp_surfs param
hasn't been used since commit: bf24c3539e4b6989512968cae12da2f88d2c53e9
("i965/miptree: Clean-up unused").
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 36 +--
 1 file changed, 12 insertions(+), 24 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index ad747e0766e..2ebd35ae49f 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -125,8 +125,7 @@ blorp_surf_for_miptree(struct brw_context *brw,
enum isl_aux_usage aux_usage,
bool is_render_target,
unsigned *level,
-   unsigned start_layer, unsigned num_layers,
-   struct isl_surf tmp_surfs[1])
+   unsigned start_layer, unsigned num_layers)
 {
const struct gen_device_info *devinfo = >screen->devinfo;
 
@@ -406,12 +405,11 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
intel_miptree_prepare_access(brw, dst_mt, dst_level, 1, dst_layer, 1,
 dst_aux_usage, dst_clear_supported);
 
-   struct isl_surf tmp_surfs[2];
struct blorp_surf src_surf, dst_surf;
blorp_surf_for_miptree(brw, _surf, src_mt, src_aux_usage, false,
-  _level, src_layer, 1, _surfs[0]);
+  _level, src_layer, 1);
blorp_surf_for_miptree(brw, _surf, dst_mt, dst_aux_usage, true,
-  _level, dst_layer, 1, _surfs[1]);
+  _level, dst_layer, 1);
 
struct isl_swizzle src_isl_swizzle = {
   .r = swizzle_to_scs(GET_SWZ(src_swizzle, 0)),
@@ -497,12 +495,11 @@ brw_blorp_copy_miptrees(struct brw_context *brw,
intel_miptree_prepare_access(brw, dst_mt, dst_level, 1, dst_layer, 1,
 dst_aux_usage, dst_clear_supported);
 
-   struct isl_surf tmp_surfs[2];
struct blorp_surf src_surf, dst_surf;
blorp_surf_for_miptree(brw, _surf, src_mt, src_aux_usage, false,
-  _level, src_layer, 1, _surfs[0]);
+  _level, src_layer, 1);
blorp_surf_for_miptree(brw, _surf, dst_mt, dst_aux_usage, true,
-  _level, dst_layer, 1, _surfs[1]);
+  _level, dst_layer, 1);
 
/* The hardware seems to have issues with having a two different format
 * views of the same texture in the sampler cache at the same time.  It's
@@ -1300,10 +1297,9 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
   irb->mt, irb->mt_level, irb->mt_layer, num_layers);
 
   /* We can't setup the blorp_surf until we've allocated the MCS above */
-  struct isl_surf isl_tmp[2];
   struct blorp_surf surf;
   blorp_surf_for_miptree(brw, , irb->mt, irb->mt->aux_usage, true,
- , irb->mt_layer, num_layers, isl_tmp);
+ , irb->mt_layer, num_layers);
 
   /* Ivybrigde PRM Vol 2, Part 1, "11.7 MCS Buffer for Render Target(s)":
*
@@ -1346,10 +1342,9 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
   intel_miptree_prepare_render(brw, irb->mt, level, irb->mt_layer,
num_layers, aux_usage);
 
-  struct isl_surf isl_tmp[2];
   struct blorp_surf surf;
   blorp_surf_for_miptree(brw, , irb->mt, aux_usage, true,
- , irb->mt_layer, num_layers, isl_tmp);
+ , irb->mt_layer, num_layers);
 
   union isl_color_value clear_color;
   memcpy(clear_color.f32, ctx->Color.ClearColor.f, sizeof(float) * 4);
@@ -1444,7 +1439,6 @@ brw_blorp_clear_depth_stencil(struct brw_context *brw,
   return;
 
uint32_t level, start_layer, num_layers;
-   struct isl_surf isl_tmp[4];
struct blorp_surf depth_surf, stencil_surf;
 
struct intel_mipmap_tree *depth_mt = NULL;
@@ -1461,8 +1455,7 @@ brw_blorp_clear_depth_stencil(struct brw_context *brw,
 
   unsigned depth_level = level;
   blorp_surf_for_miptree(brw, _surf, depth_mt, depth_mt->aux_usage,
- true, _level, start_layer, num_layers,
- _tmp[0]);
+ true, _level, start_layer, num_layers);
   assert(depth_level == level);
}
 
@@ -1491,8 +1484,7 @@ brw_blorp_clear_depth_stencil(struct brw_context *brw,
   unsigned stencil_level = level;
   blorp_surf_for_miptree(brw, _surf, stencil_mt,
  ISL_AUX_USAGE_NONE, true,
- _level, start_layer, num_layers,
- _tmp[2]);
+ _level, start_layer, num_layers);
}
 
assert((mask & BUFFER_BIT_DEPTH) || stencil_mask);
@@ -1527,11 +1519,9 @@ brw_blorp_resolve_color(struct brw_context *brw, struct 
intel_mipmap_tree *mt,
 
 

[Mesa-dev] [PATCH 3/9] i965/miptree: Track the staleness of the ASTC shadow

2018-09-26 Thread Nanley Chery
Track whether or not the ASTC shadow miptree will need to be updated
prior to sampling.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 5 -
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 6 ++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 5e99b563102..090e20e1d70 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -2451,8 +2451,11 @@ intel_miptree_finish_write(struct brw_context *brw,
 
switch (mt->aux_usage) {
case ISL_AUX_USAGE_NONE:
-  if (mt->format == MESA_FORMAT_S_UINT8 && devinfo->gen <= 7)
+  if (mt->format == MESA_FORMAT_S_UINT8 && devinfo->gen <= 7) {
  mt->shadow_needs_update = true;
+  } else if (intel_miptree_has_astc_shadow(mt)) {
+ mt->shadow_needs_update = true;
+  }
   break;
 
case ISL_AUX_USAGE_MCS:
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index b22514de386..3ae0117d68f 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -726,6 +726,12 @@ intel_miptree_blt_pitch(struct intel_mipmap_tree *mt)
return pitch;
 }
 
+static inline bool
+intel_miptree_has_astc_shadow(const struct intel_mipmap_tree *mt)
+{
+   return _mesa_get_format_layout(mt->format) == MESA_FORMAT_LAYOUT_ASTC &&
+   mt->shadow_mt != NULL;
+}
 #ifdef __cplusplus
 }
 #endif
-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 9/9] i965/tex_image: Drop intelCompressedTexSubImage

2018-09-26 Thread Nanley Chery
Effectively revert 710b1d2e665ed654fb8d52b146fa22469e1dc3a7.

This function was created to perform the ASTC void-extent workaround.
Now that the workaround is handled prior to sampling, this function is
no longer necessary.
---
 src/mesa/drivers/dri/i965/intel_tex_image.c | 87 -
 1 file changed, 87 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
b/src/mesa/drivers/dri/i965/intel_tex_image.c
index 9775f788788..31ff08217ac 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_image.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
@@ -843,98 +843,11 @@ intel_get_tex_sub_image(struct gl_context *ctx,
DBG("%s - DONE\n", __func__);
 }
 
-static void
-flush_astc_denorms(struct gl_context *ctx, GLuint dims,
-   struct gl_texture_image *texImage,
-   GLint xoffset, GLint yoffset, GLint zoffset,
-   GLsizei width, GLsizei height, GLsizei depth)
-{
-   struct compressed_pixelstore store;
-   _mesa_compute_compressed_pixelstore(dims, texImage->TexFormat,
-   width, height, depth,
-   >Unpack, );
-
-   for (int slice = 0; slice < store.CopySlices; slice++) {
-
-  /* Map dest texture buffer */
-  GLubyte *dstMap;
-  GLint dstRowStride;
-  ctx->Driver.MapTextureImage(ctx, texImage, slice + zoffset,
-  xoffset, yoffset, width, height,
-  GL_MAP_READ_BIT | GL_MAP_WRITE_BIT,
-  , );
-  if (!dstMap)
- continue;
-
-  for (int i = 0; i < store.CopyRowsPerSlice; i++) {
-
- /* An ASTC block is stored in little endian mode. The byte that
-  * contains bits 0..7 is stored at the lower address in memory.
-  */
- struct astc_void_extent {
-uint16_t header : 12;
-uint16_t dontcare[3];
-uint16_t R;
-uint16_t G;
-uint16_t B;
-uint16_t A;
- } *blocks = (struct astc_void_extent*) dstMap;
-
- /* Iterate over every copied block in the row */
- for (int j = 0; j < store.CopyBytesPerRow / 16; j++) {
-
-/* Check if the header matches that of an LDR void-extent block */
-if (blocks[j].header == 0xDFC) {
-
-   /* Flush UNORM16 values that would be denormalized */
-   if (blocks[j].A < 4) blocks[j].A = 0;
-   if (blocks[j].B < 4) blocks[j].B = 0;
-   if (blocks[j].G < 4) blocks[j].G = 0;
-   if (blocks[j].R < 4) blocks[j].R = 0;
-}
- }
-
- dstMap += dstRowStride;
-  }
-
-  ctx->Driver.UnmapTextureImage(ctx, texImage, slice + zoffset);
-   }
-}
-
-
-static void
-intelCompressedTexSubImage(struct gl_context *ctx, GLuint dims,
-struct gl_texture_image *texImage,
-GLint xoffset, GLint yoffset, GLint zoffset,
-GLsizei width, GLsizei height, GLsizei depth,
-GLenum format,
-GLsizei imageSize, const GLvoid *data)
-{
-   /* Upload the compressed data blocks */
-   _mesa_store_compressed_texsubimage(ctx, dims, texImage,
-  xoffset, yoffset, zoffset,
-  width, height, depth,
-  format, imageSize, data);
-
-   /* Fix up copied ASTC blocks if necessary */
-   GLenum gl_format = _mesa_compressed_format_to_glenum(ctx,
-texImage->TexFormat);
-   bool is_linear_astc = _mesa_is_astc_format(gl_format) &&
-!_mesa_is_srgb_format(gl_format);
-   struct brw_context *brw = (struct brw_context*) ctx;
-   const struct gen_device_info *devinfo = >screen->devinfo;
-   if (devinfo->gen == 9 && !gen_device_info_is_9lp(devinfo) && is_linear_astc)
-  flush_astc_denorms(ctx, dims, texImage,
- xoffset, yoffset, zoffset,
- width, height, depth);
-}
-
 void
 intelInitTextureImageFuncs(struct dd_function_table *functions)
 {
functions->TexImage = intelTexImage;
functions->TexSubImage = intelTexSubImage;
-   functions->CompressedTexSubImage = intelCompressedTexSubImage;
functions->EGLImageTargetTexture2D = intel_image_target_texture_2d;
functions->BindRenderbufferTexImage = intel_bind_renderbuffer_tex_image;
functions->GetTexSubImage = intel_get_tex_sub_image;
-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/9] i965: Re-implement the gen9 void-extent ASTC WA with BLORP

2018-09-26 Thread Nanley Chery
The current workaround has two issues. It causes significant slow-downs [1] in
application startup times and uses the modified ASTC blocks for non-sampling
operations. This can result in incorrect texture downloads.

This series addresses the latter issue by keeping two copies of an ASTC
miptree: one that's been modified for the sampler bug (the shadow) and another
that hasn't (the main). The main copy is used for pixel transfer operations and
the shadow is used for sampling within a shader. The former issue is addressed
by exchanging multiple GTT-mapped memory accesses at texture upload time with a
render engine read and write at sampling time.

At the moment, I don't have any empirical data on the performance
implications nor on the bug fixes. I'm trying to get my hands on one of
the affected benchmarks. This series does pass our CI system.

1. 17 seconds were saved by avoiding it in commit:
   3e56e4642fb5875b3f5c4eb34798ba9f3d827705

Nanley Chery (9):
  i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*
  i965/miptree: Allocate a shadow_mt for an ASTC WA
  i965/miptree: Track the staleness of the ASTC shadow
  intel/blorp_blit: Fix ptr deref in convert_to_uncompressed
  intel/blorp_blit: Add blorp_copy_astc_wa
  i965/blorp: Drop tmp_surfs from surf_for_miptree
  i965: Do a WA blit between ASTC main and shadow
  i965/surface_state: Use the ASTC shadow_mt if present
  i965/tex_image: Drop intelCompressedTexSubImage

 src/intel/blorp/blorp.h   |   6 +
 src/intel/blorp/blorp_blit.c  | 158 +-
 src/intel/blorp/blorp_priv.h  |   1 +
 src/mesa/drivers/dri/i965/brw_blorp.c |  56 ---
 src/mesa/drivers/dri/i965/brw_blorp.h |   6 +
 src/mesa/drivers/dri/i965/brw_draw.c  |  16 ++
 .../drivers/dri/i965/brw_wm_surface_state.c   |  11 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c |  46 -
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  21 ++-
 src/mesa/drivers/dri/i965/intel_tex_image.c   |  87 --
 10 files changed, 276 insertions(+), 132 deletions(-)

-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/9] i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*

2018-09-26 Thread Nanley Chery
Use more generic field names. We'll reuse these fields for a workaround
with ASTC miptrees.
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  8 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 16 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h| 14 +++---
 3 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 8d21cf5fa70..e214fae140b 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -563,15 +563,15 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
 
   if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) 
{
  if (devinfo->gen <= 7) {
-assert(mt->r8stencil_mt && 
!mt->stencil_mt->r8stencil_needs_update);
-mt = mt->r8stencil_mt;
+assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update);
+mt = mt->shadow_mt;
  } else {
 mt = mt->stencil_mt;
  }
  format = ISL_FORMAT_R8_UINT;
   } else if (devinfo->gen <= 7 && mt->format == MESA_FORMAT_S_UINT8) {
- assert(mt->r8stencil_mt && !mt->r8stencil_needs_update);
- mt = mt->r8stencil_mt;
+ assert(mt->shadow_mt && !mt->shadow_needs_update);
+ mt = mt->shadow_mt;
  format = ISL_FORMAT_R8_UINT;
   }
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index e32641f4098..332c5d88f58 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1214,7 +1214,7 @@ intel_miptree_release(struct intel_mipmap_tree **mt)
 
   brw_bo_unreference((*mt)->bo);
   intel_miptree_release(&(*mt)->stencil_mt);
-  intel_miptree_release(&(*mt)->r8stencil_mt);
+  intel_miptree_release(&(*mt)->shadow_mt);
   intel_miptree_aux_buffer_free((*mt)->aux_buf);
   free_aux_state_map((*mt)->aux_state);
 
@@ -2427,7 +2427,7 @@ intel_miptree_finish_write(struct brw_context *brw,
switch (mt->aux_usage) {
case ISL_AUX_USAGE_NONE:
   if (mt->format == MESA_FORMAT_S_UINT8 && devinfo->gen <= 7)
- mt->r8stencil_needs_update = true;
+ mt->shadow_needs_update = true;
   break;
 
case ISL_AUX_USAGE_MCS:
@@ -2933,9 +2933,9 @@ intel_update_r8stencil(struct brw_context *brw,
 
assert(src->surf.size_B > 0);
 
-   if (!mt->r8stencil_mt) {
+   if (!mt->shadow_mt) {
   assert(devinfo->gen > 6); /* Handle MIPTREE_LAYOUT_GEN6_HIZ_STENCIL */
-  mt->r8stencil_mt = make_surface(
+  mt->shadow_mt = make_surface(
 brw,
 src->target,
 MESA_FORMAT_R_UINT8,
@@ -2949,13 +2949,13 @@ intel_update_r8stencil(struct brw_context *brw,
 ISL_TILING_Y0_BIT,
 ISL_SURF_USAGE_TEXTURE_BIT,
 BO_ALLOC_BUSY, 0, NULL);
-  assert(mt->r8stencil_mt);
+  assert(mt->shadow_mt);
}
 
-   if (src->r8stencil_needs_update == false)
+   if (src->shadow_needs_update == false)
   return;
 
-   struct intel_mipmap_tree *dst = mt->r8stencil_mt;
+   struct intel_mipmap_tree *dst = mt->shadow_mt;
 
for (int level = src->first_level; level <= src->last_level; level++) {
   const unsigned depth = src->surf.dim == ISL_SURF_DIM_3D ?
@@ -2975,7 +2975,7 @@ intel_update_r8stencil(struct brw_context *brw,
}
 
brw_cache_flush_for_read(brw, dst->bo);
-   src->r8stencil_needs_update = false;
+   src->shadow_needs_update = false;
 }
 
 static void *
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 708757c47b8..d75c93b8b42 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -294,16 +294,16 @@ struct intel_mipmap_tree
struct intel_mipmap_tree *stencil_mt;
 
/**
-* \brief Stencil texturing miptree for sampling from a stencil texture
+* \brief Shadow miptree for sampling when the main isn't supported by HW.
 *
-* Some hardware doesn't support sampling from the stencil texture as
-* required by the GL_ARB_stencil_texturing extenion. To workaround this we
-* blit the texture into a new texture that can be sampled.
+* To workaround various sampler bugs and limitations, we blit the main
+* texture into a new texture that can be sampled.
 *
-* \see intel_update_r8stencil()
+* This miptree may be used for:
+* - Stencil texturing (pre-BDW) as required by GL_ARB_stencil_texturing.
 */
-   struct intel_mipmap_tree *r8stencil_mt;
-   bool r8stencil_needs_update;
+   struct intel_mipmap_tree *shadow_mt;
+   bool shadow_needs_update;
 
/**
 * \brief CCS, MCS, or HiZ auxiliary buffer.
-- 
2.19.0


[Mesa-dev] [PATCH 7/9] i965: Do a WA blit between ASTC main and shadow

2018-09-26 Thread Nanley Chery
Perform a workaround blit prior to sampling from the ASTC miptree.
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 20 
 src/mesa/drivers/dri/i965/brw_blorp.h |  6 ++
 src/mesa/drivers/dri/i965/brw_draw.c  | 16 
 3 files changed, 42 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 2ebd35ae49f..6fc0b441cd0 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -266,6 +266,26 @@ swizzle_to_scs(GLenum swizzle)
return (enum isl_channel_select)((swizzle + 4) & 7);
 }
 
+void
+brw_blorp_copy_astc_wa(struct brw_context *brw,
+   struct intel_mipmap_tree *src_mt,
+   struct intel_mipmap_tree *dst_mt,
+   unsigned level, unsigned layer)
+{
+   struct blorp_surf src_surf, dst_surf;
+   unsigned src_level = level;
+   unsigned dst_level = level;
+   blorp_surf_for_miptree(brw, _surf, src_mt, ISL_AUX_USAGE_NONE, false,
+  _level, layer, 1);
+   blorp_surf_for_miptree(brw, _surf, dst_mt, ISL_AUX_USAGE_NONE, true,
+  _level, layer, 1);
+
+   struct blorp_batch batch;
+   blorp_batch_init(>blorp, , brw, 0);
+   blorp_copy_astc_wa(, _surf, _surf, dst_level, layer);
+   blorp_batch_finish();
+}
+
 /**
  * Note: if the src (or dst) is a 2D multisample array texture on Gen7+ using
  * INTEL_MSAA_LAYOUT_UMS or INTEL_MSAA_LAYOUT_CMS, src_layer (dst_layer) is
diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h 
b/src/mesa/drivers/dri/i965/brw_blorp.h
index 551e1fcdcba..ba0d5679a04 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.h
+++ b/src/mesa/drivers/dri/i965/brw_blorp.h
@@ -34,6 +34,12 @@ extern "C" {
 
 void brw_blorp_init(struct brw_context *brw);
 
+void
+brw_blorp_copy_astc_wa(struct brw_context *brw,
+   struct intel_mipmap_tree *src_mt,
+   struct intel_mipmap_tree *dst_mt,
+   unsigned level, unsigned layer);
+
 void
 brw_blorp_blit_miptrees(struct brw_context *brw,
 struct intel_mipmap_tree *src_mt,
diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 8536c040109..772f8f8fad7 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -558,6 +558,22 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool 
rendering,
   if (tex_obj->base.StencilSampling ||
   tex_obj->mt->format == MESA_FORMAT_S_UINT8) {
  intel_update_r8stencil(brw, tex_obj->mt);
+  } else if (intel_miptree_has_astc_shadow(tex_obj->mt) &&
+ tex_obj->mt->shadow_needs_update) {
+ struct intel_mipmap_tree *src = tex_obj->mt;
+ struct intel_mipmap_tree *dst = src->shadow_mt;
+
+ for (int level = src->first_level; level <= src->last_level; level++) 
{
+const unsigned depth = src->surf.dim == ISL_SURF_DIM_3D ?
+   minify(src->surf.logical_level0_px.depth, level) :
+   src->surf.logical_level0_px.array_len;
+
+for (unsigned layer = 0; layer < depth; layer++) {
+   brw_blorp_copy_astc_wa(brw, src, dst, level, layer);
+}
+ }
+ brw_cache_flush_for_read(brw, dst->bo);
+ src->shadow_needs_update = false;
   }
}
 
-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/9] intel/blorp_blit: Add blorp_copy_astc_wa

2018-09-26 Thread Nanley Chery
Add a function which copies blocks from one ASTC surface to another,
patching them up as necessary.
---
 src/intel/blorp/blorp.h  |   6 ++
 src/intel/blorp/blorp_blit.c | 153 +++
 src/intel/blorp/blorp_priv.h |   1 +
 3 files changed, 160 insertions(+)

diff --git a/src/intel/blorp/blorp.h b/src/intel/blorp/blorp.h
index ee343a4a6bb..67df3ff26b0 100644
--- a/src/intel/blorp/blorp.h
+++ b/src/intel/blorp/blorp.h
@@ -152,6 +152,12 @@ blorp_copy(struct blorp_batch *batch,
uint32_t dst_x, uint32_t dst_y,
uint32_t src_width, uint32_t src_height);
 
+void
+blorp_copy_astc_wa(struct blorp_batch *batch,
+   const struct blorp_surf *src_surf,
+   const struct blorp_surf *dst_surf,
+   unsigned src_level, unsigned src_layer);
+
 void
 blorp_buffer_copy(struct blorp_batch *batch,
   struct blorp_address src,
diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c
index 7c4e569e44c..442f7227c0a 100644
--- a/src/intel/blorp/blorp_blit.c
+++ b/src/intel/blorp/blorp_blit.c
@@ -2658,6 +2658,159 @@ blorp_copy(struct blorp_batch *batch,
do_blorp_blit(batch, , _prog_key, );
 }
 
+/* Try to add a pixel shader kernel for the ASTC WA to params. */
+static bool
+get_copy_astc_wa_kernel(struct blorp_context *blorp,
+struct blorp_params *params)
+{
+   /* Use the shader in our cache if it already exists. */
+   enum blorp_shader_type astc_wa_key = BLORP_SHADER_TYPE_ASTC_VOID_EXTENT_WA;
+   if (blorp->lookup_shader(blorp, _wa_key, sizeof(astc_wa_key),
+>wm_prog_kernel, >wm_prog_data))
+  return true;
+
+   /* Otherwise, build the kernel now. */
+   void *mem_ctx = ralloc_context(NULL);
+
+   const unsigned *program;
+   struct brw_wm_prog_data prog_data;
+
+   nir_builder b;
+   nir_builder_init_simple_shader(, mem_ctx, MESA_SHADER_FRAGMENT, NULL);
+   b.shader->info.name =
+  ralloc_strdup(b.shader, "BLORP-ASTC-void-extent-wa-copy");
+
+   /* Input: Perform a texelfetch on the 2D RGBA32UI texture */
+   nir_ssa_def *frag_coord_u = nir_f2i32(, blorp_nir_frag_coord());
+   nir_ssa_def *pos = nir_vec2(, nir_channel(, frag_coord_u, 0),
+   nir_channel(, frag_coord_u, 1));
+   nir_tex_instr *tex = nir_tex_instr_create(b.shader, 2);
+   nir_ssa_dest_init(>instr, >dest, 4, 32, NULL);
+
+   tex->texture_index = 0;
+   tex->sampler_index = 0;
+
+   tex->op = nir_texop_txf;
+   tex->sampler_dim = GLSL_SAMPLER_DIM_2D;
+   tex->dest_type = nir_type_uint;
+   tex->src[0].src_type = nir_tex_src_coord;
+   tex->src[0].src = nir_src_for_ssa(pos);
+   tex->coord_components = 2;
+   tex->src[1].src_type = nir_tex_src_lod;
+   tex->src[1].src = nir_src_for_ssa(nir_imm_int(, 0));
+
+   nir_builder_instr_insert(, >instr);
+
+   /* Output: Declare the fragment color */
+   nir_variable *frag_color =
+  nir_variable_create(b.shader, nir_var_shader_out,
+  glsl_uvec4_type(), "gl_FragColor");
+   frag_color->data.location = FRAG_RESULT_COLOR;
+
+   /* Main: Patch up the fetched block as needed.
+*
+* An ASTC block is stored in little endian mode. The byte that contains
+* bits 0..7 is stored at the lower address in memory.
+*
+* The low 12 bits contain the header which can indicate an LDR void-extent
+* block.
+*
+* If this is such a block, the high 64 bits contain 4 UNORM16s which must
+* be set to 0 if their values are less than 4.
+*
+* The PRMs describe formats as being stored in little-endian pixel order.
+* Since we're viewing this texture as an RGBA32_UINT, this means R will
+* contain bits 31:0 of the ASTC block, G will contain 63:32, and so on.
+*/
+
+   /* Check if the header indicates an LDR void-extent block */
+   nir_ssa_def *header = nir_iand(, nir_channel(, >dest.ssa, 0),
+  nir_imm_int(, 0xFFF));
+   nir_ssa_def *ve_header = nir_imm_int(, 0xDFC);
+   nir_if *if_stmt = nir_if_create(b.shader);
+   if_stmt->condition = nir_src_for_ssa(nir_ieq(, header, ve_header));
+   nir_cf_node_insert(b.cursor, _stmt->cf_node);
+   b.cursor = nir_after_cf_list(_stmt->then_list);
+   
+   /* Go from AB32 to ABGR16 */
+   nir_ssa_def *AB32 = nir_vec2(, nir_channel(, >dest.ssa, 3),
+nir_channel(, >dest.ssa, 2));
+   nir_ssa_def *ABGR16 = nir_format_bitcast_uvec_unmasked(, AB32, 32, 16);
+
+
+   /* Set the channels to 0 if less than 4. */
+   nir_ssa_def *chan_ge_4 = nir_ige(, ABGR16, nir_imm_ivec4(, 4, 4, 4, 4));
+   nir_ssa_def *ABGR16_mod = nir_iand(, ABGR16, chan_ge_4);
+
+
+   /* Store the modified block */
+   nir_ssa_def *AB32_mod =
+  nir_format_bitcast_uvec_unmasked(, ABGR16_mod, 16, 32);
+   nir_ssa_def *color = nir_vec4(, nir_channel(, >dest.ssa, 0),
+ nir_channel(, >dest.ssa, 1),
+ 

Re: [Mesa-dev] [PATCH 1/2] anv: s/batch/value_bo/ on anv_device_init_hiz_clear_batch

2018-09-26 Thread Nanley Chery
On Tue, Sep 25, 2018 at 04:26:57PM -0700, Jordan Justen wrote:
> Signed-off-by: Jordan Justen 
> Cc: Jason Ekstrand 
> ---
>  src/intel/vulkan/anv_device.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 

This series is
Reviewed-by: Nanley Chery 

> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index 4219a073d2d..265fc4a3347 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -1566,7 +1566,7 @@ vk_priority_to_gen(int priority)
>  }
>  
>  static void
> -anv_device_init_hiz_clear_batch(struct anv_device *device)
> +anv_device_init_hiz_clear_value_bo(struct anv_device *device)
>  {
> anv_bo_init_new(>hiz_clear_bo, device, 4096);
> uint32_t *map = anv_gem_mmap(device, device->hiz_clear_bo.gem_handle,
> @@ -1802,7 +1802,7 @@ VkResult anv_CreateDevice(
> anv_device_init_trivial_batch(device);
>  
> if (device->info.gen >= 10)
> -  anv_device_init_hiz_clear_batch(device);
> +  anv_device_init_hiz_clear_value_bo(device);
>  
> anv_scratch_pool_init(device, >scratch_pool);
>  
> -- 
> 2.18.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: If softpin is supported, use it with hiz clear batch bo

2018-09-25 Thread Nanley Chery
On Tue, Sep 25, 2018 at 03:22:11PM -0700, Jordan Justen wrote:
> Signed-off-by: Jordan Justen 
> Cc: Nanley Chery 
> ---
>  src/intel/vulkan/anv_device.c | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index 0ea8be052fa..4e446c3280a 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -1574,6 +1574,15 @@ static void
>  anv_device_init_hiz_clear_batch(struct anv_device *device)
>  {
> anv_bo_init_new(>hiz_clear_bo, device, 4096);
> +
> +   if (device->instance->physicalDevice.has_exec_async)
> +  device->hiz_clear_bo.flags |= EXEC_OBJECT_ASYNC;
> +
> +   if (device->instance->physicalDevice.use_softpin)
> +  device->hiz_clear_bo.flags |= EXEC_OBJECT_PINNED;
> +
> +   anv_vma_alloc(device, >hiz_clear_bo);
> +

Seems like we should handle the return value of this function.
Maybe also hook into the block of gotos in anv_CreateDevice()?

> uint32_t *map = anv_gem_mmap(device, device->hiz_clear_bo.gem_handle,
>  0, 4096, 0);
>  
> -- 
> 2.18.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel/isl: Add a _B suffix to some struct fields

2018-09-25 Thread Nanley Chery
On Wed, Sep 05, 2018 at 02:10:28PM -0500, Jason Ekstrand wrote:
> I was about to make the claim to someone that every field in isl_surf
> is either an enum or has explicit units.  Then I looked at isl_surf and
> discovered this claim was wrong.  We should fix that.
> 
> Cc: Chad Versace 
> ---
>  src/intel/blorp/blorp_blit.c  |   4 +-
>  src/intel/blorp/blorp_clear.c |   4 +-
>  src/intel/isl/isl.c   | 108 +-
>  src/intel/isl/isl.h   |  38 +++---
>  src/intel/isl/isl_emit_depth_stencil.c|   6 +-
>  src/intel/isl/isl_storage_image.c |   2 +-
>  src/intel/isl/isl_surface_state.c |  14 +--
>  src/intel/vulkan/anv_blorp.c  |   2 +-
>  src/intel/vulkan/anv_device.c |   6 +-
>  src/intel/vulkan/anv_image.c  |  57 -
>  src/intel/vulkan/anv_private.h|   4 +-
>  src/intel/vulkan/genX_cmd_buffer.c|   4 +-
>  src/mesa/drivers/dri/i965/brw_misc_state.c|   2 +-
>  .../drivers/dri/i965/brw_wm_surface_state.c   |   6 +-
>  src/mesa/drivers/dri/i965/intel_blit.c|  10 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c |  66 +--
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |   2 +-
>  .../drivers/dri/i965/intel_pixel_bitmap.c |   2 +-
>  src/mesa/drivers/dri/i965/intel_pixel_read.c  |   2 +-
>  src/mesa/drivers/dri/i965/intel_screen.c  |  28 ++---
>  src/mesa/drivers/dri/i965/intel_tex_image.c   |   8 +-
>  21 files changed, 188 insertions(+), 187 deletions(-)
> 

In addition to adding _B to struct fields, this patch also:
* adds _B to some variables and parameters
* renames row_pitch_tiles -> row_pitch_tl

Perhaps update the title to reflect this? Not a big deal though.

With or without an updated title, this patch is
Reviewed-by: Nanley Chery 

> diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c
> index 60cb32641d6..efa2cb08964 100644
> --- a/src/intel/blorp/blorp_blit.c
> +++ b/src/intel/blorp/blorp_blit.c
> @@ -2108,7 +2108,7 @@ shrink_surface_params(const struct isl_device *dev,
> x_offset_sa = (uint32_t)*x0 * px_size_sa.w + info->tile_x_sa;
> y_offset_sa = (uint32_t)*y0 * px_size_sa.h + info->tile_y_sa;
> isl_tiling_get_intratile_offset_sa(info->surf.tiling,
> -  info->surf.format, 
> info->surf.row_pitch,
> +  info->surf.format, 
> info->surf.row_pitch_B,
>x_offset_sa, y_offset_sa,
>_offset,
>>tile_x_sa, >tile_y_sa);
> @@ -2708,7 +2708,7 @@ do_buffer_copy(struct blorp_batch *batch,
>.levels = 1,
>.array_len = 1,
>.samples = 1,
> -  .row_pitch = width * block_size,
> +  .row_pitch_B = width * block_size,
>.usage = ISL_SURF_USAGE_TEXTURE_BIT |
> ISL_SURF_USAGE_RENDER_TARGET_BIT,
>.tiling_flags = ISL_TILING_LINEAR_BIT);
> diff --git a/src/intel/blorp/blorp_clear.c b/src/intel/blorp/blorp_clear.c
> index b4c744020d9..5b575dccc22 100644
> --- a/src/intel/blorp/blorp_clear.c
> +++ b/src/intel/blorp/blorp_clear.c
> @@ -1089,7 +1089,7 @@ blorp_ccs_ambiguate(struct blorp_batch *batch,
> isl_surf_get_image_offset_el(surf->aux_surf, level, layer, z,
>  _offset_el, _offset_el);
> isl_tiling_get_intratile_offset_el(surf->aux_surf->tiling, aux_fmtl->bpb,
> -  surf->aux_surf->row_pitch,
> +  surf->aux_surf->row_pitch_B,
>x_offset_el, y_offset_el,
>_B, _offset_el, _offset_el);
> params.dst.addr.offset += offset_B;
> @@ -1178,7 +1178,7 @@ blorp_ccs_ambiguate(struct blorp_batch *batch,
>  .levels = 1,
>  .array_len = 1,
>  .samples = 1,
> -.row_pitch = surf->aux_surf->row_pitch,
> +.row_pitch_B = surf->aux_surf->row_pitch_B,
>  .usage = ISL_SURF_USAGE_RENDER_TARGET_BIT,
>  .tiling_flags = ISL_TILING_Y0_BIT);
> assert(ok);
> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> index f39d8a79995..359293cfcb2 100644
> --- a/src/intel/isl/isl.c
> +++ b/src/intel/isl/isl.c
> @@ -1261,12 +1261,12 @@ static uint32_t
>  isl_calc_linear_min_row_pitch

Re: [Mesa-dev] [PATCH] anv/cmd_buffer: Require HiZ for WM_HZ_OP stencil clears

2018-09-24 Thread Nanley Chery
On Fri, Sep 21, 2018 at 08:23:04PM +0200, Jason Ekstrand wrote:
> Rb. This also fixes simulation errors on gen9; might be worth mentioning that.
> 

Thanks for the review! It turns out that I had an old build of the gen11
simulator. It was later updated to remove this error. As such, I'm
guessing we should drop this patch... I'll send you more details.

-Nanley

> On September 21, 2018 19:12:28 Nanley Chery  wrote:
> 
> > Avoid an ICL fulsim failure. Makes 336 crucible tests under
> > func.depthstencil.stencil-triangles.clear-0x17.ref-0x17.* go from fail
> > to pass with the simulator.
> > 
> > Fixes: 2cc3445eb24af469537911277f7bc4e73a6c5670
> >   ("anv/cmd_buffer: Decide whether or not to HiZ clear up-front")
> > ---
> > src/intel/vulkan/genX_cmd_buffer.c | 12 ++--
> > 1 file changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/src/intel/vulkan/genX_cmd_buffer.c
> > b/src/intel/vulkan/genX_cmd_buffer.c
> > index a9a8a41ac9d..5f5f2c114ff 100644
> > --- a/src/intel/vulkan/genX_cmd_buffer.c
> > +++ b/src/intel/vulkan/genX_cmd_buffer.c
> > @@ -374,12 +374,6 @@ depth_stencil_attachment_compute_aux_usage(struct
> > anv_device *device,
> >   return;
> >}
> > 
> > -   if (!(att_state->pending_clear_aspects & VK_IMAGE_ASPECT_DEPTH_BIT)) {
> > -  /* If we're just clearing stencil, we can always HiZ clear */
> > -  att_state->fast_clear = true;
> > -  return;
> > -   }
> > -
> >/* Default to false for now */
> >att_state->fast_clear = false;
> > 
> > @@ -387,6 +381,12 @@ depth_stencil_attachment_compute_aux_usage(struct
> > anv_device *device,
> >if (!(iview->image->aspects & VK_IMAGE_ASPECT_DEPTH_BIT))
> >   return;
> > 
> > +   if (att_state->pending_clear_aspects == VK_IMAGE_ASPECT_STENCIL_BIT) {
> > +  /* If we're just clearing stencil, we can always HiZ clear */
> > +  att_state->fast_clear = true;
> > +  return;
> > +   }
> > +
> >const enum isl_aux_usage first_subpass_aux_usage =
> >   anv_layout_to_aux_usage(>info, iview->image,
> >   VK_IMAGE_ASPECT_DEPTH_BIT,
> > --
> > 2.19.0
> 
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv/cmd_buffer: Require HiZ for WM_HZ_OP stencil clears

2018-09-21 Thread Nanley Chery
Avoid an ICL fulsim failure. Makes 336 crucible tests under
func.depthstencil.stencil-triangles.clear-0x17.ref-0x17.* go from fail
to pass with the simulator.

Fixes: 2cc3445eb24af469537911277f7bc4e73a6c5670
   ("anv/cmd_buffer: Decide whether or not to HiZ clear up-front")
---
 src/intel/vulkan/genX_cmd_buffer.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index a9a8a41ac9d..5f5f2c114ff 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -374,12 +374,6 @@ depth_stencil_attachment_compute_aux_usage(struct 
anv_device *device,
   return;
}
 
-   if (!(att_state->pending_clear_aspects & VK_IMAGE_ASPECT_DEPTH_BIT)) {
-  /* If we're just clearing stencil, we can always HiZ clear */
-  att_state->fast_clear = true;
-  return;
-   }
-
/* Default to false for now */
att_state->fast_clear = false;
 
@@ -387,6 +381,12 @@ depth_stencil_attachment_compute_aux_usage(struct 
anv_device *device,
if (!(iview->image->aspects & VK_IMAGE_ASPECT_DEPTH_BIT))
   return;
 
+   if (att_state->pending_clear_aspects == VK_IMAGE_ASPECT_STENCIL_BIT) {
+  /* If we're just clearing stencil, we can always HiZ clear */
+  att_state->fast_clear = true;
+  return;
+   }
+
const enum isl_aux_usage first_subpass_aux_usage =
   anv_layout_to_aux_usage(>info, iview->image,
   VK_IMAGE_ASPECT_DEPTH_BIT,
-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] anv/so_memcpy: Use the correct SO_BUFFER size on gen8+

2018-09-14 Thread Nanley Chery
On Wed, Sep 12, 2018 at 12:06:49AM -0500, Jason Ekstrand wrote:
> This shouldn't matter as we'll never write OOB anyway but we may as well
> get it right.  It's supposed to be in dwords - 1.
> ---
>  src/intel/vulkan/genX_gpu_memcpy.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

This patch is
Reviewed-by: Nanley Chery 

> diff --git a/src/intel/vulkan/genX_gpu_memcpy.c 
> b/src/intel/vulkan/genX_gpu_memcpy.c
> index 57abd8cd5c1..cba820a1866 100644
> --- a/src/intel/vulkan/genX_gpu_memcpy.c
> +++ b/src/intel/vulkan/genX_gpu_memcpy.c
> @@ -222,7 +222,7 @@ genX(cmd_buffer_so_memcpy)(struct anv_cmd_buffer 
> *cmd_buffer,
>  
>  #if GEN_GEN >= 8
>sob.SOBufferEnable = true;
> -  sob.SurfaceSize = size - 1;
> +  sob.SurfaceSize = size / 4 - 1;
>  #else
>sob.SurfacePitch = bs;
>sob.SurfaceEndAddress = sob.SurfaceBaseAddress;
> -- 
> 2.17.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] anv/blorp: Do more flushing around HiZ clears

2018-08-31 Thread Nanley Chery
On Fri, Aug 31, 2018 at 05:15:53PM -0500, Jason Ekstrand wrote:
> We make the flush after a HiZ clear unconditional and add a flush/stall
> before the clear as well.
> 
> Cc: mesa-sta...@lists.freedesktop.org
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107760
> ---
>  src/intel/vulkan/anv_blorp.c | 44 +++-
>  1 file changed, 33 insertions(+), 11 deletions(-)
> 
> diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
> index 35b304f92b3..04bca4d261f 100644
> --- a/src/intel/vulkan/anv_blorp.c
> +++ b/src/intel/vulkan/anv_blorp.c
> @@ -1604,6 +1604,24 @@ anv_image_hiz_clear(struct anv_cmd_buffer *cmd_buffer,
> ISL_AUX_USAGE_NONE, );
> }
>  
> +   /* From the Sky Lake PRM Volume 7, "Depth Buffer Clear":
> +*
> +*"The following is required when performing a depth buffer clear with
> +*using the WM_STATE or 3DSTATE_WM:
> +*
> +*   * If other rendering operations have preceded this clear, a
> +* PIPE_CONTROL with depth cache flush enabled, Depth Stall bit
> +* enabled must be issued before the rectangle primitive used for
> +* the depth buffer clear operation.
> +*   * [...]"
> +*
> +* Even though the PRM only says that this is required if using 3DSTATE_WM
> +* and a 3DPRIMITIVE, it appears to also sometimes hang when doing a clear
> +* with WM_HZ_OP.
   ^
This part was a little hard to parse because the PRM hasn't mentioned
the hardware hanging and the subject's changed from the pipecontrol to
the GPU. Maybe replace it with something like the following?

   the GPU appears to also need this to avoid
   occasional hangs when doing a clear with
       WM_HZ_OP.

We could discuss it more on IRC if you prefer.

With that changed, this patch is
Reviewed-by: Nanley Chery 


> +*/
> +   cmd_buffer->state.pending_pipe_bits |=
> +  ANV_PIPE_DEPTH_CACHE_FLUSH_BIT | ANV_PIPE_DEPTH_STALL_BIT;
> +
> blorp_hiz_clear_depth_stencil(, , ,
>   level, base_layer, layer_count,
>   area.offset.x, area.offset.y,
> @@ -1618,18 +1636,22 @@ anv_image_hiz_clear(struct anv_cmd_buffer *cmd_buffer,
>  
> /* From the SKL PRM, Depth Buffer Clear:
>  *
> -* Depth Buffer Clear Workaround
> -* Depth buffer clear pass using any of the methods (WM_STATE, 3DSTATE_WM
> -* or 3DSTATE_WM_HZ_OP) must be followed by a PIPE_CONTROL command with
> -* DEPTH_STALL bit and Depth FLUSH bits “set” before starting to render.
> -* DepthStall and DepthFlush are not needed between consecutive depth 
> clear
> -* passes nor is it required if the depth-clear pass was done with
> -* “full_surf_clear” bit set in the 3DSTATE_WM_HZ_OP.
> +*"Depth Buffer Clear Workaround
> +*
> +*Depth buffer clear pass using any of the methods (WM_STATE,
> +*3DSTATE_WM or 3DSTATE_WM_HZ_OP) must be followed by a PIPE_CONTROL
> +*command with DEPTH_STALL bit and Depth FLUSH bits “set” before
> +*starting to render.  DepthStall and DepthFlush are not needed 
> between
> +*consecutive depth clear passes nor is it required if the depth-clear
> +*pass was done with “full_surf_clear” bit set in the
> +*3DSTATE_WM_HZ_OP."
> +*
> +* Even though the PRM provides a bunch of conditions under which this is
> +* supposedly unnecessary, we choose to perform the flush unconditionally
> +* just to be safe.
>  */
> -   if (aspects & VK_IMAGE_ASPECT_DEPTH_BIT) {
> -  cmd_buffer->state.pending_pipe_bits |=
> - ANV_PIPE_DEPTH_CACHE_FLUSH_BIT | ANV_PIPE_DEPTH_STALL_BIT;
> -   }
> +   cmd_buffer->state.pending_pipe_bits |=
> +  ANV_PIPE_DEPTH_CACHE_FLUSH_BIT | ANV_PIPE_DEPTH_STALL_BIT;
>  }
>  
>  void
> -- 
> 2.17.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/blorp: Emit depth flush and stall prior to HiZ clears

2018-08-31 Thread Nanley Chery
On Fri, Aug 31, 2018 at 04:04:22PM -0500, Jason Ekstrand wrote:
> We had the flush/stall after the clear but missed the one that needs to
> go before the clear.
> 

Does this fix the GPU Hang in DiRT 3?

> Cc: mesa-sta...@lists.freedesktop.org
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107760
> ---
>  src/intel/vulkan/anv_blorp.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
> index 3dfc8087630..532e8185c0e 100644
> --- a/src/intel/vulkan/anv_blorp.c
> +++ b/src/intel/vulkan/anv_blorp.c
> @@ -1605,6 +1605,16 @@ anv_image_hiz_clear(struct anv_cmd_buffer *cmd_buffer,
> ISL_AUX_USAGE_NONE, );
> }
>  
> +   /* From the Sky Lake PRM Volume 7, "Depth Buffer Clear":
> +*
> +*"If other rendering operations have preceded this clear, a
> +*PIPE_CONTROL with depth cache flush enabled, Depth Stall bit enabled
> +*must be issued before the rectangle primitive used for the depth
> +*buffer clear operation."
> +*/
> +   cmd_buffer->state.pending_pipe_bits |=
> +  ANV_PIPE_DEPTH_CACHE_FLUSH_BIT | ANV_PIPE_DEPTH_STALL_BIT;
> +

The PRMs say this pipecontrol is needed only if you're doing a clear
with WM_STATE or 3DSTATE_WM.

I wonder if we should be doing the pipecontrol that comes after
blorp_hiz_clear_depth_stencil() in the case of stencil-only HIZ clears
as well?

If that doesn't fix it, I think it'd be good to comment that we've
observed this pipecontrol be necessary for 3DSTATE_WM_HZ_OP.

> blorp_hiz_clear_depth_stencil(, , ,
>   level, base_layer, layer_count,
>   area.offset.x, area.offset.y,
> -- 
> 2.17.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] i965/gen7_urb: Re-emit PUSH_CONSTANT_ALLOC on some gen9

2018-08-30 Thread Nanley Chery
On Thu, Aug 30, 2018 at 02:37:40PM -0700, Kenneth Graunke wrote:
> On Wednesday, August 29, 2018 1:38:51 PM PDT Nanley Chery wrote:
> > According to internal docs, some gen9 platforms have a pixel shader push
> > constant synchronization issue. Although not listed among said
> > platforms, this issue seems to be present on the GeminiLake 2x6's we've
> > tested.
> > 
> > We consider the available workarounds to be too detrimental on
> > performance. Instead, we mitigate the issue by applying part of one of
> > the workarounds. Re-emit PUSH_CONSTANT_ALLOC at the top of every batch
> > (as suggested by Ken).
> > 
> > Fixes ext_framebuffer_multisample-accuracy piglit test failures with the
> > following options:
> > * 6 depth_draw small depthstencil
> > * 8 stencil_draw small depthstencil
> > * 6 stencil_draw small depthstencil
> > * 8 depth_resolve small
> > * 6 stencil_resolve small depthstencil
> > * 4 stencil_draw small depthstencil
> > * 16 stencil_draw small depthstencil
> > * 16 depth_draw small depthstencil
> > * 2 stencil_resolve small depthstencil
> > * 6 stencil_draw small
> > * all_samples stencil_draw small
> > * 2 depth_draw small depthstencil
> > * all_samples depth_draw small depthstencil
> > * all_samples stencil_resolve small
> > * 4 depth_draw small depthstencil
> > * all_samples depth_draw small
> > * all_samples stencil_draw small depthstencil
> > * 4 stencil_resolve small depthstencil
> > * 4 depth_resolve small depthstencil
> > * all_samples stencil_resolve small depthstencil
> > 
> > v2: Include more platforms in WA (Ken).
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106865
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93355
> > Cc: 
> > Tested-by: Mark Janes 
> > ---
> >  src/mesa/drivers/dri/i965/gen7_urb.c | 28 
> >  1 file changed, 28 insertions(+)
> > 
> > I'm not sure I have enough information about what's happening in the HW
> > to create a piglit test for this issue.
> > 
> > diff --git a/src/mesa/drivers/dri/i965/gen7_urb.c 
> > b/src/mesa/drivers/dri/i965/gen7_urb.c
> > index 2e5f8e60ba9..e7259fc1b8d 100644
> > --- a/src/mesa/drivers/dri/i965/gen7_urb.c
> > +++ b/src/mesa/drivers/dri/i965/gen7_urb.c
> > @@ -118,6 +118,33 @@ gen7_emit_push_constant_state(struct brw_context *brw, 
> > unsigned vs_size,
> > const struct gen_device_info *devinfo = >screen->devinfo;
> > unsigned offset = 0;
> >  
> > +   /* From the SKL PRM, Workarounds section (#878):
> > +*
> > +*Push constant buffer corruption possible. WA: Insert 2 zero-length
> > +*PushConst_PS before every intended PushConst_PS update, issue a
> > +*NULLPRIM after each of the zero len PC update to make sure CS 
> > commits
> > +*them.
> > +*
> > +* This workaround is attempting to solve a pixel shader push constant
> > +* synchronization issue.
> > +*
> > +* There's an unpublished WA that involves re-emitting
> > +* 3DSTATE_PUSH_CONSTANT_ALLOC_PS for every 500-ish 3DSTATE_CONSTANT_PS
> > +* packets. Since our counting methods may not be reliable due to
> > +* context-switching and pre-emption, we instead choose to approximate 
> > this
> > +* behavior by re-emitting the packet at the top of the batch.
> > +*/
> > +   if (brw->ctx.NewDriverState == BRW_NEW_BATCH) {
> > +   /* SKL GT2 and GLK 2x6 have reliably demonstrated this issue thus 
> > far.
> > +* We've also seen some intermittent failures from SKL GT4 and BXT 
> > in
> > +* the past.
> > +*/
> > +  if (!devinfo->is_skylake &&
> > +  !devinfo->is_broxton &&
> > +  !devinfo->is_geminilake)
> > + return;
> > +   }
> > +
> > BEGIN_BATCH(10);
> > OUT_BATCH(_3DSTATE_PUSH_CONSTANT_ALLOC_VS << 16 | (2 - 2));
> > OUT_BATCH(vs_size | offset << GEN7_PUSH_CONSTANT_BUFFER_OFFSET_SHIFT);
> > @@ -154,6 +181,7 @@ const struct brw_tracked_state gen7_push_constant_space 
> > = {
> > .dirty = {
> >.mesa = 0,
> >.brw = BRW_NEW_CONTEXT |
> > + BRW_NEW_BATCH | /* Push constant workaround */
> >   BRW_NEW_GEOMETRY_PROGRAM |
> >   BRW_NEW_TESS_PROGRAMS,
> > },
> > 
> 
> Not sure we can do much better than this.  Thanks for taking care of
> this, Nanley.
> 
> Reviewed-by: Kenneth Graunke 

Same here. Thanks.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] i965/gen7_urb: Re-emit PUSH_CONSTANT_ALLOC on some gen9

2018-08-29 Thread Nanley Chery
According to internal docs, some gen9 platforms have a pixel shader push
constant synchronization issue. Although not listed among said
platforms, this issue seems to be present on the GeminiLake 2x6's we've
tested.

We consider the available workarounds to be too detrimental on
performance. Instead, we mitigate the issue by applying part of one of
the workarounds. Re-emit PUSH_CONSTANT_ALLOC at the top of every batch
(as suggested by Ken).

Fixes ext_framebuffer_multisample-accuracy piglit test failures with the
following options:
* 6 depth_draw small depthstencil
* 8 stencil_draw small depthstencil
* 6 stencil_draw small depthstencil
* 8 depth_resolve small
* 6 stencil_resolve small depthstencil
* 4 stencil_draw small depthstencil
* 16 stencil_draw small depthstencil
* 16 depth_draw small depthstencil
* 2 stencil_resolve small depthstencil
* 6 stencil_draw small
* all_samples stencil_draw small
* 2 depth_draw small depthstencil
* all_samples depth_draw small depthstencil
* all_samples stencil_resolve small
* 4 depth_draw small depthstencil
* all_samples depth_draw small
* all_samples stencil_draw small depthstencil
* 4 stencil_resolve small depthstencil
* 4 depth_resolve small depthstencil
* all_samples stencil_resolve small depthstencil

v2: Include more platforms in WA (Ken).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106865
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93355
Cc: 
Tested-by: Mark Janes 
---
 src/mesa/drivers/dri/i965/gen7_urb.c | 28 
 1 file changed, 28 insertions(+)

I'm not sure I have enough information about what's happening in the HW
to create a piglit test for this issue.

diff --git a/src/mesa/drivers/dri/i965/gen7_urb.c 
b/src/mesa/drivers/dri/i965/gen7_urb.c
index 2e5f8e60ba9..e7259fc1b8d 100644
--- a/src/mesa/drivers/dri/i965/gen7_urb.c
+++ b/src/mesa/drivers/dri/i965/gen7_urb.c
@@ -118,6 +118,33 @@ gen7_emit_push_constant_state(struct brw_context *brw, 
unsigned vs_size,
const struct gen_device_info *devinfo = >screen->devinfo;
unsigned offset = 0;
 
+   /* From the SKL PRM, Workarounds section (#878):
+*
+*Push constant buffer corruption possible. WA: Insert 2 zero-length
+*PushConst_PS before every intended PushConst_PS update, issue a
+*NULLPRIM after each of the zero len PC update to make sure CS commits
+*them.
+*
+* This workaround is attempting to solve a pixel shader push constant
+* synchronization issue.
+*
+* There's an unpublished WA that involves re-emitting
+* 3DSTATE_PUSH_CONSTANT_ALLOC_PS for every 500-ish 3DSTATE_CONSTANT_PS
+* packets. Since our counting methods may not be reliable due to
+* context-switching and pre-emption, we instead choose to approximate this
+* behavior by re-emitting the packet at the top of the batch.
+*/
+   if (brw->ctx.NewDriverState == BRW_NEW_BATCH) {
+   /* SKL GT2 and GLK 2x6 have reliably demonstrated this issue thus far.
+* We've also seen some intermittent failures from SKL GT4 and BXT in
+* the past.
+*/
+  if (!devinfo->is_skylake &&
+  !devinfo->is_broxton &&
+  !devinfo->is_geminilake)
+ return;
+   }
+
BEGIN_BATCH(10);
OUT_BATCH(_3DSTATE_PUSH_CONSTANT_ALLOC_VS << 16 | (2 - 2));
OUT_BATCH(vs_size | offset << GEN7_PUSH_CONSTANT_BUFFER_OFFSET_SHIFT);
@@ -154,6 +181,7 @@ const struct brw_tracked_state gen7_push_constant_space = {
.dirty = {
   .mesa = 0,
   .brw = BRW_NEW_CONTEXT |
+ BRW_NEW_BATCH | /* Push constant workaround */
  BRW_NEW_GEOMETRY_PROGRAM |
  BRW_NEW_TESS_PROGRAMS,
},
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/gen7_urb: Re-emit PUSH_CONSTANT_ALLOC on GLK

2018-08-27 Thread Nanley Chery
On Mon, Aug 27, 2018 at 03:25:37PM -0700, Kenneth Graunke wrote:
> On Monday, August 27, 2018 11:03:33 AM PDT Nanley Chery wrote:
> > On Fri, Aug 24, 2018 at 05:46:44PM -0700, Nanley Chery wrote:
> > > According to internal docs, some gen9 platforms have a pixel shader push
> > > constant synchronization issue. Although not listed among said
> > > platforms, this issue seems to be present on the GeminiLake 2x6's we've
> > > tested.
> > > 
> > > We consider the available workarounds to be too detrimental on
> > > performance. Instead, we mitigate the issue by applying part of one of
> > > the workarounds. Re-emit PUSH_CONSTANT_ALLOC at the top of every batch
> > > (as suggested by Ken).
> > > 
> > > Fixes ext_framebuffer_multisample-accuracy piglit test failures with the
> > > following options:
> > > * 6 depth_draw small depthstencil
> > > * 8 stencil_draw small depthstencil
> > > * 6 stencil_draw small depthstencil
> > > * 8 depth_resolve small
> > > * 6 stencil_resolve small depthstencil
> > > * 4 stencil_draw small depthstencil
> > > * 16 stencil_draw small depthstencil
> > > * 16 depth_draw small depthstencil
> > > * 2 stencil_resolve small depthstencil
> > > * 6 stencil_draw small
> > > * all_samples stencil_draw small
> > > * 2 depth_draw small depthstencil
> > > * all_samples depth_draw small depthstencil
> > > * all_samples stencil_resolve small
> > > * 4 depth_draw small depthstencil
> > > * all_samples depth_draw small
> > > * all_samples stencil_draw small depthstencil
> > > * 4 stencil_resolve small depthstencil
> > > * 4 depth_resolve small depthstencil
> > > * all_samples stencil_resolve small depthstencil
> > > 
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106865
> > > Cc: 
> > > ---
> > >  src/mesa/drivers/dri/i965/gen7_urb.c | 23 +++
> > >  1 file changed, 23 insertions(+)
> > 
> > Ping?
> 
> So...I believe those tests are still intermittent on other platforms.
> 
> And...these platforms aren't listed as having the bug that you're trying
> to work around here.
> 
> Which makes me wonder if a GLK2x6 hack is really the right thing to do.

Right, it'd be good to know if this helps any other platforms. I'll see
if we could add a more generic patch to CI to see if stability improves
for other platforms.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/gen7_urb: Re-emit PUSH_CONSTANT_ALLOC on GLK

2018-08-27 Thread Nanley Chery
On Fri, Aug 24, 2018 at 05:46:44PM -0700, Nanley Chery wrote:
> According to internal docs, some gen9 platforms have a pixel shader push
> constant synchronization issue. Although not listed among said
> platforms, this issue seems to be present on the GeminiLake 2x6's we've
> tested.
> 
> We consider the available workarounds to be too detrimental on
> performance. Instead, we mitigate the issue by applying part of one of
> the workarounds. Re-emit PUSH_CONSTANT_ALLOC at the top of every batch
> (as suggested by Ken).
> 
> Fixes ext_framebuffer_multisample-accuracy piglit test failures with the
> following options:
> * 6 depth_draw small depthstencil
> * 8 stencil_draw small depthstencil
> * 6 stencil_draw small depthstencil
> * 8 depth_resolve small
> * 6 stencil_resolve small depthstencil
> * 4 stencil_draw small depthstencil
> * 16 stencil_draw small depthstencil
> * 16 depth_draw small depthstencil
> * 2 stencil_resolve small depthstencil
> * 6 stencil_draw small
> * all_samples stencil_draw small
> * 2 depth_draw small depthstencil
> * all_samples depth_draw small depthstencil
> * all_samples stencil_resolve small
> * 4 depth_draw small depthstencil
> * all_samples depth_draw small
> * all_samples stencil_draw small depthstencil
> * 4 stencil_resolve small depthstencil
> * 4 depth_resolve small depthstencil
> * all_samples stencil_resolve small depthstencil
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106865
> Cc: 
> ---
>  src/mesa/drivers/dri/i965/gen7_urb.c | 23 +++
>  1 file changed, 23 insertions(+)

Ping?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/gen7_urb: Re-emit PUSH_CONSTANT_ALLOC on GLK

2018-08-24 Thread Nanley Chery
On Fri, Aug 24, 2018 at 09:17:03PM -0400, Ilia Mirkin wrote:
> On Fri, Aug 24, 2018 at 8:46 PM, Nanley Chery  wrote:
> > According to internal docs, some gen9 platforms have a pixel shader push
> > constant synchronization issue. Although not listed among said
> > platforms, this issue seems to be present on the GeminiLake 2x6's we've
> > tested.
> >
> > We consider the available workarounds to be too detrimental on
> > performance. Instead, we mitigate the issue by applying part of one of
> > the workarounds. Re-emit PUSH_CONSTANT_ALLOC at the top of every batch
> > (as suggested by Ken).
> >
> > Fixes ext_framebuffer_multisample-accuracy piglit test failures with the
> > following options:
> > * 6 depth_draw small depthstencil
> > * 8 stencil_draw small depthstencil
> > * 6 stencil_draw small depthstencil
> > * 8 depth_resolve small
> > * 6 stencil_resolve small depthstencil
> > * 4 stencil_draw small depthstencil
> > * 16 stencil_draw small depthstencil
> > * 16 depth_draw small depthstencil
> > * 2 stencil_resolve small depthstencil
> > * 6 stencil_draw small
> > * all_samples stencil_draw small
> > * 2 depth_draw small depthstencil
> > * all_samples depth_draw small depthstencil
> > * all_samples stencil_resolve small
> > * 4 depth_draw small depthstencil
> > * all_samples depth_draw small
> > * all_samples stencil_draw small depthstencil
> > * 4 stencil_resolve small depthstencil
> > * 4 depth_resolve small depthstencil
> > * all_samples stencil_resolve small depthstencil
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106865
> > Cc: 
> > ---
> >  src/mesa/drivers/dri/i965/gen7_urb.c | 23 +++
> >  1 file changed, 23 insertions(+)
> >
> > diff --git a/src/mesa/drivers/dri/i965/gen7_urb.c 
> > b/src/mesa/drivers/dri/i965/gen7_urb.c
> > index 2e5f8e60ba9..cb045251236 100644
> > --- a/src/mesa/drivers/dri/i965/gen7_urb.c
> > +++ b/src/mesa/drivers/dri/i965/gen7_urb.c
> > @@ -118,6 +118,28 @@ gen7_emit_push_constant_state(struct brw_context *brw, 
> > unsigned vs_size,
> > const struct gen_device_info *devinfo = >screen->devinfo;
> > unsigned offset = 0;
> >
> > +   /* From the SKL PRM, Workarounds section (#878):
> > +*
> > +*Push constant buffer corruption possible. WA: Insert 2 zero-length
> > +*PushConst_PS before every intended PushConst_PS update, issue a
> > +*NULLPRIM after each of the zero len PC update to make sure CS 
> > commits
> > +*them.
> > +*
> > +* This workaround is attempting to solve a pixel shader push constant
> > +* synchronization issue.
> > +*
> > +* There's an unpublished WA that involves re-emitting
> > +* 3DSTATE_PUSH_CONSTANT_ALLOC_PS for every 500-ish 3DSTATE_CONSTANT_PS
> > +* packets. Since our counting methods may not be reliable due to
> > +* context-switching and pre-emption, we instead choose to approximate 
> > this
> > +* behavior by re-emitting the packet at the top of the batch.
> > +*/
> > +   if (brw->ctx.NewDriverState == BRW_NEW_BATCH) {
> 
> Did you want & here?
> 

Using & would prevent push constant allocation on non-GLK 2x6 devices
if we had a NEW_BATCH and NEW_GEOMETRY_PROGRAM, which I think we don't
want.

If the equality fails, we'll emit push constant allocation packets,
which is what we want. This block basically filters out the cases in
which we're emitting this packet unnecessarily due to adding the
BRW_NEW_BATCH dirty flag below.

-Nanley

> > +   /* Only GLK 2x6 has demonstrated this issue thus far. */
> > +  if (!devinfo->is_geminilake || devinfo->num_subslices[0] != 2)
> > + return;
> > +   }
> > +
> > BEGIN_BATCH(10);
> > OUT_BATCH(_3DSTATE_PUSH_CONSTANT_ALLOC_VS << 16 | (2 - 2));
> > OUT_BATCH(vs_size | offset << GEN7_PUSH_CONSTANT_BUFFER_OFFSET_SHIFT);
> > @@ -154,6 +176,7 @@ const struct brw_tracked_state gen7_push_constant_space 
> > = {
> > .dirty = {
> >.mesa = 0,
> >.brw = BRW_NEW_CONTEXT |
> > + BRW_NEW_BATCH | /* GLK workaround */
> >   BRW_NEW_GEOMETRY_PROGRAM |
> >   BRW_NEW_TESS_PROGRAMS,
> > },
> > --
> > 2.18.0
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/gen7_urb: Re-emit PUSH_CONSTANT_ALLOC on GLK

2018-08-24 Thread Nanley Chery
According to internal docs, some gen9 platforms have a pixel shader push
constant synchronization issue. Although not listed among said
platforms, this issue seems to be present on the GeminiLake 2x6's we've
tested.

We consider the available workarounds to be too detrimental on
performance. Instead, we mitigate the issue by applying part of one of
the workarounds. Re-emit PUSH_CONSTANT_ALLOC at the top of every batch
(as suggested by Ken).

Fixes ext_framebuffer_multisample-accuracy piglit test failures with the
following options:
* 6 depth_draw small depthstencil
* 8 stencil_draw small depthstencil
* 6 stencil_draw small depthstencil
* 8 depth_resolve small
* 6 stencil_resolve small depthstencil
* 4 stencil_draw small depthstencil
* 16 stencil_draw small depthstencil
* 16 depth_draw small depthstencil
* 2 stencil_resolve small depthstencil
* 6 stencil_draw small
* all_samples stencil_draw small
* 2 depth_draw small depthstencil
* all_samples depth_draw small depthstencil
* all_samples stencil_resolve small
* 4 depth_draw small depthstencil
* all_samples depth_draw small
* all_samples stencil_draw small depthstencil
* 4 stencil_resolve small depthstencil
* 4 depth_resolve small depthstencil
* all_samples stencil_resolve small depthstencil

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106865
Cc: 
---
 src/mesa/drivers/dri/i965/gen7_urb.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/gen7_urb.c 
b/src/mesa/drivers/dri/i965/gen7_urb.c
index 2e5f8e60ba9..cb045251236 100644
--- a/src/mesa/drivers/dri/i965/gen7_urb.c
+++ b/src/mesa/drivers/dri/i965/gen7_urb.c
@@ -118,6 +118,28 @@ gen7_emit_push_constant_state(struct brw_context *brw, 
unsigned vs_size,
const struct gen_device_info *devinfo = >screen->devinfo;
unsigned offset = 0;
 
+   /* From the SKL PRM, Workarounds section (#878):
+*
+*Push constant buffer corruption possible. WA: Insert 2 zero-length
+*PushConst_PS before every intended PushConst_PS update, issue a
+*NULLPRIM after each of the zero len PC update to make sure CS commits
+*them.
+*
+* This workaround is attempting to solve a pixel shader push constant
+* synchronization issue.
+*
+* There's an unpublished WA that involves re-emitting
+* 3DSTATE_PUSH_CONSTANT_ALLOC_PS for every 500-ish 3DSTATE_CONSTANT_PS
+* packets. Since our counting methods may not be reliable due to
+* context-switching and pre-emption, we instead choose to approximate this
+* behavior by re-emitting the packet at the top of the batch.
+*/
+   if (brw->ctx.NewDriverState == BRW_NEW_BATCH) {
+   /* Only GLK 2x6 has demonstrated this issue thus far. */
+  if (!devinfo->is_geminilake || devinfo->num_subslices[0] != 2)
+ return;
+   }
+
BEGIN_BATCH(10);
OUT_BATCH(_3DSTATE_PUSH_CONSTANT_ALLOC_VS << 16 | (2 - 2));
OUT_BATCH(vs_size | offset << GEN7_PUSH_CONSTANT_BUFFER_OFFSET_SHIFT);
@@ -154,6 +176,7 @@ const struct brw_tracked_state gen7_push_constant_space = {
.dirty = {
   .mesa = 0,
   .brw = BRW_NEW_CONTEXT |
+ BRW_NEW_BATCH | /* GLK workaround */
  BRW_NEW_GEOMETRY_PROGRAM |
  BRW_NEW_TESS_PROGRAMS,
},
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 2/3] i965/miptree: Fix can_blit_slice()

2018-08-20 Thread Nanley Chery
Check the destination's row pitch against the BLT engine's row pitch
limitation as well.

Fixes: 0288fe8d0417730bdd5b3477130dd1dc32bdbcd3
("i965/miptree: Use the correct BLT pitch")

v2: Fix the Fixes tag (Dylan).
Check the destination row pitch (Chris).

Cc: 
Reported-by: Dylan Baker 
---

I decided against using the mesa row pitch helper to keep the
dst_blt_pitch assignment on one line.

 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index b477c97e51d..983f145afc9 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3545,10 +3545,9 @@ can_blit_slice(struct intel_mipmap_tree *mt,
const struct intel_miptree_map *map)
 {
/* See intel_miptree_blit() for details on the 32k pitch limit. */
-   if (intel_miptree_blt_pitch(mt) >= 32768)
-  return false;
-
-   return true;
+   const unsigned src_blt_pitch = intel_miptree_blt_pitch(mt);
+   const unsigned dst_blt_pitch = ALIGN(map->w * mt->cpp, 64);
+   return src_blt_pitch < 32768 && dst_blt_pitch < 32768;
 }
 
 static bool
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 3/3] intel/isl: Avoid tiling some 16K-wide render targets

2018-08-20 Thread Nanley Chery
Fix rendering issues on BDW and SKL.

Fixes: 0288fe8d0417730bdd5b3477130dd1dc32bdbcd3
("i965/miptree: Use the correct BLT pitch")

Fixes the following regressions seen

exclusively on SKL:
* KHR-GL46.texture_barrier_ARB.disjoint-texels
* KHR-GL46.texture_barrier_ARB.overlapping-texels
* KHR-GL46.texture_barrier.disjoint-texels
* KHR-GL46.texture_barrier.overlapping-texels

and both on BDW and SKL:
* GTF-GL46.gtf21.GL2FixedTests.buffer_corners.buffer_corners
* GTF-GL46.gtf21.GL2FixedTests.stencil_plane_corners.stencil_plane_corners

v2: Note the fixed tests (Andres).
Don't cause failures with multisampled buffers (Andres).
Don't hamper SKL GT4 (Ken).
v3: Fix the Fixes tag (Dylan).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107359
Cc: 
---
 src/intel/isl/isl_gen7.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/src/intel/isl/isl_gen7.c b/src/intel/isl/isl_gen7.c
index 4fa9851233f..a9db21fba52 100644
--- a/src/intel/isl/isl_gen7.c
+++ b/src/intel/isl/isl_gen7.c
@@ -294,6 +294,29 @@ isl_gen6_filter_tiling(const struct isl_device *dev,
 */
if (ISL_DEV_GEN(dev) < 7 && isl_format_get_layout(info->format)->bpb >= 128)
   *flags &= ~ISL_TILING_Y0_BIT;
+
+   /* From the BDW and SKL PRMs, Volume 2d,
+* RENDER_SURFACE_STATE::Width - Programming Notes:
+*
+*   A known issue exists if a primitive is rendered to the first 2 rows and
+*   last 2 columns of a 16K width surface. If any geometry is drawn inside
+*   this square it will be copied to column X=2 and X=3 (arrangement on Y
+*   position will stay the same). If any geometry exceeds the boundaries of
+*   this 2x2 region it will be drawn normally. The issue also only occurs
+*   if the surface has TileMode != Linear.
+*
+* [Internal documentation notes that this issue isn't present on SKL GT4.]
+* To prevent this rendering corruption, only allow linear tiling for
+* surfaces with widths greater than 16K-2 pixels.
+*
+* TODO: Is this an issue for multisampled surfaces as well?
+*/
+   if (info->width > 16382 && info->samples == 1 &&
+   info->usage & ISL_SURF_USAGE_RENDER_TARGET_BIT &&
+   (ISL_DEV_GEN(dev) == 8 ||
+(dev->info->is_skylake && dev->info->gt != 4))) {
+  *flags &= ISL_TILING_LINEAR_BIT;
+   }
 }
 
 void
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 1/3] i965/miptree: Use miptree_map in map_blit functions

2018-08-20 Thread Nanley Chery
This struct contains all the data of interest. can_blit_slice() will use
it in the next patch to calculate the correct pitch.

Suggested-by: Chris Wilson 
Cc: 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index a18d5ac3624..b477c97e51d 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3542,7 +3542,7 @@ intel_miptree_release_map(struct intel_mipmap_tree *mt,
 
 static bool
 can_blit_slice(struct intel_mipmap_tree *mt,
-   unsigned int level, unsigned int slice)
+   const struct intel_miptree_map *map)
 {
/* See intel_miptree_blit() for details on the 32k pitch limit. */
if (intel_miptree_blt_pitch(mt) >= 32768)
@@ -3554,9 +3554,7 @@ can_blit_slice(struct intel_mipmap_tree *mt,
 static bool
 use_intel_mipree_map_blit(struct brw_context *brw,
   struct intel_mipmap_tree *mt,
-  GLbitfield mode,
-  unsigned int level,
-  unsigned int slice)
+  const struct intel_miptree_map *map)
 {
const struct gen_device_info *devinfo = >screen->devinfo;
 
@@ -3564,19 +3562,19 @@ use_intel_mipree_map_blit(struct brw_context *brw,
   /* It's probably not worth swapping to the blit ring because of
* all the overhead involved.
*/
-   !(mode & GL_MAP_WRITE_BIT) &&
+   !(map->mode & GL_MAP_WRITE_BIT) &&
!mt->compressed &&
(mt->surf.tiling == ISL_TILING_X ||
 /* Prior to Sandybridge, the blitter can't handle Y tiling */
 (devinfo->gen >= 6 && mt->surf.tiling == ISL_TILING_Y0) ||
 /* Fast copy blit on skl+ supports all tiling formats. */
 devinfo->gen >= 9) &&
-   can_blit_slice(mt, level, slice))
+   can_blit_slice(mt, map))
   return true;
 
if (mt->surf.tiling != ISL_TILING_LINEAR &&
mt->bo->size >= brw->max_gtt_map_object_size) {
-  assert(can_blit_slice(mt, level, slice));
+  assert(can_blit_slice(mt, map));
   return true;
}
 
@@ -3625,7 +3623,7 @@ intel_miptree_map(struct brw_context *brw,
   intel_miptree_map_etc(brw, mt, map, level, slice);
} else if (mt->stencil_mt && !(mode & BRW_MAP_DIRECT_BIT)) {
   intel_miptree_map_depthstencil(brw, mt, map, level, slice);
-   } else if (use_intel_mipree_map_blit(brw, mt, mode, level, slice)) {
+   } else if (use_intel_mipree_map_blit(brw, mt, map)) {
   intel_miptree_map_blit(brw, mt, map, level, slice);
 #if defined(USE_SSE41)
} else if (!(mode & GL_MAP_WRITE_BIT) &&
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH v2 1/2] i965/miptree: Fix can_blit_slice()

2018-08-10 Thread Nanley Chery
On Fri, Aug 10, 2018 at 02:12:55PM -0700, Dylan Baker wrote:
> Quoting Nanley Chery (2018-08-10 10:23:34)
> > Satisfy the BLT engine's row pitch limitation on the destination
> > miptree. The destination miptree is untiled, so its row_pitch will be
> > slightly less than or equal to the source miptree's row_pitch. Use the
> > source miptree's row_pitch in can_blit_slice instead of its blt_pitch.
> > 
> > Fixes 0288fe8d0417730bdd5b3477130dd1dc32bdbcd3
> > ("i965/miptree: Use the correct BLT pitch")
> 
> For the scripts in stable to pick this up I believe you need a ":", as in
> Fixes: abc123 ("some patch")
> 

Thanks. Fixed locally.

-Nanley

> Dylan
> 
> > 
> > Cc: 
> > Reported-by: Dylan Baker 
> > ---
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 9 +++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > index a18d5ac3624..d8e823e4826 100644
> > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > @@ -3544,8 +3544,13 @@ static bool
> >  can_blit_slice(struct intel_mipmap_tree *mt,
> > unsigned int level, unsigned int slice)
> >  {
> > -   /* See intel_miptree_blit() for details on the 32k pitch limit. */
> > -   if (intel_miptree_blt_pitch(mt) >= 32768)
> > +   /* The blit destination is untiled, so its row_pitch will be slightly 
> > less
> > +* than or equal to the source's row_pitch. The BLT engine only supports
> > +* linear row pitches up to but not including 32k.
> > +*
> > +* See intel_miptree_blit() for details on the 32k pitch limit.
> > +*/
> > +   if (mt->surf.row_pitch >= 32768)
> >return false;
> >  
> > return true;
> > -- 
> > 2.18.0
> > 
> > ___
> > mesa-stable mailing list
> > mesa-sta...@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-stable


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 2/2] intel/isl: Avoid tiling some 16K-wide render targets

2018-08-10 Thread Nanley Chery
Fix rendering issues on BDW and SKL.

Fixes 0288fe8d0417730bdd5b3477130dd1dc32bdbcd3
("i965/miptree: Use the correct BLT pitch")

Fixes the following regressions seen

exclusively on SKL:
* KHR-GL46.texture_barrier_ARB.disjoint-texels
* KHR-GL46.texture_barrier_ARB.overlapping-texels
* KHR-GL46.texture_barrier.disjoint-texels
* KHR-GL46.texture_barrier.overlapping-texels

and both on BDW and SKL:
* GTF-GL46.gtf21.GL2FixedTests.buffer_corners.buffer_corners
* GTF-GL46.gtf21.GL2FixedTests.stencil_plane_corners.stencil_plane_corners

v2: Note the fixed tests (Andres).
Don't cause failures with multisampled surfaces (Andres).
Don't hamper SKL GT4 (Ken).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107359
Cc: 
---
 src/intel/isl/isl_gen7.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/src/intel/isl/isl_gen7.c b/src/intel/isl/isl_gen7.c
index 4fa9851233f..a9db21fba52 100644
--- a/src/intel/isl/isl_gen7.c
+++ b/src/intel/isl/isl_gen7.c
@@ -294,6 +294,29 @@ isl_gen6_filter_tiling(const struct isl_device *dev,
 */
if (ISL_DEV_GEN(dev) < 7 && isl_format_get_layout(info->format)->bpb >= 128)
   *flags &= ~ISL_TILING_Y0_BIT;
+
+   /* From the BDW and SKL PRMs, Volume 2d,
+* RENDER_SURFACE_STATE::Width - Programming Notes:
+*
+*   A known issue exists if a primitive is rendered to the first 2 rows and
+*   last 2 columns of a 16K width surface. If any geometry is drawn inside
+*   this square it will be copied to column X=2 and X=3 (arrangement on Y
+*   position will stay the same). If any geometry exceeds the boundaries of
+*   this 2x2 region it will be drawn normally. The issue also only occurs
+*   if the surface has TileMode != Linear.
+*
+* [Internal documentation notes that this issue isn't present on SKL GT4.]
+* To prevent this rendering corruption, only allow linear tiling for
+* surfaces with widths greater than 16K-2 pixels.
+*
+* TODO: Is this an issue for multisampled surfaces as well?
+*/
+   if (info->width > 16382 && info->samples == 1 &&
+   info->usage & ISL_SURF_USAGE_RENDER_TARGET_BIT &&
+   (ISL_DEV_GEN(dev) == 8 ||
+(dev->info->is_skylake && dev->info->gt != 4))) {
+  *flags &= ISL_TILING_LINEAR_BIT;
+   }
 }
 
 void
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/2] i965/miptree: Fix can_blit_slice()

2018-08-10 Thread Nanley Chery
Satisfy the BLT engine's row pitch limitation on the destination
miptree. The destination miptree is untiled, so its row_pitch will be
slightly less than or equal to the source miptree's row_pitch. Use the
source miptree's row_pitch in can_blit_slice instead of its blt_pitch.

Fixes 0288fe8d0417730bdd5b3477130dd1dc32bdbcd3
("i965/miptree: Use the correct BLT pitch")

Cc: 
Reported-by: Dylan Baker 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index a18d5ac3624..d8e823e4826 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3544,8 +3544,13 @@ static bool
 can_blit_slice(struct intel_mipmap_tree *mt,
unsigned int level, unsigned int slice)
 {
-   /* See intel_miptree_blit() for details on the 32k pitch limit. */
-   if (intel_miptree_blt_pitch(mt) >= 32768)
+   /* The blit destination is untiled, so its row_pitch will be slightly less
+* than or equal to the source's row_pitch. The BLT engine only supports
+* linear row pitches up to but not including 32k.
+*
+* See intel_miptree_blit() for details on the 32k pitch limit.
+*/
+   if (mt->surf.row_pitch >= 32768)
   return false;
 
return true;
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel/isl: Avoid tiling on 16K-wide render targets

2018-08-09 Thread Nanley Chery
On Thu, Aug 09, 2018 at 04:26:06PM +0300, Andres Gomez wrote:
> Ugh!
> 
> Unfortunately, as I've commented at:
> https://bugs.freedesktop.org/show_bug.cgi?id=107359
> 
> This is now breaking these other CTS tests:
> GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_error_blitframebuffer_multisampled_framebuffers_different_formats
> GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_error_blitframebuffer_multisampled_framebuffers_different_origins
> GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_error_blitframebuffer_multisampled_framebuffers_different_sample_count
> GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_error_blitframebuffer_multisampled_framebuffers_different_sizes
> GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_error_blitframebuffer_multisampled_read_buffer_different_formats
> GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_error_blitframebuffer_multisampled_read_buffer_different_origins
> GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_error_blitframebuffer_multisampled_read_buffer_different_sizes
> GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_multisampled_to_singlesampled_blit
> 
> 

Sorry about that, I must not have tested this in CI. I'll send out a
better tested v2.

-Nanley

> On Mon, 2018-07-30 at 19:25 +0300, Andres Gomez wrote:
> > That was quick! ☺
> > 
> > On Fri, 2018-07-27 at 16:02 -0700, Nanley Chery wrote:
> > > Fix rendering issues on BDW and SKL.
> > > 
> > > Fixes 0288fe8d0417730bdd5b3477130dd1dc32bdbcd3
> > > ("i965/miptree: Use the correct BLT pitch")
> > 
> > I'd add here some lines listing the tests fixed by this patch.
> > 
> > > 
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107359
> > > Cc: 
> > 
> > This is:
> > 
> > Tested-by: Andres Gomez 
> > 
> > > ---
> > > 
> > > We could probably add an assert when filling out the surface state, but
> > > I think BLORP would need a non-trivial amount of work done as a
> > > prerequisite. I'm thinking specifically of the cases where we bind a
> > > depth buffer as a render target.
> > > 
> > > I won't be able to push anything until about a week from EOD today, so
> > > if this does end up getting reviewed, please feel free to push it.
> > > 
> > >  src/intel/isl/isl_gen7.c | 19 +++
> > >  1 file changed, 19 insertions(+)
> > > 
> > > diff --git a/src/intel/isl/isl_gen7.c b/src/intel/isl/isl_gen7.c
> > > index 4fa9851233f..2d85f4b568d 100644
> > > --- a/src/intel/isl/isl_gen7.c
> > > +++ b/src/intel/isl/isl_gen7.c
> > > @@ -294,6 +294,25 @@ isl_gen6_filter_tiling(const struct isl_device *dev,
> > >  */
> > > if (ISL_DEV_GEN(dev) < 7 && isl_format_get_layout(info->format)->bpb 
> > > >= 128)
> > >*flags &= ~ISL_TILING_Y0_BIT;
> > > +
> > > +   /* From the BDW and SKL PRMs, Volume 2d,
> > > +* RENDER_SURFACE_STATE::Width - Programming Notes:
> > > +*
> > > +*   A known issue exists if a primitive is rendered to the first 2 
> > > rows and
> > > +*   last 2 columns of a 16K width surface. If any geometry is drawn 
> > > inside
> > > +*   this square it will be copied to column X=2 and X=3 (arrangement 
> > > on Y
> > > +*   position will stay the same). If any geometry exceeds the 
> > > boundaries of
> > > +*   this 2x2 region it will be drawn normally. The issue also only 
> > > occurs
> > > +*   if the surface has TileMode != Linear.
> > > +*
> > > +* To prevent this rendering corruption, only allow linear tiling for
> > > +* surfaces with widths greater than 16K-2 pixels.
> > > +*/
> > > +   if (info->width > 16382 &&
> > > +   info->usage & ISL_SURF_USAGE_RENDER_TARGET_BIT &&
> > > +   (ISL_DEV_GEN(dev) == 8 || dev->info->is_skylake)) {
> > > +  *flags &= ISL_TILING_LINEAR_BIT;
> > > +   }
> > >  }
> > >  
> > >  void
> -- 
> Br,
> 
> Andres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] mesa: enable EXT_render_snorm extension

2018-08-08 Thread Nanley Chery
On Thu, Aug 02, 2018 at 02:14:31PM +0300, Tapani Pälli wrote:
> Patch sets additional formats renderable and enables the extension
> when OpenGL ES 3.1 is supported.
> 
> v2: instead of dummy_true, have a separate toggle for extension
> (Eric Anholt)
> 
> v3: add missing checks, simplify some existing checks and fix
> glCopyTexImage2D check (Nanley Chery)
> 
> add SHORT and BYTE support in read_pixels_es3_error_check
> 
> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/main/extensions_table.h |  1 +
>  src/mesa/main/fbobject.c | 31 +++
>  src/mesa/main/glformats.c|  9 +
>  src/mesa/main/mtypes.h   |  1 +
>  src/mesa/main/readpix.c  | 19 +++
>  src/mesa/main/teximage.c |  3 ++-
>  6 files changed, 55 insertions(+), 9 deletions(-)
> 
> diff --git a/src/mesa/main/extensions_table.h 
> b/src/mesa/main/extensions_table.h
> index 3f01896cae..c3a69b8987 100644
> --- a/src/mesa/main/extensions_table.h
> +++ b/src/mesa/main/extensions_table.h
> @@ -246,6 +246,7 @@ EXT(EXT_polygon_offset_clamp, 
> ARB_polygon_offset_clamp
>  EXT(EXT_primitive_bounding_box  , OES_primitive_bounding_box 
> ,  x ,  x ,  x ,  31, 2014)
>  EXT(EXT_provoking_vertex, EXT_provoking_vertex   
> , GLL, GLC,  x ,  x , 2009)
>  EXT(EXT_read_format_bgra, dummy_true 
> ,  x ,  x , ES1, ES2, 2009)
> +EXT(EXT_render_snorm, EXT_render_snorm   
> ,  x ,  x ,  x,   31, 2014)
>  EXT(EXT_rescale_normal  , dummy_true 
> , GLL,  x ,  x ,  x , 1997)
>  EXT(EXT_robustness  , KHR_robustness 
> ,  x,   x,   x , ES2, 2011)
>  EXT(EXT_secondary_color , dummy_true 
> , GLL,  x ,  x ,  x , 1999)
> diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
> index cfe2174ef1..2534b1df29 100644
> --- a/src/mesa/main/fbobject.c
> +++ b/src/mesa/main/fbobject.c
> @@ -728,7 +728,15 @@ is_format_color_renderable(const struct gl_context *ctx, 
> mesa_format format,
>  
> /* Reject additional cases for GLES */
> switch (internalFormat) {
> +   case GL_R8_SNORM:
> +   case GL_RG8_SNORM:
> case GL_RGBA8_SNORM:
> +  return _mesa_has_EXT_render_snorm(ctx);
> +   case GL_R16_SNORM:
> +   case GL_RG16_SNORM:
> +   case GL_RGBA16_SNORM:
> +  return _mesa_has_EXT_texture_norm16(ctx) &&
> + _mesa_has_EXT_render_snorm(ctx);
> case GL_RGB32F:
> case GL_RGB32I:
> case GL_RGB32UI:
> @@ -741,8 +749,6 @@ is_format_color_renderable(const struct gl_context *ctx, 
> mesa_format format,
> case GL_SRGB8:
> case GL_RGB10:
> case GL_RGB9_E5:
> -   case GL_RG8_SNORM:
> -   case GL_R8_SNORM:
>return GL_FALSE;
> default:
>break;
> @@ -1997,25 +2003,34 @@ _mesa_base_fbo_format(const struct gl_context *ctx, 
> GLenum internalFormat)
>return ctx->API != API_OPENGLES && ctx->Extensions.ARB_texture_rg
>   ? GL_RG : 0;
> /* signed normalized texture formats */
> -   case GL_RED_SNORM:
> case GL_R8_SNORM:
> +  return _mesa_has_EXT_texture_snorm(ctx) || 
> _mesa_has_EXT_render_snorm(ctx)
> + ? GL_RED : 0;
> +   case GL_RED_SNORM:
> +  return _mesa_has_EXT_texture_snorm(ctx) ? GL_RED : 0;
> case GL_R16_SNORM:
> -  return _mesa_is_desktop_gl(ctx) && ctx->Extensions.EXT_texture_snorm
> +  return _mesa_has_EXT_texture_snorm(ctx) || 
> _mesa_has_EXT_render_snorm(ctx)

Shouldn't the condition for R16, RG16, and RGBA16 be
_mesa_has_EXT_texture_snorm(ctx) ||
(_mesa_has_EXT_render_snorm(ctx) && _mesa_has_EXT_texture_norm16(ctx))
?

If so, with those changes applied, this series is
Reviewed-by: Nanley Chery 

>   ? GL_RED : 0;
> -   case GL_RG_SNORM:
> case GL_RG8_SNORM:
> +  return _mesa_has_EXT_texture_snorm(ctx) || 
> _mesa_has_EXT_render_snorm(ctx)
> + ? GL_RG : 0;
> +   case GL_RG_SNORM:
> +  _mesa_has_EXT_texture_snorm(ctx) ? GL_RG : 0;
> case GL_RG16_SNORM:
> -  return _mesa_is_desktop_gl(ctx) && ctx->Extensions.EXT_texture_snorm
> +  return _mesa_has_EXT_texture_snorm(ctx) || 
> _mesa_has_EXT_render_snorm(ctx)
>   ? GL_RG : 0;
> case GL_RGB_SNORM:
> case GL_RGB8_SNORM:
> case GL_RGB16_SNORM:
>return _mesa_is_desktop_gl(ctx) && ctx->Extensions.EXT_texture_snorm
>   ? GL_RGB

[Mesa-dev] [PATCH] intel/isl: Avoid tiling on 16K-wide render targets

2018-07-27 Thread Nanley Chery
Fix rendering issues on BDW and SKL.

Fixes 0288fe8d0417730bdd5b3477130dd1dc32bdbcd3
("i965/miptree: Use the correct BLT pitch")

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107359
Cc: 
---

We could probably add an assert when filling out the surface state, but
I think BLORP would need a non-trivial amount of work done as a
prerequisite. I'm thinking specifically of the cases where we bind a
depth buffer as a render target.

I won't be able to push anything until about a week from EOD today, so
if this does end up getting reviewed, please feel free to push it.

 src/intel/isl/isl_gen7.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/src/intel/isl/isl_gen7.c b/src/intel/isl/isl_gen7.c
index 4fa9851233f..2d85f4b568d 100644
--- a/src/intel/isl/isl_gen7.c
+++ b/src/intel/isl/isl_gen7.c
@@ -294,6 +294,25 @@ isl_gen6_filter_tiling(const struct isl_device *dev,
 */
if (ISL_DEV_GEN(dev) < 7 && isl_format_get_layout(info->format)->bpb >= 128)
   *flags &= ~ISL_TILING_Y0_BIT;
+
+   /* From the BDW and SKL PRMs, Volume 2d,
+* RENDER_SURFACE_STATE::Width - Programming Notes:
+*
+*   A known issue exists if a primitive is rendered to the first 2 rows and
+*   last 2 columns of a 16K width surface. If any geometry is drawn inside
+*   this square it will be copied to column X=2 and X=3 (arrangement on Y
+*   position will stay the same). If any geometry exceeds the boundaries of
+*   this 2x2 region it will be drawn normally. The issue also only occurs
+*   if the surface has TileMode != Linear.
+*
+* To prevent this rendering corruption, only allow linear tiling for
+* surfaces with widths greater than 16K-2 pixels.
+*/
+   if (info->width > 16382 &&
+   info->usage & ISL_SURF_USAGE_RENDER_TARGET_BIT &&
+   (ISL_DEV_GEN(dev) == 8 || dev->info->is_skylake)) {
+  *flags &= ISL_TILING_LINEAR_BIT;
+   }
 }
 
 void
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/miptree: Fix can_blit_slice()

2018-07-27 Thread Nanley Chery
On Tue, Jul 24, 2018 at 03:28:09PM +0300, Andres Gomez wrote:
> Hi Nanley,
> 
> I'm observing regressions for SKL and BDW which doesn't seem to be
> solved with this new patch, in git master. Therefore, I've gone ahead
> and reported:
> https://bugs.freedesktop.org/show_bug.cgi?id=107359
> 

Hi Andres,

Thank you for the bug report. It turned out to be a HW issue. I'll send
a fix soon.

-Nanley

> On Mon, 2018-07-23 at 10:17 -0700, Nanley Chery wrote:
> > Satisfy the BLT engine's row pitch limitation on the destination
> > miptree. The destination miptree is untiled, so its row_pitch will be
> > slightly less than or equal to the source miptree's row_pitch. Use the
> > source miptree's row_pitch in can_blit_slice instead of its blt_pitch.
> > 
> > Fixes 0288fe8d0417730bdd5b3477130dd1dc32bdbcd3
> > ("i965/miptree: Use the correct BLT pitch")
> > 
> > Cc: 
> > Reported-by: Dylan Baker 
> > ---
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 9 +++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > index a18d5ac3624..d8e823e4826 100644
> > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > @@ -3544,8 +3544,13 @@ static bool
> >  can_blit_slice(struct intel_mipmap_tree *mt,
> > unsigned int level, unsigned int slice)
> >  {
> > -   /* See intel_miptree_blit() for details on the 32k pitch limit. */
> > -   if (intel_miptree_blt_pitch(mt) >= 32768)
> > +   /* The blit destination is untiled, so its row_pitch will be slightly 
> > less
> > +* than or equal to the source's row_pitch. The BLT engine only supports
> > +* linear row pitches up to but not including 32k.
> > +*
> > +* See intel_miptree_blit() for details on the 32k pitch limit.
> > +*/
> > +   if (mt->surf.row_pitch >= 32768)
> >return false;
> >  
> > return true;
> -- 
> Br,
> 
> Andres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [ANNOUNCE] Mesa 18.1.5 Candidate

2018-07-25 Thread Nanley Chery
On Wed, Jul 25, 2018 at 11:47:59AM -0700, Dylan Baker wrote:
> Greetings,
> 
> Mesa the staging/18.1 branch is currently slushed for 18.1.5, and assuming 
> that
> there are no regressions or patches critically necessary this will be merged 
> to
> the 18.1 branch for a release Friday morning, July 27th around 10 AM PDT.
> 
> Currently the branch has the following changes since 18.1.4:
>  - 45 queued
>  - 0 nominated (outstanding)
>  - and 1 rejected patch
> 
> The one rejected patch was de-nominated by the author.
> 
> Note: there are 4 cherry-ignore patches, but 3 are for patches that required
> manual backport, and thus the cherry-picked from line does not match the 
> commit
> in master.
> 
> All merge conflicts that I resolved have already been verified by the original
> patch authors.
> 
> Dylan
> 
> Shortlog of changes:
> 
> Alex Smith (1):
>   anv: Pay attention to VK_ACCESS_MEMORY_(READ|WRITE)_BIT
> 
> Bas Nieuwenhuizen (7):
>   radv: Select correct entries for binning.
>   radv: Fix number of samples used for binning.
>   radv: Disable disabled color buffers in rbplus opts.
>   nir: Do not use continue block after removing it.
>   util/disk_cache: Fix disk_cache_get_function_timestamp with disabled 
> cache.
>   nir: Fix end of function without return warning/error.
>   radv: Still enable inmemory & API level caching if disk cache is not 
> enabled.
> 
> Chad Versace (2):
>   anv/android: Fix type error in call to vk_errorf()
>   anv/android: Fix Autotools build for VK_ANDROID_native_buffer
> 
> Chih-Wei Huang (1):
>   Android: fix a missing nir_intrinsics.h error
> 
> Danylo Piliaiev (1):
>   i965: Sweep NIR after linking phase to free held memory
> 
> Dave Airlie (1):
>   r600: enable tess_input_info for TES
> 
> Dylan Baker (5):
>   docs: Add sha256 sums for 18.1.4 tarballs
>   cherry-ignore: add 4a67ce886a7b3def5f66c1aedf9e5436d157a03c
>   cherry-ignore: Add 1f616a840eac02241c585d28e9dac8f19a297f39
>   cherry-ignore: add 11712b9ca17e4e1a819dcb7d020e19c6da77bc90
>   cherry-ignore: Add 0288fe8d0417730bdd5b3477130dd1dc32bdbcd3
> 
> Eric Anholt (2):
>   vc4: Don't automatically reallocate a PERSISTENT-mapped buffer.
>   meson: Move xvmc test tools from unit tests to installed tools.
> 
> Harish Krupo (1):
>   egl: Fix missing clamping in eglSetDamageRegionKHR
> 
> Jan Vesely (3):
>   radeonsi: Refuse to accept code with unhandled relocations
>   clover: Report error when pipe driver fails to create compute state
>   clover: Catch errors from executing event action
> 
> Jason Ekstrand (6):
>   anv: Stop setting 3DSTATE_PS_EXTRA::PixelShaderHasUAV
>   nir/serialize: Alloc constants off the variable
>   blorp: Handle the RGB workaround more like other workarounds
>   intel/blorp: Handle 3-component formats in clears
>   intel/compiler: Account for built-in uniforms in analyze_ubo_ranges
>   spirv: Fix a couple of image atomic load/store bugs
> 
> José Fonseca (1):
>   gallium/tests: Don't ignore S3TC errors.
> 
> Karol Herbst (1):
>   nir: fix printing of vec16 type
> 
> Lepton Wu (1):
>   virgl: Fix flush in virgl_encoder_inline_write.
> 
> Lucas Stach (1):
>   st/mesa: call resource_changed when binding a EGLImage to a texture
> 
> Mauro Rossi (2):
>   radv: winsys/amdgpu: include missing pthread.h header
>   android: util/disk_cache: fix building errors in gallium drivers
> 
> Michel Dänzer (1):
>   gallium: Check pipe_screen::resource_changed before dereferencing it
> 
> Nanley Chery (3):
>   i965: Make blt_pitch public
>   i965/miptree: Drop an if case from retile_as_linear
>   i965/miptree: Fix can_blit_slice()

Hi Dylan,

I noticed that the patch, "i965/miptree: Use the correct BLT pitch,"
isn't included here. Without it, none of these patches change Mesa's
behavior.

Since the "Fix can_blit_slice()" patch hasn't been reviewed yet, I'm
fine with none of these patches going in.

-Nanley

> 
> Roland Scheidegger (1):
>   draw: force draw pipeline if there's more than 65535 vertices
> 
> Samuel Iglesias Gonsálvez (1):
>   anv: fix assert in anv_CmdBindDescriptorSets()
> 
> Samuel Pitoiset (3):
>   radv: make sure to wait for CP DMA when needed
>   radv: emit a dummy ZPASS_DONE to prevent GPU hangs on GFX9
>   radv: fix a memleak for merged shaders on GFX9



> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: add glRenderbufferStorage support for EXT_texture_norm16 formats

2018-07-24 Thread Nanley Chery
On Tue, Jul 24, 2018 at 08:58:20AM +0300, Tapani Pälli wrote:
> These bits were missing, found when extending the Piglit test.
> 
> Fixes: 7f467d4f73 "mesa: GL_EXT_texture_norm16 extension plumbing"
> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/main/fbobject.c | 10 +++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 

Shouldn't we also update is_format_color_renderable?

Nonetheless, this series is an improvement and is
Reviewed-by: Nanley Chery 

> diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
> index fa7a9361df..679e206c71 100644
> --- a/src/mesa/main/fbobject.c
> +++ b/src/mesa/main/fbobject.c
> @@ -1927,8 +1927,10 @@ _mesa_base_fbo_format(const struct gl_context *ctx, 
> GLenum internalFormat)
> case GL_RGBA:
> case GL_RGBA2:
> case GL_RGBA12:
> -   case GL_RGBA16:
>return _mesa_is_desktop_gl(ctx) ? GL_RGBA : 0;
> +   case GL_RGBA16:
> +  return _mesa_is_desktop_gl(ctx) || _mesa_has_EXT_texture_norm16(ctx)
> + ? GL_RGBA : 0;
> case GL_RGB10_A2:
> case GL_SRGB8_ALPHA8_EXT:
>return _mesa_is_desktop_gl(ctx) || _mesa_is_gles3(ctx) ? GL_RGBA : 0;
> @@ -1963,15 +1965,17 @@ _mesa_base_fbo_format(const struct gl_context *ctx, 
> GLenum internalFormat)
>   ctx->Extensions.ARB_depth_buffer_float)
>   ? GL_DEPTH_STENCIL : 0;
> case GL_RED:
> +  return _mesa_has_ARB_texture_rg(ctx) ? GL_RED : 0;
> case GL_R16:
> -  return _mesa_is_desktop_gl(ctx) && ctx->Extensions.ARB_texture_rg
> +  return _mesa_has_ARB_texture_rg(ctx) || 
> _mesa_has_EXT_texture_norm16(ctx)
>   ? GL_RED : 0;
> case GL_R8:
>return ctx->API != API_OPENGLES && ctx->Extensions.ARB_texture_rg
>   ? GL_RED : 0;
> case GL_RG:
> +  return _mesa_has_ARB_texture_rg(ctx) ? GL_RG : 0;
> case GL_RG16:
> -  return _mesa_is_desktop_gl(ctx) && ctx->Extensions.ARB_texture_rg
> +  return _mesa_has_ARB_texture_rg(ctx) || 
> _mesa_has_EXT_texture_norm16(ctx)
>   ? GL_RG : 0;
> case GL_RG8:
>return ctx->API != API_OPENGLES && ctx->Extensions.ARB_texture_rg
> -- 
> 2.14.4
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/miptree: Fix can_blit_slice()

2018-07-23 Thread Nanley Chery
On Mon, Jul 23, 2018 at 08:20:15PM +0100, Chris Wilson wrote:
> Quoting Nanley Chery (2018-07-23 18:17:15)
> > Satisfy the BLT engine's row pitch limitation on the destination
> > miptree. The destination miptree is untiled, so its row_pitch will be
> > slightly less than or equal to the source miptree's row_pitch. Use the
> > source miptree's row_pitch in can_blit_slice instead of its blt_pitch.
> > 
> > Fixes 0288fe8d0417730bdd5b3477130dd1dc32bdbcd3
> > ("i965/miptree: Use the correct BLT pitch")
> > 
> > Cc: 
> > Reported-by: Dylan Baker 
> > ---
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 9 +++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > index a18d5ac3624..d8e823e4826 100644
> > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > @@ -3544,8 +3544,13 @@ static bool
> >  can_blit_slice(struct intel_mipmap_tree *mt,
> > unsigned int level, unsigned int slice)
> >  {
> > -   /* See intel_miptree_blit() for details on the 32k pitch limit. */
> > -   if (intel_miptree_blt_pitch(mt) >= 32768)
> > +   /* The blit destination is untiled, so its row_pitch will be slightly 
> > less
> > +* than or equal to the source's row_pitch. The BLT engine only supports
> > +* linear row pitches up to but not including 32k.
> > +*
> > +* See intel_miptree_blit() for details on the 32k pitch limit.
> > +*/
> > +   if (mt->surf.row_pitch >= 32768)
> >return false;
> 
> I see the difference, but do we copy the whole slice or a region of it?

I think it depends on the caller.

-Nanley

> -Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/miptree: Fix can_blit_slice()

2018-07-23 Thread Nanley Chery
Satisfy the BLT engine's row pitch limitation on the destination
miptree. The destination miptree is untiled, so its row_pitch will be
slightly less than or equal to the source miptree's row_pitch. Use the
source miptree's row_pitch in can_blit_slice instead of its blt_pitch.

Fixes 0288fe8d0417730bdd5b3477130dd1dc32bdbcd3
("i965/miptree: Use the correct BLT pitch")

Cc: 
Reported-by: Dylan Baker 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index a18d5ac3624..d8e823e4826 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3544,8 +3544,13 @@ static bool
 can_blit_slice(struct intel_mipmap_tree *mt,
unsigned int level, unsigned int slice)
 {
-   /* See intel_miptree_blit() for details on the 32k pitch limit. */
-   if (intel_miptree_blt_pitch(mt) >= 32768)
+   /* The blit destination is untiled, so its row_pitch will be slightly less
+* than or equal to the source's row_pitch. The BLT engine only supports
+* linear row pitches up to but not including 32k.
+*
+* See intel_miptree_blit() for details on the 32k pitch limit.
+*/
+   if (mt->surf.row_pitch >= 32768)
   return false;
 
return true;
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH 3/3] i965/miptree: Use the correct BLT pitch

2018-07-20 Thread Nanley Chery
On Fri, Jul 20, 2018 at 02:48:58PM -0700, Dylan Baker wrote:
> Hi Nanley,
> 
> This applies cleanly to the 18.1 branch, but there is something else missing 
> as
> it adds roughly 2200 regressions, see here:
> http://otc-mesa-ci.jf.intel.com/view/dev/job/dcbaker_18.1/66/testReport/
> 
> For the moment I've reverted it out of the staging/18.1 branch. If we need to
> get it back in we can, but we need to sort out those regressions first.
> 

Thanks for letting me know. I guess I didn't figure out how to do stable
testing after all. I think I found the issue and am running the fix
through jenkins (rebuild of the job above and a mesa_master build).

-Nanley

> Dylan
> 
> Quoting Nanley Chery (2018-07-12 10:28:16)
> > Retile miptrees to a linear tiling less often. Retiling can cause issues
> > with imported BOs.
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106738
> > Suggested-by: Chris Wilson 
> > Cc: 
> > ---
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 12 ++--
> >  1 file changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > index 53e01120a92..1ddb945b085 100644
> > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > @@ -509,7 +509,7 @@ free_aux_state_map(enum isl_aux_state **state)
> >  }
> >  
> >  static bool
> > -need_to_retile_as_linear(struct brw_context *brw, unsigned row_pitch,
> > +need_to_retile_as_linear(struct brw_context *brw, unsigned blt_pitch,
> >   enum isl_tiling tiling, unsigned samples)
> >  {
> > if (samples > 1)
> > @@ -518,9 +518,9 @@ need_to_retile_as_linear(struct brw_context *brw, 
> > unsigned row_pitch,
> > if (tiling == ISL_TILING_LINEAR)
> >return false;
> >  
> > -   if (ALIGN(row_pitch, 512) >= 32768) {
> > -  perf_debug("row pitch %u too large to blit, falling back to untiled",
> > - row_pitch);
> > +   if (blt_pitch >= 32768) {
> > +  perf_debug("blt pitch %u too large to blit, falling back to untiled",
> > + blt_pitch);
> >return true;
> > }
> >  
> > @@ -600,7 +600,7 @@ make_surface(struct brw_context *brw, GLenum target, 
> > mesa_format format,
> > bool is_depth_stencil =
> >mt->surf.usage & (ISL_SURF_USAGE_STENCIL_BIT | 
> > ISL_SURF_USAGE_DEPTH_BIT);
> > if (!is_depth_stencil) {
> > -  if (need_to_retile_as_linear(brw, mt->surf.row_pitch,
> > +  if (need_to_retile_as_linear(brw, intel_miptree_blt_pitch(mt),
> > mt->surf.tiling, mt->surf.samples)) {
> >   init_info.tiling_flags = 1u << ISL_TILING_LINEAR;
> >   if (!isl_surf_init_s(>isl_dev, >surf, _info))
> > @@ -3577,7 +3577,7 @@ can_blit_slice(struct intel_mipmap_tree *mt,
> > unsigned int level, unsigned int slice)
> >  {
> > /* See intel_miptree_blit() for details on the 32k pitch limit. */
> > -   if (mt->surf.row_pitch >= 32768)
> > +   if (intel_miptree_blt_pitch(mt) >= 32768)
> >return false;
> >  
> > return true;
> > -- 
> > 2.18.0
> > 
> > ___
> > mesa-stable mailing list
> > mesa-sta...@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-stable


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] intel/isl/gen4: Make depth/stencil buffers Y-Tiled

2018-07-19 Thread Nanley Chery
On Thu, Jul 19, 2018 at 10:16:47AM -0700, Kenneth Graunke wrote:
> On Tuesday, July 17, 2018 10:45:28 AM PDT Nanley Chery wrote:
> > On Tue, Jul 17, 2018 at 08:19:30AM -0700, Kenneth Graunke wrote:
> > > Personally, I'd be inclined to simply make this
> > > 
> > >*flags &= ISL_TILING_Y0_BIT;
> 
> While I still think the above is simpler and perhaps safer, your
> patches seem correct to me, and are:
> 
> Reviewed-by: Kenneth Graunke 

Thank you for the review.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v5] i965: Fix ETC2/EAC GetCompressed* functions on Gen7 GPUs

2018-07-18 Thread Nanley Chery
On Wed, Jul 18, 2018 at 05:34:13PM +0300, Eleni Maria Stea wrote:
> On 07/10/2018 03:10 AM, Nanley Chery wrote:
> > On Thu, Jun 14, 2018 at 10:50:57PM +0300, Eleni Maria Stea wrote:
> >> On 06/14/2018 10:27 PM, Nanley Chery wrote:
> >>
> >>> +Jason, Ken
> >>>
> >>> Hello,
> >>>
> >>> I recently did some miptree work relating to the r8stencil_mt and I
> >>> think I now have a more informed opinion about how things should be
> >>> structured. I'd like to propose an alternative solution.
> >>>
> >>> I had initially thought we should have a separate miptree to hold the
> >>> compressed data, like this patch does, but now I think we should
> >>> actually have the compressed data be the main miptree and to store the
> >>> decompressed miptree as part of the main one. The reasoning is that we
> >>> could reuse this structure to handle the r8stencil workaround and to
> >>> eventually handle the ASTC_LDR surfaces that are modified on gen9.
> >>>
> >>> I'm proposing something like the following:
> >>>
> >>> 1. Rename r8stencil_mt ->shadow_mt and
> >>>r8stencil_needs_update -> shadow_needs_update.
> >>> 2. Make shadow_mt hold the decompressed ETC miptree
> >>> 3. Update shadow_needs_update whenever the main mt is modified
> >>> 4. Add an function to update the shadow_mt using the main mt as a source
> >>> 5. Sample from the shadow_mt as appropriate
> >>> 6. Make the main miptree hold the compressed data
> >>>
> >>> This method should also be able to handle the CopyImage functions. What
> >>> do you all think?
> >>>
> >>> -Nanley
> >>
> >> Hi Nanley,
> >>
> >> Thank you for your reply. I wasn't aware that there are other cases we
> >> might need to store a 2nd image. I agree that it's more reasonable to
> >> use one generic purpose miptree that can be accessible from different
> >> parts of the i965 code for such cases instead of storing miptrees in
> >> different places for different hacks when a feature is not supported.
> >>
> >> I will search your patch to get a look and I will also get a look at the
> >> mesa code to see how easy this fix would be (which parts of the code it
> >> might affect) and if everyone agrees that this is a good idea I will
> >> modify this patch according to your suggestions.
> >>
> >> BR :)
> >> Eleni
> > 
> > Hi Eleni,
> > 
> > I gave this more thought and am now thinking that what you have here is
> > fine. Having two different ways of working with a shadow miptree
> > suggests a refactor later on, but IMO this is ultimately a step in the
> > right direction. Sorry for the noise.
> > 
> > With code-sharing among shadow miptrees in mind, my two main
> > suggestions are 1) to perform mapping operations only with the cmt (if
> > it's present) and 2) to update the decompressed mt, on demand. Maybe
> > with intel_miptree_copy_slice_sw?
> > 
> > Regards,
> > Nanley
> > 
> 
> Hi Nanley,
> 
> I talked to you on IRC but I reply here as well:
> 
> Thank you for the suggestions, I had misunderstood something from our
> IRC conversation that followed this e-mail, so the patch v6 has several
> issues. I will send a new one soon and I will implement the solution you
> suggested earlier (suggestions 1-6) instead. Sorry for the noise with
> the patch v6.
> 

Sounds good. By the way, I think it'd be helpful if you sent out the
solution as a series of patches (see git format-patch - for example).
That way it's easier to confirm each step of the solution is correct.

-Nanley

> Thanks,
> Eleni
> 
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] intel/isl/gen4: Make depth/stencil buffers Y-Tiled

2018-07-17 Thread Nanley Chery
On Tue, Jul 17, 2018 at 01:29:26PM -0700, Kenneth Graunke wrote:
> On Tuesday, July 17, 2018 10:45:28 AM PDT Nanley Chery wrote:
> > On Tue, Jul 17, 2018 at 08:19:30AM -0700, Kenneth Graunke wrote:
> > > Wow, I had no idea we were actually using linear depth buffers.
> > 
> > Neither did I until Mark let me know that I regressed some tests after
> > landing commit fbe01625f6bf2cef6742e1ff0d3d44a2afec003e. Whoops!
> > 
> > I've included the Bugzilla tag locally.
> > https://bugs.freedesktop.org/show_bug.cgi?id=107248
> > 
> > > Is there any compelling reason to use them?  I can't think of one.
> > > 
> > 
> > Yes. They give us the best performance and memory usage for 1D textures
> > (see isl_surf_choose_tiling()).
> 
> While 1D depth textures are certainly supported, I don't think I have
> ever seen one.  shader-db has no instances of them in any shader.
> 
> It doesn't make much sense...when you use a depth buffer, you draw
> objects in a scene and record their distance from the camera.  Doing
> that with a 1D surface just seems...unlikely?
> 

Good to know.

> > > Y-tiling should offer better performance.  PBO upload needs to be
> > > done with a R32_UINT format anyway, as R24_X8 isn't renderable.
> > > 
> > 
> > I don't understand how the PBO format comes into play here. Care to
> > elaborate?
> 
> PBO upload/download of texture data uses linear buffers.  I was thinking
> of glTexImage2D with data in a PBO.  Not sure if you can do that with
> depth...
> 

I'm still a little lost, but maybe we can talk about this point offline.

> > > I'm pretty sure that we used to always use Y-tiling in the old
> > > miptree code...maybe this regressed when we switched to ISL?
> > 
> > Yeah, we did always use Y-tiling. I caused the regression when I pushed
> > that commit.
> > 
> > > The hardware timeline looks like:
> > > 
> > >BROKEN /-- Works??? ---\DISALLOWED
> > > [Broadwater, Crestline, Eaglelake, Cantiga, Ironlake, Sandybridge, ...]
> > > 
> > > Given that it was broken and ultimately became disallowed, I'm a bit
> > > skeptical whether it really works, or is useful, in the meantime.
> > > 
> > 
> > It seems to work. We're getting testing from these piglit tests:
> > * piglit.spec.arb_shader_texture_lod.execution.tex-miplevel-selection 
> > *projgradarb 1dshadow
> > * piglit.spec.arb_shader_texture_lod.execution.tex-miplevel-selection 
> > *gradarb 1dshadow
> > * piglit.spec.arb_shader_texture_lod.execution.tex-miplevel-selection *lod 
> > 1dshadow
> > * piglit.spec.!opengl 1_1.copyteximage 1d
> > * piglit.spec.ext_texture_array.copyteximage 1d_array
> > * piglit.spec.arb_shader_texture_lod.execution.tex-miplevel-selection 
> > *projlod 1dshadow
> > 
> > On ILK and g45 they go from fail->pass with the last patch. But g965
> > needed the linear option removed completely.
> 
> I guess if it's working and useful for 1D, while being basically no
> code, we can do it, but...I have a real hard time caring about 1D depth.

I don't blame you. The main reason I care is that I just broke them :/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] intel/isl/gen4: Make depth/stencil buffers Y-Tiled

2018-07-17 Thread Nanley Chery
On Tue, Jul 17, 2018 at 08:19:30AM -0700, Kenneth Graunke wrote:
> On Monday, July 16, 2018 4:57:40 PM PDT Nanley Chery wrote:
> > Rendering to a linear depth buffer on gen4 is causing a GPU hang in the
> > CI system. Until a better explanation is found, assume that errata is
> > applicable to all gen4 platforms.
> > 
> > Fixes fbe01625f6bf2cef6742e1ff0d3d44a2afec003e
> > ("i965/miptree: Share tiling_flags in miptree_create").
> > 
> > Reported-by: Mark Janes 
> > ---
> >  src/intel/isl/isl_gen4.c | 9 -
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/intel/isl/isl_gen4.c b/src/intel/isl/isl_gen4.c
> > index 14706c895a5..a212d0ee0af 100644
> > --- a/src/intel/isl/isl_gen4.c
> > +++ b/src/intel/isl/isl_gen4.c
> > @@ -51,8 +51,15 @@ isl_gen4_filter_tiling(const struct isl_device *dev,
> >/* From the g35 PRM Vol. 2, 3DSTATE_DEPTH_BUFFER::Tile Walk:
> > *
> > *"The Depth Buffer, if tiled, must use Y-Major tiling"
> > +   *
> > +   *Errata   DescriptionProject
> > +   *BWT014   The Depth Buffer Must be Tiled, it cannot be linear. 
> > This
> > +   *field must be set to 1 on DevBW-A.  [DevBW -A,B]
> > +   *
> > +   * In testing, the linear configuration doesn't seem to work on gen4.
> > */
> > -  *flags &= (ISL_TILING_LINEAR_BIT | ISL_TILING_Y0_BIT);
> > +  *flags &= (ISL_DEV_GEN(dev) == 4 && !ISL_DEV_IS_G4X(dev)) ?
> > +ISL_TILING_Y0_BIT : (ISL_TILING_Y0_BIT | 
> > ISL_TILING_LINEAR_BIT);
> > }
> >  
> > if (info->usage & (ISL_SURF_USAGE_DISPLAY_ROTATE_90_BIT |
> > 
> 
> Wow, I had no idea we were actually using linear depth buffers.

Neither did I until Mark let me know that I regressed some tests after
landing commit fbe01625f6bf2cef6742e1ff0d3d44a2afec003e. Whoops!

I've included the Bugzilla tag locally.
https://bugs.freedesktop.org/show_bug.cgi?id=107248

> Is there any compelling reason to use them?  I can't think of one.
> 

Yes. They give us the best performance and memory usage for 1D textures
(see isl_surf_choose_tiling()).

> Y-tiling should offer better performance.  PBO upload needs to be
> done with a R32_UINT format anyway, as R24_X8 isn't renderable.
> 

I don't understand how the PBO format comes into play here. Care to
elaborate?

> I'm pretty sure that we used to always use Y-tiling in the old
> miptree code...maybe this regressed when we switched to ISL?
> 

Yeah, we did always use Y-tiling. I caused the regression when I pushed
that commit.

> The hardware timeline looks like:
> 
>BROKEN /-- Works??? ---\DISALLOWED
> [Broadwater, Crestline, Eaglelake, Cantiga, Ironlake, Sandybridge, ...]
> 
> Given that it was broken and ultimately became disallowed, I'm a bit
> skeptical whether it really works, or is useful, in the meantime.
> 

It seems to work. We're getting testing from these piglit tests:
* piglit.spec.arb_shader_texture_lod.execution.tex-miplevel-selection 
*projgradarb 1dshadow
* piglit.spec.arb_shader_texture_lod.execution.tex-miplevel-selection *gradarb 
1dshadow
* piglit.spec.arb_shader_texture_lod.execution.tex-miplevel-selection *lod 
1dshadow
* piglit.spec.!opengl 1_1.copyteximage 1d
* piglit.spec.ext_texture_array.copyteximage 1d_array
* piglit.spec.arb_shader_texture_lod.execution.tex-miplevel-selection *projlod 
1dshadow

On ILK and g45 they go from fail->pass with the last patch. But g965
needed the linear option removed completely.

> Personally, I'd be inclined to simply make this
> 
>*flags &= ISL_TILING_Y0_BIT;
> 
> which I suppose gets into philosophy about whether ISL should represent
> exactly what the hardware can do, or instead do what we want...but...we
> ought to disallow linear somehow.

Yeah, I think ISL mostly tries to represent what the HW can do. Allowing
linear in i965 seems painless thus far, so I thought I might as well
give it a try.

> 
> Good find with the errata!  I'm glad to see it cited here.

Thanks.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965/misc: Use depth/stencil surf's tiling on gen4-5

2018-07-16 Thread Nanley Chery
Make the 3D engine aware of the depth/stencil surface's tiling before
doing any render operations.

Fixes fbe01625f6bf2cef6742e1ff0d3d44a2afec003e
("i965/miptree: Share tiling_flags in miptree_create").

Reported-by: Mark Janes 
---
 src/mesa/drivers/dri/i965/brw_misc_state.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
b/src/mesa/drivers/dri/i965/brw_misc_state.c
index 9a663b1d61c..5cf704ff0e9 100644
--- a/src/mesa/drivers/dri/i965/brw_misc_state.c
+++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
@@ -267,6 +267,7 @@ brw_emit_depth_stencil_hiz(struct brw_context *brw,
uint32_t depthbuffer_format = BRW_DEPTHFORMAT_D32_FLOAT;
uint32_t depth_offset = 0;
uint32_t width = 1, height = 1;
+   bool tiled_surface = true;
 
/* If there's a packed depth/stencil bound to stencil only, we need to
 * emit the packed depth/stencil buffer packet.
@@ -282,6 +283,7 @@ brw_emit_depth_stencil_hiz(struct brw_context *brw,
   depth_offset = brw->depthstencil.depth_offset;
   width = depth_irb->Base.Base.Width;
   height = depth_irb->Base.Base.Height;
+  tiled_surface = depth_mt->surf.tiling != ISL_TILING_LINEAR;
}
 
const struct gen_device_info *devinfo = >screen->devinfo;
@@ -292,7 +294,7 @@ brw_emit_depth_stencil_hiz(struct brw_context *brw,
OUT_BATCH((depth_mt ? depth_mt->surf.row_pitch - 1 : 0) |
  (depthbuffer_format << 18) |
  (BRW_TILEWALK_YMAJOR << 26) |
- (1 << 27) |
+ (tiled_surface << 27) |
  (depth_surface_type << 29));
 
if (depth_mt) {
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   3   4   5   6   7   8   9   10   >