Re: [Mesa-dev] [PATCH 03/18] nir: add new intrinsic field for storing component offset

2016-06-11 Thread Kenneth Graunke
On Saturday, June 11, 2016 9:03:23 AM PDT Timothy Arceri wrote:
> This offset is used for packing.
> ---
>  src/compiler/nir/nir.h| 6 ++
>  src/compiler/nir/nir_intrinsics.h | 8 
>  src/compiler/nir/nir_lower_io.c   | 8 
>  src/compiler/nir/nir_print.c  | 3 +++
>  4 files changed, 21 insertions(+), 4 deletions(-)
> 
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index ec7b0c7..d5e4733 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -987,6 +987,11 @@ typedef enum {
>  */
> NIR_INTRINSIC_BINDING = 7,
>  
> +   /**
> +* Component offset.
> +*/
> +   NIR_INTRINSIC_COMPONENT = 8,
> +
> NIR_INTRINSIC_NUM_INDEX_FLAGS,
>  
>  } nir_intrinsic_index_flag;
> @@ -1053,6 +1058,7 @@ INTRINSIC_IDX_ACCESSORS(ucp_id, UCP_ID, unsigned)
>  INTRINSIC_IDX_ACCESSORS(range, RANGE, unsigned)
>  INTRINSIC_IDX_ACCESSORS(desc_set, DESC_SET, unsigned)
>  INTRINSIC_IDX_ACCESSORS(binding, BINDING, unsigned)
> +INTRINSIC_IDX_ACCESSORS(component, COMPONENT, unsigned)
>  
>  /**
>   * \group texture information
> diff --git a/src/compiler/nir/nir_intrinsics.h 
> b/src/compiler/nir/nir_intrinsics.h
> index 6f86c9f..4da36ff 100644
> --- a/src/compiler/nir/nir_intrinsics.h
> +++ b/src/compiler/nir/nir_intrinsics.h
> @@ -336,9 +336,9 @@ LOAD(uniform, 1, 2, BASE, RANGE, xx, 
> NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC
>  /* src[] = { buffer_index, offset }. No const_index */
>  LOAD(ubo, 2, 0, xx, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE | 
> NIR_INTRINSIC_CAN_REORDER)
>  /* src[] = { offset }. const_index[] = { base } */
> -LOAD(input, 1, 1, BASE, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE | 
> NIR_INTRINSIC_CAN_REORDER)
> +LOAD(input, 1, 2, BASE, COMPONENT, xx, NIR_INTRINSIC_CAN_ELIMINATE | 
> NIR_INTRINSIC_CAN_REORDER)
>  /* src[] = { vertex, offset }. const_index[] = { base } */
> -LOAD(per_vertex_input, 2, 1, BASE, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE | 
> NIR_INTRINSIC_CAN_REORDER)
> +LOAD(per_vertex_input, 2, 2, BASE, COMPONENT, xx, 
> NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
>  /* src[] = { buffer_index, offset }. No const_index */
>  LOAD(ssbo, 2, 0, xx, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE)
>  /* src[] = { offset }. const_index[] = { base } */

Don't nir_intrinsic_load_output and nir_intrinsic_load_per_vertex_output
need component support too?  (These are used to read TCS outputs.)

> @@ -362,9 +362,9 @@ LOAD(push_constant, 1, 2, BASE, RANGE, xx,
> INTRINSIC(store_##name, srcs, ARR(0, 1, 1, 1), false, 0, 0, num_indices, 
> idx0, idx1, idx2, flags)
>  
>  /* src[] = { value, offset }. const_index[] = { base, write_mask } */
> -STORE(output, 2, 2, BASE, WRMASK, xx, 0)
> +STORE(output, 2, 3, BASE, WRMASK, COMPONENT, 0)
>  /* src[] = { value, vertex, offset }. const_index[] = { base, write_mask } */
> -STORE(per_vertex_output, 3, 2, BASE, WRMASK, xx, 0)
> +STORE(per_vertex_output, 3, 3, BASE, WRMASK, COMPONENT, 0)
>  /* src[] = { value, block_index, offset }. const_index[] = { write_mask } */
>  STORE(ssbo, 3, 1, WRMASK, xx, xx, 0)
>  /* src[] = { value, offset }. const_index[] = { base, write_mask } */
> diff --git a/src/compiler/nir/nir_lower_io.c b/src/compiler/nir/nir_lower_io.c
> index a839924..0d6d8e4 100644
> --- a/src/compiler/nir/nir_lower_io.c
> +++ b/src/compiler/nir/nir_lower_io.c
> @@ -274,6 +274,10 @@ nir_lower_io_block(nir_block *block,
>  
>   nir_intrinsic_set_base(load,
>  intrin->variables[0]->var->data.driver_location);
> + if (mode == nir_var_shader_in) {

If you add load_output support, then this would need to be

   if (mode == nir_var_shader_in || mode == nir_var_shader_out)

With that changed, patches 1-3 are:
Reviewed-by: Kenneth Graunke 

> +nir_intrinsic_set_component(load,
> +   intrin->variables[0]->var->data.location_frac);
> + }
>  
>   if (load->intrinsic == nir_intrinsic_load_uniform) {
>  nir_intrinsic_set_range(load,
> @@ -322,6 +326,10 @@ nir_lower_io_block(nir_block *block,
>  
>   nir_intrinsic_set_base(store,
>  intrin->variables[0]->var->data.driver_location);
> + if (mode == nir_var_shader_out) {
> +nir_intrinsic_set_component(store,
> +   intrin->variables[0]->var->data.location_frac);
> + }
>   nir_intrinsic_set_write_mask(store, 
> nir_intrinsic_write_mask(intrin));
>  
>   if (per_vertex)
> diff --git a/src/compiler/nir/nir_print.c b/src/compiler/nir/nir_print.c
> index 36176ec..bca8a35 100644
> --- a/src/compiler/nir/nir_print.c
> +++ b/src/compiler/nir/nir_print.c
> @@ -570,6 +570,7 @@ print_intrinsic_instr(nir_intrinsic_instr *instr, 
> print_state *state)
>[NIR_INTRINSIC_RANGE] = "range",
>[NIR_INTRINSIC_DESC_SET] = "desc-set",
>[NIR_INTRINSIC_BINDING] = "binding",
> +  [NIR_INTRINSIC_COMPONENT] = "component",
> };
> for (unsigned idx = 1; idx < 

Re: [Mesa-dev] [PATCH 1/4] i965: Fix issues with number of VS URB entries on Cherryview/Broxton.

2016-06-11 Thread Jordan Justen
Series Reviewed-by: Jordan Justen 

On 2016-06-10 14:19:43, Kenneth Graunke wrote:
> Cherryview/Broxton annoyingly have a minimum number of VS URB entries
> of 34, which is not a multiple of 8.  When the VS size is less than 9,
> the number of VS entries has to be a multiple of 8.
> 
> Notably, BLORP programmed the minimum number of VS URB entries (34), with
> a size of 1 (less than 9), which is invalid.
> 
> It seemed like this could be a problem in the regular URB code as well,
> so I went ahead and updated that to be safe.
> 
> Cc: "12.0" 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/gen7_blorp.c | 5 +++--
>  src/mesa/drivers/dri/i965/gen7_urb.c   | 2 ++
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen7_blorp.c 
> b/src/mesa/drivers/dri/i965/gen7_blorp.c
> index 270fe57..235f0b5 100644
> --- a/src/mesa/drivers/dri/i965/gen7_blorp.c
> +++ b/src/mesa/drivers/dri/i965/gen7_blorp.c
> @@ -67,8 +67,9 @@ gen7_blorp_emit_urb_config(struct brw_context *brw)
>push_constant_bytes / chunk_size_bytes;
> const unsigned vs_size = 1;
> const unsigned vs_start = push_constant_chunks;
> +   const unsigned min_vs_entries = ALIGN(brw->urb.min_vs_entries, 8);
> const unsigned vs_chunks =
> -  DIV_ROUND_UP(brw->urb.min_vs_entries * vs_size * 64, chunk_size_bytes);
> +  DIV_ROUND_UP(min_vs_entries * vs_size * 64, chunk_size_bytes);
>  
> if (gen7_blorp_skip_urb_config(brw))
>return;
> @@ -83,7 +84,7 @@ gen7_blorp_emit_urb_config(struct brw_context *brw)
>   urb_size / 2 /* fs_size */);
>  
> gen7_emit_urb_state(brw,
> -   brw->urb.min_vs_entries /* num_vs_entries */,
> +   min_vs_entries /* num_vs_entries */,
> vs_size,
> vs_start,
> 0 /* num_hs_entries */,
> diff --git a/src/mesa/drivers/dri/i965/gen7_urb.c 
> b/src/mesa/drivers/dri/i965/gen7_urb.c
> index a412a42..387ed2e 100644
> --- a/src/mesa/drivers/dri/i965/gen7_urb.c
> +++ b/src/mesa/drivers/dri/i965/gen7_urb.c
> @@ -234,6 +234,8 @@ gen7_upload_urb(struct brw_context *brw)
>  */
> unsigned vs_min_entries =
>tess_present && brw->gen == 8 ? 192 : brw->urb.min_vs_entries;
> +   /* Min VS Entries isn't a multiple of 8 on Cherryview/Broxton; round up */
> +   vs_min_entries = ALIGN(vs_min_entries, vs_granularity);
>  
> unsigned vs_chunks =
>DIV_ROUND_UP(vs_min_entries * vs_entry_size_bytes, chunk_size_bytes);
> -- 
> 2.8.3
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] i965: Fix multiplication of immediates on Cherryview/Broxton.

2016-06-11 Thread Jordan Justen
On 2016-06-10 14:19:44, Kenneth Graunke wrote:
> Cherryview and Broxton don't support DW x DW multiplication.  We have
> piles of code to handle this, but apparently weren't retyping in the
> immediate case.
> 
> For example,
> tests/spec/arb_tessellation_shader/execution/dvec3-vs-tcs-tes
> makes the simulator angry about instructions such as:
> 
>mul(8) r18<1>:D r10.0<8;8,1>:D 0x0003:D
> 
> Just retype to UW.  It should be safe everywhere.
> 
> Cc: "12.0" 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 4b29ee5..13246c2 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -3564,7 +3564,10 @@ fs_visitor::lower_integer_multiplication()
> ibld.MOV(imm, inst->src[1]);
> ibld.MUL(inst->dst, imm, inst->src[0]);
>  } else {
> -   ibld.MUL(inst->dst, inst->src[0], inst->src[1]);
> +   const bool ud = (inst->src[1].type == BRW_REGISTER_TYPE_UD);
> +   ibld.MUL(inst->dst, inst->src[0],
> +ud ? brw_imm_uw(inst->src[1].ud)
> +   : brw_imm_w(inst->src[1].d));

This change looks fine, but will it actually be possible to hit this
code path for negative numbers? Above, we have:

   if (inst->src[1].file == IMM &&
   inst->src[1].ud < (1 << 16)) {

Bit 31 would be set if inst->src[1].d has a negative number.

-Jordan

>  }
>   } else {
>  /* Gen < 8 (and some Gen8+ low-power parts like Cherryview) 
> cannot
> -- 
> 2.8.3
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Fix regs_written for SIMD-lowered instructions some more.

2016-06-11 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On 2016-06-10 22:39:23, Francisco Jerez wrote:
> ISTR having suggested this during review of the recent FP64 changes to
> the SIMD lowering pass, but it doesn't look like it was taken into
> account in the end.  Using the fs_reg::component_size helper instead
> of this open-coded variant makes sure that the stride is taken into
> account correctly.  Fixes at least the following piglit tests with
> spilling forced on (since otherwise regs_written would be calculated
> incorrectly and the spilling code would be rather confused about how
> much data needs to be spilled):
> 
>  spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-shader
>  spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-mixed-shader
> 
> Cc: 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 104c20b..0347b0a 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -5261,9 +5261,9 @@ fs_visitor::lower_simd_width()
> split_inst.src[j] = emit_unzip(lbld, block, inst, j);
>  
>  split_inst.dst = emit_zip(lbld, block, inst);
> -split_inst.regs_written =
> -   DIV_ROUND_UP(type_sz(inst->dst.type) * dst_size * lower_width,
> -REG_SIZE);
> +split_inst.regs_written = DIV_ROUND_UP(
> +   split_inst.dst.component_size(lower_width) * dst_size,
> +   REG_SIZE);
>  
>  lbld.emit(split_inst);
>   }
> -- 
> 2.7.3
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 8/8] i965: Use the correct number of threads for compute shaders.

2016-06-11 Thread Jordan Justen
Series Reviewed-by: Jordan Justen 

On 2016-06-10 13:05:20, Kenneth Graunke wrote:
> We were programming the number of threads per subslice, when we should
> have been programming the total number of threads on the GPU as a whole.
> 
> Thanks to Curro and Jordan for helping track this down!
> 
> On Skylake GT3e:
> - Improves performance in Unreal's Elemental Demo by roughly 1.5-1.7x.
> - Improves performance in Synmark's Gl43CSDof by roughly 3.7x.
> - Improves performance in Synmark's Gl43GSCloth by roughly 1.18x.
> 
> On Broadwell GT2:
> - Improves performance in Unreal's Elemental Demo by roughly 1.2-1.5x.
> - Improves performance in Synmark's Gl43CSDof by roughly 2.0x.
> - Improves performance in Synmark's Gl43GSCloth by 1.47035% +/-
>   0.255654% (n=25).
> 
> On Haswell GT3e:
> - Improves performance in Unreal's Elemental Demo (in GL 4.3 mode)
>   by roughly 1.10x.
> - Improves performance in Synmark's Gl43CSDof by roughly 1.18x.
> - Decreases performance in Synmark's Gl43CSCloth by -1.99484% +/-
>   0.432771% (n=64).
> 
> On Ivybridge GT2:
> - Improves performance in Unreal's Elemental Demo (in GL 4.2 mode)
>   by roughly 1.03x.
> - Improves performance in Synmark's G/43CSDof by roughly 1.25x.
> - No change in Synmark's Gl43CSCloth (n=28).
> 
> Cc: "12.0" 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/gen7_cs_state.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c 
> b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> index 9d83837..ba558a6 100644
> --- a/src/mesa/drivers/dri/i965/gen7_cs_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> @@ -95,7 +95,9 @@ brw_upload_cs_state(struct brw_context *brw)
> const uint32_t vfe_num_urb_entries = brw->gen >= 8 ? 2 : 0;
> const uint32_t vfe_gpgpu_mode =
>brw->gen == 7 ? SET_FIELD(1, GEN7_MEDIA_VFE_STATE_GPGPU_MODE) : 0;
> -   OUT_BATCH(SET_FIELD(brw->max_cs_threads - 1, MEDIA_VFE_STATE_MAX_THREADS) 
> |
> +   const uint32_t subslices = MAX2(brw->intelScreen->subslice_total, 1);
> +   OUT_BATCH(SET_FIELD(brw->max_cs_threads * subslices - 1,
> +   MEDIA_VFE_STATE_MAX_THREADS) |
>   SET_FIELD(vfe_num_urb_entries, MEDIA_VFE_STATE_URB_ENTRIES) |
>   SET_FIELD(1, MEDIA_VFE_STATE_RESET_GTW_TIMER) |
>   SET_FIELD(1, MEDIA_VFE_STATE_BYPASS_GTW) |
> -- 
> 2.8.3
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] i965: Keep track of the per-thread scratch allocation in brw_stage_state.

2016-06-11 Thread Francisco Jerez
This will be used to find out what per-thread slot size a previously
allocated scratch BO was used with in order to fix a hardware race
condition without introducing additional stalls or memory allocations.
Instead of calling brw_get_scratch_bo() manually from the various
codegen functions, call a new helper function that keeps track of the
per-thread scratch size and conditionally allocates a larger scratch
BO.
---
This patch and the next one apply on top of Ken's compute shader
scratch fixes from:
 https://lists.freedesktop.org/archives/mesa-dev/2016-June/120084.html
 
 src/mesa/drivers/dri/i965/brw_context.h | 10 +++
 src/mesa/drivers/dri/i965/brw_cs.c  | 48 -
 src/mesa/drivers/dri/i965/brw_gs.c  |  8 +++---
 src/mesa/drivers/dri/i965/brw_program.c | 17 
 src/mesa/drivers/dri/i965/brw_tcs.c |  8 +++---
 src/mesa/drivers/dri/i965/brw_tes.c |  8 +++---
 src/mesa/drivers/dri/i965/brw_vs.c  |  8 +++---
 src/mesa/drivers/dri/i965/brw_wm.c  |  7 +++--
 8 files changed, 65 insertions(+), 49 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index daa9ed2..9618b4a 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -677,6 +677,12 @@ struct brw_stage_state
 */
drm_intel_bo *scratch_bo;
 
+   /**
+* Scratch slot size allocated for each thread in the buffer object given
+* by \c scratch_bo.
+*/
+   uint32_t per_thread_scratch;
+
/** Offset in the program cache to the program */
uint32_t prog_offset;
 
@@ -1481,6 +1487,10 @@ brw_get_scratch_size(int size)
 }
 void brw_get_scratch_bo(struct brw_context *brw,
drm_intel_bo **scratch_bo, int size);
+void brw_alloc_stage_scratch(struct brw_context *brw,
+ struct brw_stage_state *stage_state,
+ unsigned per_thread_size,
+ unsigned thread_count);
 void brw_init_shader_time(struct brw_context *brw);
 int brw_get_shader_time_index(struct brw_context *brw,
   struct gl_shader_program *shader_prog,
diff --git a/src/mesa/drivers/dri/i965/brw_cs.c 
b/src/mesa/drivers/dri/i965/brw_cs.c
index 329adff..5c89d42 100644
--- a/src/mesa/drivers/dri/i965/brw_cs.c
+++ b/src/mesa/drivers/dri/i965/brw_cs.c
@@ -148,31 +148,29 @@ brw_codegen_cs_prog(struct brw_context *brw,
   }
}
 
-   if (prog_data.base.total_scratch) {
-  const unsigned subslices = MAX2(brw->intelScreen->subslice_total, 1);
-
-  /* WaCSScratchSize:hsw
-   *
-   * Haswell's scratch space address calculation appears to be sparse
-   * rather than tightly packed.  The Thread ID has bits indicating
-   * which subslice, EU within a subslice, and thread within an EU
-   * it is.  There's a maximum of two slices and two subslices, so these
-   * can be stored with a single bit.  Even though there are only 10 EUs
-   * per subslice, this is stored in 4 bits, so there's an effective
-   * maximum value of 16 EUs.  Similarly, although there are only 7
-   * threads per EU, this is stored in a 3 bit number, giving an effective
-   * maximum value of 8 threads per EU.
-   *
-   * This means that we need to use 16 * 8 instead of 10 * 7 for the
-   * number of threads per subslice.
-   */
-  const unsigned threads_per_subslice =
- brw->is_haswell ? 16 * 8 : brw->max_cs_threads;
-
-  brw_get_scratch_bo(brw, >cs.base.scratch_bo,
- prog_data.base.total_scratch *
- threads_per_subslice * subslices);
-   }
+   const unsigned subslices = MAX2(brw->intelScreen->subslice_total, 1);
+
+   /* WaCSScratchSize:hsw
+*
+* Haswell's scratch space address calculation appears to be sparse
+* rather than tightly packed.  The Thread ID has bits indicating
+* which subslice, EU within a subslice, and thread within an EU
+* it is.  There's a maximum of two slices and two subslices, so these
+* can be stored with a single bit.  Even though there are only 10 EUs
+* per subslice, this is stored in 4 bits, so there's an effective
+* maximum value of 16 EUs.  Similarly, although there are only 7
+* threads per EU, this is stored in a 3 bit number, giving an effective
+* maximum value of 8 threads per EU.
+*
+* This means that we need to use 16 * 8 instead of 10 * 7 for the
+* number of threads per subslice.
+*/
+   const unsigned threads_per_subslice =
+  brw->is_haswell ? 16 * 8 : brw->max_cs_threads;
+
+   brw_alloc_stage_scratch(brw, >cs.base,
+   prog_data.base.total_scratch,
+   threads_per_subslice * subslices);
 
if (unlikely(INTEL_DEBUG & DEBUG_CS))
   fprintf(stderr, "\n");
diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
b/src/mesa/drivers/dri/i965/brw_gs.c
index 

[Mesa-dev] [PATCH 1/3] i965: Fix scratch overallocation if the original slot size was already a power of two.

2016-06-11 Thread Francisco Jerez
The bitwise arithmetic trick used in brw_get_scratch_size() to clamp
the scratch allocation to 1KB has the unintended side effect that it
will cause us to allocate 2x the required amount of scratch space if
the original per-thread scratch size happened to be already a power of
two.  Instead use the obvious MAX2 idiom to clamp the scratch
allocation to the expected range.
---
 src/mesa/drivers/dri/i965/brw_context.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 4b22201..daa9ed2 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1477,7 +1477,7 @@ void brwInitFragProgFuncs( struct dd_function_table 
*functions );
 static inline int
 brw_get_scratch_size(int size)
 {
-   return util_next_power_of_two(size | 1023);
+   return MAX2(1024, util_next_power_of_two(size));
 }
 void brw_get_scratch_bo(struct brw_context *brw,
drm_intel_bo **scratch_bo, int size);
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] i965: Fix cross-primitive scratch corruption when changing the per-thread allocation.

2016-06-11 Thread Francisco Jerez
I haven't found any mention of this in the hardware docs, but
experimentally what seems to be going on is that when the per-thread
scratch slot size is changed between two pipelined draw calls, shader
invocations using the old and new scratch size setting may end up
being executed in parallel, causing their scratch offset calculations
to be based in a different partitioning of the scratch space, which
can cause their thread-local scratch space to overlap leading to
cross-thread scratch corruption.

I've been experimenting with alternative workarounds, like emitting a
PIPE_CONTROL with DC flush and CS stall between draw (or dispatch
compute) calls using different per-thread scratch allocation settings,
or avoiding reuse of the scratch BO if the per-thread scratch
allocation doesn't exactly match the original.  Both seem to be as
effective as this workaround, but they have potential performance
implications, while this should be basically for free.

Fixes over 40 failures in our CI system with spilling forced on
(including CTS, dEQP and Piglit failures) on a number of different
platforms from Gen4 to Gen9.  The 'glsl-max-varyings' piglit test
seems to be able to reproduce this bug consistently in the vertex
shader on at least Gen4, Gen8 and Gen9 with spilling forced on.
---
 src/mesa/drivers/dri/i965/brw_context.h   | 13 +
 src/mesa/drivers/dri/i965/brw_vs_state.c  |  2 +-
 src/mesa/drivers/dri/i965/brw_wm_state.c  |  2 +-
 src/mesa/drivers/dri/i965/gen6_gs_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen6_vs_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen6_wm_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen7_cs_state.c |  6 +++---
 src/mesa/drivers/dri/i965/gen7_ds_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen7_gs_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen7_hs_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen7_vs_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen7_wm_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen8_ds_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen8_gs_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen8_hs_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen8_ps_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen8_vs_state.c |  2 +-
 17 files changed, 31 insertions(+), 18 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 9618b4a..6e84506 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -674,6 +674,19 @@ struct brw_stage_state
/**
 * Optional scratch buffer used to store spilled register values and
 * variably-indexed GRF arrays.
+*
+* The contents of this buffer are short-lived so the same memory can be
+* re-used at will for multiple shader programs (executed by the same fixed
+* function).  However reusing a scratch BO for which shader invocations
+* are still in flight with a per-thread scratch slot size other than the
+* original can cause threads with different scratch slot size and FFTID
+* (which may be executed in parallel depending on the shader stage and
+* hardware generation) to map to an overlapping region of the scratch
+* space, which can potentially lead to mutual scratch space corruption.
+* For that reason if you borrow this scratch buffer you should only be
+* using the slot size given by the \c per_thread_scratch member below,
+* unless you're taking additional measures to synchronize thread execution
+* across slot size changes.
 */
drm_intel_bo *scratch_bo;
 
diff --git a/src/mesa/drivers/dri/i965/brw_vs_state.c 
b/src/mesa/drivers/dri/i965/brw_vs_state.c
index c728f09..331949a 100644
--- a/src/mesa/drivers/dri/i965/brw_vs_state.c
+++ b/src/mesa/drivers/dri/i965/brw_vs_state.c
@@ -83,7 +83,7 @@ brw_upload_vs_unit(struct brw_context *brw)
   vs->thread2.scratch_space_base_pointer =
 stage_state->scratch_bo->offset64 >> 10; /* reloc */
   vs->thread2.per_thread_scratch_space =
-ffs(brw->vs.prog_data->base.base.total_scratch) - 11;
+ffs(stage_state->per_thread_scratch) - 11;
} else {
   vs->thread2.scratch_space_base_pointer = 0;
   vs->thread2.per_thread_scratch_space = 0;
diff --git a/src/mesa/drivers/dri/i965/brw_wm_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_state.c
index bf1bdc9..dda4f23 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_state.c
@@ -133,7 +133,7 @@ brw_upload_wm_unit(struct brw_context *brw)
   wm->thread2.scratch_space_base_pointer =
 brw->wm.base.scratch_bo->offset64 >> 10; /* reloc */
   wm->thread2.per_thread_scratch_space =
-ffs(prog_data->base.total_scratch) - 11;
+ffs(brw->wm.base.per_thread_scratch) - 11;
} else {
   wm->thread2.scratch_space_base_pointer = 0;
   wm->thread2.per_thread_scratch_space = 0;
diff --git a/src/mesa/drivers/dri/i965/gen6_gs_state.c 
b/src/mesa/drivers/dri/i965/gen6_gs_state.c
index 4e4b9463..da7322e 100644
--- 

Re: [Mesa-dev] [PATCH 4/8] i965: Account for poor address calculations in Haswell CS scratch size.

2016-06-11 Thread Jordan Justen
On 2016-06-10 13:05:16, Kenneth Graunke wrote:
> Curro figured this out by investigating the simulator.  Apparently
> there's also a workaround in the Windows driver.  I'm not sure it's
> actually documented anywhere.
> 
> We were underallocating the scratch buffer by a factor of 128/70.
> 
> Cc: "12.0" 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_cs.c | 21 -
>  1 file changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_cs.c 
> b/src/mesa/drivers/dri/i965/brw_cs.c
> index c8598d6..329adff 100644
> --- a/src/mesa/drivers/dri/i965/brw_cs.c
> +++ b/src/mesa/drivers/dri/i965/brw_cs.c
> @@ -150,9 +150,28 @@ brw_codegen_cs_prog(struct brw_context *brw,
>  
> if (prog_data.base.total_scratch) {
>const unsigned subslices = MAX2(brw->intelScreen->subslice_total, 1);
> +
> +  /* WaCSScratchSize:hsw
> +   *
> +   * Haswell's scratch space address calculation appears to be sparse
> +   * rather than tightly packed.  The Thread ID has bits indicating
> +   * which subslice, EU within a subslice, and thread within an EU
> +   * it is.  There's a maximum of two slices and two subslices, so these
> +   * can be stored with a single bit.  Even though there are only 10 EUs
> +   * per subslice, this is stored in 4 bits, so there's an effective
> +   * maximum value of 16 EUs.  Similarly, although there are only 7
> +   * threads per EU, this is stored in a 3 bit number, giving an 
> effective
> +   * maximum value of 8 threads per EU.
> +   *
> +   * This means that we need to use 16 * 8 instead of 10 * 7 for the
> +   * number of threads per subslice.
> +   */
> +  const unsigned threads_per_subslice =

How about naming the variable something like scratch_ids_per_subslice?

-Jordan

> + brw->is_haswell ? 16 * 8 : brw->max_cs_threads;
> +
>brw_get_scratch_bo(brw, >cs.base.scratch_bo,
>   prog_data.base.total_scratch *
> - brw->max_cs_threads * subslices);
> + threads_per_subslice * subslices);
> }
>  
> if (unlikely(INTEL_DEBUG & DEBUG_CS))
> -- 
> 2.8.3
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/compiler: Bring back the INTEL_PRECISE_TRIG environment variable

2016-06-11 Thread Kenneth Graunke
On Saturday, June 11, 2016 1:22:00 PM PDT Jason Ekstrand wrote:
> This was removed in d9546b0c5d and replced with the precise_trig driconf
> option.  However, we still need precise trig in the Vulkan driver so this
> commit brings back the environment variable and compiler->precise_trig is
> effectively the logical OR of the two.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96484
> Cc: "12.0" 
> Cc: Mark Janes 
> ---
>  src/mesa/drivers/dri/i965/brw_compiler.c | 2 ++
>  src/mesa/drivers/dri/i965/brw_context.c  | 4 ++--
>  2 files changed, 4 insertions(+), 2 deletions(-)

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/compiler: Bring back the INTEL_PRECISE_TRIG environment variable

2016-06-11 Thread Jason Ekstrand
This was removed in d9546b0c5d and replced with the precise_trig driconf
option.  However, we still need precise trig in the Vulkan driver so this
commit brings back the environment variable and compiler->precise_trig is
effectively the logical OR of the two.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96484
Cc: "12.0" 
Cc: Mark Janes 
---
 src/mesa/drivers/dri/i965/brw_compiler.c | 2 ++
 src/mesa/drivers/dri/i965/brw_context.c  | 4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_compiler.c 
b/src/mesa/drivers/dri/i965/brw_compiler.c
index 9eda3fc..a4855a0 100644
--- a/src/mesa/drivers/dri/i965/brw_compiler.c
+++ b/src/mesa/drivers/dri/i965/brw_compiler.c
@@ -103,6 +103,8 @@ brw_compiler_create(void *mem_ctx, const struct 
brw_device_info *devinfo)
brw_fs_alloc_reg_sets(compiler);
brw_vec4_alloc_reg_set(compiler);
 
+   compiler->precise_trig = env_var_as_boolean("INTEL_PRECISE_TRIG", false);
+
compiler->scalar_stage[MESA_SHADER_VERTEX] =
   devinfo->gen >= 8 && !(INTEL_DEBUG & DEBUG_VEC4VS);
compiler->scalar_stage[MESA_SHADER_TESS_CTRL] =
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 7bbc128..a0ee0e6 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -803,8 +803,8 @@ brw_process_driconf_options(struct brw_context *brw)
 
brw->precompile = driQueryOptionb(>optionCache, "shader_precompile");
 
-   brw->intelScreen->compiler->precise_trig =
-  driQueryOptionb(>optionCache, "precise_trig");
+   if (driQueryOptionb(>optionCache, "precise_trig"))
+  brw->intelScreen->compiler->precise_trig = true;
 
ctx->Const.ForceGLSLExtensionsWarn =
   driQueryOptionb(options, "force_glsl_extensions_warn");
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH 7/8] i965: Assert that the scratch spaces are in range.

2016-06-11 Thread Francisco Jerez
Kenneth Graunke  writes:

> I don't know that anything actually guarantees this, but if we exceed
> the limits, we may end up overflowing and trashing random buffers that
> happen to be nearby in the VMA space, leading to rendering corruption,
> hangs, or worse.
>
> We should really fix this properly.  However, the pitfall has existed
> for ages, so for now we should at least detect it.
>
> Cc: "12.0" 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 14 ++
>  1 file changed, 14 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index f1a1c87..104c20b 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -6001,7 +6001,21 @@ fs_visitor::allocate_registers(bool allow_spilling)
>* size linearly with a range of [1kB, 12kB] and 1kB granularity.
>*/
>   prog_data->total_scratch = ALIGN(last_scratch, 1024);
> +
> + assert(prog_data->total_scratch < 12 * 1024);

Looks like the following CTS and Piglit test cases hit this assertion
when run with INTEL_DEBUG=spill_fs:

  piglit.es31-cts.layout_binding.image2d_layout_binding_imageload_computeshader
  piglit.es31-cts.shader_storage_buffer_object.basic-stdlayout_ubo_ssbo-case2-cs
  piglit.spec.arb_compute_shader.linker.bug-93840

We can probably fix it later on, it will likely make things easier to
have a test case using more than 12 KB of scratch from the compute
shader.  Series is:

Reviewed-by: Francisco Jerez 

>}
> +
> +  /* We currently only support up to 2MB of scratch space.  If we
> +   * need to support more eventually, the documentation suggests
> +   * that we could allocate a larger buffer, and partition it out
> +   * ourselves.  We'd just have to undo the hardware's address
> +   * calculation by subtracting (FFTID * Per Thread Scratch Space)
> +   * and then add FFTID * (Larger Per Thread Scratch Space).
> +   *
> +   * See 3D-Media-GPGPU Engine > Media GPGPU Pipeline >
> +   * Thread Group Tracking > Local Memory/Scratch Space.
> +   */
> +  assert(prog_data->total_scratch < 2 * 1024 * 1024);
> }
>  }
>  
> -- 
> 2.8.3
>
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-stable


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 3/3] u_vbuf: use single vertex buffer if needed

2016-06-11 Thread Christian Gmeiner
From: "Wladimir J. van der Laan" 

CONST, VERTEX and INSTANCE attributes into one vertex buffer if
necessary due to hardware constraints.

Signed-off-by: Wladimir J. van der Laan 
---
 src/gallium/auxiliary/util/u_vbuf.c | 28 
 1 file changed, 24 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_vbuf.c 
b/src/gallium/auxiliary/util/u_vbuf.c
index 464c279..d35f3b0 100644
--- a/src/gallium/auxiliary/util/u_vbuf.c
+++ b/src/gallium/auxiliary/util/u_vbuf.c
@@ -539,25 +539,45 @@ u_vbuf_translate_find_free_vb_slots(struct u_vbuf *mgr,
uint32_t unused_vb_mask =
   (mgr->ve->incompatible_vb_mask_all | mgr->incompatible_vb_mask |
   ~mgr->enabled_vb_mask | extra_free_vb_mask) & mgr->allowed_vb_mask;
+   uint32_t unused_vb_mask_temp;
+   boolean insufficient_buffers = false;
+
+   /* No vertex buffers available at all */
+   if(!unused_vb_mask)
+  return FALSE;
 
memset(fallback_vbs, ~0, sizeof(fallback_vbs));
 
/* Find free slots for each type if needed. */
+   unused_vb_mask_temp = unused_vb_mask;
for (type = 0; type < VB_NUM; type++) {
   if (mask[type]) {
  uint32_t index;
 
- if (!unused_vb_mask) {
-return FALSE;
+ if (!unused_vb_mask_temp) {
+insufficient_buffers = TRUE;
+break;
  }
 
- index = ffs(unused_vb_mask) - 1;
+ index = ffs(unused_vb_mask_temp) - 1;
  fallback_vbs[type] = index;
- unused_vb_mask &= ~(1 << index);
+ unused_vb_mask_temp &= ~(1 << index);
  /*printf("found slot=%i for type=%i\n", index, type);*/
   }
}
 
+   if (insufficient_buffers) {
+  /* not enough vbs for all types supported by the hardware, they will 
have to
+   * share one buffer */
+  uint32_t index = ffs(unused_vb_mask) - 1;
+
+  /* When sharing one vertex buffer use per-vertex frequency for 
everything. */
+  fallback_vbs[VB_VERTEX] = index;
+  mask[VB_VERTEX] = mask[VB_VERTEX] | mask[VB_CONST] | mask[VB_INSTANCE];
+  mask[VB_CONST] = 0;
+  mask[VB_INSTANCE] = 0;
+   }
+
for (type = 0; type < VB_NUM; type++) {
   if (mask[type]) {
  mgr->dirty_real_vb_mask |= 1 << fallback_vbs[type];
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 2/3] u_vbuf: add logic to use a limited number of vbufs

2016-06-11 Thread Christian Gmeiner
From: "Wladimir J. van der Laan" 

Make it possible to limit the number of vertex buffers as there exist
GPUs with less then 32 supported vertex buffers.

Signed-off-by: Wladimir J. van der Laan 
---
 src/gallium/auxiliary/util/u_vbuf.c | 45 +++--
 src/gallium/auxiliary/util/u_vbuf.h |  3 +++
 2 files changed, 41 insertions(+), 7 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_vbuf.c 
b/src/gallium/auxiliary/util/u_vbuf.c
index 5b4e527..464c279 100644
--- a/src/gallium/auxiliary/util/u_vbuf.c
+++ b/src/gallium/auxiliary/util/u_vbuf.c
@@ -184,6 +184,8 @@ struct u_vbuf {
uint32_t incompatible_vb_mask; /* each bit describes a corresp. buffer */
/* Which buffer has a non-zero stride. */
uint32_t nonzero_stride_vb_mask; /* each bit describes a corresp. buffer */
+   /* Which buffers are allowed (supported by hardware). */
+   uint32_t allowed_vb_mask;
 };
 
 static void *
@@ -291,10 +293,14 @@ boolean u_vbuf_get_caps(struct pipe_screen *screen, 
struct u_vbuf_caps *caps)
caps->user_vertex_buffers =
   screen->get_param(screen, PIPE_CAP_USER_VERTEX_BUFFERS);
 
+   caps->max_vertex_buffers =
+  screen->get_param(screen, PIPE_CAP_MAX_VERTEX_BUFFERS);
+
if (!caps->buffer_offset_unaligned ||
!caps->buffer_stride_unaligned ||
!caps->velem_src_offset_unaligned ||
-   !caps->user_vertex_buffers) {
+   !caps->user_vertex_buffers ||
+   !caps->max_vertex_buffers) {
   fallback = TRUE;
}
 
@@ -313,6 +319,7 @@ u_vbuf_create(struct pipe_context *pipe,
mgr->cso_cache = cso_cache_create();
mgr->translate_cache = translate_cache_create();
memset(mgr->fallback_vbs, ~0, sizeof(mgr->fallback_vbs));
+   mgr->allowed_vb_mask = (1 << mgr->caps.max_vertex_buffers) - 1;
 
mgr->uploader = u_upload_create(pipe, 1024 * 1024,
PIPE_BIND_VERTEX_BUFFER,
@@ -523,14 +530,15 @@ u_vbuf_translate_buffers(struct u_vbuf *mgr, struct 
translate_key *key,
 
 static boolean
 u_vbuf_translate_find_free_vb_slots(struct u_vbuf *mgr,
-unsigned mask[VB_NUM])
+unsigned mask[VB_NUM],
+unsigned extra_free_vb_mask)
 {
unsigned type;
unsigned fallback_vbs[VB_NUM];
/* Set the bit for each buffer which is incompatible, or isn't set. */
uint32_t unused_vb_mask =
-  mgr->ve->incompatible_vb_mask_all | mgr->incompatible_vb_mask |
-  ~mgr->enabled_vb_mask;
+  (mgr->ve->incompatible_vb_mask_all | mgr->incompatible_vb_mask |
+  ~mgr->enabled_vb_mask | extra_free_vb_mask) & mgr->allowed_vb_mask;
 
memset(fallback_vbs, ~0, sizeof(fallback_vbs));
 
@@ -573,6 +581,7 @@ u_vbuf_translate_begin(struct u_vbuf *mgr,
unsigned i, type;
unsigned incompatible_vb_mask = mgr->incompatible_vb_mask &
mgr->ve->used_vb_mask;
+   unsigned extra_free_vb_mask = 0;
 
int start[VB_NUM] = {
   start_vertex, /* VERTEX */
@@ -618,8 +627,15 @@ u_vbuf_translate_begin(struct u_vbuf *mgr,
 
assert(mask[VB_VERTEX] || mask[VB_INSTANCE] || mask[VB_CONST]);
 
+   /* In the case of unroll_indices, we can regard all non-constant
+* vertex buffers with only non-instance vertex elements as incompatible
+* and thus free.
+*/
+   if (unroll_indices)
+   extra_free_vb_mask = mask[VB_VERTEX] & ~mask[VB_INSTANCE];
+
/* Find free vertex buffer slots. */
-   if (!u_vbuf_translate_find_free_vb_slots(mgr, mask)) {
+   if (!u_vbuf_translate_find_free_vb_slots(mgr, mask, extra_free_vb_mask)) {
   return FALSE;
}
 
@@ -778,6 +794,17 @@ u_vbuf_create_vertex_elements(struct u_vbuf *mgr, unsigned 
count,
   }
}
 
+   if (used_buffers & ~mgr->allowed_vb_mask) {
+  /* More vertex buffers are used than the hardware supports.  In
+   * principle, we only need to make sure that less vertex buffers are
+   * used, and mark some of the latter vertex buffers as incompatible.
+   * For now, mark all vertex buffers as incompatible.
+   */
+  ve->incompatible_vb_mask_any = used_buffers;
+  ve->compatible_vb_mask_any = 0;
+  ve->incompatible_elem_mask = (1 << count) - 1;
+   }
+
ve->used_vb_mask = used_buffers;
ve->compatible_vb_mask_all = ~ve->incompatible_vb_mask_any & used_buffers;
ve->incompatible_vb_mask_all = ~ve->compatible_vb_mask_any & used_buffers;
@@ -790,8 +817,12 @@ u_vbuf_create_vertex_elements(struct u_vbuf *mgr, unsigned 
count,
   }
}
 
-   ve->driver_cso =
-  pipe->create_vertex_elements_state(pipe, count, driver_attribs);
+   /* Only create driver CSO if no incompatible elements */
+   if (!ve->incompatible_elem_mask) {
+  ve->driver_cso =
+ pipe->create_vertex_elements_state(pipe, count, driver_attribs);
+   }
+
return ve;
 }
 
diff --git a/src/gallium/auxiliary/util/u_vbuf.h 
b/src/gallium/auxiliary/util/u_vbuf.h

[Mesa-dev] [RFC 1/3] gallium: add PIPE_CAP_MAX_VERTEX_BUFFERS

2016-06-11 Thread Christian Gmeiner
Signed-off-by: Christian Gmeiner 
---
 src/gallium/docs/source/screen.rst   | 1 +
 src/gallium/drivers/freedreno/freedreno_screen.c | 2 ++
 src/gallium/drivers/i915/i915_screen.c   | 2 ++
 src/gallium/drivers/ilo/ilo_screen.c | 2 ++
 src/gallium/drivers/llvmpipe/lp_screen.c | 2 ++
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 2 ++
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 2 ++
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 2 ++
 src/gallium/drivers/r300/r300_screen.c   | 2 ++
 src/gallium/drivers/r600/r600_pipe.c | 2 ++
 src/gallium/drivers/radeonsi/si_pipe.c   | 2 ++
 src/gallium/drivers/softpipe/sp_screen.c | 2 ++
 src/gallium/drivers/svga/svga_screen.c   | 2 ++
 src/gallium/drivers/swr/swr_screen.cpp   | 2 ++
 src/gallium/drivers/vc4/vc4_screen.c | 2 ++
 src/gallium/drivers/virgl/virgl_screen.c | 2 ++
 src/gallium/include/pipe/p_defines.h | 1 +
 17 files changed, 32 insertions(+)

diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index 979b6c1..6c91c66 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -341,6 +341,7 @@ The integer capabilities:
 * ``PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES``: Whether primitive restart is
   supported for patch primitives.
 * ``PIPE_CAP_TGSI_VOTE``: Whether the ``VOTE_*`` ops can be used in shaders.
+* ``PIPE_CAP_MAX_VERTEX_BUFFERS`: Number of supported vertex buffers.
 
 
 .. _pipe_capf:
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index c258074..3672c9d 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -351,6 +351,8 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
return 10;
case PIPE_CAP_UMA:
return 1;
+   case PIPE_CAP_MAX_VERTEX_BUFFERS:
+   return 32;
}
debug_printf("unknown param %d\n", param);
return 0;
diff --git a/src/gallium/drivers/i915/i915_screen.c 
b/src/gallium/drivers/i915/i915_screen.c
index a7ee381..e7d72d1 100644
--- a/src/gallium/drivers/i915/i915_screen.c
+++ b/src/gallium/drivers/i915/i915_screen.c
@@ -366,6 +366,8 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
cap)
}
case PIPE_CAP_UMA:
   return 1;
+   case PIPE_CAP_MAX_VERTEX_BUFFERS:
+  return 32;
 
default:
   debug_printf("%s: Unknown cap %u.\n", __FUNCTION__, cap);
diff --git a/src/gallium/drivers/ilo/ilo_screen.c 
b/src/gallium/drivers/ilo/ilo_screen.c
index 1ae0327..026cd6c 100644
--- a/src/gallium/drivers/ilo/ilo_screen.c
+++ b/src/gallium/drivers/ilo/ilo_screen.c
@@ -532,6 +532,8 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap 
param)
   return false;
case PIPE_CAP_POLYGON_OFFSET_CLAMP:
   return true;
+   case PIPE_CAP_MAX_VERTEX_BUFFERS:
+  return 32;
 
default:
   return 0;
diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index f2a12a0..5beb746 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -326,6 +326,8 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
case PIPE_CAP_TGSI_VOTE:
   return 0;
+   case PIPE_CAP_MAX_VERTEX_BUFFERS:
+  return 32;
}
/* should only get here on unhandled cases */
debug_printf("Unexpected PIPE_CAP %d query\n", param);
diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c 
b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
index de798cf..763926a 100644
--- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
@@ -75,6 +75,8 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
   return 1;
case PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE:
   return 2048;
+   case PIPE_CAP_MAX_VERTEX_BUFFERS:
+  return 32;
/* supported capabilities */
case PIPE_CAP_TWO_SIDED_STENCIL:
case PIPE_CAP_ANISOTROPIC_FILTER:
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index bcb1ae9..35cda61 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -141,6 +141,8 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
   return PIPE_ENDIAN_LITTLE;
case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
   return (class_3d >= NVA3_3D_CLASS) ? 4 : 0;
+   case PIPE_CAP_MAX_VERTEX_BUFFERS:
+  return 32;
 
/* supported caps */
case PIPE_CAP_TEXTURE_MIRROR_CLAMP:
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
index b9437b2..d9c8545 

[Mesa-dev] [RFC 0/3] Support GPUs with less then 32 vertex buffers

2016-06-11 Thread Christian Gmeiner
To be able to upstream etnaviv mesa driver it is needed to get all
gallium changes reviewed (and upstreamed) first. And I will start
with the vertex buffer topic.

The current u_vbuf source assumes that every GPU supports 32
vertext buffers. Vivante GPUs do support a different number of
vertext buffers based on the model.

- GC600:  1
- GC1000: 4
- GC2000: 8

The patches are written by Wladimir about 3 years ago and I did a rebase
and some code cosmectics only.

Christian Gmeiner (1):
  gallium: add PIPE_CAP_MAX_VERTEX_BUFFERS

Wladimir J. van der Laan (2):
  u_vbuf: add logic to use a limited number of vbufs
  u_vbuf: use single vertex buffer if needed

 src/gallium/auxiliary/util/u_vbuf.c  | 73 
 src/gallium/auxiliary/util/u_vbuf.h  |  3 +
 src/gallium/docs/source/screen.rst   |  1 +
 src/gallium/drivers/freedreno/freedreno_screen.c |  2 +
 src/gallium/drivers/i915/i915_screen.c   |  2 +
 src/gallium/drivers/ilo/ilo_screen.c |  2 +
 src/gallium/drivers/llvmpipe/lp_screen.c |  2 +
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   |  2 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   |  2 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   |  2 +
 src/gallium/drivers/r300/r300_screen.c   |  2 +
 src/gallium/drivers/r600/r600_pipe.c |  2 +
 src/gallium/drivers/radeonsi/si_pipe.c   |  2 +
 src/gallium/drivers/softpipe/sp_screen.c |  2 +
 src/gallium/drivers/svga/svga_screen.c   |  2 +
 src/gallium/drivers/swr/swr_screen.cpp   |  2 +
 src/gallium/drivers/vc4/vc4_screen.c |  2 +
 src/gallium/drivers/virgl/virgl_screen.c |  2 +
 src/gallium/include/pipe/p_defines.h |  1 +
 19 files changed, 97 insertions(+), 11 deletions(-)

-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium: add API for setting window rectangles

2016-06-11 Thread Ilia Mirkin
Window rectangles apply to all framebuffer operations, either in
inclusive or exclusive mode. They may also be specified as part of a
blit operation.

In exclusive mode, any fragment inside any of the specified rectangles
will be discarded.

In inclusive mode, any fragment outside every rectangle will be
discarded.

The no-op state is to have 0 rectangles in exclusive mode.

Signed-off-by: Ilia Mirkin 
---

Wanted to get some early feedback on the interface, while I go around adding 
pipe caps and the state tracker code (which is easy but a little 
time-consuming).

This is for GL_EXT_window_rectangles.

 src/gallium/docs/source/context.rst  | 15 ---
 src/gallium/include/pipe/p_context.h |  5 +
 src/gallium/include/pipe/p_state.h   |  6 ++
 3 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/src/gallium/docs/source/context.rst 
b/src/gallium/docs/source/context.rst
index 3a45f40..667d9a2 100644
--- a/src/gallium/docs/source/context.rst
+++ b/src/gallium/docs/source/context.rst
@@ -79,6 +79,15 @@ objects. They all follow simple, one-method binding calls, 
e.g.
   should be the same as the number of set viewports and can be up to
   PIPE_MAX_VIEWPORTS.
 * ``set_viewport_states``
+* ``set_window_rectangle_states`` sets the window rectangles to be
+  used for rendering, as defined by GL_EXT_window_rectangles. There
+  are two modes - include and exclude, which define whether the
+  supplied rectangles are to be used for including fragments or
+  excluding them. All of the rectangles are ORed together, so in
+  exclude mode, any fragment inside any rectangle would be culled,
+  while in include mode, any fragment outside all rectangles would be
+  culled. xmin/ymin are inclusive, while xmax/ymax are exclusive (same
+  as scissor states above).
 * ``set_tess_state`` configures the default tessellation parameters:
   * ``default_outer_level`` is the default value for the outer tessellation
 levels. This corresponds to GL's ``PATCH_DEFAULT_OUTER_LEVEL``.
@@ -492,9 +501,9 @@ This can be considered the equivalent of a CPU memcpy.
 
 ``blit`` blits a region of a resource to a region of another resource, 
including
 scaling, format conversion, and up-/downsampling, as well as a destination clip
-rectangle (scissors). It can also optionally honor the current render condition
-(but either way the blit itself never contributes anything to queries currently
-gathering data).
+rectangle (scissors) and window rectangles. It can also optionally honor the
+current render condition (but either way the blit itself never contributes
+anything to queries currently gathering data).
 As opposed to manually drawing a textured quad, this lets the pipe driver 
choose
 the optimal method for blitting (like using a special 2D engine), and usually
 offers, for example, accelerated stencil-only copies even where
diff --git a/src/gallium/include/pipe/p_context.h 
b/src/gallium/include/pipe/p_context.h
index 9d7a8eb..0ea18c7 100644
--- a/src/gallium/include/pipe/p_context.h
+++ b/src/gallium/include/pipe/p_context.h
@@ -274,6 +274,11 @@ struct pipe_context {
unsigned num_scissors,
const struct pipe_scissor_state * );
 
+   void (*set_window_rectangle_states)( struct pipe_context *,
+boolean include,
+unsigned num_rectangles,
+const struct pipe_scissor_state * );
+
void (*set_viewport_states)( struct pipe_context *,
 unsigned start_slot,
 unsigned num_viewports,
diff --git a/src/gallium/include/pipe/p_state.h 
b/src/gallium/include/pipe/p_state.h
index 396f563..9c69355 100644
--- a/src/gallium/include/pipe/p_state.h
+++ b/src/gallium/include/pipe/p_state.h
@@ -69,6 +69,7 @@ extern "C" {
 #define PIPE_MAX_VIEWPORTS16
 #define PIPE_MAX_CLIP_OR_CULL_DISTANCE_COUNT 8
 #define PIPE_MAX_CLIP_OR_CULL_DISTANCE_ELEMENT_COUNT 2
+#define PIPE_MAX_WINDOW_RECTANGLES 8
 
 
 struct pipe_reference
@@ -710,6 +711,11 @@ struct pipe_blit_info
boolean scissor_enable;
struct pipe_scissor_state scissor;
 
+   /* Window rectangles can either be inclusive or exclusive. */
+   boolean window_rectangle_include;
+   unsigned num_window_rectangles;
+   struct pipe_scissor_state window_rectangles[PIPE_MAX_WINDOW_RECTANGLES];
+
boolean render_condition_enable; /**< whether the blit should honor the
 current render condition */
boolean alpha_blend; /* dst.rgb = src.rgb * src.a + dst.rgb * (1 - src.a) */
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] llvmpipe: don't use 3-component formats, except 32-bit x 3 formats

2016-06-11 Thread Jose Fonseca

On 10/06/16 22:02, Roland Scheidegger wrote:

Am 10.06.2016 um 20:58 schrieb Brian Paul:

This basically disallows all 8-bit x 3 and 16-bit x 3 formats for
textures and render targets.  Some 3-component formats were already
disallowed before.  This avoids problems with GL_ARB_copy_image.

v2: the previous version of this patch disallowed all 3-component formats

Reviewed-by: Charmaine Lee 
---
  src/gallium/drivers/llvmpipe/lp_screen.c | 23 ---
  1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index f2a12a0..a44312c 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -450,19 +450,20 @@ llvmpipe_is_format_supported( struct pipe_screen *_screen,
if (!format_desc->is_array && !format_desc->is_bitmask &&
format != PIPE_FORMAT_R11G11B10_FLOAT)
   return FALSE;
+   }

-  /*
-   * XXX refuse formats known to crash in generate_unswizzled_blend().
-   * These include all 3-channel 24bit RGB8 variants, plus 48bit
-   * (except those using floats) 3-channel RGB16 variants (the latter
-   * seems to be more of a llvm bug though).
-   * The mesa state tracker only seems to use these for SINT/UINT formats.
+   if ((bind & (PIPE_BIND_RENDER_TARGET | PIPE_BIND_SAMPLER_VIEW)) &&
+   ((bind & PIPE_BIND_DISPLAY_TARGET) == 0)) {
+  /* Disable all 3-channel formats, where channel size != 32 bits.
+   * In some cases we run into crashes (in generate_unswizzled_blend()),
+   * for 3-channel RGB16 variants, there was an apparent LLVM bug.
+   * In any case, disabling the shallower 3-channel formats avoids a
+   * number of issues with GL_ARB_copy_image support.
 */
-  if (format_desc->is_array && format_desc->nr_channels == 3) {
- if (format_desc->block.bits == 24 || (format_desc->block.bits == 48 &&
-   !util_format_is_float(format))) {
-return FALSE;
- }
+  if (format_desc->is_array &&
+  format_desc->nr_channels == 3 &&
+  format_desc->block.bits != 96) {
+ return FALSE;
}
 }




Reviewed-by: Roland Scheidegger 



Reviewed-by: Jose Fonseca 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/2] gallivm: Fix trivial sign warnings

2016-06-11 Thread Jose Fonseca

On 10/06/16 04:01, Jan Vesely wrote:

From: Jan Vesely 

v2: include whitespace fixes

Signed-off-by: Jan Vesely 
---
  src/gallium/auxiliary/gallivm/lp_bld_conv.c |  4 ++--
  src/gallium/auxiliary/gallivm/lp_bld_logic.c| 10 ++
  src/gallium/auxiliary/gallivm/lp_bld_pack.c |  2 +-
  src/gallium/auxiliary/gallivm/lp_bld_printf.c   |  7 +++
  src/gallium/auxiliary/gallivm/lp_bld_swizzle.c  |  2 +-
  src/gallium/auxiliary/gallivm/lp_bld_tgsi.c |  6 +++---
  src/gallium/auxiliary/gallivm/lp_bld_tgsi.h |  2 +-
  src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 10 +-
  8 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_conv.c 
b/src/gallium/auxiliary/gallivm/lp_bld_conv.c
index 7cf0dee..69d24a5 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_conv.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_conv.c
@@ -311,7 +311,7 @@ lp_build_clamped_float_to_unsigned_norm(struct 
gallivm_state *gallivm,
 * important, we also get exact results for 0.0 and 1.0.
 */

-  unsigned n = MIN2(src_type.width - 1, dst_width);
+  unsigned n = MIN2(src_type.width - 1u, dst_width);

double scale = (double)(1ULL << n);
unsigned lshift = dst_width - n;
@@ -445,7 +445,7 @@ int lp_build_conv_auto(struct gallivm_state *gallivm,
 unsigned num_srcs,
 LLVMValueRef *dst)
  {
-   int i;
+   unsigned i;
 int num_dsts = num_srcs;

 if (src_type.floating == dst_type->floating &&
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_logic.c 
b/src/gallium/auxiliary/gallivm/lp_bld_logic.c
index a26cc48..14bf236 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_logic.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_logic.c
@@ -88,8 +88,6 @@ lp_build_compare_ext(struct gallivm_state *gallivm,
 LLVMValueRef cond;
 LLVMValueRef res;

-   assert(func >= PIPE_FUNC_NEVER);
-   assert(func <= PIPE_FUNC_ALWAYS);
 assert(lp_check_value(type, a));
 assert(lp_check_value(type, b));

@@ -98,6 +96,9 @@ lp_build_compare_ext(struct gallivm_state *gallivm,
 if(func == PIPE_FUNC_ALWAYS)
return ones;

+   assert(func > PIPE_FUNC_NEVER);
+   assert(func < PIPE_FUNC_ALWAYS);
+
 if(type.floating) {
LLVMRealPredicate op;
switch(func) {
@@ -176,8 +177,6 @@ lp_build_compare(struct gallivm_state *gallivm,
 LLVMValueRef zeros = LLVMConstNull(int_vec_type);
 LLVMValueRef ones = LLVMConstAllOnes(int_vec_type);

-   assert(func >= PIPE_FUNC_NEVER);
-   assert(func <= PIPE_FUNC_ALWAYS);
 assert(lp_check_value(type, a));
 assert(lp_check_value(type, b));

@@ -186,6 +185,9 @@ lp_build_compare(struct gallivm_state *gallivm,
 if(func == PIPE_FUNC_ALWAYS)
return ones;

+   assert(func > PIPE_FUNC_NEVER);
+   assert(func < PIPE_FUNC_ALWAYS);
+
  #if defined(PIPE_ARCH_X86) || defined(PIPE_ARCH_X86_64)
 /*
  * There are no unsigned integer comparison instructions in SSE.
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_pack.c 
b/src/gallium/auxiliary/gallivm/lp_bld_pack.c
index 35b4c58..b0e76e6 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_pack.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_pack.c
@@ -236,7 +236,7 @@ lp_build_concat_n(struct gallivm_state *gallivm,
unsigned num_dsts)
  {
 int size = num_srcs / num_dsts;
-   int i;
+   unsigned i;

 assert(num_srcs >= num_dsts);
 assert((num_srcs % size) == 0);
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_printf.c 
b/src/gallium/auxiliary/gallivm/lp_bld_printf.c
index 14131b3..575ebdf 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_printf.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_printf.c
@@ -155,10 +155,10 @@ lp_build_print_value(struct gallivm_state *gallivm,
  }


-static int
+static unsigned
  lp_get_printf_arg_count(const char *fmt)
  {
-   int count =0;
+   unsigned count = 0;
 const char *p = fmt;
 int c;

@@ -195,8 +195,7 @@ lp_build_printf(struct gallivm_state *gallivm,
  {
 LLVMValueRef params[50];
 va_list arglist;
-   int argcount;
-   int i;
+   unsigned argcount, i;

 argcount = lp_get_printf_arg_count(fmt);
 assert(ARRAY_SIZE(params) >= argcount + 1);
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_swizzle.c 
b/src/gallium/auxiliary/gallivm/lp_bld_swizzle.c
index 92f387d..5a97c48 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_swizzle.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_swizzle.c
@@ -467,7 +467,7 @@ lp_build_swizzle_aos(struct lp_build_context *bld,
LLVMValueRef res;
struct lp_type type4;
unsigned cond = 0;
-  unsigned chan;
+  int chan;
int shift;

/*
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c
index 614c655..3f5bfec 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c
@@ -335,7 

Re: [Mesa-dev] [PATCH] llvmpipe: hack-fix bugs due to bogus bind flags

2016-06-11 Thread Jose Fonseca

On 11/06/16 00:19, srol...@vmware.com wrote:

From: Roland Scheidegger 

The gallium contract would be that bind flags must indicate all possible
bindings a resource might get used, but fact is the mesa state tracker does
not set bind flags correctly, and this is more or less unfixable due to GL.

This caused a bug with piglit arb_uniform_buffer_object-rendering-dsa
since 6e6fd911da8a1d9cd62fe0a8a4cc0fb7bdccfe02 - the commit is correct,
but it caused us to miss updates to fs UBOs completely, since the
corresponding buffer didn't have the appropriate bind flag set (thus we
wouldn't check if it is indeed currently bound).
See the discussion about this starting here:
https://lists.freedesktop.org/archives/mesa-dev/2016-June/119829.html

So, instead use a new bind_actual value in llvmpipe_resource instead of
the bind one in the pipe_resource, and update it accordingly whenever
a resource is bound "illegally".
Note we update this value for now only in places which matter for us - that
is creating sampler/surface view, or binding constant buffer. There's plenty
more places (setting streamout buffers, vertex/index buffers, ...) where
things can be set with the wrong bind flags, but the bind flags there never
matter.

While here also make sure we only set dirty constant bit when it's a fs
constant buffer - totally doesn't matter if it's vs/gs.
---
  src/gallium/drivers/llvmpipe/lp_state.h |  2 +-
  src/gallium/drivers/llvmpipe/lp_state_derived.c |  2 +-
  src/gallium/drivers/llvmpipe/lp_state_fs.c  | 12 ++--
  src/gallium/drivers/llvmpipe/lp_state_sampler.c |  9 ++---
  src/gallium/drivers/llvmpipe/lp_surface.c   | 10 +-
  src/gallium/drivers/llvmpipe/lp_texture.c   | 24 +++-
  src/gallium/drivers/llvmpipe/lp_texture.h   |  2 ++
  7 files changed, 40 insertions(+), 21 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_state.h 
b/src/gallium/drivers/llvmpipe/lp_state.h
index 78918cf..f15d70d 100644
--- a/src/gallium/drivers/llvmpipe/lp_state.h
+++ b/src/gallium/drivers/llvmpipe/lp_state.h
@@ -46,7 +46,7 @@
  #define LP_NEW_STIPPLE   0x40
  #define LP_NEW_FRAMEBUFFER   0x80
  #define LP_NEW_DEPTH_STENCIL_ALPHA 0x100
-#define LP_NEW_CONSTANTS 0x200
+#define LP_NEW_FS_CONSTANTS  0x200
  #define LP_NEW_SAMPLER   0x400
  #define LP_NEW_SAMPLER_VIEW  0x800
  #define LP_NEW_VERTEX0x1000
diff --git a/src/gallium/drivers/llvmpipe/lp_state_derived.c 
b/src/gallium/drivers/llvmpipe/lp_state_derived.c
index 9e29902..f76de6b 100644
--- a/src/gallium/drivers/llvmpipe/lp_state_derived.c
+++ b/src/gallium/drivers/llvmpipe/lp_state_derived.c
@@ -235,7 +235,7 @@ void llvmpipe_update_derived( struct llvmpipe_context 
*llvmpipe )
llvmpipe->stencil_ref.ref_value);
 }

-   if (llvmpipe->dirty & LP_NEW_CONSTANTS)
+   if (llvmpipe->dirty & LP_NEW_FS_CONSTANTS)
lp_setup_set_fs_constants(llvmpipe->setup,
  
ARRAY_SIZE(llvmpipe->constants[PIPE_SHADER_FRAGMENT]),
  llvmpipe->constants[PIPE_SHADER_FRAGMENT]);
diff --git a/src/gallium/drivers/llvmpipe/lp_state_fs.c 
b/src/gallium/drivers/llvmpipe/lp_state_fs.c
index 7dceff7..a30a051 100644
--- a/src/gallium/drivers/llvmpipe/lp_state_fs.c
+++ b/src/gallium/drivers/llvmpipe/lp_state_fs.c
@@ -2847,6 +2847,13 @@ llvmpipe_set_constant_buffer(struct pipe_context *pipe,
 /* note: reference counting */
 util_copy_constant_buffer(>constants[shader][index], cb);

+   if (constants) {
+  struct llvmpipe_resource *lpr = llvmpipe_resource(constants);
+  if (!(lpr->bind_actual & PIPE_BIND_CONSTANT_BUFFER)) {


Let's add a warning like the one we already for sampler views.


+ lpr->bind_actual |= PIPE_BIND_CONSTANT_BUFFER;
+  }
+   }
+
 if (shader == PIPE_SHADER_VERTEX ||
 shader == PIPE_SHADER_GEOMETRY) {
/* Pass the constants to the 'draw' module */
@@ -2869,8 +2876,9 @@ llvmpipe_set_constant_buffer(struct pipe_context *pipe,
draw_set_mapped_constant_buffer(llvmpipe->draw, shader,
index, data, size);
 }
-
-   llvmpipe->dirty |= LP_NEW_CONSTANTS;
+   else {
+  llvmpipe->dirty |= LP_NEW_FS_CONSTANTS;
+   }

 if (cb && cb->user_buffer) {
pipe_resource_reference(, NULL);
diff --git a/src/gallium/drivers/llvmpipe/lp_state_sampler.c 
b/src/gallium/drivers/llvmpipe/lp_state_sampler.c
index 81b998a..2d61e24 100644
--- a/src/gallium/drivers/llvmpipe/lp_state_sampler.c
+++ b/src/gallium/drivers/llvmpipe/lp_state_sampler.c
@@ -168,12 +168,15 @@ llvmpipe_create_sampler_view(struct pipe_context *pipe,
  const struct pipe_sampler_view *templ)
  {
 struct pipe_sampler_view *view = CALLOC_STRUCT(pipe_sampler_view);
+   struct llvmpipe_resource *lpr = llvmpipe_resource(texture);
 /*
-* XXX we REALLY want to see the correct bind flag here but the 

Re: [Mesa-dev] [PATCH 00/64] i965: Start using ISL for filling out surface states

2016-06-11 Thread Jason Ekstrand
For those of you who like branches.  The whole series can be found here:

https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=review/i965-isl-v1

On Sat, Jun 11, 2016 at 9:02 AM, Jason Ekstrand 
wrote:

> We would like to eventually start using ISL inside of the GL driver to
> replace the fairly sprawling layout code in brw_tex_layout.c and
> intel_mipmap_tree.c.  However, that is a very big change that no one is
> ready to make yet.  A smaller change, I thought, would be to start using
> ISL in blorp.  In order to do that, I needed a function to get an isl_surf
> from an intel_mipmap_tree.  How do you test such a function to ensure that
> it's working in all of the cases?  Use ISL for emitting all surface states
> on everything and run it through Jenkins of course!  Hence this series.
>
> This series is one of the most educational projects I've worked on in a
> bit.  It turns out there are a lot of subtlties in surface layout and I
> found bugs in both the i965 and ISL state setup code.  I've tried to keep
> all of the functional changes contained to the first 8 or so patches which
> only touch the GL driver.  That way those fixes can be back-ported to
> stable and are bisectable.
>
> The next 20 patches or so are general ISL cleanups and fixes.  If no one is
> too opposed, I'd like to back-port the whole pile to 12.0.  There are two
> reasons for this: First, ISL is new and this is a substantial cleanup;
> back-porting it will make back-porting will keep the initial release of ISL
> cleaner and make back-porting other patches easier in the future.  Second,
> in the middle of the series are a couple of changes that fix some 850
> Vulkan CTS tests on Haswell.
>
> The next 9 patches add support to ISL for filling out surface states on
> gen4, 4x, 5, and 6 as well as support for color compression.  I'm not sure
> if the CCS formats are 100% correct or of that's even the exact approach we
> want to take.  Chad, I'd like you to chip in here.
>
> Finally, starting with blorp, we replace almost all of the surface state
> setup code in i965 with paths based on ISL.  For textures/renderbuffers we
> delete 1 path for gen4-5, 3 for gen6, 4 for gen7, and 3 for gen8 along with
> 3 different paths for emitting buffer surfaces.
>
> As far as review goes, I'd like to get the i965 bugfixes and ISL cleanups
> landed soon-ish and back-ported for 12.0.  Everything after that is a bit
> more up-in-the-air.  It won't be all that hard to rebase because it's mosly
> just whole-sale replacing the code we have with new code.
>
> Cc: Chad Versace 
> Cc: Nanley Chery 
> Cc: Kenneth Graunke 
> Cc: Topi Pohjolainen 
>
> Jason Ekstrand (64):
>   i965: Drop Max3DTextureLevels to 512 on Sandy Bridge and prior
>   i965/blorp/gen8: Use the correct max level and layer in
> emit_surface_states
>   i965/gen8: Use the qpitch from the aux_mt for AUX_QPITCH
>   i965/fs: Use a default Y coordinate of 0 for TXF on gen9+
>   i965: Remove fake W-tiled render target support
>   i965/gen4: Subtract 1 from buffer sizes
>   i965/gen7,8: Set SURFACE_IS_ARRAY for all non-3D texture types
>   i965/blorp: Only set src_z for gen8+ 3D textures
>   genxml/gen8,9: Prefix the multisample format enum with MSFMT
>   isl/state: Don't use designated initializers for the surface state
>   isl/state: Remove some unused fields
>   isl/state: Put surface format setup at the top
>   isl/state: Put all dimension setup together and towards the top
>   isl/state: Put pitch calculations together
>   isl/state: Return an extent3d from the halign/valign helper
>   isl/state: Refactor the per-gen isl_to_gen_h/valign tables
>   isl/state: Refactor the setup of clear colors
>   isl/state: Don't force-disable L2 bypass for everything
>   isl/state: Set SurfaceArray based on the surface dimension
>   isl/format: Mark R9G9B9E5 as containing 9-bit unsigned float channels
>   isl/state: Set the IntegerSurfaceFormat bit on Haswell
>   isl/state: Use the layout for computing qpitch rather than dimensions
>   isl/state: Only set cube face enables if usage includes CUBE_BIT
>   isl/state: Emit no-op mip tail setup on SKL
>   isl/state: Use TILEWALK_XMAJOR for linear surfaces on gen7
>   isl/state: Don't set SurfacePitch for gen9 1-D textures
>   isl/state: Add assertions for buffer surface restrictions
>   isl/state: Don't use designated initializers for buffer surface state
>   isl/state: Allow for full 31-bit buffer texture sizes
>   anv,isl: Lower storage image formats in anv
>   genxml: Put append counter fields before MCS in RENDER_SURFACE_STATE
> on gen7
>   genxml: Add enough XML for gens 4, 4.5, and 5 to get SURFACE_STATE
>   genxml: Make X/Y Offset field of SURFACE_STATE a uint
>   genxml: Add macros and #includes for gens 4-6
>   isl: Add an ISL_DEV_IS_G4X macro
>   isl: Add support for filling out surface states all the way back to
> gen4
>   isl: 

[Mesa-dev] [PATCH 24/64] isl/state: Emit no-op mip tail setup on SKL

2016-06-11 Thread Jason Ekstrand
This hasn't ever been a problem in the past but it is recommended by the
hardware docs.
---
 src/intel/isl/isl_surface_state.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index 0d26619..fe0402f 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -277,6 +277,14 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
   s.MIPCountLOD = MAX(info->view->levels, 1) - 1;
}
 
+#if GEN_GEN >= 9
+   /* We don't use miptails yet.  The PRM recommends that you set "Mip Tail
+* Start LOD" to 15 to prevent the hardware from trying to use them.
+*/
+   s.TiledResourceMode = NONE;
+   s.MipTailStartLOD = 15;
+#endif
+
const struct isl_extent3d image_align = get_image_alignment(info->surf);
s.SurfaceVerticalAlignment = isl_to_gen_valign[image_align.height];
s.SurfaceHorizontalAlignment = isl_to_gen_halign[image_align.width];
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 64/64] i965/context: Remove some unnecessary vfuncs

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_context.h   | 17 -
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  |  3 +--
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c |  1 -
 src/mesa/drivers/dri/i965/gen8_surface_state.c|  1 -
 4 files changed, 1 insertion(+), 21 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 20c6d96..7ca3434 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -723,27 +723,10 @@ struct brw_context
 
struct
{
-  void (*update_texture_surface)(struct gl_context *ctx,
- unsigned unit,
- uint32_t *surf_offset,
- bool for_gather, uint32_t plane);
   uint32_t (*update_renderbuffer_surface)(struct brw_context *brw,
   struct gl_renderbuffer *rb,
   bool layered, unsigned unit,
   uint32_t surf_index);
-
-  void (*emit_texture_surface_state)(struct brw_context *brw,
- struct intel_mipmap_tree *mt,
- GLenum target,
- unsigned min_layer,
- unsigned max_layer,
- unsigned min_level,
- unsigned max_level,
- unsigned format,
- unsigned swizzle,
- uint32_t *surf_offset,
- int surf_index,
- bool rw, bool for_gather);
   void (*emit_null_surface_state)(struct brw_context *brw,
   unsigned width,
   unsigned height,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index c80d4f2..7a262bc 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -1010,7 +1010,7 @@ update_stage_texture_surfaces(struct brw_context *brw,
 
  /* _NEW_TEXTURE */
  if (ctx->Texture.Unit[unit]._Current) {
-brw->vtbl.update_texture_surface(ctx, unit, surf_offset + s, 
for_gather, plane);
+brw_update_texture_surface(ctx, unit, surf_offset + s, for_gather, 
plane);
  }
   }
}
@@ -1589,7 +1589,6 @@ const struct brw_tracked_state brw_wm_image_surfaces = {
 void
 gen4_init_vtable_surface_functions(struct brw_context *brw)
 {
-   brw->vtbl.update_texture_surface = brw_update_texture_surface;
brw->vtbl.update_renderbuffer_surface = gen4_update_renderbuffer_surface;
brw->vtbl.emit_null_surface_state = brw_emit_null_surface_state;
 }
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index 742ac0e..5587a02 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -176,7 +176,6 @@ gen7_emit_null_surface_state(struct brw_context *brw,
 void
 gen7_init_vtable_surface_functions(struct brw_context *brw)
 {
-   brw->vtbl.update_texture_surface = brw_update_texture_surface;
brw->vtbl.update_renderbuffer_surface = brw_update_renderbuffer_surface;
brw->vtbl.emit_null_surface_state = gen7_emit_null_surface_state;
 }
diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
b/src/mesa/drivers/dri/i965/gen8_surface_state.c
index 1f86557..08f83f3 100644
--- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
@@ -80,7 +80,6 @@ gen8_emit_null_surface_state(struct brw_context *brw,
 void
 gen8_init_vtable_surface_functions(struct brw_context *brw)
 {
-   brw->vtbl.update_texture_surface = brw_update_texture_surface;
brw->vtbl.update_renderbuffer_surface = brw_update_renderbuffer_surface;
brw->vtbl.emit_null_surface_state = gen8_emit_null_surface_state;
 }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 48/64] i965/blorp: Use the generic ISL path for texture surfaces on gen6

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/gen6_blorp.c | 77 +-
 1 file changed, 2 insertions(+), 75 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.c 
b/src/mesa/drivers/dri/i965/gen6_blorp.c
index 3af9c95..4620da2 100644
--- a/src/mesa/drivers/dri/i965/gen6_blorp.c
+++ b/src/mesa/drivers/dri/i965/gen6_blorp.c
@@ -320,79 +320,6 @@ gen6_blorp_emit_wm_constants(struct brw_context *brw,
 }
 
 
-/* SURFACE_STATE for renderbuffer or texture surface (see
- * brw_update_renderbuffer_surface and brw_update_texture_surface)
- */
-static uint32_t
-gen6_blorp_emit_surface_state(struct brw_context *brw,
-  const struct brw_blorp_params *params,
-  const struct brw_blorp_surface_info *surface,
-  uint32_t read_domains, uint32_t write_domain)
-{
-   uint32_t wm_surf_offset;
-   uint32_t width = surface->width;
-   uint32_t height = surface->height;
-   if (surface->num_samples > 1) {
-  /* Since gen6 uses INTEL_MSAA_LAYOUT_IMS, width and height are measured
-   * in samples.  But SURFACE_STATE wants them in pixels, so we need to
-   * divide them each by 2.
-   */
-  width /= 2;
-  height /= 2;
-   }
-   struct intel_mipmap_tree *mt = surface->mt;
-   uint32_t tile_x, tile_y;
-
-   uint32_t *surf = (uint32_t *)
-  brw_state_batch(brw, AUB_TRACE_SURFACE_STATE, 6 * 4, 32,
-  _surf_offset);
-
-   surf[0] = (BRW_SURFACE_2D << BRW_SURFACE_TYPE_SHIFT |
-  BRW_SURFACE_MIPMAPLAYOUT_BELOW << BRW_SURFACE_MIPLAYOUT_SHIFT |
-  BRW_SURFACE_CUBEFACE_ENABLES |
-  surface->brw_surfaceformat << BRW_SURFACE_FORMAT_SHIFT);
-
-   /* reloc */
-   surf[1] = (brw_blorp_compute_tile_offsets(surface, _x, _y) +
-  mt->bo->offset64);
-
-   surf[2] = (0 << BRW_SURFACE_LOD_SHIFT |
-  (width - 1) << BRW_SURFACE_WIDTH_SHIFT |
-  (height - 1) << BRW_SURFACE_HEIGHT_SHIFT);
-
-   uint32_t tiling = surface->map_stencil_as_y_tiled
-  ? BRW_SURFACE_TILED | BRW_SURFACE_TILED_Y
-  : brw_get_surface_tiling_bits(mt->tiling);
-   uint32_t pitch_bytes = mt->pitch;
-   if (surface->map_stencil_as_y_tiled)
-  pitch_bytes *= 2;
-   surf[3] = (tiling |
-  0 << BRW_SURFACE_DEPTH_SHIFT |
-  (pitch_bytes - 1) << BRW_SURFACE_PITCH_SHIFT);
-
-   surf[4] = brw_get_surface_num_multisamples(surface->num_samples);
-
-   /* Note that the low bits of these fields are missing, so
-* there's the possibility of getting in trouble.
-*/
-   assert(tile_x % 4 == 0);
-   assert(tile_y % 2 == 0);
-   surf[5] = ((tile_x / 4) << BRW_SURFACE_X_OFFSET_SHIFT |
-  (tile_y / 2) << BRW_SURFACE_Y_OFFSET_SHIFT |
-  (surface->mt->valign == 4 ?
-   BRW_SURFACE_VERTICAL_ALIGN_ENABLE : 0));
-
-   /* Emit relocation to surface contents */
-   drm_intel_bo_emit_reloc(brw->batch.bo,
-   wm_surf_offset + 4,
-   mt->bo,
-   surf[1] - mt->bo->offset64,
-   read_domains, write_domain);
-
-   return wm_surf_offset;
-}
-
-
 /* BINDING_TABLE.  See brw_wm_binding_table(). */
 uint32_t
 gen6_blorp_emit_binding_table(struct brw_context *brw,
@@ -1022,8 +949,8 @@ gen6_blorp_exec(struct brw_context *brw,
   I915_GEM_DOMAIN_RENDER, true);
   if (params->src.mt) {
  wm_surf_offset_texture =
-gen6_blorp_emit_surface_state(brw, params, >src,
-  I915_GEM_DOMAIN_SAMPLER, 0);
+brw_blorp_emit_surface_state(brw, >src,
+ I915_GEM_DOMAIN_SAMPLER, 0, false);
   }
   wm_bind_bo_offset =
  gen6_blorp_emit_binding_table(brw,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 56/64] i965/gen7: Use the generic ISL-based path for texture surfaces

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 169 +-
 1 file changed, 1 insertion(+), 168 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index 2a7ae31..bdb4f66 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -39,27 +39,6 @@
 #include "brw_defines.h"
 #include "brw_wm.h"
 
-/**
- * Convert an swizzle enumeration (i.e. SWIZZLE_X) to one of the Gen7.5+
- * "Shader Channel Select" enumerations (i.e. HSW_SCS_RED).  The mappings are
- *
- * SWIZZLE_X, SWIZZLE_Y, SWIZZLE_Z, SWIZZLE_W, SWIZZLE_ZERO, SWIZZLE_ONE
- * 0  1  2  3 45
- * 4  5  6  7 01
- *   SCS_RED, SCS_GREEN,  SCS_BLUE, SCS_ALPHA, SCS_ZERO, SCS_ONE
- *
- * which is simply adding 4 then modding by 8 (or anding with 7).
- *
- * We then may need to apply workarounds for textureGather hardware bugs.
- */
-static unsigned
-swizzle_to_scs(GLenum swizzle, bool need_green_to_blue)
-{
-   unsigned scs = (swizzle + 4) & 7;
-
-   return (need_green_to_blue && scs == HSW_SCS_GREEN) ? HSW_SCS_BLUE : scs;
-}
-
 uint32_t
 gen7_surface_tiling_mode(uint32_t tiling)
 {
@@ -264,151 +243,6 @@ gen7_emit_buffer_surface_state(struct brw_context *brw,
gen7_check_surface_setup(surf, false /* is_render_target */);
 }
 
-static void
-gen7_emit_texture_surface_state(struct brw_context *brw,
-struct intel_mipmap_tree *mt,
-GLenum target,
-unsigned min_layer, unsigned max_layer,
-unsigned min_level, unsigned max_level,
-unsigned format,
-unsigned swizzle,
-uint32_t *surf_offset,
-int surf_index /* unused */,
-bool rw, bool for_gather)
-{
-   const unsigned depth = max_layer - min_layer;
-   uint32_t *surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
-8 * 4, 32, surf_offset);
-
-   memset(surf, 0, 8 * 4);
-
-   surf[0] = translate_tex_target(target) << BRW_SURFACE_TYPE_SHIFT |
- format << BRW_SURFACE_FORMAT_SHIFT |
- gen7_surface_tiling_mode(mt->tiling);
-
-   /* mask of faces present in cube map; for other surfaces MBZ. */
-   if (target == GL_TEXTURE_CUBE_MAP || target == GL_TEXTURE_CUBE_MAP_ARRAY)
-  surf[0] |= BRW_SURFACE_CUBEFACE_ENABLES;
-
-   if (mt->valign == 4)
-  surf[0] |= GEN7_SURFACE_VALIGN_4;
-   if (mt->halign == 8)
-  surf[0] |= GEN7_SURFACE_HALIGN_8;
-
-   if (mt->target != GL_TEXTURE_3D)
-  surf[0] |= GEN7_SURFACE_IS_ARRAY;
-
-   if (mt->array_layout == ALL_SLICES_AT_EACH_LOD)
-  surf[0] |= GEN7_SURFACE_ARYSPC_LOD0;
-
-   surf[1] = mt->bo->offset64 + mt->offset; /* reloc */
-
-   surf[2] = SET_FIELD(mt->logical_width0 - 1, GEN7_SURFACE_WIDTH) |
- SET_FIELD(mt->logical_height0 - 1, GEN7_SURFACE_HEIGHT);
-
-   surf[3] = SET_FIELD(depth - 1, BRW_SURFACE_DEPTH) |
- (mt->pitch - 1);
-
-   if (brw->is_haswell && _mesa_is_format_integer(mt->format))
-  surf[3] |= HSW_SURFACE_IS_INTEGER_FORMAT;
-
-   surf[4] = gen7_surface_msaa_bits(mt->num_samples, mt->msaa_layout) |
- SET_FIELD(min_layer, GEN7_SURFACE_MIN_ARRAY_ELEMENT) |
- SET_FIELD(depth - 1, GEN7_SURFACE_RENDER_TARGET_VIEW_EXTENT);
-
-   surf[5] = (SET_FIELD(GEN7_MOCS_L3, GEN7_SURFACE_MOCS) |
-  SET_FIELD(min_level - mt->first_level, GEN7_SURFACE_MIN_LOD) |
-  /* mip count */
-  (max_level - min_level - 1));
-
-   surf[7] = mt->fast_clear_color_value;
-
-   if (brw->is_haswell) {
-  const bool need_scs_green_to_blue = for_gather && format == 
BRW_SURFACEFORMAT_R32G32_FLOAT_LD;
-
-  surf[7] |=
- SET_FIELD(swizzle_to_scs(GET_SWZ(swizzle, 0), 
need_scs_green_to_blue), GEN7_SURFACE_SCS_R) |
- SET_FIELD(swizzle_to_scs(GET_SWZ(swizzle, 1), 
need_scs_green_to_blue), GEN7_SURFACE_SCS_G) |
- SET_FIELD(swizzle_to_scs(GET_SWZ(swizzle, 2), 
need_scs_green_to_blue), GEN7_SURFACE_SCS_B) |
- SET_FIELD(swizzle_to_scs(GET_SWZ(swizzle, 3), 
need_scs_green_to_blue), GEN7_SURFACE_SCS_A);
-   }
-
-   if (mt->mcs_mt) {
-  gen7_set_surface_mcs_info(brw, surf, *surf_offset,
-mt->mcs_mt, false /* is RT */);
-   }
-
-   /* Emit relocation to surface contents */
-   drm_intel_bo_emit_reloc(brw->batch.bo,
-   *surf_offset + 4,
-   mt->bo,
-   surf[1] - mt->bo->offset64,
-   I915_GEM_DOMAIN_SAMPLER,
-   (rw ? I915_GEM_DOMAIN_SAMPLER : 0));
-
-   gen7_check_surface_setup(surf, false /* 

[Mesa-dev] [PATCH 53/64] i965/state: Add generic surface update functions based on ISL

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_state.h|   9 ++
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 184 +++
 2 files changed, 193 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index ba441cd..1667ee0 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -283,6 +283,15 @@ void brw_emit_surface_state(struct brw_context *brw,
 uint32_t *surf_offset, int surf_index,
 unsigned read_domains, unsigned write_domains);
 
+void brw_update_texture_surface(struct gl_context *ctx,
+unsigned unit, uint32_t *surf_offset,
+bool for_gather, uint32_t plane);
+
+uint32_t brw_update_renderbuffer_surface(struct brw_context *brw,
+ struct gl_renderbuffer *rb,
+ bool layered, unsigned unit,
+ uint32_t surf_index);
+
 void brw_update_renderbuffer_surfaces(struct brw_context *brw,
   const struct gl_framebuffer *fb,
   uint32_t render_target_start,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 190b42a..a9540b4 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -133,6 +133,54 @@ brw_emit_surface_state(struct brw_context *brw,
}
 }
 
+uint32_t
+brw_update_renderbuffer_surface(struct brw_context *brw,
+struct gl_renderbuffer *rb,
+bool layered, unsigned unit /* unused */,
+uint32_t surf_index)
+{
+   struct gl_context *ctx = >ctx;
+   struct intel_renderbuffer *irb = intel_renderbuffer(rb);
+   struct intel_mipmap_tree *mt = irb->mt;
+
+   assert(brw_render_target_supported(brw, rb));
+   intel_miptree_used_for_rendering(mt);
+
+   mesa_format rb_format = _mesa_get_render_format(ctx, intel_rb_format(irb));
+   if (unlikely(!brw->format_supported_as_render_target[rb_format])) {
+  _mesa_problem(ctx, "%s: renderbuffer format %s unsupported\n",
+__func__, _mesa_get_format_name(rb_format));
+   }
+
+   const unsigned layer_multiplier =
+  (irb->mt->msaa_layout == INTEL_MSAA_LAYOUT_UMS ||
+   irb->mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS) ?
+  MAX2(irb->mt->num_samples, 1) : 1;
+
+   struct isl_view view = {
+  .format = brw->render_target_format[rb_format],
+  .base_level = irb->mt_level - irb->mt->first_level,
+  .levels = 1,
+  .base_array_layer = irb->mt_layer / layer_multiplier,
+  .array_len = MAX2(irb->layer_count, 1),
+  .channel_select = {
+ ISL_CHANNEL_SELECT_RED,
+ ISL_CHANNEL_SELECT_GREEN,
+ ISL_CHANNEL_SELECT_BLUE,
+ ISL_CHANNEL_SELECT_ALPHA,
+  },
+  .usage = ISL_SURF_USAGE_RENDER_TARGET_BIT,
+   };
+
+   uint32_t offset;
+   brw_emit_surface_state(brw, mt, ,
+  surface_state_infos[brw->gen].rb_mocs, false,
+  , surf_index,
+  I915_GEM_DOMAIN_RENDER,
+  I915_GEM_DOMAIN_RENDER);
+   return offset;
+}
+
 GLuint
 translate_tex_target(GLenum target)
 {
@@ -300,6 +348,142 @@ brw_get_texture_swizzle(const struct gl_context *ctx,
 swizzles[GET_SWZ(t->_Swizzle, 3)]);
 }
 
+/**
+ * Convert an swizzle enumeration (i.e. SWIZZLE_X) to one of the Gen7.5+
+ * "Shader Channel Select" enumerations (i.e. HSW_SCS_RED).  The mappings are
+ *
+ * SWIZZLE_X, SWIZZLE_Y, SWIZZLE_Z, SWIZZLE_W, SWIZZLE_ZERO, SWIZZLE_ONE
+ * 0  1  2  3 45
+ * 4  5  6  7 01
+ *   SCS_RED, SCS_GREEN,  SCS_BLUE, SCS_ALPHA, SCS_ZERO, SCS_ONE
+ *
+ * which is simply adding 4 then modding by 8 (or anding with 7).
+ *
+ * We then may need to apply workarounds for textureGather hardware bugs.
+ */
+static unsigned
+swizzle_to_scs(GLenum swizzle, bool need_green_to_blue)
+{
+   unsigned scs = (swizzle + 4) & 7;
+
+   return (need_green_to_blue && scs == HSW_SCS_GREEN) ? HSW_SCS_BLUE : scs;
+}
+
+void
+brw_update_texture_surface(struct gl_context *ctx,
+   unsigned unit,
+   uint32_t *surf_offset,
+   bool for_gather,
+   uint32_t plane)
+{
+   struct brw_context *brw = brw_context(ctx);
+   struct gl_texture_object *obj = ctx->Texture.Unit[unit]._Current;
+
+   if (obj->Target == GL_TEXTURE_BUFFER) {
+  brw_update_buffer_texture_surface(ctx, unit, surf_offset);
+
+   } else {
+  struct intel_texture_object *intel_obj = intel_texture_object(obj);
+  struct 

[Mesa-dev] [PATCH 27/64] isl/state: Add assertions for buffer surface restrictions

2016-06-11 Thread Jason Ekstrand
---
 src/intel/isl/isl_surface_state.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index 8f223d1..ca13175 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -416,6 +416,17 @@ isl_genX(buffer_fill_state_s)(void *state,
 {
uint32_t num_elements = info->size / info->stride;
 
+   if (GEN_GEN >= 7) {
+  if (info->format == ISL_FORMAT_RAW) {
+ assert(num_elements <= (1ull << 31));
+ assert((num_elements & 3) == 0);
+  } else {
+ assert(num_elements <= (1ull << 27));
+  }
+   } else {
+  assert(num_elements <= (1ull << 27));
+   }
+
struct GENX(RENDER_SURFACE_STATE) surface_state = {
   .SurfaceType = SURFTYPE_BUFFER,
   .SurfaceArray = false,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 59/64] i965/gen4-6: Use the generic ISL-based path for texture surfaces

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 94 +---
 1 file changed, 1 insertion(+), 93 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index a9540b4..c310b15 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -552,98 +552,6 @@ brw_update_buffer_texture_surface(struct gl_context *ctx,
false /* rw */);
 }
 
-static void
-gen4_update_texture_surface(struct gl_context *ctx,
-unsigned unit,
-uint32_t *surf_offset,
-bool for_gather,
-uint32_t plane)
-{
-   struct brw_context *brw = brw_context(ctx);
-   struct gl_texture_object *tObj = ctx->Texture.Unit[unit]._Current;
-   struct intel_texture_object *intelObj = intel_texture_object(tObj);
-   struct intel_mipmap_tree *mt = intelObj->mt;
-   struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit);
-   uint32_t *surf;
-
-   /* BRW_NEW_TEXTURE_BUFFER */
-   if (tObj->Target == GL_TEXTURE_BUFFER) {
-  brw_update_buffer_texture_surface(ctx, unit, surf_offset);
-  return;
-   }
-
-   surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
- 6 * 4, 32, surf_offset);
-
-   uint32_t tex_format = translate_tex_format(brw, mt->format,
-  sampler->sRGBDecode);
-
-   if (tObj->Target == GL_TEXTURE_EXTERNAL_OES) {
-  if (plane > 0)
- mt = mt->plane[plane - 1];
-  if (mt == NULL)
- return;
-
-  tex_format = translate_tex_format(brw, mt->format, sampler->sRGBDecode);
-   }
-
-   if (for_gather) {
-  /* Sandybridge's gather4 message is broken for integer formats.
-   * To work around this, we pretend the surface is UNORM for
-   * 8 or 16-bit formats, and emit shader instructions to recover
-   * the real INT/UINT value.  For 32-bit formats, we pretend
-   * the surface is FLOAT, and simply reinterpret the resulting
-   * bits.
-   */
-  switch (tex_format) {
-  case BRW_SURFACEFORMAT_R8_SINT:
-  case BRW_SURFACEFORMAT_R8_UINT:
- tex_format = BRW_SURFACEFORMAT_R8_UNORM;
- break;
-
-  case BRW_SURFACEFORMAT_R16_SINT:
-  case BRW_SURFACEFORMAT_R16_UINT:
- tex_format = BRW_SURFACEFORMAT_R16_UNORM;
- break;
-
-  case BRW_SURFACEFORMAT_R32_SINT:
-  case BRW_SURFACEFORMAT_R32_UINT:
- tex_format = BRW_SURFACEFORMAT_R32_FLOAT;
- break;
-
-  default:
- break;
-  }
-   }
-
-   surf[0] = (translate_tex_target(tObj->Target) << BRW_SURFACE_TYPE_SHIFT |
- BRW_SURFACE_MIPMAPLAYOUT_BELOW << BRW_SURFACE_MIPLAYOUT_SHIFT |
- BRW_SURFACE_CUBEFACE_ENABLES |
- tex_format << BRW_SURFACE_FORMAT_SHIFT);
-
-   surf[1] = mt->bo->offset64 + mt->offset; /* reloc */
-
-   surf[2] = ((intelObj->_MaxLevel - tObj->BaseLevel) << BRW_SURFACE_LOD_SHIFT 
|
- (mt->logical_width0 - 1) << BRW_SURFACE_WIDTH_SHIFT |
- (mt->logical_height0 - 1) << BRW_SURFACE_HEIGHT_SHIFT);
-
-   surf[3] = (brw_get_surface_tiling_bits(mt->tiling) |
- (mt->logical_depth0 - 1) << BRW_SURFACE_DEPTH_SHIFT |
- (mt->pitch - 1) << BRW_SURFACE_PITCH_SHIFT);
-
-   surf[4] = (brw_get_surface_num_multisamples(mt->num_samples) |
-  SET_FIELD(tObj->BaseLevel - mt->first_level, 
BRW_SURFACE_MIN_LOD));
-
-   surf[5] = mt->valign == 4 ? BRW_SURFACE_VERTICAL_ALIGN_ENABLE : 0;
-
-   /* Emit relocation to surface contents */
-   drm_intel_bo_emit_reloc(brw->batch.bo,
-   *surf_offset + 4,
-   mt->bo,
-   surf[1] - mt->bo->offset64,
-   I915_GEM_DOMAIN_SAMPLER, 0);
-}
-
 /**
  * Create the constant buffer surface.  Vertex/fragment shader constants will 
be
  * read from this buffer with Data Port Read instructions/messages.
@@ -1680,7 +1588,7 @@ const struct brw_tracked_state brw_wm_image_surfaces = {
 void
 gen4_init_vtable_surface_functions(struct brw_context *brw)
 {
-   brw->vtbl.update_texture_surface = gen4_update_texture_surface;
+   brw->vtbl.update_texture_surface = brw_update_texture_surface;
brw->vtbl.update_renderbuffer_surface = gen4_update_renderbuffer_surface;
brw->vtbl.emit_null_surface_state = brw_emit_null_surface_state;
brw->vtbl.emit_buffer_surface_state = gen4_emit_buffer_surface_state;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 50/64] i965/blorp: Use a generic ISL path for texture surfaces on gen8

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/gen8_blorp.c | 47 +++---
 1 file changed, 38 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_blorp.c 
b/src/mesa/drivers/dri/i965/gen8_blorp.c
index b5c600b..918f3d6 100644
--- a/src/mesa/drivers/dri/i965/gen8_blorp.c
+++ b/src/mesa/drivers/dri/i965/gen8_blorp.c
@@ -499,6 +499,25 @@ gen8_blorp_emit_constant_ps(struct brw_context *brw,
ADVANCE_BATCH();
 }
 
+/**
+ * Convert an swizzle enumeration (i.e. SWIZZLE_X) to one of the Gen7.5+
+ * "Shader Channel Select" enumerations (i.e. HSW_SCS_RED).  The mappings are
+ *
+ * SWIZZLE_X, SWIZZLE_Y, SWIZZLE_Z, SWIZZLE_W, SWIZZLE_ZERO, SWIZZLE_ONE
+ * 0  1  2  3 45
+ * 4  5  6  7 01
+ *   SCS_RED, SCS_GREEN,  SCS_BLUE, SCS_ALPHA, SCS_ZERO, SCS_ONE
+ *
+ * which is simply adding 4 then modding by 8 (or anding with 7).
+ *
+ * We then may need to apply workarounds for textureGather hardware bugs.
+ */
+static unsigned
+swizzle_to_scs(GLenum swizzle)
+{
+   return (swizzle + 4) & 7;
+}
+
 static uint32_t
 gen8_blorp_emit_surface_states(struct brw_context *brw,
const struct brw_blorp_params *params)
@@ -531,21 +550,31 @@ gen8_blorp_emit_surface_states(struct brw_context *brw,
   mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS) ?
  MAX2(mt->num_samples, 1) : 1;
 
-  /* Cube textures are sampled as 2D array. */
   const bool is_cube = mt->target == GL_TEXTURE_CUBE_MAP_ARRAY ||
mt->target == GL_TEXTURE_CUBE_MAP;
   const unsigned depth = (is_cube ? 6 : 1) * mt->logical_depth0;
-  const GLenum target = is_cube ? GL_TEXTURE_2D_ARRAY : mt->target;
   const unsigned layer = mt->target != GL_TEXTURE_3D ?
 surface->layer / layer_divider : 0;
 
-  brw->vtbl.emit_texture_surface_state(brw, mt, target,
-   layer, depth,
-   surface->level, mt->last_level + 1,
-   surface->brw_surfaceformat,
-   surface->swizzle,
-   _surf_offset_texture,
-   -1, false, false);
+  struct isl_view view = {
+ .format = surface->brw_surfaceformat,
+ .base_level = surface->level,
+ .levels = mt->last_level - surface->level + 1,
+ .base_array_layer = layer,
+ .array_len = depth - layer,
+ .channel_select = {
+swizzle_to_scs(GET_SWZ(surface->swizzle, 0)),
+swizzle_to_scs(GET_SWZ(surface->swizzle, 1)),
+swizzle_to_scs(GET_SWZ(surface->swizzle, 2)),
+swizzle_to_scs(GET_SWZ(surface->swizzle, 3)),
+ },
+ .usage = ISL_SURF_USAGE_TEXTURE_BIT,
+  };
+
+  brw_emit_surface_state(brw, mt, ,
+ brw->gen >= 9 ? SKL_MOCS_WB : BDW_MOCS_WB,
+ false, _surf_offset_texture, -1,
+ I915_GEM_DOMAIN_SAMPLER, 0);
}
 
return gen6_blorp_emit_binding_table(brw,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 54/64] i965/gen8: Use the generic ISL-based path for texture surfaces

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/gen8_surface_state.c | 213 +
 1 file changed, 1 insertion(+), 212 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
b/src/mesa/drivers/dri/i965/gen8_surface_state.c
index f4375ea..ed26271 100644
--- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
@@ -42,23 +42,6 @@
 #include "brw_wm.h"
 #include "isl/isl.h"
 
-/**
- * Convert an swizzle enumeration (i.e. SWIZZLE_X) to one of the Gen7.5+
- * "Shader Channel Select" enumerations (i.e. HSW_SCS_RED).  The mappings are
- *
- * SWIZZLE_X, SWIZZLE_Y, SWIZZLE_Z, SWIZZLE_W, SWIZZLE_ZERO, SWIZZLE_ONE
- * 0  1  2  3 45
- * 4  5  6  7 01
- *   SCS_RED, SCS_GREEN,  SCS_BLUE, SCS_ALPHA, SCS_ZERO, SCS_ONE
- *
- * which is simply adding 4 then modding by 8 (or anding with 7).
- */
-static unsigned
-swizzle_to_scs(unsigned swizzle)
-{
-   return (swizzle + 4) & 7;
-}
-
 static uint32_t
 surface_tiling_resource_mode(uint32_t tr_mode)
 {
@@ -224,199 +207,6 @@ gen8_get_aux_mode(const struct brw_context *brw,
return GEN8_SURFACE_AUX_MODE_MCS;
 }
 
-static void
-gen8_emit_texture_surface_state(struct brw_context *brw,
-struct intel_mipmap_tree *mt,
-GLenum target,
-unsigned min_layer, unsigned max_layer,
-unsigned min_level, unsigned max_level,
-unsigned format,
-unsigned swizzle,
-uint32_t *surf_offset, int surf_index,
-bool rw, bool for_gather)
-{
-   const unsigned depth = max_layer - min_layer;
-   struct intel_mipmap_tree *aux_mt = mt->mcs_mt;
-   uint32_t mocs_wb = brw->gen >= 9 ? SKL_MOCS_WB : BDW_MOCS_WB;
-   unsigned tiling_mode, pitch;
-   const unsigned tr_mode = surface_tiling_resource_mode(mt->tr_mode);
-   const uint32_t surf_type = translate_tex_target(target);
-   uint32_t aux_mode = gen8_get_aux_mode(brw, mt);
-
-   if (mt->format == MESA_FORMAT_S_UINT8) {
-  tiling_mode = GEN8_SURFACE_TILING_W;
-  pitch = 2 * mt->pitch;
-   } else {
-  tiling_mode = gen8_surface_tiling_mode(mt->tiling);
-  pitch = mt->pitch;
-   }
-
-   /* Prior to Gen9, MCS is not uploaded for single-sampled surfaces because
-* the color buffer should always have been resolved before it is used as
-* a texture so there is no need for it. On Gen9 it will be uploaded when
-* the surface is losslessly compressed (CCS_E).
-* However, sampling engine is not capable of re-interpreting the
-* underlying color buffer in non-compressible formats when the surface
-* is configured as compressed. Therefore state upload has made sure the
-* buffer is in resolved state allowing the surface to be configured as
-* non-compressed.
-*/
-   if (mt->num_samples <= 1 &&
-   (aux_mode != GEN9_SURFACE_AUX_MODE_CCS_E ||
-!isl_format_supports_lossless_compression(
-brw->intelScreen->devinfo, format))) {
-  assert(!mt->mcs_mt ||
- mt->fast_clear_state == INTEL_FAST_CLEAR_STATE_RESOLVED);
-  aux_mt = NULL;
-  aux_mode = GEN8_SURFACE_AUX_MODE_NONE;
-   }
-
-   uint32_t *surf = gen8_allocate_surface_state(brw, surf_offset, surf_index);
-
-   surf[0] = SET_FIELD(surf_type, BRW_SURFACE_TYPE) |
- format << BRW_SURFACE_FORMAT_SHIFT |
- gen8_vertical_alignment(brw, mt, surf_type) |
- gen8_horizontal_alignment(brw, mt, surf_type) |
- tiling_mode;
-
-   if (surf_type == BRW_SURFACE_CUBE) {
-  surf[0] |= BRW_SURFACE_CUBEFACE_ENABLES;
-   }
-
-   /* From the CHV PRM, Volume 2d, page 321 (RENDER_SURFACE_STATE dword 0
-* bit 9 "Sampler L2 Bypass Mode Disable" Programming Notes):
-*
-*This bit must be set for the following surface types: BC2_UNORM
-*BC3_UNORM BC5_UNORM BC5_SNORM BC7_UNORM
-*/
-   if ((brw->gen >= 9 || brw->is_cherryview) &&
-   (format == BRW_SURFACEFORMAT_BC2_UNORM ||
-format == BRW_SURFACEFORMAT_BC3_UNORM ||
-format == BRW_SURFACEFORMAT_BC5_UNORM ||
-format == BRW_SURFACEFORMAT_BC5_SNORM ||
-format == BRW_SURFACEFORMAT_BC7_UNORM))
-  surf[0] |= GEN8_SURFACE_SAMPLER_L2_BYPASS_DISABLE;
-
-   if (mt->target != GL_TEXTURE_3D)
-  surf[0] |= GEN8_SURFACE_IS_ARRAY;
-
-   surf[1] = SET_FIELD(mocs_wb, GEN8_SURFACE_MOCS) | mt->qpitch >> 2;
-
-   surf[2] = SET_FIELD(mt->logical_width0 - 1, GEN7_SURFACE_WIDTH) |
- SET_FIELD(mt->logical_height0 - 1, GEN7_SURFACE_HEIGHT);
-
-   surf[3] = SET_FIELD(depth - 1, BRW_SURFACE_DEPTH) | (pitch - 1);
-
-   surf[4] = gen7_surface_msaa_bits(mt->num_samples, mt->msaa_layout) |
- SET_FIELD(min_layer, GEN7_SURFACE_MIN_ARRAY_ELEMENT) |
-

[Mesa-dev] [PATCH 49/64] i965/state: Add a helper for emitting a surface state using isl

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_state.h|  8 +++
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 81 
 2 files changed, 89 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index eec4bae..ba441cd 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -275,6 +275,14 @@ GLuint translate_tex_format(struct brw_context *brw,
 int brw_get_texture_swizzle(const struct gl_context *ctx,
 const struct gl_texture_object *t);
 
+struct isl_view;
+void brw_emit_surface_state(struct brw_context *brw,
+struct intel_mipmap_tree *mt,
+const struct isl_view *view,
+uint32_t mocs, bool for_gather,
+uint32_t *surf_offset, int surf_index,
+unsigned read_domains, unsigned write_domains);
+
 void brw_update_renderbuffer_surfaces(struct brw_context *brw,
   const struct gl_framebuffer *fb,
   uint32_t render_target_start,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index e1f4bcb..2888cc9 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -35,6 +35,7 @@
 #include "main/mtypes.h"
 #include "main/samplerobj.h"
 #include "main/shaderimage.h"
+#include "main/teximage.h"
 #include "program/prog_parameter.h"
 #include "program/prog_instruction.h"
 #include "main/framebuffer.h"
@@ -52,6 +53,86 @@
 #include "brw_defines.h"
 #include "brw_wm.h"
 
+struct surface_state_info {
+   unsigned num_dwords;
+   unsigned ss_align; /* Required alignment of RENDER_SURFACE_STATE in bytes */
+   unsigned reloc_dw;
+   unsigned aux_reloc_dw;
+   unsigned tex_mocs;
+   unsigned rb_mocs;
+};
+
+static const struct surface_state_info surface_state_infos[] = {
+   [4] = {6,  32, 1,  0},
+   [5] = {6,  32, 1,  0},
+   [6] = {6,  32, 1,  0},
+   [7] = {8,  32, 1,  6,  GEN7_MOCS_L3, GEN7_MOCS_L3},
+   [8] = {13, 64, 8,  10, BDW_MOCS_WB,  BDW_MOCS_PTE},
+   [9] = {16, 64, 8,  10, SKL_MOCS_WB,  SKL_MOCS_PTE},
+};
+
+void
+brw_emit_surface_state(struct brw_context *brw,
+   struct intel_mipmap_tree *mt,
+   const struct isl_view *view,
+   uint32_t mocs, bool for_gather,
+   uint32_t *surf_offset, int surf_index,
+   unsigned read_domains, unsigned write_domains)
+{
+   /* TODO: This should go in the context */
+   struct isl_device isl_dev;
+   isl_device_init(_dev, brw->intelScreen->devinfo, brw->has_swizzling);
+
+   const struct surface_state_info ss_info = surface_state_infos[brw->gen];
+
+   struct isl_surf surf;
+   intel_miptree_get_isl_surf(brw, mt, );
+
+   union isl_color_value clear_color = { .u32 = { 0, 0, 0, 0 } };
+
+   struct isl_surf *aux_surf = NULL, aux_surf_s;
+   uint64_t aux_offset = 0;
+   if (mt->mcs_mt &&
+   ((view->usage & ISL_SURF_USAGE_RENDER_TARGET_BIT) ||
+mt->fast_clear_state != INTEL_FAST_CLEAR_STATE_RESOLVED)) {
+  intel_miptree_get_ccs_isl_surf(brw, mt, _surf_s);
+  aux_surf = _surf_s;
+  assert(mt->mcs_mt->offset == 0);
+  aux_offset = mt->mcs_mt->bo->offset64;
+
+  /* We only really need a clear color if we also have an auxiliary
+   * surfacae.  Without one, it does nothing.
+   */
+  clear_color = intel_miptree_get_isl_clear_color(brw, mt);
+   }
+
+   uint32_t *dw = __brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
+ss_info.num_dwords * 4, ss_info.ss_align,
+surf_index, surf_offset);
+
+   isl_surf_fill_state(_dev, dw, .surf = , .view = view,
+   .address = mt->bo->offset64 + mt->offset,
+   .aux_surf = aux_surf, .aux_address = aux_offset,
+   .mocs = mocs, .clear_color = clear_color);
+
+   drm_intel_bo_emit_reloc(brw->batch.bo,
+   *surf_offset + 4 * ss_info.reloc_dw,
+   mt->bo, mt->offset,
+   read_domains, write_domains);
+
+   if (aux_surf) {
+  /* On gen7 and prior, the bottom 12 bits of the MCS base address are
+   * used to store other information.  This should be ok, however, because
+   * surface buffer addresses are always 4K page alinged.
+   */
+  assert((aux_offset & 0xfff) == 0);
+  drm_intel_bo_emit_reloc(brw->batch.bo,
+  *surf_offset + 4 * ss_info.aux_reloc_dw,
+  mt->mcs_mt->bo, dw[ss_info.aux_reloc_dw] & 0xfff,
+  read_domains, write_domains);
+   }
+}
+
 GLuint
 translate_tex_target(GLenum target)
 {
-- 
2.5.0.400.gff86faf


[Mesa-dev] [PATCH 61/64] i965/state: Account for the element size in emit_buffer_surface_state

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 11 ++-
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c |  9 +
 src/mesa/drivers/dri/i965/gen8_surface_state.c|  9 +
 3 files changed, 16 insertions(+), 13 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index c310b15..ba2ad7d 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -494,6 +494,7 @@ gen4_emit_buffer_surface_state(struct brw_context *brw,
unsigned pitch,
bool rw)
 {
+   unsigned elements = buffer_size / pitch;
uint32_t *surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
 6 * 4, 32, out_offset);
memset(surf, 0, 6 * 4);
@@ -502,9 +503,9 @@ gen4_emit_buffer_surface_state(struct brw_context *brw,
  surface_format << BRW_SURFACE_FORMAT_SHIFT |
  (brw->gen >= 6 ? BRW_SURFACE_RC_READ_WRITE : 0);
surf[1] = (bo ? bo->offset64 : 0) + buffer_offset; /* reloc */
-   surf[2] = ((buffer_size - 1) & 0x7f) << BRW_SURFACE_WIDTH_SHIFT |
- (((buffer_size - 1) >> 7) & 0x1fff) << BRW_SURFACE_HEIGHT_SHIFT;
-   surf[3] = (((buffer_size - 1) >> 20) & 0x7f) << BRW_SURFACE_DEPTH_SHIFT |
+   surf[2] = ((elements - 1) & 0x7f) << BRW_SURFACE_WIDTH_SHIFT |
+ (((elements - 1) >> 7) & 0x1fff) << BRW_SURFACE_HEIGHT_SHIFT;
+   surf[3] = (((elements - 1) >> 20) & 0x7f) << BRW_SURFACE_DEPTH_SHIFT |
  (pitch - 1) << BRW_SURFACE_PITCH_SHIFT;
 
/* Emit relocation to surface contents.  The 965 PRM, Volume 4, section
@@ -547,7 +548,7 @@ brw_update_buffer_texture_surface(struct gl_context *ctx,
brw->vtbl.emit_buffer_surface_state(brw, surf_offset, bo,
tObj->BufferOffset,
brw_format,
-   size / texel_size,
+   size,
texel_size,
false /* rw */);
 }
@@ -1478,7 +1479,7 @@ update_image_surface(struct brw_context *brw,
 
  brw->vtbl.emit_buffer_surface_state(
 brw, surf_offset, intel_obj->buffer, obj->BufferOffset,
-format, intel_obj->Base.Size / texel_size, texel_size,
+format, intel_obj->Base.Size, texel_size,
 access != GL_READ_ONLY);
 
  update_buffer_image_param(brw, u, surface_idx, param);
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index bb94f2d..65a1cb0 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -135,6 +135,7 @@ gen7_emit_buffer_surface_state(struct brw_context *brw,
unsigned pitch,
bool rw)
 {
+   unsigned elements = buffer_size / pitch;
uint32_t *surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
 8 * 4, 32, out_offset);
memset(surf, 0, 8 * 4);
@@ -143,12 +144,12 @@ gen7_emit_buffer_surface_state(struct brw_context *brw,
  surface_format << BRW_SURFACE_FORMAT_SHIFT |
  BRW_SURFACE_RC_READ_WRITE;
surf[1] = (bo ? bo->offset64 : 0) + buffer_offset; /* reloc */
-   surf[2] = SET_FIELD((buffer_size - 1) & 0x7f, GEN7_SURFACE_WIDTH) |
- SET_FIELD(((buffer_size - 1) >> 7) & 0x3fff, GEN7_SURFACE_HEIGHT);
+   surf[2] = SET_FIELD((elements - 1) & 0x7f, GEN7_SURFACE_WIDTH) |
+ SET_FIELD(((elements - 1) >> 7) & 0x3fff, GEN7_SURFACE_HEIGHT);
if (surface_format == BRW_SURFACEFORMAT_RAW)
-  surf[3] = SET_FIELD(((buffer_size - 1) >> 21) & 0x3ff, 
BRW_SURFACE_DEPTH);
+  surf[3] = SET_FIELD(((elements - 1) >> 21) & 0x3ff, BRW_SURFACE_DEPTH);
else
-  surf[3] = SET_FIELD(((buffer_size - 1) >> 21) & 0x3f, BRW_SURFACE_DEPTH);
+  surf[3] = SET_FIELD(((elements - 1) >> 21) & 0x3f, BRW_SURFACE_DEPTH);
surf[3] |= (pitch - 1);
 
surf[5] = SET_FIELD(GEN7_MOCS_L3, GEN7_SURFACE_MOCS);
diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
b/src/mesa/drivers/dri/i965/gen8_surface_state.c
index 00e4c48..9ac8a48 100644
--- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
@@ -63,6 +63,7 @@ gen8_emit_buffer_surface_state(struct brw_context *brw,
unsigned pitch,
bool rw)
 {
+   unsigned elements = buffer_size / pitch;
const unsigned mocs = brw->gen >= 9 ? SKL_MOCS_WB : BDW_MOCS_WB;
uint32_t *surf = gen8_allocate_surface_state(brw, out_offset, -1);
 
@@ -71,12 +72,12 @@ gen8_emit_buffer_surface_state(struct brw_context *brw,
  BRW_SURFACE_RC_READ_WRITE;
surf[1] = SET_FIELD(mocs, 

[Mesa-dev] [PATCH 58/64] i965/gen6: Use the generic ISL-based path for renderbuffer surfaces

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/gen6_surface_state.c | 100 +
 1 file changed, 1 insertion(+), 99 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_surface_state.c 
b/src/mesa/drivers/dri/i965/gen6_surface_state.c
index d892c93..84b8ef4 100644
--- a/src/mesa/drivers/dri/i965/gen6_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_surface_state.c
@@ -40,107 +40,9 @@
 #include "brw_defines.h"
 #include "brw_wm.h"
 
-/**
- * Sets up a surface state structure to point at the given region.
- * While it is only used for the front/back buffer currently, it should be
- * usable for further buffers when doing ARB_draw_buffer support.
- */
-static uint32_t
-gen6_update_renderbuffer_surface(struct brw_context *brw,
- struct gl_renderbuffer *rb,
- bool layered, unsigned unit /* unused */,
- uint32_t surf_index)
-{
-   struct gl_context *ctx = >ctx;
-   struct intel_renderbuffer *irb = intel_renderbuffer(rb);
-   struct intel_mipmap_tree *mt = irb->mt;
-   uint32_t *surf;
-   uint32_t format = 0;
-   uint32_t offset;
-   /* _NEW_BUFFERS */
-   mesa_format rb_format = _mesa_get_render_format(ctx, intel_rb_format(irb));
-   uint32_t surftype;
-   int depth = MAX2(irb->layer_count, 1);
-   const GLenum gl_target =
-  rb->TexImage ? rb->TexImage->TexObject->Target : GL_TEXTURE_2D;
-
-   intel_miptree_used_for_rendering(irb->mt);
-
-   surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE, 6 * 4, 32, );
-
-   format = brw->render_target_format[rb_format];
-   if (unlikely(!brw->format_supported_as_render_target[rb_format])) {
-  _mesa_problem(ctx, "%s: renderbuffer format %s unsupported\n",
-__func__, _mesa_get_format_name(rb_format));
-   }
-
-   switch (gl_target) {
-   case GL_TEXTURE_CUBE_MAP_ARRAY:
-   case GL_TEXTURE_CUBE_MAP:
-  surftype = BRW_SURFACE_2D;
-  depth *= 6;
-  break;
-   case GL_TEXTURE_3D:
-  depth = MAX2(irb->mt->logical_depth0, 1);
-  /* fallthrough */
-   default:
-  surftype = translate_tex_target(gl_target);
-  break;
-   }
-
-   const int min_array_element = irb->mt_layer;
-   assert(!layered || irb->mt_layer == 0);
-
-   surf[0] = SET_FIELD(surftype, BRW_SURFACE_TYPE) |
- SET_FIELD(format, BRW_SURFACE_FORMAT);
-
-   /* reloc */
-   assert(mt->offset % mt->cpp == 0);
-   surf[1] = mt->bo->offset64 + mt->offset;
-
-   /* In the gen6 PRM Volume 1 Part 1: Graphics Core, Section 7.18.3.7.1
-* (Surface Arrays For all surfaces other than separate stencil buffer):
-*
-* "[DevSNB] Errata: Sampler MSAA Qpitch will be 4 greater than the value
-*  calculated in the equation above , for every other odd Surface Height
-*  starting from 1 i.e. 1,5,9,13"
-*
-* Since this Qpitch errata only impacts the sampler, we have to adjust the
-* input for the rendering surface to achieve the same qpitch. For the
-* affected heights, we increment the height by 1 for the rendering
-* surface.
-*/
-   int height0 = irb->mt->logical_height0;
-   if (brw->gen == 6 && irb->mt->num_samples > 1 && (height0 % 4) == 1)
-  height0++;
-
-   surf[2] = SET_FIELD(mt->logical_width0 - 1, BRW_SURFACE_WIDTH) |
- SET_FIELD(height0 - 1, BRW_SURFACE_HEIGHT) |
- SET_FIELD(irb->mt_level - irb->mt->first_level, BRW_SURFACE_LOD);
-
-   surf[3] = brw_get_surface_tiling_bits(mt->tiling) |
- SET_FIELD(depth - 1, BRW_SURFACE_DEPTH) |
- SET_FIELD(mt->pitch - 1, BRW_SURFACE_PITCH);
-
-   surf[4] = brw_get_surface_num_multisamples(mt->num_samples) |
- SET_FIELD(min_array_element, BRW_SURFACE_MIN_ARRAY_ELEMENT) |
- SET_FIELD(depth - 1, BRW_SURFACE_RENDER_TARGET_VIEW_EXTENT);
-
-   surf[5] = (mt->valign == 4 ? BRW_SURFACE_VERTICAL_ALIGN_ENABLE : 0);
-
-   drm_intel_bo_emit_reloc(brw->batch.bo,
-   offset + 4,
-   mt->bo,
-   surf[1] - mt->bo->offset64,
-   I915_GEM_DOMAIN_RENDER,
-   I915_GEM_DOMAIN_RENDER);
-
-   return offset;
-}
-
 void
 gen6_init_vtable_surface_functions(struct brw_context *brw)
 {
gen4_init_vtable_surface_functions(brw);
-   brw->vtbl.update_renderbuffer_surface = gen6_update_renderbuffer_surface;
+   brw->vtbl.update_renderbuffer_surface = brw_update_renderbuffer_surface;
 }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 40/64] i965/miptree: Add a helper for getting an isl_surf from a miptree

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 171 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |   6 +
 2 files changed, 175 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index b6265dc..83a9764 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -26,8 +26,6 @@
 #include 
 #include 
 
-#include "isl/isl.h"
-
 #include "intel_batchbuffer.h"
 #include "intel_mipmap_tree.h"
 #include "intel_resolve_map.h"
@@ -2999,3 +2997,172 @@ intel_miptree_unmap(struct brw_context *brw,
 
intel_miptree_release_map(mt, level, slice);
 }
+
+void
+intel_miptree_get_isl_surf(struct brw_context *brw,
+   const struct intel_mipmap_tree *mt,
+   struct isl_surf *surf)
+{
+   switch (mt->target) {
+   case GL_TEXTURE_1D:
+   case GL_TEXTURE_1D_ARRAY: {
+  surf->dim = ISL_SURF_DIM_1D;
+  if (brw->gen >= 9 && mt->tiling == I915_TILING_NONE)
+ surf->dim_layout = ISL_DIM_LAYOUT_GEN9_1D;
+  else
+ surf->dim_layout = ISL_DIM_LAYOUT_GEN4_2D;
+  break;
+   }
+   case GL_TEXTURE_2D:
+   case GL_TEXTURE_2D_ARRAY:
+   case GL_TEXTURE_RECTANGLE:
+   case GL_TEXTURE_CUBE_MAP:
+   case GL_TEXTURE_CUBE_MAP_ARRAY:
+   case GL_TEXTURE_2D_MULTISAMPLE:
+   case GL_TEXTURE_2D_MULTISAMPLE_ARRAY:
+   case GL_TEXTURE_EXTERNAL_OES:
+  surf->dim = ISL_SURF_DIM_2D;
+  surf->dim_layout = ISL_DIM_LAYOUT_GEN4_2D;
+  break;
+   case GL_TEXTURE_3D:
+  surf->dim = ISL_SURF_DIM_3D;
+  if (brw->gen >= 9)
+ surf->dim_layout = ISL_DIM_LAYOUT_GEN4_2D;
+  else
+ surf->dim_layout = ISL_DIM_LAYOUT_GEN4_3D;
+  break;
+   default:
+  unreachable("Invalid texture target");
+   }
+
+   switch (mt->msaa_layout) {
+   case INTEL_MSAA_LAYOUT_NONE:
+  surf->msaa_layout = ISL_MSAA_LAYOUT_NONE;
+  break;
+   case INTEL_MSAA_LAYOUT_IMS:
+  surf->msaa_layout = ISL_MSAA_LAYOUT_INTERLEAVED;
+  break;
+   case INTEL_MSAA_LAYOUT_UMS:
+   case INTEL_MSAA_LAYOUT_CMS:
+  surf->msaa_layout = ISL_MSAA_LAYOUT_ARRAY;
+  break;
+   default:
+  unreachable("Invalid MSAA layout");
+   }
+
+   if (mt->format == MESA_FORMAT_S_UINT8) {
+  surf->tiling = ISL_TILING_W;
+   } else {
+  switch (mt->tiling) {
+  case I915_TILING_NONE:
+ surf->tiling = ISL_TILING_LINEAR;
+ break;
+  case I915_TILING_X:
+ surf->tiling = ISL_TILING_X;
+ break;
+  case I915_TILING_Y:
+ switch (mt->tr_mode) {
+ case INTEL_MIPTREE_TRMODE_NONE:
+surf->tiling = ISL_TILING_Y0;
+break;
+ case INTEL_MIPTREE_TRMODE_YF:
+surf->tiling = ISL_TILING_Yf;
+break;
+ case INTEL_MIPTREE_TRMODE_YS:
+surf->tiling = ISL_TILING_Ys;
+break;
+ }
+ break;
+  default:
+ unreachable("Invalid tiling mode");
+  }
+   }
+
+   surf->format = translate_tex_format(brw, mt->format, false);
+
+   if (brw->gen >= 9) {
+  /* On gen9+, intel_mipmap_tree stores the horizontal and vertical
+   * alignment in terms of surface elements like we want.
+   */
+  surf->image_alignment_el = isl_extent3d(mt->halign, mt->valign, 1);
+   } else if (brw->gen == 6 && mt->valign > 4) {
+  /* Sandy Bridge hardware doesn't support multiple mip levels on stencil
+   * or HiZ buffers.  The miptree calculation code leaves us with a valign
+   * that we can't actually use.  Just pick something that won't assert
+   * inside ISL when we try to emit surface state.
+   */
+  assert(mt->array_layout == ALL_SLICES_AT_EACH_LOD);
+  assert(_mesa_is_depth_or_stencil_format(
+ _mesa_get_format_base_format(mt->format)));
+  surf->image_alignment_el = isl_extent3d(8, 2, 1);
+   } else {
+  /* On earlier gens it's storred in pixels. */
+  unsigned bw, bh;
+  _mesa_get_format_block_size(mt->format, , );
+  surf->image_alignment_el =
+ isl_extent3d(mt->halign / bw, mt->valign / bh, 1);
+   }
+
+   surf->logical_level0_px.width = mt->logical_width0;
+   surf->logical_level0_px.height = mt->logical_height0;
+   if (surf->dim == ISL_SURF_DIM_3D) {
+  surf->logical_level0_px.depth = mt->logical_depth0;
+  surf->logical_level0_px.array_len = 1;
+   } else if (mt->target == GL_TEXTURE_CUBE_MAP ||
+  mt->target == GL_TEXTURE_CUBE_MAP_ARRAY) {
+  /* For cube maps, mt->logical_depth0 is in number of cubes */
+  surf->logical_level0_px.depth = 1;
+  surf->logical_level0_px.array_len = mt->logical_depth0 * 6;
+   } else {
+  surf->logical_level0_px.depth = 1;
+  surf->logical_level0_px.array_len = mt->logical_depth0;
+   }
+
+   surf->phys_level0_sa.width = mt->physical_width0;
+   surf->phys_level0_sa.height = mt->physical_height0;
+   if (surf->dim == 

[Mesa-dev] [PATCH 60/64] isl/formats: Mark RAW as having a block size of 1 byte

2016-06-11 Thread Jason Ekstrand
---
 src/intel/isl/isl_format_layout.csv | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/isl/isl_format_layout.csv 
b/src/intel/isl/isl_format_layout.csv
index a39093e..7d3c3de 100644
--- a/src/intel/isl/isl_format_layout.csv
+++ b/src/intel/isl/isl_format_layout.csv
@@ -285,7 +285,7 @@ ETC2_EAC_RGBA8  , 128,  4,  4,  1,  un8,  un8,  
un8,  un8, ,
 ETC2_EAC_SRGB8_A8   , 128,  4,  4,  1,  un8,  un8,  un8,  un8, ,   
  ,,   srgb,  etc2
 R8G8B8_UINT ,  24,  1,  1,  1,  ui8,  ui8,  ui8, , ,   
  ,, linear,
 R8G8B8_SINT ,  24,  1,  1,  1,  si8,  si8,  si8, , ,   
  ,, linear,
-RAW ,   0,  0,  0,  0, , , , , ,   
  ,,   ,
+RAW ,   8,  0,  0,  0, , , , , ,   
  ,,   ,
 ASTC_LDR_2D_4X4_U8SRGB  , 128,  4,  4,  1,  un8,  un8,  un8,  un8, ,   
  ,,   srgb,  astc
 ASTC_LDR_2D_5X4_U8SRGB  , 128,  5,  4,  1,  un8,  un8,  un8,  un8, ,   
  ,,   srgb,  astc
 ASTC_LDR_2D_5X5_U8SRGB  , 128,  5,  5,  1,  un8,  un8,  un8,  un8, ,   
  ,,   srgb,  astc
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/64] isl/state: Return an extent3d from the halign/valign helper

2016-06-11 Thread Jason Ekstrand
---
 src/intel/isl/isl_surface_state.c | 28 
 1 file changed, 8 insertions(+), 20 deletions(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index 50570aa..1e94e60 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -110,9 +110,8 @@ get_surftype(enum isl_surf_dim dim, isl_surf_usage_flags_t 
usage)
  * Get the values to pack into RENDER_SUFFACE_STATE.SurfaceHorizontalAlignment
  * and SurfaceVerticalAlignment.
  */
-static void
-get_halign_valign(const struct isl_surf *surf,
-  uint32_t *halign, uint32_t *valign)
+static struct isl_extent3d
+get_image_alignment(const struct isl_surf *surf)
 {
if (GEN_GEN >= 9) {
   if (isl_tiling_is_std_y(surf->tiling) ||
@@ -121,8 +120,7 @@ get_halign_valign(const struct isl_surf *surf,
   * true alignment is likely outside the enum range of HALIGN* and
   * VALIGN*.
   */
- *halign = 0;
- *valign = 0;
+ return isl_extent3d(0, 0, 0);
   } else {
  /* In Skylake, RENDER_SUFFACE_STATE.SurfaceVerticalAlignment is in 
units
   * of surface elements (not pixels nor samples). For compressed 
formats,
@@ -131,11 +129,7 @@ get_halign_valign(const struct isl_surf *surf,
   * format (ETC2 has a block height of 4), then the vertical alignment 
is
   * 4 compression blocks or, equivalently, 16 pixels.
   */
- struct isl_extent3d image_align_el
-= isl_surf_get_image_alignment_el(surf);
-
- *halign = isl_to_gen_halign[image_align_el.width];
- *valign = isl_to_gen_valign[image_align_el.height];
+ return isl_surf_get_image_alignment_el(surf);
   }
} else {
   /* Pre-Skylake, RENDER_SUFFACE_STATE.SurfaceVerticalAlignment is in
@@ -144,11 +138,7 @@ get_halign_valign(const struct isl_surf *surf,
* format (compressed or not) the vertical alignment is
* 4 pixels.
*/
-  struct isl_extent3d image_align_sa
- = isl_surf_get_image_alignment_sa(surf);
-
-  *halign = isl_to_gen_halign[image_align_sa.width];
-  *valign = isl_to_gen_valign[image_align_sa.height];
+  return isl_surf_get_image_alignment_sa(surf);
}
 }
 
@@ -199,9 +189,6 @@ void
 isl_genX(surf_fill_state_s)(const struct isl_device *dev, void *state,
 const struct isl_surf_fill_state_info *restrict 
info)
 {
-   uint32_t halign, valign;
-   get_halign_valign(info->surf, , );
-
struct GENX(RENDER_SURFACE_STATE) s = { 0 };
 
s.SurfaceType = get_surftype(info->surf->dim, info->view->usage);
@@ -288,8 +275,9 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
   s.MIPCountLOD = MAX(info->view->levels, 1) - 1;
}
 
-   s.SurfaceVerticalAlignment = valign;
-   s.SurfaceHorizontalAlignment = halign;
+   const struct isl_extent3d image_align = get_image_alignment(info->surf);
+   s.SurfaceVerticalAlignment = isl_to_gen_valign[image_align.height];
+   s.SurfaceHorizontalAlignment = isl_to_gen_halign[image_align.width];
 
if (info->surf->tiling == ISL_TILING_W) {
   /* From the Broadwell PRM documentation for this field:
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 25/64] isl/state: Use TILEWALK_XMAJOR for linear surfaces on gen7

2016-06-11 Thread Jason Ekstrand
This matches better what happens on gen8 where the "Tiled Surface" and
"Tile Walke" bits are combined into a single two-bit value.  This is also
more consistent with what the GL driver does.
---
 src/intel/isl/isl_surface_state.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index fe0402f..e1159b2 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -313,8 +313,8 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
s.TileMode = isl_to_gen_tiling[info->surf->tiling];
 #else
s.TiledSurface = info->surf->tiling != ISL_TILING_LINEAR,
-   s.TileWalk = info->surf->tiling == ISL_TILING_X ? TILEWALK_XMAJOR :
- TILEWALK_YMAJOR;
+   s.TileWalk = info->surf->tiling == ISL_TILING_Y0 ? TILEWALK_YMAJOR :
+  TILEWALK_XMAJOR,
 #endif
 
 #if GEN_GEN >= 8
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 36/64] isl: Add support for filling out surface states all the way back to gen4

2016-06-11 Thread Jason Ekstrand
---
 src/intel/isl/Makefile.am | 12 
 src/intel/isl/Makefile.sources| 13 +++--
 src/intel/isl/isl.c   | 28 +++
 src/intel/isl/isl_priv.h  | 24 
 src/intel/isl/isl_surface_state.c | 58 ---
 5 files changed, 129 insertions(+), 6 deletions(-)

diff --git a/src/intel/isl/Makefile.am b/src/intel/isl/Makefile.am
index 74f863a..ae367a9 100644
--- a/src/intel/isl/Makefile.am
+++ b/src/intel/isl/Makefile.am
@@ -22,6 +22,9 @@
 include Makefile.sources
 
 ISL_GEN_LIBS =   \
+   libisl-gen4.la   \
+   libisl-gen5.la   \
+   libisl-gen6.la   \
libisl-gen7.la   \
libisl-gen75.la  \
libisl-gen8.la   \
@@ -52,6 +55,15 @@ libisl_la_LIBADD = $(ISL_GEN_LIBS)
 
 libisl_la_SOURCES = $(ISL_FILES) $(ISL_GENERATED_FILES)
 
+libisl_gen4_la_SOURCES = $(ISL_GEN4_FILES)
+libisl_gen4_la_CFLAGS = $(libisl_la_CFLAGS) -DGEN_VERSIONx10=40
+
+libisl_gen5_la_SOURCES = $(ISL_GEN5_FILES)
+libisl_gen5_la_CFLAGS = $(libisl_la_CFLAGS) -DGEN_VERSIONx10=50
+
+libisl_gen6_la_SOURCES = $(ISL_GEN6_FILES)
+libisl_gen6_la_CFLAGS = $(libisl_la_CFLAGS) -DGEN_VERSIONx10=60
+
 libisl_gen7_la_SOURCES = $(ISL_GEN7_FILES)
 libisl_gen7_la_CFLAGS = $(libisl_la_CFLAGS) -DGEN_VERSIONx10=70
 
diff --git a/src/intel/isl/Makefile.sources b/src/intel/isl/Makefile.sources
index 89b1418..aa20ed4 100644
--- a/src/intel/isl/Makefile.sources
+++ b/src/intel/isl/Makefile.sources
@@ -2,12 +2,21 @@ ISL_FILES = \
isl.c \
isl.h \
isl_format.c \
+   isl_priv.h \
+   isl_storage_image.c
+
+ISL_GEN4_FILES = \
isl_gen4.c \
isl_gen4.h \
+   isl_surface_state.c
+
+ISL_GEN5_FILES = \
+   isl_surface_state.c
+
+ISL_GEN6_FILES = \
isl_gen6.c \
isl_gen6.h \
-   isl_priv.h \
-   isl_storage_image.c
+   isl_surface_state.c
 
 ISL_GEN7_FILES = \
isl_gen7.c \
diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index 77b570d..7343a55 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -1191,6 +1191,20 @@ isl_surf_fill_state_s(const struct isl_device *dev, void 
*state,
}
 
switch (ISL_DEV_GEN(dev)) {
+   case 4:
+  if (ISL_DEV_IS_G4X(dev)) {
+ isl_gen4_surf_fill_state_s(dev, state, info);
+  } else {
+ /* G45 surface state is the same as gen5 */
+ isl_gen5_surf_fill_state_s(dev, state, info);
+  }
+  break;
+   case 5:
+  isl_gen5_surf_fill_state_s(dev, state, info);
+  break;
+   case 6:
+  isl_gen6_surf_fill_state_s(dev, state, info);
+  break;
case 7:
   if (ISL_DEV_IS_HASWELL(dev)) {
  isl_gen75_surf_fill_state_s(dev, state, info);
@@ -1214,6 +1228,20 @@ isl_buffer_fill_state_s(const struct isl_device *dev, 
void *state,
 const struct isl_buffer_fill_state_info *restrict info)
 {
switch (ISL_DEV_GEN(dev)) {
+   case 4:
+  if (ISL_DEV_IS_G4X(dev)) {
+ isl_gen4_buffer_fill_state_s(state, info);
+  } else {
+ /* G45 surface state is the same as gen5 */
+ isl_gen5_buffer_fill_state_s(state, info);
+  }
+  break;
+   case 5:
+  isl_gen5_buffer_fill_state_s(state, info);
+  break;
+   case 6:
+  isl_gen6_buffer_fill_state_s(state, info);
+  break;
case 7:
   if (ISL_DEV_IS_HASWELL(dev)) {
  isl_gen75_buffer_fill_state_s(state, info);
diff --git a/src/intel/isl/isl_priv.h b/src/intel/isl/isl_priv.h
index d98e707..3a7af1a 100644
--- a/src/intel/isl/isl_priv.h
+++ b/src/intel/isl/isl_priv.h
@@ -136,6 +136,18 @@ isl_extent3d_el_to_sa(enum isl_format fmt, struct 
isl_extent3d extent_el)
 }
 
 void
+isl_gen4_surf_fill_state_s(const struct isl_device *dev, void *state,
+   const struct isl_surf_fill_state_info *restrict 
info);
+
+void
+isl_gen5_surf_fill_state_s(const struct isl_device *dev, void *state,
+   const struct isl_surf_fill_state_info *restrict 
info);
+
+void
+isl_gen6_surf_fill_state_s(const struct isl_device *dev, void *state,
+   const struct isl_surf_fill_state_info *restrict 
info);
+
+void
 isl_gen7_surf_fill_state_s(const struct isl_device *dev, void *state,
const struct isl_surf_fill_state_info *restrict 
info);
 
@@ -150,6 +162,18 @@ isl_gen9_surf_fill_state_s(const struct isl_device *dev, 
void *state,
const struct isl_surf_fill_state_info *restrict 
info);
 
 void
+isl_gen4_buffer_fill_state_s(void *state,
+ const struct isl_buffer_fill_state_info *restrict 
info);
+
+void
+isl_gen5_buffer_fill_state_s(void *state,
+ const struct 

[Mesa-dev] [PATCH 52/64] i965/surface_state: Rename brw_update to gen4_update

2016-06-11 Thread Jason Ekstrand
We're about to add generic versions which work across gens and those should
have the brw name.
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 9270372..190b42a 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -369,11 +369,11 @@ brw_update_buffer_texture_surface(struct gl_context *ctx,
 }
 
 static void
-brw_update_texture_surface(struct gl_context *ctx,
-   unsigned unit,
-   uint32_t *surf_offset,
-   bool for_gather,
-   uint32_t plane)
+gen4_update_texture_surface(struct gl_context *ctx,
+unsigned unit,
+uint32_t *surf_offset,
+bool for_gather,
+uint32_t plane)
 {
struct brw_context *brw = brw_context(ctx);
struct gl_texture_object *tObj = ctx->Texture.Unit[unit]._Current;
@@ -717,10 +717,10 @@ brw_emit_null_surface_state(struct brw_context *brw,
  * usable for further buffers when doing ARB_draw_buffer support.
  */
 static uint32_t
-brw_update_renderbuffer_surface(struct brw_context *brw,
-struct gl_renderbuffer *rb,
-bool layered, unsigned unit,
-uint32_t surf_index)
+gen4_update_renderbuffer_surface(struct brw_context *brw,
+ struct gl_renderbuffer *rb,
+ bool layered, unsigned unit,
+ uint32_t surf_index)
 {
struct gl_context *ctx = >ctx;
struct intel_renderbuffer *irb = intel_renderbuffer(rb);
@@ -1496,8 +1496,8 @@ const struct brw_tracked_state brw_wm_image_surfaces = {
 void
 gen4_init_vtable_surface_functions(struct brw_context *brw)
 {
-   brw->vtbl.update_texture_surface = brw_update_texture_surface;
-   brw->vtbl.update_renderbuffer_surface = brw_update_renderbuffer_surface;
+   brw->vtbl.update_texture_surface = gen4_update_texture_surface;
+   brw->vtbl.update_renderbuffer_surface = gen4_update_renderbuffer_surface;
brw->vtbl.emit_null_surface_state = brw_emit_null_surface_state;
brw->vtbl.emit_buffer_surface_state = gen4_emit_buffer_surface_state;
 }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 46/64] i965/blorp: Use the generic ISL path for texture surfaces on gen7

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/gen7_blorp.c | 97 ++
 1 file changed, 3 insertions(+), 94 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen7_blorp.c 
b/src/mesa/drivers/dri/i965/gen7_blorp.c
index 353a60f..3300adc 100644
--- a/src/mesa/drivers/dri/i965/gen7_blorp.c
+++ b/src/mesa/drivers/dri/i965/gen7_blorp.c
@@ -155,97 +155,6 @@ gen7_blorp_emit_depth_stencil_state_pointers(struct 
brw_context *brw,
 }
 
 
-/* SURFACE_STATE for renderbuffer or texture surface (see
- * brw_update_renderbuffer_surface and brw_update_texture_surface)
- */
-static uint32_t
-gen7_blorp_emit_surface_state(struct brw_context *brw,
-  const struct brw_blorp_surface_info *surface,
-  uint32_t read_domains, uint32_t write_domain,
-  bool is_render_target)
-{
-   uint32_t wm_surf_offset;
-   uint32_t width = surface->width;
-   uint32_t height = surface->height;
-   /* Note: since gen7 uses INTEL_MSAA_LAYOUT_CMS or INTEL_MSAA_LAYOUT_UMS for
-* color surfaces, width and height are measured in pixels; we don't need
-* to divide them by 2 as we do for Gen6 (see
-* gen6_blorp_emit_surface_state).
-*/
-   struct intel_mipmap_tree *mt = surface->mt;
-   uint32_t tile_x, tile_y;
-   const uint8_t mocs = GEN7_MOCS_L3;
-
-   uint32_t tiling = surface->map_stencil_as_y_tiled
-  ? I915_TILING_Y : mt->tiling;
-
-   uint32_t *surf = (uint32_t *)
-  brw_state_batch(brw, AUB_TRACE_SURFACE_STATE, 8 * 4, 32, 
_surf_offset);
-   memset(surf, 0, 8 * 4);
-
-   surf[0] = BRW_SURFACE_2D << BRW_SURFACE_TYPE_SHIFT |
- surface->brw_surfaceformat << BRW_SURFACE_FORMAT_SHIFT |
- gen7_surface_tiling_mode(tiling);
-
-   if (surface->mt->valign == 4)
-  surf[0] |= GEN7_SURFACE_VALIGN_4;
-   if (surface->mt->halign == 8)
-  surf[0] |= GEN7_SURFACE_HALIGN_8;
-
-   if (surface->array_layout == ALL_SLICES_AT_EACH_LOD)
-  surf[0] |= GEN7_SURFACE_ARYSPC_LOD0;
-   else
-  surf[0] |= GEN7_SURFACE_ARYSPC_FULL;
-
-   /* reloc */
-   surf[1] = brw_blorp_compute_tile_offsets(surface, _x, _y) +
- mt->bo->offset64;
-
-   /* Note that the low bits of these fields are missing, so
-* there's the possibility of getting in trouble.
-*/
-   assert(tile_x % 4 == 0);
-   assert(tile_y % 2 == 0);
-   surf[5] = SET_FIELD(tile_x / 4, BRW_SURFACE_X_OFFSET) |
- SET_FIELD(tile_y / 2, BRW_SURFACE_Y_OFFSET) |
- SET_FIELD(mocs, GEN7_SURFACE_MOCS);
-
-   surf[2] = SET_FIELD(width - 1, GEN7_SURFACE_WIDTH) |
- SET_FIELD(height - 1, GEN7_SURFACE_HEIGHT);
-
-   uint32_t pitch_bytes = mt->pitch;
-   if (surface->map_stencil_as_y_tiled)
-  pitch_bytes *= 2;
-   surf[3] = pitch_bytes - 1;
-
-   surf[4] = gen7_surface_msaa_bits(surface->num_samples, 
surface->msaa_layout);
-   if (surface->mt->mcs_mt) {
-  gen7_set_surface_mcs_info(brw, surf, wm_surf_offset, surface->mt->mcs_mt,
-is_render_target);
-   }
-
-   surf[7] = surface->mt->fast_clear_color_value;
-
-   if (brw->is_haswell) {
-  surf[7] |= (SET_FIELD(HSW_SCS_RED,   GEN7_SURFACE_SCS_R) |
-  SET_FIELD(HSW_SCS_GREEN, GEN7_SURFACE_SCS_G) |
-  SET_FIELD(HSW_SCS_BLUE,  GEN7_SURFACE_SCS_B) |
-  SET_FIELD(HSW_SCS_ALPHA, GEN7_SURFACE_SCS_A));
-   }
-
-   /* Emit relocation to surface contents */
-   drm_intel_bo_emit_reloc(brw->batch.bo,
-   wm_surf_offset + 4,
-   mt->bo,
-   surf[1] - mt->bo->offset64,
-   read_domains, write_domain);
-
-   gen7_check_surface_setup(surf, is_render_target);
-
-   return wm_surf_offset;
-}
-
-
 /* 3DSTATE_VS
  *
  * Disable vertex shader.
@@ -847,9 +756,9 @@ gen7_blorp_exec(struct brw_context *brw,
   true /* is_render_target */);
   if (params->src.mt) {
  wm_surf_offset_texture =
-gen7_blorp_emit_surface_state(brw, >src,
-  I915_GEM_DOMAIN_SAMPLER, 0,
-  false /* is_render_target */);
+brw_blorp_emit_surface_state(brw, >src,
+ I915_GEM_DOMAIN_SAMPLER, 0,
+ false /* is_render_target */);
   }
   wm_bind_bo_offset =
  gen6_blorp_emit_binding_table(brw,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 63/64] i965: Get rid of gen6_surface_state.c

2016-06-11 Thread Jason Ekstrand
The only useful thing left was gen6_init_vtable_surface_functions which we
can easily put in brw_wm_surface_state.c.
---
 src/mesa/drivers/dri/i965/Makefile.sources   |  1 -
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  7 
 src/mesa/drivers/dri/i965/gen6_surface_state.c   | 48 
 3 files changed, 7 insertions(+), 49 deletions(-)
 delete mode 100644 src/mesa/drivers/dri/i965/gen6_surface_state.c

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources
index f448551..433ce47 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -187,7 +187,6 @@ i965_FILES = \
gen6_scissor_state.c \
gen6_sf_state.c \
gen6_sol.c \
-   gen6_surface_state.c \
gen6_urb.c \
gen6_viewport_state.c \
gen6_vs_state.c \
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index fccd26b..c80d4f2 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -1594,6 +1594,13 @@ gen4_init_vtable_surface_functions(struct brw_context 
*brw)
brw->vtbl.emit_null_surface_state = brw_emit_null_surface_state;
 }
 
+void
+gen6_init_vtable_surface_functions(struct brw_context *brw)
+{
+   gen4_init_vtable_surface_functions(brw);
+   brw->vtbl.update_renderbuffer_surface = brw_update_renderbuffer_surface;
+}
+
 static void
 brw_upload_cs_work_groups_surface(struct brw_context *brw)
 {
diff --git a/src/mesa/drivers/dri/i965/gen6_surface_state.c 
b/src/mesa/drivers/dri/i965/gen6_surface_state.c
deleted file mode 100644
index 84b8ef4..000
--- a/src/mesa/drivers/dri/i965/gen6_surface_state.c
+++ /dev/null
@@ -1,48 +0,0 @@
-/*
- * Copyright (c) 2014 Intel Corporation
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice (including the next
- * paragraph) shall be included in all copies or substantial portions of the
- * Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
- * IN THE SOFTWARE.
- */
-
-
-#include "main/context.h"
-#include "main/blend.h"
-#include "main/mtypes.h"
-#include "main/samplerobj.h"
-#include "main/texformat.h"
-#include "program/prog_parameter.h"
-
-#include "intel_mipmap_tree.h"
-#include "intel_batchbuffer.h"
-#include "intel_tex.h"
-#include "intel_fbo.h"
-#include "intel_buffer_objects.h"
-
-#include "brw_context.h"
-#include "brw_state.h"
-#include "brw_defines.h"
-#include "brw_wm.h"
-
-void
-gen6_init_vtable_surface_functions(struct brw_context *brw)
-{
-   gen4_init_vtable_surface_functions(brw);
-   brw->vtbl.update_renderbuffer_surface = brw_update_renderbuffer_surface;
-}
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 38/64] isl/state: Add support for handling color control surfaces

2016-06-11 Thread Jason Ekstrand
---
 src/intel/isl/isl.h   |  6 ++
 src/intel/isl/isl_surface_state.c | 42 ---
 2 files changed, 45 insertions(+), 3 deletions(-)

diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index 36038bc..a987482 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@ -833,6 +833,12 @@ struct isl_surf_fill_state_info {
uint32_t mocs;
 
/**
+* The auxilary surface or NULL if no auxilary surface is to be used.
+*/
+   const struct isl_surf *aux_surf;
+   uint64_t aux_address;
+
+   /**
 * The clear color for this surface
 *
 * Valid values depend on hardware generation.
diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index 53ff56f..9bfc55f 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -185,6 +185,28 @@ get_qpitch(const struct isl_surf *surf)
 }
 #endif /* GEN_GEN >= 8 */
 
+#if GEN_GEN >= 8
+static uint32_t
+get_aux_mode_for_format(const struct isl_device *dev,
+enum isl_format view_format,
+enum isl_format aux_format)
+{
+   /* TODO: HiZ */
+#if GEN_GEN >= 9
+   if (aux_format == ISL_FORMAT_NOMSRT_CCS_E_32BPP ||
+   aux_format == ISL_FORMAT_NOMSRT_CCS_E_64BPP ||
+   aux_format == ISL_FORMAT_NOMSRT_CCS_E_128BPP) {
+  assert(isl_format_supports_lossless_compression(dev->info, view_format));
+  return AUX_CCS_E;
+   } else {
+  return AUX_CCS_D;
+   }
+#else
+   return AUX_MCS;
+#endif
+}
+#endif /* GEN_GEN >= 8 */
+
 void
 isl_genX(surf_fill_state_s)(const struct isl_device *dev, void *state,
 const struct isl_surf_fill_state_info *restrict 
info)
@@ -379,10 +401,24 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
s.MOCS = info->mocs;
 #endif
 
+#if GEN_GEN >= 7
+   if (info->aux_surf) {
+  struct isl_tile_info tile_info;
+  isl_surf_get_tile_info(dev, info->aux_surf, _info);
+  uint32_t pitch_in_tiles = info->aux_surf->row_pitch / tile_info.width;
+
 #if GEN_GEN >= 8
-   s.AuxiliarySurfaceMode = AUX_NONE;
-#elif GEN_GEN >= 7
-   s.MCSEnable = false;
+  s.AuxiliarySurfacePitch = pitch_in_tiles - 1;
+  s.AuxiliarySurfaceQPitch = get_qpitch(info->aux_surf) >> 2;
+  s.AuxiliarySurfaceBaseAddress = info->aux_address;
+  s.AuxiliarySurfaceMode = get_aux_mode_for_format(dev, info->view->format,
+   info->aux_surf->format);
+#else
+  s.MCSBaseAddress = info->aux_address,
+  s.MCSSurfacePitch = pitch_in_tiles - 1;
+  s.MCSEnable = true;
+#endif
+   }
 #endif
 
 #if GEN_GEN >= 8
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 43/64] i965/blorp: Add a generic ISL-based surface state emit path

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 146 ++
 src/mesa/drivers/dri/i965/brw_blorp.h |   6 ++
 2 files changed, 152 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 9590968..1089a49 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -240,6 +240,152 @@ brw_blorp_compile_nir_shader(struct brw_context *brw, 
struct nir_shader *nir,
return program;
 }
 
+static enum isl_msaa_layout
+get_isl_msaa_layout(enum intel_msaa_layout layout)
+{
+   switch (layout) {
+   case INTEL_MSAA_LAYOUT_NONE:
+  return ISL_MSAA_LAYOUT_NONE;
+   case INTEL_MSAA_LAYOUT_IMS:
+  return ISL_MSAA_LAYOUT_INTERLEAVED;
+   case INTEL_MSAA_LAYOUT_UMS:
+   case INTEL_MSAA_LAYOUT_CMS:
+  return ISL_MSAA_LAYOUT_ARRAY;
+   default:
+  unreachable("Invalid MSAA layout");
+   }
+}
+
+struct surface_state_info {
+   unsigned num_dwords;
+   unsigned ss_align; /* Required alignment of RENDER_SURFACE_STATE in bytes */
+   unsigned reloc_dw;
+   unsigned aux_reloc_dw;
+   unsigned tex_mocs;
+   unsigned rb_mocs;
+};
+
+static const struct surface_state_info surface_state_infos[] = {
+   [6] = {6,  32, 1,  0},
+   [7] = {8,  32, 1,  6,  GEN7_MOCS_L3, GEN7_MOCS_L3},
+   [8] = {13, 64, 8,  10, BDW_MOCS_WB,  BDW_MOCS_PTE},
+   [9] = {16, 64, 8,  10, SKL_MOCS_WB,  SKL_MOCS_PTE},
+};
+
+uint32_t
+brw_blorp_emit_surface_state(struct brw_context *brw,
+ const struct brw_blorp_surface_info *surface,
+ uint32_t read_domains, uint32_t write_domain,
+ bool is_render_target)
+{
+   /* TODO: This should go in the context */
+   struct isl_device isl_dev;
+   isl_device_init(_dev, brw->intelScreen->devinfo, brw->has_swizzling);
+
+   const struct surface_state_info ss_info = surface_state_infos[brw->gen];
+
+   struct isl_surf surf;
+   intel_miptree_get_isl_surf(brw, surface->mt, );
+
+   /* Stomp surface dimensions and tiling (if needed) with info from blorp */
+   surf.dim = ISL_SURF_DIM_2D;
+   surf.dim_layout = ISL_DIM_LAYOUT_GEN4_2D;
+   surf.msaa_layout = get_isl_msaa_layout(surface->msaa_layout);
+   surf.logical_level0_px.width = surface->width;
+   surf.logical_level0_px.height = surface->height;
+   surf.levels = 1;
+   surf.samples = MAX2(surface->num_samples, 1);
+
+   if (brw->gen == 6 && surface->num_samples > 1) {
+  /* Since gen6 uses INTEL_MSAA_LAYOUT_IMS, width and height are measured
+   * in samples.  But SURFACE_STATE wants them in pixels, so we need to
+   * divide them each by 2.
+   */
+  surf.logical_level0_px.width /= 2;
+  surf.logical_level0_px.height /= 2;
+   }
+
+   if (surface->map_stencil_as_y_tiled) {
+  /* We need to fake W-tiling with Y-tiling */
+  surf.tiling = ISL_TILING_Y0;
+  surf.row_pitch *= 2;
+   }
+
+   union isl_color_value clear_color = { .u32 = { 0, 0, 0, 0 } };
+
+   struct isl_surf *aux_surf = NULL, aux_surf_s;
+   uint64_t aux_offset;
+   if (surface->mt->mcs_mt) {
+  /* We should probably to similar stomping to above but most of the aux
+   * surf gets ignored when we fill out the surface state anyway so
+   * there's no point.
+   */
+  intel_miptree_get_ccs_isl_surf(brw, surface->mt, _surf_s);
+  aux_surf = _surf_s;
+  assert(surface->mt->mcs_mt->offset == 0);
+  aux_offset = surface->mt->mcs_mt->bo->offset64;
+
+  /* We only really need a clear color if we also have an auxiliary
+   * surfacae.  Without one, it does nothing.
+   */
+  clear_color = intel_miptree_get_isl_clear_color(brw, surface->mt);
+   }
+
+   struct isl_view view = {
+  .format = surface->brw_surfaceformat,
+  .base_level = 0,
+  .levels = 1,
+  .base_array_layer = 0,
+  .array_len = 1,
+  .channel_select = {
+ ISL_CHANNEL_SELECT_RED,
+ ISL_CHANNEL_SELECT_GREEN,
+ ISL_CHANNEL_SELECT_BLUE,
+ ISL_CHANNEL_SELECT_ALPHA,
+  },
+  .usage = is_render_target ? ISL_SURF_USAGE_RENDER_TARGET_BIT :
+  ISL_SURF_USAGE_TEXTURE_BIT,
+   };
+
+   uint32_t offset, tile_x, tile_y;
+   offset = brw_blorp_compute_tile_offsets(surface, _x, _y);
+
+   uint32_t surf_offset;
+   uint32_t *dw = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
+  ss_info.num_dwords * 4, ss_info.ss_align,
+  _offset);
+
+   const uint32_t mocs = is_render_target ? ss_info.rb_mocs : ss_info.tex_mocs;
+
+   isl_surf_fill_state(_dev, dw, .surf = , .view = ,
+   .address = surface->mt->bo->offset64 + offset,
+   .aux_surf = aux_surf, .aux_address = aux_offset,
+   .mocs = mocs, .clear_color = clear_color,
+   .x_offset = tile_x, .y_offset = tile_y);
+
+   /* Emit relocation to surface contents */
+   

[Mesa-dev] [PATCH 45/64] i965/blorp: Use the generic ISL path for renderbuffer surfaces on gen7

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/gen7_blorp.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen7_blorp.c 
b/src/mesa/drivers/dri/i965/gen7_blorp.c
index 270fe57..353a60f 100644
--- a/src/mesa/drivers/dri/i965/gen7_blorp.c
+++ b/src/mesa/drivers/dri/i965/gen7_blorp.c
@@ -841,10 +841,10 @@ gen7_blorp_exec(struct brw_context *brw,
   wm_push_const_offset = gen6_blorp_emit_wm_constants(brw, params);
   intel_miptree_used_for_rendering(params->dst.mt);
   wm_surf_offset_renderbuffer =
- gen7_blorp_emit_surface_state(brw, >dst,
-   I915_GEM_DOMAIN_RENDER,
-   I915_GEM_DOMAIN_RENDER,
-   true /* is_render_target */);
+ brw_blorp_emit_surface_state(brw, >dst,
+  I915_GEM_DOMAIN_RENDER,
+  I915_GEM_DOMAIN_RENDER,
+  true /* is_render_target */);
   if (params->src.mt) {
  wm_surf_offset_texture =
 gen7_blorp_emit_surface_state(brw, >src,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 57/64] i965/gen7: Use the generic ISL-based path for renderbuffer surfaces

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_state.h |   7 -
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 194 +-
 2 files changed, 1 insertion(+), 200 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index fec224e..604467b 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -298,13 +298,6 @@ void brw_update_renderbuffer_surfaces(struct brw_context 
*brw,
   uint32_t *surf_offset);
 
 /* gen7_wm_surface_state.c */
-uint32_t gen7_surface_tiling_mode(uint32_t tiling);
-uint32_t gen7_surface_msaa_bits(unsigned num_samples, enum intel_msaa_layout 
l);
-void gen7_set_surface_mcs_info(struct brw_context *brw,
-   uint32_t *surf,
-   uint32_t surf_offset,
-   const struct intel_mipmap_tree *mcs_mt,
-   bool is_render_target);
 void gen7_check_surface_setup(uint32_t *surf, bool is_render_target);
 void gen7_init_vtable_surface_functions(struct brw_context *brw);
 
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index bdb4f66..bb94f2d 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -39,79 +39,6 @@
 #include "brw_defines.h"
 #include "brw_wm.h"
 
-uint32_t
-gen7_surface_tiling_mode(uint32_t tiling)
-{
-   switch (tiling) {
-   case I915_TILING_X:
-  return GEN7_SURFACE_TILING_X;
-   case I915_TILING_Y:
-  return GEN7_SURFACE_TILING_Y;
-   default:
-  return GEN7_SURFACE_TILING_NONE;
-   }
-}
-
-
-uint32_t
-gen7_surface_msaa_bits(unsigned num_samples, enum intel_msaa_layout layout)
-{
-   uint32_t ss4 = 0;
-
-   assert(num_samples <= 16);
-
-   /* The SURFACE_MULTISAMPLECOUNT_X enums are simply log2(num_samples) << 3. 
*/
-   ss4 |= (ffs(MAX2(num_samples, 1)) - 1) << 3;
-
-   if (layout == INTEL_MSAA_LAYOUT_IMS)
-  ss4 |= GEN7_SURFACE_MSFMT_DEPTH_STENCIL;
-   else
-  ss4 |= GEN7_SURFACE_MSFMT_MSS;
-
-   return ss4;
-}
-
-
-void
-gen7_set_surface_mcs_info(struct brw_context *brw,
-  uint32_t *surf,
-  uint32_t surf_offset,
-  const struct intel_mipmap_tree *mcs_mt,
-  bool is_render_target)
-{
-   /* From the Ivy Bridge PRM, Vol4 Part1 p76, "MCS Base Address":
-*
-* "The MCS surface must be stored as Tile Y."
-*/
-   assert(mcs_mt->tiling == I915_TILING_Y);
-
-   /* Compute the pitch in units of tiles.  To do this we need to divide the
-* pitch in bytes by 128, since a single Y-tile is 128 bytes wide.
-*/
-   unsigned pitch_tiles = mcs_mt->pitch / 128;
-
-   /* The upper 20 bits of surface state DWORD 6 are the upper 20 bits of the
-* GPU address of the MCS buffer; the lower 12 bits contain other control
-* information.  Since buffer addresses are always on 4k boundaries (and
-* thus have their lower 12 bits zero), we can use an ordinary reloc to do
-* the necessary address translation.
-*/
-   assert ((mcs_mt->bo->offset64 & 0xfff) == 0);
-
-   surf[6] = GEN7_SURFACE_MCS_ENABLE |
- SET_FIELD(pitch_tiles - 1, GEN7_SURFACE_MCS_PITCH) |
- mcs_mt->bo->offset64;
-
-   drm_intel_bo_emit_reloc(brw->batch.bo,
-   surf_offset + 6 * 4,
-   mcs_mt->bo,
-   surf[6] & 0xfff,
-   is_render_target ? I915_GEM_DOMAIN_RENDER
-   : I915_GEM_DOMAIN_SAMPLER,
-   is_render_target ? I915_GEM_DOMAIN_RENDER : 0);
-}
-
-
 void
 gen7_check_surface_setup(uint32_t *surf, bool is_render_target)
 {
@@ -291,130 +218,11 @@ gen7_emit_null_surface_state(struct brw_context *brw,
gen7_check_surface_setup(surf, true /* is_render_target */);
 }
 
-/**
- * Sets up a surface state structure to point at the given region.
- * While it is only used for the front/back buffer currently, it should be
- * usable for further buffers when doing ARB_draw_buffer support.
- */
-static uint32_t
-gen7_update_renderbuffer_surface(struct brw_context *brw,
- struct gl_renderbuffer *rb,
- bool layered, unsigned unit /* unused */,
- uint32_t surf_index)
-{
-   struct gl_context *ctx = >ctx;
-   struct intel_renderbuffer *irb = intel_renderbuffer(rb);
-   struct intel_mipmap_tree *mt = irb->mt;
-   uint32_t format;
-   /* _NEW_BUFFERS */
-   mesa_format rb_format = _mesa_get_render_format(ctx, intel_rb_format(irb));
-   uint32_t surftype;
-   bool is_array = false;
-   int depth = MAX2(irb->layer_count, 1);
-   const uint8_t mocs = GEN7_MOCS_L3;
-   uint32_t offset;
-
-   int min_array_element = irb->mt_layer / MAX2(mt->num_samples, 1);
-
-   

[Mesa-dev] [PATCH 35/64] isl: Add an ISL_DEV_IS_G4X macro

2016-06-11 Thread Jason Ekstrand
---
 src/intel/isl/isl.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index ef86228..3bf7469 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@ -65,6 +65,10 @@ struct brw_image_param;
(assert(ISL_DEV_GEN(__dev) == (__dev)->info->gen))
 #endif
 
+#ifndef ISL_DEV_IS_G4X
+#define ISL_DEV_IS_G4X(__dev) ((__dev)->info->is_g4x)
+#endif
+
 #ifndef ISL_DEV_IS_HASWELL
 /**
  * @brief Get the hardware generation of isl_device.
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 47/64] i965/blorp: Use the generic ISL path for renderbuffer surfaces on gen6

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/gen6_blorp.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.c 
b/src/mesa/drivers/dri/i965/gen6_blorp.c
index 5f84ab0..3af9c95 100644
--- a/src/mesa/drivers/dri/i965/gen6_blorp.c
+++ b/src/mesa/drivers/dri/i965/gen6_blorp.c
@@ -1017,9 +1017,9 @@ gen6_blorp_exec(struct brw_context *brw,
   wm_push_const_offset = gen6_blorp_emit_wm_constants(brw, params);
   intel_miptree_used_for_rendering(params->dst.mt);
   wm_surf_offset_renderbuffer =
- gen6_blorp_emit_surface_state(brw, params, >dst,
-   I915_GEM_DOMAIN_RENDER,
-   I915_GEM_DOMAIN_RENDER);
+ brw_blorp_emit_surface_state(brw, >dst,
+  I915_GEM_DOMAIN_RENDER,
+  I915_GEM_DOMAIN_RENDER, true);
   if (params->src.mt) {
  wm_surf_offset_texture =
 gen6_blorp_emit_surface_state(brw, params, >src,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 51/64] i965/state: Use ISL for emitting image surfaces

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 32 
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 2888cc9..9270372 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -1402,22 +1402,32 @@ update_image_surface(struct brw_context *brw,
access != GL_READ_ONLY);
 
  } else {
-const unsigned min_layer = obj->MinLayer + u->_Layer;
-const unsigned min_level = obj->MinLevel + u->Level;
 const unsigned num_layers = (!u->Layered ? 1 :
  obj->Target == GL_TEXTURE_CUBE_MAP ? 
6 :
  mt->logical_depth0);
-const GLenum target = (obj->Target == GL_TEXTURE_CUBE_MAP ||
-   obj->Target == GL_TEXTURE_CUBE_MAP_ARRAY ?
-   GL_TEXTURE_2D_ARRAY : obj->Target);
+
+struct isl_view view = {
+   .format = format,
+   .base_level = obj->MinLevel + u->Level,
+   .levels = 1,
+   .base_array_layer = obj->MinLayer + u->_Layer,
+   .array_len = num_layers,
+   .channel_select = {
+  ISL_CHANNEL_SELECT_RED,
+  ISL_CHANNEL_SELECT_GREEN,
+  ISL_CHANNEL_SELECT_BLUE,
+  ISL_CHANNEL_SELECT_ALPHA,
+   },
+   .usage = ISL_SURF_USAGE_STORAGE_BIT,
+};
+
 const int surf_index = surf_offset - >wm.base.surf_offset[0];
 
-brw->vtbl.emit_texture_surface_state(
-   brw, mt, target,
-   min_layer, min_layer + num_layers,
-   min_level, min_level + 1,
-   format, SWIZZLE_XYZW,
-   surf_offset, surf_index, access != GL_READ_ONLY, false);
+brw_emit_surface_state(brw, mt, ,
+   surface_state_infos[brw->gen].rb_mocs, 
false,
+   surf_offset, surf_index,
+   I915_GEM_DOMAIN_SAMPLER,
+   I915_GEM_DOMAIN_SAMPLER);
  }
 
  update_texture_image_param(brw, u, surface_idx, param);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/64] isl/state: Refactor the per-gen isl_to_gen_h/valign tables

2016-06-11 Thread Jason Ekstrand
This moves the #if's around so that halign and valign have different sets
of #if conditions.  This also prepares us for SNB because isl_to_gen_halign
is not defined at all on gen6.
---
 src/intel/isl/isl_surface_state.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index 1e94e60..db90936 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -47,18 +47,20 @@ static const uint8_t isl_to_gen_halign[] = {
 [8] = HALIGN8,
 [16] = HALIGN16,
 };
+#elif GEN_GEN >= 7
+static const uint8_t isl_to_gen_halign[] = {
+[4] = HALIGN_4,
+[8] = HALIGN_8,
+};
+#endif
 
+#if GEN_GEN >= 8
 static const uint8_t isl_to_gen_valign[] = {
 [4] = VALIGN4,
 [8] = VALIGN8,
 [16] = VALIGN16,
 };
-#else
-static const uint8_t isl_to_gen_halign[] = {
-[4] = HALIGN_4,
-[8] = HALIGN_8,
-};
-
+#elif GEN_GEN >= 6
 static const uint8_t isl_to_gen_valign[] = {
 [2] = VALIGN_2,
 [4] = VALIGN_4,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 33/64] genxml: Make X/Y Offset field of SURFACE_STATE a uint

2016-06-11 Thread Jason Ekstrand
THe offset type has special implications that it's intended to be some form
of aligned memory address.  These assumptions allow it to handle the case
where there is some alignment requirement on the offset and the bottom bits
are used for other things.  However, the offsets in the surface state field
are really just unsigned integers.
---
 src/intel/genxml/gen45.xml | 4 ++--
 src/intel/genxml/gen5.xml  | 4 ++--
 src/intel/genxml/gen6.xml  | 4 ++--
 src/intel/genxml/gen7.xml  | 4 ++--
 src/intel/genxml/gen75.xml | 4 ++--
 src/intel/genxml/gen8.xml  | 4 ++--
 src/intel/genxml/gen9.xml  | 4 ++--
 7 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/src/intel/genxml/gen45.xml b/src/intel/genxml/gen45.xml
index 973b3bb..ae483b7 100644
--- a/src/intel/genxml/gen45.xml
+++ b/src/intel/genxml/gen45.xml
@@ -50,7 +50,7 @@
 
 
 
-
-
+
+
   
 
diff --git a/src/intel/genxml/gen5.xml b/src/intel/genxml/gen5.xml
index 37e1ac4..cb6a7b6 100644
--- a/src/intel/genxml/gen5.xml
+++ b/src/intel/genxml/gen5.xml
@@ -50,7 +50,7 @@
 
 
 
-
-
+
+
   
 
diff --git a/src/intel/genxml/gen6.xml b/src/intel/genxml/gen6.xml
index 7525fce..62a77de 100644
--- a/src/intel/genxml/gen6.xml
+++ b/src/intel/genxml/gen6.xml
@@ -355,12 +355,12 @@
   
 
 
-
+
 
   
   
 
-
+
 
 
   
diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml
index 87057f3..9652e3f 100644
--- a/src/intel/genxml/gen7.xml
+++ b/src/intel/genxml/gen7.xml
@@ -388,8 +388,8 @@
 
 
 
-
-
+
+
 
 
 
diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml
index dcceea5..37e4813 100644
--- a/src/intel/genxml/gen75.xml
+++ b/src/intel/genxml/gen75.xml
@@ -399,8 +399,8 @@
 
 
 
-
-
+
+
 
 
 
diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml
index 09671ba..c33474d 100644
--- a/src/intel/genxml/gen8.xml
+++ b/src/intel/genxml/gen8.xml
@@ -317,8 +317,8 @@
   
 
 
-
-
+
+
 
 
   
diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml
index f527838..26d86a0 100644
--- a/src/intel/genxml/gen9.xml
+++ b/src/intel/genxml/gen9.xml
@@ -324,8 +324,8 @@
   
 
 
-
-
+
+
 
 
   
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 20/64] isl/format: Mark R9G9B9E5 as containing 9-bit unsigned float channels

2016-06-11 Thread Jason Ekstrand
---
 src/intel/isl/isl_format_layout.csv | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/isl/isl_format_layout.csv 
b/src/intel/isl/isl_format_layout.csv
index bcfea0f..f90fbe0 100644
--- a/src/intel/isl/isl_format_layout.csv
+++ b/src/intel/isl/isl_format_layout.csv
@@ -149,7 +149,7 @@ B8G8R8X8_UNORM  ,  32,  1,  1,  1,  un8,  un8,  
un8,   x8, ,
 B8G8R8X8_UNORM_SRGB ,  32,  1,  1,  1,  un8,  un8,  un8,   x8, ,   
  ,,   srgb,
 R8G8B8X8_UNORM  ,  32,  1,  1,  1,  un8,  un8,  un8,   x8, ,   
  ,, linear,
 R8G8B8X8_UNORM_SRGB ,  32,  1,  1,  1,  un8,  un8,  un8,   x8, ,   
  ,,   srgb,
-R9G9B9E5_SHAREDEXP  ,  32,  1,  1,  1,  ui9,  ui9,  ui9, , ,   
  ,, linear,
+R9G9B9E5_SHAREDEXP  ,  32,  1,  1,  1,  uf9,  uf9,  uf9, , ,   
  ,, linear,
 B10G10R10X2_UNORM   ,  32,  1,  1,  1, un10, un10, un10,   x2, ,   
  ,, linear,
 L16A16_FLOAT,  32,  1,  1,  1, , , , sf16, sf16,   
  ,, linear,
 R32_UNORM   ,  32,  1,  1,  1, un32, , , , ,   
  ,, linear,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 23/64] isl/state: Only set cube face enables if usage includes CUBE_BIT

2016-06-11 Thread Jason Ekstrand
It seems safe to set it all the time, but this reduces the diff between
the way i965 does it and what ISL does.
---
 src/intel/isl/isl_surface_state.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index 2d36881..0d26619 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -315,16 +315,18 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
s.RenderCacheReadWriteMode = 0;
 #endif
 
+   if (info->view->usage & ISL_SURF_USAGE_CUBE_BIT) {
 #if GEN_GEN >= 8
-   s.CubeFaceEnablePositiveZ = 1;
-   s.CubeFaceEnableNegativeZ = 1;
-   s.CubeFaceEnablePositiveY = 1;
-   s.CubeFaceEnableNegativeY = 1;
-   s.CubeFaceEnablePositiveX = 1;
-   s.CubeFaceEnableNegativeX = 1;
+  s.CubeFaceEnablePositiveZ = 1;
+  s.CubeFaceEnableNegativeZ = 1;
+  s.CubeFaceEnablePositiveY = 1;
+  s.CubeFaceEnableNegativeY = 1;
+  s.CubeFaceEnablePositiveX = 1;
+  s.CubeFaceEnableNegativeX = 1;
 #else
-   s.CubeFaceEnables = 0x3f;
+  s.CubeFaceEnables = 0x3f;
 #endif
+   }
 
s.MultisampledSurfaceStorageFormat =
   isl_to_gen_multisample_layout[info->surf->msaa_layout];
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 28/64] isl/state: Don't use designated initializers for buffer surface state

2016-06-11 Thread Jason Ekstrand
---
 src/intel/isl/isl_surface_state.c | 46 +++
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index ca13175..2026f80 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -427,40 +427,40 @@ isl_genX(buffer_fill_state_s)(void *state,
   assert(num_elements <= (1ull << 27));
}
 
-   struct GENX(RENDER_SURFACE_STATE) surface_state = {
-  .SurfaceType = SURFTYPE_BUFFER,
-  .SurfaceArray = false,
-  .SurfaceFormat = info->format,
-  .SurfaceVerticalAlignment = isl_to_gen_valign[4],
-  .SurfaceHorizontalAlignment = isl_to_gen_halign[4],
-  .Height = ((num_elements - 1) >> 7) & 0x3fff,
-  .Width = (num_elements - 1) & 0x7f,
-  .Depth = ((num_elements - 1) >> 21) & 0x3f,
-  .SurfacePitch = info->stride - 1,
-  .NumberofMultisamples = MULTISAMPLECOUNT_1,
+   struct GENX(RENDER_SURFACE_STATE) s = { 0 };
+
+   s.SurfaceType = SURFTYPE_BUFFER,
+   s.SurfaceArray = false,
+   s.SurfaceFormat = info->format,
+   s.SurfaceVerticalAlignment = isl_to_gen_valign[4],
+   s.SurfaceHorizontalAlignment = isl_to_gen_halign[4],
+   s.Height = ((num_elements - 1) >> 7) & 0x3fff,
+   s.Width = (num_elements - 1) & 0x7f,
+   s.Depth = ((num_elements - 1) >> 21) & 0x3f,
+   s.SurfacePitch = info->stride - 1,
+   s.NumberofMultisamples = MULTISAMPLECOUNT_1,
 
 #if (GEN_GEN >= 8)
-  .TileMode = LINEAR,
+   s.TileMode = LINEAR,
 #else
-  .TiledSurface = false,
+   s.TiledSurface = false,
 #endif
 
 #if (GEN_GEN >= 8)
-  .RenderCacheReadWriteMode = WriteOnlyCache,
+   s.RenderCacheReadWriteMode = WriteOnlyCache,
 #else
-  .RenderCacheReadWriteMode = 0,
+   s.RenderCacheReadWriteMode = 0,
 #endif
 
-  .MOCS = info->mocs,
+   s.SurfaceBaseAddress = info->address,
+   s.MOCS = info->mocs,
 
 #if (GEN_GEN >= 8 || GEN_IS_HASWELL)
-  .ShaderChannelSelectRed = SCS_RED,
-  .ShaderChannelSelectGreen = SCS_GREEN,
-  .ShaderChannelSelectBlue = SCS_BLUE,
-  .ShaderChannelSelectAlpha = SCS_ALPHA,
+   s.ShaderChannelSelectRed = SCS_RED,
+   s.ShaderChannelSelectGreen = SCS_GREEN,
+   s.ShaderChannelSelectBlue = SCS_BLUE,
+   s.ShaderChannelSelectAlpha = SCS_ALPHA,
 #endif
-  .SurfaceBaseAddress = info->address,
-   };
 
-   GENX(RENDER_SURFACE_STATE_pack)(NULL, state, _state);
+   GENX(RENDER_SURFACE_STATE_pack)(NULL, state, );
 }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 21/64] isl/state: Set the IntegerSurfaceFormat bit on Haswell

2016-06-11 Thread Jason Ekstrand
This fixes 688 Vulkan CTS tests on Haswell.
---
 src/intel/isl/isl_surface_state.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index 35902e6..b16bcbf 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -202,6 +202,10 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
   s.SurfaceFormat = info->view->format;
}
 
+#if GEN_IS_HASWELL
+   s.IntegerSurfaceFormat = isl_format_has_int_channel(s.SurfaceFormat);
+#endif
+
s.Width = info->surf->logical_level0_px.width - 1;
s.Height = info->surf->logical_level0_px.height - 1;
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 29/64] isl/state: Allow for full 31-bit buffer texture sizes

2016-06-11 Thread Jason Ekstrand
Ivy Bridge and above can handle up to 2^31 elements for RAW buffer
surfaces.
---
 src/intel/isl/isl_surface_state.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index 2026f80..4c6563a 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -436,7 +436,7 @@ isl_genX(buffer_fill_state_s)(void *state,
s.SurfaceHorizontalAlignment = isl_to_gen_halign[4],
s.Height = ((num_elements - 1) >> 7) & 0x3fff,
s.Width = (num_elements - 1) & 0x7f,
-   s.Depth = ((num_elements - 1) >> 21) & 0x3f,
+   s.Depth = ((num_elements - 1) >> 21) & 0x3ff,
s.SurfacePitch = info->stride - 1,
s.NumberofMultisamples = MULTISAMPLECOUNT_1,
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 62/64] i965: Use ISL for emitting buffer surface states

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_binding_tables.c|  2 +-
 src/mesa/drivers/dri/i965/brw_context.h   |  8 --
 src/mesa/drivers/dri/i965/brw_state.h |  9 +++
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 97 +++
 src/mesa/drivers/dri/i965/gen7_cs_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 47 ---
 src/mesa/drivers/dri/i965/gen8_surface_state.c| 42 --
 7 files changed, 59 insertions(+), 148 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
b/src/mesa/drivers/dri/i965/brw_binding_tables.c
index 3bf2255..9ca841a 100644
--- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
+++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
@@ -100,7 +100,7 @@ brw_upload_binding_table(struct brw_context *brw,
} else {
   /* Upload a new binding table. */
   if (INTEL_DEBUG & DEBUG_SHADER_TIME) {
- brw->vtbl.emit_buffer_surface_state(
+ brw_emit_buffer_surface_state(
 brw, _state->surf_offset[
 prog_data->binding_table.shader_time_start],
 brw->shader_time.bo, 0, BRW_SURFACEFORMAT_RAW,
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 4b22201..20c6d96 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -744,14 +744,6 @@ struct brw_context
  uint32_t *surf_offset,
  int surf_index,
  bool rw, bool for_gather);
-  void (*emit_buffer_surface_state)(struct brw_context *brw,
-uint32_t *out_offset,
-drm_intel_bo *bo,
-unsigned buffer_offset,
-unsigned surface_format,
-unsigned buffer_size,
-unsigned pitch,
-bool rw);
   void (*emit_null_surface_state)(struct brw_context *brw,
   unsigned width,
   unsigned height,
diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index 604467b..fd7e0e8 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -283,6 +283,15 @@ void brw_emit_surface_state(struct brw_context *brw,
 uint32_t *surf_offset, int surf_index,
 unsigned read_domains, unsigned write_domains);
 
+void brw_emit_buffer_surface_state(struct brw_context *brw,
+   uint32_t *out_offset,
+   drm_intel_bo *bo,
+   unsigned buffer_offset,
+   unsigned surface_format,
+   unsigned buffer_size,
+   unsigned pitch,
+   bool rw);
+
 void brw_update_texture_surface(struct gl_context *ctx,
 unsigned unit, uint32_t *surf_offset,
 bool for_gather, uint32_t plane);
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index ba2ad7d..fccd26b 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -484,36 +484,36 @@ brw_update_texture_surface(struct gl_context *ctx,
}
 }
 
-static void
-gen4_emit_buffer_surface_state(struct brw_context *brw,
-   uint32_t *out_offset,
-   drm_intel_bo *bo,
-   unsigned buffer_offset,
-   unsigned surface_format,
-   unsigned buffer_size,
-   unsigned pitch,
-   bool rw)
+void
+brw_emit_buffer_surface_state(struct brw_context *brw,
+  uint32_t *out_offset,
+  drm_intel_bo *bo,
+  unsigned buffer_offset,
+  unsigned surface_format,
+  unsigned buffer_size,
+  unsigned pitch,
+  bool rw)
 {
-   unsigned elements = buffer_size / pitch;
-   uint32_t *surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
-6 * 4, 32, out_offset);
-   memset(surf, 0, 6 * 4);
+   /* TODO: This should go in the context */
+   struct isl_device isl_dev;
+   isl_device_init(_dev, brw->intelScreen->devinfo, brw->has_swizzling);
+
+   const struct surface_state_info ss_info = surface_state_infos[brw->gen];
+
+   

[Mesa-dev] [PATCH 55/64] i965/gen8: Use the generic ISL-based path for renderbuffer surfaces

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_state.h  |  16 --
 src/mesa/drivers/dri/i965/gen8_surface_state.c | 249 +
 2 files changed, 2 insertions(+), 263 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index 1667ee0..fec224e 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -325,22 +325,6 @@ void gen7_upload_3dstate_so_decl_list(struct brw_context 
*brw,
 
 void gen8_init_vtable_surface_functions(struct brw_context *brw);
 
-unsigned gen8_surface_tiling_mode(uint32_t tiling);
-unsigned gen8_vertical_alignment(const struct brw_context *brw,
- const struct intel_mipmap_tree *mt,
- uint32_t surf_type);
-unsigned gen8_horizontal_alignment(const struct brw_context *brw,
-   const struct intel_mipmap_tree *mt,
-   uint32_t surf_type);
-uint32_t *gen8_allocate_surface_state(struct brw_context *brw,
-  uint32_t *out_offset, int index);
-
-void gen8_emit_fast_clear_color(const struct brw_context *brw,
-const struct intel_mipmap_tree *mt,
-uint32_t *surf);
-uint32_t gen8_get_aux_mode(const struct brw_context *brw,
-   const struct intel_mipmap_tree *mt);
-
 /* brw_sampler_state.c */
 void brw_emit_sampler_state(struct brw_context *brw,
 uint32_t *sampler_state,
diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
b/src/mesa/drivers/dri/i965/gen8_surface_state.c
index ed26271..00e4c48 100644
--- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
@@ -42,83 +42,7 @@
 #include "brw_wm.h"
 #include "isl/isl.h"
 
-static uint32_t
-surface_tiling_resource_mode(uint32_t tr_mode)
-{
-   switch (tr_mode) {
-   case INTEL_MIPTREE_TRMODE_YF:
-  return GEN9_SURFACE_TRMODE_TILEYF;
-   case INTEL_MIPTREE_TRMODE_YS:
-  return GEN9_SURFACE_TRMODE_TILEYS;
-   default:
-  return GEN9_SURFACE_TRMODE_NONE;
-   }
-}
-
-uint32_t
-gen8_surface_tiling_mode(uint32_t tiling)
-{
-   switch (tiling) {
-   case I915_TILING_X:
-  return GEN8_SURFACE_TILING_X;
-   case I915_TILING_Y:
-  return GEN8_SURFACE_TILING_Y;
-   default:
-  return GEN8_SURFACE_TILING_NONE;
-   }
-}
-
-unsigned
-gen8_vertical_alignment(const struct brw_context *brw,
-const struct intel_mipmap_tree *mt,
-uint32_t surf_type)
-{
-   /* On Gen9+ vertical alignment is ignored for 1D surfaces and when
-* tr_mode is not TRMODE_NONE. Set to an arbitrary non-reserved value.
-*/
-   if (brw->gen > 8 &&
-   (mt->tr_mode != INTEL_MIPTREE_TRMODE_NONE ||
-surf_type == BRW_SURFACE_1D))
-  return GEN8_SURFACE_VALIGN_4;
-
-   switch (mt->valign) {
-   case 4:
-  return GEN8_SURFACE_VALIGN_4;
-   case 8:
-  return GEN8_SURFACE_VALIGN_8;
-   case 16:
-  return GEN8_SURFACE_VALIGN_16;
-   default:
-  unreachable("Unsupported vertical surface alignment.");
-   }
-}
-
-unsigned
-gen8_horizontal_alignment(const struct brw_context *brw,
-  const struct intel_mipmap_tree *mt,
-  uint32_t surf_type)
-{
-   /* On Gen9+ horizontal alignment is ignored when tr_mode is not
-* TRMODE_NONE. Set to an arbitrary non-reserved value.
-*/
-   if (brw->gen > 8 &&
-   (mt->tr_mode != INTEL_MIPTREE_TRMODE_NONE ||
-gen9_use_linear_1d_layout(brw, mt)))
-  return GEN8_SURFACE_HALIGN_4;
-
-   switch (mt->halign) {
-   case 4:
-  return GEN8_SURFACE_HALIGN_4;
-   case 8:
-  return GEN8_SURFACE_HALIGN_8;
-   case 16:
-  return GEN8_SURFACE_HALIGN_16;
-   default:
-  unreachable("Unsupported horizontal surface alignment.");
-   }
-}
-
-uint32_t *
+static uint32_t *
 gen8_allocate_surface_state(struct brw_context *brw,
 uint32_t *out_offset, int index)
 {
@@ -169,44 +93,6 @@ gen8_emit_buffer_surface_state(struct brw_context *brw,
}
 }
 
-void
-gen8_emit_fast_clear_color(const struct brw_context *brw,
-   const struct intel_mipmap_tree *mt,
-   uint32_t *surf)
-{
-   if (brw->gen >= 9) {
-  surf[12] = mt->gen9_fast_clear_color.ui[0];
-  surf[13] = mt->gen9_fast_clear_color.ui[1];
-  surf[14] = mt->gen9_fast_clear_color.ui[2];
-  surf[15] = mt->gen9_fast_clear_color.ui[3];
-   } else
-  surf[7] |= mt->fast_clear_color_value;
-}
-
-uint32_t
-gen8_get_aux_mode(const struct brw_context *brw,
-  const struct intel_mipmap_tree *mt)
-{
-   if (mt->mcs_mt == NULL)
-  return GEN8_SURFACE_AUX_MODE_NONE;
-
-   /*
-* From the BDW PRM, Volume 2d, page 260 (RENDER_SURFACE_STATE):
-* "When MCS is enabled for non-MSRT, HALIGN_16 must be used"
-*
-   

[Mesa-dev] [PATCH 34/64] genxml: Add macros and #includes for gens 4-6

2016-06-11 Thread Jason Ekstrand
---
 src/intel/genxml/genX_pack.h  | 10 +-
 src/intel/genxml/gen_macros.h | 15 ++-
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/src/intel/genxml/genX_pack.h b/src/intel/genxml/genX_pack.h
index 7967c29..0c25c4e 100644
--- a/src/intel/genxml/genX_pack.h
+++ b/src/intel/genxml/genX_pack.h
@@ -27,7 +27,15 @@
 #  error "The GEN_VERSIONx10 macro must be defined"
 #endif
 
-#if (GEN_VERSIONx10 == 70)
+#if (GEN_VERSIONx10 == 40)
+#  include "genxml/gen4_pack.h"
+#elif (GEN_VERSIONx10 == 45)
+#  include "genxml/gen45_pack.h"
+#elif (GEN_VERSIONx10 == 50)
+#  include "genxml/gen5_pack.h"
+#elif (GEN_VERSIONx10 == 60)
+#  include "genxml/gen6_pack.h"
+#elif (GEN_VERSIONx10 == 70)
 #  include "genxml/gen7_pack.h"
 #elif (GEN_VERSIONx10 == 75)
 #  include "genxml/gen75_pack.h"
diff --git a/src/intel/genxml/gen_macros.h b/src/intel/genxml/gen_macros.h
index 868bc22..1d591fa 100644
--- a/src/intel/genxml/gen_macros.h
+++ b/src/intel/genxml/gen_macros.h
@@ -57,9 +57,22 @@
 
 #define GEN_GEN ((GEN_VERSIONx10) / 10)
 #define GEN_IS_HASWELL ((GEN_VERSIONx10) == 75)
+#define GEN_IS_G4X ((GEN_VERSIONx10) == 45)
 
 /* Prefixing macros */
-#if (GEN_VERSIONx10 == 70)
+#if (GEN_VERSIONx10 == 40)
+#  define GENX(X) GEN4_##X
+#  define genX(x) gen4_##x
+#elif (GEN_VERSIONx10 == 45)
+#  define GENX(X) GEN45_##X
+#  define genX(x) gen45_##x
+#elif (GEN_VERSIONx10 == 50)
+#  define GENX(X) GEN5_##X
+#  define genX(x) gen5_##x
+#elif (GEN_VERSIONx10 == 60)
+#  define GENX(X) GEN6_##X
+#  define genX(x) gen6_##x
+#elif (GEN_VERSIONx10 == 70)
 #  define GENX(X) GEN7_##X
 #  define genX(x) gen7_##x
 #elif (GEN_VERSIONx10 == 75)
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 30/64] anv, isl: Lower storage image formats in anv

2016-06-11 Thread Jason Ekstrand
ISL was being a bit too clever for its own good and lowering the format for
us.  This is all well and good *if* we always want to lower it.  However,
the GL driver selectively lowers the format depending on whether the
surface is write-only or not.
---
 src/intel/isl/isl_surface_state.c | 8 +---
 src/intel/vulkan/anv_image.c  | 3 +++
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index 4c6563a..fb3fd99 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -190,13 +190,7 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
struct GENX(RENDER_SURFACE_STATE) s = { 0 };
 
s.SurfaceType = get_surftype(info->surf->dim, info->view->usage);
-
-   if (info->view->usage & ISL_SURF_USAGE_STORAGE_BIT) {
-  s.SurfaceFormat =
- isl_lower_storage_image_format(dev->info, info->view->format);
-   } else {
-  s.SurfaceFormat = info->view->format;
-   }
+   s.SurfaceFormat = info->view->format;
 
 #if GEN_IS_HASWELL
s.IntegerSurfaceFormat = isl_format_has_int_channel(s.SurfaceFormat);
diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index 208e377..77d9931 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -537,12 +537,15 @@ anv_image_view_init(struct anv_image_view *iview,
   iview->color_rt_surface_state.alloc_size = 0;
}
 
+   /* NOTE: This one needs to go last since it may stomp isl_view.format */
if (image->usage & usage_mask & VK_IMAGE_USAGE_STORAGE_BIT) {
   iview->storage_surface_state = alloc_surface_state(device, cmd_buffer);
 
   if (isl_has_matching_typed_storage_image_format(>info,
   format.isl_format)) {
  isl_view.usage = cube_usage | ISL_SURF_USAGE_STORAGE_BIT;
+ isl_view.format = isl_lower_storage_image_format(>info,
+  isl_view.format);
  isl_surf_fill_state(>isl_dev,
  iview->storage_surface_state.map,
  .surf = >isl,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 22/64] isl/state: Use the layout for computing qpitch rather than dimensions

2016-06-11 Thread Jason Ekstrand
For depth/stencil 1-D textures on SKL, we want them layed out in the old
format that has been used since gen4.  In order for the surface state
fill-out code to handle, this it needs to distinguish based on layout
rather than just dimensionality.
---
 src/intel/isl/isl_surface_state.c | 34 +++---
 1 file changed, 15 insertions(+), 19 deletions(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index b16bcbf..2d36881 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -148,27 +148,11 @@ get_image_alignment(const struct isl_surf *surf)
 static uint32_t
 get_qpitch(const struct isl_surf *surf)
 {
-   switch (surf->dim) {
+   switch (surf->dim_layout) {
default:
   unreachable("Bad isl_surf_dim");
-   case ISL_SURF_DIM_1D:
-  if (GEN_GEN >= 9) {
- /* QPitch is usually expressed as rows of surface elements (where
-  * a surface element is an compression block or a single surface
-  * sample). Skylake 1D is an outlier.
-  *
-  * From the Skylake BSpec >> Memory Views >> Common Surface
-  * Formats >> Surface Layout and Tiling >> 1D Surfaces:
-  *
-  *Surface QPitch specifies the distance in pixels between array
-  *slices.
-  */
- return isl_surf_get_array_pitch_el(surf);
-  } else {
- return isl_surf_get_array_pitch_el_rows(surf);
-  }
-   case ISL_SURF_DIM_2D:
-   case ISL_SURF_DIM_3D:
+   case ISL_DIM_LAYOUT_GEN4_2D:
+   case ISL_DIM_LAYOUT_GEN4_3D:
   if (GEN_GEN >= 9) {
  return isl_surf_get_array_pitch_el_rows(surf);
   } else {
@@ -183,6 +167,18 @@ get_qpitch(const struct isl_surf *surf)
   */
  return isl_surf_get_array_pitch_sa_rows(surf);
   }
+   case ISL_DIM_LAYOUT_GEN9_1D:
+  /* QPitch is usually expressed as rows of surface elements (where
+   * a surface element is an compression block or a single surface
+   * sample). Skylake 1D is an outlier.
+   *
+   * From the Skylake BSpec >> Memory Views >> Common Surface
+   * Formats >> Surface Layout and Tiling >> 1D Surfaces:
+   *
+   *Surface QPitch specifies the distance in pixels between array
+   *slices.
+   */
+  return isl_surf_get_array_pitch_el(surf);
}
 }
 #endif /* GEN_GEN >= 8 */
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 44/64] i965/blorp: Use the generic ISL path for renderbuffer surfaces on gen8-9

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/gen8_blorp.c | 99 ++
 1 file changed, 4 insertions(+), 95 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_blorp.c 
b/src/mesa/drivers/dri/i965/gen8_blorp.c
index fcf5a53..b5c600b 100644
--- a/src/mesa/drivers/dri/i965/gen8_blorp.c
+++ b/src/mesa/drivers/dri/i965/gen8_blorp.c
@@ -33,97 +33,6 @@
 
 #include "brw_blorp.h"
 
-
-/* SURFACE_STATE for renderbuffer or texture surface (see
- * brw_update_renderbuffer_surface and brw_update_texture_surface)
- */
-static uint32_t
-gen8_blorp_emit_surface_state(struct brw_context *brw,
-  const struct brw_blorp_surface_info *surface,
-  uint32_t read_domains, uint32_t write_domain,
-  bool is_render_target)
-{
-   uint32_t wm_surf_offset;
-   const struct intel_mipmap_tree *mt = surface->mt;
-   const uint32_t mocs_wb = is_render_target ?
-   (brw->gen >= 9 ? SKL_MOCS_PTE : BDW_MOCS_PTE) :
-   (brw->gen >= 9 ? SKL_MOCS_WB : BDW_MOCS_WB);
-   const uint32_t tiling = surface->map_stencil_as_y_tiled
-  ? I915_TILING_Y : mt->tiling;
-   uint32_t tile_x, tile_y;
-
-   uint32_t *surf = gen8_allocate_surface_state(brw, _surf_offset, -1);
-
-   surf[0] = BRW_SURFACE_2D << BRW_SURFACE_TYPE_SHIFT |
- surface->brw_surfaceformat << BRW_SURFACE_FORMAT_SHIFT |
- gen8_vertical_alignment(brw, mt, BRW_SURFACE_2D) |
- gen8_horizontal_alignment(brw, mt, BRW_SURFACE_2D) |
- gen8_surface_tiling_mode(tiling);
-
-   surf[1] = SET_FIELD(mocs_wb, GEN8_SURFACE_MOCS) | mt->qpitch >> 2;
-
-   surf[2] = SET_FIELD(surface->width - 1, GEN7_SURFACE_WIDTH) |
- SET_FIELD(surface->height - 1, GEN7_SURFACE_HEIGHT);
-
-   uint32_t pitch_bytes = mt->pitch;
-   if (surface->map_stencil_as_y_tiled)
-  pitch_bytes *= 2;
-   surf[3] = pitch_bytes - 1;
-
-   surf[4] = gen7_surface_msaa_bits(surface->num_samples,
-surface->msaa_layout);
-
-   if (surface->mt->mcs_mt) {
-  surf[6] = SET_FIELD(surface->mt->qpitch / 4, GEN8_SURFACE_AUX_QPITCH) |
-SET_FIELD((surface->mt->mcs_mt->pitch / 128) - 1,
-  GEN8_SURFACE_AUX_PITCH) |
-gen8_get_aux_mode(brw, mt);
-   } else {
-  surf[6] = 0;
-   }
-
-   gen8_emit_fast_clear_color(brw, mt, surf);
-   surf[7] |= SET_FIELD(HSW_SCS_RED,   GEN7_SURFACE_SCS_R) |
-  SET_FIELD(HSW_SCS_GREEN, GEN7_SURFACE_SCS_G) |
-  SET_FIELD(HSW_SCS_BLUE,  GEN7_SURFACE_SCS_B) |
-  SET_FIELD(HSW_SCS_ALPHA, GEN7_SURFACE_SCS_A);
-
-/* reloc */
-   *((uint64_t *)[8]) =
-  brw_blorp_compute_tile_offsets(surface, _x, _y) +
-  mt->bo->offset64;
-
-   /* Note that the low bits of these fields are missing, so there's the
-* possibility of getting in trouble.
-*/
-   assert(tile_x % 4 == 0);
-   assert(tile_y % 4 == 0);
-   surf[5] = SET_FIELD(tile_x / 4, BRW_SURFACE_X_OFFSET) |
- SET_FIELD(tile_y / 4, GEN8_SURFACE_Y_OFFSET);
-
-   if (brw->gen >= 9) {
-  /* Disable Mip Tail by setting a large value. */
-  surf[5] |= SET_FIELD(15, GEN9_SURFACE_MIP_TAIL_START_LOD);
-   }
-
-   if (surface->mt->mcs_mt) {
-  *((uint64_t *) [10]) = surface->mt->mcs_mt->bo->offset64;
-  drm_intel_bo_emit_reloc(brw->batch.bo,
-  wm_surf_offset + 10 * 4,
-  surface->mt->mcs_mt->bo, 0,
-  read_domains, write_domain);
-   }
-
-   /* Emit relocation to surface contents */
-   drm_intel_bo_emit_reloc(brw->batch.bo,
-   wm_surf_offset + 8 * 4,
-   mt->bo,
-   surf[8] - mt->bo->offset64,
-   read_domains, write_domain);
-
-   return wm_surf_offset;
-}
-
 static uint32_t
 gen8_blorp_emit_blend_state(struct brw_context *brw,
 const struct brw_blorp_params *params)
@@ -600,10 +509,10 @@ gen8_blorp_emit_surface_states(struct brw_context *brw,
intel_miptree_used_for_rendering(params->dst.mt);
 
wm_surf_offset_renderbuffer =
-  gen8_blorp_emit_surface_state(brw, >dst,
-I915_GEM_DOMAIN_RENDER,
-I915_GEM_DOMAIN_RENDER,
-true /* is_render_target */);
+  brw_blorp_emit_surface_state(brw, >dst,
+   I915_GEM_DOMAIN_RENDER,
+   I915_GEM_DOMAIN_RENDER,
+   true /* is_render_target */);
if (params->src.mt) {
   const struct brw_blorp_surface_info *surface = >src;
   struct intel_mipmap_tree *mt = surface->mt;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

[Mesa-dev] [PATCH 41/64] i965/miptree: Add a helper for getting the ISL clear color from a miptree

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 24 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  4 
 2 files changed, 28 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 83a9764..8a746ec 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3166,3 +3166,27 @@ intel_miptree_get_isl_surf(struct brw_context *brw,
 
surf->usage = 0; /* TODO */
 }
+
+union isl_color_value
+intel_miptree_get_isl_clear_color(struct brw_context *brw,
+  const struct intel_mipmap_tree *mt)
+{
+   union isl_color_value clear_color;
+   if (brw->gen >= 9) {
+  clear_color.i32[0] = mt->gen9_fast_clear_color.i[0];
+  clear_color.i32[1] = mt->gen9_fast_clear_color.i[1];
+  clear_color.i32[2] = mt->gen9_fast_clear_color.i[2];
+  clear_color.i32[3] = mt->gen9_fast_clear_color.i[3];
+   } else if (_mesa_is_format_integer(mt->format)) {
+  clear_color.i32[0] = (mt->fast_clear_color_value & (1u << 31)) != 0;
+  clear_color.i32[1] = (mt->fast_clear_color_value & (1u << 30)) != 0;
+  clear_color.i32[2] = (mt->fast_clear_color_value & (1u << 29)) != 0;
+  clear_color.i32[3] = (mt->fast_clear_color_value & (1u << 28)) != 0;
+   } else {
+  clear_color.f32[0] = (mt->fast_clear_color_value & (1u << 31)) != 0;
+  clear_color.f32[1] = (mt->fast_clear_color_value & (1u << 30)) != 0;
+  clear_color.f32[2] = (mt->fast_clear_color_value & (1u << 29)) != 0;
+  clear_color.f32[3] = (mt->fast_clear_color_value & (1u << 28)) != 0;
+   }
+   return clear_color;
+}
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index cf5d1a6..a50f181 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -802,6 +802,10 @@ intel_miptree_get_isl_surf(struct brw_context *brw,
const struct intel_mipmap_tree *mt,
struct isl_surf *surf);
 
+union isl_color_value
+intel_miptree_get_isl_clear_color(struct brw_context *brw,
+  const struct intel_mipmap_tree *mt);
+
 void
 intel_get_image_dims(struct gl_texture_image *image,
  int *width, int *height, int *depth);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 39/64] isl/state: Add support for OffsetX/Y in surface state

2016-06-11 Thread Jason Ekstrand
---
 src/intel/isl/isl.h   | 3 +++
 src/intel/isl/isl_surface_state.c | 9 +
 2 files changed, 12 insertions(+)

diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index a987482..4dd4a2f 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@ -844,6 +844,9 @@ struct isl_surf_fill_state_info {
 * Valid values depend on hardware generation.
 */
union isl_color_value clear_color;
+
+   /* Intra-tile offset */
+   uint16_t x_offset, y_offset;
 };
 
 struct isl_buffer_fill_state_info {
diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index 9bfc55f..65e4b8e 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -401,6 +401,15 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
s.MOCS = info->mocs;
 #endif
 
+#if GEN_GEN > 4 || GEN_IS_G4X
+   const unsigned x_div = 4;
+   const unsigned y_div = GEN_GEN >= 8 ? 4 : 2;
+   assert(info->x_offset % x_div == 0);
+   assert(info->y_offset % y_div == 0);
+   s.XOffset = info->x_offset / x_div;
+   s.YOffset = info->y_offset / y_div;
+#endif
+
 #if GEN_GEN >= 7
if (info->aux_surf) {
   struct isl_tile_info tile_info;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/64] i965: Remove fake W-tiled render target support

2016-06-11 Thread Jason Ekstrand
This hasn't been used since 1cfb4bc890b8 where we deleted the meta stencil
blit path.
---
 src/mesa/drivers/dri/i965/brw_state.h|  6 --
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 25 
 src/mesa/drivers/dri/i965/gen8_surface_state.c   | 25 +---
 3 files changed, 9 insertions(+), 47 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index 0a4c21f..eec4bae 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -264,12 +264,6 @@ void gen4_init_vtable_surface_functions(struct brw_context 
*brw);
 uint32_t brw_get_surface_tiling_bits(uint32_t tiling);
 uint32_t brw_get_surface_num_multisamples(unsigned num_samples);
 
-void brw_configure_w_tiled(const struct intel_mipmap_tree *mt,
-   bool is_render_target,
-   unsigned *width, unsigned *height,
-   unsigned *pitch, uint32_t *tiling,
-   unsigned *format);
-
 uint32_t brw_format_for_mesa_format(mesa_format mesa_format);
 
 GLuint translate_tex_target(GLenum target);
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 133a944..6bb1bec 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -105,31 +105,6 @@ brw_get_surface_num_multisamples(unsigned num_samples)
   return BRW_SURFACE_MULTISAMPLECOUNT_1;
 }
 
-void
-brw_configure_w_tiled(const struct intel_mipmap_tree *mt,
-  bool is_render_target,
-  unsigned *width, unsigned *height,
-  unsigned *pitch, uint32_t *tiling, unsigned *format)
-{
-   static const unsigned halign_stencil = 8;
-
-   /* In Y-tiling row is twice as wide as in W-tiling, and subsequently
-* there are half as many rows.
-* In addition, mip-levels are accessed manually by the program and
-* therefore the surface is setup to cover all the mip-levels for one slice.
-* (Hardware is still used to access individual slices).
-*/
-   *tiling = I915_TILING_Y;
-   *pitch = mt->pitch * 2;
-   *width = ALIGN(mt->total_width, halign_stencil) * 2;
-   *height = (mt->total_height / mt->physical_depth0) / 2;
-
-   if (is_render_target) {
-  *format = BRW_SURFACEFORMAT_R8_UINT;
-   }
-}
-
-
 /**
  * Compute the combination of DEPTH_TEXTURE_MODE and EXT_texture_swizzle
  * swizzling.
diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
b/src/mesa/drivers/dri/i965/gen8_surface_state.c
index 6a98d76..abd6016 100644
--- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
@@ -490,22 +490,15 @@ gen8_update_renderbuffer_surface(struct brw_context *brw,
}
 
/* _NEW_BUFFERS */
-   /* Render targets can't use IMS layout. Stencil in turn gets configured as
-* single sampled and indexed manually by the program.
-*/
-   if (mt->format == MESA_FORMAT_S_UINT8) {
-  brw_configure_w_tiled(mt, true, , , ,
-, );
-   } else {
-  assert(mt->msaa_layout != INTEL_MSAA_LAYOUT_IMS);
-  assert(brw_render_target_supported(brw, rb));
-  mesa_format rb_format = _mesa_get_render_format(ctx,
-  intel_rb_format(irb));
-  format = brw->render_target_format[rb_format];
-  if (unlikely(!brw->format_supported_as_render_target[rb_format]))
- _mesa_problem(ctx, "%s: renderbuffer format %s unsupported\n",
-   __func__, _mesa_get_format_name(rb_format));
-   }
+   /* Render targets can't use IMS layout. */
+   assert(mt->msaa_layout != INTEL_MSAA_LAYOUT_IMS);
+   assert(brw_render_target_supported(brw, rb));
+   mesa_format rb_format = _mesa_get_render_format(ctx,
+   intel_rb_format(irb));
+   format = brw->render_target_format[rb_format];
+   if (unlikely(!brw->format_supported_as_render_target[rb_format]))
+  _mesa_problem(ctx, "%s: renderbuffer format %s unsupported\n",
+__func__, _mesa_get_format_name(rb_format));
 
struct intel_mipmap_tree *aux_mt = mt->mcs_mt;
const uint32_t aux_mode = gen8_get_aux_mode(brw, mt);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 19/64] isl/state: Set SurfaceArray based on the surface dimension

2016-06-11 Thread Jason Ekstrand
According to the PRM, you can't set SurfaceArray for 3D or buffer textures.
There doesn't seem to be a good reason not to set it when we can.  On the
other hand, if we don't set it we can end up getting strange results for
1-layer array textures such as textureSize() returning the wrong results.
---
 src/intel/isl/isl_surface_state.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index 60bfced..35902e6 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -257,7 +257,7 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
   unreachable("bad SurfaceType");
}
 
-   s.SurfaceArray = info->surf->phys_level0_sa.array_len > 1;
+   s.SurfaceArray = info->surf->dim != ISL_SURF_DIM_3D;
 
if (info->view->usage & ISL_SURF_USAGE_RENDER_TARGET_BIT) {
   /* For render target surfaces, the hardware interprets field
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 17/64] isl/state: Refactor the setup of clear colors

2016-06-11 Thread Jason Ekstrand
This commit switches clear colors to use #if's instead of a C if.  This
lets us properly handle SNB where the clear color field doesn't exist.
---
 src/intel/isl/isl_surface_state.c | 44 +++
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index db90936..aa720d8 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -372,31 +372,31 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
}
 #endif
 
-   if (GEN_GEN <= 8) {
-  /* Prior to Sky Lake, we only have one bit for the clear color which
-   * gives us 0 or 1 in whatever the surface's format happens to be.
-   */
-  if (isl_format_has_int_channel(info->view->format)) {
- for (unsigned i = 0; i < 4; i++) {
-assert(info->clear_color.u32[i] == 0 ||
-   info->clear_color.u32[i] == 1);
- }
-  } else {
- for (unsigned i = 0; i < 4; i++) {
-assert(info->clear_color.f32[i] == 0.0f ||
-   info->clear_color.f32[i] == 1.0f);
- }
+#if GEN_GEN >= 9
+   s.RedClearColor = info->clear_color.u32[0];
+   s.GreenClearColor = info->clear_color.u32[1];
+   s.BlueClearColor = info->clear_color.u32[2];
+   s.AlphaClearColor = info->clear_color.u32[3];
+#elif GEN_GEN >= 7
+   /* Prior to Sky Lake, we only have one bit for the clear color which
+* gives us 0 or 1 in whatever the surface's format happens to be.
+*/
+   if (isl_format_has_int_channel(info->view->format)) {
+  for (unsigned i = 0; i < 4; i++) {
+ assert(info->clear_color.u32[i] == 0 ||
+info->clear_color.u32[i] == 1);
   }
-  s.RedClearColor = info->clear_color.u32[0] != 0;
-  s.GreenClearColor = info->clear_color.u32[1] != 0;
-  s.BlueClearColor = info->clear_color.u32[2] != 0;
-  s.AlphaClearColor = info->clear_color.u32[3] != 0;
} else {
-  s.RedClearColor = info->clear_color.u32[0];
-  s.GreenClearColor = info->clear_color.u32[1];
-  s.BlueClearColor = info->clear_color.u32[2];
-  s.AlphaClearColor = info->clear_color.u32[3];
+  for (unsigned i = 0; i < 4; i++) {
+ assert(info->clear_color.f32[i] == 0.0f ||
+info->clear_color.f32[i] == 1.0f);
+  }
}
+   s.RedClearColor = info->clear_color.u32[0] != 0;
+   s.GreenClearColor = info->clear_color.u32[1] != 0;
+   s.BlueClearColor = info->clear_color.u32[2] != 0;
+   s.AlphaClearColor = info->clear_color.u32[3] != 0;
+#endif
 
GENX(RENDER_SURFACE_STATE_pack)(NULL, state, );
 }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 32/64] genxml: Add enough XML for gens 4, 4.5, and 5 to get SURFACE_STATE

2016-06-11 Thread Jason Ekstrand
---
 src/intel/genxml/Makefile.am  |  3 +++
 src/intel/genxml/Makefile.sources |  3 +++
 src/intel/genxml/gen4.xml | 52 
 src/intel/genxml/gen45.xml| 56 +++
 src/intel/genxml/gen5.xml | 56 +++
 5 files changed, 170 insertions(+)
 create mode 100644 src/intel/genxml/gen4.xml
 create mode 100644 src/intel/genxml/gen45.xml
 create mode 100644 src/intel/genxml/gen5.xml

diff --git a/src/intel/genxml/Makefile.am b/src/intel/genxml/Makefile.am
index d6c1c5b..95c1ff9 100644
--- a/src/intel/genxml/Makefile.am
+++ b/src/intel/genxml/Makefile.am
@@ -35,6 +35,9 @@ $(BUILT_SOURCES): gen_pack_header.py
 CLEANFILES = $(BUILT_SOURCES)
 
 EXTRA_DIST = \
+   gen4.xml \
+   gen45.xml \
+   gen5.xml \
gen6.xml \
gen7.xml \
gen75.xml \
diff --git a/src/intel/genxml/Makefile.sources 
b/src/intel/genxml/Makefile.sources
index 9298b4a..86c0bbe 100644
--- a/src/intel/genxml/Makefile.sources
+++ b/src/intel/genxml/Makefile.sources
@@ -1,4 +1,7 @@
 GENXML_GENERATED_FILES = \
+   gen4_pack.h \
+   gen45_pack.h \
+   gen5_pack.h \
gen6_pack.h \
gen7_pack.h \
gen75_pack.h \
diff --git a/src/intel/genxml/gen4.xml b/src/intel/genxml/gen4.xml
new file mode 100644
index 000..1f89b1d
--- /dev/null
+++ b/src/intel/genxml/gen4.xml
@@ -0,0 +1,52 @@
+
+  
+
+  
+  
+  
+  
+  
+  
+
+
+  
+  
+
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+  
+
+
+  
+  
+
+
+  
+
+
+
+
+
+
+
+
+
+
+  
+  
+
+
+
+
+  
+
diff --git a/src/intel/genxml/gen45.xml b/src/intel/genxml/gen45.xml
new file mode 100644
index 000..973b3bb
--- /dev/null
+++ b/src/intel/genxml/gen45.xml
@@ -0,0 +1,56 @@
+
+  
+
+  
+  
+  
+  
+  
+  
+
+
+  
+  
+
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+  
+
+
+  
+  
+
+
+  
+  
+  
+
+
+
+
+
+
+
+
+
+
+  
+  
+
+
+
+
+
+
+  
+
diff --git a/src/intel/genxml/gen5.xml b/src/intel/genxml/gen5.xml
new file mode 100644
index 000..37e1ac4
--- /dev/null
+++ b/src/intel/genxml/gen5.xml
@@ -0,0 +1,56 @@
+
+  
+
+  
+  
+  
+  
+  
+  
+
+
+  
+  
+
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+  
+
+
+  
+  
+
+
+  
+  
+  
+
+
+
+
+
+
+
+
+
+
+  
+  
+
+
+
+
+
+
+  
+
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 18/64] isl/state: Don't force-disable L2 bypass for everything

2016-06-11 Thread Jason Ekstrand
We already set the bit in the few cases where it's required by the docs so
there's no need to set it all the time.  This has no noticable perf impact
for Dota 2 on Vulkan with the time demo I have.
---
 src/intel/isl/isl_surface_state.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index aa720d8..60bfced 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -310,10 +310,6 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
 #endif
 
 #if GEN_GEN >= 8
-   s.SamplerL2BypassModeDisable = true;
-#endif
-
-#if GEN_GEN >= 8
s.RenderCacheReadWriteMode = WriteOnlyCache;
 #else
s.RenderCacheReadWriteMode = 0;
@@ -426,7 +422,6 @@ isl_genX(buffer_fill_state_s)(void *state,
 #endif
 
 #if (GEN_GEN >= 8)
-  .SamplerL2BypassModeDisable = true,
   .RenderCacheReadWriteMode = WriteOnlyCache,
 #else
   .RenderCacheReadWriteMode = 0,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/64] isl/state: Put pitch calculations together

2016-06-11 Thread Jason Ekstrand
---
 src/intel/isl/isl_surface_state.c | 42 +++
 1 file changed, 20 insertions(+), 22 deletions(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index 0ada3e4..50570aa 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -291,6 +291,26 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
s.SurfaceVerticalAlignment = valign;
s.SurfaceHorizontalAlignment = halign;
 
+   if (info->surf->tiling == ISL_TILING_W) {
+  /* From the Broadwell PRM documentation for this field:
+   *
+   *"If the surface is a stencil buffer (and thus has Tile Mode set
+   *to TILEMODE_WMAJOR), the pitch must be set to 2x the value
+   *computed based on width, as the stencil buffer is stored with
+   *two rows interleaved."
+   */
+  s.SurfacePitch = info->surf->row_pitch * 2 - 1;
+   } else {
+  s.SurfacePitch = info->surf->row_pitch - 1;
+   }
+
+#if GEN_GEN >= 8
+   s.SurfaceQPitch = get_qpitch(info->surf) >> 2;
+#elif GEN_GEN == 7
+   s.SurfaceArraySpacing = info->surf->array_pitch_span ==
+   ISL_ARRAY_PITCH_SPAN_COMPACT;
+#endif
+
 #if GEN_GEN >= 8
s.TileMode = isl_to_gen_tiling[info->surf->tiling];
 #else
@@ -299,11 +319,6 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
  TILEWALK_YMAJOR;
 #endif
 
-#if (GEN_GEN == 7)
-   s.SurfaceArraySpacing = info->surf->array_pitch_span ==
-   ISL_ARRAY_PITCH_SPAN_COMPACT;
-#endif
-
 #if GEN_GEN >= 8
s.SamplerL2BypassModeDisable = true;
 #endif
@@ -325,10 +340,6 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
s.CubeFaceEnables = 0x3f;
 #endif
 
-#if GEN_GEN >= 8
-   s.SurfaceQPitch = get_qpitch(info->surf) >> 2;
-#endif
-
s.MultisampledSurfaceStorageFormat =
   isl_to_gen_multisample_layout[info->surf->msaa_layout];
s.NumberofMultisamples = ffs(info->surf->samples) - 1;
@@ -349,19 +360,6 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
s.MCSEnable = false;
 #endif
 
-   if (info->surf->tiling == ISL_TILING_W) {
-  /* From the Broadwell PRM documentation for this field:
-   *
-   *"If the surface is a stencil buffer (and thus has Tile Mode set
-   *to TILEMODE_WMAJOR), the pitch must be set to 2x the value
-   *computed based on width, as the stencil buffer is stored with
-   *two rows interleaved."
-   */
-  s.SurfacePitch = info->surf->row_pitch * 2 - 1;
-   } else {
-  s.SurfacePitch = info->surf->row_pitch - 1;
-   }
-
 #if GEN_GEN >= 8
/* From the CHV PRM, Volume 2d, page 321 (RENDER_SURFACE_STATE dword 0
 * bit 9 "Sampler L2 Bypass Mode Disable" Programming Notes):
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 42/64] i965/miptree: Add a helper for getting the aux isl_surf from a miptree

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 70 +++
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  4 ++
 2 files changed, 74 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 8a746ec..e3e8cf6 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3167,6 +3167,76 @@ intel_miptree_get_isl_surf(struct brw_context *brw,
surf->usage = 0; /* TODO */
 }
 
+/* WARNING: THE SURFACE CREATED BY THIS FUNCTION IS NOT COMPLETE AND CANNOT BE
+ * USED FOR ANY REAL CALCULATIONS.  THE ONLY VALID USE OF SUCH A SURFACE IS TO
+ * PASS IT INTO isl_surf_fill_state.
+ */
+void
+intel_miptree_get_ccs_isl_surf(struct brw_context *brw,
+   const struct intel_mipmap_tree *mt,
+   struct isl_surf *surf)
+{
+   /* Much is the same as the regular surface */
+   intel_miptree_get_isl_surf(brw, mt->mcs_mt, surf);
+
+   switch (mt->num_samples) {
+   case 0:
+   case 1:
+  /*
+   * From the BDW PRM, Volume 2d, page 260 (RENDER_SURFACE_STATE):
+   * "When MCS is enabled for non-MSRT, HALIGN_16 must be used"
+   *
+   * From the hardware spec for GEN9:
+   * "When Auxiliary Surface Mode is set to AUX_CCS_D or AUX_CCS_E, HALIGN
+   *  16 must be used."
+   */
+  if (brw->gen >= 9 || mt->num_samples == 1)
+ assert(mt->halign == 16);
+
+  if (intel_miptree_is_lossless_compressed(brw, mt)) {
+ assert(mt->tiling == I915_TILING_Y);
+ switch (_mesa_get_format_bytes(mt->format)) {
+ case 4:  surf->format = ISL_FORMAT_NOMSRT_CCS_E_32BPP;   break;
+ case 8:  surf->format = ISL_FORMAT_NOMSRT_CCS_E_64BPP;   break;
+ case 16: surf->format = ISL_FORMAT_NOMSRT_CCS_E_128BPP;  break;
+ default:
+unreachable("Invalid format size for color compression");
+ }
+  } else if (mt->tiling == I915_TILING_Y) {
+ switch (_mesa_get_format_bytes(mt->format)) {
+ case 4:  surf->format = ISL_FORMAT_NOMSRT_CCS_D_32BPP_Y;break;
+ case 8:  surf->format = ISL_FORMAT_NOMSRT_CCS_D_64BPP_Y;break;
+ case 16: surf->format = ISL_FORMAT_NOMSRT_CCS_D_128BPP_X;   break;
+ default:
+unreachable("Invalid format size for color compression");
+ }
+  } else {
+ assert(mt->tiling == I915_TILING_X);
+ switch (_mesa_get_format_bytes(mt->format)) {
+ case 4:  surf->format = ISL_FORMAT_NOMSRT_CCS_D_32BPP_X;break;
+ case 8:  surf->format = ISL_FORMAT_NOMSRT_CCS_D_64BPP_X;break;
+ case 16: surf->format = ISL_FORMAT_NOMSRT_CCS_D_128BPP_X;   break;
+ default:
+unreachable("Invalid format size for color compression");
+ }
+  }
+  break;
+
+   case 2:
+   case 4:
+  surf->format = ISL_FORMAT_R8_UINT;
+  break;
+   case 8:
+  surf->format = ISL_FORMAT_R32_UINT;
+  break;
+   case 16:
+  surf->format = ISL_FORMAT_R32G32_UINT;
+  break;
+   default:
+  unreachable("Invalid number of samples");
+   }
+}
+
 union isl_color_value
 intel_miptree_get_isl_clear_color(struct brw_context *brw,
   const struct intel_mipmap_tree *mt)
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index a50f181..683abf3 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -801,6 +801,10 @@ void
 intel_miptree_get_isl_surf(struct brw_context *brw,
const struct intel_mipmap_tree *mt,
struct isl_surf *surf);
+void
+intel_miptree_get_ccs_isl_surf(struct brw_context *brw,
+   const struct intel_mipmap_tree *mt,
+   struct isl_surf *surf);
 
 union isl_color_value
 intel_miptree_get_isl_clear_color(struct brw_context *brw,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/64] i965/blorp/gen8: Use the correct max level and layer in emit_surface_states

2016-06-11 Thread Jason Ekstrand
We were adding in the base which is wrong because the values given in the
miptree are relative to zero and not the base layer/level.

Cc: "11.1 11.2 12.0" 
---
 src/mesa/drivers/dri/i965/gen8_blorp.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_blorp.c 
b/src/mesa/drivers/dri/i965/gen8_blorp.c
index a9a400d..fcf5a53 100644
--- a/src/mesa/drivers/dri/i965/gen8_blorp.c
+++ b/src/mesa/drivers/dri/i965/gen8_blorp.c
@@ -627,13 +627,12 @@ gen8_blorp_emit_surface_states(struct brw_context *brw,
mt->target == GL_TEXTURE_CUBE_MAP;
   const unsigned depth = (is_cube ? 6 : 1) * mt->logical_depth0;
   const GLenum target = is_cube ? GL_TEXTURE_2D_ARRAY : mt->target;
-  const unsigned max_level = surface->level + mt->last_level + 1;
   const unsigned layer = mt->target != GL_TEXTURE_3D ?
 surface->layer / layer_divider : 0;
 
   brw->vtbl.emit_texture_surface_state(brw, mt, target,
-   layer, layer + depth,
-   surface->level, max_level,
+   layer, depth,
+   surface->level, mt->last_level + 1,
surface->brw_surfaceformat,
surface->swizzle,
_surf_offset_texture,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/64] isl/state: Put all dimension setup together and towards the top

2016-06-11 Thread Jason Ekstrand
---
 src/intel/isl/isl_surface_state.c | 154 ++
 1 file changed, 74 insertions(+), 80 deletions(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index 0f21e34..0ada3e4 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -213,7 +213,81 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
   s.SurfaceFormat = info->view->format;
}
 
+   s.Width = info->surf->logical_level0_px.width - 1;
+   s.Height = info->surf->logical_level0_px.height - 1;
+
+   switch (s.SurfaceType) {
+   case SURFTYPE_1D:
+   case SURFTYPE_2D:
+  s.MinimumArrayElement = info->view->base_array_layer;
+
+  /* From the Broadwell PRM >> RENDER_SURFACE_STATE::Depth:
+   *
+   *For SURFTYPE_1D, 2D, and CUBE: The range of this field is reduced
+   *by one for each increase from zero of Minimum Array Element. For
+   *example, if Minimum Array Element is set to 1024 on a 2D surface,
+   *the range of this field is reduced to [0,1023].
+   *
+   * In other words, 'Depth' is the number of array layers.
+   */
+  s.Depth = info->view->array_len - 1;
+
+  /* From the Broadwell PRM >> 
RENDER_SURFACE_STATE::RenderTargetViewExtent:
+   *
+   *For Render Target and Typed Dataport 1D and 2D Surfaces:
+   *This field must be set to the same value as the Depth field.
+   */
+  s.RenderTargetViewExtent = s.Depth;
+  break;
+   case SURFTYPE_CUBE:
+  s.MinimumArrayElement = info->view->base_array_layer;
+  /* Same as SURFTYPE_2D, but divided by 6 */
+  s.Depth = info->view->array_len / 6 - 1;
+  s.RenderTargetViewExtent = s.Depth;
+  break;
+   case SURFTYPE_3D:
+  s.MinimumArrayElement = info->view->base_array_layer;
+
+  /* From the Broadwell PRM >> RENDER_SURFACE_STATE::Depth:
+   *
+   *If the volume texture is MIP-mapped, this field specifies the
+   *depth of the base MIP level.
+   */
+  s.Depth = info->surf->logical_level0_px.depth - 1;
+
+  /* From the Broadwell PRM >> 
RENDER_SURFACE_STATE::RenderTargetViewExtent:
+   *
+   *For Render Target and Typed Dataport 3D Surfaces: This field
+   *indicates the extent of the accessible 'R' coordinates minus 1 on
+   *the LOD currently being rendered to.
+   */
+  s.RenderTargetViewExtent = 
isl_minify(info->surf->logical_level0_px.depth,
+info->view->base_level) - 1;
+  break;
+   default:
+  unreachable("bad SurfaceType");
+   }
+
s.SurfaceArray = info->surf->phys_level0_sa.array_len > 1;
+
+   if (info->view->usage & ISL_SURF_USAGE_RENDER_TARGET_BIT) {
+  /* For render target surfaces, the hardware interprets field
+   * MIPCount/LOD as LOD. The Broadwell PRM says:
+   *
+   *MIPCountLOD defines the LOD that will be rendered into.
+   *SurfaceMinLOD is ignored.
+   */
+  s.MIPCountLOD = info->view->base_level;
+  s.SurfaceMinLOD = 0;
+   } else {
+  /* For non render target surfaces, the hardware interprets field
+   * MIPCount/LOD as MIPCount.  The range of levels accessible by the
+   * sampler engine is [SurfaceMinLOD, SurfaceMinLOD + MIPCountLOD].
+   */
+  s.SurfaceMinLOD = info->view->base_level;
+  s.MIPCountLOD = MAX(info->view->levels, 1) - 1;
+   }
+
s.SurfaceVerticalAlignment = valign;
s.SurfaceHorizontalAlignment = halign;
 
@@ -255,20 +329,10 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
s.SurfaceQPitch = get_qpitch(info->surf) >> 2;
 #endif
 
-   s.Width = info->surf->logical_level0_px.width - 1;
-   s.Height = info->surf->logical_level0_px.height - 1;
-   s.Depth = 0; /* TEMPLATE */
-
-   s.RenderTargetViewExtent = 0; /* TEMPLATE */
-   s.MinimumArrayElement = 0; /* TEMPLATE */
-
s.MultisampledSurfaceStorageFormat =
   isl_to_gen_multisample_layout[info->surf->msaa_layout];
s.NumberofMultisamples = ffs(info->surf->samples) - 1;
 
-   s.MIPCountLOD = 0; /* TEMPLATE */
-   s.SurfaceMinLOD = 0; /* TEMPLATE */
-
 #if (GEN_GEN >= 8 || GEN_IS_HASWELL)
s.ShaderChannelSelectRed = info->view->channel_select[0];
s.ShaderChannelSelectGreen = info->view->channel_select[1];
@@ -298,76 +362,6 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
   s.SurfacePitch = info->surf->row_pitch - 1;
}
 
-   switch (s.SurfaceType) {
-   case SURFTYPE_1D:
-   case SURFTYPE_2D:
-  s.MinimumArrayElement = info->view->base_array_layer;
-
-  /* From the Broadwell PRM >> RENDER_SURFACE_STATE::Depth:
-   *
-   *For SURFTYPE_1D, 2D, and CUBE: The range of this field is reduced
-   *by one for each increase from zero of Minimum Array Element. For
-   *example, if Minimum Array Element is set to 1024 on a 2D surface,
-   *the range of this 

[Mesa-dev] [PATCH 07/64] i965/gen7, 8: Set SURFACE_IS_ARRAY for all non-3D texture types

2016-06-11 Thread Jason Ekstrand
There's no real reason why we shouldn't set this bit.  It does affect how
the sampler operates a bit but since you can have a 2D non-array view of a
2D_ARRAY texture that distinction is very weak.  Also, this is what ISL
will do and we would like this change to be isolated from using ISL.
---
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 2 +-
 src/mesa/drivers/dri/i965/gen8_surface_state.c| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index 60589bc..2a7ae31 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -295,7 +295,7 @@ gen7_emit_texture_surface_state(struct brw_context *brw,
if (mt->halign == 8)
   surf[0] |= GEN7_SURFACE_HALIGN_8;
 
-   if (_mesa_is_array_texture(target) || target == GL_TEXTURE_CUBE_MAP)
+   if (mt->target != GL_TEXTURE_3D)
   surf[0] |= GEN7_SURFACE_IS_ARRAY;
 
if (mt->array_layout == ALL_SLICES_AT_EACH_LOD)
diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
b/src/mesa/drivers/dri/i965/gen8_surface_state.c
index abd6016..f4375ea 100644
--- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
@@ -297,7 +297,7 @@ gen8_emit_texture_surface_state(struct brw_context *brw,
 format == BRW_SURFACEFORMAT_BC7_UNORM))
   surf[0] |= GEN8_SURFACE_SAMPLER_L2_BYPASS_DISABLE;
 
-   if (_mesa_is_array_texture(mt->target) || mt->target == GL_TEXTURE_CUBE_MAP)
+   if (mt->target != GL_TEXTURE_3D)
   surf[0] |= GEN8_SURFACE_IS_ARRAY;
 
surf[1] = SET_FIELD(mocs_wb, GEN8_SURFACE_MOCS) | mt->qpitch >> 2;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/64] i965/gen4: Subtract 1 from buffer sizes

2016-06-11 Thread Jason Ekstrand
Cc: "11.1 11.2 12.0" 
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 6bb1bec..e1f4bcb 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -237,9 +237,9 @@ gen4_emit_buffer_surface_state(struct brw_context *brw,
  surface_format << BRW_SURFACE_FORMAT_SHIFT |
  (brw->gen >= 6 ? BRW_SURFACE_RC_READ_WRITE : 0);
surf[1] = (bo ? bo->offset64 : 0) + buffer_offset; /* reloc */
-   surf[2] = (buffer_size & 0x7f) << BRW_SURFACE_WIDTH_SHIFT |
- ((buffer_size >> 7) & 0x1fff) << BRW_SURFACE_HEIGHT_SHIFT;
-   surf[3] = ((buffer_size >> 20) & 0x7f) << BRW_SURFACE_DEPTH_SHIFT |
+   surf[2] = ((buffer_size - 1) & 0x7f) << BRW_SURFACE_WIDTH_SHIFT |
+ (((buffer_size - 1) >> 7) & 0x1fff) << BRW_SURFACE_HEIGHT_SHIFT;
+   surf[3] = (((buffer_size - 1) >> 20) & 0x7f) << BRW_SURFACE_DEPTH_SHIFT |
  (pitch - 1) << BRW_SURFACE_PITCH_SHIFT;
 
/* Emit relocation to surface contents.  The 965 PRM, Volume 4, section
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/64] i965/fs: Use a default Y coordinate of 0 for TXF on gen9+

2016-06-11 Thread Jason Ekstrand
Previously, we were incrementing length but not actually putting anything
in the Y coordinate.  This meant that 1-D TXF operations had a garbage
array index.  If the surface is emitted as 1-D non-array, the coordinate
gets discarded and it works fine.  If it happens to be bound as an array
surface, it may count as an out-of-bounds array access and you get zero.

Cc: "11.1 11.2 12.0" 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 4b29ee5..5d3d4d0 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4274,6 +4274,8 @@ lower_sampler_logical_send_gen7(const fs_builder , 
fs_inst *inst, opcode op,
  if (coord_components >= 2) {
 bld.MOV(retype(sources[length], BRW_REGISTER_TYPE_D),
 offset(coordinate, bld, 1));
+ } else {
+sources[length] = brw_imm_d(0);
  }
  length++;
   }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/64] i965/gen8: Use the qpitch from the aux_mt for AUX_QPITCH

2016-06-11 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/gen8_surface_state.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
b/src/mesa/drivers/dri/i965/gen8_surface_state.c
index ee4781b..6a98d76 100644
--- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
@@ -325,7 +325,7 @@ gen8_emit_texture_surface_state(struct brw_context *brw,
   assert(aux_mt->tiling == I915_TILING_Y);
   intel_get_tile_dims(aux_mt->tiling, aux_mt->tr_mode,
   aux_mt->cpp, _w, _h);
-  surf[6] = SET_FIELD(mt->qpitch / 4, GEN8_SURFACE_AUX_QPITCH) |
+  surf[6] = SET_FIELD(aux_mt->qpitch / 4, GEN8_SURFACE_AUX_QPITCH) |
 SET_FIELD((aux_mt->pitch / tile_w) - 1,
   GEN8_SURFACE_AUX_PITCH) |
 aux_mode;
@@ -546,7 +546,7 @@ gen8_update_renderbuffer_surface(struct brw_context *brw,
   assert(aux_mt->tiling == I915_TILING_Y);
   intel_get_tile_dims(aux_mt->tiling, aux_mt->tr_mode,
   aux_mt->cpp, _w, _h);
-  surf[6] = SET_FIELD(mt->qpitch / 4, GEN8_SURFACE_AUX_QPITCH) |
+  surf[6] = SET_FIELD(aux_mt->qpitch / 4, GEN8_SURFACE_AUX_QPITCH) |
 SET_FIELD((aux_mt->pitch / tile_w) - 1,
   GEN8_SURFACE_AUX_PITCH) |
 aux_mode;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 37/64] isl: Add surface formats for on-MSAA CCS surfaces

2016-06-11 Thread Jason Ekstrand
---
 src/intel/isl/isl.h | 16 
 src/intel/isl/isl_format_layout.csv |  9 +
 2 files changed, 25 insertions(+)

diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index 3bf7469..36038bc 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@ -349,6 +349,22 @@ enum isl_format {
ISL_FORMAT_ASTC_LDR_2D_12X10_FLT16 =638,
ISL_FORMAT_ASTC_LDR_2D_12X12_FLT16 =639,
 
+   /* The formats that follow are internal to ISL and as such don't have an
+* explicit number.  We'll just let the C compiler assign it for us.  Any
+* actual hardware formats *must* come before these in the list.
+*/
+
+   /* Formats for representing a non-MSAA color control surface */
+   ISL_FORMAT_NOMSRT_CCS_D_32BPP_X,
+   ISL_FORMAT_NOMSRT_CCS_D_64BPP_X,
+   ISL_FORMAT_NOMSRT_CCS_D_128BPP_X,
+   ISL_FORMAT_NOMSRT_CCS_D_32BPP_Y,
+   ISL_FORMAT_NOMSRT_CCS_D_64BPP_Y,
+   ISL_FORMAT_NOMSRT_CCS_D_128BPP_Y,
+   ISL_FORMAT_NOMSRT_CCS_E_32BPP,
+   ISL_FORMAT_NOMSRT_CCS_E_64BPP,
+   ISL_FORMAT_NOMSRT_CCS_E_128BPP,
+
/* Hardware doesn't understand this out-of-band value */
ISL_FORMAT_UNSUPPORTED = UINT16_MAX,
 };
diff --git a/src/intel/isl/isl_format_layout.csv 
b/src/intel/isl/isl_format_layout.csv
index f90fbe0..a39093e 100644
--- a/src/intel/isl/isl_format_layout.csv
+++ b/src/intel/isl/isl_format_layout.csv
@@ -314,3 +314,12 @@ ASTC_LDR_2D_10X8_FLT16  , 128, 10,  8,  1, sf16, sf16, 
sf16, sf16, ,
 ASTC_LDR_2D_10X10_FLT16 , 128, 10, 10,  1, sf16, sf16, sf16, sf16, ,   
  ,, linear,  astc
 ASTC_LDR_2D_12X10_FLT16 , 128, 12, 10,  1, sf16, sf16, sf16, sf16, ,   
  ,, linear,  astc
 ASTC_LDR_2D_12X12_FLT16 , 128, 12, 12,  1, sf16, sf16, sf16, sf16, ,   
  ,, linear,  astc
+NOMSRT_CCS_D_32BPP_X,   8,  8,  4,  1,
+NOMSRT_CCS_D_64BPP_X,   8,  4,  4,  1,
+NOMSRT_CCS_D_128BPP_X   ,   8,  2,  4,  1,
+NOMSRT_CCS_D_32BPP_Y,   8, 16,  2,  1,
+NOMSRT_CCS_D_64BPP_Y,   8,  8,  2,  1,
+NOMSRT_CCS_D_128BPP_Y   ,   8,  4,  2,  1,
+NOMSRT_CCS_E_32BPP  ,  16, 16,  2,  1,
+NOMSRT_CCS_E_64BPP  ,  16,  8,  2,  1,
+NOMSRT_CCS_E_128BPP ,  16,  4,  2,  1,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/64] isl/state: Don't use designated initializers for the surface state

2016-06-11 Thread Jason Ekstrand
While designated initializers are nice, they also force us to put some
things in the initializer and some things later.  Surface state setup is
complicated enough that this really hurs readability in the long run.
---
 src/intel/isl/isl_surface_state.c | 95 ---
 1 file changed, 48 insertions(+), 47 deletions(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index 51c5953..ae8096f 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -202,89 +202,90 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
uint32_t halign, valign;
get_halign_valign(info->surf, , );
 
-   struct GENX(RENDER_SURFACE_STATE) s = {
-  .SurfaceType = get_surftype(info->surf->dim, info->view->usage),
-  .SurfaceArray = info->surf->phys_level0_sa.array_len > 1,
-  .SurfaceVerticalAlignment = valign,
-  .SurfaceHorizontalAlignment = halign,
+   struct GENX(RENDER_SURFACE_STATE) s = { 0 };
+
+   s.SurfaceType = get_surftype(info->surf->dim, info->view->usage);
+
+   s.SurfaceArray = info->surf->phys_level0_sa.array_len > 1;
+   s.SurfaceVerticalAlignment = valign;
+   s.SurfaceHorizontalAlignment = halign;
 
 #if GEN_GEN >= 8
-  .TileMode = isl_to_gen_tiling[info->surf->tiling],
+   s.TileMode = isl_to_gen_tiling[info->surf->tiling];
 #else
-  .TiledSurface = info->surf->tiling != ISL_TILING_LINEAR,
-  .TileWalk = info->surf->tiling == ISL_TILING_X ? TILEWALK_XMAJOR :
-   TILEWALK_YMAJOR,
+   s.TiledSurface = info->surf->tiling != ISL_TILING_LINEAR,
+   s.TileWalk = info->surf->tiling == ISL_TILING_X ? TILEWALK_XMAJOR :
+ TILEWALK_YMAJOR;
 #endif
 
-  .VerticalLineStride = 0,
-  .VerticalLineStrideOffset = 0,
+   s.VerticalLineStride = 0;
+   s.VerticalLineStrideOffset = 0;
 
 #if (GEN_GEN == 7)
-  .SurfaceArraySpacing = info->surf->array_pitch_span ==
- ISL_ARRAY_PITCH_SPAN_COMPACT,
+   s.SurfaceArraySpacing = info->surf->array_pitch_span ==
+   ISL_ARRAY_PITCH_SPAN_COMPACT;
 #endif
 
 #if GEN_GEN >= 8
-  .SamplerL2BypassModeDisable = true,
+   s.SamplerL2BypassModeDisable = true;
 #endif
 
 #if GEN_GEN >= 8
-  .RenderCacheReadWriteMode = WriteOnlyCache,
+   s.RenderCacheReadWriteMode = WriteOnlyCache;
 #else
-  .RenderCacheReadWriteMode = 0,
+   s.RenderCacheReadWriteMode = 0;
 #endif
 
 #if GEN_GEN >= 8
-  .CubeFaceEnablePositiveZ = 1,
-  .CubeFaceEnableNegativeZ = 1,
-  .CubeFaceEnablePositiveY = 1,
-  .CubeFaceEnableNegativeY = 1,
-  .CubeFaceEnablePositiveX = 1,
-  .CubeFaceEnableNegativeX = 1,
+   s.CubeFaceEnablePositiveZ = 1;
+   s.CubeFaceEnableNegativeZ = 1;
+   s.CubeFaceEnablePositiveY = 1;
+   s.CubeFaceEnableNegativeY = 1;
+   s.CubeFaceEnablePositiveX = 1;
+   s.CubeFaceEnableNegativeX = 1;
 #else
-  .CubeFaceEnables = 0x3f,
+   s.CubeFaceEnables = 0x3f;
 #endif
 
 #if GEN_GEN >= 8
-  .SurfaceQPitch = get_qpitch(info->surf) >> 2,
+   s.SurfaceQPitch = get_qpitch(info->surf) >> 2;
 #endif
 
-  .Width = info->surf->logical_level0_px.width - 1,
-  .Height = info->surf->logical_level0_px.height - 1,
-  .Depth = 0, /* TEMPLATE */
+   s.Width = info->surf->logical_level0_px.width - 1;
+   s.Height = info->surf->logical_level0_px.height - 1;
+   s.Depth = 0; /* TEMPLATE */
 
-  .RenderTargetViewExtent = 0, /* TEMPLATE */
-  .MinimumArrayElement = 0, /* TEMPLATE */
+   s.RenderTargetViewExtent = 0; /* TEMPLATE */
+   s.MinimumArrayElement = 0; /* TEMPLATE */
 
-  .MultisampledSurfaceStorageFormat =
- isl_to_gen_multisample_layout[info->surf->msaa_layout],
-  .NumberofMultisamples = ffs(info->surf->samples) - 1,
-  .MultisamplePositionPaletteIndex = 0, /* UNUSED */
+   s.MultisampledSurfaceStorageFormat =
+  isl_to_gen_multisample_layout[info->surf->msaa_layout];
+   s.NumberofMultisamples = ffs(info->surf->samples) - 1;
+   s.MultisamplePositionPaletteIndex = 0; /* UNUSED */
 
-  .XOffset = 0,
-  .YOffset = 0,
+   s.XOffset = 0;
+   s.YOffset = 0;
 
-  .ResourceMinLOD = 0.0,
+   s.ResourceMinLOD = 0.0;
 
-  .MIPCountLOD = 0, /* TEMPLATE */
-  .SurfaceMinLOD = 0, /* TEMPLATE */
+   s.MIPCountLOD = 0; /* TEMPLATE */
+   s.SurfaceMinLOD = 0; /* TEMPLATE */
 
 #if (GEN_GEN >= 8 || GEN_IS_HASWELL)
-  .ShaderChannelSelectRed = info->view->channel_select[0],
-  .ShaderChannelSelectGreen = info->view->channel_select[1],
-  .ShaderChannelSelectBlue = info->view->channel_select[2],
-  .ShaderChannelSelectAlpha = info->view->channel_select[3],
+   s.ShaderChannelSelectRed = info->view->channel_select[0];
+   s.ShaderChannelSelectGreen = info->view->channel_select[1];
+   s.ShaderChannelSelectBlue = info->view->channel_select[2];
+   s.ShaderChannelSelectAlpha = 

[Mesa-dev] [PATCH 11/64] isl/state: Remove some unused fields

2016-06-11 Thread Jason Ekstrand
They're already zero-initialized and we have no plans of doing anything
more interesting with them.
---
 src/intel/isl/isl_surface_state.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index ae8096f..c36ef3b 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -218,9 +218,6 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
  TILEWALK_YMAJOR;
 #endif
 
-   s.VerticalLineStride = 0;
-   s.VerticalLineStrideOffset = 0;
-
 #if (GEN_GEN == 7)
s.SurfaceArraySpacing = info->surf->array_pitch_span ==
ISL_ARRAY_PITCH_SPAN_COMPACT;
@@ -261,12 +258,6 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
s.MultisampledSurfaceStorageFormat =
   isl_to_gen_multisample_layout[info->surf->msaa_layout];
s.NumberofMultisamples = ffs(info->surf->samples) - 1;
-   s.MultisamplePositionPaletteIndex = 0; /* UNUSED */
-
-   s.XOffset = 0;
-   s.YOffset = 0;
-
-   s.ResourceMinLOD = 0.0;
 
s.MIPCountLOD = 0; /* TEMPLATE */
s.SurfaceMinLOD = 0; /* TEMPLATE */
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/64] isl/state: Put surface format setup at the top

2016-06-11 Thread Jason Ekstrand
---
 src/intel/isl/isl_surface_state.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index c36ef3b..0f21e34 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -206,6 +206,13 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
 
s.SurfaceType = get_surftype(info->surf->dim, info->view->usage);
 
+   if (info->view->usage & ISL_SURF_USAGE_STORAGE_BIT) {
+  s.SurfaceFormat =
+ isl_lower_storage_image_format(dev->info, info->view->format);
+   } else {
+  s.SurfaceFormat = info->view->format;
+   }
+
s.SurfaceArray = info->surf->phys_level0_sa.array_len > 1;
s.SurfaceVerticalAlignment = valign;
s.SurfaceHorizontalAlignment = halign;
@@ -291,13 +298,6 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
   s.SurfacePitch = info->surf->row_pitch - 1;
}
 
-   if (info->view->usage & ISL_SURF_USAGE_STORAGE_BIT) {
-  s.SurfaceFormat =
- isl_lower_storage_image_format(dev->info, info->view->format);
-   } else {
-  s.SurfaceFormat = info->view->format;
-   }
-
switch (s.SurfaceType) {
case SURFTYPE_1D:
case SURFTYPE_2D:
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/64] genxml/gen8, 9: Prefix the multisample format enum with MSFMT

2016-06-11 Thread Jason Ekstrand
This is what gen7 does and it's nice to have a prefix
---
 src/intel/genxml/gen8.xml | 4 ++--
 src/intel/genxml/gen9.xml | 4 ++--
 src/intel/isl/isl_surface_state.c | 8 
 3 files changed, 4 insertions(+), 12 deletions(-)

diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml
index 1d6a43f..09671ba 100644
--- a/src/intel/genxml/gen8.xml
+++ b/src/intel/genxml/gen8.xml
@@ -307,8 +307,8 @@
 
 
 
-  
-  
+  
+  
 
 
   
diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml
index 2c01c56..f527838 100644
--- a/src/intel/genxml/gen9.xml
+++ b/src/intel/genxml/gen9.xml
@@ -313,8 +313,8 @@
 
 
 
-  
-  
+  
+  
 
 
   
diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index e96d3b0..51c5953 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -76,19 +76,11 @@ static const uint8_t isl_to_gen_tiling[] = {
 };
 #endif
 
-#if GEN_GEN >= 8
-static const uint32_t isl_to_gen_multisample_layout[] = {
-   [ISL_MSAA_LAYOUT_NONE]   = MSS,
-   [ISL_MSAA_LAYOUT_INTERLEAVED]= DEPTH_STENCIL,
-   [ISL_MSAA_LAYOUT_ARRAY]  = MSS,
-};
-#else
 static const uint32_t isl_to_gen_multisample_layout[] = {
[ISL_MSAA_LAYOUT_NONE]   = MSFMT_MSS,
[ISL_MSAA_LAYOUT_INTERLEAVED]= MSFMT_DEPTH_STENCIL,
[ISL_MSAA_LAYOUT_ARRAY]  = MSFMT_MSS,
 };
-#endif
 
 static uint8_t
 get_surftype(enum isl_surf_dim dim, isl_surf_usage_flags_t usage)
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/64] i965/blorp: Only set src_z for gen8+ 3D textures

2016-06-11 Thread Jason Ekstrand
Otherwise, we end up with a bogus value in the third component.  On gen6-7
where we always use 2D textures, this can cause problems if the
SurfaceArray bit is set in the SURFACE_STATE.
---
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
index 782d285..cdb6b33 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
@@ -1846,8 +1846,15 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
brw_blorp_setup_coord_transform(_push_consts.y_transform,
src_y0, src_y1, dst_y0, dst_y1, mirror_y);
 
-   params.wm_push_consts.src_z =
-  params.src.mt->target == GL_TEXTURE_3D ? params.src.layer : 0;
+   if (brw->gen >= 8 && params.src.mt->target == GL_TEXTURE_3D) {
+  /* On gen8+ we use actual 3-D textures so we need to pass the layer
+   * through to the sampler.
+   */
+  params.wm_push_consts.src_z = params.src.layer;
+   } else {
+  /* On gen7 and earlier, we fake everything with 2-D textures */
+  params.wm_push_consts.src_z = 0;
+   }
 
if (params.dst.num_samples <= 1 && dst_mt->num_samples > 1) {
   /* We must expand the rectangle we send through the rendering pipeline,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 31/64] genxml: Put append counter fields before MCS in RENDER_SURFACE_STATE on gen7

2016-06-11 Thread Jason Ekstrand
The pack header generation scripts can't handle the case where you have
two addresses in the same dword; they just take whatever is the last one.
This meant that the MCS address wasn't properly getting handled.  Since we
don't care about append counters, we can just re-arrange the XML for now.
---
 src/intel/genxml/gen7.xml  | 4 ++--
 src/intel/genxml/gen75.xml | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml
index 6f3e8cc..87057f3 100644
--- a/src/intel/genxml/gen7.xml
+++ b/src/intel/genxml/gen7.xml
@@ -394,10 +394,10 @@
 
 
 
-
-
 
 
+
+
 
 
 
diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml
index ac1b6e4..dcceea5 100644
--- a/src/intel/genxml/gen75.xml
+++ b/src/intel/genxml/gen75.xml
@@ -405,10 +405,10 @@
 
 
 
-
-
 
 
+
+
 
 
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/64] i965: Drop Max3DTextureLevels to 512 on Sandy Bridge and prior

2016-06-11 Thread Jason Ekstrand
The RenderTargetViewExtent field of RENDER_SURFACE_STATE is supposed to be
set to the depth of a 3-D texture when rendering.  Unfortunatley, that
field is only 9 bits on Sandy Bridge and prior so we can't actually bind
a 3-D texturing for rendering if it has depth > 512.  On Ivy Bridge, this
field was bumpped to 11 bits so we can go all the way up to 2048.

Cc: "11.1 11.2 12.0" 
---
 src/mesa/drivers/dri/i965/brw_context.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 7bbc128..3b11bef 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -467,7 +467,10 @@ brw_initialize_context_constants(struct brw_context *brw)
ctx->Const.MaxImageUnits = MAX_IMAGE_UNITS;
ctx->Const.MaxRenderbufferSize = 8192;
ctx->Const.MaxTextureLevels = MIN2(14 /* 8192 */, MAX_TEXTURE_LEVELS);
-   ctx->Const.Max3DTextureLevels = 12; /* 2048 */
+   if (brw->gen >= 7)
+  ctx->Const.Max3DTextureLevels = 12; /* 2048 */
+   else
+  ctx->Const.Max3DTextureLevels = 10; /* 512 */
ctx->Const.MaxCubeTextureLevels = 14; /* 8192 */
ctx->Const.MaxArrayTextureLayers = brw->gen >= 7 ? 2048 : 512;
ctx->Const.MaxTextureMbytes = 1536;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 26/64] isl/state: Don't set SurfacePitch for gen9 1-D textures

2016-06-11 Thread Jason Ekstrand
This field is ignored by the hardware in this case and, on very large 1-D
textures, it can end up being larger than the maximum allowed value.
---
 src/intel/isl/isl_surface_state.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index e1159b2..8f223d1 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -298,6 +298,9 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
*two rows interleaved."
*/
   s.SurfacePitch = info->surf->row_pitch * 2 - 1;
+   } else if (info->surf->dim_layout == ISL_DIM_LAYOUT_GEN9_1D) {
+  /* For gen9 1-D textures, surface pitch is ignored */
+  s.SurfacePitch = 0;
} else {
   s.SurfacePitch = info->surf->row_pitch - 1;
}
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/64] i965: Start using ISL for filling out surface states

2016-06-11 Thread Jason Ekstrand
We would like to eventually start using ISL inside of the GL driver to
replace the fairly sprawling layout code in brw_tex_layout.c and
intel_mipmap_tree.c.  However, that is a very big change that no one is
ready to make yet.  A smaller change, I thought, would be to start using
ISL in blorp.  In order to do that, I needed a function to get an isl_surf
from an intel_mipmap_tree.  How do you test such a function to ensure that
it's working in all of the cases?  Use ISL for emitting all surface states
on everything and run it through Jenkins of course!  Hence this series.

This series is one of the most educational projects I've worked on in a
bit.  It turns out there are a lot of subtlties in surface layout and I
found bugs in both the i965 and ISL state setup code.  I've tried to keep
all of the functional changes contained to the first 8 or so patches which
only touch the GL driver.  That way those fixes can be back-ported to
stable and are bisectable.

The next 20 patches or so are general ISL cleanups and fixes.  If no one is
too opposed, I'd like to back-port the whole pile to 12.0.  There are two
reasons for this: First, ISL is new and this is a substantial cleanup;
back-porting it will make back-porting will keep the initial release of ISL
cleaner and make back-porting other patches easier in the future.  Second,
in the middle of the series are a couple of changes that fix some 850
Vulkan CTS tests on Haswell.

The next 9 patches add support to ISL for filling out surface states on
gen4, 4x, 5, and 6 as well as support for color compression.  I'm not sure
if the CCS formats are 100% correct or of that's even the exact approach we
want to take.  Chad, I'd like you to chip in here.

Finally, starting with blorp, we replace almost all of the surface state
setup code in i965 with paths based on ISL.  For textures/renderbuffers we
delete 1 path for gen4-5, 3 for gen6, 4 for gen7, and 3 for gen8 along with
3 different paths for emitting buffer surfaces.

As far as review goes, I'd like to get the i965 bugfixes and ISL cleanups
landed soon-ish and back-ported for 12.0.  Everything after that is a bit
more up-in-the-air.  It won't be all that hard to rebase because it's mosly
just whole-sale replacing the code we have with new code.

Cc: Chad Versace 
Cc: Nanley Chery 
Cc: Kenneth Graunke 
Cc: Topi Pohjolainen 

Jason Ekstrand (64):
  i965: Drop Max3DTextureLevels to 512 on Sandy Bridge and prior
  i965/blorp/gen8: Use the correct max level and layer in
emit_surface_states
  i965/gen8: Use the qpitch from the aux_mt for AUX_QPITCH
  i965/fs: Use a default Y coordinate of 0 for TXF on gen9+
  i965: Remove fake W-tiled render target support
  i965/gen4: Subtract 1 from buffer sizes
  i965/gen7,8: Set SURFACE_IS_ARRAY for all non-3D texture types
  i965/blorp: Only set src_z for gen8+ 3D textures
  genxml/gen8,9: Prefix the multisample format enum with MSFMT
  isl/state: Don't use designated initializers for the surface state
  isl/state: Remove some unused fields
  isl/state: Put surface format setup at the top
  isl/state: Put all dimension setup together and towards the top
  isl/state: Put pitch calculations together
  isl/state: Return an extent3d from the halign/valign helper
  isl/state: Refactor the per-gen isl_to_gen_h/valign tables
  isl/state: Refactor the setup of clear colors
  isl/state: Don't force-disable L2 bypass for everything
  isl/state: Set SurfaceArray based on the surface dimension
  isl/format: Mark R9G9B9E5 as containing 9-bit unsigned float channels
  isl/state: Set the IntegerSurfaceFormat bit on Haswell
  isl/state: Use the layout for computing qpitch rather than dimensions
  isl/state: Only set cube face enables if usage includes CUBE_BIT
  isl/state: Emit no-op mip tail setup on SKL
  isl/state: Use TILEWALK_XMAJOR for linear surfaces on gen7
  isl/state: Don't set SurfacePitch for gen9 1-D textures
  isl/state: Add assertions for buffer surface restrictions
  isl/state: Don't use designated initializers for buffer surface state
  isl/state: Allow for full 31-bit buffer texture sizes
  anv,isl: Lower storage image formats in anv
  genxml: Put append counter fields before MCS in RENDER_SURFACE_STATE
on gen7
  genxml: Add enough XML for gens 4, 4.5, and 5 to get SURFACE_STATE
  genxml: Make X/Y Offset field of SURFACE_STATE a uint
  genxml: Add macros and #includes for gens 4-6
  isl: Add an ISL_DEV_IS_G4X macro
  isl: Add support for filling out surface states all the way back to
gen4
  isl: Add surface formats for on-MSAA CCS surfaces
  isl/state: Add support for handling color control surfaces
  isl/state: Add support for OffsetX/Y in surface state
  i965/miptree: Add a helper for getting an isl_surf from a miptree
  i965/miptree: Add a helper for getting the ISL clear color from a
miptree
  i965/miptree: Add a helper for getting the aux isl_surf from a miptree